All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Using MMIO based flips on VLV
@ 2014-01-09 11:26 akash.goel
  2014-01-09 11:26 ` [PATCH 1/2] drm/i915: Creating a new workqueue to handle MMIO flip work items akash.goel
                   ` (2 more replies)
  0 siblings, 3 replies; 67+ messages in thread
From: akash.goel @ 2014-01-09 11:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Replaced blitter command streamer based flips with MMIO flips on VLV.
This is for Media power well residency optimization. The blitter ring
is currently being used just for the command streamer based flip calls. 
For pure 3D workloads, with MMIO flips, there will be no use of blitter
ring and this will ensure the 100% residency (in D0i3 state) for Media well.
This change contributed in decent power savings.
The other alternative of having Render ring based flip calls is not being
used, as that option adversly affects the performance (FPS) of certain 3D Apps
Also going forward, for newer platforms like CHV, the atomic flips will be
used and for that most probably the MMIO based flips only will be used.

Akash Goel (2):
  drm/i915: Creating a new workqueue to handle MMIO flip work items
  drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for
    VLV.

 drivers/gpu/drm/i915/i915_dma.c      |  25 ++++++
 drivers/gpu/drm/i915/i915_drv.h      |   3 +
 drivers/gpu/drm/i915/i915_gem.c      |   4 +-
 drivers/gpu/drm/i915/intel_display.c | 147 +++++++++++++++++++++++++++++++++++
 4 files changed, 177 insertions(+), 2 deletions(-)

-- 
1.8.5.2

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 1/2] drm/i915: Creating a new workqueue to handle MMIO flip work items
  2014-01-09 11:26 [PATCH 0/2] Using MMIO based flips on VLV akash.goel
@ 2014-01-09 11:26 ` akash.goel
  2014-01-09 11:26 ` [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV akash.goel
  2014-01-09 11:29 ` [PATCH 0/2] Using MMIO based flips on VLV Chris Wilson
  2 siblings, 0 replies; 67+ messages in thread
From: akash.goel @ 2014-01-09 11:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

MMIO flips will replace the blitter command streamer based flips on VLV.
Created our own private workqueue for handling the MMIO based flips,
so as to avoid the block of the display thread (who issued the flip ioctl),
due to the synchronization needed between the Rendering & flip stages.
Now with MMIO based flips, the synchronization will require a wait in sw
for the ongoing rendering operation, if any, to complete before issuing
the flip. Because of this work queue, the wait on rendering to complete
would be done by the worker thread.

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c | 25 +++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 2 files changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index ee9502b..ee4b445 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1609,6 +1609,29 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 		goto out_mtrrfree;
 	}
 
+	/*
+	 * Creating our own private workqueue for handling the
+	 * MMIO based flips. This is to avoid the block of the
+	 * display thread (who issued the flip ioctl), due to the
+	 * synchronization needed between the Rendering & flip stages.
+	 * Now with MMIO based flips, the synchronization will require a
+	 * wait in sw for the ongoing rendering operation, if any, to
+	 * complete before issuing the flip. Because of this work queue,
+	 * the wait on rendering would be done by the worker thread.
+	 *
+	 * Also since the flip work has to be processed at earliest,
+	 * HIGHPRI flag is used here. We also use an ordered one, as
+	 * we need serialized execution but anyways we support one
+	 * outstanding flip call only, so probably doesn't really matter.
+	 */
+	dev_priv->flipwq = alloc_ordered_workqueue("i915_flip", WQ_HIGHPRI);
+	if (dev_priv->flipwq == NULL) {
+		DRM_ERROR("Failed to create flip workqueue.\n");
+		ret = -ENOMEM;
+		destroy_workqueue(dev_priv->wq);
+		goto out_mtrrfree;
+	}
+
 	intel_irq_init(dev);
 	intel_uncore_sanitize(dev);
 
@@ -1685,6 +1708,7 @@ out_gem_unload:
 
 	intel_teardown_gmbus(dev);
 	intel_teardown_mchbar(dev);
+	destroy_workqueue(dev_priv->flipwq);
 	destroy_workqueue(dev_priv->wq);
 out_mtrrfree:
 	arch_phys_wc_del(dev_priv->gtt.mtrr);
@@ -1792,6 +1816,7 @@ int i915_driver_unload(struct drm_device *dev)
 	intel_teardown_mchbar(dev);
 
 	destroy_workqueue(dev_priv->wq);
+	destroy_workqueue(dev_priv->flipwq);
 	pm_qos_remove_request(&dev_priv->pm_qos);
 
 	dev_priv->gtt.base.cleanup(&dev_priv->gtt.base);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cc8afff..2e22430 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1451,6 +1451,7 @@ typedef struct drm_i915_private {
 	 * result in deadlocks.
 	 */
 	struct workqueue_struct *wq;
+	struct workqueue_struct *flipwq;
 
 	/* Display functions */
 	struct drm_i915_display_funcs display;
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
  2014-01-09 11:26 [PATCH 0/2] Using MMIO based flips on VLV akash.goel
  2014-01-09 11:26 ` [PATCH 1/2] drm/i915: Creating a new workqueue to handle MMIO flip work items akash.goel
@ 2014-01-09 11:26 ` akash.goel
  2014-01-09 11:31   ` Chris Wilson
  2014-01-09 11:29 ` [PATCH 0/2] Using MMIO based flips on VLV Chris Wilson
  2 siblings, 1 reply; 67+ messages in thread
From: akash.goel @ 2014-01-09 11:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel

From: Akash Goel <akash.goel@intel.com>

Using MMIO based flips now on VLV for Media power well residency optimization.
The blitter ring is currently being used just for the command streamer based
flip calls. For pure 3D workloads, with MMIO flips, there will be no use of
blitter ring and this will ensure the 100% residency in D0i3 for Media well.
The other alternative of having Render ring based flip calls is not being used,
as that option adversly affects the performance (FPS) of certain 3D Apps

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h      |   2 +
 drivers/gpu/drm/i915/i915_gem.c      |   4 +-
 drivers/gpu/drm/i915/intel_display.c | 147 +++++++++++++++++++++++++++++++++++
 3 files changed, 151 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2e22430..6d1e496 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -74,6 +74,7 @@ enum plane {
 	PLANE_A = 0,
 	PLANE_B,
 	PLANE_C,
+	I915_MAX_PLANES
 };
 #define plane_name(p) ((p) + 'A')
 
@@ -2114,6 +2115,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
+int i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
 	return unlikely(atomic_read(&error->reset_counter)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 656406d..1a33501 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -957,7 +957,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
  * Compare seqno against outstanding lazy request. Emit a request if they are
  * equal.
  */
-static int
+int
 i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
 {
 	int ret;
@@ -1008,7 +1008,7 @@ static bool can_wait_boost(struct drm_i915_file_private *file_priv)
  * Returns 0 if the seqno was found within the alloted time. Else returns the
  * errno with remaining time filled in timeout argument.
  */
-static int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno,
+int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno,
 			unsigned reset_counter,
 			bool interruptible,
 			struct timespec *timeout,
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 4d1357a..25aa3a8 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -52,6 +52,9 @@ static void ironlake_pch_clock_get(struct intel_crtc *crtc,
 static int intel_set_mode(struct drm_crtc *crtc, struct drm_display_mode *mode,
 			  int x, int y, struct drm_framebuffer *old_fb);
 
+int __wait_seqno(struct intel_ring_buffer *ring, u32 seqno, unsigned reset_counter,
+		 bool interruptible, struct timespec *timeout,
+		 struct drm_i915_file_private *file_priv);
 
 typedef struct {
 	int	min, max;
@@ -68,6 +71,24 @@ struct intel_limit {
 	intel_p2_t	    p2;
 };
 
+struct i915_flip_data {
+	struct drm_crtc *crtc;
+	u32 seqno;
+	u32 ring_id;
+};
+
+struct i915_flip_work {
+	struct i915_flip_data flipdata;
+	struct work_struct  work;
+};
+
+/*
+ * Need one work item only for each primary plane,
+ * as we support only one outstanding flip request
+ * on each plane at a time.
+ */
+static struct i915_flip_work flip_works[I915_MAX_PLANES];
+
 int
 intel_pch_rawclk(struct drm_device *dev)
 {
@@ -8588,6 +8609,123 @@ err:
 	return ret;
 }
 
+static void intel_gen7_queue_mmio_flip_work(struct work_struct *__work)
+{
+	struct i915_flip_work *flipwork =
+		container_of(__work, struct i915_flip_work, work);
+	int ret = 0;
+	unsigned int reset_counter;
+	struct drm_crtc *crtc = flipwork->flipdata.crtc;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	struct drm_device *dev = crtc->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_ring_buffer *ring =
+			&dev_priv->ring[flipwork->flipdata.ring_id];
+
+	if (dev_priv->ums.mm_suspended || (ring->obj == NULL)) {
+		DRM_ERROR("flip attempted while the ring is not ready\n");
+		return;
+	}
+
+	/*
+	 * Wait is needed only for nonZero Seqnos, as zero Seqno indicates
+	 * that either the rendering on the object (through GPU) is already
+	 * completed or not intiated at all
+	 */
+	if (flipwork->flipdata.seqno > 0) {
+		reset_counter =
+			atomic_read(&dev_priv->gpu_error.reset_counter);
+		/* sleep wait until the seqno has passed */
+		ret = __wait_seqno(ring, flipwork->flipdata.seqno,
+					reset_counter, true, NULL, NULL);
+		if (ret)
+			DRM_ERROR("wait_seqno failed on seqno 0x%x(%d)\n",
+				flipwork->flipdata.seqno,
+				flipwork->flipdata.ring_id);
+	}
+
+	intel_mark_page_flip_active(intel_crtc);
+	i9xx_update_plane(crtc, crtc->fb, 0, 0);
+}
+
+/*
+ * Using MMIO based flips starting from VLV, for Media power well
+ * residency optimization. The other alternative of having Render
+ * ring based flip calls is not being used, as that option adversly
+ * affects the performance (FPS) of certain 3D Apps.
+ */
+static int intel_gen7_queue_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc,
+			struct drm_framebuffer *fb,
+			struct drm_i915_gem_object *obj,
+			uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	struct i915_flip_work *work = &flip_works[intel_crtc->plane];
+	int ret;
+
+	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
+	if (ret)
+		goto err;
+
+	switch (intel_crtc->plane) {
+	case PLANE_A:
+	case PLANE_B:
+	case PLANE_C:
+	break;
+	default:
+		WARN_ONCE(1, "unknown plane in flip command\n");
+		ret = -ENODEV;
+		goto err_unpin;
+	}
+
+	work->flipdata.crtc  = crtc;
+	work->flipdata.seqno = obj->last_write_seqno;
+	work->flipdata.ring_id = RCS;
+
+	if (obj->last_write_seqno > 0) {
+		if (obj->ring) {
+			work->flipdata.ring_id = obj->ring->id;
+			/*
+			 * Check if there is a need to add the request
+			 * in the ring to emit the seqno for this fb obj
+			 */
+			ret = i915_gem_check_olr(obj->ring,
+						obj->last_write_seqno);
+			if (ret)
+				goto err_unpin;
+		} else {
+			DRM_ERROR("NULL ring for active obj with seqno %x\n",
+				obj->last_write_seqno);
+			ret = -EINVAL;
+			goto err_unpin;
+		}
+	}
+
+	/*
+	 * Flush the work as it could be running now.
+	 * Although this could only happen in a particular,
+	 * very rare condition, as we allow only 1 outstanding
+	 * flip at a time through the 'unpin_work' variable.
+	 * To be checked, if its really needed or not
+	 */
+	flush_work(&work->work);
+
+	/*
+	 * Queue the MMIO flip work in our private workqueue.
+	 */
+	queue_work(dev_priv->flipwq, &work->work);
+
+	return 0;
+
+err_unpin:
+	intel_unpin_fb_obj(obj);
+err:
+	return ret;
+}
+
+
 static int intel_default_queue_flip(struct drm_device *dev,
 				    struct drm_crtc *crtc,
 				    struct drm_framebuffer *fb,
@@ -10171,6 +10309,12 @@ static void intel_crtc_init(struct drm_device *dev, int pipe)
 	dev_priv->pipe_to_crtc_mapping[intel_crtc->pipe] = &intel_crtc->base;
 
 	drm_crtc_helper_add(&intel_crtc->base, &intel_helper_funcs);
+
+	/*
+	 * Initialize the flip work item (one per primary plane)
+	 */
+	INIT_WORK(&flip_works[intel_crtc->plane].work,
+		  intel_gen7_queue_mmio_flip_work);
 }
 
 enum pipe intel_get_pipe_from_connector(struct intel_connector *connector)
@@ -10675,6 +10819,9 @@ static void intel_init_display(struct drm_device *dev)
 		break;
 	}
 
+	if (IS_VALLEYVIEW(dev))
+		dev_priv->display.queue_flip = intel_gen7_queue_mmio_flip;
+
 	intel_panel_init_backlight_funcs(dev);
 }
 
-- 
1.8.5.2

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH 0/2] Using MMIO based flips on VLV
  2014-01-09 11:26 [PATCH 0/2] Using MMIO based flips on VLV akash.goel
  2014-01-09 11:26 ` [PATCH 1/2] drm/i915: Creating a new workqueue to handle MMIO flip work items akash.goel
  2014-01-09 11:26 ` [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV akash.goel
@ 2014-01-09 11:29 ` Chris Wilson
  2 siblings, 0 replies; 67+ messages in thread
From: Chris Wilson @ 2014-01-09 11:29 UTC (permalink / raw)
  To: akash.goel; +Cc: intel-gfx

On Thu, Jan 09, 2014 at 04:56:37PM +0530, akash.goel@intel.com wrote:
> The other alternative of having Render ring based flip calls is not being
> used, as that option adversly affects the performance (FPS) of certain 3D Apps
> Also going forward, for newer platforms like CHV, the atomic flips will be
> used and for that most probably the MMIO based flips only will be used.

That's interesting as there is no mention of render support in the spec.
Care to elaborate?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
  2014-01-09 11:26 ` [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV akash.goel
@ 2014-01-09 11:31   ` Chris Wilson
  2014-01-13  9:47     ` Goel, Akash
  0 siblings, 1 reply; 67+ messages in thread
From: Chris Wilson @ 2014-01-09 11:31 UTC (permalink / raw)
  To: akash.goel; +Cc: intel-gfx

On Thu, Jan 09, 2014 at 04:56:39PM +0530, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Using MMIO based flips now on VLV for Media power well residency optimization.
> The blitter ring is currently being used just for the command streamer based
> flip calls. For pure 3D workloads, with MMIO flips, there will be no use of
> blitter ring and this will ensure the 100% residency in D0i3 for Media well.
> The other alternative of having Render ring based flip calls is not being used,
> as that option adversly affects the performance (FPS) of certain 3D Apps

Rather exporting deep magic from i915_gem, just emit the request after
the mmio flip and use the normal signalling mechanisms. There are other
users who could also use a request after a flip.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
  2014-01-09 11:31   ` Chris Wilson
@ 2014-01-13  9:47     ` Goel, Akash
  2014-02-07 11:59       ` Goel, Akash
  0 siblings, 1 reply; 67+ messages in thread
From: Goel, Akash @ 2014-01-13  9:47 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

>> Rather exporting deep magic from i915_gem, just emit the request after the mmio flip and use the normal signalling mechanisms. There are other users who could also use a request after a flip.

Sorry, couldn't understand your point.

With Command streamer based flips, we could use the cross ring MBOX mechanism at Hw level, to ensure that buffer is flipped only when the rendering is completed. 

But with MMIO flips, need to ensure that we somehow introduce a wait for the rendering to complete, before updating the Display Surface Address register, to effect the flip. 

Best Regards
Akash

-----Original Message-----
From: Chris Wilson [mailto:chris@chris-wilson.co.uk] 
Sent: Thursday, January 09, 2014 5:02 PM
To: Goel, Akash
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.

On Thu, Jan 09, 2014 at 04:56:39PM +0530, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Using MMIO based flips now on VLV for Media power well residency optimization.
> The blitter ring is currently being used just for the command streamer 
> based flip calls. For pure 3D workloads, with MMIO flips, there will 
> be no use of blitter ring and this will ensure the 100% residency in D0i3 for Media well.
> The other alternative of having Render ring based flip calls is not 
> being used, as that option adversly affects the performance (FPS) of 
> certain 3D Apps

Rather exporting deep magic from i915_gem, just emit the request after the mmio flip and use the normal signalling mechanisms. There are other users who could also use a request after a flip.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
  2014-01-13  9:47     ` Goel, Akash
@ 2014-02-07 11:59       ` Goel, Akash
  2014-02-07 14:47         ` Daniel Vetter
  0 siblings, 1 reply; 67+ messages in thread
From: Goel, Akash @ 2014-02-07 11:59 UTC (permalink / raw)
  To: 'Chris Wilson'; +Cc: 'intel-gfx@lists.freedesktop.org'

Please could you kindly elaborate here, it will help us to proceed further with this patch.

Best Regards
Akash

-----Original Message-----
From: Goel, Akash 
Sent: Monday, January 13, 2014 3:17 PM
To: Chris Wilson
Cc: intel-gfx@lists.freedesktop.org
Subject: RE: [Intel-gfx] [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.

>> Rather exporting deep magic from i915_gem, just emit the request after the mmio flip and use the normal signalling mechanisms. There are other users who could also use a request after a flip.

Sorry, couldn't understand your point.

With Command streamer based flips, we could use the cross ring MBOX mechanism at Hw level, to ensure that buffer is flipped only when the rendering is completed. 

But with MMIO flips, need to ensure that we somehow introduce a wait for the rendering to complete, before updating the Display Surface Address register, to effect the flip. 

Best Regards
Akash

-----Original Message-----
From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
Sent: Thursday, January 09, 2014 5:02 PM
To: Goel, Akash
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.

On Thu, Jan 09, 2014 at 04:56:39PM +0530, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Using MMIO based flips now on VLV for Media power well residency optimization.
> The blitter ring is currently being used just for the command streamer 
> based flip calls. For pure 3D workloads, with MMIO flips, there will 
> be no use of blitter ring and this will ensure the 100% residency in D0i3 for Media well.
> The other alternative of having Render ring based flip calls is not 
> being used, as that option adversly affects the performance (FPS) of 
> certain 3D Apps

Rather exporting deep magic from i915_gem, just emit the request after the mmio flip and use the normal signalling mechanisms. There are other users who could also use a request after a flip.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
  2014-02-07 11:59       ` Goel, Akash
@ 2014-02-07 14:47         ` Daniel Vetter
  2014-02-07 17:17           ` Goel, Akash
       [not found]           ` <8BF5CF93467D8C498F250C96583BC09CC718E3@BGSMSX103.gar.corp.intel.com>
  0 siblings, 2 replies; 67+ messages in thread
From: Daniel Vetter @ 2014-02-07 14:47 UTC (permalink / raw)
  To: Goel, Akash; +Cc: 'intel-gfx@lists.freedesktop.org'

On Fri, Feb 07, 2014 at 11:59:29AM +0000, Goel, Akash wrote:
> Please could you kindly elaborate here, it will help us to proceed further with this patch.

As Chris said, instead of rolling your own code to track when flips are
emitted to the ring, simply add a real request (with the add_request
function) like the execbuf paths. Then add any additional trackin you need
to our request structure.
-Daniel

> 
> Best Regards
> Akash
> 
> -----Original Message-----
> From: Goel, Akash 
> Sent: Monday, January 13, 2014 3:17 PM
> To: Chris Wilson
> Cc: intel-gfx@lists.freedesktop.org
> Subject: RE: [Intel-gfx] [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
> 
> >> Rather exporting deep magic from i915_gem, just emit the request after the mmio flip and use the normal signalling mechanisms. There are other users who could also use a request after a flip.
> 
> Sorry, couldn't understand your point.
> 
> With Command streamer based flips, we could use the cross ring MBOX mechanism at Hw level, to ensure that buffer is flipped only when the rendering is completed. 
> 
> But with MMIO flips, need to ensure that we somehow introduce a wait for the rendering to complete, before updating the Display Surface Address register, to effect the flip. 
> 
> Best Regards
> Akash
> 
> -----Original Message-----
> From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> Sent: Thursday, January 09, 2014 5:02 PM
> To: Goel, Akash
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
> 
> On Thu, Jan 09, 2014 at 04:56:39PM +0530, akash.goel@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > Using MMIO based flips now on VLV for Media power well residency optimization.
> > The blitter ring is currently being used just for the command streamer 
> > based flip calls. For pure 3D workloads, with MMIO flips, there will 
> > be no use of blitter ring and this will ensure the 100% residency in D0i3 for Media well.
> > The other alternative of having Render ring based flip calls is not 
> > being used, as that option adversly affects the performance (FPS) of 
> > certain 3D Apps
> 
> Rather exporting deep magic from i915_gem, just emit the request after the mmio flip and use the normal signalling mechanisms. There are other users who could also use a request after a flip.
> -Chris
> 
> --
> Chris Wilson, Intel Open Source Technology Centre
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
  2014-02-07 14:47         ` Daniel Vetter
@ 2014-02-07 17:17           ` Goel, Akash
       [not found]           ` <8BF5CF93467D8C498F250C96583BC09CC718E3@BGSMSX103.gar.corp.intel.com>
  1 sibling, 0 replies; 67+ messages in thread
From: Goel, Akash @ 2014-02-07 17:17 UTC (permalink / raw)
  To: Daniel Vetter, 'intel-gfx; +Cc: Vetter, Daniel

>> As Chris said, instead of rolling your own code to track when flips are emitted to the ring, simply add a real request (with the add_request function) 
>> like the execbuf paths. Then add any additional trackin you need to our request structure.
>> -Daniel

I am really sorry, but I am still not able to get it. 
Earlier when we were adding a MI_DISPLAY_FLIP command in the blitter ring,  the cross ring MBOX synchronization at Hw level ensured that Flip command will get parsed/executed by BCS, only after the rendering has completed on RCS.
But now with MMIO based flips, before updating the Display Surface register, we somehow need to wait for the rendering to get completed. But we want to avoid synchronous wait in the Page flip call. So we call the 'wait_seqno' from the context of a worker thread  & once the wait is completed we update the plane register to effect a flip.
I am not able to understand that how the 'add_request' will help here.

Best Regards
Akash

-----Original Message-----
From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
Sent: Friday, February 07, 2014 8:17 PM
To: Goel, Akash
Cc: 'Chris Wilson'; 'intel-gfx@lists.freedesktop.org'
Subject: Re: [Intel-gfx] [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.

On Fri, Feb 07, 2014 at 11:59:29AM +0000, Goel, Akash wrote:
> Please could you kindly elaborate here, it will help us to proceed further with this patch.

As Chris said, instead of rolling your own code to track when flips are emitted to the ring, simply add a real request (with the add_request
function) like the execbuf paths. Then add any additional trackin you need to our request structure.
-Daniel

> 
> Best Regards
> Akash
> 
> -----Original Message-----
> From: Goel, Akash
> Sent: Monday, January 13, 2014 3:17 PM
> To: Chris Wilson
> Cc: intel-gfx@lists.freedesktop.org
> Subject: RE: [Intel-gfx] [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
> 
> >> Rather exporting deep magic from i915_gem, just emit the request after the mmio flip and use the normal signalling mechanisms. There are other users who could also use a request after a flip.
> 
> Sorry, couldn't understand your point.
> 
> With Command streamer based flips, we could use the cross ring MBOX mechanism at Hw level, to ensure that buffer is flipped only when the rendering is completed. 
> 
> But with MMIO flips, need to ensure that we somehow introduce a wait for the rendering to complete, before updating the Display Surface Address register, to effect the flip. 
> 
> Best Regards
> Akash
> 
> -----Original Message-----
> From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> Sent: Thursday, January 09, 2014 5:02 PM
> To: Goel, Akash
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
> 
> On Thu, Jan 09, 2014 at 04:56:39PM +0530, akash.goel@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > Using MMIO based flips now on VLV for Media power well residency optimization.
> > The blitter ring is currently being used just for the command 
> > streamer based flip calls. For pure 3D workloads, with MMIO flips, 
> > there will be no use of blitter ring and this will ensure the 100% residency in D0i3 for Media well.
> > The other alternative of having Render ring based flip calls is not 
> > being used, as that option adversly affects the performance (FPS) of 
> > certain 3D Apps
> 
> Rather exporting deep magic from i915_gem, just emit the request after the mmio flip and use the normal signalling mechanisms. There are other users who could also use a request after a flip.
> -Chris
> 
> --
> Chris Wilson, Intel Open Source Technology Centre 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
       [not found]           ` <8BF5CF93467D8C498F250C96583BC09CC718E3@BGSMSX103.gar.corp.intel.com>
@ 2014-03-07 13:17             ` Gupta, Sourab
  2014-03-07 13:30               ` Ville Syrjälä
  0 siblings, 1 reply; 67+ messages in thread
From: Gupta, Sourab @ 2014-03-07 13:17 UTC (permalink / raw)
  To: daniel, Chris Wilson; +Cc: intel-gfx, Goel, Akash

Hi Daniel/Chris,

> As Chris said, instead of rolling your own code to track when flips are emitted to the ring simply add a real request (with the add_request function) like the execbuf paths.Then add any additional trackin you need to our request structure.

We can use the 'add_request' function. But since we already have & can track the seqno of the object (for which we want the rendering to complete), we think there shall be no additional benefit in adding a new request & then tracking the same.

> Rather exporting deep magic from i915_gem, just emit the request after the mmio flip and use the normal signalling mechanisms. There are other users who could also use a request after a flip.

We had the following point in mind, when implementing the Mmio based Page flips :
We wanted to completely avoid locking of the device mutex from the flip path. As we had seen sometimes the flips getting delayed because of concurrent exec buffers processing, while we are waiting for them to release the mutex. 
Since the public functions (i915_wait_seqno ) require mutex to be taken beforehand, we had no choice but to expose the private __wait_seqno function in order to do so.
Also, we couldn't find any other signaling mechanism (other than wait_seqno  type of functions) to do so.

Can you please provide your feedback on the above points.

Regards,
Sourab


-----Original Message-----
From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter
Sent: Friday, February 07, 2014 8:17 PM
To: Goel, Akash
Cc: 'Chris Wilson'; 'intel-gfx@lists.freedesktop.org'
Subject: Re: [Intel-gfx] [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.

On Fri, Feb 07, 2014 at 11:59:29AM +0000, Goel, Akash wrote:
> Please could you kindly elaborate here, it will help us to proceed further with this patch.

As Chris said, instead of rolling your own code to track when flips are emitted to the ring, simply add a real request (with the add_request
function) like the execbuf paths. Then add any additional trackin you need to our request structure.
-Daniel

> 
> Best Regards
> Akash
> 
> -----Original Message-----
> From: Goel, Akash
> Sent: Monday, January 13, 2014 3:17 PM
> To: Chris Wilson
> Cc: intel-gfx@lists.freedesktop.org
> Subject: RE: [Intel-gfx] [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
> 
> >> Rather exporting deep magic from i915_gem, just emit the request after the mmio flip and use the normal signalling mechanisms. There are other users who could also use a request after a flip.
> 
> Sorry, couldn't understand your point.
> 
> With Command streamer based flips, we could use the cross ring MBOX mechanism at Hw level, to ensure that buffer is flipped only when the rendering is completed. 
> 
> But with MMIO flips, need to ensure that we somehow introduce a wait for the rendering to complete, before updating the Display Surface Address register, to effect the flip. 
> 
> Best Regards
> Akash
> 
> -----Original Message-----
> From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> Sent: Thursday, January 09, 2014 5:02 PM
> To: Goel, Akash
> Cc: intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
> 
> On Thu, Jan 09, 2014 at 04:56:39PM +0530, akash.goel@intel.com wrote:
> > From: Akash Goel <akash.goel@intel.com>
> > 
> > Using MMIO based flips now on VLV for Media power well residency optimization.
> > The blitter ring is currently being used just for the command 
> > streamer based flip calls. For pure 3D workloads, with MMIO flips, 
> > there will be no use of blitter ring and this will ensure the 100% residency in D0i3 for Media well.
> > The other alternative of having Render ring based flip calls is not 
> > being used, as that option adversly affects the performance (FPS) of 
> > certain 3D Apps
> 
> Rather exporting deep magic from i915_gem, just emit the request after the mmio flip and use the normal signalling mechanisms. There are other users who could also use a request after a flip.
> -Chris
> 
> --
> Chris Wilson, Intel Open Source Technology Centre 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV.
  2014-03-07 13:17             ` Gupta, Sourab
@ 2014-03-07 13:30               ` Ville Syrjälä
  2014-03-13  7:21                 ` [PATCH] drm/i915: Replaced Blitter ring based flips with MMIO flips " sourab.gupta
  2014-03-13  9:01                 ` [PATCH v2] " sourab.gupta
  0 siblings, 2 replies; 67+ messages in thread
From: Ville Syrjälä @ 2014-03-07 13:30 UTC (permalink / raw)
  To: Gupta, Sourab; +Cc: intel-gfx, Goel, Akash

On Fri, Mar 07, 2014 at 01:17:17PM +0000, Gupta, Sourab wrote:
> Hi Daniel/Chris,
> 
> > As Chris said, instead of rolling your own code to track when flips are emitted to the ring simply add a real request (with the add_request function) like the execbuf paths.Then add any additional trackin you need to our request structure.
> 
> We can use the 'add_request' function. But since we already have & can track the seqno of the object (for which we want the rendering to complete), we think there shall be no additional benefit in adding a new request & then tracking the same.
> 
> > Rather exporting deep magic from i915_gem, just emit the request after the mmio flip and use the normal signalling mechanisms. There are other users who could also use a request after a flip.
> 
> We had the following point in mind, when implementing the Mmio based Page flips :
> We wanted to completely avoid locking of the device mutex from the flip path. As we had seen sometimes the flips getting delayed because of concurrent exec buffers processing, while we are waiting for them to release the mutex. 
> Since the public functions (i915_wait_seqno ) require mutex to be taken beforehand, we had no choice but to expose the private __wait_seqno function in order to do so.
> Also, we couldn't find any other signaling mechanism (other than wait_seqno  type of functions) to do so.
> 
> Can you please provide your feedback on the above points.

I'd like to move this towards the atomic age. In my atomic branch I had
an interrupt driven mechanism for issuing the flips when the target
seqno(s) is/are reached. So my hope would be to move towards something
similar so that it can be easily used for the nuclear flip later.

Here's my latest code for that:
https://gitorious.org/vsyrjala/linux/commit/4cad93ab1ac09d4649419789a80408c5d8505cdc

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-03-07 13:30               ` Ville Syrjälä
@ 2014-03-13  7:21                 ` sourab.gupta
  2014-03-13  9:01                 ` [PATCH v2] " sourab.gupta
  1 sibling, 0 replies; 67+ messages in thread
From: sourab.gupta @ 2014-03-13  7:21 UTC (permalink / raw)
  To: intel-gfx; +Cc: Sourab Gupta, Akash Goel, Daniel Vetter

From: Sourab Gupta <sourab.gupta@intel.com>

Using MMIO based flips on VLV for Media power well residency optimization.
The blitter ring is currently being used just for command streamer based
flip calls. For pure 3D workloads, with MMIO flips, there will be no use
of blitter ring and this will ensure the 100% residency for Media well.

v2: The MMIO flips now use the interrupt driven mechanism for issuing the
flips when target seqno is reached. (Incorporating Ville's idea)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com

Cc: Daniel Vetter <daniel@ffwl.ch>

Cc: Chris Wilson <chris@chris-wilson.co.uk>

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |    1 +
 drivers/gpu/drm/i915/i915_drv.h      |   13 ++++
 drivers/gpu/drm/i915/i915_irq.c      |    2 +
 drivers/gpu/drm/i915/intel_display.c |  125 ++++++++++++++++++++++++++++++++++
 4 files changed, 141 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index e4d2b9f..d6ae334 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1566,6 +1566,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->flip_lock);
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a0d90ef..af35197 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1436,6 +1436,12 @@ struct intel_pipe_crc {
 	wait_queue_head_t wq;
 };
 
+struct i915_flip_data {
+	struct drm_crtc *crtc;
+	u32 seqno;
+	u32 ring_id;
+};
+
 typedef struct drm_i915_private {
 	struct drm_device *dev;
 	struct kmem_cache *slab;
@@ -1643,6 +1649,11 @@ typedef struct drm_i915_private {
 	struct i915_ums_state ums;
 
 	u32 suspend_count;
+
+	/* protects the flip_data */
+	spinlock_t flip_lock;
+
+	struct i915_flip_data	flip_data[I915_MAX_PIPES];
 } drm_i915_private_t;
 
 static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
@@ -2681,6 +2692,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
 int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct drm_device *dev,
+			struct intel_ring_buffer *ring);
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
 extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index be2713f..9b2007e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1062,6 +1062,8 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
 static void notify_ring(struct drm_device *dev,
 			struct intel_ring_buffer *ring)
 {
+	intel_notify_mmio_flip(dev, ring);
+
 	if (ring->obj == NULL)
 		return;
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 2bccc68..8bd2f57 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8813,6 +8813,122 @@ err:
 	return ret;
 }
 
+static void intel_do_mmio_flip(struct drm_crtc *crtc)
+{
+	struct intel_crtc *intel_crtc;
+
+	intel_crtc = to_intel_crtc(crtc) ;
+
+	intel_mark_page_flip_active(intel_crtc);
+	i9xx_update_plane(crtc, crtc->fb, 0, 0);
+}
+
+static bool intel_postpone_flip(struct drm_i915_gem_object *obj)
+{
+	int ret;
+	if(!obj->ring)
+		return false;
+
+	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, false),
+			      obj->last_write_seqno))
+		return false;
+
+	if (obj->last_write_seqno == obj->ring->outstanding_lazy_seqno) {
+		ret = i915_add_request(obj->ring, NULL);
+		if(WARN_ON(ret))
+			return false;
+	}
+
+	if(WARN_ON(!obj->ring->irq_get(obj->ring)))
+		return false;
+
+	return true;
+}
+
+void intel_notify_mmio_flip(struct drm_device *dev,
+			struct intel_ring_buffer *ring)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc;
+	struct i915_flip_data *flip_data;
+	unsigned long irq_flags;
+	u32 seqno, count;
+
+	BUG_ON(!ring);
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->flip_lock, irq_flags);
+
+	for(count=0;count<I915_MAX_PIPES;count++) {
+		flip_data =  &(dev_priv->flip_data[count]);
+		intel_crtc = to_intel_crtc(flip_data->crtc);
+		if ((flip_data->seqno != 0) &&
+				(ring->id == flip_data->ring_id) &&
+				( seqno >= flip_data->seqno ) ) {
+			/*FIXME: Can move do_mmio_flip out of spinlock protection */
+			intel_do_mmio_flip(flip_data->crtc);
+			flip_data->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+	spin_unlock_irqrestore(&dev_priv->flip_lock, irq_flags);
+}
+
+/* Using MMIO based flips starting from VLV, for Media power well
+ * residency optimization. The other alternative of having Render
+ * ring based flip calls is not being used, as the performance
+ * (FPS) of certain 3D Apps was getting severly affected.
+ */
+static int intel_gen7_queue_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc,
+			struct drm_framebuffer *fb,
+			struct drm_i915_gem_object *obj,
+			uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	struct i915_flip_data *flip_data = &(dev_priv->flip_data[intel_crtc->pipe]);
+	unsigned long irq_flags;
+	int ret;
+
+	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
+	if (ret)
+		goto err;
+
+	switch (intel_crtc->plane) {
+	case PLANE_A:
+	case PLANE_B:
+	case PLANE_C:
+	break;
+	default:
+		WARN_ONCE(1, "unknown plane in flip command\n");
+		ret = -ENODEV;
+		goto err_unpin;
+	}
+
+	if(!intel_postpone_flip(obj)) {
+		intel_do_mmio_flip(crtc);
+		return 0;
+	}
+
+	spin_lock_irqsave(&dev_priv->flip_lock, irq_flags);
+	flip_data->seqno = obj->last_write_seqno;
+	flip_data->ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->flip_lock, irq_flags);
+
+	/* Double check to catch cases where irq fired before
+	 * flip data was ready
+	 */
+	intel_notify_mmio_flip(dev, obj->ring);
+	return 0;
+
+err_unpin:
+	intel_unpin_fb_obj(obj);
+err:
+	return ret;
+}
+
 static int intel_gen7_queue_flip(struct drm_device *dev,
 				 struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
@@ -10581,6 +10697,12 @@ static void intel_crtc_init(struct drm_device *dev, int pipe)
 	dev_priv->pipe_to_crtc_mapping[intel_crtc->pipe] = &intel_crtc->base;
 
 	drm_crtc_helper_add(&intel_crtc->base, &intel_helper_funcs);
+
+	if (IS_VALLEYVIEW(dev)) {
+			dev_priv->flip_data[pipe].crtc =
+				dev_priv->pipe_to_crtc_mapping[pipe];
+			dev_priv->flip_data[pipe].seqno = 0;
+	}
 }
 
 enum pipe intel_get_pipe_from_connector(struct intel_connector *connector)
@@ -11103,6 +11225,9 @@ static void intel_init_display(struct drm_device *dev)
 		dev_priv->display.queue_flip = intel_gen7_queue_flip;
 		break;
 	}
+	if (IS_VALLEYVIEW(dev)) {
+		dev_priv->display.queue_flip = intel_gen7_queue_mmio_flip;
+	}
 
 	intel_panel_init_backlight_funcs(dev);
 }
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v2] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-03-07 13:30               ` Ville Syrjälä
  2014-03-13  7:21                 ` [PATCH] drm/i915: Replaced Blitter ring based flips with MMIO flips " sourab.gupta
@ 2014-03-13  9:01                 ` sourab.gupta
  2014-03-17  4:33                   ` Gupta, Sourab
                                     ` (2 more replies)
  1 sibling, 3 replies; 67+ messages in thread
From: sourab.gupta @ 2014-03-13  9:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: Sourab Gupta, Akash Goel, Daniel Vetter

From: Sourab Gupta <sourab.gupta@intel.com>

Using MMIO based flips on VLV for Media power well residency optimization.
The blitter ring is currently being used just for command streamer based
flip calls. For pure 3D workloads, with MMIO flips, there will be no use
of blitter ring and this will ensure the 100% residency for Media well.

v2: The MMIO flips now use the interrupt driven mechanism for issuing the
flips when target seqno is reached. (Incorporating Ville's idea)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>

Cc: Daniel Vetter <daniel@ffwl.ch>

Cc: Chris Wilson <chris@chris-wilson.co.uk>

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |  13 ++++
 drivers/gpu/drm/i915/i915_irq.c      |   2 +
 drivers/gpu/drm/i915/intel_display.c | 125 +++++++++++++++++++++++++++++++++++
 4 files changed, 141 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index e4d2b9f..d6ae334 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1566,6 +1566,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->flip_lock);
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a0d90ef..af35197 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1436,6 +1436,12 @@ struct intel_pipe_crc {
 	wait_queue_head_t wq;
 };
 
+struct i915_flip_data {
+	struct drm_crtc *crtc;
+	u32 seqno;
+	u32 ring_id;
+};
+
 typedef struct drm_i915_private {
 	struct drm_device *dev;
 	struct kmem_cache *slab;
@@ -1643,6 +1649,11 @@ typedef struct drm_i915_private {
 	struct i915_ums_state ums;
 
 	u32 suspend_count;
+
+	/* protects the flip_data */
+	spinlock_t flip_lock;
+
+	struct i915_flip_data	flip_data[I915_MAX_PIPES];
 } drm_i915_private_t;
 
 static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
@@ -2681,6 +2692,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
 int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct drm_device *dev,
+			struct intel_ring_buffer *ring);
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
 extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index be2713f..9b2007e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1062,6 +1062,8 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
 static void notify_ring(struct drm_device *dev,
 			struct intel_ring_buffer *ring)
 {
+	intel_notify_mmio_flip(dev, ring);
+
 	if (ring->obj == NULL)
 		return;
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 2bccc68..8bd2f57 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8813,6 +8813,122 @@ err:
 	return ret;
 }
 
+static void intel_do_mmio_flip(struct drm_crtc *crtc)
+{
+	struct intel_crtc *intel_crtc;
+
+	intel_crtc = to_intel_crtc(crtc) ;
+
+	intel_mark_page_flip_active(intel_crtc);
+	i9xx_update_plane(crtc, crtc->fb, 0, 0);
+}
+
+static bool intel_postpone_flip(struct drm_i915_gem_object *obj)
+{
+	int ret;
+	if(!obj->ring)
+		return false;
+
+	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, false),
+			      obj->last_write_seqno))
+		return false;
+
+	if (obj->last_write_seqno == obj->ring->outstanding_lazy_seqno) {
+		ret = i915_add_request(obj->ring, NULL);
+		if(WARN_ON(ret))
+			return false;
+	}
+
+	if(WARN_ON(!obj->ring->irq_get(obj->ring)))
+		return false;
+
+	return true;
+}
+
+void intel_notify_mmio_flip(struct drm_device *dev,
+			struct intel_ring_buffer *ring)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc;
+	struct i915_flip_data *flip_data;
+	unsigned long irq_flags;
+	u32 seqno, count;
+
+	BUG_ON(!ring);
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->flip_lock, irq_flags);
+
+	for(count=0;count<I915_MAX_PIPES;count++) {
+		flip_data =  &(dev_priv->flip_data[count]);
+		intel_crtc = to_intel_crtc(flip_data->crtc);
+		if ((flip_data->seqno != 0) &&
+				(ring->id == flip_data->ring_id) &&
+				( seqno >= flip_data->seqno ) ) {
+			/*FIXME: Can move do_mmio_flip out of spinlock protection */
+			intel_do_mmio_flip(flip_data->crtc);
+			flip_data->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+	spin_unlock_irqrestore(&dev_priv->flip_lock, irq_flags);
+}
+
+/* Using MMIO based flips starting from VLV, for Media power well
+ * residency optimization. The other alternative of having Render
+ * ring based flip calls is not being used, as the performance
+ * (FPS) of certain 3D Apps was getting severly affected.
+ */
+static int intel_gen7_queue_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc,
+			struct drm_framebuffer *fb,
+			struct drm_i915_gem_object *obj,
+			uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	struct i915_flip_data *flip_data = &(dev_priv->flip_data[intel_crtc->pipe]);
+	unsigned long irq_flags;
+	int ret;
+
+	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
+	if (ret)
+		goto err;
+
+	switch (intel_crtc->plane) {
+	case PLANE_A:
+	case PLANE_B:
+	case PLANE_C:
+	break;
+	default:
+		WARN_ONCE(1, "unknown plane in flip command\n");
+		ret = -ENODEV;
+		goto err_unpin;
+	}
+
+	if(!intel_postpone_flip(obj)) {
+		intel_do_mmio_flip(crtc);
+		return 0;
+	}
+
+	spin_lock_irqsave(&dev_priv->flip_lock, irq_flags);
+	flip_data->seqno = obj->last_write_seqno;
+	flip_data->ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->flip_lock, irq_flags);
+
+	/* Double check to catch cases where irq fired before
+	 * flip data was ready
+	 */
+	intel_notify_mmio_flip(dev, obj->ring);
+	return 0;
+
+err_unpin:
+	intel_unpin_fb_obj(obj);
+err:
+	return ret;
+}
+
 static int intel_gen7_queue_flip(struct drm_device *dev,
 				 struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
@@ -10581,6 +10697,12 @@ static void intel_crtc_init(struct drm_device *dev, int pipe)
 	dev_priv->pipe_to_crtc_mapping[intel_crtc->pipe] = &intel_crtc->base;
 
 	drm_crtc_helper_add(&intel_crtc->base, &intel_helper_funcs);
+
+	if (IS_VALLEYVIEW(dev)) {
+			dev_priv->flip_data[pipe].crtc =
+				dev_priv->pipe_to_crtc_mapping[pipe];
+			dev_priv->flip_data[pipe].seqno = 0;
+	}
 }
 
 enum pipe intel_get_pipe_from_connector(struct intel_connector *connector)
@@ -11103,6 +11225,9 @@ static void intel_init_display(struct drm_device *dev)
 		dev_priv->display.queue_flip = intel_gen7_queue_flip;
 		break;
 	}
+	if (IS_VALLEYVIEW(dev)) {
+		dev_priv->display.queue_flip = intel_gen7_queue_mmio_flip;
+	}
 
 	intel_panel_init_backlight_funcs(dev);
 }
-- 
1.8.5.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v2] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-03-13  9:01                 ` [PATCH v2] " sourab.gupta
@ 2014-03-17  4:33                   ` Gupta, Sourab
  2014-03-21 17:10                   ` Gupta, Sourab
  2014-03-21 18:15                   ` Damien Lespiau
  2 siblings, 0 replies; 67+ messages in thread
From: Gupta, Sourab @ 2014-03-17  4:33 UTC (permalink / raw)
  To: Syrjala, Ville; +Cc: Goel, Akash, intel-gfx, Daniel Vetter

Hi Ville,

Can you please review the below patch.

Regards,
Sourab

-----Original Message-----
From: Gupta, Sourab 
Sent: Thursday, March 13, 2014 2:32 PM
To: intel-gfx@lists.freedesktop.org
Cc: Gupta, Sourab; Ville Syrjälä; Daniel Vetter; Chris Wilson; Goel, Akash
Subject: [PATCH v2] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV

From: Sourab Gupta <sourab.gupta@intel.com>

Using MMIO based flips on VLV for Media power well residency optimization.
The blitter ring is currently being used just for command streamer based flip calls. For pure 3D workloads, with MMIO flips, there will be no use of blitter ring and this will ensure the 100% residency for Media well.

v2: The MMIO flips now use the interrupt driven mechanism for issuing the flips when target seqno is reached. (Incorporating Ville's idea)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>

Cc: Daniel Vetter <daniel@ffwl.ch>

Cc: Chris Wilson <chris@chris-wilson.co.uk>

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |  13 ++++
 drivers/gpu/drm/i915/i915_irq.c      |   2 +
 drivers/gpu/drm/i915/intel_display.c | 125 +++++++++++++++++++++++++++++++++++
 4 files changed, 141 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index e4d2b9f..d6ae334 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1566,6 +1566,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->flip_lock);
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a0d90ef..af35197 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1436,6 +1436,12 @@ struct intel_pipe_crc {
 	wait_queue_head_t wq;
 };
 
+struct i915_flip_data {
+	struct drm_crtc *crtc;
+	u32 seqno;
+	u32 ring_id;
+};
+
 typedef struct drm_i915_private {
 	struct drm_device *dev;
 	struct kmem_cache *slab;
@@ -1643,6 +1649,11 @@ typedef struct drm_i915_private {
 	struct i915_ums_state ums;
 
 	u32 suspend_count;
+
+	/* protects the flip_data */
+	spinlock_t flip_lock;
+
+	struct i915_flip_data	flip_data[I915_MAX_PIPES];
 } drm_i915_private_t;
 
 static inline struct drm_i915_private *to_i915(const struct drm_device *dev) @@ -2681,6 +2692,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct drm_device *dev,
+			struct intel_ring_buffer *ring);
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e, diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index be2713f..9b2007e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1062,6 +1062,8 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)  static void notify_ring(struct drm_device *dev,
 			struct intel_ring_buffer *ring)
 {
+	intel_notify_mmio_flip(dev, ring);
+
 	if (ring->obj == NULL)
 		return;
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 2bccc68..8bd2f57 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8813,6 +8813,122 @@ err:
 	return ret;
 }
 
+static void intel_do_mmio_flip(struct drm_crtc *crtc) {
+	struct intel_crtc *intel_crtc;
+
+	intel_crtc = to_intel_crtc(crtc) ;
+
+	intel_mark_page_flip_active(intel_crtc);
+	i9xx_update_plane(crtc, crtc->fb, 0, 0); }
+
+static bool intel_postpone_flip(struct drm_i915_gem_object *obj) {
+	int ret;
+	if(!obj->ring)
+		return false;
+
+	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, false),
+			      obj->last_write_seqno))
+		return false;
+
+	if (obj->last_write_seqno == obj->ring->outstanding_lazy_seqno) {
+		ret = i915_add_request(obj->ring, NULL);
+		if(WARN_ON(ret))
+			return false;
+	}
+
+	if(WARN_ON(!obj->ring->irq_get(obj->ring)))
+		return false;
+
+	return true;
+}
+
+void intel_notify_mmio_flip(struct drm_device *dev,
+			struct intel_ring_buffer *ring)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc;
+	struct i915_flip_data *flip_data;
+	unsigned long irq_flags;
+	u32 seqno, count;
+
+	BUG_ON(!ring);
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->flip_lock, irq_flags);
+
+	for(count=0;count<I915_MAX_PIPES;count++) {
+		flip_data =  &(dev_priv->flip_data[count]);
+		intel_crtc = to_intel_crtc(flip_data->crtc);
+		if ((flip_data->seqno != 0) &&
+				(ring->id == flip_data->ring_id) &&
+				( seqno >= flip_data->seqno ) ) {
+			/*FIXME: Can move do_mmio_flip out of spinlock protection */
+			intel_do_mmio_flip(flip_data->crtc);
+			flip_data->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+	spin_unlock_irqrestore(&dev_priv->flip_lock, irq_flags); }
+
+/* Using MMIO based flips starting from VLV, for Media power well
+ * residency optimization. The other alternative of having Render
+ * ring based flip calls is not being used, as the performance
+ * (FPS) of certain 3D Apps was getting severly affected.
+ */
+static int intel_gen7_queue_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc,
+			struct drm_framebuffer *fb,
+			struct drm_i915_gem_object *obj,
+			uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	struct i915_flip_data *flip_data = &(dev_priv->flip_data[intel_crtc->pipe]);
+	unsigned long irq_flags;
+	int ret;
+
+	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
+	if (ret)
+		goto err;
+
+	switch (intel_crtc->plane) {
+	case PLANE_A:
+	case PLANE_B:
+	case PLANE_C:
+	break;
+	default:
+		WARN_ONCE(1, "unknown plane in flip command\n");
+		ret = -ENODEV;
+		goto err_unpin;
+	}
+
+	if(!intel_postpone_flip(obj)) {
+		intel_do_mmio_flip(crtc);
+		return 0;
+	}
+
+	spin_lock_irqsave(&dev_priv->flip_lock, irq_flags);
+	flip_data->seqno = obj->last_write_seqno;
+	flip_data->ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->flip_lock, irq_flags);
+
+	/* Double check to catch cases where irq fired before
+	 * flip data was ready
+	 */
+	intel_notify_mmio_flip(dev, obj->ring);
+	return 0;
+
+err_unpin:
+	intel_unpin_fb_obj(obj);
+err:
+	return ret;
+}
+
 static int intel_gen7_queue_flip(struct drm_device *dev,
 				 struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
@@ -10581,6 +10697,12 @@ static void intel_crtc_init(struct drm_device *dev, int pipe)
 	dev_priv->pipe_to_crtc_mapping[intel_crtc->pipe] = &intel_crtc->base;
 
 	drm_crtc_helper_add(&intel_crtc->base, &intel_helper_funcs);
+
+	if (IS_VALLEYVIEW(dev)) {
+			dev_priv->flip_data[pipe].crtc =
+				dev_priv->pipe_to_crtc_mapping[pipe];
+			dev_priv->flip_data[pipe].seqno = 0;
+	}
 }
 
 enum pipe intel_get_pipe_from_connector(struct intel_connector *connector) @@ -11103,6 +11225,9 @@ static void intel_init_display(struct drm_device *dev)
 		dev_priv->display.queue_flip = intel_gen7_queue_flip;
 		break;
 	}
+	if (IS_VALLEYVIEW(dev)) {
+		dev_priv->display.queue_flip = intel_gen7_queue_mmio_flip;
+	}
 
 	intel_panel_init_backlight_funcs(dev);
 }
--
1.8.5.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v2] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-03-13  9:01                 ` [PATCH v2] " sourab.gupta
  2014-03-17  4:33                   ` Gupta, Sourab
@ 2014-03-21 17:10                   ` Gupta, Sourab
  2014-03-21 18:15                   ` Damien Lespiau
  2 siblings, 0 replies; 67+ messages in thread
From: Gupta, Sourab @ 2014-03-21 17:10 UTC (permalink / raw)
  To: Syrjala, Ville; +Cc: Goel, Akash, intel-gfx, Daniel Vetter

Hi Ville,
Can you provide your feedback on this patch.
Waiting for your response.

Regards,
Sourab

-----Original Message-----
From: Gupta, Sourab 
Sent: Monday, March 17, 2014 10:04 AM
To: Syrjala, Ville
Cc: Daniel Vetter; Chris Wilson; Goel, Akash; intel-gfx@lists.freedesktop.org
Subject: RE: [PATCH v2] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV

Hi Ville,

Can you please review the below patch.

Regards,
Sourab

-----Original Message-----
From: Gupta, Sourab 
Sent: Thursday, March 13, 2014 2:32 PM
To: intel-gfx@lists.freedesktop.org
Cc: Gupta, Sourab; Ville Syrjälä; Daniel Vetter; Chris Wilson; Goel, Akash
Subject: [PATCH v2] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV

From: Sourab Gupta <sourab.gupta@intel.com>

Using MMIO based flips on VLV for Media power well residency optimization.
The blitter ring is currently being used just for command streamer based flip calls. For pure 3D workloads, with MMIO flips, there will be no use of blitter ring and this will ensure the 100% residency for Media well.

v2: The MMIO flips now use the interrupt driven mechanism for issuing the flips when target seqno is reached. (Incorporating Ville's idea)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>

Cc: Daniel Vetter <daniel@ffwl.ch>

Cc: Chris Wilson <chris@chris-wilson.co.uk>

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>

Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |  13 ++++
 drivers/gpu/drm/i915/i915_irq.c      |   2 +
 drivers/gpu/drm/i915/intel_display.c | 125 +++++++++++++++++++++++++++++++++++
 4 files changed, 141 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index e4d2b9f..d6ae334 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1566,6 +1566,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->flip_lock);
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a0d90ef..af35197 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1436,6 +1436,12 @@ struct intel_pipe_crc {
 	wait_queue_head_t wq;
 };
 
+struct i915_flip_data {
+	struct drm_crtc *crtc;
+	u32 seqno;
+	u32 ring_id;
+};
+
 typedef struct drm_i915_private {
 	struct drm_device *dev;
 	struct kmem_cache *slab;
@@ -1643,6 +1649,11 @@ typedef struct drm_i915_private {
 	struct i915_ums_state ums;
 
 	u32 suspend_count;
+
+	/* protects the flip_data */
+	spinlock_t flip_lock;
+
+	struct i915_flip_data	flip_data[I915_MAX_PIPES];
 } drm_i915_private_t;
 
 static inline struct drm_i915_private *to_i915(const struct drm_device *dev) @@ -2681,6 +2692,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct drm_device *dev,
+			struct intel_ring_buffer *ring);
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e, diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index be2713f..9b2007e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1062,6 +1062,8 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)  static void notify_ring(struct drm_device *dev,
 			struct intel_ring_buffer *ring)
 {
+	intel_notify_mmio_flip(dev, ring);
+
 	if (ring->obj == NULL)
 		return;
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 2bccc68..8bd2f57 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8813,6 +8813,122 @@ err:
 	return ret;
 }
 
+static void intel_do_mmio_flip(struct drm_crtc *crtc) {
+	struct intel_crtc *intel_crtc;
+
+	intel_crtc = to_intel_crtc(crtc) ;
+
+	intel_mark_page_flip_active(intel_crtc);
+	i9xx_update_plane(crtc, crtc->fb, 0, 0); }
+
+static bool intel_postpone_flip(struct drm_i915_gem_object *obj) {
+	int ret;
+	if(!obj->ring)
+		return false;
+
+	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, false),
+			      obj->last_write_seqno))
+		return false;
+
+	if (obj->last_write_seqno == obj->ring->outstanding_lazy_seqno) {
+		ret = i915_add_request(obj->ring, NULL);
+		if(WARN_ON(ret))
+			return false;
+	}
+
+	if(WARN_ON(!obj->ring->irq_get(obj->ring)))
+		return false;
+
+	return true;
+}
+
+void intel_notify_mmio_flip(struct drm_device *dev,
+			struct intel_ring_buffer *ring)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc;
+	struct i915_flip_data *flip_data;
+	unsigned long irq_flags;
+	u32 seqno, count;
+
+	BUG_ON(!ring);
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->flip_lock, irq_flags);
+
+	for(count=0;count<I915_MAX_PIPES;count++) {
+		flip_data =  &(dev_priv->flip_data[count]);
+		intel_crtc = to_intel_crtc(flip_data->crtc);
+		if ((flip_data->seqno != 0) &&
+				(ring->id == flip_data->ring_id) &&
+				( seqno >= flip_data->seqno ) ) {
+			/*FIXME: Can move do_mmio_flip out of spinlock protection */
+			intel_do_mmio_flip(flip_data->crtc);
+			flip_data->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+	spin_unlock_irqrestore(&dev_priv->flip_lock, irq_flags); }
+
+/* Using MMIO based flips starting from VLV, for Media power well
+ * residency optimization. The other alternative of having Render
+ * ring based flip calls is not being used, as the performance
+ * (FPS) of certain 3D Apps was getting severly affected.
+ */
+static int intel_gen7_queue_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc,
+			struct drm_framebuffer *fb,
+			struct drm_i915_gem_object *obj,
+			uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	struct i915_flip_data *flip_data = &(dev_priv->flip_data[intel_crtc->pipe]);
+	unsigned long irq_flags;
+	int ret;
+
+	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
+	if (ret)
+		goto err;
+
+	switch (intel_crtc->plane) {
+	case PLANE_A:
+	case PLANE_B:
+	case PLANE_C:
+	break;
+	default:
+		WARN_ONCE(1, "unknown plane in flip command\n");
+		ret = -ENODEV;
+		goto err_unpin;
+	}
+
+	if(!intel_postpone_flip(obj)) {
+		intel_do_mmio_flip(crtc);
+		return 0;
+	}
+
+	spin_lock_irqsave(&dev_priv->flip_lock, irq_flags);
+	flip_data->seqno = obj->last_write_seqno;
+	flip_data->ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->flip_lock, irq_flags);
+
+	/* Double check to catch cases where irq fired before
+	 * flip data was ready
+	 */
+	intel_notify_mmio_flip(dev, obj->ring);
+	return 0;
+
+err_unpin:
+	intel_unpin_fb_obj(obj);
+err:
+	return ret;
+}
+
 static int intel_gen7_queue_flip(struct drm_device *dev,
 				 struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
@@ -10581,6 +10697,12 @@ static void intel_crtc_init(struct drm_device *dev, int pipe)
 	dev_priv->pipe_to_crtc_mapping[intel_crtc->pipe] = &intel_crtc->base;
 
 	drm_crtc_helper_add(&intel_crtc->base, &intel_helper_funcs);
+
+	if (IS_VALLEYVIEW(dev)) {
+			dev_priv->flip_data[pipe].crtc =
+				dev_priv->pipe_to_crtc_mapping[pipe];
+			dev_priv->flip_data[pipe].seqno = 0;
+	}
 }
 
 enum pipe intel_get_pipe_from_connector(struct intel_connector *connector) @@ -11103,6 +11225,9 @@ static void intel_init_display(struct drm_device *dev)
 		dev_priv->display.queue_flip = intel_gen7_queue_flip;
 		break;
 	}
+	if (IS_VALLEYVIEW(dev)) {
+		dev_priv->display.queue_flip = intel_gen7_queue_mmio_flip;
+	}
 
 	intel_panel_init_backlight_funcs(dev);
 }
--
1.8.5.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v2] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-03-13  9:01                 ` [PATCH v2] " sourab.gupta
  2014-03-17  4:33                   ` Gupta, Sourab
  2014-03-21 17:10                   ` Gupta, Sourab
@ 2014-03-21 18:15                   ` Damien Lespiau
  2014-03-23  9:01                     ` [PATCH v3] " sourab.gupta
  2 siblings, 1 reply; 67+ messages in thread
From: Damien Lespiau @ 2014-03-21 18:15 UTC (permalink / raw)
  To: sourab.gupta; +Cc: intel-gfx, ville.syrjala, Akash Goel, Daniel Vetter

On Thu, Mar 13, 2014 at 02:31:37PM +0530, sourab.gupta@intel.com wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> Using MMIO based flips on VLV for Media power well residency optimization.
> The blitter ring is currently being used just for command streamer based
> flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> of blitter ring and this will ensure the 100% residency for Media well.
> 
> v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> flips when target seqno is reached. (Incorporating Ville's idea)
> 
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Cc: Daniel Vetter <daniel@ffwl.ch>
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> 
> Signed-off-by: Akash Goel <akash.goel@intel.com>

A light pass that does't actually look much at the correctness.
Submitting patches with obvious trivial issues creates reviewing
overhead. scripts/checkpatch.pl can catch a lot of small issues.

Also note that someone from your team can also review patches, it's an
interesting exercise and would help moving the work forward instead of
waiting.

> ---
>  drivers/gpu/drm/i915/i915_dma.c      |   1 +
>  drivers/gpu/drm/i915/i915_drv.h      |  13 ++++
>  drivers/gpu/drm/i915/i915_irq.c      |   2 +
>  drivers/gpu/drm/i915/intel_display.c | 125 +++++++++++++++++++++++++++++++++++
>  4 files changed, 141 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index e4d2b9f..d6ae334 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1566,6 +1566,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  	spin_lock_init(&dev_priv->backlight_lock);
>  	spin_lock_init(&dev_priv->uncore.lock);
>  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> +	spin_lock_init(&dev_priv->flip_lock);

Let's try to be more descriptive, mmio_flip_lock. One has to understand
what a variable do with its name. This is not a generic "flip lock" at
this point.

>  	mutex_init(&dev_priv->dpio_lock);
>  	mutex_init(&dev_priv->modeset_restore_lock);
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a0d90ef..af35197 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1436,6 +1436,12 @@ struct intel_pipe_crc {
>  	wait_queue_head_t wq;
>  };
>  
> +struct i915_flip_data {
> +	struct drm_crtc *crtc;
> +	u32 seqno;
> +	u32 ring_id;
> +};
> +
>  typedef struct drm_i915_private {
>  	struct drm_device *dev;
>  	struct kmem_cache *slab;
> @@ -1643,6 +1649,11 @@ typedef struct drm_i915_private {
>  	struct i915_ums_state ums;
>  
>  	u32 suspend_count;
> +
> +	/* protects the flip_data */
> +	spinlock_t flip_lock;
> +
> +	struct i915_flip_data	flip_data[I915_MAX_PIPES];

If we need one of those per-pipe, why not put that structure on the
CRTC? Writing this is a good hint this data belongs to the CRTC object.

We try to reserve the i915 prefix for GT stuff. It's also related to
mmio flip, so how about calling it intel_mmio_flip?

>  } drm_i915_private_t;
>  
>  static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> @@ -2681,6 +2692,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
>  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file);
>  
> +void intel_notify_mmio_flip(struct drm_device *dev,
> +			struct intel_ring_buffer *ring);
>  /* overlay */
>  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
>  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index be2713f..9b2007e 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1062,6 +1062,8 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
>  static void notify_ring(struct drm_device *dev,
>  			struct intel_ring_buffer *ring)
>  {
> +	intel_notify_mmio_flip(dev, ring);
> +
>  	if (ring->obj == NULL)
>  		return;
  
It looks like the wrong place to put it. It should be after that check,
and also after the trace point so one tracing the events doesn't see the
mmio flip complete before the request_complete event.

> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 2bccc68..8bd2f57 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -8813,6 +8813,122 @@ err:
>  	return ret;
>  }
>  
> +static void intel_do_mmio_flip(struct drm_crtc *crtc)
> +{
> +	struct intel_crtc *intel_crtc;
> +
> +	intel_crtc = to_intel_crtc(crtc) ;
> +
> +	intel_mark_page_flip_active(intel_crtc);
> +	i9xx_update_plane(crtc, crtc->fb, 0, 0);

This function has changed name. Could be please rebase your patch
against -nightly?

Also let's try to be more generic here by calling the
->update_primary_plane() vfunc.

> +}
> +
> +static bool intel_postpone_flip(struct drm_i915_gem_object *obj)
> +{
> +	int ret;

Space between variable declaration and code. You have various whitespace
issues you need to fix (use ./script/checkpatch.pl to get the list).

> +	if(!obj->ring)
> +		return false;
> +
> +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, false),
> +			      obj->last_write_seqno))
> +		return false;
> +
> +	if (obj->last_write_seqno == obj->ring->outstanding_lazy_seqno) {
> +		ret = i915_add_request(obj->ring, NULL);
> +		if(WARN_ON(ret))
> +			return false;
> +	}
> +
> +	if(WARN_ON(!obj->ring->irq_get(obj->ring)))
> +		return false;
> +
> +	return true;
> +}
> +
> +void intel_notify_mmio_flip(struct drm_device *dev,
> +			struct intel_ring_buffer *ring)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_crtc *intel_crtc;
> +	struct i915_flip_data *flip_data;
> +	unsigned long irq_flags;
> +	u32 seqno, count;
> +
> +	BUG_ON(!ring);
> +
> +	seqno = ring->get_seqno(ring, false);
> +
> +	spin_lock_irqsave(&dev_priv->flip_lock, irq_flags);
> +
> +	for(count=0;count<I915_MAX_PIPES;count++) {

for_each_pipe(). I915_MAX_PIPES is the maximum number of pipes across
all platforms, not what you want to loop through the existing pipes of
the device.

> +		flip_data =  &(dev_priv->flip_data[count]);
> +		intel_crtc = to_intel_crtc(flip_data->crtc);
> +		if ((flip_data->seqno != 0) &&
> +				(ring->id == flip_data->ring_id) &&
> +				( seqno >= flip_data->seqno ) ) {
> +			/*FIXME: Can move do_mmio_flip out of spinlock protection */

There's a FIXME here, what is its status?

> +			intel_do_mmio_flip(flip_data->crtc);
> +			flip_data->seqno = 0;
> +			ring->irq_put(ring);
> +		}
> +	}
> +	spin_unlock_irqrestore(&dev_priv->flip_lock, irq_flags);
> +}
> +
> +/* Using MMIO based flips starting from VLV, for Media power well
> + * residency optimization. The other alternative of having Render
> + * ring based flip calls is not being used, as the performance
> + * (FPS) of certain 3D Apps was getting severly affected.
> + */
> +static int intel_gen7_queue_mmio_flip(struct drm_device *dev,
> +			struct drm_crtc *crtc,
> +			struct drm_framebuffer *fb,
> +			struct drm_i915_gem_object *obj,
> +			uint32_t flags)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> +	struct i915_flip_data *flip_data = &(dev_priv->flip_data[intel_crtc->pipe]);

No need of the pair of () here (and elsewhere in the patch).

> +	unsigned long irq_flags;
> +	int ret;
> +
> +	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
> +	if (ret)
> +		goto err;
> +
> +	switch (intel_crtc->plane) {
> +	case PLANE_A:
> +	case PLANE_B:
> +	case PLANE_C:
> +	break;
> +	default:
> +		WARN_ONCE(1, "unknown plane in flip command\n");
> +		ret = -ENODEV;
> +		goto err_unpin;
> +	}
> +
> +	if(!intel_postpone_flip(obj)) {
> +		intel_do_mmio_flip(crtc);
> +		return 0;
> +	}
> +
> +	spin_lock_irqsave(&dev_priv->flip_lock, irq_flags);
> +	flip_data->seqno = obj->last_write_seqno;
> +	flip_data->ring_id = obj->ring->id;
> +	spin_unlock_irqrestore(&dev_priv->flip_lock, irq_flags);
> +
> +	/* Double check to catch cases where irq fired before
> +	 * flip data was ready
> +	 */
> +	intel_notify_mmio_flip(dev, obj->ring);
> +	return 0;
> +
> +err_unpin:
> +	intel_unpin_fb_obj(obj);
> +err:
> +	return ret;
> +}
> +
>  static int intel_gen7_queue_flip(struct drm_device *dev,
>  				 struct drm_crtc *crtc,
>  				 struct drm_framebuffer *fb,
> @@ -10581,6 +10697,12 @@ static void intel_crtc_init(struct drm_device *dev, int pipe)
>  	dev_priv->pipe_to_crtc_mapping[intel_crtc->pipe] = &intel_crtc->base;
>  
>  	drm_crtc_helper_add(&intel_crtc->base, &intel_helper_funcs);
> +
> +	if (IS_VALLEYVIEW(dev)) {
> +			dev_priv->flip_data[pipe].crtc =
> +				dev_priv->pipe_to_crtc_mapping[pipe];
> +			dev_priv->flip_data[pipe].seqno = 0;
> +	}
>  }
>  
>  enum pipe intel_get_pipe_from_connector(struct intel_connector *connector)
> @@ -11103,6 +11225,9 @@ static void intel_init_display(struct drm_device *dev)
>  		dev_priv->display.queue_flip = intel_gen7_queue_flip;
>  		break;
>  	}
> +	if (IS_VALLEYVIEW(dev)) {
> +		dev_priv->display.queue_flip = intel_gen7_queue_mmio_flip;
> +	}
>  
>  	intel_panel_init_backlight_funcs(dev);
>  }
> -- 
> 1.8.5.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v3] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-03-21 18:15                   ` Damien Lespiau
@ 2014-03-23  9:01                     ` sourab.gupta
  2014-03-26  7:49                       ` Gupta, Sourab
  2014-05-09 11:59                       ` Ville Syrjälä
  0 siblings, 2 replies; 67+ messages in thread
From: sourab.gupta @ 2014-03-23  9:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: Sourab Gupta, Akash Goel

From: Sourab Gupta <sourab.gupta@intel.com>

Using MMIO based flips on VLV for Media power well residency optimization.
The blitter ring is currently being used just for command streamer based
flip calls. For pure 3D workloads, with MMIO flips, there will be no use
of blitter ring and this will ensure the 100% residency for Media well.

v2: The MMIO flips now use the interrupt driven mechanism for issuing the
flips when target seqno is reached. (Incorporating Ville's idea)

v3: Rebasing on latest code. Code restructuring after incorporating
Damien's comments

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   7 ++
 drivers/gpu/drm/i915/i915_irq.c      |   2 +
 drivers/gpu/drm/i915/intel_display.c | 124 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h     |   7 ++
 5 files changed, 141 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 4e0a26a..bca3c5a 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1569,6 +1569,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->mmio_flip_lock);
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3f62be0..678d31d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1621,6 +1621,10 @@ typedef struct drm_i915_private {
 	struct i915_dri1_state dri1;
 	/* Old ums support infrastructure, same warning applies. */
 	struct i915_ums_state ums;
+
+	/* protects the mmio flip data */
+	spinlock_t mmio_flip_lock;
+
 } drm_i915_private_t;
 
 static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
@@ -2657,6 +2661,9 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
 int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct drm_device *dev,
+			struct intel_ring_buffer *ring);
+
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
 extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 4b4aeb3..ad26abe 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1080,6 +1080,8 @@ static void notify_ring(struct drm_device *dev,
 
 	trace_i915_gem_request_complete(ring);
 
+	intel_notify_mmio_flip(dev, ring);
+
 	wake_up_all(&ring->irq_queue);
 	i915_queue_hangcheck(dev);
 }
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 7e4ea8d..19004bf 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -8782,6 +8782,125 @@ err:
 	return ret;
 }
 
+static int intel_do_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc;
+
+	intel_crtc = to_intel_crtc(crtc);
+
+	intel_mark_page_flip_active(intel_crtc);
+	return dev_priv->display.update_primary_plane(crtc, crtc->fb, 0, 0);
+}
+
+static bool intel_postpone_flip(struct drm_i915_gem_object *obj)
+{
+	int ret;
+	if (!obj->ring)
+		return false;
+
+	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, false),
+			      obj->last_write_seqno))
+		return false;
+
+	if (obj->last_write_seqno == obj->ring->outstanding_lazy_seqno) {
+		ret = i915_add_request(obj->ring, NULL);
+		if (WARN_ON(ret))
+			return false;
+	}
+
+	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
+		return false;
+
+	return true;
+}
+
+void intel_notify_mmio_flip(struct drm_device *dev,
+			struct intel_ring_buffer *ring)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_crtc *crtc;
+	struct intel_crtc *intel_crtc;
+	struct intel_mmio_flip *mmio_flip_data;
+	unsigned long irq_flags;
+	u32 seqno;
+	enum pipe pipe;
+
+	BUG_ON(!ring);
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	for_each_pipe(pipe) {
+		crtc = dev_priv->pipe_to_crtc_mapping[pipe];
+		intel_crtc = to_intel_crtc(crtc);
+		mmio_flip_data = &intel_crtc->mmio_flip_data;
+		if ((mmio_flip_data->seqno != 0) &&
+				(ring->id == mmio_flip_data->ring_id) &&
+				(seqno >= mmio_flip_data->seqno)) {
+			intel_do_mmio_flip(dev, crtc);
+			mmio_flip_data->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+}
+
+/* Using MMIO based flips starting from VLV, for Media power well
+ * residency optimization. The other alternative of having Render
+ * ring based flip calls is not being used, as the performance
+ * (FPS) of certain 3D Apps was getting severly affected.
+ */
+static int intel_gen7_queue_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc,
+			struct drm_framebuffer *fb,
+			struct drm_i915_gem_object *obj,
+			uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	unsigned long irq_flags;
+	int ret;
+
+	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
+	if (ret)
+		goto err;
+
+	switch (intel_crtc->plane) {
+	case PLANE_A:
+	case PLANE_B:
+	case PLANE_C:
+	break;
+	default:
+		WARN_ONCE(1, "unknown plane in flip command\n");
+		ret = -ENODEV;
+		goto err_unpin;
+	}
+
+	if (!intel_postpone_flip(obj)) {
+		ret = intel_do_mmio_flip(dev, crtc);
+		return ret;
+	}
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	intel_crtc->mmio_flip_data.seqno = obj->last_write_seqno;
+	intel_crtc->mmio_flip_data.ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+
+	/* Double check to catch cases where irq fired before
+	 * mmio flip data was ready
+	 */
+	intel_notify_mmio_flip(dev, obj->ring);
+	return 0;
+
+err_unpin:
+	intel_unpin_fb_obj(obj);
+err:
+	return ret;
+}
+
 static int intel_gen7_queue_flip(struct drm_device *dev,
 				 struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
@@ -10577,6 +10696,8 @@ static void intel_crtc_init(struct drm_device *dev, int pipe)
 		DRM_DEBUG_KMS("swapping pipes & planes for FBC\n");
 		intel_crtc->plane = !pipe;
 	}
+	if (IS_VALLEYVIEW(dev))
+			intel_crtc->mmio_flip_data.seqno = 0;
 
 	BUG_ON(pipe >= ARRAY_SIZE(dev_priv->plane_to_crtc_mapping) ||
 	       dev_priv->plane_to_crtc_mapping[intel_crtc->plane] != NULL);
@@ -11107,6 +11228,9 @@ static void intel_init_display(struct drm_device *dev)
 	}
 
 	intel_panel_init_backlight_funcs(dev);
+
+	if (IS_VALLEYVIEW(dev))
+		dev_priv->display.queue_flip = intel_gen7_queue_mmio_flip;
 }
 
 /*
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index fa99104..f0b26a1 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -344,6 +344,11 @@ struct intel_pipe_wm {
 	bool fbc_wm_enabled;
 };
 
+struct intel_mmio_flip {
+	u32 seqno;
+	u32 ring_id;
+};
+
 struct intel_crtc {
 	struct drm_crtc base;
 	enum pipe pipe;
@@ -395,6 +400,8 @@ struct intel_crtc {
 		/* watermarks currently being used  */
 		struct intel_pipe_wm active;
 	} wm;
+
+	struct intel_mmio_flip	mmio_flip_data;
 };
 
 struct intel_plane_wm_parameters {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v3] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-03-23  9:01                     ` [PATCH v3] " sourab.gupta
@ 2014-03-26  7:49                       ` Gupta, Sourab
  2014-04-03  8:40                         ` Gupta, Sourab
  2014-05-09 11:59                       ` Ville Syrjälä
  1 sibling, 1 reply; 67+ messages in thread
From: Gupta, Sourab @ 2014-03-26  7:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Goel, Akash

Hi Ville/Damien,
Can you please review the below patch(v3) for mmio flips.
Thanks,
Sourab

On Sun, 2014-03-23 at 09:01 +0000, Gupta, Sourab wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> Using MMIO based flips on VLV for Media power well residency optimization.
> The blitter ring is currently being used just for command streamer based
> flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> of blitter ring and this will ensure the 100% residency for Media well.
> 
> v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> flips when target seqno is reached. (Incorporating Ville's idea)
> 
> v3: Rebasing on latest code. Code restructuring after incorporating
> Damien's comments
> 
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_dma.c      |   1 +
>  drivers/gpu/drm/i915/i915_drv.h      |   7 ++
>  drivers/gpu/drm/i915/i915_irq.c      |   2 +
>  drivers/gpu/drm/i915/intel_display.c | 124 +++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_drv.h     |   7 ++
>  5 files changed, 141 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 4e0a26a..bca3c5a 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1569,6 +1569,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  	spin_lock_init(&dev_priv->backlight_lock);
>  	spin_lock_init(&dev_priv->uncore.lock);
>  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> +	spin_lock_init(&dev_priv->mmio_flip_lock);
>  	mutex_init(&dev_priv->dpio_lock);
>  	mutex_init(&dev_priv->modeset_restore_lock);
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 3f62be0..678d31d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1621,6 +1621,10 @@ typedef struct drm_i915_private {
>  	struct i915_dri1_state dri1;
>  	/* Old ums support infrastructure, same warning applies. */
>  	struct i915_ums_state ums;
> +
> +	/* protects the mmio flip data */
> +	spinlock_t mmio_flip_lock;
> +
>  } drm_i915_private_t;
>  
>  static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> @@ -2657,6 +2661,9 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
>  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file);
>  
> +void intel_notify_mmio_flip(struct drm_device *dev,
> +			struct intel_ring_buffer *ring);
> +
>  /* overlay */
>  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
>  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 4b4aeb3..ad26abe 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1080,6 +1080,8 @@ static void notify_ring(struct drm_device *dev,
>  
>  	trace_i915_gem_request_complete(ring);
>  
> +	intel_notify_mmio_flip(dev, ring);
> +
>  	wake_up_all(&ring->irq_queue);
>  	i915_queue_hangcheck(dev);
>  }
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 7e4ea8d..19004bf 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -8782,6 +8782,125 @@ err:
>  	return ret;
>  }
>  
> +static int intel_do_mmio_flip(struct drm_device *dev,
> +			struct drm_crtc *crtc)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_crtc *intel_crtc;
> +
> +	intel_crtc = to_intel_crtc(crtc);
> +
> +	intel_mark_page_flip_active(intel_crtc);
> +	return dev_priv->display.update_primary_plane(crtc, crtc->fb, 0, 0);
> +}
> +
> +static bool intel_postpone_flip(struct drm_i915_gem_object *obj)
> +{
> +	int ret;
> +	if (!obj->ring)
> +		return false;
> +
> +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, false),
> +			      obj->last_write_seqno))
> +		return false;
> +
> +	if (obj->last_write_seqno == obj->ring->outstanding_lazy_seqno) {
> +		ret = i915_add_request(obj->ring, NULL);
> +		if (WARN_ON(ret))
> +			return false;
> +	}
> +
> +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> +		return false;
> +
> +	return true;
> +}
> +
> +void intel_notify_mmio_flip(struct drm_device *dev,
> +			struct intel_ring_buffer *ring)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_crtc *crtc;
> +	struct intel_crtc *intel_crtc;
> +	struct intel_mmio_flip *mmio_flip_data;
> +	unsigned long irq_flags;
> +	u32 seqno;
> +	enum pipe pipe;
> +
> +	BUG_ON(!ring);
> +
> +	seqno = ring->get_seqno(ring, false);
> +
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	for_each_pipe(pipe) {
> +		crtc = dev_priv->pipe_to_crtc_mapping[pipe];
> +		intel_crtc = to_intel_crtc(crtc);
> +		mmio_flip_data = &intel_crtc->mmio_flip_data;
> +		if ((mmio_flip_data->seqno != 0) &&
> +				(ring->id == mmio_flip_data->ring_id) &&
> +				(seqno >= mmio_flip_data->seqno)) {
> +			intel_do_mmio_flip(dev, crtc);
> +			mmio_flip_data->seqno = 0;
> +			ring->irq_put(ring);
> +		}
> +	}
> +
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +}
> +
> +/* Using MMIO based flips starting from VLV, for Media power well
> + * residency optimization. The other alternative of having Render
> + * ring based flip calls is not being used, as the performance
> + * (FPS) of certain 3D Apps was getting severly affected.
> + */
> +static int intel_gen7_queue_mmio_flip(struct drm_device *dev,
> +			struct drm_crtc *crtc,
> +			struct drm_framebuffer *fb,
> +			struct drm_i915_gem_object *obj,
> +			uint32_t flags)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> +	unsigned long irq_flags;
> +	int ret;
> +
> +	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
> +	if (ret)
> +		goto err;
> +
> +	switch (intel_crtc->plane) {
> +	case PLANE_A:
> +	case PLANE_B:
> +	case PLANE_C:
> +	break;
> +	default:
> +		WARN_ONCE(1, "unknown plane in flip command\n");
> +		ret = -ENODEV;
> +		goto err_unpin;
> +	}
> +
> +	if (!intel_postpone_flip(obj)) {
> +		ret = intel_do_mmio_flip(dev, crtc);
> +		return ret;
> +	}
> +
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	intel_crtc->mmio_flip_data.seqno = obj->last_write_seqno;
> +	intel_crtc->mmio_flip_data.ring_id = obj->ring->id;
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +
> +	/* Double check to catch cases where irq fired before
> +	 * mmio flip data was ready
> +	 */
> +	intel_notify_mmio_flip(dev, obj->ring);
> +	return 0;
> +
> +err_unpin:
> +	intel_unpin_fb_obj(obj);
> +err:
> +	return ret;
> +}
> +
>  static int intel_gen7_queue_flip(struct drm_device *dev,
>  				 struct drm_crtc *crtc,
>  				 struct drm_framebuffer *fb,
> @@ -10577,6 +10696,8 @@ static void intel_crtc_init(struct drm_device *dev, int pipe)
>  		DRM_DEBUG_KMS("swapping pipes & planes for FBC\n");
>  		intel_crtc->plane = !pipe;
>  	}
> +	if (IS_VALLEYVIEW(dev))
> +			intel_crtc->mmio_flip_data.seqno = 0;
>  
>  	BUG_ON(pipe >= ARRAY_SIZE(dev_priv->plane_to_crtc_mapping) ||
>  	       dev_priv->plane_to_crtc_mapping[intel_crtc->plane] != NULL);
> @@ -11107,6 +11228,9 @@ static void intel_init_display(struct drm_device *dev)
>  	}
>  
>  	intel_panel_init_backlight_funcs(dev);
> +
> +	if (IS_VALLEYVIEW(dev))
> +		dev_priv->display.queue_flip = intel_gen7_queue_mmio_flip;
>  }
>  
>  /*
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index fa99104..f0b26a1 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -344,6 +344,11 @@ struct intel_pipe_wm {
>  	bool fbc_wm_enabled;
>  };
>  
> +struct intel_mmio_flip {
> +	u32 seqno;
> +	u32 ring_id;
> +};
> +
>  struct intel_crtc {
>  	struct drm_crtc base;
>  	enum pipe pipe;
> @@ -395,6 +400,8 @@ struct intel_crtc {
>  		/* watermarks currently being used  */
>  		struct intel_pipe_wm active;
>  	} wm;
> +
> +	struct intel_mmio_flip	mmio_flip_data;
>  };
>  
>  struct intel_plane_wm_parameters {

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-03-26  7:49                       ` Gupta, Sourab
@ 2014-04-03  8:40                         ` Gupta, Sourab
  2014-04-07 11:19                           ` Gupta, Sourab
  0 siblings, 1 reply; 67+ messages in thread
From: Gupta, Sourab @ 2014-04-03  8:40 UTC (permalink / raw)
  To: Lespiau, Damien, Ville Syrjala
  Cc: Goel, Akash, intel-gfx, Hiremath, Shashidhar

On Wed, 2014-03-26 at 13:20 +0530, sourab gupta wrote:
> Hi Ville/Damien,
> Can you please review the below patch(v3) for mmio flips.
> Thanks,
> Sourab
> 
> On Sun, 2014-03-23 at 09:01 +0000, Gupta, Sourab wrote:
> > From: Sourab Gupta <sourab.gupta@intel.com>
> > 
> > Using MMIO based flips on VLV for Media power well residency optimization.
> > The blitter ring is currently being used just for command streamer based
> > flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> > of blitter ring and this will ensure the 100% residency for Media well.
> > 
> > v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> > flips when target seqno is reached. (Incorporating Ville's idea)
> > 
> > v3: Rebasing on latest code. Code restructuring after incorporating
> > Damien's comments
> > 
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c      |   1 +
> >  drivers/gpu/drm/i915/i915_drv.h      |   7 ++
> >  drivers/gpu/drm/i915/i915_irq.c      |   2 +
> >  drivers/gpu/drm/i915/intel_display.c | 124 +++++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/intel_drv.h     |   7 ++
> >  5 files changed, 141 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 4e0a26a..bca3c5a 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1569,6 +1569,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> >  	spin_lock_init(&dev_priv->backlight_lock);
> >  	spin_lock_init(&dev_priv->uncore.lock);
> >  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> > +	spin_lock_init(&dev_priv->mmio_flip_lock);
> >  	mutex_init(&dev_priv->dpio_lock);
> >  	mutex_init(&dev_priv->modeset_restore_lock);
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 3f62be0..678d31d 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1621,6 +1621,10 @@ typedef struct drm_i915_private {
> >  	struct i915_dri1_state dri1;
> >  	/* Old ums support infrastructure, same warning applies. */
> >  	struct i915_ums_state ums;
> > +
> > +	/* protects the mmio flip data */
> > +	spinlock_t mmio_flip_lock;
> > +
> >  } drm_i915_private_t;
> >  
> >  static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> > @@ -2657,6 +2661,9 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
> >  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
> >  			       struct drm_file *file);
> >  
> > +void intel_notify_mmio_flip(struct drm_device *dev,
> > +			struct intel_ring_buffer *ring);
> > +
> >  /* overlay */
> >  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
> >  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 4b4aeb3..ad26abe 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -1080,6 +1080,8 @@ static void notify_ring(struct drm_device *dev,
> >  
> >  	trace_i915_gem_request_complete(ring);
> >  
> > +	intel_notify_mmio_flip(dev, ring);
> > +
> >  	wake_up_all(&ring->irq_queue);
> >  	i915_queue_hangcheck(dev);
> >  }
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 7e4ea8d..19004bf 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -8782,6 +8782,125 @@ err:
> >  	return ret;
> >  }
> >  
> > +static int intel_do_mmio_flip(struct drm_device *dev,
> > +			struct drm_crtc *crtc)
> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct intel_crtc *intel_crtc;
> > +
> > +	intel_crtc = to_intel_crtc(crtc);
> > +
> > +	intel_mark_page_flip_active(intel_crtc);
> > +	return dev_priv->display.update_primary_plane(crtc, crtc->fb, 0, 0);
> > +}
> > +
> > +static bool intel_postpone_flip(struct drm_i915_gem_object *obj)
> > +{
> > +	int ret;
> > +	if (!obj->ring)
> > +		return false;
> > +
> > +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, false),
> > +			      obj->last_write_seqno))
> > +		return false;
> > +
> > +	if (obj->last_write_seqno == obj->ring->outstanding_lazy_seqno) {
> > +		ret = i915_add_request(obj->ring, NULL);
> > +		if (WARN_ON(ret))
> > +			return false;
> > +	}
> > +
> > +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> > +		return false;
> > +
> > +	return true;
> > +}
> > +
> > +void intel_notify_mmio_flip(struct drm_device *dev,
> > +			struct intel_ring_buffer *ring)
> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct drm_crtc *crtc;
> > +	struct intel_crtc *intel_crtc;
> > +	struct intel_mmio_flip *mmio_flip_data;
> > +	unsigned long irq_flags;
> > +	u32 seqno;
> > +	enum pipe pipe;
> > +
> > +	BUG_ON(!ring);
> > +
> > +	seqno = ring->get_seqno(ring, false);
> > +
> > +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> > +	for_each_pipe(pipe) {
> > +		crtc = dev_priv->pipe_to_crtc_mapping[pipe];
> > +		intel_crtc = to_intel_crtc(crtc);
> > +		mmio_flip_data = &intel_crtc->mmio_flip_data;
> > +		if ((mmio_flip_data->seqno != 0) &&
> > +				(ring->id == mmio_flip_data->ring_id) &&
> > +				(seqno >= mmio_flip_data->seqno)) {
> > +			intel_do_mmio_flip(dev, crtc);
> > +			mmio_flip_data->seqno = 0;
> > +			ring->irq_put(ring);
> > +		}
> > +	}
> > +
> > +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> > +}
> > +
> > +/* Using MMIO based flips starting from VLV, for Media power well
> > + * residency optimization. The other alternative of having Render
> > + * ring based flip calls is not being used, as the performance
> > + * (FPS) of certain 3D Apps was getting severly affected.
> > + */
> > +static int intel_gen7_queue_mmio_flip(struct drm_device *dev,
> > +			struct drm_crtc *crtc,
> > +			struct drm_framebuffer *fb,
> > +			struct drm_i915_gem_object *obj,
> > +			uint32_t flags)
> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> > +	unsigned long irq_flags;
> > +	int ret;
> > +
> > +	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
> > +	if (ret)
> > +		goto err;
> > +
> > +	switch (intel_crtc->plane) {
> > +	case PLANE_A:
> > +	case PLANE_B:
> > +	case PLANE_C:
> > +	break;
> > +	default:
> > +		WARN_ONCE(1, "unknown plane in flip command\n");
> > +		ret = -ENODEV;
> > +		goto err_unpin;
> > +	}
> > +
> > +	if (!intel_postpone_flip(obj)) {
> > +		ret = intel_do_mmio_flip(dev, crtc);
> > +		return ret;
> > +	}
> > +
> > +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> > +	intel_crtc->mmio_flip_data.seqno = obj->last_write_seqno;
> > +	intel_crtc->mmio_flip_data.ring_id = obj->ring->id;
> > +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> > +
> > +	/* Double check to catch cases where irq fired before
> > +	 * mmio flip data was ready
> > +	 */
> > +	intel_notify_mmio_flip(dev, obj->ring);
> > +	return 0;
> > +
> > +err_unpin:
> > +	intel_unpin_fb_obj(obj);
> > +err:
> > +	return ret;
> > +}
> > +
> >  static int intel_gen7_queue_flip(struct drm_device *dev,
> >  				 struct drm_crtc *crtc,
> >  				 struct drm_framebuffer *fb,
> > @@ -10577,6 +10696,8 @@ static void intel_crtc_init(struct drm_device *dev, int pipe)
> >  		DRM_DEBUG_KMS("swapping pipes & planes for FBC\n");
> >  		intel_crtc->plane = !pipe;
> >  	}
> > +	if (IS_VALLEYVIEW(dev))
> > +			intel_crtc->mmio_flip_data.seqno = 0;
> >  
> >  	BUG_ON(pipe >= ARRAY_SIZE(dev_priv->plane_to_crtc_mapping) ||
> >  	       dev_priv->plane_to_crtc_mapping[intel_crtc->plane] != NULL);
> > @@ -11107,6 +11228,9 @@ static void intel_init_display(struct drm_device *dev)
> >  	}
> >  
> >  	intel_panel_init_backlight_funcs(dev);
> > +
> > +	if (IS_VALLEYVIEW(dev))
> > +		dev_priv->display.queue_flip = intel_gen7_queue_mmio_flip;
> >  }
> >  
> >  /*
> > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> > index fa99104..f0b26a1 100644
> > --- a/drivers/gpu/drm/i915/intel_drv.h
> > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > @@ -344,6 +344,11 @@ struct intel_pipe_wm {
> >  	bool fbc_wm_enabled;
> >  };
> >  
> > +struct intel_mmio_flip {
> > +	u32 seqno;
> > +	u32 ring_id;
> > +};
> > +
> >  struct intel_crtc {
> >  	struct drm_crtc base;
> >  	enum pipe pipe;
> > @@ -395,6 +400,8 @@ struct intel_crtc {
> >  		/* watermarks currently being used  */
> >  		struct intel_pipe_wm active;
> >  	} wm;
> > +
> > +	struct intel_mmio_flip	mmio_flip_data;
> >  };
> >  
> >  struct intel_plane_wm_parameters {
> 

Hi Ville,
A gentle reminder to review the patch.

Thanks,
Sourab

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-04-03  8:40                         ` Gupta, Sourab
@ 2014-04-07 11:19                           ` Gupta, Sourab
  0 siblings, 0 replies; 67+ messages in thread
From: Gupta, Sourab @ 2014-04-07 11:19 UTC (permalink / raw)
  To: intel-gfx; +Cc: Goel, Akash, Hiremath, Shashidhar

On Thu, 2014-04-03 at 14:11 +0530, sourab gupta wrote:
> On Wed, 2014-03-26 at 13:20 +0530, sourab gupta wrote:
> > Hi Ville/Damien,
> > Can you please review the below patch(v3) for mmio flips.
> > Thanks,
> > Sourab
> > 
> > On Sun, 2014-03-23 at 09:01 +0000, Gupta, Sourab wrote:
> > > From: Sourab Gupta <sourab.gupta@intel.com>
> > > 
> > > Using MMIO based flips on VLV for Media power well residency optimization.
> > > The blitter ring is currently being used just for command streamer based
> > > flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> > > of blitter ring and this will ensure the 100% residency for Media well.
> > > 
> > > v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> > > flips when target seqno is reached. (Incorporating Ville's idea)
> > > 
> > > v3: Rebasing on latest code. Code restructuring after incorporating
> > > Damien's comments
> > > 
> > > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_dma.c      |   1 +
> > >  drivers/gpu/drm/i915/i915_drv.h      |   7 ++
> > >  drivers/gpu/drm/i915/i915_irq.c      |   2 +
> > >  drivers/gpu/drm/i915/intel_display.c | 124 +++++++++++++++++++++++++++++++++++
> > >  drivers/gpu/drm/i915/intel_drv.h     |   7 ++
> > >  5 files changed, 141 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > > index 4e0a26a..bca3c5a 100644
> > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > @@ -1569,6 +1569,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> > >  	spin_lock_init(&dev_priv->backlight_lock);
> > >  	spin_lock_init(&dev_priv->uncore.lock);
> > >  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> > > +	spin_lock_init(&dev_priv->mmio_flip_lock);
> > >  	mutex_init(&dev_priv->dpio_lock);
> > >  	mutex_init(&dev_priv->modeset_restore_lock);
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 3f62be0..678d31d 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1621,6 +1621,10 @@ typedef struct drm_i915_private {
> > >  	struct i915_dri1_state dri1;
> > >  	/* Old ums support infrastructure, same warning applies. */
> > >  	struct i915_ums_state ums;
> > > +
> > > +	/* protects the mmio flip data */
> > > +	spinlock_t mmio_flip_lock;
> > > +
> > >  } drm_i915_private_t;
> > >  
> > >  static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> > > @@ -2657,6 +2661,9 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
> > >  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
> > >  			       struct drm_file *file);
> > >  
> > > +void intel_notify_mmio_flip(struct drm_device *dev,
> > > +			struct intel_ring_buffer *ring);
> > > +
> > >  /* overlay */
> > >  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
> > >  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > index 4b4aeb3..ad26abe 100644
> > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > @@ -1080,6 +1080,8 @@ static void notify_ring(struct drm_device *dev,
> > >  
> > >  	trace_i915_gem_request_complete(ring);
> > >  
> > > +	intel_notify_mmio_flip(dev, ring);
> > > +
> > >  	wake_up_all(&ring->irq_queue);
> > >  	i915_queue_hangcheck(dev);
> > >  }
> > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > > index 7e4ea8d..19004bf 100644
> > > --- a/drivers/gpu/drm/i915/intel_display.c
> > > +++ b/drivers/gpu/drm/i915/intel_display.c
> > > @@ -8782,6 +8782,125 @@ err:
> > >  	return ret;
> > >  }
> > >  
> > > +static int intel_do_mmio_flip(struct drm_device *dev,
> > > +			struct drm_crtc *crtc)
> > > +{
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	struct intel_crtc *intel_crtc;
> > > +
> > > +	intel_crtc = to_intel_crtc(crtc);
> > > +
> > > +	intel_mark_page_flip_active(intel_crtc);
> > > +	return dev_priv->display.update_primary_plane(crtc, crtc->fb, 0, 0);
> > > +}
> > > +
> > > +static bool intel_postpone_flip(struct drm_i915_gem_object *obj)
> > > +{
> > > +	int ret;
> > > +	if (!obj->ring)
> > > +		return false;
> > > +
> > > +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, false),
> > > +			      obj->last_write_seqno))
> > > +		return false;
> > > +
> > > +	if (obj->last_write_seqno == obj->ring->outstanding_lazy_seqno) {
> > > +		ret = i915_add_request(obj->ring, NULL);
> > > +		if (WARN_ON(ret))
> > > +			return false;
> > > +	}
> > > +
> > > +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> > > +		return false;
> > > +
> > > +	return true;
> > > +}
> > > +
> > > +void intel_notify_mmio_flip(struct drm_device *dev,
> > > +			struct intel_ring_buffer *ring)
> > > +{
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	struct drm_crtc *crtc;
> > > +	struct intel_crtc *intel_crtc;
> > > +	struct intel_mmio_flip *mmio_flip_data;
> > > +	unsigned long irq_flags;
> > > +	u32 seqno;
> > > +	enum pipe pipe;
> > > +
> > > +	BUG_ON(!ring);
> > > +
> > > +	seqno = ring->get_seqno(ring, false);
> > > +
> > > +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> > > +	for_each_pipe(pipe) {
> > > +		crtc = dev_priv->pipe_to_crtc_mapping[pipe];
> > > +		intel_crtc = to_intel_crtc(crtc);
> > > +		mmio_flip_data = &intel_crtc->mmio_flip_data;
> > > +		if ((mmio_flip_data->seqno != 0) &&
> > > +				(ring->id == mmio_flip_data->ring_id) &&
> > > +				(seqno >= mmio_flip_data->seqno)) {
> > > +			intel_do_mmio_flip(dev, crtc);
> > > +			mmio_flip_data->seqno = 0;
> > > +			ring->irq_put(ring);
> > > +		}
> > > +	}
> > > +
> > > +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> > > +}
> > > +
> > > +/* Using MMIO based flips starting from VLV, for Media power well
> > > + * residency optimization. The other alternative of having Render
> > > + * ring based flip calls is not being used, as the performance
> > > + * (FPS) of certain 3D Apps was getting severly affected.
> > > + */
> > > +static int intel_gen7_queue_mmio_flip(struct drm_device *dev,
> > > +			struct drm_crtc *crtc,
> > > +			struct drm_framebuffer *fb,
> > > +			struct drm_i915_gem_object *obj,
> > > +			uint32_t flags)
> > > +{
> > > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > > +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> > > +	unsigned long irq_flags;
> > > +	int ret;
> > > +
> > > +	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
> > > +	if (ret)
> > > +		goto err;
> > > +
> > > +	switch (intel_crtc->plane) {
> > > +	case PLANE_A:
> > > +	case PLANE_B:
> > > +	case PLANE_C:
> > > +	break;
> > > +	default:
> > > +		WARN_ONCE(1, "unknown plane in flip command\n");
> > > +		ret = -ENODEV;
> > > +		goto err_unpin;
> > > +	}
> > > +
> > > +	if (!intel_postpone_flip(obj)) {
> > > +		ret = intel_do_mmio_flip(dev, crtc);
> > > +		return ret;
> > > +	}
> > > +
> > > +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> > > +	intel_crtc->mmio_flip_data.seqno = obj->last_write_seqno;
> > > +	intel_crtc->mmio_flip_data.ring_id = obj->ring->id;
> > > +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> > > +
> > > +	/* Double check to catch cases where irq fired before
> > > +	 * mmio flip data was ready
> > > +	 */
> > > +	intel_notify_mmio_flip(dev, obj->ring);
> > > +	return 0;
> > > +
> > > +err_unpin:
> > > +	intel_unpin_fb_obj(obj);
> > > +err:
> > > +	return ret;
> > > +}
> > > +
> > >  static int intel_gen7_queue_flip(struct drm_device *dev,
> > >  				 struct drm_crtc *crtc,
> > >  				 struct drm_framebuffer *fb,
> > > @@ -10577,6 +10696,8 @@ static void intel_crtc_init(struct drm_device *dev, int pipe)
> > >  		DRM_DEBUG_KMS("swapping pipes & planes for FBC\n");
> > >  		intel_crtc->plane = !pipe;
> > >  	}
> > > +	if (IS_VALLEYVIEW(dev))
> > > +			intel_crtc->mmio_flip_data.seqno = 0;
> > >  
> > >  	BUG_ON(pipe >= ARRAY_SIZE(dev_priv->plane_to_crtc_mapping) ||
> > >  	       dev_priv->plane_to_crtc_mapping[intel_crtc->plane] != NULL);
> > > @@ -11107,6 +11228,9 @@ static void intel_init_display(struct drm_device *dev)
> > >  	}
> > >  
> > >  	intel_panel_init_backlight_funcs(dev);
> > > +
> > > +	if (IS_VALLEYVIEW(dev))
> > > +		dev_priv->display.queue_flip = intel_gen7_queue_mmio_flip;
> > >  }
> > >  
> > >  /*
> > > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> > > index fa99104..f0b26a1 100644
> > > --- a/drivers/gpu/drm/i915/intel_drv.h
> > > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > > @@ -344,6 +344,11 @@ struct intel_pipe_wm {
> > >  	bool fbc_wm_enabled;
> > >  };
> > >  
> > > +struct intel_mmio_flip {
> > > +	u32 seqno;
> > > +	u32 ring_id;
> > > +};
> > > +
> > >  struct intel_crtc {
> > >  	struct drm_crtc base;
> > >  	enum pipe pipe;
> > > @@ -395,6 +400,8 @@ struct intel_crtc {
> > >  		/* watermarks currently being used  */
> > >  		struct intel_pipe_wm active;
> > >  	} wm;
> > > +
> > > +	struct intel_mmio_flip	mmio_flip_data;
> > >  };
> > >  
> > >  struct intel_plane_wm_parameters {
> > 
> 
> Hi Ville,
> A gentle reminder to review the patch.
> 
> Thanks,
> Sourab
> 

Gentle reminder to review the patch. (Been more than 2 weeks since last
version was sent.)

Thanks,
Sourab

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-03-23  9:01                     ` [PATCH v3] " sourab.gupta
  2014-03-26  7:49                       ` Gupta, Sourab
@ 2014-05-09 11:59                       ` Ville Syrjälä
  2014-05-09 13:28                         ` Ville Syrjälä
  2014-05-09 17:18                         ` Ville Syrjälä
  1 sibling, 2 replies; 67+ messages in thread
From: Ville Syrjälä @ 2014-05-09 11:59 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Akash Goel, intel-gfx

On Sun, Mar 23, 2014 at 02:31:05PM +0530, sourab.gupta@intel.com wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> Using MMIO based flips on VLV for Media power well residency optimization.
> The blitter ring is currently being used just for command streamer based
> flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> of blitter ring and this will ensure the 100% residency for Media well.

Sorry for dragging my feet with reviewing this. I'm hoping this is the
latest version...

> 
> v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> flips when target seqno is reached. (Incorporating Ville's idea)
> 
> v3: Rebasing on latest code. Code restructuring after incorporating
> Damien's comments
> 
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_dma.c      |   1 +
>  drivers/gpu/drm/i915/i915_drv.h      |   7 ++
>  drivers/gpu/drm/i915/i915_irq.c      |   2 +
>  drivers/gpu/drm/i915/intel_display.c | 124 +++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_drv.h     |   7 ++
>  5 files changed, 141 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 4e0a26a..bca3c5a 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1569,6 +1569,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  	spin_lock_init(&dev_priv->backlight_lock);
>  	spin_lock_init(&dev_priv->uncore.lock);
>  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> +	spin_lock_init(&dev_priv->mmio_flip_lock);
>  	mutex_init(&dev_priv->dpio_lock);
>  	mutex_init(&dev_priv->modeset_restore_lock);
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 3f62be0..678d31d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1621,6 +1621,10 @@ typedef struct drm_i915_private {
>  	struct i915_dri1_state dri1;
>  	/* Old ums support infrastructure, same warning applies. */
>  	struct i915_ums_state ums;
> +
> +	/* protects the mmio flip data */
> +	spinlock_t mmio_flip_lock;
> +
>  } drm_i915_private_t;
>  
>  static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> @@ -2657,6 +2661,9 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
>  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file);
>  
> +void intel_notify_mmio_flip(struct drm_device *dev,
> +			struct intel_ring_buffer *ring);
> +
>  /* overlay */
>  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
>  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 4b4aeb3..ad26abe 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1080,6 +1080,8 @@ static void notify_ring(struct drm_device *dev,
>  
>  	trace_i915_gem_request_complete(ring);
>  
> +	intel_notify_mmio_flip(dev, ring);
> +
>  	wake_up_all(&ring->irq_queue);
>  	i915_queue_hangcheck(dev);
>  }
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 7e4ea8d..19004bf 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -8782,6 +8782,125 @@ err:
>  	return ret;
>  }
>  
> +static int intel_do_mmio_flip(struct drm_device *dev,
> +			struct drm_crtc *crtc)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_crtc *intel_crtc;
> +
> +	intel_crtc = to_intel_crtc(crtc);

nit: could be part of intel_crtc declaration

> +
> +	intel_mark_page_flip_active(intel_crtc);
> +	return dev_priv->display.update_primary_plane(crtc, crtc->fb, 0, 0);

Needs to pass crtc->{x,y} instead of 0,0.

I was a bit worried crtc->fb might be changed already at this point, but
after thinking a bit it should be fine since the presense of unpin_work
will keep intel_crtc_page_flip() from frobbing with it and we always
call intel_crtc_wait_for_pending_flips() before set_base.

Just need to update to use crtc->primary->fb now.

I'm thinking we also have a small race here with a flip done interrupt
from a previous set_base. Probably we need to sort it out using the 
SURFLIVE and/or flip counter like I did for the mmio vs. cs flip
race. But I need to think on this a bit more. Perhaps you want to also
look at those patches a bit.

> +}
> +
> +static bool intel_postpone_flip(struct drm_i915_gem_object *obj)
> +{
> +	int ret;

nit: needs an empty line between declarations and code.

> +	if (!obj->ring)
> +		return false;
> +
> +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, false),
> +			      obj->last_write_seqno))

Maybe this should just do a very light check using lazy_coherency?

> +		return false;
> +
> +	if (obj->last_write_seqno == obj->ring->outstanding_lazy_seqno) {
> +		ret = i915_add_request(obj->ring, NULL);
> +		if (WARN_ON(ret))
> +			return false;
> +	}

Looks like i915_gem_check_olr(), so maybe make it non-static and use
here.

> +
> +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> +		return false;

Maybe do this before the checking outstanding_lazy_seqno. Hmm, or
perhaps not. It won't actaully eliminate the race entirely so you
still have to do the manual check later. I guess having it in this
order keeps the error paths simpler.

> +
> +	return true;
> +}
> +
> +void intel_notify_mmio_flip(struct drm_device *dev,
> +			struct intel_ring_buffer *ring)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_crtc *crtc;
> +	struct intel_crtc *intel_crtc;
> +	struct intel_mmio_flip *mmio_flip_data;
> +	unsigned long irq_flags;
> +	u32 seqno;
> +	enum pipe pipe;
> +
> +	BUG_ON(!ring);

No need for the BUG. It'll explode on the next line anyway if
ring==NULL.

> +
> +	seqno = ring->get_seqno(ring, false);
> +`
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	for_each_pipe(pipe) {

list_for_each_entry() seems more appropriate since you don't use the
pipe variable for anything else besides looking up the crtc.

You could also move some of the variable declarations inside the loop
since they're not needed on the outside.

> +		crtc = dev_priv->pipe_to_crtc_mapping[pipe];
> +		intel_crtc = to_intel_crtc(crtc);
> +		mmio_flip_data = &intel_crtc->mmio_flip_data;
> +		if ((mmio_flip_data->seqno != 0) &&
> +				(ring->id == mmio_flip_data->ring_id) &&
> +				(seqno >= mmio_flip_data->seqno)) {

Weird indentation and a bit too many parens for my taste.

Should also use i915_seqno_passed() here.

Special casing seqno 0 this way seems safe enough since we apparently
skip seqno 0 always.

> +			intel_do_mmio_flip(dev, crtc);
> +			mmio_flip_data->seqno = 0;
> +			ring->irq_put(ring);
> +		}
> +	}
> +
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +}
> +
> +/* Using MMIO based flips starting from VLV, for Media power well
> + * residency optimization. The other alternative of having Render
> + * ring based flip calls is not being used, as the performance
> + * (FPS) of certain 3D Apps was getting severly affected.
> + */
> +static int intel_gen7_queue_mmio_flip(struct drm_device *dev,
> +			struct drm_crtc *crtc,
> +			struct drm_framebuffer *fb,
> +			struct drm_i915_gem_object *obj,
> +			uint32_t flags)

There's nothing gen7 specific here. So you could just rename the
function to eg. intel_queue_mmio_flip(). Maybe also move the
comment about VLV to where you set up the function pointer.

> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> +	unsigned long irq_flags;
> +	int ret;
> +
> +	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
> +	if (ret)
> +		goto err;
> +
> +	switch (intel_crtc->plane) {
> +	case PLANE_A:
> +	case PLANE_B:
> +	case PLANE_C:
> +	break;
> +	default:
> +		WARN_ONCE(1, "unknown plane in flip command\n");
> +		ret = -ENODEV;
> +		goto err_unpin;
> +	}

I think you can drop this plane check. We should hopefully catch such
bugs elsewhere already.

> +
> +	if (!intel_postpone_flip(obj)) {
> +		ret = intel_do_mmio_flip(dev, crtc);
> +		return ret;

Just 'return intel_do_mmio_flip(...'

Actually I think you can just make intel_do_mmio_flip() void since
.update_primary_plane() can't really fail. I think Daniel even has a
patch lined up to make it void as well.

> +	}
> +
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	intel_crtc->mmio_flip_data.seqno = obj->last_write_seqno;
> +	intel_crtc->mmio_flip_data.ring_id = obj->ring->id;
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +
> +	/* Double check to catch cases where irq fired before
> +	 * mmio flip data was ready
> +	 */
> +	intel_notify_mmio_flip(dev, obj->ring);
> +	return 0;
> +
> +err_unpin:
> +	intel_unpin_fb_obj(obj);
> +err:
> +	return ret;
> +}
> +
>  static int intel_gen7_queue_flip(struct drm_device *dev,
>  				 struct drm_crtc *crtc,
>  				 struct drm_framebuffer *fb,
> @@ -10577,6 +10696,8 @@ static void intel_crtc_init(struct drm_device *dev, int pipe)
>  		DRM_DEBUG_KMS("swapping pipes & planes for FBC\n");
>  		intel_crtc->plane = !pipe;
>  	}
> +	if (IS_VALLEYVIEW(dev))
> +			intel_crtc->mmio_flip_data.seqno = 0;

Not needed. It's kzalloced so everything starts zeroed.

>  
>  	BUG_ON(pipe >= ARRAY_SIZE(dev_priv->plane_to_crtc_mapping) ||
>  	       dev_priv->plane_to_crtc_mapping[intel_crtc->plane] != NULL);
> @@ -11107,6 +11228,9 @@ static void intel_init_display(struct drm_device *dev)
>  	}
>  
>  	intel_panel_init_backlight_funcs(dev);
> +
> +	if (IS_VALLEYVIEW(dev))
> +		dev_priv->display.queue_flip = intel_gen7_queue_mmio_flip;

Would look a bit cleaner to do this before
intel_panel_init_backlight_funcs(). That will keep the .queue_flip
assignments closer together.

I'm also thinking that maybe we want a module parameter to select
between CS vs. mmio flips. It could default to mmio flips on VLV
and CS flips on other platforms if it's generally accepted that mmio
flips are better on VLV. That can at least get us a bit more testing
coverage if people don't need a specific platform to try it out.

On the whole I think it looks fairly good, and it should be fairly easy
to extend it more for nuclear flips.

>  }
>  
>  /*
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index fa99104..f0b26a1 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -344,6 +344,11 @@ struct intel_pipe_wm {
>  	bool fbc_wm_enabled;
>  };
>  
> +struct intel_mmio_flip {
> +	u32 seqno;
> +	u32 ring_id;
> +};
> +
>  struct intel_crtc {
>  	struct drm_crtc base;
>  	enum pipe pipe;
> @@ -395,6 +400,8 @@ struct intel_crtc {
>  		/* watermarks currently being used  */
>  		struct intel_pipe_wm active;
>  	} wm;
> +
> +	struct intel_mmio_flip	mmio_flip_data;
>  };
>  
>  struct intel_plane_wm_parameters {
> -- 
> 1.8.5.1

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-05-09 11:59                       ` Ville Syrjälä
@ 2014-05-09 13:28                         ` Ville Syrjälä
  2014-05-09 17:18                         ` Ville Syrjälä
  1 sibling, 0 replies; 67+ messages in thread
From: Ville Syrjälä @ 2014-05-09 13:28 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Akash Goel, intel-gfx

On Fri, May 09, 2014 at 02:59:42PM +0300, Ville Syrjälä wrote:
> On Sun, Mar 23, 2014 at 02:31:05PM +0530, sourab.gupta@intel.com wrote:
> > From: Sourab Gupta <sourab.gupta@intel.com>
> > 
> > Using MMIO based flips on VLV for Media power well residency optimization.
> > The blitter ring is currently being used just for command streamer based
> > flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> > of blitter ring and this will ensure the 100% residency for Media well.
> 
> Sorry for dragging my feet with reviewing this. I'm hoping this is the
> latest version...
> 
> > 
> > v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> > flips when target seqno is reached. (Incorporating Ville's idea)
> > 
> > v3: Rebasing on latest code. Code restructuring after incorporating
> > Damien's comments
> > 
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c      |   1 +
> >  drivers/gpu/drm/i915/i915_drv.h      |   7 ++
> >  drivers/gpu/drm/i915/i915_irq.c      |   2 +
> >  drivers/gpu/drm/i915/intel_display.c | 124 +++++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/intel_drv.h     |   7 ++
> >  5 files changed, 141 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 4e0a26a..bca3c5a 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1569,6 +1569,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> >  	spin_lock_init(&dev_priv->backlight_lock);
> >  	spin_lock_init(&dev_priv->uncore.lock);
> >  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> > +	spin_lock_init(&dev_priv->mmio_flip_lock);
> >  	mutex_init(&dev_priv->dpio_lock);
> >  	mutex_init(&dev_priv->modeset_restore_lock);
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 3f62be0..678d31d 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1621,6 +1621,10 @@ typedef struct drm_i915_private {
> >  	struct i915_dri1_state dri1;
> >  	/* Old ums support infrastructure, same warning applies. */
> >  	struct i915_ums_state ums;
> > +
> > +	/* protects the mmio flip data */
> > +	spinlock_t mmio_flip_lock;
> > +
> >  } drm_i915_private_t;
> >  
> >  static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> > @@ -2657,6 +2661,9 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
> >  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
> >  			       struct drm_file *file);
> >  
> > +void intel_notify_mmio_flip(struct drm_device *dev,
> > +			struct intel_ring_buffer *ring);
> > +
> >  /* overlay */
> >  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
> >  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 4b4aeb3..ad26abe 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -1080,6 +1080,8 @@ static void notify_ring(struct drm_device *dev,
> >  
> >  	trace_i915_gem_request_complete(ring);
> >  
> > +	intel_notify_mmio_flip(dev, ring);
> > +
> >  	wake_up_all(&ring->irq_queue);
> >  	i915_queue_hangcheck(dev);
> >  }
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 7e4ea8d..19004bf 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -8782,6 +8782,125 @@ err:
> >  	return ret;
> >  }
> >  
> > +static int intel_do_mmio_flip(struct drm_device *dev,
> > +			struct drm_crtc *crtc)
> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct intel_crtc *intel_crtc;
> > +
> > +	intel_crtc = to_intel_crtc(crtc);
> 
> nit: could be part of intel_crtc declaration
> 
> > +
> > +	intel_mark_page_flip_active(intel_crtc);
> > +	return dev_priv->display.update_primary_plane(crtc, crtc->fb, 0, 0);
> 
> Needs to pass crtc->{x,y} instead of 0,0.
> 
> I was a bit worried crtc->fb might be changed already at this point, but
> after thinking a bit it should be fine since the presense of unpin_work
> will keep intel_crtc_page_flip() from frobbing with it and we always
> call intel_crtc_wait_for_pending_flips() before set_base.
> 
> Just need to update to use crtc->primary->fb now.
> 
> I'm thinking we also have a small race here with a flip done interrupt
> from a previous set_base. Probably we need to sort it out using the 
> SURFLIVE and/or flip counter like I did for the mmio vs. cs flip
> race. But I need to think on this a bit more. Perhaps you want to also
> look at those patches a bit.

Oh another thing here is that we update_primary_plane() isn't guaranteed
to be atomic. We could start to use the atomic pipe update mechanism here
already but then we need to schedule a work so that we can perform the
vblank evade trick. That's going to be needed for the nuclear flip
anyway.

Another option would be to only update the base address and leave the
offset registers alone just like the CS flip does.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v3] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-05-09 11:59                       ` Ville Syrjälä
  2014-05-09 13:28                         ` Ville Syrjälä
@ 2014-05-09 17:18                         ` Ville Syrjälä
  2014-05-15  6:17                           ` [PATCH v4] " sourab.gupta
  1 sibling, 1 reply; 67+ messages in thread
From: Ville Syrjälä @ 2014-05-09 17:18 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Akash Goel, intel-gfx

On Fri, May 09, 2014 at 02:59:42PM +0300, Ville Syrjälä wrote:
> On Sun, Mar 23, 2014 at 02:31:05PM +0530, sourab.gupta@intel.com wrote:
> > +			intel_do_mmio_flip(dev, crtc);
> > +			mmio_flip_data->seqno = 0;
> > +			ring->irq_put(ring);
> > +		}
> > +	}
> > +
> > +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> > +}
> > +
> > +/* Using MMIO based flips starting from VLV, for Media power well
> > + * residency optimization. The other alternative of having Render
> > + * ring based flip calls is not being used, as the performance
> > + * (FPS) of certain 3D Apps was getting severly affected.
> > + */
> > +static int intel_gen7_queue_mmio_flip(struct drm_device *dev,
> > +			struct drm_crtc *crtc,
> > +			struct drm_framebuffer *fb,
> > +			struct drm_i915_gem_object *obj,
> > +			uint32_t flags)
> 
> There's nothing gen7 specific here. So you could just rename the
> function to eg. intel_queue_mmio_flip(). Maybe also move the
> comment about VLV to where you set up the function pointer.

Actually this code isn't entirely gen agnostic. It should work on gen5+
since all of those have a flip done interrupt. For older platforms we
use some clever tricks involving the flip_pending status bits and vblank
irqs. That code won't work for mmio flips. We'd need to add another way
to complete the flips based. That would involve using the frame counter
to make it accurate. To avoid races there we'd definitely need to use
the vblank evade mechanism to make sure we sample the frame counter
within the same frame as when we write the registers. Also gen2 has
the extra complication that it lacks a hardware frame counter.

So I think we can start off with limiting this to gen5+, and later we
can extend it to cover the older platforms since we anyway need to do
that work to get the nuclear flips working.

BTW I gave this code a whirl on my IVB and everything seems to work
fine.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v4] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-05-09 17:18                         ` Ville Syrjälä
@ 2014-05-15  6:17                           ` sourab.gupta
  2014-05-15 12:27                             ` Ville Syrjälä
  0 siblings, 1 reply; 67+ messages in thread
From: sourab.gupta @ 2014-05-15  6:17 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

Using MMIO based flips on Gen5+ for Media power well residency optimization.
The blitter ring is currently being used just for command streamer based
flip calls. For pure 3D workloads, with MMIO flips, there will be no use
of blitter ring and this will ensure the 100% residency for Media well.

In a subsequent patch, we can make the selection between CS vs MMIO flip
based on a module parameter to give more testing coverage.

v2: The MMIO flips now use the interrupt driven mechanism for issuing the
flips when target seqno is reached. (Incorporating Ville's idea)

v3: Rebasing on latest code. Code restructuring after incorporating
Damien's comments

v4: Addressing Ville's review comments
    -general cleanup
    -updating only base addr instead of calling update_primary_plane
    -extending patch for gen5+ platforms

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   6 ++
 drivers/gpu/drm/i915/i915_gem.c      |   2 +-
 drivers/gpu/drm/i915/i915_irq.c      |   2 +
 drivers/gpu/drm/i915/intel_display.c | 115 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h     |   6 ++
 6 files changed, 131 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 46f1dec..672c28f 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1570,6 +1570,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->mmio_flip_lock);
 	dev_priv->ring_index = 0;
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4006dfe..38c0820 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1543,6 +1543,8 @@ struct drm_i915_private {
 	struct i915_ums_state ums;
 	/* the indicator for dispatch video commands on two BSD rings */
 	int ring_index;
+	/* protects the mmio flip data */
+	spinlock_t mmio_flip_lock;
 };
 
 static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
@@ -2209,6 +2211,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
+int i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
 	return unlikely(atomic_read(&error->reset_counter)
@@ -2580,6 +2583,9 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
 int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct drm_device *dev,
+			struct intel_ring_buffer *ring);
+
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
 extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fa5b5ab..5b4e953 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -975,7 +975,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
  * Compare seqno against outstanding lazy request. Emit a request if they are
  * equal.
  */
-static int
+int
 i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
 {
 	int ret;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index b10fbde..a353693 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1084,6 +1084,8 @@ static void notify_ring(struct drm_device *dev,
 
 	trace_i915_gem_request_complete(ring);
 
+	intel_notify_mmio_flip(dev, ring);
+
 	wake_up_all(&ring->irq_queue);
 	i915_queue_hangcheck(dev);
 }
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 0f8f9bc..9d190c2 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9037,6 +9037,110 @@ err:
 	return ret;
 }
 
+static void intel_do_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *obj;
+	struct intel_framebuffer *intel_fb;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	int plane = intel_crtc->plane;
+
+	intel_mark_page_flip_active(intel_crtc);
+
+	intel_fb = to_intel_framebuffer(crtc->primary->fb);
+	obj = intel_fb->obj;
+
+	/* Update the base address reg for the plane */
+	I915_WRITE(DSPSURF(plane), i915_gem_obj_ggtt_offset(obj) +
+			intel_crtc->dspaddr_offset);
+}
+
+static bool intel_postpone_flip(struct drm_i915_gem_object *obj)
+{
+	if (!obj->ring)
+		return false;
+
+	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, false),
+				obj->last_write_seqno))
+		return false;
+
+	i915_gem_check_olr(obj->ring, obj->last_write_seqno);
+
+	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
+		return false;
+
+	return true;
+}
+
+void intel_notify_mmio_flip(struct drm_device *dev,
+			struct intel_ring_buffer *ring)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_crtc *crtc;
+	unsigned long irq_flags;
+	u32 seqno;
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	for_each_crtc(dev, crtc) {
+		struct intel_crtc *intel_crtc;
+		struct intel_mmio_flip *mmio_flip_data;
+
+		intel_crtc = to_intel_crtc(crtc);
+		mmio_flip_data = &intel_crtc->mmio_flip_data;
+
+		if (mmio_flip_data->seqno == 0)
+			continue;
+		if (ring->id != mmio_flip_data->ring_id)
+			continue;
+
+		if (i915_seqno_passed(seqno, mmio_flip_data->seqno)) {
+			intel_do_mmio_flip(dev, crtc);
+			mmio_flip_data->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+}
+
+static int intel_queue_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc,
+			struct drm_framebuffer *fb,
+			struct drm_i915_gem_object *obj,
+			uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	unsigned long irq_flags;
+	int ret;
+
+	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
+	if (ret)
+		goto err;
+
+	if (!intel_postpone_flip(obj)) {
+		intel_do_mmio_flip(dev, crtc);
+		return 0;
+	}
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	intel_crtc->mmio_flip_data.seqno = obj->last_write_seqno;
+	intel_crtc->mmio_flip_data.ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+
+	/* Double check to catch cases where irq fired before
+	 * mmio flip data was ready
+	 */
+	intel_notify_mmio_flip(dev, obj->ring);
+	return 0;
+
+err:
+	return ret;
+}
+
 static int intel_gen7_queue_flip(struct drm_device *dev,
 				 struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
@@ -11377,7 +11481,18 @@ static void intel_init_display(struct drm_device *dev)
 		break;
 	}
 
+	/* Using MMIO based flips starting from Gen5, for Media power well
+	 * residency optimization. This is not currently being used for
+	 * older platforms because of non-availability of flip done interrupt.
+	 * The other alternative of having Render ring based flip calls is
+	 * not being used, as the performance(FPS) of certain 3D Apps gets
+	 * severly affected.
+	 */
+	if (INTEL_INFO(dev)->gen >= 5)
+		dev_priv->display.queue_flip = intel_queue_mmio_flip;
+
 	intel_panel_init_backlight_funcs(dev);
+
 }
 
 /*
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 32a74e1..7953ed6 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -351,6 +351,11 @@ struct intel_pipe_wm {
 	bool sprites_scaled;
 };
 
+struct intel_mmio_flip {
+	u32 seqno;
+	u32 ring_id;
+};
+
 struct intel_crtc {
 	struct drm_crtc base;
 	enum pipe pipe;
@@ -403,6 +408,7 @@ struct intel_crtc {
 	} wm;
 
 	wait_queue_head_t vbl_wait;
+	struct intel_mmio_flip	mmio_flip_data;
 };
 
 struct intel_plane_wm_parameters {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v4] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-05-15  6:17                           ` [PATCH v4] " sourab.gupta
@ 2014-05-15 12:27                             ` Ville Syrjälä
  2014-05-16 12:34                               ` Gupta, Sourab
  0 siblings, 1 reply; 67+ messages in thread
From: Ville Syrjälä @ 2014-05-15 12:27 UTC (permalink / raw)
  To: sourab.gupta; +Cc: intel-gfx, Akash Goel

On Thu, May 15, 2014 at 11:47:37AM +0530, sourab.gupta@intel.com wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> Using MMIO based flips on Gen5+ for Media power well residency optimization.
> The blitter ring is currently being used just for command streamer based
> flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> of blitter ring and this will ensure the 100% residency for Media well.
> 
> In a subsequent patch, we can make the selection between CS vs MMIO flip
> based on a module parameter to give more testing coverage.
> 
> v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> flips when target seqno is reached. (Incorporating Ville's idea)
> 
> v3: Rebasing on latest code. Code restructuring after incorporating
> Damien's comments
> 
> v4: Addressing Ville's review comments
>     -general cleanup
>     -updating only base addr instead of calling update_primary_plane
>     -extending patch for gen5+ platforms
> 
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_dma.c      |   1 +
>  drivers/gpu/drm/i915/i915_drv.h      |   6 ++
>  drivers/gpu/drm/i915/i915_gem.c      |   2 +-
>  drivers/gpu/drm/i915/i915_irq.c      |   2 +
>  drivers/gpu/drm/i915/intel_display.c | 115 +++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_drv.h     |   6 ++
>  6 files changed, 131 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 46f1dec..672c28f 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1570,6 +1570,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  	spin_lock_init(&dev_priv->backlight_lock);
>  	spin_lock_init(&dev_priv->uncore.lock);
>  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> +	spin_lock_init(&dev_priv->mmio_flip_lock);
>  	dev_priv->ring_index = 0;
>  	mutex_init(&dev_priv->dpio_lock);
>  	mutex_init(&dev_priv->modeset_restore_lock);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4006dfe..38c0820 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1543,6 +1543,8 @@ struct drm_i915_private {
>  	struct i915_ums_state ums;
>  	/* the indicator for dispatch video commands on two BSD rings */
>  	int ring_index;
> +	/* protects the mmio flip data */
> +	spinlock_t mmio_flip_lock;
>  };
>  
>  static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> @@ -2209,6 +2211,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
>  void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
>  int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
>  				      bool interruptible);
> +int i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
>  static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
>  {
>  	return unlikely(atomic_read(&error->reset_counter)
> @@ -2580,6 +2583,9 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
>  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file);
>  
> +void intel_notify_mmio_flip(struct drm_device *dev,
> +			struct intel_ring_buffer *ring);
> +
>  /* overlay */
>  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
>  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index fa5b5ab..5b4e953 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -975,7 +975,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
>   * Compare seqno against outstanding lazy request. Emit a request if they are
>   * equal.
>   */
> -static int
> +int
>  i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
>  {
>  	int ret;
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index b10fbde..a353693 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1084,6 +1084,8 @@ static void notify_ring(struct drm_device *dev,
>  
>  	trace_i915_gem_request_complete(ring);
>  
> +	intel_notify_mmio_flip(dev, ring);
> +

Hmm. How badly is this going to explode with UMS?

>  	wake_up_all(&ring->irq_queue);
>  	i915_queue_hangcheck(dev);
>  }
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 0f8f9bc..9d190c2 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -9037,6 +9037,110 @@ err:
>  	return ret;
>  }
>  
> +static void intel_do_mmio_flip(struct drm_device *dev,
> +			struct drm_crtc *crtc)

nit: no need to pass dev and crtc. You can dig out dev_priv from
the crtc in the function.

Passing intel_crtc instead of drm_crtc could also avoid some extra
variables.

> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_i915_gem_object *obj;
> +	struct intel_framebuffer *intel_fb;
> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> +	int plane = intel_crtc->plane;
> +
> +	intel_mark_page_flip_active(intel_crtc);
> +
> +	intel_fb = to_intel_framebuffer(crtc->primary->fb);
> +	obj = intel_fb->obj;

nit: could be done as part of the declaration

> +
> +	/* Update the base address reg for the plane */

Useless comment.

> +	I915_WRITE(DSPSURF(plane), i915_gem_obj_ggtt_offset(obj) +
> +			intel_crtc->dspaddr_offset);

Should probably have a POSTING_READ() here.

> +}
> +
> +static bool intel_postpone_flip(struct drm_i915_gem_object *obj)
> +{
> +	if (!obj->ring)
> +		return false;
> +
> +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, false),

Still not using lazy_coherency.

> +				obj->last_write_seqno))
> +		return false;
> +
> +	i915_gem_check_olr(obj->ring, obj->last_write_seqno);

Error handling is missing.

In fact if we get an error from this I guess we should fail
.queue_flip(). A bool return isn't sufficient to convey both
errors and whether we need to wait for the GPU or not. I guess
you could encode that into an int with >0,==0,<0 meaning
different things, or you may just need to inline this stuff
into intel_queue_mmio_flip().

> +
> +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> +		return false;
> +
> +	return true;
> +}
> +
> +void intel_notify_mmio_flip(struct drm_device *dev,
> +			struct intel_ring_buffer *ring)

Again could just pass ring and dig out dev from it.

> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_crtc *crtc;
> +	unsigned long irq_flags;
> +	u32 seqno;
> +
> +	seqno = ring->get_seqno(ring, false);
> +
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	for_each_crtc(dev, crtc) {
> +		struct intel_crtc *intel_crtc;

Could use for_each_intel_crtc() to avoid some extra variables.

> +		struct intel_mmio_flip *mmio_flip_data;
> +
> +		intel_crtc = to_intel_crtc(crtc);
> +		mmio_flip_data = &intel_crtc->mmio_flip_data;
> +
> +		if (mmio_flip_data->seqno == 0)
> +			continue;
> +		if (ring->id != mmio_flip_data->ring_id)
> +			continue;
> +
> +		if (i915_seqno_passed(seqno, mmio_flip_data->seqno)) {
> +			intel_do_mmio_flip(dev, crtc);
> +			mmio_flip_data->seqno = 0;
> +			ring->irq_put(ring);
> +		}
> +	}
> +
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +}
> +
> +static int intel_queue_mmio_flip(struct drm_device *dev,
> +			struct drm_crtc *crtc,
> +			struct drm_framebuffer *fb,
> +			struct drm_i915_gem_object *obj,
> +			uint32_t flags)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> +	unsigned long irq_flags;
> +	int ret;
> +
> +	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
> +	if (ret)
> +		goto err;
> +
> +	if (!intel_postpone_flip(obj)) {
> +		intel_do_mmio_flip(dev, crtc);
> +		return 0;
> +	}
> +
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	intel_crtc->mmio_flip_data.seqno = obj->last_write_seqno;
> +	intel_crtc->mmio_flip_data.ring_id = obj->ring->id;
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +
> +	/* Double check to catch cases where irq fired before
> +	 * mmio flip data was ready
> +	 */
> +	intel_notify_mmio_flip(dev, obj->ring);
> +	return 0;
> +
> +err:
> +	return ret;
> +}
> +
>  static int intel_gen7_queue_flip(struct drm_device *dev,
>  				 struct drm_crtc *crtc,
>  				 struct drm_framebuffer *fb,
> @@ -11377,7 +11481,18 @@ static void intel_init_display(struct drm_device *dev)
>  		break;
>  	}
>  
> +	/* Using MMIO based flips starting from Gen5, for Media power well
> +	 * residency optimization. This is not currently being used for
> +	 * older platforms because of non-availability of flip done interrupt.
> +	 * The other alternative of having Render ring based flip calls is
> +	 * not being used, as the performance(FPS) of certain 3D Apps gets
> +	 * severly affected.
> +	 */
> +	if (INTEL_INFO(dev)->gen >= 5)
> +		dev_priv->display.queue_flip = intel_queue_mmio_flip;

I still think we need a module param for this, and default to CS for
now (except maybe for VLV).

> +
>  	intel_panel_init_backlight_funcs(dev);
> +
>  }
>  
>  /*
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 32a74e1..7953ed6 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -351,6 +351,11 @@ struct intel_pipe_wm {
>  	bool sprites_scaled;
>  };
>  
> +struct intel_mmio_flip {
> +	u32 seqno;
> +	u32 ring_id;
> +};
> +
>  struct intel_crtc {
>  	struct drm_crtc base;
>  	enum pipe pipe;
> @@ -403,6 +408,7 @@ struct intel_crtc {
>  	} wm;
>  
>  	wait_queue_head_t vbl_wait;
> +	struct intel_mmio_flip	mmio_flip_data;
                              ^

Should be a space, not tab.

>  };
>  
>  struct intel_plane_wm_parameters {
> -- 
> 1.8.5.1

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v4] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-05-15 12:27                             ` Ville Syrjälä
@ 2014-05-16 12:34                               ` Gupta, Sourab
  2014-05-16 12:51                                 ` Ville Syrjälä
  0 siblings, 1 reply; 67+ messages in thread
From: Gupta, Sourab @ 2014-05-16 12:34 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Goel, Akash

On Thu, 2014-05-15 at 12:27 +0000, Ville Syrjälä wrote:
> On Thu, May 15, 2014 at 11:47:37AM +0530, sourab.gupta@intel.com wrote:
> > From: Sourab Gupta <sourab.gupta@intel.com>
> > 
> > Using MMIO based flips on Gen5+ for Media power well residency optimization.
> > The blitter ring is currently being used just for command streamer based
> > flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> > of blitter ring and this will ensure the 100% residency for Media well.
> > 
> > In a subsequent patch, we can make the selection between CS vs MMIO flip
> > based on a module parameter to give more testing coverage.
> > 
> > v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> > flips when target seqno is reached. (Incorporating Ville's idea)
> > 
> > v3: Rebasing on latest code. Code restructuring after incorporating
> > Damien's comments
> > 
> > v4: Addressing Ville's review comments
> >     -general cleanup
> >     -updating only base addr instead of calling update_primary_plane
> >     -extending patch for gen5+ platforms
> > 
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c      |   1 +
> >  drivers/gpu/drm/i915/i915_drv.h      |   6 ++
> >  drivers/gpu/drm/i915/i915_gem.c      |   2 +-
> >  drivers/gpu/drm/i915/i915_irq.c      |   2 +
> >  drivers/gpu/drm/i915/intel_display.c | 115 +++++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/intel_drv.h     |   6 ++
> >  6 files changed, 131 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 46f1dec..672c28f 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1570,6 +1570,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> >  	spin_lock_init(&dev_priv->backlight_lock);
> >  	spin_lock_init(&dev_priv->uncore.lock);
> >  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> > +	spin_lock_init(&dev_priv->mmio_flip_lock);
> >  	dev_priv->ring_index = 0;
> >  	mutex_init(&dev_priv->dpio_lock);
> >  	mutex_init(&dev_priv->modeset_restore_lock);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 4006dfe..38c0820 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1543,6 +1543,8 @@ struct drm_i915_private {
> >  	struct i915_ums_state ums;
> >  	/* the indicator for dispatch video commands on two BSD rings */
> >  	int ring_index;
> > +	/* protects the mmio flip data */
> > +	spinlock_t mmio_flip_lock;
> >  };
> >  
> >  static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> > @@ -2209,6 +2211,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
> >  void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
> >  int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
> >  				      bool interruptible);
> > +int i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
> >  static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
> >  {
> >  	return unlikely(atomic_read(&error->reset_counter)
> > @@ -2580,6 +2583,9 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
> >  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
> >  			       struct drm_file *file);
> >  
> > +void intel_notify_mmio_flip(struct drm_device *dev,
> > +			struct intel_ring_buffer *ring);
> > +
> >  /* overlay */
> >  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
> >  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index fa5b5ab..5b4e953 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -975,7 +975,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
> >   * Compare seqno against outstanding lazy request. Emit a request if they are
> >   * equal.
> >   */
> > -static int
> > +int
> >  i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
> >  {
> >  	int ret;
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index b10fbde..a353693 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -1084,6 +1084,8 @@ static void notify_ring(struct drm_device *dev,
> >  
> >  	trace_i915_gem_request_complete(ring);
> >  
> > +	intel_notify_mmio_flip(dev, ring);
> > +
> 
> Hmm. How badly is this going to explode with UMS?

Hi Ville,
It seems there would be a small race between the page filp done intr and
the flip done interrupt from previous set base. But it seems to be the
case for CS flips also. In both cases, once we do the
mark_page_flip_active, there may be a window in which page flip intr
from previous set base may arrive.
Have we interpreted the race correctly? Or are we missing something
here?

Also, notify_mmio_flip is being called from notify_ring function.
Not sure of the scenario in which it may explode with UMS. Can you
please elaborate more.

I'll have next version of the patch with rest of the comments addressed.

> 
> >  	wake_up_all(&ring->irq_queue);
> >  	i915_queue_hangcheck(dev);
> >  }
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 0f8f9bc..9d190c2 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -9037,6 +9037,110 @@ err:
> >  	return ret;
> >  }
> >  
> > +static void intel_do_mmio_flip(struct drm_device *dev,
> > +			struct drm_crtc *crtc)
> 
> nit: no need to pass dev and crtc. You can dig out dev_priv from
> the crtc in the function.
> 
> Passing intel_crtc instead of drm_crtc could also avoid some extra
> variables.

> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct drm_i915_gem_object *obj;
> > +	struct intel_framebuffer *intel_fb;
> > +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> > +	int plane = intel_crtc->plane;
> > +
> > +	intel_mark_page_flip_active(intel_crtc);
> > +
> > +	intel_fb = to_intel_framebuffer(crtc->primary->fb);
> > +	obj = intel_fb->obj;
> 
> nit: could be done as part of the declaration
> 
> > +
> > +	/* Update the base address reg for the plane */
> 
> Useless comment.
> 
> > +	I915_WRITE(DSPSURF(plane), i915_gem_obj_ggtt_offset(obj) +
> > +			intel_crtc->dspaddr_offset);
> 
> Should probably have a POSTING_READ() here.
> 
> > +}
> > +
> > +static bool intel_postpone_flip(struct drm_i915_gem_object *obj)
> > +{
> > +	if (!obj->ring)
> > +		return false;
> > +
> > +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, false),
> 
> Still not using lazy_coherency.
> 
> > +				obj->last_write_seqno))
> > +		return false;
> > +
> > +	i915_gem_check_olr(obj->ring, obj->last_write_seqno);
> 
> Error handling is missing.
> 
> In fact if we get an error from this I guess we should fail
> .queue_flip(). A bool return isn't sufficient to convey both
> errors and whether we need to wait for the GPU or not. I guess
> you could encode that into an int with >0,==0,<0 meaning
> different things, or you may just need to inline this stuff
> into intel_queue_mmio_flip().
> 
> > +
> > +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> > +		return false;
> > +
> > +	return true;
> > +}
> > +
> > +void intel_notify_mmio_flip(struct drm_device *dev,
> > +			struct intel_ring_buffer *ring)
> 
> Again could just pass ring and dig out dev from it.
> 
> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct drm_crtc *crtc;
> > +	unsigned long irq_flags;
> > +	u32 seqno;
> > +
> > +	seqno = ring->get_seqno(ring, false);
> > +
> > +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> > +	for_each_crtc(dev, crtc) {
> > +		struct intel_crtc *intel_crtc;
> 
> Could use for_each_intel_crtc() to avoid some extra variables.
> 
> > +		struct intel_mmio_flip *mmio_flip_data;
> > +
> > +		intel_crtc = to_intel_crtc(crtc);
> > +		mmio_flip_data = &intel_crtc->mmio_flip_data;
> > +
> > +		if (mmio_flip_data->seqno == 0)
> > +			continue;
> > +		if (ring->id != mmio_flip_data->ring_id)
> > +			continue;
> > +
> > +		if (i915_seqno_passed(seqno, mmio_flip_data->seqno)) {
> > +			intel_do_mmio_flip(dev, crtc);
> > +			mmio_flip_data->seqno = 0;
> > +			ring->irq_put(ring);
> > +		}
> > +	}
> > +
> > +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> > +}
> > +
> > +static int intel_queue_mmio_flip(struct drm_device *dev,
> > +			struct drm_crtc *crtc,
> > +			struct drm_framebuffer *fb,
> > +			struct drm_i915_gem_object *obj,
> > +			uint32_t flags)
> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> > +	unsigned long irq_flags;
> > +	int ret;
> > +
> > +	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
> > +	if (ret)
> > +		goto err;
> > +
> > +	if (!intel_postpone_flip(obj)) {
> > +		intel_do_mmio_flip(dev, crtc);
> > +		return 0;
> > +	}
> > +
> > +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> > +	intel_crtc->mmio_flip_data.seqno = obj->last_write_seqno;
> > +	intel_crtc->mmio_flip_data.ring_id = obj->ring->id;
> > +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> > +
> > +	/* Double check to catch cases where irq fired before
> > +	 * mmio flip data was ready
> > +	 */
> > +	intel_notify_mmio_flip(dev, obj->ring);
> > +	return 0;
> > +
> > +err:
> > +	return ret;
> > +}
> > +
> >  static int intel_gen7_queue_flip(struct drm_device *dev,
> >  				 struct drm_crtc *crtc,
> >  				 struct drm_framebuffer *fb,
> > @@ -11377,7 +11481,18 @@ static void intel_init_display(struct drm_device *dev)
> >  		break;
> >  	}
> >  
> > +	/* Using MMIO based flips starting from Gen5, for Media power well
> > +	 * residency optimization. This is not currently being used for
> > +	 * older platforms because of non-availability of flip done interrupt.
> > +	 * The other alternative of having Render ring based flip calls is
> > +	 * not being used, as the performance(FPS) of certain 3D Apps gets
> > +	 * severly affected.
> > +	 */
> > +	if (INTEL_INFO(dev)->gen >= 5)
> > +		dev_priv->display.queue_flip = intel_queue_mmio_flip;
> 
> I still think we need a module param for this, and default to CS for
> now (except maybe for VLV).
> 
> > +
> >  	intel_panel_init_backlight_funcs(dev);
> > +
> >  }
> >  
> >  /*
> > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> > index 32a74e1..7953ed6 100644
> > --- a/drivers/gpu/drm/i915/intel_drv.h
> > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > @@ -351,6 +351,11 @@ struct intel_pipe_wm {
> >  	bool sprites_scaled;
> >  };
> >  
> > +struct intel_mmio_flip {
> > +	u32 seqno;
> > +	u32 ring_id;
> > +};
> > +
> >  struct intel_crtc {
> >  	struct drm_crtc base;
> >  	enum pipe pipe;
> > @@ -403,6 +408,7 @@ struct intel_crtc {
> >  	} wm;
> >  
> >  	wait_queue_head_t vbl_wait;
> > +	struct intel_mmio_flip	mmio_flip_data;
>                               ^
> 
> Should be a space, not tab.
> 
> >  };
> >  
> >  struct intel_plane_wm_parameters {
> > -- 
> > 1.8.5.1
> 

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v4] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-05-16 12:34                               ` Gupta, Sourab
@ 2014-05-16 12:51                                 ` Ville Syrjälä
  2014-05-19  9:19                                   ` Gupta, Sourab
  2014-05-19 10:58                                   ` [PATCH v5] " sourab.gupta
  0 siblings, 2 replies; 67+ messages in thread
From: Ville Syrjälä @ 2014-05-16 12:51 UTC (permalink / raw)
  To: Gupta, Sourab; +Cc: intel-gfx, Goel, Akash

On Fri, May 16, 2014 at 12:34:08PM +0000, Gupta, Sourab wrote:
> On Thu, 2014-05-15 at 12:27 +0000, Ville Syrjälä wrote:
> > On Thu, May 15, 2014 at 11:47:37AM +0530, sourab.gupta@intel.com wrote:
> > > From: Sourab Gupta <sourab.gupta@intel.com>
> > > 
> > > Using MMIO based flips on Gen5+ for Media power well residency optimization.
> > > The blitter ring is currently being used just for command streamer based
> > > flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> > > of blitter ring and this will ensure the 100% residency for Media well.
> > > 
> > > In a subsequent patch, we can make the selection between CS vs MMIO flip
> > > based on a module parameter to give more testing coverage.
> > > 
> > > v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> > > flips when target seqno is reached. (Incorporating Ville's idea)
> > > 
> > > v3: Rebasing on latest code. Code restructuring after incorporating
> > > Damien's comments
> > > 
> > > v4: Addressing Ville's review comments
> > >     -general cleanup
> > >     -updating only base addr instead of calling update_primary_plane
> > >     -extending patch for gen5+ platforms
> > > 
> > > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_dma.c      |   1 +
> > >  drivers/gpu/drm/i915/i915_drv.h      |   6 ++
> > >  drivers/gpu/drm/i915/i915_gem.c      |   2 +-
> > >  drivers/gpu/drm/i915/i915_irq.c      |   2 +
> > >  drivers/gpu/drm/i915/intel_display.c | 115 +++++++++++++++++++++++++++++++++++
> > >  drivers/gpu/drm/i915/intel_drv.h     |   6 ++
> > >  6 files changed, 131 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > > index 46f1dec..672c28f 100644
> > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > @@ -1570,6 +1570,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> > >  	spin_lock_init(&dev_priv->backlight_lock);
> > >  	spin_lock_init(&dev_priv->uncore.lock);
> > >  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> > > +	spin_lock_init(&dev_priv->mmio_flip_lock);
> > >  	dev_priv->ring_index = 0;
> > >  	mutex_init(&dev_priv->dpio_lock);
> > >  	mutex_init(&dev_priv->modeset_restore_lock);
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 4006dfe..38c0820 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1543,6 +1543,8 @@ struct drm_i915_private {
> > >  	struct i915_ums_state ums;
> > >  	/* the indicator for dispatch video commands on two BSD rings */
> > >  	int ring_index;
> > > +	/* protects the mmio flip data */
> > > +	spinlock_t mmio_flip_lock;
> > >  };
> > >  
> > >  static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> > > @@ -2209,6 +2211,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
> > >  void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
> > >  int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
> > >  				      bool interruptible);
> > > +int i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
> > >  static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
> > >  {
> > >  	return unlikely(atomic_read(&error->reset_counter)
> > > @@ -2580,6 +2583,9 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
> > >  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
> > >  			       struct drm_file *file);
> > >  
> > > +void intel_notify_mmio_flip(struct drm_device *dev,
> > > +			struct intel_ring_buffer *ring);
> > > +
> > >  /* overlay */
> > >  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
> > >  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index fa5b5ab..5b4e953 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -975,7 +975,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
> > >   * Compare seqno against outstanding lazy request. Emit a request if they are
> > >   * equal.
> > >   */
> > > -static int
> > > +int
> > >  i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
> > >  {
> > >  	int ret;
> > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > index b10fbde..a353693 100644
> > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > @@ -1084,6 +1084,8 @@ static void notify_ring(struct drm_device *dev,
> > >  
> > >  	trace_i915_gem_request_complete(ring);
> > >  
> > > +	intel_notify_mmio_flip(dev, ring);
> > > +
> > 
> > Hmm. How badly is this going to explode with UMS?
> 
> Hi Ville,
> It seems there would be a small race between the page filp done intr and
> the flip done interrupt from previous set base. But it seems to be the
> case for CS flips also. In both cases, once we do the
> mark_page_flip_active, there may be a window in which page flip intr
> from previous set base may arrive.
> Have we interpreted the race correctly? Or are we missing something
> here?

Yes. See here for my patches to fix the mmio vs. CS race:
http://lists.freedesktop.org/archives/intel-gfx/2014-April/043759.html
Feel free to review that stuff if you have a bit of time.

I've not had time to think about the mmio vs. mmio case yet. Perhaps my
patches would fix that too?

> 
> Also, notify_mmio_flip is being called from notify_ring function.
> Not sure of the scenario in which it may explode with UMS. Can you
> please elaborate more.

With UMS we have no modeset structures (drm_crtcs and whatnot). So the
crtc list walk will probably explode.

Hmm. I guess we could just init all the mode_config lists even w/ UMS,
so that the code will just see an empty list. Does anyone see any
problems with that?

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v4] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-05-16 12:51                                 ` Ville Syrjälä
@ 2014-05-19  9:19                                   ` Gupta, Sourab
  2014-05-19 10:58                                   ` [PATCH v5] " sourab.gupta
  1 sibling, 0 replies; 67+ messages in thread
From: Gupta, Sourab @ 2014-05-19  9:19 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx, Goel, Akash

On Fri, 2014-05-16 at 12:51 +0000, Ville Syrjälä wrote:
> On Fri, May 16, 2014 at 12:34:08PM +0000, Gupta, Sourab wrote:
> > On Thu, 2014-05-15 at 12:27 +0000, Ville Syrjälä wrote:
> > > On Thu, May 15, 2014 at 11:47:37AM +0530, sourab.gupta@intel.com wrote:
> > > > From: Sourab Gupta <sourab.gupta@intel.com>
> > > > 
> > > > Using MMIO based flips on Gen5+ for Media power well residency optimization.
> > > > The blitter ring is currently being used just for command streamer based
> > > > flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> > > > of blitter ring and this will ensure the 100% residency for Media well.
> > > > 
> > > > In a subsequent patch, we can make the selection between CS vs MMIO flip
> > > > based on a module parameter to give more testing coverage.
> > > > 
> > > > v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> > > > flips when target seqno is reached. (Incorporating Ville's idea)
> > > > 
> > > > v3: Rebasing on latest code. Code restructuring after incorporating
> > > > Damien's comments
> > > > 
> > > > v4: Addressing Ville's review comments
> > > >     -general cleanup
> > > >     -updating only base addr instead of calling update_primary_plane
> > > >     -extending patch for gen5+ platforms
> > > > 
> > > > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_dma.c      |   1 +
> > > >  drivers/gpu/drm/i915/i915_drv.h      |   6 ++
> > > >  drivers/gpu/drm/i915/i915_gem.c      |   2 +-
> > > >  drivers/gpu/drm/i915/i915_irq.c      |   2 +
> > > >  drivers/gpu/drm/i915/intel_display.c | 115 +++++++++++++++++++++++++++++++++++
> > > >  drivers/gpu/drm/i915/intel_drv.h     |   6 ++
> > > >  6 files changed, 131 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > > > index 46f1dec..672c28f 100644
> > > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > > @@ -1570,6 +1570,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> > > >  	spin_lock_init(&dev_priv->backlight_lock);
> > > >  	spin_lock_init(&dev_priv->uncore.lock);
> > > >  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> > > > +	spin_lock_init(&dev_priv->mmio_flip_lock);
> > > >  	dev_priv->ring_index = 0;
> > > >  	mutex_init(&dev_priv->dpio_lock);
> > > >  	mutex_init(&dev_priv->modeset_restore_lock);
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > > index 4006dfe..38c0820 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > @@ -1543,6 +1543,8 @@ struct drm_i915_private {
> > > >  	struct i915_ums_state ums;
> > > >  	/* the indicator for dispatch video commands on two BSD rings */
> > > >  	int ring_index;
> > > > +	/* protects the mmio flip data */
> > > > +	spinlock_t mmio_flip_lock;
> > > >  };
> > > >  
> > > >  static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> > > > @@ -2209,6 +2211,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
> > > >  void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
> > > >  int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
> > > >  				      bool interruptible);
> > > > +int i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
> > > >  static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
> > > >  {
> > > >  	return unlikely(atomic_read(&error->reset_counter)
> > > > @@ -2580,6 +2583,9 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
> > > >  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
> > > >  			       struct drm_file *file);
> > > >  
> > > > +void intel_notify_mmio_flip(struct drm_device *dev,
> > > > +			struct intel_ring_buffer *ring);
> > > > +
> > > >  /* overlay */
> > > >  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
> > > >  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > > index fa5b5ab..5b4e953 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -975,7 +975,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
> > > >   * Compare seqno against outstanding lazy request. Emit a request if they are
> > > >   * equal.
> > > >   */
> > > > -static int
> > > > +int
> > > >  i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
> > > >  {
> > > >  	int ret;
> > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > > index b10fbde..a353693 100644
> > > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > > @@ -1084,6 +1084,8 @@ static void notify_ring(struct drm_device *dev,
> > > >  
> > > >  	trace_i915_gem_request_complete(ring);
> > > >  
> > > > +	intel_notify_mmio_flip(dev, ring);
> > > > +
> > > 
> > > Hmm. How badly is this going to explode with UMS?
> > 
> > Hi Ville,
> > It seems there would be a small race between the page filp done intr and
> > the flip done interrupt from previous set base. But it seems to be the
> > case for CS flips also. In both cases, once we do the
> > mark_page_flip_active, there may be a window in which page flip intr
> > from previous set base may arrive.
> > Have we interpreted the race correctly? Or are we missing something
> > here?
> 
> Yes. See here for my patches to fix the mmio vs. CS race:
> http://lists.freedesktop.org/archives/intel-gfx/2014-April/043759.html
> Feel free to review that stuff if you have a bit of time.
> 
> I've not had time to think about the mmio vs. mmio case yet. Perhaps my
> patches would fix that too?
> 
Hi Ville,
We looked at your patch and it looks good to fix the mmio vs mmio case
also.
We would want to incorporate that idea into our patch also.
So, should we send a second patch to address this race, on top of our
base mmio flip patch. Meaning that the second base patch will be
independent, and the second patch will be dependent on your patch. Is
this approach ok? Or should we add it into this patch only, which will
be built on top of your patch?
> > 
> > Also, notify_mmio_flip is being called from notify_ring function.
> > Not sure of the scenario in which it may explode with UMS. Can you
> > please elaborate more.
> 
> With UMS we have no modeset structures (drm_crtcs and whatnot). So the
> crtc list walk will probably explode.
> 
> Hmm. I guess we could just init all the mode_config lists even w/ UMS,
> so that the code will just see an empty list. Does anyone see any
> problems with that?
> 

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v5] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-05-16 12:51                                 ` Ville Syrjälä
  2014-05-19  9:19                                   ` Gupta, Sourab
@ 2014-05-19 10:58                                   ` sourab.gupta
  2014-05-19 11:47                                     ` Ville Syrjälä
  1 sibling, 1 reply; 67+ messages in thread
From: sourab.gupta @ 2014-05-19 10:58 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

Using MMIO based flips on Gen5+ for Media power well residency optimization.
The blitter ring is currently being used just for command streamer based
flip calls. For pure 3D workloads, with MMIO flips, there will be no use
of blitter ring and this will ensure the 100% residency for Media well.

Can we address the condition for race between page flip mmio vs set base mmio
in a seperate patch or do we address it in this patch only? In which case, this
patch may be dependent on
http://lists.freedesktop.org/archives/intel-gfx/2014-April/043759.html

v2: The MMIO flips now use the interrupt driven mechanism for issuing the
flips when target seqno is reached. (Incorporating Ville's idea)

v3: Rebasing on latest code. Code restructuring after incorporating
Damien's comments

v4: Addressing Ville's review comments
    -general cleanup
    -updating only base addr instead of calling update_primary_plane
    -extending patch for gen5+ platforms

v5: Addressed Ville's review comments
    -Making mmio flip vs cs flip selection based on module parameter
    -Adding check for DRIVER_MODESET feature in notify_ring before calling
     notify mmio flip.
    -Other changes mostly in function arguments

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   6 ++
 drivers/gpu/drm/i915/i915_gem.c      |   2 +-
 drivers/gpu/drm/i915/i915_irq.c      |   3 +
 drivers/gpu/drm/i915/i915_params.c   |   4 ++
 drivers/gpu/drm/i915/intel_display.c | 113 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h     |   6 ++
 7 files changed, 134 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 46f1dec..672c28f 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1570,6 +1570,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->mmio_flip_lock);
 	dev_priv->ring_index = 0;
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4006dfe..9f1d042 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1543,6 +1543,8 @@ struct drm_i915_private {
 	struct i915_ums_state ums;
 	/* the indicator for dispatch video commands on two BSD rings */
 	int ring_index;
+	/* protects the mmio flip data */
+	spinlock_t mmio_flip_lock;
 };
 
 static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
@@ -2019,6 +2021,7 @@ struct i915_params {
 	bool reset;
 	bool disable_display;
 	bool disable_vtd_wa;
+	bool use_mmio_flip;
 };
 extern struct i915_params i915 __read_mostly;
 
@@ -2209,6 +2212,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
+int i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
 	return unlikely(atomic_read(&error->reset_counter)
@@ -2580,6 +2584,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
 int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct intel_ring_buffer *ring);
+
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
 extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fa5b5ab..5b4e953 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -975,7 +975,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
  * Compare seqno against outstanding lazy request. Emit a request if they are
  * equal.
  */
-static int
+int
 i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
 {
 	int ret;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index b10fbde..31e98e2 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1084,6 +1084,9 @@ static void notify_ring(struct drm_device *dev,
 
 	trace_i915_gem_request_complete(ring);
 
+	if (drm_core_check_feature(dev, DRIVER_MODESET))
+		intel_notify_mmio_flip(ring);
+
 	wake_up_all(&ring->irq_queue);
 	i915_queue_hangcheck(dev);
 }
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index d05a2af..e0d44df 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -48,6 +48,7 @@ struct i915_params i915 __read_mostly = {
 	.disable_display = 0,
 	.enable_cmd_parser = 1,
 	.disable_vtd_wa = 0,
+	.use_mmio_flip = 0,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -156,3 +157,6 @@ MODULE_PARM_DESC(disable_vtd_wa, "Disable all VT-d workarounds (default: false)"
 module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600);
 MODULE_PARM_DESC(enable_cmd_parser,
 		 "Enable command parsing (1=enabled [default], 0=disabled)");
+
+module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
+MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 0f8f9bc..6003068 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9037,6 +9037,108 @@ err:
 	return ret;
 }
 
+static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
+{
+	struct drm_i915_private *dev_priv =
+		intel_crtc->base.dev->dev_private;
+	struct intel_framebuffer *intel_fb =
+		to_intel_framebuffer(intel_crtc->base.primary->fb);
+	struct drm_i915_gem_object *obj = intel_fb->obj;
+
+	intel_mark_page_flip_active(intel_crtc);
+
+	I915_WRITE(DSPSURF(intel_crtc->plane), i915_gem_obj_ggtt_offset(obj) +
+			intel_crtc->dspaddr_offset);
+	POSTING_READ(DSPSURF(intel_crtc->plane));
+}
+
+static int intel_postpone_flip(struct drm_i915_gem_object *obj)
+{
+	if (!obj->ring)
+		return 0;
+
+	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
+				obj->last_write_seqno))
+		return 0;
+
+	if (i915_gem_check_olr(obj->ring, obj->last_write_seqno))
+		return -1;
+
+	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
+		return 0;
+
+	return 1;
+}
+
+void intel_notify_mmio_flip(struct intel_ring_buffer *ring)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_crtc *intel_crtc;
+	unsigned long irq_flags;
+	u32 seqno;
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	for_each_intel_crtc(ring->dev, intel_crtc) {
+		struct intel_mmio_flip *mmio_flip_data;
+
+		mmio_flip_data = &intel_crtc->mmio_flip_data;
+
+		if (mmio_flip_data->seqno == 0)
+			continue;
+		if (ring->id != mmio_flip_data->ring_id)
+			continue;
+
+		if (i915_seqno_passed(seqno, mmio_flip_data->seqno)) {
+			intel_do_mmio_flip(intel_crtc);
+			mmio_flip_data->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+}
+
+static int intel_queue_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc,
+			struct drm_framebuffer *fb,
+			struct drm_i915_gem_object *obj,
+			uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	unsigned long irq_flags;
+	int ret;
+
+	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
+	if (ret)
+		goto err;
+
+	ret = intel_postpone_flip(obj);
+	if (ret < 0) {
+		goto err_unpin;
+	} else if (ret == 0) {
+		intel_do_mmio_flip(intel_crtc);
+		return 0;
+	}
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	intel_crtc->mmio_flip_data.seqno = obj->last_write_seqno;
+	intel_crtc->mmio_flip_data.ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+
+	/* Double check to catch cases where irq fired before
+	 * mmio flip data was ready
+	 */
+	intel_notify_mmio_flip(obj->ring);
+	return 0;
+
+err_unpin:
+	intel_unpin_fb_obj(obj);
+err:
+	return ret;
+}
+
 static int intel_gen7_queue_flip(struct drm_device *dev,
 				 struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
@@ -11377,6 +11479,17 @@ static void intel_init_display(struct drm_device *dev)
 		break;
 	}
 
+	/* If module parameter is enabled, Use MMIO based flips starting
+	 * from Gen5, for Media power well residency optimization. This is
+	 * not currently being used for older platforms because of
+	 * non-availability of flip done interrupt.
+	 * The other alternative of having Render ring based flip calls is
+	 * not being used, as the performance(FPS) of certain 3D Apps gets
+	 * severly affected.
+	 */
+	if ((i915.use_mmio_flip != 0) && (INTEL_INFO(dev)->gen >= 5))
+		dev_priv->display.queue_flip = intel_queue_mmio_flip;
+
 	intel_panel_init_backlight_funcs(dev);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 32a74e1..08d65a4 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -351,6 +351,11 @@ struct intel_pipe_wm {
 	bool sprites_scaled;
 };
 
+struct intel_mmio_flip {
+	u32 seqno;
+	u32 ring_id;
+};
+
 struct intel_crtc {
 	struct drm_crtc base;
 	enum pipe pipe;
@@ -403,6 +408,7 @@ struct intel_crtc {
 	} wm;
 
 	wait_queue_head_t vbl_wait;
+	struct intel_mmio_flip mmio_flip_data;
 };
 
 struct intel_plane_wm_parameters {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v5] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-05-19 10:58                                   ` [PATCH v5] " sourab.gupta
@ 2014-05-19 11:47                                     ` Ville Syrjälä
  2014-05-19 12:29                                       ` Daniel Vetter
  0 siblings, 1 reply; 67+ messages in thread
From: Ville Syrjälä @ 2014-05-19 11:47 UTC (permalink / raw)
  To: sourab.gupta; +Cc: intel-gfx, Akash Goel

On Mon, May 19, 2014 at 04:28:58PM +0530, sourab.gupta@intel.com wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> Using MMIO based flips on Gen5+ for Media power well residency optimization.
> The blitter ring is currently being used just for command streamer based
> flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> of blitter ring and this will ensure the 100% residency for Media well.
> 
> Can we address the condition for race between page flip mmio vs set base mmio
> in a seperate patch or do we address it in this patch only? In which case, this
> patch may be dependent on
> http://lists.freedesktop.org/archives/intel-gfx/2014-April/043759.html
> 
> v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> flips when target seqno is reached. (Incorporating Ville's idea)
> 
> v3: Rebasing on latest code. Code restructuring after incorporating
> Damien's comments
> 
> v4: Addressing Ville's review comments
>     -general cleanup
>     -updating only base addr instead of calling update_primary_plane
>     -extending patch for gen5+ platforms
> 
> v5: Addressed Ville's review comments
>     -Making mmio flip vs cs flip selection based on module parameter
>     -Adding check for DRIVER_MODESET feature in notify_ring before calling
>      notify mmio flip.
>     -Other changes mostly in function arguments
> 
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_dma.c      |   1 +
>  drivers/gpu/drm/i915/i915_drv.h      |   6 ++
>  drivers/gpu/drm/i915/i915_gem.c      |   2 +-
>  drivers/gpu/drm/i915/i915_irq.c      |   3 +
>  drivers/gpu/drm/i915/i915_params.c   |   4 ++
>  drivers/gpu/drm/i915/intel_display.c | 113 +++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_drv.h     |   6 ++
>  7 files changed, 134 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 46f1dec..672c28f 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1570,6 +1570,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  	spin_lock_init(&dev_priv->backlight_lock);
>  	spin_lock_init(&dev_priv->uncore.lock);
>  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> +	spin_lock_init(&dev_priv->mmio_flip_lock);
>  	dev_priv->ring_index = 0;
>  	mutex_init(&dev_priv->dpio_lock);
>  	mutex_init(&dev_priv->modeset_restore_lock);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4006dfe..9f1d042 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1543,6 +1543,8 @@ struct drm_i915_private {
>  	struct i915_ums_state ums;
>  	/* the indicator for dispatch video commands on two BSD rings */
>  	int ring_index;
> +	/* protects the mmio flip data */
> +	spinlock_t mmio_flip_lock;
>  };
>  
>  static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> @@ -2019,6 +2021,7 @@ struct i915_params {
>  	bool reset;
>  	bool disable_display;
>  	bool disable_vtd_wa;
> +	bool use_mmio_flip;
>  };
>  extern struct i915_params i915 __read_mostly;
>  
> @@ -2209,6 +2212,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
>  void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
>  int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
>  				      bool interruptible);
> +int i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
>  static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
>  {
>  	return unlikely(atomic_read(&error->reset_counter)
> @@ -2580,6 +2584,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
>  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file);
>  
> +void intel_notify_mmio_flip(struct intel_ring_buffer *ring);
> +
>  /* overlay */
>  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
>  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index fa5b5ab..5b4e953 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -975,7 +975,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
>   * Compare seqno against outstanding lazy request. Emit a request if they are
>   * equal.
>   */
> -static int
> +int
>  i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
>  {
>  	int ret;
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index b10fbde..31e98e2 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1084,6 +1084,9 @@ static void notify_ring(struct drm_device *dev,
>  
>  	trace_i915_gem_request_complete(ring);
>  
> +	if (drm_core_check_feature(dev, DRIVER_MODESET))
> +		intel_notify_mmio_flip(ring);
> +

I'm not a fan of such checks.

After a bit of extra thought I got the idea of adding a per-ring
notify_list. So we could have something like this:

struct ring_notify {
	void (*notify)(struct ring_notify *notify);
	struct list_head list;
	u32 seqno;
};

intel_crtc {
	...
	struct ring_notify mmio_flip_notify;
	...
};

I'll probably want something like this for FBC as well, so I guess
we might as well add it from the start. I think you could do this
as two patches; first one adds the ring notify list, second one
implements the mmio flip on top.

>  	wake_up_all(&ring->irq_queue);
>  	i915_queue_hangcheck(dev);
>  }
> diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
> index d05a2af..e0d44df 100644
> --- a/drivers/gpu/drm/i915/i915_params.c
> +++ b/drivers/gpu/drm/i915/i915_params.c
> @@ -48,6 +48,7 @@ struct i915_params i915 __read_mostly = {
>  	.disable_display = 0,
>  	.enable_cmd_parser = 1,
>  	.disable_vtd_wa = 0,
> +	.use_mmio_flip = 0,
>  };
>  
>  module_param_named(modeset, i915.modeset, int, 0400);
> @@ -156,3 +157,6 @@ MODULE_PARM_DESC(disable_vtd_wa, "Disable all VT-d workarounds (default: false)"
>  module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600);
>  MODULE_PARM_DESC(enable_cmd_parser,
>  		 "Enable command parsing (1=enabled [default], 0=disabled)");
> +
> +module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
> +MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");

If we want to enable this by default on VLV, then this should be an int
where -1 would mean per-chip default (ie. enable on VLV, disable on
rest).

> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 0f8f9bc..6003068 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -9037,6 +9037,108 @@ err:
>  	return ret;
>  }
>  
> +static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
> +{
> +	struct drm_i915_private *dev_priv =
> +		intel_crtc->base.dev->dev_private;
> +	struct intel_framebuffer *intel_fb =
> +		to_intel_framebuffer(intel_crtc->base.primary->fb);
> +	struct drm_i915_gem_object *obj = intel_fb->obj;
> +
> +	intel_mark_page_flip_active(intel_crtc);
> +
> +	I915_WRITE(DSPSURF(intel_crtc->plane), i915_gem_obj_ggtt_offset(obj) +
> +			intel_crtc->dspaddr_offset);
> +	POSTING_READ(DSPSURF(intel_crtc->plane));
> +}
> +
> +static int intel_postpone_flip(struct drm_i915_gem_object *obj)
> +{
> +	if (!obj->ring)
> +		return 0;
> +
> +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
> +				obj->last_write_seqno))
> +		return 0;
> +
> +	if (i915_gem_check_olr(obj->ring, obj->last_write_seqno))
> +		return -1;

This should pass the actual error code to the caller.

> +
> +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> +		return 0;
> +
> +	return 1;
> +}
> +
> +void intel_notify_mmio_flip(struct intel_ring_buffer *ring)
> +{
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +	struct intel_crtc *intel_crtc;
> +	unsigned long irq_flags;
> +	u32 seqno;
> +
> +	seqno = ring->get_seqno(ring, false);
> +
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	for_each_intel_crtc(ring->dev, intel_crtc) {
> +		struct intel_mmio_flip *mmio_flip_data;
> +
> +		mmio_flip_data = &intel_crtc->mmio_flip_data;
> +
> +		if (mmio_flip_data->seqno == 0)
> +			continue;
> +		if (ring->id != mmio_flip_data->ring_id)
> +			continue;
> +
> +		if (i915_seqno_passed(seqno, mmio_flip_data->seqno)) {
> +			intel_do_mmio_flip(intel_crtc);
> +			mmio_flip_data->seqno = 0;
> +			ring->irq_put(ring);
> +		}
> +	}
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +}
> +
> +static int intel_queue_mmio_flip(struct drm_device *dev,
> +			struct drm_crtc *crtc,
> +			struct drm_framebuffer *fb,
> +			struct drm_i915_gem_object *obj,
> +			uint32_t flags)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> +	unsigned long irq_flags;
> +	int ret;
> +
> +	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
> +	if (ret)
> +		goto err;
> +
> +	ret = intel_postpone_flip(obj);
> +	if (ret < 0) {
> +		goto err_unpin;
> +	} else if (ret == 0) {
> +		intel_do_mmio_flip(intel_crtc);
> +		return 0;
> +	}
> +
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	intel_crtc->mmio_flip_data.seqno = obj->last_write_seqno;
> +	intel_crtc->mmio_flip_data.ring_id = obj->ring->id;
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +
> +	/* Double check to catch cases where irq fired before
> +	 * mmio flip data was ready
> +	 */
> +	intel_notify_mmio_flip(obj->ring);
> +	return 0;
> +
> +err_unpin:
> +	intel_unpin_fb_obj(obj);
> +err:
> +	return ret;
> +}
> +
>  static int intel_gen7_queue_flip(struct drm_device *dev,
>  				 struct drm_crtc *crtc,
>  				 struct drm_framebuffer *fb,
> @@ -11377,6 +11479,17 @@ static void intel_init_display(struct drm_device *dev)
>  		break;
>  	}
>  
> +	/* If module parameter is enabled, Use MMIO based flips starting
> +	 * from Gen5, for Media power well residency optimization. This is
> +	 * not currently being used for older platforms because of
> +	 * non-availability of flip done interrupt.
> +	 * The other alternative of having Render ring based flip calls is
> +	 * not being used, as the performance(FPS) of certain 3D Apps gets
> +	 * severly affected.

The comments here are rather VLV specific. They would be more
appropriate here if you actually enabled this by default on VLV.
Now they just seem a bit misplaced.

So I suggest you add another patch on top that enables the feature
by default on VLV and then add the comments there.

> +	 */
> +	if ((i915.use_mmio_flip != 0) && (INTEL_INFO(dev)->gen >= 5))

This has too many parens for my taste.

Also to reduce clutter it might be nice to move all these checks into
a small function eg. 'bool use_mmio_flip()'. Especially if you add the
"default to mmio flip on VLV" patch on top.

> +		dev_priv->display.queue_flip = intel_queue_mmio_flip;
> +
>  	intel_panel_init_backlight_funcs(dev);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 32a74e1..08d65a4 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -351,6 +351,11 @@ struct intel_pipe_wm {
>  	bool sprites_scaled;
>  };
>  
> +struct intel_mmio_flip {
> +	u32 seqno;
> +	u32 ring_id;
> +};
> +
>  struct intel_crtc {
>  	struct drm_crtc base;
>  	enum pipe pipe;
> @@ -403,6 +408,7 @@ struct intel_crtc {
>  	} wm;
>  
>  	wait_queue_head_t vbl_wait;
> +	struct intel_mmio_flip mmio_flip_data;
>  };
>  
>  struct intel_plane_wm_parameters {
> -- 
> 1.8.5.1

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v5] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-05-19 11:47                                     ` Ville Syrjälä
@ 2014-05-19 12:29                                       ` Daniel Vetter
  2014-05-19 13:06                                         ` Ville Syrjälä
  0 siblings, 1 reply; 67+ messages in thread
From: Daniel Vetter @ 2014-05-19 12:29 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Akash Goel, sourab.gupta, intel-gfx

On Mon, May 19, 2014 at 02:47:20PM +0300, Ville Syrjälä wrote:
> On Mon, May 19, 2014 at 04:28:58PM +0530, sourab.gupta@intel.com wrote:
> > From: Sourab Gupta <sourab.gupta@intel.com>
> > 
> > Using MMIO based flips on Gen5+ for Media power well residency optimization.
> > The blitter ring is currently being used just for command streamer based
> > flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> > of blitter ring and this will ensure the 100% residency for Media well.
> > 
> > Can we address the condition for race between page flip mmio vs set base mmio
> > in a seperate patch or do we address it in this patch only? In which case, this
> > patch may be dependent on
> > http://lists.freedesktop.org/archives/intel-gfx/2014-April/043759.html
> > 
> > v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> > flips when target seqno is reached. (Incorporating Ville's idea)
> > 
> > v3: Rebasing on latest code. Code restructuring after incorporating
> > Damien's comments
> > 
> > v4: Addressing Ville's review comments
> >     -general cleanup
> >     -updating only base addr instead of calling update_primary_plane
> >     -extending patch for gen5+ platforms
> > 
> > v5: Addressed Ville's review comments
> >     -Making mmio flip vs cs flip selection based on module parameter
> >     -Adding check for DRIVER_MODESET feature in notify_ring before calling
> >      notify mmio flip.
> >     -Other changes mostly in function arguments
> > 
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c      |   1 +
> >  drivers/gpu/drm/i915/i915_drv.h      |   6 ++
> >  drivers/gpu/drm/i915/i915_gem.c      |   2 +-
> >  drivers/gpu/drm/i915/i915_irq.c      |   3 +
> >  drivers/gpu/drm/i915/i915_params.c   |   4 ++
> >  drivers/gpu/drm/i915/intel_display.c | 113 +++++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/intel_drv.h     |   6 ++
> >  7 files changed, 134 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 46f1dec..672c28f 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1570,6 +1570,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> >  	spin_lock_init(&dev_priv->backlight_lock);
> >  	spin_lock_init(&dev_priv->uncore.lock);
> >  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> > +	spin_lock_init(&dev_priv->mmio_flip_lock);
> >  	dev_priv->ring_index = 0;
> >  	mutex_init(&dev_priv->dpio_lock);
> >  	mutex_init(&dev_priv->modeset_restore_lock);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 4006dfe..9f1d042 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1543,6 +1543,8 @@ struct drm_i915_private {
> >  	struct i915_ums_state ums;
> >  	/* the indicator for dispatch video commands on two BSD rings */
> >  	int ring_index;
> > +	/* protects the mmio flip data */
> > +	spinlock_t mmio_flip_lock;
> >  };
> >  
> >  static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> > @@ -2019,6 +2021,7 @@ struct i915_params {
> >  	bool reset;
> >  	bool disable_display;
> >  	bool disable_vtd_wa;
> > +	bool use_mmio_flip;
> >  };
> >  extern struct i915_params i915 __read_mostly;
> >  
> > @@ -2209,6 +2212,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
> >  void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
> >  int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
> >  				      bool interruptible);
> > +int i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
> >  static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
> >  {
> >  	return unlikely(atomic_read(&error->reset_counter)
> > @@ -2580,6 +2584,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
> >  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
> >  			       struct drm_file *file);
> >  
> > +void intel_notify_mmio_flip(struct intel_ring_buffer *ring);
> > +
> >  /* overlay */
> >  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
> >  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index fa5b5ab..5b4e953 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -975,7 +975,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
> >   * Compare seqno against outstanding lazy request. Emit a request if they are
> >   * equal.
> >   */
> > -static int
> > +int
> >  i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
> >  {
> >  	int ret;
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index b10fbde..31e98e2 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -1084,6 +1084,9 @@ static void notify_ring(struct drm_device *dev,
> >  
> >  	trace_i915_gem_request_complete(ring);
> >  
> > +	if (drm_core_check_feature(dev, DRIVER_MODESET))
> > +		intel_notify_mmio_flip(ring);
> > +
> 
> I'm not a fan of such checks.
> 
> After a bit of extra thought I got the idea of adding a per-ring
> notify_list. So we could have something like this:
> 
> struct ring_notify {
> 	void (*notify)(struct ring_notify *notify);
> 	struct list_head list;
> 	u32 seqno;
> };
> 
> intel_crtc {
> 	...
> 	struct ring_notify mmio_flip_notify;
> 	...
> };
> 
> I'll probably want something like this for FBC as well, so I guess
> we might as well add it from the start. I think you could do this
> as two patches; first one adds the ring notify list, second one
> implements the mmio flip on top.

This fells like massive overkill, at least for now. So imo the check is
ok. What I do wonder about is whether we really want to run this in
interrupt context. With atomic sprite updates we need to do the vblank
avoidance, and that really shouldn't happen from interrupt context.

So why can't we just latch a work item which uses all the established
seqno waiting stuff and so avoids all these issues (and a pile of
duplicated code)?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v5] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-05-19 12:29                                       ` Daniel Vetter
@ 2014-05-19 13:06                                         ` Ville Syrjälä
  2014-05-19 13:41                                           ` Daniel Vetter
  0 siblings, 1 reply; 67+ messages in thread
From: Ville Syrjälä @ 2014-05-19 13:06 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Akash Goel, sourab.gupta, intel-gfx

On Mon, May 19, 2014 at 02:29:09PM +0200, Daniel Vetter wrote:
> On Mon, May 19, 2014 at 02:47:20PM +0300, Ville Syrjälä wrote:
> > On Mon, May 19, 2014 at 04:28:58PM +0530, sourab.gupta@intel.com wrote:
> > > From: Sourab Gupta <sourab.gupta@intel.com>
> > > 
> > > Using MMIO based flips on Gen5+ for Media power well residency optimization.
> > > The blitter ring is currently being used just for command streamer based
> > > flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> > > of blitter ring and this will ensure the 100% residency for Media well.
> > > 
> > > Can we address the condition for race between page flip mmio vs set base mmio
> > > in a seperate patch or do we address it in this patch only? In which case, this
> > > patch may be dependent on
> > > http://lists.freedesktop.org/archives/intel-gfx/2014-April/043759.html
> > > 
> > > v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> > > flips when target seqno is reached. (Incorporating Ville's idea)
> > > 
> > > v3: Rebasing on latest code. Code restructuring after incorporating
> > > Damien's comments
> > > 
> > > v4: Addressing Ville's review comments
> > >     -general cleanup
> > >     -updating only base addr instead of calling update_primary_plane
> > >     -extending patch for gen5+ platforms
> > > 
> > > v5: Addressed Ville's review comments
> > >     -Making mmio flip vs cs flip selection based on module parameter
> > >     -Adding check for DRIVER_MODESET feature in notify_ring before calling
> > >      notify mmio flip.
> > >     -Other changes mostly in function arguments
> > > 
> > > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > > Signed-off-by: Akash Goel <akash.goel@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_dma.c      |   1 +
> > >  drivers/gpu/drm/i915/i915_drv.h      |   6 ++
> > >  drivers/gpu/drm/i915/i915_gem.c      |   2 +-
> > >  drivers/gpu/drm/i915/i915_irq.c      |   3 +
> > >  drivers/gpu/drm/i915/i915_params.c   |   4 ++
> > >  drivers/gpu/drm/i915/intel_display.c | 113 +++++++++++++++++++++++++++++++++++
> > >  drivers/gpu/drm/i915/intel_drv.h     |   6 ++
> > >  7 files changed, 134 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > > index 46f1dec..672c28f 100644
> > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > @@ -1570,6 +1570,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> > >  	spin_lock_init(&dev_priv->backlight_lock);
> > >  	spin_lock_init(&dev_priv->uncore.lock);
> > >  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> > > +	spin_lock_init(&dev_priv->mmio_flip_lock);
> > >  	dev_priv->ring_index = 0;
> > >  	mutex_init(&dev_priv->dpio_lock);
> > >  	mutex_init(&dev_priv->modeset_restore_lock);
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index 4006dfe..9f1d042 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1543,6 +1543,8 @@ struct drm_i915_private {
> > >  	struct i915_ums_state ums;
> > >  	/* the indicator for dispatch video commands on two BSD rings */
> > >  	int ring_index;
> > > +	/* protects the mmio flip data */
> > > +	spinlock_t mmio_flip_lock;
> > >  };
> > >  
> > >  static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
> > > @@ -2019,6 +2021,7 @@ struct i915_params {
> > >  	bool reset;
> > >  	bool disable_display;
> > >  	bool disable_vtd_wa;
> > > +	bool use_mmio_flip;
> > >  };
> > >  extern struct i915_params i915 __read_mostly;
> > >  
> > > @@ -2209,6 +2212,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
> > >  void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
> > >  int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
> > >  				      bool interruptible);
> > > +int i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
> > >  static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
> > >  {
> > >  	return unlikely(atomic_read(&error->reset_counter)
> > > @@ -2580,6 +2584,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
> > >  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
> > >  			       struct drm_file *file);
> > >  
> > > +void intel_notify_mmio_flip(struct intel_ring_buffer *ring);
> > > +
> > >  /* overlay */
> > >  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
> > >  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index fa5b5ab..5b4e953 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -975,7 +975,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
> > >   * Compare seqno against outstanding lazy request. Emit a request if they are
> > >   * equal.
> > >   */
> > > -static int
> > > +int
> > >  i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
> > >  {
> > >  	int ret;
> > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > index b10fbde..31e98e2 100644
> > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > @@ -1084,6 +1084,9 @@ static void notify_ring(struct drm_device *dev,
> > >  
> > >  	trace_i915_gem_request_complete(ring);
> > >  
> > > +	if (drm_core_check_feature(dev, DRIVER_MODESET))
> > > +		intel_notify_mmio_flip(ring);
> > > +
> > 
> > I'm not a fan of such checks.
> > 
> > After a bit of extra thought I got the idea of adding a per-ring
> > notify_list. So we could have something like this:
> > 
> > struct ring_notify {
> > 	void (*notify)(struct ring_notify *notify);
> > 	struct list_head list;
> > 	u32 seqno;
> > };
> > 
> > intel_crtc {
> > 	...
> > 	struct ring_notify mmio_flip_notify;
> > 	...
> > };
> > 
> > I'll probably want something like this for FBC as well, so I guess
> > we might as well add it from the start. I think you could do this
> > as two patches; first one adds the ring notify list, second one
> > implements the mmio flip on top.
> 
> This fells like massive overkill, at least for now. So imo the check is
> ok.

OK. I guess we can then start with the simple check then.

> What I do wonder about is whether we really want to run this in
> interrupt context. With atomic sprite updates we need to do the vblank
> avoidance, and that really shouldn't happen from interrupt context.

That should be a fairly simple change to do at any time. But perhaps we 
want to do it from the start.

> 
> So why can't we just latch a work item which uses all the established
> seqno waiting stuff and so avoids all these issues

I hate blocking and waiting for stuff. It usually means all kinds of lock
dropping tricks, and extra hurdles if you want to cancel the operation.
My brain just can't deal with it. I much prefer a nice and simple event
based mechanism.

> (and a pile of  duplicated code)?

Which code is duplicated?

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v5] drm/i915: Replaced Blitter ring based flips with MMIO flips for VLV
  2014-05-19 13:06                                         ` Ville Syrjälä
@ 2014-05-19 13:41                                           ` Daniel Vetter
  2014-05-20 10:49                                             ` [PATCH 0/3] Replace Blitter ring based flips with MMIO flips sourab.gupta
  0 siblings, 1 reply; 67+ messages in thread
From: Daniel Vetter @ 2014-05-19 13:41 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Akash Goel, Gupta, Sourab, intel-gfx

On Mon, May 19, 2014 at 3:06 PM, Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
>> So why can't we just latch a work item which uses all the established
>> seqno waiting stuff and so avoids all these issues
>
> I hate blocking and waiting for stuff. It usually means all kinds of lock
> dropping tricks, and extra hurdles if you want to cancel the operation.
> My brain just can't deal with it. I much prefer a nice and simple event
> based mechanism.

Hm, I'm the other way round - I prefer stack based state machines over
continuation passing ...

>> (and a pile of  duplicated code)?
>
> Which code is duplicated?

Handling gpu hangs essentially. If we just latch a work item we can
reuse all the wait and notify logic we already have. And in the past
we've had divergent changes in this area.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 0/3] Replace Blitter ring based flips with MMIO flips
  2014-05-19 13:41                                           ` Daniel Vetter
@ 2014-05-20 10:49                                             ` sourab.gupta
  2014-05-20 10:49                                               ` [PATCH v6 1/3] drm/i915: Replaced " sourab.gupta
                                                                 ` (2 more replies)
  0 siblings, 3 replies; 67+ messages in thread
From: sourab.gupta @ 2014-05-20 10:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

Using MMIO based flips for Media power well residency optimization.
The blitter ring is currently being used just for command streamer based
flip calls. For pure 3D workloads, with MMIO flips, there will be no use
of blitter ring and this will ensure the 100% residency for Media well.

Sourab Gupta (3):
  drm/i915: Replaced Blitter ring based flips with MMIO flips
  drm/i915: Default to mmio flips on VLV
  drm/i915: Fix mmio page flip vs mmio set base race

 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   6 ++
 drivers/gpu/drm/i915/i915_gem.c      |   2 +-
 drivers/gpu/drm/i915/i915_irq.c      |   3 +
 drivers/gpu/drm/i915/i915_params.c   |   5 ++
 drivers/gpu/drm/i915/intel_display.c | 135 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h     |   6 ++
 7 files changed, 157 insertions(+), 1 deletion(-)

-- 
1.8.5.1

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v6 1/3] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-20 10:49                                             ` [PATCH 0/3] Replace Blitter ring based flips with MMIO flips sourab.gupta
@ 2014-05-20 10:49                                               ` sourab.gupta
  2014-05-20 11:59                                                 ` Chris Wilson
  2014-05-20 10:49                                               ` [PATCH 2/3] drm/i915: Default to mmio flips on VLV sourab.gupta
  2014-05-20 10:49                                               ` [PATCH 3/3] drm/i915: Fix mmio page flip vs mmio set base race sourab.gupta
  2 siblings, 1 reply; 67+ messages in thread
From: sourab.gupta @ 2014-05-20 10:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

Using MMIO based flips on Gen5+ for Media power well residency optimization.
The blitter ring is currently being used just for command streamer based
flip calls. For pure 3D workloads, with MMIO flips, there will be no use
of blitter ring and this will ensure the 100% residency for Media well.

v2: The MMIO flips now use the interrupt driven mechanism for issuing the
flips when target seqno is reached. (Incorporating Ville's idea)

v3: Rebasing on latest code. Code restructuring after incorporating
Damien's comments

v4: Addressing Ville's review comments
    -general cleanup
    -updating only base addr instead of calling update_primary_plane
    -extending patch for gen5+ platforms

v5: Addressed Ville's review comments
    -Making mmio flip vs cs flip selection based on module parameter
    -Adding check for DRIVER_MODESET feature in notify_ring before calling
     notify mmio flip.
    -Other changes mostly in function arguments

v6: -Having a seperate function to check condition for using mmio flips (Ville)
    -propogating error code from i915_gem_check_olr (Ville)

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   6 ++
 drivers/gpu/drm/i915/i915_gem.c      |   2 +-
 drivers/gpu/drm/i915/i915_irq.c      |   3 +
 drivers/gpu/drm/i915/i915_params.c   |   4 ++
 drivers/gpu/drm/i915/intel_display.c | 123 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h     |   6 ++
 7 files changed, 144 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 46f1dec..672c28f 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1570,6 +1570,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->mmio_flip_lock);
 	dev_priv->ring_index = 0;
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4006dfe..9f1d042 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1543,6 +1543,8 @@ struct drm_i915_private {
 	struct i915_ums_state ums;
 	/* the indicator for dispatch video commands on two BSD rings */
 	int ring_index;
+	/* protects the mmio flip data */
+	spinlock_t mmio_flip_lock;
 };
 
 static inline struct drm_i915_private *to_i915(const struct drm_device *dev)
@@ -2019,6 +2021,7 @@ struct i915_params {
 	bool reset;
 	bool disable_display;
 	bool disable_vtd_wa;
+	bool use_mmio_flip;
 };
 extern struct i915_params i915 __read_mostly;
 
@@ -2209,6 +2212,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
+int i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
 	return unlikely(atomic_read(&error->reset_counter)
@@ -2580,6 +2584,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
 int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct intel_ring_buffer *ring);
+
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
 extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fa5b5ab..5b4e953 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -975,7 +975,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
  * Compare seqno against outstanding lazy request. Emit a request if they are
  * equal.
  */
-static int
+int
 i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
 {
 	int ret;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index b10fbde..31e98e2 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1084,6 +1084,9 @@ static void notify_ring(struct drm_device *dev,
 
 	trace_i915_gem_request_complete(ring);
 
+	if (drm_core_check_feature(dev, DRIVER_MODESET))
+		intel_notify_mmio_flip(ring);
+
 	wake_up_all(&ring->irq_queue);
 	i915_queue_hangcheck(dev);
 }
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index d05a2af..e0d44df 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -48,6 +48,7 @@ struct i915_params i915 __read_mostly = {
 	.disable_display = 0,
 	.enable_cmd_parser = 1,
 	.disable_vtd_wa = 0,
+	.use_mmio_flip = 0,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -156,3 +157,6 @@ MODULE_PARM_DESC(disable_vtd_wa, "Disable all VT-d workarounds (default: false)"
 module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600);
 MODULE_PARM_DESC(enable_cmd_parser,
 		 "Enable command parsing (1=enabled [default], 0=disabled)");
+
+module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
+MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 0f8f9bc..d8bc30b 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9037,6 +9037,126 @@ err:
 	return ret;
 }
 
+static bool intel_use_mmio_flip(struct drm_device *dev)
+{
+	/* If module parameter is disabled, use CS flips.
+	 * Otherwise, use MMIO flips starting from Gen5.
+	 * This is not being used for older platforms because of
+	 * non-availability of flip done interrupt.
+	 */
+	if (i915.use_mmio_flip == 0)
+		return false;
+
+	if (INTEL_INFO(dev)->gen >= 5)
+		return true;
+	else
+		return false;
+}
+
+static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
+{
+	struct drm_i915_private *dev_priv =
+		intel_crtc->base.dev->dev_private;
+	struct intel_framebuffer *intel_fb =
+		to_intel_framebuffer(intel_crtc->base.primary->fb);
+	struct drm_i915_gem_object *obj = intel_fb->obj;
+
+	intel_mark_page_flip_active(intel_crtc);
+
+	I915_WRITE(DSPSURF(intel_crtc->plane), i915_gem_obj_ggtt_offset(obj) +
+			intel_crtc->dspaddr_offset);
+	POSTING_READ(DSPSURF(intel_crtc->plane));
+}
+
+static int intel_postpone_flip(struct drm_i915_gem_object *obj)
+{
+	int ret;
+
+	if (!obj->ring)
+		return 0;
+
+	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
+				obj->last_write_seqno))
+		return 0;
+
+	if (ret = i915_gem_check_olr(obj->ring, obj->last_write_seqno))
+		return ret;
+
+	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
+		return 0;
+
+	return 1;
+}
+
+void intel_notify_mmio_flip(struct intel_ring_buffer *ring)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_crtc *intel_crtc;
+	unsigned long irq_flags;
+	u32 seqno;
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	for_each_intel_crtc(ring->dev, intel_crtc) {
+		struct intel_mmio_flip *mmio_flip_data;
+
+		mmio_flip_data = &intel_crtc->mmio_flip_data;
+
+		if (mmio_flip_data->seqno == 0)
+			continue;
+		if (ring->id != mmio_flip_data->ring_id)
+			continue;
+
+		if (i915_seqno_passed(seqno, mmio_flip_data->seqno)) {
+			intel_do_mmio_flip(intel_crtc);
+			mmio_flip_data->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+}
+
+static int intel_queue_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc,
+			struct drm_framebuffer *fb,
+			struct drm_i915_gem_object *obj,
+			uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	unsigned long irq_flags;
+	int ret;
+
+	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
+	if (ret)
+		goto err;
+
+	ret = intel_postpone_flip(obj);
+	if (ret < 0) {
+		goto err_unpin;
+	} else if (ret == 0) {
+		intel_do_mmio_flip(intel_crtc);
+		return 0;
+	}
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	intel_crtc->mmio_flip_data.seqno = obj->last_write_seqno;
+	intel_crtc->mmio_flip_data.ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+
+	/* Double check to catch cases where irq fired before
+	 * mmio flip data was ready
+	 */
+	intel_notify_mmio_flip(obj->ring);
+	return 0;
+
+err_unpin:
+	intel_unpin_fb_obj(obj);
+err:
+	return ret;
+}
+
 static int intel_gen7_queue_flip(struct drm_device *dev,
 				 struct drm_crtc *crtc,
 				 struct drm_framebuffer *fb,
@@ -11377,6 +11497,9 @@ static void intel_init_display(struct drm_device *dev)
 		break;
 	}
 
+	if(intel_use_mmio_flip(dev))
+		dev_priv->display.queue_flip = intel_queue_mmio_flip;
+
 	intel_panel_init_backlight_funcs(dev);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 32a74e1..08d65a4 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -351,6 +351,11 @@ struct intel_pipe_wm {
 	bool sprites_scaled;
 };
 
+struct intel_mmio_flip {
+	u32 seqno;
+	u32 ring_id;
+};
+
 struct intel_crtc {
 	struct drm_crtc base;
 	enum pipe pipe;
@@ -403,6 +408,7 @@ struct intel_crtc {
 	} wm;
 
 	wait_queue_head_t vbl_wait;
+	struct intel_mmio_flip mmio_flip_data;
 };
 
 struct intel_plane_wm_parameters {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 2/3] drm/i915: Default to mmio flips on VLV
  2014-05-20 10:49                                             ` [PATCH 0/3] Replace Blitter ring based flips with MMIO flips sourab.gupta
  2014-05-20 10:49                                               ` [PATCH v6 1/3] drm/i915: Replaced " sourab.gupta
@ 2014-05-20 10:49                                               ` sourab.gupta
  2014-05-20 10:49                                               ` [PATCH 3/3] drm/i915: Fix mmio page flip vs mmio set base race sourab.gupta
  2 siblings, 0 replies; 67+ messages in thread
From: sourab.gupta @ 2014-05-20 10:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch is for using mmio flips by default on VLV.
The module parameter controlling use of MMIO flips allows us to
control the default behaviour, which is set true for VLV and false
elsewhere.

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/i915_params.c   |  5 +++--
 drivers/gpu/drm/i915/intel_display.c | 12 +++++++++++-
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index e0d44df..a99accc 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -48,7 +48,7 @@ struct i915_params i915 __read_mostly = {
 	.disable_display = 0,
 	.enable_cmd_parser = 1,
 	.disable_vtd_wa = 0,
-	.use_mmio_flip = 0,
+	.use_mmio_flip = -1,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -159,4 +159,5 @@ MODULE_PARM_DESC(enable_cmd_parser,
 		 "Enable command parsing (1=enabled [default], 0=disabled)");
 
 module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
-MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");
+MODULE_PARM_DESC(use_mmio_flip, "use MMIO page flips "
+		"(default: -1 (use per-chip default))");
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index d8bc30b..21f1fa5 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9043,10 +9043,20 @@ static bool intel_use_mmio_flip(struct drm_device *dev)
 	 * Otherwise, use MMIO flips starting from Gen5.
 	 * This is not being used for older platforms because of
 	 * non-availability of flip done interrupt.
+	 * On Valleyview, use MMIO flips by default, for Media Power Well
+	 * residency optimization. The other alternative of having Render
+	 * ring based flip calls is not being used, as the performance(FPS)
+	 * of certain 3D Apps gets severly affected.
 	 */
+
 	if (i915.use_mmio_flip == 0)
 		return false;
-
+	if (i915.use_mmio_flip == -1) {
+		if(IS_VALLEYVIEW(dev))
+			return true;
+		else
+			return false;
+	}
 	if (INTEL_INFO(dev)->gen >= 5)
 		return true;
 	else
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 3/3] drm/i915: Fix mmio page flip vs mmio set base race
  2014-05-20 10:49                                             ` [PATCH 0/3] Replace Blitter ring based flips with MMIO flips sourab.gupta
  2014-05-20 10:49                                               ` [PATCH v6 1/3] drm/i915: Replaced " sourab.gupta
  2014-05-20 10:49                                               ` [PATCH 2/3] drm/i915: Default to mmio flips on VLV sourab.gupta
@ 2014-05-20 10:49                                               ` sourab.gupta
  2 siblings, 0 replies; 67+ messages in thread
From: sourab.gupta @ 2014-05-20 10:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch fixes the race condition between flip done interrupt
from set base and mmio based page flip.

This patch is dependent on
http://lists.freedesktop.org/archives/intel-gfx/2014-April/043761.html

Also, for the details of the race condition please refer to the mentioned
patch.

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 21f1fa5..8e85d6c 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9073,8 +9073,7 @@ static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
 
 	intel_mark_page_flip_active(intel_crtc);
 
-	I915_WRITE(DSPSURF(intel_crtc->plane), i915_gem_obj_ggtt_offset(obj) +
-			intel_crtc->dspaddr_offset);
+	I915_WRITE(DSPSURF(intel_crtc->plane), intel_crtc->unpin_work->gtt_offset);
 	POSTING_READ(DSPSURF(intel_crtc->plane));
 }
 
@@ -9142,6 +9141,9 @@ static int intel_queue_mmio_flip(struct drm_device *dev,
 	if (ret)
 		goto err;
 
+	intel_crtc->unpin_work->gtt_offset =
+		i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset;
+
 	ret = intel_postpone_flip(obj);
 	if (ret < 0) {
 		goto err_unpin;
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/3] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-20 10:49                                               ` [PATCH v6 1/3] drm/i915: Replaced " sourab.gupta
@ 2014-05-20 11:59                                                 ` Chris Wilson
  2014-05-20 18:01                                                   ` Gupta, Sourab
  0 siblings, 1 reply; 67+ messages in thread
From: Chris Wilson @ 2014-05-20 11:59 UTC (permalink / raw)
  To: sourab.gupta; +Cc: intel-gfx, Akash Goel

On Tue, May 20, 2014 at 04:19:46PM +0530, sourab.gupta@intel.com wrote:
> +int i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);

Be strict and add __must_check

> +static bool intel_use_mmio_flip(struct drm_device *dev)
> +{
> +	/* If module parameter is disabled, use CS flips.
> +	 * Otherwise, use MMIO flips starting from Gen5.
> +	 * This is not being used for older platforms because of
> +	 * non-availability of flip done interrupt.
> +	 */

What? Where is the dependence on flip-done?

> +	if (i915.use_mmio_flip == 0)
> +		return false;
> +
> +	if (INTEL_INFO(dev)->gen >= 5)
> +		return true;
> +	else
> +		return false;

You have not justified the change in default settings for existing hw.
Your argument is based on media power wells which does not support the
general change. It would seem that we may want to mix mmio / CS flips
depending on workload based on your vague statements.

I quite fancy a tristate here for force-CS flips, force-MMIO flips, at
driver discretion. Then enabling it on an architecture as a seperate
patch with justification - it is then easier to do each architecture on
a case-by-case basis and revert if need be.

> +static int intel_postpone_flip(struct drm_i915_gem_object *obj)
> +{
> +	int ret;

if (WARN_ON(crtc->mmio_flip_data.seqno)) return -EBUSY;

You need a tiling check here as you do not update dspcntr. Or fix
mmio_done.

> +	if (!obj->ring)
> +		return 0;
> +
> +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
> +				obj->last_write_seqno))
> +		return 0;
> +
> +	if (ret = i915_gem_check_olr(obj->ring, obj->last_write_seqno))
> +		return ret;

Please don't anger gcc.

> +
> +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> +		return 0;
> +
> +	return 1;
> +}

> @@ -11377,6 +11497,9 @@ static void intel_init_display(struct drm_device *dev)
>  		break;
>  	}
>  
> +	if(intel_use_mmio_flip(dev))

Please don't anger checkpatch.

> +		dev_priv->display.queue_flip = intel_queue_mmio_flip;
> +
>  	intel_panel_init_backlight_funcs(dev);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 32a74e1..08d65a4 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -351,6 +351,11 @@ struct intel_pipe_wm {
>  	bool sprites_scaled;
>  };
>  
> +struct intel_mmio_flip {
> +	u32 seqno;
> +	u32 ring_id;
> +};
> +
>  struct intel_crtc {
>  	struct drm_crtc base;
>  	enum pipe pipe;
> @@ -403,6 +408,7 @@ struct intel_crtc {
>  	} wm;
>  
>  	wait_queue_head_t vbl_wait;
> +	struct intel_mmio_flip mmio_flip_data;

Does _data add anything meaningful here to the description of mmio_flip?
Just mmio_flip will suffice, as pending_mmio_flip is overkill but would
make a useful comment.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v6 1/3] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-20 11:59                                                 ` Chris Wilson
@ 2014-05-20 18:01                                                   ` Gupta, Sourab
  2014-05-22 14:36                                                     ` [PATCH v2 0/3] Replace " sourab.gupta
  0 siblings, 1 reply; 67+ messages in thread
From: Gupta, Sourab @ 2014-05-20 18:01 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Goel, Akash

On Tue, 2014-05-20 at 11:59 +0000, Chris Wilson wrote:
> On Tue, May 20, 2014 at 04:19:46PM +0530, sourab.gupta@intel.com wrote:
> > +int i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
> 
> Be strict and add __must_check
> 
We'll add this.

> > +static bool intel_use_mmio_flip(struct drm_device *dev)
> > +{
> > +	/* If module parameter is disabled, use CS flips.
> > +	 * Otherwise, use MMIO flips starting from Gen5.
> > +	 * This is not being used for older platforms because of
> > +	 * non-availability of flip done interrupt.
> > +	 */
> 
> What? Where is the dependence on flip-done?

Hi Chris,
>From an earlier mail by Ville, 
"It should work on gen5+ since all of those have a flip done interrupt.
For older platforms we use some clever tricks involving the flip_pending
status bits and vblank irqs. That code won't work for mmio flips. We'd
need to add another way to complete the flips based. That would involve
using the frame counter to make it accurate. To avoid races there we'd
definitely need to use the vblank evade mechanism to make sure we sample
the frame counter within the same frame as when we write the registers.
Also gen2 has the extra complication that it lacks a hardware frame
counter."
So, we had put the Gen5+ check here.

> 
> > +	if (i915.use_mmio_flip == 0)
> > +		return false;
> > +
> > +	if (INTEL_INFO(dev)->gen >= 5)
> > +		return true;
> > +	else
> > +		return false;
> 
> You have not justified the change in default settings for existing hw.
> Your argument is based on media power wells which does not support the
> general change. It would seem that we may want to mix mmio / CS flips
> depending on workload based on your vague statements.
> 

> I quite fancy a tristate here for force-CS flips, force-MMIO flips, at
> driver discretion. Then enabling it on an architecture as a seperate
> patch with justification - it is then easier to do each architecture on
> a case-by-case basis and revert if need be.
> 
We agree that the using mmio flips gives better residency in cases where
render and blitter engines reside in different power wells. This is
helpful in case of pure 3D workloads on valleyview. We have enabled it
in the second patch of series for valleyview.
This patch has put forth 2 states - 0 for force-CS flips and 1 for
force-MMIO flips. The second patch in this series enables it for
Valleyview architecture.

> > +static int intel_postpone_flip(struct drm_i915_gem_object *obj)
> > +{
> > +	int ret;
> 
> if (WARN_ON(crtc->mmio_flip_data.seqno)) return -EBUSY;
> 
> You need a tiling check here as you do not update dspcntr. Or fix
> mmio_done.
> 
We were not updating dspcntr here because atomicity concerns. We could
add tiling update also if its ok in that regard.
Ville, what's your opinion here.
> > +	if (!obj->ring)
> > +		return 0;
> > +
> > +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
> > +				obj->last_write_seqno))
> > +		return 0;
> > +
> > +	if (ret = i915_gem_check_olr(obj->ring, obj->last_write_seqno))
> > +		return ret;
> 
> Please don't anger gcc.
> 
> > +
> > +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> > +		return 0;
> > +
> > +	return 1;
> > +}
> 
> > @@ -11377,6 +11497,9 @@ static void intel_init_display(struct drm_device *dev)
> >  		break;
> >  	}
> >  
> > +	if(intel_use_mmio_flip(dev))
> 
> Please don't anger checkpatch.
> 
> > +		dev_priv->display.queue_flip = intel_queue_mmio_flip;
> > +
> >  	intel_panel_init_backlight_funcs(dev);
> >  }
> >  
> > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> > index 32a74e1..08d65a4 100644
> > --- a/drivers/gpu/drm/i915/intel_drv.h
> > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > @@ -351,6 +351,11 @@ struct intel_pipe_wm {
> >  	bool sprites_scaled;
> >  };
> >  
> > +struct intel_mmio_flip {
> > +	u32 seqno;
> > +	u32 ring_id;
> > +};
> > +
> >  struct intel_crtc {
> >  	struct drm_crtc base;
> >  	enum pipe pipe;
> > @@ -403,6 +408,7 @@ struct intel_crtc {
> >  	} wm;
> >  
> >  	wait_queue_head_t vbl_wait;
> > +	struct intel_mmio_flip mmio_flip_data;
> 
> Does _data add anything meaningful here to the description of mmio_flip?
> Just mmio_flip will suffice, as pending_mmio_flip is overkill but would
> make a useful comment.
> -Chris

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v2 0/3] Replace Blitter ring based flips with MMIO flips
  2014-05-20 18:01                                                   ` Gupta, Sourab
@ 2014-05-22 14:36                                                     ` sourab.gupta
  2014-05-22 14:36                                                       ` [PATCH v7 1/3] drm/i915: Replaced " sourab.gupta
                                                                         ` (3 more replies)
  0 siblings, 4 replies; 67+ messages in thread
From: sourab.gupta @ 2014-05-22 14:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch series replaces Blitter ring based flips with MMIO based flips.
This is useful for Media power well residency optimization. These may be
enabled on architectures where Render and Blitter engines reside in different
power wells.
The blitter ring is currently being used just for command streamer based
flip calls. For pure 3D workloads, with MMIO flips, there will be no use
of blitter ring and this will ensure the 100% residency for Media well.

Sourab Gupta (3):
  drm/i915: Replaced Blitter ring based flips with MMIO flips
  drm/i915: Default to mmio flips on VLV
  drm/i915: Fix mmio page flip vs mmio set base race

 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   6 ++
 drivers/gpu/drm/i915/i915_gem.c      |   2 +-
 drivers/gpu/drm/i915/i915_irq.c      |   3 +
 drivers/gpu/drm/i915/i915_params.c   |   5 ++
 drivers/gpu/drm/i915/intel_display.c | 135 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h     |   6 ++
 7 files changed, 157 insertions(+), 1 deletion(-)

-- 
1.8.5.1

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v7 1/3] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-22 14:36                                                     ` [PATCH v2 0/3] Replace " sourab.gupta
@ 2014-05-22 14:36                                                       ` sourab.gupta
  2014-05-27 12:52                                                         ` Ville Syrjälä
  2014-05-22 14:36                                                       ` [PATCH 2/3] drm/i915: Default to mmio flips on VLV sourab.gupta
                                                                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 67+ messages in thread
From: sourab.gupta @ 2014-05-22 14:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

Using MMIO based flips on Gen5+. The MMIO flips are useful for the Media power
well residency optimization. These maybe enabled on architectures where
Render and Blitter engines reside in different power wells.
The blitter ring is currently being used just for command streamer based
flip calls. For pure 3D workloads in such cases, with MMIO flips, there will
be no use of blitter ring and this will ensure the 100% residency for Media well.

v2: The MMIO flips now use the interrupt driven mechanism for issuing the
flips when target seqno is reached. (Incorporating Ville's idea)

v3: Rebasing on latest code. Code restructuring after incorporating
Damien's comments

v4: Addressing Ville's review comments
    -general cleanup
    -updating only base addr instead of calling update_primary_plane
    -extending patch for gen5+ platforms

v5: Addressed Ville's review comments
    -Making mmio flip vs cs flip selection based on module parameter
    -Adding check for DRIVER_MODESET feature in notify_ring before calling
     notify mmio flip.
    -Other changes mostly in function arguments

v6: -Having a seperate function to check condition for using mmio flips (Ville)
    -propogating error code from i915_gem_check_olr (Ville)

v7: -Adding __must_check with i915_gem_check_olr (Chris)
    -Renaming mmio_flip_data to mmio_flip (Chris)
    -Rebasing on latest nightly

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   7 ++
 drivers/gpu/drm/i915/i915_gem.c      |   2 +-
 drivers/gpu/drm/i915/i915_irq.c      |   3 +
 drivers/gpu/drm/i915/i915_params.c   |   4 ++
 drivers/gpu/drm/i915/intel_display.c | 133 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h     |   6 ++
 7 files changed, 155 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 20df7c72..266c9a6 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1571,6 +1571,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->mmio_flip_lock);
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 13495a4..ced6e58 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1366,6 +1366,9 @@ struct drm_i915_private {
 	/* protects the irq masks */
 	spinlock_t irq_lock;
 
+	/* protects the mmio flip data */
+	spinlock_t mmio_flip_lock;
+
 	bool display_irqs_enabled;
 
 	/* To control wakeup latency, e.g. for irq-driven dp aux transfers. */
@@ -2036,6 +2039,7 @@ struct i915_params {
 	bool reset;
 	bool disable_display;
 	bool disable_vtd_wa;
+	bool use_mmio_flip;
 };
 extern struct i915_params i915 __read_mostly;
 
@@ -2230,6 +2234,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
+int __must_check i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
 	return unlikely(atomic_read(&error->reset_counter)
@@ -2598,6 +2603,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
 int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct intel_ring_buffer *ring);
+
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
 extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f2713b9..bc6fe4e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1096,7 +1096,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
  * Compare seqno against outstanding lazy request. Emit a request if they are
  * equal.
  */
-static int
+__must_check int
 i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
 {
 	int ret;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2043276..f244b23 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1155,6 +1155,9 @@ static void notify_ring(struct drm_device *dev,
 
 	trace_i915_gem_request_complete(ring);
 
+	if (drm_core_check_feature(dev, DRIVER_MODESET))
+		intel_notify_mmio_flip(ring);
+
 	wake_up_all(&ring->irq_queue);
 	i915_queue_hangcheck(dev);
 }
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index d05a2af..e0d44df 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -48,6 +48,7 @@ struct i915_params i915 __read_mostly = {
 	.disable_display = 0,
 	.enable_cmd_parser = 1,
 	.disable_vtd_wa = 0,
+	.use_mmio_flip = 0,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -156,3 +157,6 @@ MODULE_PARM_DESC(disable_vtd_wa, "Disable all VT-d workarounds (default: false)"
 module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600);
 MODULE_PARM_DESC(enable_cmd_parser,
 		 "Enable command parsing (1=enabled [default], 0=disabled)");
+
+module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
+MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 19b92c1..a29552d 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9179,6 +9179,136 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	return 0;
 }
 
+static bool intel_use_mmio_flip(struct drm_device *dev)
+{
+	/* If module parameter is disabled, use CS flips.
+	 * Otherwise, use MMIO flips starting from Gen5.
+	 * This is not being used for older platforms, because
+	 * non-availability of flip done interrupt forces us to use
+	 * CS flips. Older platforms derive flip done using some clever
+	 * tricks involving the flip_pending status bits and vblank irqs.
+	 * So using MMIO flips there would disrupt this mechanism.
+	 */
+
+	if (i915.use_mmio_flip == 0)
+		return false;
+
+	if (INTEL_INFO(dev)->gen >= 5)
+		return true;
+	else
+		return false;
+}
+
+static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
+{
+	struct drm_i915_private *dev_priv =
+		intel_crtc->base.dev->dev_private;
+	struct intel_framebuffer *intel_fb =
+		to_intel_framebuffer(intel_crtc->base.primary->fb);
+	struct drm_i915_gem_object *obj = intel_fb->obj;
+
+	intel_mark_page_flip_active(intel_crtc);
+
+	I915_WRITE(DSPSURF(intel_crtc->plane), i915_gem_obj_ggtt_offset(obj) +
+			intel_crtc->dspaddr_offset);
+	POSTING_READ(DSPSURF(intel_crtc->plane));
+}
+
+static int intel_postpone_flip(struct drm_i915_gem_object *obj)
+{
+	int ret;
+
+	if (!obj->ring)
+		return 0;
+
+	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
+				obj->last_write_seqno))
+		return 0;
+
+	ret = i915_gem_check_olr(obj->ring, obj->last_write_seqno);
+	if (ret)
+		return ret;
+
+	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
+		return 0;
+
+	return 1;
+}
+
+void intel_notify_mmio_flip(struct intel_ring_buffer *ring)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_crtc *intel_crtc;
+	unsigned long irq_flags;
+	u32 seqno;
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	for_each_intel_crtc(ring->dev, intel_crtc) {
+		struct intel_mmio_flip *mmio_flip;
+
+		mmio_flip = &intel_crtc->mmio_flip;
+
+		if (mmio_flip->seqno == 0)
+			continue;
+		if (ring->id != mmio_flip->ring_id)
+			continue;
+
+		if (i915_seqno_passed(seqno, mmio_flip->seqno)) {
+			intel_do_mmio_flip(intel_crtc);
+			mmio_flip->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+}
+
+static int intel_queue_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc,
+			struct drm_framebuffer *fb,
+			struct drm_i915_gem_object *obj,
+			uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	unsigned long irq_flags;
+	int ret;
+
+	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
+	if (ret)
+		goto err;
+
+	if (WARN_ON(intel_crtc->mmio_flip.seqno)) {
+		ret = -EBUSY;
+		goto err_unpin;
+	}
+
+	ret = intel_postpone_flip(obj);
+	if (ret < 0) {
+		goto err_unpin;
+	} else if (ret == 0) {
+		intel_do_mmio_flip(intel_crtc);
+		return 0;
+	}
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	intel_crtc->mmio_flip.seqno = obj->last_write_seqno;
+	intel_crtc->mmio_flip.ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+
+	/* Double check to catch cases where irq fired before
+	 * mmio flip data was ready
+	 */
+	intel_notify_mmio_flip(obj->ring);
+	return 0;
+
+err_unpin:
+	intel_unpin_fb_obj(obj);
+err:
+	return ret;
+}
+
 static int intel_default_queue_flip(struct drm_device *dev,
 				    struct drm_crtc *crtc,
 				    struct drm_framebuffer *fb,
@@ -11470,6 +11600,9 @@ static void intel_init_display(struct drm_device *dev)
 		break;
 	}
 
+	if (intel_use_mmio_flip(dev))
+		dev_priv->display.queue_flip = intel_queue_mmio_flip;
+
 	intel_panel_init_backlight_funcs(dev);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 287b89e..5a4f60c 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -358,6 +358,11 @@ struct intel_pipe_wm {
 	bool sprites_scaled;
 };
 
+struct intel_mmio_flip {
+	u32 seqno;
+	u32 ring_id;
+};
+
 struct intel_crtc {
 	struct drm_crtc base;
 	enum pipe pipe;
@@ -409,6 +414,7 @@ struct intel_crtc {
 	} wm;
 
 	wait_queue_head_t vbl_wait;
+	struct intel_mmio_flip mmio_flip;
 };
 
 struct intel_plane_wm_parameters {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 2/3] drm/i915: Default to mmio flips on VLV
  2014-05-22 14:36                                                     ` [PATCH v2 0/3] Replace " sourab.gupta
  2014-05-22 14:36                                                       ` [PATCH v7 1/3] drm/i915: Replaced " sourab.gupta
@ 2014-05-22 14:36                                                       ` sourab.gupta
  2014-05-22 14:36                                                       ` [PATCH 3/3] drm/i915: Fix mmio page flip vs mmio set base race sourab.gupta
  2014-05-26  8:51                                                       ` [PATCH v2 0/3] Replace Blitter ring based flips with MMIO flips Gupta, Sourab
  3 siblings, 0 replies; 67+ messages in thread
From: sourab.gupta @ 2014-05-22 14:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch is for using mmio flips by default on VLV.
The module parameter controlling use of MMIO flips allows us to
control the default behaviour, which is set true for VLV and false
elsewhere.

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/i915_params.c   |  5 +++--
 drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index e0d44df..a99accc 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -48,7 +48,7 @@ struct i915_params i915 __read_mostly = {
 	.disable_display = 0,
 	.enable_cmd_parser = 1,
 	.disable_vtd_wa = 0,
-	.use_mmio_flip = 0,
+	.use_mmio_flip = -1,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -159,4 +159,5 @@ MODULE_PARM_DESC(enable_cmd_parser,
 		 "Enable command parsing (1=enabled [default], 0=disabled)");
 
 module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
-MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");
+MODULE_PARM_DESC(use_mmio_flip, "use MMIO page flips "
+		"(default: -1 (use per-chip default))");
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index a29552d..b3e7fc6 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9193,6 +9193,18 @@ static bool intel_use_mmio_flip(struct drm_device *dev)
 	if (i915.use_mmio_flip == 0)
 		return false;
 
+	/* On Valleyview, use MMIO flips by default, for Media Power Well
+	 * residency optimization. The other alternative of having Render
+	 * ring based flip calls is not being used, as the performance(FPS)
+	 * of certain 3D Apps gets severly affected.
+	 */
+	if (i915.use_mmio_flip == -1) {
+		if (IS_VALLEYVIEW(dev))
+			return true;
+		else
+			return false;
+	}
+
 	if (INTEL_INFO(dev)->gen >= 5)
 		return true;
 	else
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 3/3] drm/i915: Fix mmio page flip vs mmio set base race
  2014-05-22 14:36                                                     ` [PATCH v2 0/3] Replace " sourab.gupta
  2014-05-22 14:36                                                       ` [PATCH v7 1/3] drm/i915: Replaced " sourab.gupta
  2014-05-22 14:36                                                       ` [PATCH 2/3] drm/i915: Default to mmio flips on VLV sourab.gupta
@ 2014-05-22 14:36                                                       ` sourab.gupta
  2014-05-26  8:51                                                       ` [PATCH v2 0/3] Replace Blitter ring based flips with MMIO flips Gupta, Sourab
  3 siblings, 0 replies; 67+ messages in thread
From: sourab.gupta @ 2014-05-22 14:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch fixes the race condition between flip done interrupt
from set base and mmio based page flip.

This patch is dependent on
http://lists.freedesktop.org/archives/intel-gfx/2014-April/043761.html

Also, for the details of the race condition please refer to the mentioned
patch.

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index b3e7fc6..0099c56 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9221,8 +9221,8 @@ static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
 
 	intel_mark_page_flip_active(intel_crtc);
 
-	I915_WRITE(DSPSURF(intel_crtc->plane), i915_gem_obj_ggtt_offset(obj) +
-			intel_crtc->dspaddr_offset);
+	I915_WRITE(DSPSURF(intel_crtc->plane),
+			intel_crtc->unpin_work->gtt_offset);
 	POSTING_READ(DSPSURF(intel_crtc->plane));
 }
 
@@ -9296,6 +9296,10 @@ static int intel_queue_mmio_flip(struct drm_device *dev,
 		goto err_unpin;
 	}
 
+	intel_crtc->unpin_work->gtt_offset =
+		i915_gem_obj_ggtt_offset(obj) +
+		intel_crtc->dspaddr_offset;
+
 	ret = intel_postpone_flip(obj);
 	if (ret < 0) {
 		goto err_unpin;
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v2 0/3] Replace Blitter ring based flips with MMIO flips
  2014-05-22 14:36                                                     ` [PATCH v2 0/3] Replace " sourab.gupta
                                                                         ` (2 preceding siblings ...)
  2014-05-22 14:36                                                       ` [PATCH 3/3] drm/i915: Fix mmio page flip vs mmio set base race sourab.gupta
@ 2014-05-26  8:51                                                       ` Gupta, Sourab
  3 siblings, 0 replies; 67+ messages in thread
From: Gupta, Sourab @ 2014-05-26  8:51 UTC (permalink / raw)
  To: intel-gfx; +Cc: Goel, Akash

On Thu, 2014-05-22 at 14:36 +0000, Gupta, Sourab wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> This patch series replaces Blitter ring based flips with MMIO based flips.
> This is useful for Media power well residency optimization. These may be
> enabled on architectures where Render and Blitter engines reside in different
> power wells.
> The blitter ring is currently being used just for command streamer based
> flip calls. For pure 3D workloads, with MMIO flips, there will be no use
> of blitter ring and this will ensure the 100% residency for Media well.
> 
> Sourab Gupta (3):
>   drm/i915: Replaced Blitter ring based flips with MMIO flips
>   drm/i915: Default to mmio flips on VLV
>   drm/i915: Fix mmio page flip vs mmio set base race
> 
>  drivers/gpu/drm/i915/i915_dma.c      |   1 +
>  drivers/gpu/drm/i915/i915_drv.h      |   6 ++
>  drivers/gpu/drm/i915/i915_gem.c      |   2 +-
>  drivers/gpu/drm/i915/i915_irq.c      |   3 +
>  drivers/gpu/drm/i915/i915_params.c   |   5 ++
>  drivers/gpu/drm/i915/intel_display.c | 135 +++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_drv.h     |   6 ++
>  7 files changed, 157 insertions(+), 1 deletion(-)
> 
Hi Ville/Chris,
I have tried to address your comments in these series of patches. Can
you please review them.
Thanks,
Sourab

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v7 1/3] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-22 14:36                                                       ` [PATCH v7 1/3] drm/i915: Replaced " sourab.gupta
@ 2014-05-27 12:52                                                         ` Ville Syrjälä
  2014-05-27 13:09                                                           ` Daniel Vetter
  0 siblings, 1 reply; 67+ messages in thread
From: Ville Syrjälä @ 2014-05-27 12:52 UTC (permalink / raw)
  To: sourab.gupta; +Cc: intel-gfx, Akash Goel

On Thu, May 22, 2014 at 08:06:31PM +0530, sourab.gupta@intel.com wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> Using MMIO based flips on Gen5+. The MMIO flips are useful for the Media power
> well residency optimization. These maybe enabled on architectures where
> Render and Blitter engines reside in different power wells.
> The blitter ring is currently being used just for command streamer based
> flip calls. For pure 3D workloads in such cases, with MMIO flips, there will
> be no use of blitter ring and this will ensure the 100% residency for Media well.
> 
> v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> flips when target seqno is reached. (Incorporating Ville's idea)
> 
> v3: Rebasing on latest code. Code restructuring after incorporating
> Damien's comments
> 
> v4: Addressing Ville's review comments
>     -general cleanup
>     -updating only base addr instead of calling update_primary_plane
>     -extending patch for gen5+ platforms
> 
> v5: Addressed Ville's review comments
>     -Making mmio flip vs cs flip selection based on module parameter
>     -Adding check for DRIVER_MODESET feature in notify_ring before calling
>      notify mmio flip.
>     -Other changes mostly in function arguments
> 
> v6: -Having a seperate function to check condition for using mmio flips (Ville)
>     -propogating error code from i915_gem_check_olr (Ville)
> 
> v7: -Adding __must_check with i915_gem_check_olr (Chris)
>     -Renaming mmio_flip_data to mmio_flip (Chris)
>     -Rebasing on latest nightly
> 
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>

Seems the mmio vs cs flip race patches landed finally, so this needs a
rebase to kill the pin stuff from intel_queue_mmio_flip(), and that
also means you can squash patch 3/3 with this one. Also needs a
s/ring_buffer/engine_cs/ since that stuff also got changed.

Oh and Chris was right about the tiling thing. I totally forgot about
that when I said you shouldn't use .update_primary_plane(). I think
the easiest solution would be to stick the new tiling mode into the
mmio flip data, and just update DSPCNTR with a RMW in the flip notify
function. IIIRC we already have test for tiling change during page
flip in igt.

> ---
>  drivers/gpu/drm/i915/i915_dma.c      |   1 +
>  drivers/gpu/drm/i915/i915_drv.h      |   7 ++
>  drivers/gpu/drm/i915/i915_gem.c      |   2 +-
>  drivers/gpu/drm/i915/i915_irq.c      |   3 +
>  drivers/gpu/drm/i915/i915_params.c   |   4 ++
>  drivers/gpu/drm/i915/intel_display.c | 133 +++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/intel_drv.h     |   6 ++
>  7 files changed, 155 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 20df7c72..266c9a6 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1571,6 +1571,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  	spin_lock_init(&dev_priv->backlight_lock);
>  	spin_lock_init(&dev_priv->uncore.lock);
>  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> +	spin_lock_init(&dev_priv->mmio_flip_lock);
>  	mutex_init(&dev_priv->dpio_lock);
>  	mutex_init(&dev_priv->modeset_restore_lock);
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 13495a4..ced6e58 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1366,6 +1366,9 @@ struct drm_i915_private {
>  	/* protects the irq masks */
>  	spinlock_t irq_lock;
>  
> +	/* protects the mmio flip data */
> +	spinlock_t mmio_flip_lock;
> +
>  	bool display_irqs_enabled;
>  
>  	/* To control wakeup latency, e.g. for irq-driven dp aux transfers. */
> @@ -2036,6 +2039,7 @@ struct i915_params {
>  	bool reset;
>  	bool disable_display;
>  	bool disable_vtd_wa;
> +	bool use_mmio_flip;
>  };
>  extern struct i915_params i915 __read_mostly;
>  
> @@ -2230,6 +2234,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
>  void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
>  int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
>  				      bool interruptible);
> +int __must_check i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
>  static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
>  {
>  	return unlikely(atomic_read(&error->reset_counter)
> @@ -2598,6 +2603,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
>  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file);
>  
> +void intel_notify_mmio_flip(struct intel_ring_buffer *ring);
> +
>  /* overlay */
>  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
>  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f2713b9..bc6fe4e 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1096,7 +1096,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
>   * Compare seqno against outstanding lazy request. Emit a request if they are
>   * equal.
>   */
> -static int
> +__must_check int
>  i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
>  {
>  	int ret;
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 2043276..f244b23 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1155,6 +1155,9 @@ static void notify_ring(struct drm_device *dev,
>  
>  	trace_i915_gem_request_complete(ring);
>  
> +	if (drm_core_check_feature(dev, DRIVER_MODESET))
> +		intel_notify_mmio_flip(ring);
> +
>  	wake_up_all(&ring->irq_queue);
>  	i915_queue_hangcheck(dev);
>  }
> diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
> index d05a2af..e0d44df 100644
> --- a/drivers/gpu/drm/i915/i915_params.c
> +++ b/drivers/gpu/drm/i915/i915_params.c
> @@ -48,6 +48,7 @@ struct i915_params i915 __read_mostly = {
>  	.disable_display = 0,
>  	.enable_cmd_parser = 1,
>  	.disable_vtd_wa = 0,
> +	.use_mmio_flip = 0,
>  };
>  
>  module_param_named(modeset, i915.modeset, int, 0400);
> @@ -156,3 +157,6 @@ MODULE_PARM_DESC(disable_vtd_wa, "Disable all VT-d workarounds (default: false)"
>  module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600);
>  MODULE_PARM_DESC(enable_cmd_parser,
>  		 "Enable command parsing (1=enabled [default], 0=disabled)");
> +
> +module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
> +MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 19b92c1..a29552d 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -9179,6 +9179,136 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
>  	return 0;
>  }
>  
> +static bool intel_use_mmio_flip(struct drm_device *dev)
> +{
> +	/* If module parameter is disabled, use CS flips.
> +	 * Otherwise, use MMIO flips starting from Gen5.
> +	 * This is not being used for older platforms, because
> +	 * non-availability of flip done interrupt forces us to use
> +	 * CS flips. Older platforms derive flip done using some clever
> +	 * tricks involving the flip_pending status bits and vblank irqs.
> +	 * So using MMIO flips there would disrupt this mechanism.
> +	 */
> +
> +	if (i915.use_mmio_flip == 0)
> +		return false;
> +
> +	if (INTEL_INFO(dev)->gen >= 5)
> +		return true;
> +	else
> +		return false;
> +}
> +
> +static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
> +{
> +	struct drm_i915_private *dev_priv =
> +		intel_crtc->base.dev->dev_private;
> +	struct intel_framebuffer *intel_fb =
> +		to_intel_framebuffer(intel_crtc->base.primary->fb);
> +	struct drm_i915_gem_object *obj = intel_fb->obj;
> +
> +	intel_mark_page_flip_active(intel_crtc);
> +
> +	I915_WRITE(DSPSURF(intel_crtc->plane), i915_gem_obj_ggtt_offset(obj) +
> +			intel_crtc->dspaddr_offset);
> +	POSTING_READ(DSPSURF(intel_crtc->plane));
> +}
> +
> +static int intel_postpone_flip(struct drm_i915_gem_object *obj)
> +{
> +	int ret;
> +
> +	if (!obj->ring)
> +		return 0;
> +
> +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
> +				obj->last_write_seqno))
> +		return 0;
> +
> +	ret = i915_gem_check_olr(obj->ring, obj->last_write_seqno);
> +	if (ret)
> +		return ret;
> +
> +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> +		return 0;
> +
> +	return 1;
> +}
> +
> +void intel_notify_mmio_flip(struct intel_ring_buffer *ring)
> +{
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> +	struct intel_crtc *intel_crtc;
> +	unsigned long irq_flags;
> +	u32 seqno;
> +
> +	seqno = ring->get_seqno(ring, false);
> +
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	for_each_intel_crtc(ring->dev, intel_crtc) {
> +		struct intel_mmio_flip *mmio_flip;
> +
> +		mmio_flip = &intel_crtc->mmio_flip;
> +
> +		if (mmio_flip->seqno == 0)
> +			continue;
> +		if (ring->id != mmio_flip->ring_id)
> +			continue;
> +
> +		if (i915_seqno_passed(seqno, mmio_flip->seqno)) {
> +			intel_do_mmio_flip(intel_crtc);
> +			mmio_flip->seqno = 0;
> +			ring->irq_put(ring);
> +		}
> +	}
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +}
> +
> +static int intel_queue_mmio_flip(struct drm_device *dev,
> +			struct drm_crtc *crtc,
> +			struct drm_framebuffer *fb,
> +			struct drm_i915_gem_object *obj,
> +			uint32_t flags)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> +	unsigned long irq_flags;
> +	int ret;
> +
> +	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
> +	if (ret)
> +		goto err;
> +
> +	if (WARN_ON(intel_crtc->mmio_flip.seqno)) {
> +		ret = -EBUSY;
> +		goto err_unpin;
> +	}
> +
> +	ret = intel_postpone_flip(obj);
> +	if (ret < 0) {
> +		goto err_unpin;
> +	} else if (ret == 0) {
> +		intel_do_mmio_flip(intel_crtc);
> +		return 0;
> +	}
> +
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	intel_crtc->mmio_flip.seqno = obj->last_write_seqno;
> +	intel_crtc->mmio_flip.ring_id = obj->ring->id;
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +
> +	/* Double check to catch cases where irq fired before
> +	 * mmio flip data was ready
> +	 */
> +	intel_notify_mmio_flip(obj->ring);
> +	return 0;
> +
> +err_unpin:
> +	intel_unpin_fb_obj(obj);
> +err:
> +	return ret;
> +}
> +
>  static int intel_default_queue_flip(struct drm_device *dev,
>  				    struct drm_crtc *crtc,
>  				    struct drm_framebuffer *fb,
> @@ -11470,6 +11600,9 @@ static void intel_init_display(struct drm_device *dev)
>  		break;
>  	}
>  
> +	if (intel_use_mmio_flip(dev))
> +		dev_priv->display.queue_flip = intel_queue_mmio_flip;
> +
>  	intel_panel_init_backlight_funcs(dev);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 287b89e..5a4f60c 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -358,6 +358,11 @@ struct intel_pipe_wm {
>  	bool sprites_scaled;
>  };
>  
> +struct intel_mmio_flip {
> +	u32 seqno;
> +	u32 ring_id;
> +};
> +
>  struct intel_crtc {
>  	struct drm_crtc base;
>  	enum pipe pipe;
> @@ -409,6 +414,7 @@ struct intel_crtc {
>  	} wm;
>  
>  	wait_queue_head_t vbl_wait;
> +	struct intel_mmio_flip mmio_flip;
>  };
>  
>  struct intel_plane_wm_parameters {
> -- 
> 1.8.5.1

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v7 1/3] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-27 12:52                                                         ` Ville Syrjälä
@ 2014-05-27 13:09                                                           ` Daniel Vetter
  2014-05-28  7:12                                                             ` [PATCH v3 0/2] Replace " sourab.gupta
  0 siblings, 1 reply; 67+ messages in thread
From: Daniel Vetter @ 2014-05-27 13:09 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Akash Goel, sourab.gupta, intel-gfx

On Tue, May 27, 2014 at 03:52:55PM +0300, Ville Syrjälä wrote:
> On Thu, May 22, 2014 at 08:06:31PM +0530, sourab.gupta@intel.com wrote:
> > From: Sourab Gupta <sourab.gupta@intel.com>
> > 
> > Using MMIO based flips on Gen5+. The MMIO flips are useful for the Media power
> > well residency optimization. These maybe enabled on architectures where
> > Render and Blitter engines reside in different power wells.
> > The blitter ring is currently being used just for command streamer based
> > flip calls. For pure 3D workloads in such cases, with MMIO flips, there will
> > be no use of blitter ring and this will ensure the 100% residency for Media well.
> > 
> > v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> > flips when target seqno is reached. (Incorporating Ville's idea)
> > 
> > v3: Rebasing on latest code. Code restructuring after incorporating
> > Damien's comments
> > 
> > v4: Addressing Ville's review comments
> >     -general cleanup
> >     -updating only base addr instead of calling update_primary_plane
> >     -extending patch for gen5+ platforms
> > 
> > v5: Addressed Ville's review comments
> >     -Making mmio flip vs cs flip selection based on module parameter
> >     -Adding check for DRIVER_MODESET feature in notify_ring before calling
> >      notify mmio flip.
> >     -Other changes mostly in function arguments
> > 
> > v6: -Having a seperate function to check condition for using mmio flips (Ville)
> >     -propogating error code from i915_gem_check_olr (Ville)
> > 
> > v7: -Adding __must_check with i915_gem_check_olr (Chris)
> >     -Renaming mmio_flip_data to mmio_flip (Chris)
> >     -Rebasing on latest nightly
> > 
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> 
> Seems the mmio vs cs flip race patches landed finally, so this needs a
> rebase to kill the pin stuff from intel_queue_mmio_flip(), and that
> also means you can squash patch 3/3 with this one. Also needs a
> s/ring_buffer/engine_cs/ since that stuff also got changed.
> 
> Oh and Chris was right about the tiling thing. I totally forgot about
> that when I said you shouldn't use .update_primary_plane(). I think
> the easiest solution would be to stick the new tiling mode into the
> mmio flip data, and just update DSPCNTR with a RMW in the flip notify
> function. IIIRC we already have test for tiling change during page
> flip in igt.

Yeah, we have a test. But it hangs on a lot of platforms :( It's
kms_flip_tiling.

-Daniel
> 
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c      |   1 +
> >  drivers/gpu/drm/i915/i915_drv.h      |   7 ++
> >  drivers/gpu/drm/i915/i915_gem.c      |   2 +-
> >  drivers/gpu/drm/i915/i915_irq.c      |   3 +
> >  drivers/gpu/drm/i915/i915_params.c   |   4 ++
> >  drivers/gpu/drm/i915/intel_display.c | 133 +++++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/intel_drv.h     |   6 ++
> >  7 files changed, 155 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 20df7c72..266c9a6 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -1571,6 +1571,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
> >  	spin_lock_init(&dev_priv->backlight_lock);
> >  	spin_lock_init(&dev_priv->uncore.lock);
> >  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> > +	spin_lock_init(&dev_priv->mmio_flip_lock);
> >  	mutex_init(&dev_priv->dpio_lock);
> >  	mutex_init(&dev_priv->modeset_restore_lock);
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 13495a4..ced6e58 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1366,6 +1366,9 @@ struct drm_i915_private {
> >  	/* protects the irq masks */
> >  	spinlock_t irq_lock;
> >  
> > +	/* protects the mmio flip data */
> > +	spinlock_t mmio_flip_lock;
> > +
> >  	bool display_irqs_enabled;
> >  
> >  	/* To control wakeup latency, e.g. for irq-driven dp aux transfers. */
> > @@ -2036,6 +2039,7 @@ struct i915_params {
> >  	bool reset;
> >  	bool disable_display;
> >  	bool disable_vtd_wa;
> > +	bool use_mmio_flip;
> >  };
> >  extern struct i915_params i915 __read_mostly;
> >  
> > @@ -2230,6 +2234,7 @@ bool i915_gem_retire_requests(struct drm_device *dev);
> >  void i915_gem_retire_requests_ring(struct intel_ring_buffer *ring);
> >  int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
> >  				      bool interruptible);
> > +int __must_check i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno);
> >  static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
> >  {
> >  	return unlikely(atomic_read(&error->reset_counter)
> > @@ -2598,6 +2603,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
> >  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
> >  			       struct drm_file *file);
> >  
> > +void intel_notify_mmio_flip(struct intel_ring_buffer *ring);
> > +
> >  /* overlay */
> >  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
> >  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index f2713b9..bc6fe4e 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -1096,7 +1096,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
> >   * Compare seqno against outstanding lazy request. Emit a request if they are
> >   * equal.
> >   */
> > -static int
> > +__must_check int
> >  i915_gem_check_olr(struct intel_ring_buffer *ring, u32 seqno)
> >  {
> >  	int ret;
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 2043276..f244b23 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -1155,6 +1155,9 @@ static void notify_ring(struct drm_device *dev,
> >  
> >  	trace_i915_gem_request_complete(ring);
> >  
> > +	if (drm_core_check_feature(dev, DRIVER_MODESET))
> > +		intel_notify_mmio_flip(ring);
> > +
> >  	wake_up_all(&ring->irq_queue);
> >  	i915_queue_hangcheck(dev);
> >  }
> > diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
> > index d05a2af..e0d44df 100644
> > --- a/drivers/gpu/drm/i915/i915_params.c
> > +++ b/drivers/gpu/drm/i915/i915_params.c
> > @@ -48,6 +48,7 @@ struct i915_params i915 __read_mostly = {
> >  	.disable_display = 0,
> >  	.enable_cmd_parser = 1,
> >  	.disable_vtd_wa = 0,
> > +	.use_mmio_flip = 0,
> >  };
> >  
> >  module_param_named(modeset, i915.modeset, int, 0400);
> > @@ -156,3 +157,6 @@ MODULE_PARM_DESC(disable_vtd_wa, "Disable all VT-d workarounds (default: false)"
> >  module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600);
> >  MODULE_PARM_DESC(enable_cmd_parser,
> >  		 "Enable command parsing (1=enabled [default], 0=disabled)");
> > +
> > +module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
> > +MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 19b92c1..a29552d 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -9179,6 +9179,136 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
> >  	return 0;
> >  }
> >  
> > +static bool intel_use_mmio_flip(struct drm_device *dev)
> > +{
> > +	/* If module parameter is disabled, use CS flips.
> > +	 * Otherwise, use MMIO flips starting from Gen5.
> > +	 * This is not being used for older platforms, because
> > +	 * non-availability of flip done interrupt forces us to use
> > +	 * CS flips. Older platforms derive flip done using some clever
> > +	 * tricks involving the flip_pending status bits and vblank irqs.
> > +	 * So using MMIO flips there would disrupt this mechanism.
> > +	 */
> > +
> > +	if (i915.use_mmio_flip == 0)
> > +		return false;
> > +
> > +	if (INTEL_INFO(dev)->gen >= 5)
> > +		return true;
> > +	else
> > +		return false;
> > +}
> > +
> > +static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
> > +{
> > +	struct drm_i915_private *dev_priv =
> > +		intel_crtc->base.dev->dev_private;
> > +	struct intel_framebuffer *intel_fb =
> > +		to_intel_framebuffer(intel_crtc->base.primary->fb);
> > +	struct drm_i915_gem_object *obj = intel_fb->obj;
> > +
> > +	intel_mark_page_flip_active(intel_crtc);
> > +
> > +	I915_WRITE(DSPSURF(intel_crtc->plane), i915_gem_obj_ggtt_offset(obj) +
> > +			intel_crtc->dspaddr_offset);
> > +	POSTING_READ(DSPSURF(intel_crtc->plane));
> > +}
> > +
> > +static int intel_postpone_flip(struct drm_i915_gem_object *obj)
> > +{
> > +	int ret;
> > +
> > +	if (!obj->ring)
> > +		return 0;
> > +
> > +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
> > +				obj->last_write_seqno))
> > +		return 0;
> > +
> > +	ret = i915_gem_check_olr(obj->ring, obj->last_write_seqno);
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> > +		return 0;
> > +
> > +	return 1;
> > +}
> > +
> > +void intel_notify_mmio_flip(struct intel_ring_buffer *ring)
> > +{
> > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> > +	struct intel_crtc *intel_crtc;
> > +	unsigned long irq_flags;
> > +	u32 seqno;
> > +
> > +	seqno = ring->get_seqno(ring, false);
> > +
> > +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> > +	for_each_intel_crtc(ring->dev, intel_crtc) {
> > +		struct intel_mmio_flip *mmio_flip;
> > +
> > +		mmio_flip = &intel_crtc->mmio_flip;
> > +
> > +		if (mmio_flip->seqno == 0)
> > +			continue;
> > +		if (ring->id != mmio_flip->ring_id)
> > +			continue;
> > +
> > +		if (i915_seqno_passed(seqno, mmio_flip->seqno)) {
> > +			intel_do_mmio_flip(intel_crtc);
> > +			mmio_flip->seqno = 0;
> > +			ring->irq_put(ring);
> > +		}
> > +	}
> > +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> > +}
> > +
> > +static int intel_queue_mmio_flip(struct drm_device *dev,
> > +			struct drm_crtc *crtc,
> > +			struct drm_framebuffer *fb,
> > +			struct drm_i915_gem_object *obj,
> > +			uint32_t flags)
> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> > +	unsigned long irq_flags;
> > +	int ret;
> > +
> > +	ret = intel_pin_and_fence_fb_obj(dev, obj, obj->ring);
> > +	if (ret)
> > +		goto err;
> > +
> > +	if (WARN_ON(intel_crtc->mmio_flip.seqno)) {
> > +		ret = -EBUSY;
> > +		goto err_unpin;
> > +	}
> > +
> > +	ret = intel_postpone_flip(obj);
> > +	if (ret < 0) {
> > +		goto err_unpin;
> > +	} else if (ret == 0) {
> > +		intel_do_mmio_flip(intel_crtc);
> > +		return 0;
> > +	}
> > +
> > +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> > +	intel_crtc->mmio_flip.seqno = obj->last_write_seqno;
> > +	intel_crtc->mmio_flip.ring_id = obj->ring->id;
> > +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> > +
> > +	/* Double check to catch cases where irq fired before
> > +	 * mmio flip data was ready
> > +	 */
> > +	intel_notify_mmio_flip(obj->ring);
> > +	return 0;
> > +
> > +err_unpin:
> > +	intel_unpin_fb_obj(obj);
> > +err:
> > +	return ret;
> > +}
> > +
> >  static int intel_default_queue_flip(struct drm_device *dev,
> >  				    struct drm_crtc *crtc,
> >  				    struct drm_framebuffer *fb,
> > @@ -11470,6 +11600,9 @@ static void intel_init_display(struct drm_device *dev)
> >  		break;
> >  	}
> >  
> > +	if (intel_use_mmio_flip(dev))
> > +		dev_priv->display.queue_flip = intel_queue_mmio_flip;
> > +
> >  	intel_panel_init_backlight_funcs(dev);
> >  }
> >  
> > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> > index 287b89e..5a4f60c 100644
> > --- a/drivers/gpu/drm/i915/intel_drv.h
> > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > @@ -358,6 +358,11 @@ struct intel_pipe_wm {
> >  	bool sprites_scaled;
> >  };
> >  
> > +struct intel_mmio_flip {
> > +	u32 seqno;
> > +	u32 ring_id;
> > +};
> > +
> >  struct intel_crtc {
> >  	struct drm_crtc base;
> >  	enum pipe pipe;
> > @@ -409,6 +414,7 @@ struct intel_crtc {
> >  	} wm;
> >  
> >  	wait_queue_head_t vbl_wait;
> > +	struct intel_mmio_flip mmio_flip;
> >  };
> >  
> >  struct intel_plane_wm_parameters {
> > -- 
> > 1.8.5.1
> 
> -- 
> Ville Syrjälä
> Intel OTC
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v3 0/2] Replace Blitter ring based flips with MMIO flips
  2014-05-27 13:09                                                           ` Daniel Vetter
@ 2014-05-28  7:12                                                             ` sourab.gupta
  2014-05-28  7:12                                                               ` [PATCH 1/2] drm/i915: Replaced " sourab.gupta
  2014-05-28  7:12                                                               ` [PATCH 2/2] drm/i915: Default to mmio flips on VLV sourab.gupta
  0 siblings, 2 replies; 67+ messages in thread
From: sourab.gupta @ 2014-05-28  7:12 UTC (permalink / raw)
  To: intel-gfx; +Cc: Deepak S, Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch series replaces Blitter ring based flips with MMIO based flips.
This is useful for Media power well residency optimization. These may be
enabled on architectures where Render and Blitter engines reside in different
power wells.
The blitter ring is currently being used just for command streamer based
flip calls. For pure 3D workloads, with MMIO flips, there will be no use
of blitter ring and this will ensure the 100% residency for Media well.

Sourab Gupta (2):
  drm/i915: Replaced Blitter ring based flips with MMIO flips
  drm/i915: Default to mmio flips on VLV

 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   8 ++
 drivers/gpu/drm/i915/i915_gem.c      |   2 +-
 drivers/gpu/drm/i915/i915_irq.c      |   3 +
 drivers/gpu/drm/i915/i915_params.c   |   5 ++
 drivers/gpu/drm/i915/intel_display.c | 152 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h     |   6 ++
 7 files changed, 176 insertions(+), 1 deletion(-)

-- 
1.8.5.1

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH 1/2] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-28  7:12                                                             ` [PATCH v3 0/2] Replace " sourab.gupta
@ 2014-05-28  7:12                                                               ` sourab.gupta
  2014-05-28  7:30                                                                 ` Chris Wilson
  2014-05-28  7:31                                                                 ` Chris Wilson
  2014-05-28  7:12                                                               ` [PATCH 2/2] drm/i915: Default to mmio flips on VLV sourab.gupta
  1 sibling, 2 replies; 67+ messages in thread
From: sourab.gupta @ 2014-05-28  7:12 UTC (permalink / raw)
  To: intel-gfx; +Cc: Deepak S, Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

Using MMIO based flips on Gen5+. The MMIO flips are useful for the Media power
well residency optimization. These maybe enabled on architectures where
Render and Blitter engines reside in different power wells.
The blitter ring is currently being used just for command streamer based
flip calls. For pure 3D workloads in such cases, with MMIO flips, there will
be no use of blitter ring and this will ensure the 100% residency for Media well.

v2: The MMIO flips now use the interrupt driven mechanism for issuing the
flips when target seqno is reached. (Incorporating Ville's idea)

v3: Rebasing on latest code. Code restructuring after incorporating
Damien's comments

v4: Addressing Ville's review comments
    -general cleanup
    -updating only base addr instead of calling update_primary_plane
    -extending patch for gen5+ platforms

v5: Addressed Ville's review comments
    -Making mmio flip vs cs flip selection based on module parameter
    -Adding check for DRIVER_MODESET feature in notify_ring before calling
     notify mmio flip.
    -Other changes mostly in function arguments

v6: -Having a seperate function to check condition for using mmio flips (Ville)
    -propogating error code from i915_gem_check_olr (Ville)

v7: -Adding __must_check with i915_gem_check_olr (Chris)
    -Renaming mmio_flip_data to mmio_flip (Chris)
    -Rebasing on latest nightly

v8: -Rebasing on latest code
    -squash 3rd patch in series(mmio setbase vs page flip race) with this patch
    -Added new tiling mode update in intel_do_mmio_flip (Chris)

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   8 ++
 drivers/gpu/drm/i915/i915_gem.c      |   2 +-
 drivers/gpu/drm/i915/i915_irq.c      |   3 +
 drivers/gpu/drm/i915/i915_params.c   |   4 +
 drivers/gpu/drm/i915/intel_display.c | 140 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h     |   6 ++
 7 files changed, 163 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index b9159ad..532733a 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1572,6 +1572,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->mmio_flip_lock);
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bea9ab40..36e1b2a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1367,6 +1367,9 @@ struct drm_i915_private {
 	/* protects the irq masks */
 	spinlock_t irq_lock;
 
+	/* protects the mmio flip data */
+	spinlock_t mmio_flip_lock;
+
 	bool display_irqs_enabled;
 
 	/* To control wakeup latency, e.g. for irq-driven dp aux transfers. */
@@ -2036,6 +2039,7 @@ struct i915_params {
 	bool reset;
 	bool disable_display;
 	bool disable_vtd_wa;
+	bool use_mmio_flip;
 };
 extern struct i915_params i915 __read_mostly;
 
@@ -2231,6 +2235,8 @@ bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
+int __must_check i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno);
+
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
 	return unlikely(atomic_read(&error->reset_counter)
@@ -2601,6 +2607,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
 int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct intel_engine_cs *ring);
+
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
 extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 70b4f41..ab663ca 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1096,7 +1096,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
  * Compare seqno against outstanding lazy request. Emit a request if they are
  * equal.
  */
-static int
+__must_check int
 i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno)
 {
 	int ret;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 4ef6423..e0edb1f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1218,6 +1218,9 @@ static void notify_ring(struct drm_device *dev,
 
 	trace_i915_gem_request_complete(ring);
 
+	if (drm_core_check_feature(dev, DRIVER_MODESET))
+		intel_notify_mmio_flip(ring);
+
 	wake_up_all(&ring->irq_queue);
 	i915_queue_hangcheck(dev);
 }
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index d05a2af..e0d44df 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -48,6 +48,7 @@ struct i915_params i915 __read_mostly = {
 	.disable_display = 0,
 	.enable_cmd_parser = 1,
 	.disable_vtd_wa = 0,
+	.use_mmio_flip = 0,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -156,3 +157,6 @@ MODULE_PARM_DESC(disable_vtd_wa, "Disable all VT-d workarounds (default: false)"
 module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600);
 MODULE_PARM_DESC(enable_cmd_parser,
 		 "Enable command parsing (1=enabled [default], 0=disabled)");
+
+module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
+MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 731cd01..f11abfb 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9183,6 +9183,143 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	return 0;
 }
 
+static bool intel_use_mmio_flip(struct drm_device *dev)
+{
+	/* If module parameter is disabled, use CS flips.
+	 * Otherwise, use MMIO flips starting from Gen5.
+	 * This is not being used for older platforms, because
+	 * non-availability of flip done interrupt forces us to use
+	 * CS flips. Older platforms derive flip done using some clever
+	 * tricks involving the flip_pending status bits and vblank irqs.
+	 * So using MMIO flips there would disrupt this mechanism.
+	 */
+
+	if (i915.use_mmio_flip == 0)
+		return false;
+
+	if (INTEL_INFO(dev)->gen >= 5)
+		return true;
+	else
+		return false;
+}
+
+static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
+{
+	struct drm_device *dev = intel_crtc->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_framebuffer *intel_fb =
+		to_intel_framebuffer(intel_crtc->base.primary->fb);
+	struct drm_i915_gem_object *obj = intel_fb->obj;
+	u32 dspcntr;
+	u32 reg;
+
+	intel_mark_page_flip_active(intel_crtc);
+
+	reg = DSPCNTR(intel_crtc->plane);
+	dspcntr = I915_READ(reg);
+
+	if (INTEL_INFO(dev)->gen >= 4) {
+		if (obj->tiling_mode != I915_TILING_NONE)
+			dspcntr |= DISPPLANE_TILED;
+		else
+			dspcntr &= ~DISPPLANE_TILED;
+	}
+	I915_WRITE(reg, dspcntr);
+
+	I915_WRITE(DSPSURF(intel_crtc->plane),
+			intel_crtc->unpin_work->gtt_offset);
+	POSTING_READ(DSPSURF(intel_crtc->plane));
+}
+
+static int intel_postpone_flip(struct drm_i915_gem_object *obj)
+{
+	int ret;
+
+	if (!obj->ring)
+		return 0;
+
+	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
+				obj->last_write_seqno))
+		return 0;
+
+	ret = i915_gem_check_olr(obj->ring, obj->last_write_seqno);
+	if (ret)
+		return ret;
+
+	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
+		return 0;
+
+	return 1;
+}
+
+void intel_notify_mmio_flip(struct intel_engine_cs *ring)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_crtc *intel_crtc;
+	unsigned long irq_flags;
+	u32 seqno;
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	for_each_intel_crtc(ring->dev, intel_crtc) {
+		struct intel_mmio_flip *mmio_flip;
+
+		mmio_flip = &intel_crtc->mmio_flip;
+
+		if (mmio_flip->seqno == 0)
+			continue;
+		if (ring->id != mmio_flip->ring_id)
+			continue;
+
+		if (i915_seqno_passed(seqno, mmio_flip->seqno)) {
+			intel_do_mmio_flip(intel_crtc);
+			mmio_flip->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+}
+
+static int intel_queue_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc,
+			struct drm_framebuffer *fb,
+			struct drm_i915_gem_object *obj,
+			struct intel_engine_cs *ring,
+			uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	unsigned long irq_flags;
+	int ret;
+
+	if (WARN_ON(intel_crtc->mmio_flip.seqno)) {
+		ret = -EBUSY;
+		goto err;
+	}
+
+	ret = intel_postpone_flip(obj);
+	if (ret < 0) {
+		goto err;
+	} else if (ret == 0) {
+		intel_do_mmio_flip(intel_crtc);
+		return 0;
+	}
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	intel_crtc->mmio_flip.seqno = obj->last_write_seqno;
+	intel_crtc->mmio_flip.ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+
+	/* Double check to catch cases where irq fired before
+	 * mmio flip data was ready
+	 */
+	intel_notify_mmio_flip(obj->ring);
+	return 0;
+err:
+	return ret;
+}
+
 static int intel_default_queue_flip(struct drm_device *dev,
 				    struct drm_crtc *crtc,
 				    struct drm_framebuffer *fb,
@@ -11515,6 +11652,9 @@ static void intel_init_display(struct drm_device *dev)
 		break;
 	}
 
+	if (intel_use_mmio_flip(dev))
+		dev_priv->display.queue_flip = intel_queue_mmio_flip;
+
 	intel_panel_init_backlight_funcs(dev);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 9bb70dc..206b577 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -358,6 +358,11 @@ struct intel_pipe_wm {
 	bool sprites_scaled;
 };
 
+struct intel_mmio_flip {
+	u32 seqno;
+	u32 ring_id;
+};
+
 struct intel_crtc {
 	struct drm_crtc base;
 	enum pipe pipe;
@@ -411,6 +416,7 @@ struct intel_crtc {
 	wait_queue_head_t vbl_wait;
 
 	int scanline_offset;
+	struct intel_mmio_flip mmio_flip;
 };
 
 struct intel_plane_wm_parameters {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 2/2] drm/i915: Default to mmio flips on VLV
  2014-05-28  7:12                                                             ` [PATCH v3 0/2] Replace " sourab.gupta
  2014-05-28  7:12                                                               ` [PATCH 1/2] drm/i915: Replaced " sourab.gupta
@ 2014-05-28  7:12                                                               ` sourab.gupta
  2014-05-28  9:56                                                                 ` Chris Wilson
  1 sibling, 1 reply; 67+ messages in thread
From: sourab.gupta @ 2014-05-28  7:12 UTC (permalink / raw)
  To: intel-gfx; +Cc: Deepak S, Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch is for using mmio flips by default on VLV.
The module parameter controlling use of MMIO flips allows us to
control the default behaviour, which is set true for VLV and false
elsewhere.

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/i915_params.c   |  5 +++--
 drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index e0d44df..a99accc 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -48,7 +48,7 @@ struct i915_params i915 __read_mostly = {
 	.disable_display = 0,
 	.enable_cmd_parser = 1,
 	.disable_vtd_wa = 0,
-	.use_mmio_flip = 0,
+	.use_mmio_flip = -1,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -159,4 +159,5 @@ MODULE_PARM_DESC(enable_cmd_parser,
 		 "Enable command parsing (1=enabled [default], 0=disabled)");
 
 module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
-MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");
+MODULE_PARM_DESC(use_mmio_flip, "use MMIO page flips "
+		"(default: -1 (use per-chip default))");
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index f11abfb..312a9a1 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9197,6 +9197,18 @@ static bool intel_use_mmio_flip(struct drm_device *dev)
 	if (i915.use_mmio_flip == 0)
 		return false;
 
+	/* On Valleyview, use MMIO flips by default, for Media Power Well
+	 * residency optimization. The other alternative of having Render
+	 * ring based flip calls is not being used, as the performance(FPS)
+	 * of certain 3D Apps gets severly affected.
+	 */
+	if (i915.use_mmio_flip == -1) {
+		if (IS_VALLEYVIEW(dev))
+			return true;
+		else
+			return false;
+	}
+
 	if (INTEL_INFO(dev)->gen >= 5)
 		return true;
 	else
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH 1/2] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-28  7:12                                                               ` [PATCH 1/2] drm/i915: Replaced " sourab.gupta
@ 2014-05-28  7:30                                                                 ` Chris Wilson
  2014-05-28  9:42                                                                   ` Gupta, Sourab
  2014-05-28  7:31                                                                 ` Chris Wilson
  1 sibling, 1 reply; 67+ messages in thread
From: Chris Wilson @ 2014-05-28  7:30 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Deepak S, intel-gfx, Akash Goel

On Wed, May 28, 2014 at 12:42:01PM +0530, sourab.gupta@intel.com wrote:
> +static int intel_postpone_flip(struct drm_i915_gem_object *obj)
> +{
> +	int ret;
> +
> +	if (!obj->ring)
> +		return 0;
> +
> +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
> +				obj->last_write_seqno))
> +		return 0;

obj->last_write_seqno could be 0 here. To be correct, test against
obj->last_write_seqno == 0 instead of obj->ring == NULL first.

> +	ret = i915_gem_check_olr(obj->ring, obj->last_write_seqno);
> +	if (ret)
> +		return ret;
> +
> +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> +		return 0;
> +
> +	return 1;
> +}
> +

>  static int intel_default_queue_flip(struct drm_device *dev,
>  				    struct drm_crtc *crtc,
>  				    struct drm_framebuffer *fb,
> @@ -11515,6 +11652,9 @@ static void intel_init_display(struct drm_device *dev)
>  		break;
>  	}
>  
> +	if (intel_use_mmio_flip(dev))
> +		dev_priv->display.queue_flip = intel_queue_mmio_flip;

I'd still like to see this as a tristate, i.e. use_mmio_flip(dev) > 1
here. E.g. we will want to use mmio flips for !BCS, but GPU flips when
BCS is active.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 1/2] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-28  7:12                                                               ` [PATCH 1/2] drm/i915: Replaced " sourab.gupta
  2014-05-28  7:30                                                                 ` Chris Wilson
@ 2014-05-28  7:31                                                                 ` Chris Wilson
  2014-05-28  8:12                                                                   ` Ville Syrjälä
  1 sibling, 1 reply; 67+ messages in thread
From: Chris Wilson @ 2014-05-28  7:31 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Deepak S, intel-gfx, Akash Goel

On Wed, May 28, 2014 at 12:42:01PM +0530, sourab.gupta@intel.com wrote:
> +static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
> +{
> +	struct drm_device *dev = intel_crtc->base.dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_framebuffer *intel_fb =
> +		to_intel_framebuffer(intel_crtc->base.primary->fb);
> +	struct drm_i915_gem_object *obj = intel_fb->obj;
> +	u32 dspcntr;
> +	u32 reg;
> +
> +	intel_mark_page_flip_active(intel_crtc);
> +
> +	reg = DSPCNTR(intel_crtc->plane);
> +	dspcntr = I915_READ(reg);
> +
> +	if (INTEL_INFO(dev)->gen >= 4) {
> +		if (obj->tiling_mode != I915_TILING_NONE)
> +			dspcntr |= DISPPLANE_TILED;
> +		else
> +			dspcntr &= ~DISPPLANE_TILED;
> +	}
> +	I915_WRITE(reg, dspcntr);
> +
> +	I915_WRITE(DSPSURF(intel_crtc->plane),
> +			intel_crtc->unpin_work->gtt_offset);
> +	POSTING_READ(DSPSURF(intel_crtc->plane));
> +}

So other than byt, why would we not use LRI here and avoid waking the
CPU up?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 1/2] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-28  7:31                                                                 ` Chris Wilson
@ 2014-05-28  8:12                                                                   ` Ville Syrjälä
  0 siblings, 0 replies; 67+ messages in thread
From: Ville Syrjälä @ 2014-05-28  8:12 UTC (permalink / raw)
  To: Chris Wilson, sourab.gupta, intel-gfx, Deepak S, Akash Goel

On Wed, May 28, 2014 at 08:31:52AM +0100, Chris Wilson wrote:
> On Wed, May 28, 2014 at 12:42:01PM +0530, sourab.gupta@intel.com wrote:
> > +static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
> > +{
> > +	struct drm_device *dev = intel_crtc->base.dev;
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct intel_framebuffer *intel_fb =
> > +		to_intel_framebuffer(intel_crtc->base.primary->fb);
> > +	struct drm_i915_gem_object *obj = intel_fb->obj;
> > +	u32 dspcntr;
> > +	u32 reg;
> > +
> > +	intel_mark_page_flip_active(intel_crtc);
> > +
> > +	reg = DSPCNTR(intel_crtc->plane);
> > +	dspcntr = I915_READ(reg);
> > +
> > +	if (INTEL_INFO(dev)->gen >= 4) {
> > +		if (obj->tiling_mode != I915_TILING_NONE)
> > +			dspcntr |= DISPPLANE_TILED;
> > +		else
> > +			dspcntr &= ~DISPPLANE_TILED;
> > +	}
> > +	I915_WRITE(reg, dspcntr);
> > +
> > +	I915_WRITE(DSPSURF(intel_crtc->plane),
> > +			intel_crtc->unpin_work->gtt_offset);
> > +	POSTING_READ(DSPSURF(intel_crtc->plane));
> > +}
> 
> So other than byt, why would we not use LRI here and avoid waking the
> CPU up?

The plan is to eventually expand this thing to handle the nuclear flip
and we're going use mmio for that. So going for mmio from the start
seems fine to me. Especially since we need the mmio path for byt anyway.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 1/2] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-28  7:30                                                                 ` Chris Wilson
@ 2014-05-28  9:42                                                                   ` Gupta, Sourab
  0 siblings, 0 replies; 67+ messages in thread
From: Gupta, Sourab @ 2014-05-28  9:42 UTC (permalink / raw)
  To: Chris Wilson; +Cc: S, Deepak, intel-gfx, Goel, Akash

On Wed, 2014-05-28 at 07:30 +0000, Chris Wilson wrote:
> On Wed, May 28, 2014 at 12:42:01PM +0530, sourab.gupta@intel.com wrote:
> > +static int intel_postpone_flip(struct drm_i915_gem_object *obj)
> > +{
> > +	int ret;
> > +
> > +	if (!obj->ring)
> > +		return 0;
> > +
> > +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
> > +				obj->last_write_seqno))
> > +		return 0;
> 
> obj->last_write_seqno could be 0 here. To be correct, test against
> obj->last_write_seqno == 0 instead of obj->ring == NULL first.
> 
> > +	ret = i915_gem_check_olr(obj->ring, obj->last_write_seqno);
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> > +		return 0;
> > +
> > +	return 1;
> > +}
> > +
> 
> >  static int intel_default_queue_flip(struct drm_device *dev,
> >  				    struct drm_crtc *crtc,
> >  				    struct drm_framebuffer *fb,
> > @@ -11515,6 +11652,9 @@ static void intel_init_display(struct drm_device *dev)
> >  		break;
> >  	}
> >  
> > +	if (intel_use_mmio_flip(dev))
> > +		dev_priv->display.queue_flip = intel_queue_mmio_flip;
> 
> I'd still like to see this as a tristate, i.e. use_mmio_flip(dev) > 1
> here. E.g. we will want to use mmio flips for !BCS, but GPU flips when
> BCS is active.
> -Chris
> 
Hi Chris,
We can extend the module param to have tristate here, but since the
current design assigns the .queue_flip parameter at the init time, not
at the page flip time, this kind of usecase of having driver discretion
in terms of assigning CS vs MMIO flip can't be applied in this patch.

If required, we can have a subsequent patch to make a flip time decision
of MMIO vs CS flip based on module parameter.

Regards,
Sourab

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 2/2] drm/i915: Default to mmio flips on VLV
  2014-05-28  7:12                                                               ` [PATCH 2/2] drm/i915: Default to mmio flips on VLV sourab.gupta
@ 2014-05-28  9:56                                                                 ` Chris Wilson
  2014-05-29  9:40                                                                   ` [PATCH v4 0/3] Replace Blitter ring based flips with MMIO flips sourab.gupta
  0 siblings, 1 reply; 67+ messages in thread
From: Chris Wilson @ 2014-05-28  9:56 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Deepak S, intel-gfx, Akash Goel

On Wed, May 28, 2014 at 12:42:02PM +0530, sourab.gupta@intel.com wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> This patch is for using mmio flips by default on VLV.
> The module parameter controlling use of MMIO flips allows us to
> control the default behaviour, which is set true for VLV and false
> elsewhere.
> 
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_params.c   |  5 +++--
>  drivers/gpu/drm/i915/intel_display.c | 12 ++++++++++++
>  2 files changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
> index e0d44df..a99accc 100644
> --- a/drivers/gpu/drm/i915/i915_params.c
> +++ b/drivers/gpu/drm/i915/i915_params.c
> @@ -48,7 +48,7 @@ struct i915_params i915 __read_mostly = {
>  	.disable_display = 0,
>  	.enable_cmd_parser = 1,
>  	.disable_vtd_wa = 0,
> -	.use_mmio_flip = 0,
> +	.use_mmio_flip = -1,
>  };
>  
>  module_param_named(modeset, i915.modeset, int, 0400);
> @@ -159,4 +159,5 @@ MODULE_PARM_DESC(enable_cmd_parser,
>  		 "Enable command parsing (1=enabled [default], 0=disabled)");
>  
>  module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
> -MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");
> +MODULE_PARM_DESC(use_mmio_flip, "use MMIO page flips "
> +		"(default: -1 (use per-chip default))");
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index f11abfb..312a9a1 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -9197,6 +9197,18 @@ static bool intel_use_mmio_flip(struct drm_device *dev)
>  	if (i915.use_mmio_flip == 0)
>  		return false;
>  
> +	/* On Valleyview, use MMIO flips by default, for Media Power Well
> +	 * residency optimization. The other alternative of having Render
> +	 * ring based flip calls is not being used, as the performance(FPS)
> +	 * of certain 3D Apps gets severly affected.
> +	 */

Pray tell, what RCS flips?

This seems to only tell one half of the story and would seem to imply
that only using mmio for the BSD ring would be preferrable. The current
override doesn't look useful enough to do an independent study of BCS vs
mmio flips for all workloads.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v4 0/3] Replace Blitter ring based flips with MMIO flips
  2014-05-28  9:56                                                                 ` Chris Wilson
@ 2014-05-29  9:40                                                                   ` sourab.gupta
  2014-05-29  9:40                                                                     ` [PATCH v9 1/3] drm/i915: Replaced " sourab.gupta
                                                                                       ` (2 more replies)
  0 siblings, 3 replies; 67+ messages in thread
From: sourab.gupta @ 2014-05-29  9:40 UTC (permalink / raw)
  To: intel-gfx; +Cc: Deepak S, Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch series enables the framework for using MMIO flips in place of
Blitter ring based flips.
This is useful for Media power well residency optimization. These may be
enabled on architectures where Render and Blitter engines reside in different
power wells.
The blitter ring is currently being used just for command streamer based
flip calls. The decision to use MMIO flips can be made based on workloads to give
100% residency for Media power well.


Sourab Gupta (3):
  drm/i915: Replaced Blitter ring based flips with MMIO flips
  drm/i915: Selection of MMIO vs CS flip at page flip time
  drm/i915: Make module param for MMIO flip selection as tristate

 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   8 ++
 drivers/gpu/drm/i915/i915_gem.c      |   2 +-
 drivers/gpu/drm/i915/i915_irq.c      |   3 +
 drivers/gpu/drm/i915/i915_params.c   |   6 ++
 drivers/gpu/drm/i915/intel_display.c | 173 ++++++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_drv.h     |   6 ++
 7 files changed, 197 insertions(+), 2 deletions(-)

-- 
1.8.5.1

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v9 1/3] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-29  9:40                                                                   ` [PATCH v4 0/3] Replace Blitter ring based flips with MMIO flips sourab.gupta
@ 2014-05-29  9:40                                                                     ` sourab.gupta
  2014-05-30 10:31                                                                       ` Chris Wilson
  2014-05-29  9:40                                                                     ` [PATCH 2/3] drm/i915: Selection of MMIO vs CS flip at page flip time sourab.gupta
  2014-05-29  9:40                                                                     ` [PATCH 3/3] drm/i915: Make module param for MMIO flip selection as tristate sourab.gupta
  2 siblings, 1 reply; 67+ messages in thread
From: sourab.gupta @ 2014-05-29  9:40 UTC (permalink / raw)
  To: intel-gfx; +Cc: Deepak S, Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch enables the framework for using MMIO based flip calls,
in contrast with the CS based flip calls which are being used currently.

MMIO based flip calls can be enabled on architectures where
Render and Blitter engines reside in different power wells. The
decision to use MMIO flips can be made based on workloads to give
100% residency for Media power well.

v2: The MMIO flips now use the interrupt driven mechanism for issuing the
flips when target seqno is reached. (Incorporating Ville's idea)

v3: Rebasing on latest code. Code restructuring after incorporating
Damien's comments

v4: Addressing Ville's review comments
    -general cleanup
    -updating only base addr instead of calling update_primary_plane
    -extending patch for gen5+ platforms

v5: Addressed Ville's review comments
    -Making mmio flip vs cs flip selection based on module parameter
    -Adding check for DRIVER_MODESET feature in notify_ring before calling
     notify mmio flip.
    -Other changes mostly in function arguments

v6: -Having a seperate function to check condition for using mmio flips (Ville)
    -propogating error code from i915_gem_check_olr (Ville)

v7: -Adding __must_check with i915_gem_check_olr (Chris)
    -Renaming mmio_flip_data to mmio_flip (Chris)
    -Rebasing on latest nightly

v8: -Rebasing on latest code
    -squash 3rd patch in series(mmio setbase vs page flip race) with this patch
    -Added new tiling mode update in intel_do_mmio_flip (Chris)

v9: -check for obj->last_write_seqno being 0 instead of obj->ring being NULL in
intel_postpone_flip, as this is a more restrictive condition (Chris)

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   8 ++
 drivers/gpu/drm/i915/i915_gem.c      |   2 +-
 drivers/gpu/drm/i915/i915_irq.c      |   3 +
 drivers/gpu/drm/i915/i915_params.c   |   4 +
 drivers/gpu/drm/i915/intel_display.c | 140 +++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h     |   6 ++
 7 files changed, 163 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index b9159ad..532733a 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1572,6 +1572,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->mmio_flip_lock);
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bea9ab40..36e1b2a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1367,6 +1367,9 @@ struct drm_i915_private {
 	/* protects the irq masks */
 	spinlock_t irq_lock;
 
+	/* protects the mmio flip data */
+	spinlock_t mmio_flip_lock;
+
 	bool display_irqs_enabled;
 
 	/* To control wakeup latency, e.g. for irq-driven dp aux transfers. */
@@ -2036,6 +2039,7 @@ struct i915_params {
 	bool reset;
 	bool disable_display;
 	bool disable_vtd_wa;
+	bool use_mmio_flip;
 };
 extern struct i915_params i915 __read_mostly;
 
@@ -2231,6 +2235,8 @@ bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
+int __must_check i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno);
+
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
 	return unlikely(atomic_read(&error->reset_counter)
@@ -2601,6 +2607,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
 int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct intel_engine_cs *ring);
+
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
 extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 70b4f41..ab663ca 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1096,7 +1096,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
  * Compare seqno against outstanding lazy request. Emit a request if they are
  * equal.
  */
-static int
+__must_check int
 i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno)
 {
 	int ret;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 4ef6423..e0edb1f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1218,6 +1218,9 @@ static void notify_ring(struct drm_device *dev,
 
 	trace_i915_gem_request_complete(ring);
 
+	if (drm_core_check_feature(dev, DRIVER_MODESET))
+		intel_notify_mmio_flip(ring);
+
 	wake_up_all(&ring->irq_queue);
 	i915_queue_hangcheck(dev);
 }
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index d05a2af..e0d44df 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -48,6 +48,7 @@ struct i915_params i915 __read_mostly = {
 	.disable_display = 0,
 	.enable_cmd_parser = 1,
 	.disable_vtd_wa = 0,
+	.use_mmio_flip = 0,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -156,3 +157,6 @@ MODULE_PARM_DESC(disable_vtd_wa, "Disable all VT-d workarounds (default: false)"
 module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600);
 MODULE_PARM_DESC(enable_cmd_parser,
 		 "Enable command parsing (1=enabled [default], 0=disabled)");
+
+module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
+MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 731cd01..c75a925 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9183,6 +9183,143 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	return 0;
 }
 
+static bool intel_use_mmio_flip(struct drm_device *dev)
+{
+	/* If module parameter is disabled, use CS flips.
+	 * Otherwise, use MMIO flips starting from Gen5.
+	 * This is not being used for older platforms, because
+	 * non-availability of flip done interrupt forces us to use
+	 * CS flips. Older platforms derive flip done using some clever
+	 * tricks involving the flip_pending status bits and vblank irqs.
+	 * So using MMIO flips there would disrupt this mechanism.
+	 */
+
+	if (i915.use_mmio_flip == 0)
+		return false;
+
+	if (INTEL_INFO(dev)->gen >= 5)
+		return true;
+	else
+		return false;
+}
+
+static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
+{
+	struct drm_device *dev = intel_crtc->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_framebuffer *intel_fb =
+		to_intel_framebuffer(intel_crtc->base.primary->fb);
+	struct drm_i915_gem_object *obj = intel_fb->obj;
+	u32 dspcntr;
+	u32 reg;
+
+	intel_mark_page_flip_active(intel_crtc);
+
+	reg = DSPCNTR(intel_crtc->plane);
+	dspcntr = I915_READ(reg);
+
+	if (INTEL_INFO(dev)->gen >= 4) {
+		if (obj->tiling_mode != I915_TILING_NONE)
+			dspcntr |= DISPPLANE_TILED;
+		else
+			dspcntr &= ~DISPPLANE_TILED;
+	}
+	I915_WRITE(reg, dspcntr);
+
+	I915_WRITE(DSPSURF(intel_crtc->plane),
+			intel_crtc->unpin_work->gtt_offset);
+	POSTING_READ(DSPSURF(intel_crtc->plane));
+}
+
+static int intel_postpone_flip(struct drm_i915_gem_object *obj)
+{
+	int ret;
+
+	if (!obj->last_write_seqno)
+		return 0;
+
+	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
+				obj->last_write_seqno))
+		return 0;
+
+	ret = i915_gem_check_olr(obj->ring, obj->last_write_seqno);
+	if (ret)
+		return ret;
+
+	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
+		return 0;
+
+	return 1;
+}
+
+void intel_notify_mmio_flip(struct intel_engine_cs *ring)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_crtc *intel_crtc;
+	unsigned long irq_flags;
+	u32 seqno;
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	for_each_intel_crtc(ring->dev, intel_crtc) {
+		struct intel_mmio_flip *mmio_flip;
+
+		mmio_flip = &intel_crtc->mmio_flip;
+
+		if (mmio_flip->seqno == 0)
+			continue;
+		if (ring->id != mmio_flip->ring_id)
+			continue;
+
+		if (i915_seqno_passed(seqno, mmio_flip->seqno)) {
+			intel_do_mmio_flip(intel_crtc);
+			mmio_flip->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+}
+
+static int intel_queue_mmio_flip(struct drm_device *dev,
+			struct drm_crtc *crtc,
+			struct drm_framebuffer *fb,
+			struct drm_i915_gem_object *obj,
+			struct intel_engine_cs *ring,
+			uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	unsigned long irq_flags;
+	int ret;
+
+	if (WARN_ON(intel_crtc->mmio_flip.seqno)) {
+		ret = -EBUSY;
+		goto err;
+	}
+
+	ret = intel_postpone_flip(obj);
+	if (ret < 0) {
+		goto err;
+	} else if (ret == 0) {
+		intel_do_mmio_flip(intel_crtc);
+		return 0;
+	}
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	intel_crtc->mmio_flip.seqno = obj->last_write_seqno;
+	intel_crtc->mmio_flip.ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+
+	/* Double check to catch cases where irq fired before
+	 * mmio flip data was ready
+	 */
+	intel_notify_mmio_flip(obj->ring);
+	return 0;
+err:
+	return ret;
+}
+
 static int intel_default_queue_flip(struct drm_device *dev,
 				    struct drm_crtc *crtc,
 				    struct drm_framebuffer *fb,
@@ -11515,6 +11652,9 @@ static void intel_init_display(struct drm_device *dev)
 		break;
 	}
 
+	if (intel_use_mmio_flip(dev))
+		dev_priv->display.queue_flip = intel_queue_mmio_flip;
+
 	intel_panel_init_backlight_funcs(dev);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 9bb70dc..206b577 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -358,6 +358,11 @@ struct intel_pipe_wm {
 	bool sprites_scaled;
 };
 
+struct intel_mmio_flip {
+	u32 seqno;
+	u32 ring_id;
+};
+
 struct intel_crtc {
 	struct drm_crtc base;
 	enum pipe pipe;
@@ -411,6 +416,7 @@ struct intel_crtc {
 	wait_queue_head_t vbl_wait;
 
 	int scanline_offset;
+	struct intel_mmio_flip mmio_flip;
 };
 
 struct intel_plane_wm_parameters {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 2/3] drm/i915: Selection of MMIO vs CS flip at page flip time
  2014-05-29  9:40                                                                   ` [PATCH v4 0/3] Replace Blitter ring based flips with MMIO flips sourab.gupta
  2014-05-29  9:40                                                                     ` [PATCH v9 1/3] drm/i915: Replaced " sourab.gupta
@ 2014-05-29  9:40                                                                     ` sourab.gupta
  2014-05-29  9:40                                                                     ` [PATCH 3/3] drm/i915: Make module param for MMIO flip selection as tristate sourab.gupta
  2 siblings, 0 replies; 67+ messages in thread
From: sourab.gupta @ 2014-05-29  9:40 UTC (permalink / raw)
  To: intel-gfx; +Cc: Deepak S, Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch enables the selection of MMIO flip vs CS flip at
page flip time. Earlier, this selection was done only at the
init time, so, once .queue_flip was set, it was used forever.
This patch enables this selection of flip mechanism at a time
when page flips is being issued.

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index c75a925..9dda965 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9427,7 +9427,13 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 	work->gtt_offset =
 		i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset;
 
-	ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring, page_flip_flags);
+	if (intel_use_mmio_flip(dev))
+		ret = intel_queue_mmio_flip(dev, crtc,
+				fb, obj, ring, page_flip_flags);
+	else
+		ret = dev_priv->display.queue_flip(dev, crtc,
+				fb, obj, ring, page_flip_flags);
+
 	if (ret)
 		goto cleanup_unpin;
 
@@ -11652,9 +11658,6 @@ static void intel_init_display(struct drm_device *dev)
 		break;
 	}
 
-	if (intel_use_mmio_flip(dev))
-		dev_priv->display.queue_flip = intel_queue_mmio_flip;
-
 	intel_panel_init_backlight_funcs(dev);
 }
 
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH 3/3] drm/i915: Make module param for MMIO flip selection as tristate
  2014-05-29  9:40                                                                   ` [PATCH v4 0/3] Replace Blitter ring based flips with MMIO flips sourab.gupta
  2014-05-29  9:40                                                                     ` [PATCH v9 1/3] drm/i915: Replaced " sourab.gupta
  2014-05-29  9:40                                                                     ` [PATCH 2/3] drm/i915: Selection of MMIO vs CS flip at page flip time sourab.gupta
@ 2014-05-29  9:40                                                                     ` sourab.gupta
  2014-05-30 10:49                                                                       ` Chris Wilson
  2 siblings, 1 reply; 67+ messages in thread
From: sourab.gupta @ 2014-05-29  9:40 UTC (permalink / raw)
  To: intel-gfx; +Cc: Deepak S, Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch enhances the module parameter, 'use_mmio_flip' which
enables MMIO flips, to make it tristate. The values being-
0: Force CS flip
1: Force MMIO flip (Gen5+)
>1: Driver discretion is applied while selecting CS vs MMIO flip.

For Valleyview, this driver selection happens based on the idleness
of Blitter and Video engines. The Blitter and Video engines are in
the same power well. So, if both are idle, we can use MMIO flips,
Otherwise, we can use the BCS flips.
This usecase can be modified and/or enhanced to cover more platforms.

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/i915_params.c   |  4 +++-
 drivers/gpu/drm/i915/intel_display.c | 42 ++++++++++++++++++++++++++++++------
 2 files changed, 38 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index e0d44df..9becd1e 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -159,4 +159,6 @@ MODULE_PARM_DESC(enable_cmd_parser,
 		 "Enable command parsing (1=enabled [default], 0=disabled)");
 
 module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
-MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");
+MODULE_PARM_DESC(use_mmio_flip, "use MMIO page flips"
+	"(0 force CS, 1:force mmio, >1: driver selection)"
+	"(default: 0)");
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 9dda965..a3d38d2 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9185,22 +9185,50 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 
 static bool intel_use_mmio_flip(struct drm_device *dev)
 {
-	/* If module parameter is disabled, use CS flips.
-	 * Otherwise, use MMIO flips starting from Gen5.
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	bool use_mmio_flip = false;
+
+	/* If module parameter is 0, force CS flip.
+	 * If module parameter is 1, force MMIO flip starting from Gen5.
 	 * This is not being used for older platforms, because
 	 * non-availability of flip done interrupt forces us to use
 	 * CS flips. Older platforms derive flip done using some clever
 	 * tricks involving the flip_pending status bits and vblank irqs.
 	 * So using MMIO flips there would disrupt this mechanism.
+	 * If module parameter is > 1, driver discretion is applied for
+	 * selection of CS vs MMIO flip.
 	 */
 
 	if (i915.use_mmio_flip == 0)
-		return false;
+		use_mmio_flip = false;
 
-	if (INTEL_INFO(dev)->gen >= 5)
-		return true;
-	else
-		return false;
+	if (i915.use_mmio_flip == 1) {
+		if (INTEL_INFO(dev)->gen >= 5)
+			use_mmio_flip = true;
+		else
+			use_mmio_flip = false;
+	}
+
+	if (i915.use_mmio_flip > 1) {
+		/* For Valleyview, Blitter and Video engines are in the same
+		 * power well. So, if both are idle, we can use MMIO flips,
+		 * Otherwise, we can use the BCS flips.
+		 * We use the parameter 'request_list' to determine the idleness
+		 * of the engine.
+		 */
+		if (IS_VALLEYVIEW(dev)) {
+			struct intel_engine_cs *bcs_ring = &dev_priv->ring[BCS];
+			struct intel_engine_cs *vcs_ring = &dev_priv->ring[VCS];
+
+			if (list_empty(&bcs_ring->request_list) &&
+					list_empty(&vcs_ring->request_list))
+				use_mmio_flip = true;
+			else
+				use_mmio_flip = false;
+		} else
+			use_mmio_flip = false;
+	}
+	return use_mmio_flip;
 }
 
 static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v9 1/3] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-29  9:40                                                                     ` [PATCH v9 1/3] drm/i915: Replaced " sourab.gupta
@ 2014-05-30 10:31                                                                       ` Chris Wilson
  0 siblings, 0 replies; 67+ messages in thread
From: Chris Wilson @ 2014-05-30 10:31 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Deepak S, intel-gfx, Akash Goel

On Thu, May 29, 2014 at 03:10:13PM +0530, sourab.gupta@intel.com wrote:
> +	if (intel_use_mmio_flip(dev))
> +		dev_priv->display.queue_flip = intel_queue_mmio_flip;
> +

Note that this patch creates the i915.use_mmio_flip as 0600 so this
cannot be a static assignment anyway.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH 3/3] drm/i915: Make module param for MMIO flip selection as tristate
  2014-05-29  9:40                                                                     ` [PATCH 3/3] drm/i915: Make module param for MMIO flip selection as tristate sourab.gupta
@ 2014-05-30 10:49                                                                       ` Chris Wilson
  2014-06-01 11:13                                                                         ` [PATCH v10] drm/i915: Replaced Blitter ring based flips with MMIO flips sourab.gupta
  0 siblings, 1 reply; 67+ messages in thread
From: Chris Wilson @ 2014-05-30 10:49 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Deepak S, intel-gfx, Akash Goel

I was thinking this patch should be more like

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3201495..ab9b5f7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2060,7 +2060,7 @@ struct i915_params {
 	bool reset;
 	bool disable_display;
 	bool disable_vtd_wa;
-	bool use_mmio_flip;
+	int use_mmio_flip;
 };
 extern struct i915_params i915 __read_mostly;
 
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index e0d44df..6d7c580 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -158,5 +158,5 @@ module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600);
 MODULE_PARM_DESC(enable_cmd_parser,
 		 "Enable command parsing (1=enabled [default], 0=disabled)");
 
-module_param_named(use_mmio_flip, i915.use_mmio_flip, bool, 0600);
-MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (default: false)");
+module_param_named(use_mmio_flip, i915.use_mmio_flip, int, 0600);
+MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (-1=never, 0=driver discretion [default], 1=always)");
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index ac93ae4..b6c8fce 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9207,24 +9207,24 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	return 0;
 }
 
-static bool intel_use_mmio_flip(struct drm_device *dev)
+static bool use_mmio_flip(struct intel_engine_cs *ring,
+			  struct drm_i915_gem_object *obj)
 {
-	/* If module parameter is disabled, use CS flips.
-	 * Otherwise, use MMIO flips starting from Gen5.
-	 * This is not being used for older platforms, because
+	/* This is not being used for older platforms, because
 	 * non-availability of flip done interrupt forces us to use
 	 * CS flips. Older platforms derive flip done using some clever
 	 * tricks involving the flip_pending status bits and vblank irqs.
 	 * So using MMIO flips there would disrupt this mechanism.
 	 */
-
-	if (i915.use_mmio_flip == 0)
+	if (INTEL_INFO(dev)->gen < 5)
 		return false;
 
-	if (INTEL_INFO(dev)->gen >= 5)
+	if (i915.use_mmio_flip < 0)
+		return false;
+	else if (i915.use_mmio_flip > 0)
 		return true;
 	else
-		return false;
+		return ring != obj->ring;
 }
 
 static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
@@ -9290,9 +9290,9 @@ void intel_notify_mmio_flip(struct intel_engine_cs *ring)
 		struct intel_mmio_flip *mmio_flip;
 
 		mmio_flip = &intel_crtc->mmio_flip;
-
 		if (mmio_flip->seqno == 0)
 			continue;
+
 		if (ring->id != mmio_flip->ring_id)
 			continue;
 
@@ -9306,26 +9306,25 @@ void intel_notify_mmio_flip(struct intel_engine_cs *ring)
 }
 
 static int intel_queue_mmio_flip(struct drm_device *dev,
-			struct drm_crtc *crtc,
-			struct drm_framebuffer *fb,
-			struct drm_i915_gem_object *obj,
-			struct intel_engine_cs *ring,
-			uint32_t flags)
+				 struct drm_crtc *crtc,
+				 struct drm_framebuffer *fb,
+				 struct drm_i915_gem_object *obj,
+				 struct intel_engine_cs *ring,
+				 uint32_t flags)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	unsigned long irq_flags;
 	int ret;
 
-	if (WARN_ON(intel_crtc->mmio_flip.seqno)) {
-		ret = -EBUSY;
-		goto err;
-	}
+	if (WARN_ON(intel_crtc->mmio_flip.seqno))
+		return -EBUSY;
 
 	ret = intel_postpone_flip(obj);
-	if (ret < 0) {
-		goto err;
-	} else if (ret == 0) {
+	if (ret < 0)
+		return ret;
+
+	if (ret == 0) {
 		intel_do_mmio_flip(intel_crtc);
 		return 0;
 	}
@@ -9340,8 +9339,6 @@ static int intel_queue_mmio_flip(struct drm_device *dev,
 	 */
 	intel_notify_mmio_flip(obj->ring);
 	return 0;
-err:
-	return ret;
 }
 
 static int intel_default_queue_flip(struct drm_device *dev,
@@ -9529,7 +9526,10 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 		i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset;
 	work->enable_stall_check = true;
 
-	ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring, page_flip_flags);
+	if (use_mmio_flip(ring, obj))
+		ret = intel_queue_mmio_flip(ring, obj)(dev, crtc, fb, obj, ring, page_flip_flags);
+	else
+		ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring, page_flip_flags);
 	if (ret)
 		goto cleanup_unpin;
 
@@ -11775,9 +11775,6 @@ static void intel_init_display(struct drm_device *dev)
 		break;
 	}
 
-	if (intel_use_mmio_flip(dev))
-		dev_priv->display.queue_flip = intel_queue_mmio_flip;
-
 	intel_panel_init_backlight_funcs(dev);
 }
 
and squashed into the first. You can have a reviewed-by after that and 
I'll try and get a t-b.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* [PATCH v10] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-05-30 10:49                                                                       ` Chris Wilson
@ 2014-06-01 11:13                                                                         ` sourab.gupta
  2014-06-02  6:56                                                                           ` Chris Wilson
  0 siblings, 1 reply; 67+ messages in thread
From: sourab.gupta @ 2014-06-01 11:13 UTC (permalink / raw)
  To: intel-gfx; +Cc: Deepak S, Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch enables the framework for using MMIO based flip calls,
in contrast with the CS based flip calls which are being used currently.

MMIO based flip calls can be enabled on architectures where
Render and Blitter engines reside in different power wells. The
decision to use MMIO flips can be made based on workloads to give
100% residency for Media power well.

v2: The MMIO flips now use the interrupt driven mechanism for issuing the
flips when target seqno is reached. (Incorporating Ville's idea)

v3: Rebasing on latest code. Code restructuring after incorporating
Damien's comments

v4: Addressing Ville's review comments
    -general cleanup
    -updating only base addr instead of calling update_primary_plane
    -extending patch for gen5+ platforms

v5: Addressed Ville's review comments
    -Making mmio flip vs cs flip selection based on module parameter
    -Adding check for DRIVER_MODESET feature in notify_ring before calling
     notify mmio flip.
    -Other changes mostly in function arguments

v6: -Having a seperate function to check condition for using mmio flips (Ville)
    -propogating error code from i915_gem_check_olr (Ville)

v7: -Adding __must_check with i915_gem_check_olr (Chris)
    -Renaming mmio_flip_data to mmio_flip (Chris)
    -Rebasing on latest nightly

v8: -Rebasing on latest code
    -squash 3rd patch in series(mmio setbase vs page flip race) with this patch
    -Added new tiling mode update in intel_do_mmio_flip (Chris)

v9: -check for obj->last_write_seqno being 0 instead of obj->ring being NULL in
intel_postpone_flip, as this is a more restrictive condition (Chris)

v10: -Applied Chris's suggestions for squashing patches 2,3 into this patch.
These patches make the selection of CS vs MMIO flip at the page flip time, and
make the module parameter for using mmio flips as tristate, the states being
'force CS flips', 'force mmio flips', 'driver discretion'.
Changed the logic for driver discretion (Chris)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   8 ++
 drivers/gpu/drm/i915/i915_gem.c      |   2 +-
 drivers/gpu/drm/i915/i915_irq.c      |   3 +
 drivers/gpu/drm/i915/i915_params.c   |   5 ++
 drivers/gpu/drm/i915/intel_display.c | 141 ++++++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_drv.h     |   6 ++
 7 files changed, 164 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index b9159ad..532733a 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1572,6 +1572,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->mmio_flip_lock);
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bea9ab40..4d5dbec 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1367,6 +1367,9 @@ struct drm_i915_private {
 	/* protects the irq masks */
 	spinlock_t irq_lock;
 
+	/* protects the mmio flip data */
+	spinlock_t mmio_flip_lock;
+
 	bool display_irqs_enabled;
 
 	/* To control wakeup latency, e.g. for irq-driven dp aux transfers. */
@@ -2036,6 +2039,7 @@ struct i915_params {
 	bool reset;
 	bool disable_display;
 	bool disable_vtd_wa;
+	int use_mmio_flip;
 };
 extern struct i915_params i915 __read_mostly;
 
@@ -2231,6 +2235,8 @@ bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
+int __must_check i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno);
+
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
 	return unlikely(atomic_read(&error->reset_counter)
@@ -2601,6 +2607,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
 int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct intel_engine_cs *ring);
+
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
 extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 70b4f41..ab663ca 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1096,7 +1096,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
  * Compare seqno against outstanding lazy request. Emit a request if they are
  * equal.
  */
-static int
+__must_check int
 i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno)
 {
 	int ret;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 4ef6423..e0edb1f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1218,6 +1218,9 @@ static void notify_ring(struct drm_device *dev,
 
 	trace_i915_gem_request_complete(ring);
 
+	if (drm_core_check_feature(dev, DRIVER_MODESET))
+		intel_notify_mmio_flip(ring);
+
 	wake_up_all(&ring->irq_queue);
 	i915_queue_hangcheck(dev);
 }
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index d05a2af..6885de0 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -48,6 +48,7 @@ struct i915_params i915 __read_mostly = {
 	.disable_display = 0,
 	.enable_cmd_parser = 1,
 	.disable_vtd_wa = 0,
+	.use_mmio_flip = 0,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -156,3 +157,7 @@ MODULE_PARM_DESC(disable_vtd_wa, "Disable all VT-d workarounds (default: false)"
 module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600);
 MODULE_PARM_DESC(enable_cmd_parser,
 		 "Enable command parsing (1=enabled [default], 0=disabled)");
+
+module_param_named(use_mmio_flip, i915.use_mmio_flip, int, 0600);
+MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (-1=never, 0=driver "
+	"discretion [default], 1=always)");
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 731cd01..ab59f3a 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9183,6 +9183,140 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	return 0;
 }
 
+static bool use_mmio_flip(struct intel_engine_cs *ring,
+			  struct drm_i915_gem_object *obj)
+{
+	 /* This is not being used for older platforms, because
+	 * non-availability of flip done interrupt forces us to use
+	 * CS flips. Older platforms derive flip done using some clever
+	 * tricks involving the flip_pending status bits and vblank irqs.
+	 * So using MMIO flips there would disrupt this mechanism.
+	 */
+
+	if (INTEL_INFO(ring->dev)->gen < 5)
+		return false;
+
+	if (i915.use_mmio_flip < 0)
+		return false;
+	else if (i915.use_mmio_flip > 0)
+		return true;
+	else
+		return ring != obj->ring;
+}
+
+static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
+{
+	struct drm_device *dev = intel_crtc->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_framebuffer *intel_fb =
+		to_intel_framebuffer(intel_crtc->base.primary->fb);
+	struct drm_i915_gem_object *obj = intel_fb->obj;
+	u32 dspcntr;
+	u32 reg;
+
+	intel_mark_page_flip_active(intel_crtc);
+
+	reg = DSPCNTR(intel_crtc->plane);
+	dspcntr = I915_READ(reg);
+
+	if (INTEL_INFO(dev)->gen >= 4) {
+		if (obj->tiling_mode != I915_TILING_NONE)
+			dspcntr |= DISPPLANE_TILED;
+		else
+			dspcntr &= ~DISPPLANE_TILED;
+	}
+	I915_WRITE(reg, dspcntr);
+
+	I915_WRITE(DSPSURF(intel_crtc->plane),
+			intel_crtc->unpin_work->gtt_offset);
+	POSTING_READ(DSPSURF(intel_crtc->plane));
+}
+
+static int intel_postpone_flip(struct drm_i915_gem_object *obj)
+{
+	int ret;
+
+	if (!obj->last_write_seqno)
+		return 0;
+
+	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
+				obj->last_write_seqno))
+		return 0;
+
+	ret = i915_gem_check_olr(obj->ring, obj->last_write_seqno);
+	if (ret)
+		return ret;
+
+	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
+		return 0;
+
+	return 1;
+}
+
+void intel_notify_mmio_flip(struct intel_engine_cs *ring)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_crtc *intel_crtc;
+	unsigned long irq_flags;
+	u32 seqno;
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	for_each_intel_crtc(ring->dev, intel_crtc) {
+		struct intel_mmio_flip *mmio_flip;
+
+		mmio_flip = &intel_crtc->mmio_flip;
+		if (mmio_flip->seqno == 0)
+			continue;
+
+		if (ring->id != mmio_flip->ring_id)
+			continue;
+
+		if (i915_seqno_passed(seqno, mmio_flip->seqno)) {
+			intel_do_mmio_flip(intel_crtc);
+			mmio_flip->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+}
+
+static int intel_queue_mmio_flip(struct drm_device *dev,
+		struct drm_crtc *crtc,
+		struct drm_framebuffer *fb,
+		struct drm_i915_gem_object *obj,
+		struct intel_engine_cs *ring,
+		uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	unsigned long irq_flags;
+	int ret;
+
+	if (WARN_ON(intel_crtc->mmio_flip.seqno))
+		return -EBUSY;
+
+	ret = intel_postpone_flip(obj);
+	if (ret < 0)
+		return ret;
+	 if (ret == 0) {
+		intel_do_mmio_flip(intel_crtc);
+		return 0;
+	}
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	intel_crtc->mmio_flip.seqno = obj->last_write_seqno;
+	intel_crtc->mmio_flip.ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+
+	/* Double check to catch cases where irq fired before
+	 * mmio flip data was ready
+	 */
+	intel_notify_mmio_flip(obj->ring);
+	return 0;
+}
+
 static int intel_default_queue_flip(struct drm_device *dev,
 				    struct drm_crtc *crtc,
 				    struct drm_framebuffer *fb,
@@ -9290,7 +9424,12 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 	work->gtt_offset =
 		i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset;
 
-	ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring, page_flip_flags);
+	if (use_mmio_flip(ring, obj))
+		ret = intel_queue_mmio_flip(dev, crtc, fb, obj, ring,
+				page_flip_flags);
+	else
+		ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring,
+				page_flip_flags);
 	if (ret)
 		goto cleanup_unpin;
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 9bb70dc..206b577 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -358,6 +358,11 @@ struct intel_pipe_wm {
 	bool sprites_scaled;
 };
 
+struct intel_mmio_flip {
+	u32 seqno;
+	u32 ring_id;
+};
+
 struct intel_crtc {
 	struct drm_crtc base;
 	enum pipe pipe;
@@ -411,6 +416,7 @@ struct intel_crtc {
 	wait_queue_head_t vbl_wait;
 
 	int scanline_offset;
+	struct intel_mmio_flip mmio_flip;
 };
 
 struct intel_plane_wm_parameters {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v10] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-06-01 11:13                                                                         ` [PATCH v10] drm/i915: Replaced Blitter ring based flips with MMIO flips sourab.gupta
@ 2014-06-02  6:56                                                                           ` Chris Wilson
  2014-06-02 10:38                                                                             ` Gupta, Sourab
  0 siblings, 1 reply; 67+ messages in thread
From: Chris Wilson @ 2014-06-02  6:56 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Deepak S, intel-gfx, Akash Goel

On Sun, Jun 01, 2014 at 04:43:13PM +0530, sourab.gupta@intel.com wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> This patch enables the framework for using MMIO based flip calls,
> in contrast with the CS based flip calls which are being used currently.
> 
> MMIO based flip calls can be enabled on architectures where
> Render and Blitter engines reside in different power wells. The
> decision to use MMIO flips can be made based on workloads to give
> 100% residency for Media power well.
> 
> v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> flips when target seqno is reached. (Incorporating Ville's idea)
> 
> v3: Rebasing on latest code. Code restructuring after incorporating
> Damien's comments
> 
> v4: Addressing Ville's review comments
>     -general cleanup
>     -updating only base addr instead of calling update_primary_plane
>     -extending patch for gen5+ platforms
> 
> v5: Addressed Ville's review comments
>     -Making mmio flip vs cs flip selection based on module parameter
>     -Adding check for DRIVER_MODESET feature in notify_ring before calling
>      notify mmio flip.
>     -Other changes mostly in function arguments
> 
> v6: -Having a seperate function to check condition for using mmio flips (Ville)
>     -propogating error code from i915_gem_check_olr (Ville)
> 
> v7: -Adding __must_check with i915_gem_check_olr (Chris)
>     -Renaming mmio_flip_data to mmio_flip (Chris)
>     -Rebasing on latest nightly
> 
> v8: -Rebasing on latest code
>     -squash 3rd patch in series(mmio setbase vs page flip race) with this patch
>     -Added new tiling mode update in intel_do_mmio_flip (Chris)
> 
> v9: -check for obj->last_write_seqno being 0 instead of obj->ring being NULL in
> intel_postpone_flip, as this is a more restrictive condition (Chris)
> 
> v10: -Applied Chris's suggestions for squashing patches 2,3 into this patch.
> These patches make the selection of CS vs MMIO flip at the page flip time, and
> make the module parameter for using mmio flips as tristate, the states being
> 'force CS flips', 'force mmio flips', 'driver discretion'.
> Changed the logic for driver discretion (Chris)
> 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # snb, ivb

Really happy with this now, just a few irrelevant bikesheds.

> -static int
> +__must_check int
>  i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno)
>  {

Only the declaration requires the __must_check attribute, we don't need
it here as well.

>  	int ret;
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 4ef6423..e0edb1f 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1218,6 +1218,9 @@ static void notify_ring(struct drm_device *dev,
>  
>  	trace_i915_gem_request_complete(ring);
>  
> +	if (drm_core_check_feature(dev, DRIVER_MODESET))
> +		intel_notify_mmio_flip(ring);
> +

I wish Ville had suggested making UMS do the extra work of setting up
the spinlock instead.

> +static bool use_mmio_flip(struct intel_engine_cs *ring,
> +			  struct drm_i915_gem_object *obj)
> +{
> +	 /* This is not being used for older platforms, because
> +	 * non-availability of flip done interrupt forces us to use
> +	 * CS flips. Older platforms derive flip done using some clever
> +	 * tricks involving the flip_pending status bits and vblank irqs.
> +	 * So using MMIO flips there would disrupt this mechanism.
> +	 */
> +
> +	if (INTEL_INFO(ring->dev)->gen < 5)
> +		return false;
> +
> +	if (i915.use_mmio_flip < 0)
> +		return false;
> +	else if (i915.use_mmio_flip > 0)
> +		return true;
> +	else
> +		return ring != obj->ring;
> +}

Check whitespace.

> +
> +static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
> +{
> +	struct drm_device *dev = intel_crtc->base.dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_framebuffer *intel_fb =
> +		to_intel_framebuffer(intel_crtc->base.primary->fb);
> +	struct drm_i915_gem_object *obj = intel_fb->obj;
> +	u32 dspcntr;
> +	u32 reg;
> +
> +	intel_mark_page_flip_active(intel_crtc);
> +
> +	reg = DSPCNTR(intel_crtc->plane);
> +	dspcntr = I915_READ(reg);
> +
> +	if (INTEL_INFO(dev)->gen >= 4) {
> +		if (obj->tiling_mode != I915_TILING_NONE)
> +			dspcntr |= DISPPLANE_TILED;
> +		else
> +			dspcntr &= ~DISPPLANE_TILED;
> +	}
> +	I915_WRITE(reg, dspcntr);
> +
> +	I915_WRITE(DSPSURF(intel_crtc->plane),
> +			intel_crtc->unpin_work->gtt_offset);
> +	POSTING_READ(DSPSURF(intel_crtc->plane));
> +}
> +
> +static int intel_postpone_flip(struct drm_i915_gem_object *obj)
> +{
        struct intel_engine_cs *ring = obj->ring; // will save our eyes
> +	int ret;
> +

        // I had to go back and check the locking, so save the
	// next person the same task.
        lockdep_assert_held(&obj->base.dev->struct_mutex);

> +	if (!obj->last_write_seqno)
> +		return 0;
> +
> +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
> +				obj->last_write_seqno))
> +		return 0;
> +
> +	ret = i915_gem_check_olr(obj->ring, obj->last_write_seqno);
> +	if (ret)
> +		return ret;
> +
> +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> +		return 0;
> +
> +	return 1;
> +}
> +
> +void intel_notify_mmio_flip(struct intel_engine_cs *ring)
> +{
> +	struct drm_i915_private *dev_priv = ring->dev->dev_private;

	struct drm_i915_private *dev_priv = to_i915(ring->dev);

> +	struct intel_crtc *intel_crtc;
> +	unsigned long irq_flags;
> +	u32 seqno;
> +
> +	seqno = ring->get_seqno(ring, false);
> +
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	for_each_intel_crtc(ring->dev, intel_crtc) {
> +		struct intel_mmio_flip *mmio_flip;
> +
> +		mmio_flip = &intel_crtc->mmio_flip;
> +		if (mmio_flip->seqno == 0)
> +			continue;
> +
> +		if (ring->id != mmio_flip->ring_id)
> +			continue;
> +
> +		if (i915_seqno_passed(seqno, mmio_flip->seqno)) {
> +			intel_do_mmio_flip(intel_crtc);
> +			mmio_flip->seqno = 0;
> +			ring->irq_put(ring);
> +		}
> +	}
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +}
> +
> +static int intel_queue_mmio_flip(struct drm_device *dev,
> +		struct drm_crtc *crtc,
> +		struct drm_framebuffer *fb,
> +		struct drm_i915_gem_object *obj,
> +		struct intel_engine_cs *ring,
> +		uint32_t flags)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> +	unsigned long irq_flags;
> +	int ret;
> +
> +	if (WARN_ON(intel_crtc->mmio_flip.seqno))
> +		return -EBUSY;
> +
> +	ret = intel_postpone_flip(obj);
> +	if (ret < 0)
> +		return ret;
> +	 if (ret == 0) {
> +		intel_do_mmio_flip(intel_crtc);
> +		return 0;
> +	}
> +
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	intel_crtc->mmio_flip.seqno = obj->last_write_seqno;
> +	intel_crtc->mmio_flip.ring_id = obj->ring->id;
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +
> +	/* Double check to catch cases where irq fired before
> +	 * mmio flip data was ready
> +	 */
> +	intel_notify_mmio_flip(obj->ring);
> +	return 0;
> +}

Check whitespace.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v10] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-06-02  6:56                                                                           ` Chris Wilson
@ 2014-06-02 10:38                                                                             ` Gupta, Sourab
  2014-06-02 10:56                                                                               ` Chris Wilson
  0 siblings, 1 reply; 67+ messages in thread
From: Gupta, Sourab @ 2014-06-02 10:38 UTC (permalink / raw)
  To: Chris Wilson; +Cc: S, Deepak, intel-gfx, Goel, Akash

On Mon, 2014-06-02 at 06:56 +0000, Chris Wilson wrote:
> On Sun, Jun 01, 2014 at 04:43:13PM +0530, sourab.gupta@intel.com wrote:
> > From: Sourab Gupta <sourab.gupta@intel.com>
> > 
> > This patch enables the framework for using MMIO based flip calls,
> > in contrast with the CS based flip calls which are being used currently.
> > 
> > MMIO based flip calls can be enabled on architectures where
> > Render and Blitter engines reside in different power wells. The
> > decision to use MMIO flips can be made based on workloads to give
> > 100% residency for Media power well.
> > 
> > v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> > flips when target seqno is reached. (Incorporating Ville's idea)
> > 
> > v3: Rebasing on latest code. Code restructuring after incorporating
> > Damien's comments
> > 
> > v4: Addressing Ville's review comments
> >     -general cleanup
> >     -updating only base addr instead of calling update_primary_plane
> >     -extending patch for gen5+ platforms
> > 
> > v5: Addressed Ville's review comments
> >     -Making mmio flip vs cs flip selection based on module parameter
> >     -Adding check for DRIVER_MODESET feature in notify_ring before calling
> >      notify mmio flip.
> >     -Other changes mostly in function arguments
> > 
> > v6: -Having a seperate function to check condition for using mmio flips (Ville)
> >     -propogating error code from i915_gem_check_olr (Ville)
> > 
> > v7: -Adding __must_check with i915_gem_check_olr (Chris)
> >     -Renaming mmio_flip_data to mmio_flip (Chris)
> >     -Rebasing on latest nightly
> > 
> > v8: -Rebasing on latest code
> >     -squash 3rd patch in series(mmio setbase vs page flip race) with this patch
> >     -Added new tiling mode update in intel_do_mmio_flip (Chris)
> > 
> > v9: -check for obj->last_write_seqno being 0 instead of obj->ring being NULL in
> > intel_postpone_flip, as this is a more restrictive condition (Chris)
> > 
> > v10: -Applied Chris's suggestions for squashing patches 2,3 into this patch.
> > These patches make the selection of CS vs MMIO flip at the page flip time, and
> > make the module parameter for using mmio flips as tristate, the states being
> > 'force CS flips', 'force mmio flips', 'driver discretion'.
> > Changed the logic for driver discretion (Chris)
> > 
> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> > Signed-off-by: Akash Goel <akash.goel@intel.com>
> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # snb, ivb
> 
> Really happy with this now, just a few irrelevant bikesheds.
> 
> > -static int
> > +__must_check int
> >  i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno)
> >  {
> 
> Only the declaration requires the __must_check attribute, we don't need
> it here as well.
> 
> >  	int ret;
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 4ef6423..e0edb1f 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -1218,6 +1218,9 @@ static void notify_ring(struct drm_device *dev,
> >  
> >  	trace_i915_gem_request_complete(ring);
> >  
> > +	if (drm_core_check_feature(dev, DRIVER_MODESET))
> > +		intel_notify_mmio_flip(ring);
> > +
> 
> I wish Ville had suggested making UMS do the extra work of setting up
> the spinlock instead.
> 
> > +static bool use_mmio_flip(struct intel_engine_cs *ring,
> > +			  struct drm_i915_gem_object *obj)
> > +{
> > +	 /* This is not being used for older platforms, because
> > +	 * non-availability of flip done interrupt forces us to use
> > +	 * CS flips. Older platforms derive flip done using some clever
> > +	 * tricks involving the flip_pending status bits and vblank irqs.
> > +	 * So using MMIO flips there would disrupt this mechanism.
> > +	 */
> > +
> > +	if (INTEL_INFO(ring->dev)->gen < 5)
> > +		return false;
> > +
> > +	if (i915.use_mmio_flip < 0)
> > +		return false;
> > +	else if (i915.use_mmio_flip > 0)
> > +		return true;
> > +	else
> > +		return ring != obj->ring;
> > +}
> 
> Check whitespace.
> 
Hi Chris,
Couldn't get the whitespace error here, and at other places. Also,
checkpatch.pl doesn't show any. Can you please point out the error.
Thanks,
Sourab
> > +
> > +static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
> > +{
> > +	struct drm_device *dev = intel_crtc->base.dev;
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct intel_framebuffer *intel_fb =
> > +		to_intel_framebuffer(intel_crtc->base.primary->fb);
> > +	struct drm_i915_gem_object *obj = intel_fb->obj;
> > +	u32 dspcntr;
> > +	u32 reg;
> > +
> > +	intel_mark_page_flip_active(intel_crtc);
> > +
> > +	reg = DSPCNTR(intel_crtc->plane);
> > +	dspcntr = I915_READ(reg);
> > +
> > +	if (INTEL_INFO(dev)->gen >= 4) {
> > +		if (obj->tiling_mode != I915_TILING_NONE)
> > +			dspcntr |= DISPPLANE_TILED;
> > +		else
> > +			dspcntr &= ~DISPPLANE_TILED;
> > +	}
> > +	I915_WRITE(reg, dspcntr);
> > +
> > +	I915_WRITE(DSPSURF(intel_crtc->plane),
> > +			intel_crtc->unpin_work->gtt_offset);
> > +	POSTING_READ(DSPSURF(intel_crtc->plane));
> > +}
> > +
> > +static int intel_postpone_flip(struct drm_i915_gem_object *obj)
> > +{
>         struct intel_engine_cs *ring = obj->ring; // will save our eyes
> > +	int ret;
> > +
> 
>         // I had to go back and check the locking, so save the
> 	// next person the same task.
>         lockdep_assert_held(&obj->base.dev->struct_mutex);
> 
> > +	if (!obj->last_write_seqno)
> > +		return 0;
> > +
> > +	if (i915_seqno_passed(obj->ring->get_seqno(obj->ring, true),
> > +				obj->last_write_seqno))
> > +		return 0;
> > +
> > +	ret = i915_gem_check_olr(obj->ring, obj->last_write_seqno);
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (WARN_ON(!obj->ring->irq_get(obj->ring)))
> > +		return 0;
> > +
> > +	return 1;
> > +}
> > +
> > +void intel_notify_mmio_flip(struct intel_engine_cs *ring)
> > +{
> > +	struct drm_i915_private *dev_priv = ring->dev->dev_private;
> 
> 	struct drm_i915_private *dev_priv = to_i915(ring->dev);
> 
> > +	struct intel_crtc *intel_crtc;
> > +	unsigned long irq_flags;
> > +	u32 seqno;
> > +
> > +	seqno = ring->get_seqno(ring, false);
> > +
> > +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> > +	for_each_intel_crtc(ring->dev, intel_crtc) {
> > +		struct intel_mmio_flip *mmio_flip;
> > +
> > +		mmio_flip = &intel_crtc->mmio_flip;
> > +		if (mmio_flip->seqno == 0)
> > +			continue;
> > +
> > +		if (ring->id != mmio_flip->ring_id)
> > +			continue;
> > +
> > +		if (i915_seqno_passed(seqno, mmio_flip->seqno)) {
> > +			intel_do_mmio_flip(intel_crtc);
> > +			mmio_flip->seqno = 0;
> > +			ring->irq_put(ring);
> > +		}
> > +	}
> > +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> > +}
> > +
> > +static int intel_queue_mmio_flip(struct drm_device *dev,
> > +		struct drm_crtc *crtc,
> > +		struct drm_framebuffer *fb,
> > +		struct drm_i915_gem_object *obj,
> > +		struct intel_engine_cs *ring,
> > +		uint32_t flags)
> > +{
> > +	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> > +	unsigned long irq_flags;
> > +	int ret;
> > +
> > +	if (WARN_ON(intel_crtc->mmio_flip.seqno))
> > +		return -EBUSY;
> > +
> > +	ret = intel_postpone_flip(obj);
> > +	if (ret < 0)
> > +		return ret;
> > +	 if (ret == 0) {
> > +		intel_do_mmio_flip(intel_crtc);
> > +		return 0;
> > +	}
> > +
> > +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> > +	intel_crtc->mmio_flip.seqno = obj->last_write_seqno;
> > +	intel_crtc->mmio_flip.ring_id = obj->ring->id;
> > +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> > +
> > +	/* Double check to catch cases where irq fired before
> > +	 * mmio flip data was ready
> > +	 */
> > +	intel_notify_mmio_flip(obj->ring);
> > +	return 0;
> > +}
> 
> Check whitespace.
> -Chris
> 

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v10] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-06-02 10:38                                                                             ` Gupta, Sourab
@ 2014-06-02 10:56                                                                               ` Chris Wilson
  2014-06-02 11:17                                                                                 ` [PATCH v11] " sourab.gupta
  0 siblings, 1 reply; 67+ messages in thread
From: Chris Wilson @ 2014-06-02 10:56 UTC (permalink / raw)
  To: Gupta, Sourab; +Cc: S, Deepak, intel-gfx, Goel, Akash

On Mon, Jun 02, 2014 at 10:38:13AM +0000, Gupta, Sourab wrote:
> On Mon, 2014-06-02 at 06:56 +0000, Chris Wilson wrote:
> > On Sun, Jun 01, 2014 at 04:43:13PM +0530, sourab.gupta@intel.com wrote:
> > > +static bool use_mmio_flip(struct intel_engine_cs *ring,
> > > +			  struct drm_i915_gem_object *obj)
> > > +{
> > > +	 /* This is not being used for older platforms, because
> > > +	 * non-availability of flip done interrupt forces us to use
> > > +	 * CS flips. Older platforms derive flip done using some clever
> > > +	 * tricks involving the flip_pending status bits and vblank irqs.
> > > +	 * So using MMIO flips there would disrupt this mechanism.
> > > +	 */
> > > +
> > > +	if (INTEL_INFO(ring->dev)->gen < 5)
> > > +		return false;
> > > +
> > > +	if (i915.use_mmio_flip < 0)
> > > +		return false;
> > > +	else if (i915.use_mmio_flip > 0)
> > > +		return true;
> > > +	else
> > > +		return ring != obj->ring;
> > > +}
> > 
> > Check whitespace.
> > 
> Hi Chris,
> Couldn't get the whitespace error here, and at other places. Also,
> checkpatch.pl doesn't show any. Can you please point out the error.

It's the alignment of
/*
 *
 */
that needs to be checked, and later the if (ret == 0) looks to be using
a mixture of tabs and spaces.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 67+ messages in thread

* [PATCH v11] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-06-02 10:56                                                                               ` Chris Wilson
@ 2014-06-02 11:17                                                                                 ` sourab.gupta
  2014-06-17 14:14                                                                                   ` Daniel Vetter
  0 siblings, 1 reply; 67+ messages in thread
From: sourab.gupta @ 2014-06-02 11:17 UTC (permalink / raw)
  To: intel-gfx; +Cc: Deepak S, Akash Goel, Sourab Gupta

From: Sourab Gupta <sourab.gupta@intel.com>

This patch enables the framework for using MMIO based flip calls,
in contrast with the CS based flip calls which are being used currently.

MMIO based flip calls can be enabled on architectures where
Render and Blitter engines reside in different power wells. The
decision to use MMIO flips can be made based on workloads to give
100% residency for Media power well.

v2: The MMIO flips now use the interrupt driven mechanism for issuing the
flips when target seqno is reached. (Incorporating Ville's idea)

v3: Rebasing on latest code. Code restructuring after incorporating
Damien's comments

v4: Addressing Ville's review comments
    -general cleanup
    -updating only base addr instead of calling update_primary_plane
    -extending patch for gen5+ platforms

v5: Addressed Ville's review comments
    -Making mmio flip vs cs flip selection based on module parameter
    -Adding check for DRIVER_MODESET feature in notify_ring before calling
     notify mmio flip.
    -Other changes mostly in function arguments

v6: -Having a seperate function to check condition for using mmio flips (Ville)
    -propogating error code from i915_gem_check_olr (Ville)

v7: -Adding __must_check with i915_gem_check_olr (Chris)
    -Renaming mmio_flip_data to mmio_flip (Chris)
    -Rebasing on latest nightly

v8: -Rebasing on latest code
    -squash 3rd patch in series(mmio setbase vs page flip race) with this patch
    -Added new tiling mode update in intel_do_mmio_flip (Chris)

v9: -check for obj->last_write_seqno being 0 instead of obj->ring being NULL in
intel_postpone_flip, as this is a more restrictive condition (Chris)

v10: -Applied Chris's suggestions for squashing patches 2,3 into this patch.
These patches make the selection of CS vs MMIO flip at the page flip time, and
make the module parameter for using mmio flips as tristate, the states being
'force CS flips', 'force mmio flips', 'driver discretion'.
Changed the logic for driver discretion (Chris)

v11: Minor code cleanup(better readability, fixing whitespace errors, using
lockdep to check mutex locked status in postpone_flip, removal of __must_check
in function definition) (Chris)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # snb, ivb
---
 drivers/gpu/drm/i915/i915_dma.c      |   1 +
 drivers/gpu/drm/i915/i915_drv.h      |   8 ++
 drivers/gpu/drm/i915/i915_gem.c      |   2 +-
 drivers/gpu/drm/i915/i915_irq.c      |   3 +
 drivers/gpu/drm/i915/i915_params.c   |   5 ++
 drivers/gpu/drm/i915/intel_display.c | 148 ++++++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_drv.h     |   6 ++
 7 files changed, 171 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index b9159ad..532733a 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1572,6 +1572,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	spin_lock_init(&dev_priv->backlight_lock);
 	spin_lock_init(&dev_priv->uncore.lock);
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
+	spin_lock_init(&dev_priv->mmio_flip_lock);
 	mutex_init(&dev_priv->dpio_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bea9ab40..4d5dbec 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1367,6 +1367,9 @@ struct drm_i915_private {
 	/* protects the irq masks */
 	spinlock_t irq_lock;
 
+	/* protects the mmio flip data */
+	spinlock_t mmio_flip_lock;
+
 	bool display_irqs_enabled;
 
 	/* To control wakeup latency, e.g. for irq-driven dp aux transfers. */
@@ -2036,6 +2039,7 @@ struct i915_params {
 	bool reset;
 	bool disable_display;
 	bool disable_vtd_wa;
+	int use_mmio_flip;
 };
 extern struct i915_params i915 __read_mostly;
 
@@ -2231,6 +2235,8 @@ bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
+int __must_check i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno);
+
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
 	return unlikely(atomic_read(&error->reset_counter)
@@ -2601,6 +2607,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
 int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file);
 
+void intel_notify_mmio_flip(struct intel_engine_cs *ring);
+
 /* overlay */
 extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
 extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 70b4f41..19e56b7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1096,7 +1096,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
  * Compare seqno against outstanding lazy request. Emit a request if they are
  * equal.
  */
-static int
+int
 i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno)
 {
 	int ret;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 4ef6423..e0edb1f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1218,6 +1218,9 @@ static void notify_ring(struct drm_device *dev,
 
 	trace_i915_gem_request_complete(ring);
 
+	if (drm_core_check_feature(dev, DRIVER_MODESET))
+		intel_notify_mmio_flip(ring);
+
 	wake_up_all(&ring->irq_queue);
 	i915_queue_hangcheck(dev);
 }
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index d05a2af..6885de0 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -48,6 +48,7 @@ struct i915_params i915 __read_mostly = {
 	.disable_display = 0,
 	.enable_cmd_parser = 1,
 	.disable_vtd_wa = 0,
+	.use_mmio_flip = 0,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -156,3 +157,7 @@ MODULE_PARM_DESC(disable_vtd_wa, "Disable all VT-d workarounds (default: false)"
 module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600);
 MODULE_PARM_DESC(enable_cmd_parser,
 		 "Enable command parsing (1=enabled [default], 0=disabled)");
+
+module_param_named(use_mmio_flip, i915.use_mmio_flip, int, 0600);
+MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (-1=never, 0=driver "
+	"discretion [default], 1=always)");
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 731cd01..638f4b1 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -9183,6 +9183,147 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	return 0;
 }
 
+static bool use_mmio_flip(struct intel_engine_cs *ring,
+			  struct drm_i915_gem_object *obj)
+{
+	/*
+	 * This is not being used for older platforms, because
+	 * non-availability of flip done interrupt forces us to use
+	 * CS flips. Older platforms derive flip done using some clever
+	 * tricks involving the flip_pending status bits and vblank irqs.
+	 * So using MMIO flips there would disrupt this mechanism.
+	 */
+
+	if (INTEL_INFO(ring->dev)->gen < 5)
+		return false;
+
+	if (i915.use_mmio_flip < 0)
+		return false;
+	else if (i915.use_mmio_flip > 0)
+		return true;
+	else
+		return ring != obj->ring;
+}
+
+static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
+{
+	struct drm_device *dev = intel_crtc->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_framebuffer *intel_fb =
+		to_intel_framebuffer(intel_crtc->base.primary->fb);
+	struct drm_i915_gem_object *obj = intel_fb->obj;
+	u32 dspcntr;
+	u32 reg;
+
+	intel_mark_page_flip_active(intel_crtc);
+
+	reg = DSPCNTR(intel_crtc->plane);
+	dspcntr = I915_READ(reg);
+
+	if (INTEL_INFO(dev)->gen >= 4) {
+		if (obj->tiling_mode != I915_TILING_NONE)
+			dspcntr |= DISPPLANE_TILED;
+		else
+			dspcntr &= ~DISPPLANE_TILED;
+	}
+	I915_WRITE(reg, dspcntr);
+
+	I915_WRITE(DSPSURF(intel_crtc->plane),
+			intel_crtc->unpin_work->gtt_offset);
+	POSTING_READ(DSPSURF(intel_crtc->plane));
+}
+
+static int intel_postpone_flip(struct drm_i915_gem_object *obj)
+{
+	struct intel_engine_cs *ring;
+	int ret;
+
+	lockdep_assert_held(&obj->base.dev->struct_mutex);
+
+	if (!obj->last_write_seqno)
+		return 0;
+
+	ring = obj->ring;
+
+	if (i915_seqno_passed(ring->get_seqno(ring, true),
+				obj->last_write_seqno))
+		return 0;
+
+	ret = i915_gem_check_olr(ring, obj->last_write_seqno);
+	if (ret)
+		return ret;
+
+	if (WARN_ON(!ring->irq_get(ring)))
+		return 0;
+
+	return 1;
+}
+
+void intel_notify_mmio_flip(struct intel_engine_cs *ring)
+{
+	struct drm_i915_private *dev_priv = to_i915(ring->dev);
+	struct intel_crtc *intel_crtc;
+	unsigned long irq_flags;
+	u32 seqno;
+
+	seqno = ring->get_seqno(ring, false);
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	for_each_intel_crtc(ring->dev, intel_crtc) {
+		struct intel_mmio_flip *mmio_flip;
+
+		mmio_flip = &intel_crtc->mmio_flip;
+		if (mmio_flip->seqno == 0)
+			continue;
+
+		if (ring->id != mmio_flip->ring_id)
+			continue;
+
+		if (i915_seqno_passed(seqno, mmio_flip->seqno)) {
+			intel_do_mmio_flip(intel_crtc);
+			mmio_flip->seqno = 0;
+			ring->irq_put(ring);
+		}
+	}
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+}
+
+static int intel_queue_mmio_flip(struct drm_device *dev,
+		struct drm_crtc *crtc,
+		struct drm_framebuffer *fb,
+		struct drm_i915_gem_object *obj,
+		struct intel_engine_cs *ring,
+		uint32_t flags)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	unsigned long irq_flags;
+	int ret;
+
+	if (WARN_ON(intel_crtc->mmio_flip.seqno))
+		return -EBUSY;
+
+	ret = intel_postpone_flip(obj);
+	if (ret < 0)
+		return ret;
+	if (ret == 0) {
+		intel_do_mmio_flip(intel_crtc);
+		return 0;
+	}
+
+	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
+	intel_crtc->mmio_flip.seqno = obj->last_write_seqno;
+	intel_crtc->mmio_flip.ring_id = obj->ring->id;
+	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
+
+	/*
+	 * Double check to catch cases where irq fired before
+	 * mmio flip data was ready
+	 */
+	intel_notify_mmio_flip(obj->ring);
+	return 0;
+}
+
 static int intel_default_queue_flip(struct drm_device *dev,
 				    struct drm_crtc *crtc,
 				    struct drm_framebuffer *fb,
@@ -9290,7 +9431,12 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 	work->gtt_offset =
 		i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset;
 
-	ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring, page_flip_flags);
+	if (use_mmio_flip(ring, obj))
+		ret = intel_queue_mmio_flip(dev, crtc, fb, obj, ring,
+				page_flip_flags);
+	else
+		ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring,
+				page_flip_flags);
 	if (ret)
 		goto cleanup_unpin;
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 9bb70dc..206b577 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -358,6 +358,11 @@ struct intel_pipe_wm {
 	bool sprites_scaled;
 };
 
+struct intel_mmio_flip {
+	u32 seqno;
+	u32 ring_id;
+};
+
 struct intel_crtc {
 	struct drm_crtc base;
 	enum pipe pipe;
@@ -411,6 +416,7 @@ struct intel_crtc {
 	wait_queue_head_t vbl_wait;
 
 	int scanline_offset;
+	struct intel_mmio_flip mmio_flip;
 };
 
 struct intel_plane_wm_parameters {
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 67+ messages in thread

* Re: [PATCH v11] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-06-02 11:17                                                                                 ` [PATCH v11] " sourab.gupta
@ 2014-06-17 14:14                                                                                   ` Daniel Vetter
  2014-06-17 14:17                                                                                     ` Chris Wilson
  0 siblings, 1 reply; 67+ messages in thread
From: Daniel Vetter @ 2014-06-17 14:14 UTC (permalink / raw)
  To: sourab.gupta; +Cc: Deepak S, intel-gfx, Akash Goel

On Mon, Jun 02, 2014 at 04:47:17PM +0530, sourab.gupta@intel.com wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> This patch enables the framework for using MMIO based flip calls,
> in contrast with the CS based flip calls which are being used currently.
> 
> MMIO based flip calls can be enabled on architectures where
> Render and Blitter engines reside in different power wells. The
> decision to use MMIO flips can be made based on workloads to give
> 100% residency for Media power well.
> 
> v2: The MMIO flips now use the interrupt driven mechanism for issuing the
> flips when target seqno is reached. (Incorporating Ville's idea)
> 
> v3: Rebasing on latest code. Code restructuring after incorporating
> Damien's comments
> 
> v4: Addressing Ville's review comments
>     -general cleanup
>     -updating only base addr instead of calling update_primary_plane
>     -extending patch for gen5+ platforms
> 
> v5: Addressed Ville's review comments
>     -Making mmio flip vs cs flip selection based on module parameter
>     -Adding check for DRIVER_MODESET feature in notify_ring before calling
>      notify mmio flip.
>     -Other changes mostly in function arguments
> 
> v6: -Having a seperate function to check condition for using mmio flips (Ville)
>     -propogating error code from i915_gem_check_olr (Ville)
> 
> v7: -Adding __must_check with i915_gem_check_olr (Chris)
>     -Renaming mmio_flip_data to mmio_flip (Chris)
>     -Rebasing on latest nightly
> 
> v8: -Rebasing on latest code
>     -squash 3rd patch in series(mmio setbase vs page flip race) with this patch
>     -Added new tiling mode update in intel_do_mmio_flip (Chris)
> 
> v9: -check for obj->last_write_seqno being 0 instead of obj->ring being NULL in
> intel_postpone_flip, as this is a more restrictive condition (Chris)
> 
> v10: -Applied Chris's suggestions for squashing patches 2,3 into this patch.
> These patches make the selection of CS vs MMIO flip at the page flip time, and
> make the module parameter for using mmio flips as tristate, the states being
> 'force CS flips', 'force mmio flips', 'driver discretion'.
> Changed the logic for driver discretion (Chris)
> 
> v11: Minor code cleanup(better readability, fixing whitespace errors, using
> lockdep to check mutex locked status in postpone_flip, removal of __must_check
> in function definition) (Chris)
> 
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # snb, ivb

Queued for -next, thanks for the patch. Aside: Checkpatch complained about
some unaligned function parameters. Please try to get that right for the
next patch since it really helps with readability. Fixed up while
applying.
-Daniel
> ---
>  drivers/gpu/drm/i915/i915_dma.c      |   1 +
>  drivers/gpu/drm/i915/i915_drv.h      |   8 ++
>  drivers/gpu/drm/i915/i915_gem.c      |   2 +-
>  drivers/gpu/drm/i915/i915_irq.c      |   3 +
>  drivers/gpu/drm/i915/i915_params.c   |   5 ++
>  drivers/gpu/drm/i915/intel_display.c | 148 ++++++++++++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/intel_drv.h     |   6 ++
>  7 files changed, 171 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index b9159ad..532733a 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -1572,6 +1572,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  	spin_lock_init(&dev_priv->backlight_lock);
>  	spin_lock_init(&dev_priv->uncore.lock);
>  	spin_lock_init(&dev_priv->mm.object_stat_lock);
> +	spin_lock_init(&dev_priv->mmio_flip_lock);
>  	mutex_init(&dev_priv->dpio_lock);
>  	mutex_init(&dev_priv->modeset_restore_lock);
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index bea9ab40..4d5dbec 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1367,6 +1367,9 @@ struct drm_i915_private {
>  	/* protects the irq masks */
>  	spinlock_t irq_lock;
>  
> +	/* protects the mmio flip data */
> +	spinlock_t mmio_flip_lock;
> +
>  	bool display_irqs_enabled;
>  
>  	/* To control wakeup latency, e.g. for irq-driven dp aux transfers. */
> @@ -2036,6 +2039,7 @@ struct i915_params {
>  	bool reset;
>  	bool disable_display;
>  	bool disable_vtd_wa;
> +	int use_mmio_flip;
>  };
>  extern struct i915_params i915 __read_mostly;
>  
> @@ -2231,6 +2235,8 @@ bool i915_gem_retire_requests(struct drm_device *dev);
>  void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
>  int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
>  				      bool interruptible);
> +int __must_check i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno);
> +
>  static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
>  {
>  	return unlikely(atomic_read(&error->reset_counter)
> @@ -2601,6 +2607,8 @@ int i915_reg_read_ioctl(struct drm_device *dev, void *data,
>  int i915_get_reset_stats_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file);
>  
> +void intel_notify_mmio_flip(struct intel_engine_cs *ring);
> +
>  /* overlay */
>  extern struct intel_overlay_error_state *intel_overlay_capture_error_state(struct drm_device *dev);
>  extern void intel_overlay_print_error_state(struct drm_i915_error_state_buf *e,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 70b4f41..19e56b7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1096,7 +1096,7 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
>   * Compare seqno against outstanding lazy request. Emit a request if they are
>   * equal.
>   */
> -static int
> +int
>  i915_gem_check_olr(struct intel_engine_cs *ring, u32 seqno)
>  {
>  	int ret;
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 4ef6423..e0edb1f 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1218,6 +1218,9 @@ static void notify_ring(struct drm_device *dev,
>  
>  	trace_i915_gem_request_complete(ring);
>  
> +	if (drm_core_check_feature(dev, DRIVER_MODESET))
> +		intel_notify_mmio_flip(ring);
> +
>  	wake_up_all(&ring->irq_queue);
>  	i915_queue_hangcheck(dev);
>  }
> diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
> index d05a2af..6885de0 100644
> --- a/drivers/gpu/drm/i915/i915_params.c
> +++ b/drivers/gpu/drm/i915/i915_params.c
> @@ -48,6 +48,7 @@ struct i915_params i915 __read_mostly = {
>  	.disable_display = 0,
>  	.enable_cmd_parser = 1,
>  	.disable_vtd_wa = 0,
> +	.use_mmio_flip = 0,
>  };
>  
>  module_param_named(modeset, i915.modeset, int, 0400);
> @@ -156,3 +157,7 @@ MODULE_PARM_DESC(disable_vtd_wa, "Disable all VT-d workarounds (default: false)"
>  module_param_named(enable_cmd_parser, i915.enable_cmd_parser, int, 0600);
>  MODULE_PARM_DESC(enable_cmd_parser,
>  		 "Enable command parsing (1=enabled [default], 0=disabled)");
> +
> +module_param_named(use_mmio_flip, i915.use_mmio_flip, int, 0600);
> +MODULE_PARM_DESC(use_mmio_flip, "use MMIO flips (-1=never, 0=driver "
> +	"discretion [default], 1=always)");
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 731cd01..638f4b1 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -9183,6 +9183,147 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
>  	return 0;
>  }
>  
> +static bool use_mmio_flip(struct intel_engine_cs *ring,
> +			  struct drm_i915_gem_object *obj)
> +{
> +	/*
> +	 * This is not being used for older platforms, because
> +	 * non-availability of flip done interrupt forces us to use
> +	 * CS flips. Older platforms derive flip done using some clever
> +	 * tricks involving the flip_pending status bits and vblank irqs.
> +	 * So using MMIO flips there would disrupt this mechanism.
> +	 */
> +
> +	if (INTEL_INFO(ring->dev)->gen < 5)
> +		return false;
> +
> +	if (i915.use_mmio_flip < 0)
> +		return false;
> +	else if (i915.use_mmio_flip > 0)
> +		return true;
> +	else
> +		return ring != obj->ring;
> +}
> +
> +static void intel_do_mmio_flip(struct intel_crtc *intel_crtc)
> +{
> +	struct drm_device *dev = intel_crtc->base.dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_framebuffer *intel_fb =
> +		to_intel_framebuffer(intel_crtc->base.primary->fb);
> +	struct drm_i915_gem_object *obj = intel_fb->obj;
> +	u32 dspcntr;
> +	u32 reg;
> +
> +	intel_mark_page_flip_active(intel_crtc);
> +
> +	reg = DSPCNTR(intel_crtc->plane);
> +	dspcntr = I915_READ(reg);
> +
> +	if (INTEL_INFO(dev)->gen >= 4) {
> +		if (obj->tiling_mode != I915_TILING_NONE)
> +			dspcntr |= DISPPLANE_TILED;
> +		else
> +			dspcntr &= ~DISPPLANE_TILED;
> +	}
> +	I915_WRITE(reg, dspcntr);
> +
> +	I915_WRITE(DSPSURF(intel_crtc->plane),
> +			intel_crtc->unpin_work->gtt_offset);
> +	POSTING_READ(DSPSURF(intel_crtc->plane));
> +}
> +
> +static int intel_postpone_flip(struct drm_i915_gem_object *obj)
> +{
> +	struct intel_engine_cs *ring;
> +	int ret;
> +
> +	lockdep_assert_held(&obj->base.dev->struct_mutex);
> +
> +	if (!obj->last_write_seqno)
> +		return 0;
> +
> +	ring = obj->ring;
> +
> +	if (i915_seqno_passed(ring->get_seqno(ring, true),
> +				obj->last_write_seqno))
> +		return 0;
> +
> +	ret = i915_gem_check_olr(ring, obj->last_write_seqno);
> +	if (ret)
> +		return ret;
> +
> +	if (WARN_ON(!ring->irq_get(ring)))
> +		return 0;
> +
> +	return 1;
> +}
> +
> +void intel_notify_mmio_flip(struct intel_engine_cs *ring)
> +{
> +	struct drm_i915_private *dev_priv = to_i915(ring->dev);
> +	struct intel_crtc *intel_crtc;
> +	unsigned long irq_flags;
> +	u32 seqno;
> +
> +	seqno = ring->get_seqno(ring, false);
> +
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	for_each_intel_crtc(ring->dev, intel_crtc) {
> +		struct intel_mmio_flip *mmio_flip;
> +
> +		mmio_flip = &intel_crtc->mmio_flip;
> +		if (mmio_flip->seqno == 0)
> +			continue;
> +
> +		if (ring->id != mmio_flip->ring_id)
> +			continue;
> +
> +		if (i915_seqno_passed(seqno, mmio_flip->seqno)) {
> +			intel_do_mmio_flip(intel_crtc);
> +			mmio_flip->seqno = 0;
> +			ring->irq_put(ring);
> +		}
> +	}
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +}
> +
> +static int intel_queue_mmio_flip(struct drm_device *dev,
> +		struct drm_crtc *crtc,
> +		struct drm_framebuffer *fb,
> +		struct drm_i915_gem_object *obj,
> +		struct intel_engine_cs *ring,
> +		uint32_t flags)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> +	unsigned long irq_flags;
> +	int ret;
> +
> +	if (WARN_ON(intel_crtc->mmio_flip.seqno))
> +		return -EBUSY;
> +
> +	ret = intel_postpone_flip(obj);
> +	if (ret < 0)
> +		return ret;
> +	if (ret == 0) {
> +		intel_do_mmio_flip(intel_crtc);
> +		return 0;
> +	}
> +
> +	spin_lock_irqsave(&dev_priv->mmio_flip_lock, irq_flags);
> +	intel_crtc->mmio_flip.seqno = obj->last_write_seqno;
> +	intel_crtc->mmio_flip.ring_id = obj->ring->id;
> +	spin_unlock_irqrestore(&dev_priv->mmio_flip_lock, irq_flags);
> +
> +	/*
> +	 * Double check to catch cases where irq fired before
> +	 * mmio flip data was ready
> +	 */
> +	intel_notify_mmio_flip(obj->ring);
> +	return 0;
> +}
> +
>  static int intel_default_queue_flip(struct drm_device *dev,
>  				    struct drm_crtc *crtc,
>  				    struct drm_framebuffer *fb,
> @@ -9290,7 +9431,12 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
>  	work->gtt_offset =
>  		i915_gem_obj_ggtt_offset(obj) + intel_crtc->dspaddr_offset;
>  
> -	ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring, page_flip_flags);
> +	if (use_mmio_flip(ring, obj))
> +		ret = intel_queue_mmio_flip(dev, crtc, fb, obj, ring,
> +				page_flip_flags);
> +	else
> +		ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring,
> +				page_flip_flags);
>  	if (ret)
>  		goto cleanup_unpin;
>  
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 9bb70dc..206b577 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -358,6 +358,11 @@ struct intel_pipe_wm {
>  	bool sprites_scaled;
>  };
>  
> +struct intel_mmio_flip {
> +	u32 seqno;
> +	u32 ring_id;
> +};
> +
>  struct intel_crtc {
>  	struct drm_crtc base;
>  	enum pipe pipe;
> @@ -411,6 +416,7 @@ struct intel_crtc {
>  	wait_queue_head_t vbl_wait;
>  
>  	int scanline_offset;
> +	struct intel_mmio_flip mmio_flip;
>  };
>  
>  struct intel_plane_wm_parameters {
> -- 
> 1.8.5.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 67+ messages in thread

* Re: [PATCH v11] drm/i915: Replaced Blitter ring based flips with MMIO flips
  2014-06-17 14:14                                                                                   ` Daniel Vetter
@ 2014-06-17 14:17                                                                                     ` Chris Wilson
  0 siblings, 0 replies; 67+ messages in thread
From: Chris Wilson @ 2014-06-17 14:17 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Deepak S, Akash Goel, sourab.gupta, intel-gfx

On Tue, Jun 17, 2014 at 04:14:37PM +0200, Daniel Vetter wrote:
> Queued for -next, thanks for the patch. Aside: Checkpatch complained about
> some unaligned function parameters. Please try to get that right for the
> next patch since it really helps with readability. Fixed up while
> applying.

Oh well, I have a slight refinement because I grew tired of seeing
spurious syncs being reported:

> > -	ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring, page_flip_flags);
> > +	if (use_mmio_flip(ring, obj))

Just before this point we call pin_and_fence, but really we want to do
so here instead with

pin_and_fence(obj, obj->ring);
work->gtt_offset = blah;

> > +		ret = intel_queue_mmio_flip(dev, crtc, fb, obj, ring,
> > +				page_flip_flags);
> > +	else

and

pin_and_fence(obj, ring);
work->gtt_offset = blah;

here
> > +		ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, ring,
> > +				page_flip_flags);
> >  	if (ret)
> >  		goto cleanup_unpin;

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 67+ messages in thread

end of thread, other threads:[~2014-06-17 14:17 UTC | newest]

Thread overview: 67+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-09 11:26 [PATCH 0/2] Using MMIO based flips on VLV akash.goel
2014-01-09 11:26 ` [PATCH 1/2] drm/i915: Creating a new workqueue to handle MMIO flip work items akash.goel
2014-01-09 11:26 ` [PATCH 2/2] drm/i915/vlv: Replaced Blitter ring based flips with MMIO Flips for VLV akash.goel
2014-01-09 11:31   ` Chris Wilson
2014-01-13  9:47     ` Goel, Akash
2014-02-07 11:59       ` Goel, Akash
2014-02-07 14:47         ` Daniel Vetter
2014-02-07 17:17           ` Goel, Akash
     [not found]           ` <8BF5CF93467D8C498F250C96583BC09CC718E3@BGSMSX103.gar.corp.intel.com>
2014-03-07 13:17             ` Gupta, Sourab
2014-03-07 13:30               ` Ville Syrjälä
2014-03-13  7:21                 ` [PATCH] drm/i915: Replaced Blitter ring based flips with MMIO flips " sourab.gupta
2014-03-13  9:01                 ` [PATCH v2] " sourab.gupta
2014-03-17  4:33                   ` Gupta, Sourab
2014-03-21 17:10                   ` Gupta, Sourab
2014-03-21 18:15                   ` Damien Lespiau
2014-03-23  9:01                     ` [PATCH v3] " sourab.gupta
2014-03-26  7:49                       ` Gupta, Sourab
2014-04-03  8:40                         ` Gupta, Sourab
2014-04-07 11:19                           ` Gupta, Sourab
2014-05-09 11:59                       ` Ville Syrjälä
2014-05-09 13:28                         ` Ville Syrjälä
2014-05-09 17:18                         ` Ville Syrjälä
2014-05-15  6:17                           ` [PATCH v4] " sourab.gupta
2014-05-15 12:27                             ` Ville Syrjälä
2014-05-16 12:34                               ` Gupta, Sourab
2014-05-16 12:51                                 ` Ville Syrjälä
2014-05-19  9:19                                   ` Gupta, Sourab
2014-05-19 10:58                                   ` [PATCH v5] " sourab.gupta
2014-05-19 11:47                                     ` Ville Syrjälä
2014-05-19 12:29                                       ` Daniel Vetter
2014-05-19 13:06                                         ` Ville Syrjälä
2014-05-19 13:41                                           ` Daniel Vetter
2014-05-20 10:49                                             ` [PATCH 0/3] Replace Blitter ring based flips with MMIO flips sourab.gupta
2014-05-20 10:49                                               ` [PATCH v6 1/3] drm/i915: Replaced " sourab.gupta
2014-05-20 11:59                                                 ` Chris Wilson
2014-05-20 18:01                                                   ` Gupta, Sourab
2014-05-22 14:36                                                     ` [PATCH v2 0/3] Replace " sourab.gupta
2014-05-22 14:36                                                       ` [PATCH v7 1/3] drm/i915: Replaced " sourab.gupta
2014-05-27 12:52                                                         ` Ville Syrjälä
2014-05-27 13:09                                                           ` Daniel Vetter
2014-05-28  7:12                                                             ` [PATCH v3 0/2] Replace " sourab.gupta
2014-05-28  7:12                                                               ` [PATCH 1/2] drm/i915: Replaced " sourab.gupta
2014-05-28  7:30                                                                 ` Chris Wilson
2014-05-28  9:42                                                                   ` Gupta, Sourab
2014-05-28  7:31                                                                 ` Chris Wilson
2014-05-28  8:12                                                                   ` Ville Syrjälä
2014-05-28  7:12                                                               ` [PATCH 2/2] drm/i915: Default to mmio flips on VLV sourab.gupta
2014-05-28  9:56                                                                 ` Chris Wilson
2014-05-29  9:40                                                                   ` [PATCH v4 0/3] Replace Blitter ring based flips with MMIO flips sourab.gupta
2014-05-29  9:40                                                                     ` [PATCH v9 1/3] drm/i915: Replaced " sourab.gupta
2014-05-30 10:31                                                                       ` Chris Wilson
2014-05-29  9:40                                                                     ` [PATCH 2/3] drm/i915: Selection of MMIO vs CS flip at page flip time sourab.gupta
2014-05-29  9:40                                                                     ` [PATCH 3/3] drm/i915: Make module param for MMIO flip selection as tristate sourab.gupta
2014-05-30 10:49                                                                       ` Chris Wilson
2014-06-01 11:13                                                                         ` [PATCH v10] drm/i915: Replaced Blitter ring based flips with MMIO flips sourab.gupta
2014-06-02  6:56                                                                           ` Chris Wilson
2014-06-02 10:38                                                                             ` Gupta, Sourab
2014-06-02 10:56                                                                               ` Chris Wilson
2014-06-02 11:17                                                                                 ` [PATCH v11] " sourab.gupta
2014-06-17 14:14                                                                                   ` Daniel Vetter
2014-06-17 14:17                                                                                     ` Chris Wilson
2014-05-22 14:36                                                       ` [PATCH 2/3] drm/i915: Default to mmio flips on VLV sourab.gupta
2014-05-22 14:36                                                       ` [PATCH 3/3] drm/i915: Fix mmio page flip vs mmio set base race sourab.gupta
2014-05-26  8:51                                                       ` [PATCH v2 0/3] Replace Blitter ring based flips with MMIO flips Gupta, Sourab
2014-05-20 10:49                                               ` [PATCH 2/3] drm/i915: Default to mmio flips on VLV sourab.gupta
2014-05-20 10:49                                               ` [PATCH 3/3] drm/i915: Fix mmio page flip vs mmio set base race sourab.gupta
2014-01-09 11:29 ` [PATCH 0/2] Using MMIO based flips on VLV Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.