All of lore.kernel.org
 help / color / mirror / Atom feed
* [RESEND 0/9] Reviewed FBC patches
@ 2015-02-09 16:46 Paulo Zanoni
  2015-02-09 16:46 ` [PATCH 1/9] drm/i915: don't try to find crtcs for FBC if it's disabled Paulo Zanoni
                   ` (8 more replies)
  0 siblings, 9 replies; 19+ messages in thread
From: Paulo Zanoni @ 2015-02-09 16:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

As requested by Daniel, here is the series with the FBC patches that were
already reviewed. Some of them were rebased after the review, but the changes
are minor.

Paulo Zanoni (9):
  drm/i915: don't try to find crtcs for FBC if it's disabled
  drm/i915: don't keep reassigning FBC_UNSUPPORTED
  drm/i915: change dev_priv->fbc.plane to dev_priv->fbc.crtc
  drm/i915: pass which operation triggered the frontbuffer tracking
  drm/i915: also do frontbuffer tracking on pwrites
  drm/i915: add frontbuffer tracking to FBC
  drm/i915: extract intel_fbc_find_crtc()
  drm/i915: HSW+ FBC is tied to pipe A
  drm/i915: gen5+ can have FBC with multiple pipes

 drivers/gpu/drm/i915/i915_drv.h            |  18 +--
 drivers/gpu/drm/i915/i915_gem.c            |  14 ++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   2 +-
 drivers/gpu/drm/i915/intel_display.c       |   5 +-
 drivers/gpu/drm/i915/intel_drv.h           |  17 ++-
 drivers/gpu/drm/i915/intel_fbc.c           | 196 ++++++++++++++++++-----------
 drivers/gpu/drm/i915/intel_frontbuffer.c   |  29 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  41 +-----
 drivers/gpu/drm/i915/intel_ringbuffer.h    |   1 -
 drivers/gpu/drm/i915/intel_sprite.c        |   2 +-
 10 files changed, 173 insertions(+), 152 deletions(-)

-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/9] drm/i915: don't try to find crtcs for FBC if it's disabled
  2015-02-09 16:46 [RESEND 0/9] Reviewed FBC patches Paulo Zanoni
@ 2015-02-09 16:46 ` Paulo Zanoni
  2015-02-09 16:46 ` [PATCH 2/9] drm/i915: don't keep reassigning FBC_UNSUPPORTED Paulo Zanoni
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Paulo Zanoni @ 2015-02-09 16:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

.. because it would be a waste of time, so move the place where the
check is done. Also, with this we won't risk printing "more than one
pipe active, disabling compression" or "no output, disabling" when FBC
is actually disabled.

This patch also represents a small behavior difference when using
i915.powersave=0: it is now exactly the same as i915.enable_fbc=0 on
this part of the code.

V2: Rebase.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (v1)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_fbc.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
index b572bb6e..c9a470f 100644
--- a/drivers/gpu/drm/i915/intel_fbc.c
+++ b/drivers/gpu/drm/i915/intel_fbc.c
@@ -507,10 +507,16 @@ void intel_fbc_update(struct drm_device *dev)
 		return;
 	}
 
-	if (!i915.powersave) {
+	if (i915.enable_fbc < 0) {
+		if (set_no_fbc_reason(dev_priv, FBC_CHIP_DEFAULT))
+			DRM_DEBUG_KMS("disabled per chip default\n");
+		goto out_disable;
+	}
+
+	if (!i915.enable_fbc || !i915.powersave) {
 		if (set_no_fbc_reason(dev_priv, FBC_MODULE_PARAM))
 			DRM_DEBUG_KMS("fbc disabled per module param\n");
-		return;
+		goto out_disable;
 	}
 
 	/*
@@ -545,16 +551,6 @@ void intel_fbc_update(struct drm_device *dev)
 	obj = intel_fb_obj(fb);
 	adjusted_mode = &intel_crtc->config->base.adjusted_mode;
 
-	if (i915.enable_fbc < 0) {
-		if (set_no_fbc_reason(dev_priv, FBC_CHIP_DEFAULT))
-			DRM_DEBUG_KMS("disabled per chip default\n");
-		goto out_disable;
-	}
-	if (!i915.enable_fbc) {
-		if (set_no_fbc_reason(dev_priv, FBC_MODULE_PARAM))
-			DRM_DEBUG_KMS("fbc disabled per module param\n");
-		goto out_disable;
-	}
 	if ((adjusted_mode->flags & DRM_MODE_FLAG_INTERLACE) ||
 	    (adjusted_mode->flags & DRM_MODE_FLAG_DBLSCAN)) {
 		if (set_no_fbc_reason(dev_priv, FBC_UNSUPPORTED_MODE))
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/9] drm/i915: don't keep reassigning FBC_UNSUPPORTED
  2015-02-09 16:46 [RESEND 0/9] Reviewed FBC patches Paulo Zanoni
  2015-02-09 16:46 ` [PATCH 1/9] drm/i915: don't try to find crtcs for FBC if it's disabled Paulo Zanoni
@ 2015-02-09 16:46 ` Paulo Zanoni
  2015-02-09 16:46 ` [PATCH 3/9] drm/i915: change dev_priv->fbc.plane to dev_priv->fbc.crtc Paulo Zanoni
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Paulo Zanoni @ 2015-02-09 16:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

This may save a few picoseconds on !HAS_FBC platforms. And it also
satisfies my OCD.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_fbc.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
index c9a470f..ce7774d 100644
--- a/drivers/gpu/drm/i915/intel_fbc.c
+++ b/drivers/gpu/drm/i915/intel_fbc.c
@@ -502,10 +502,8 @@ void intel_fbc_update(struct drm_device *dev)
 	const struct drm_display_mode *adjusted_mode;
 	unsigned int max_width, max_height;
 
-	if (!HAS_FBC(dev)) {
-		set_no_fbc_reason(dev_priv, FBC_UNSUPPORTED);
+	if (!HAS_FBC(dev))
 		return;
-	}
 
 	if (i915.enable_fbc < 0) {
 		if (set_no_fbc_reason(dev_priv, FBC_CHIP_DEFAULT))
@@ -670,6 +668,7 @@ void intel_fbc_init(struct drm_i915_private *dev_priv)
 {
 	if (!HAS_FBC(dev_priv)) {
 		dev_priv->fbc.enabled = false;
+		dev_priv->fbc.no_fbc_reason = FBC_UNSUPPORTED;
 		return;
 	}
 
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/9] drm/i915: change dev_priv->fbc.plane to dev_priv->fbc.crtc
  2015-02-09 16:46 [RESEND 0/9] Reviewed FBC patches Paulo Zanoni
  2015-02-09 16:46 ` [PATCH 1/9] drm/i915: don't try to find crtcs for FBC if it's disabled Paulo Zanoni
  2015-02-09 16:46 ` [PATCH 2/9] drm/i915: don't keep reassigning FBC_UNSUPPORTED Paulo Zanoni
@ 2015-02-09 16:46 ` Paulo Zanoni
  2015-02-09 16:46 ` [PATCH 4/9] drm/i915: pass which operation triggered the frontbuffer tracking Paulo Zanoni
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Paulo Zanoni @ 2015-02-09 16:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

Since the mapping from CRTCs to planes is fixed, looking at the CRTC
is essentially the same as looking at the plane. Also, the next
patches wil start using the frontbuffer_bits macros, and they take the
pipe as the parameter instead of the plane, and this could differ on
gens 2 and 3.

Another nice thing is that we don't risk accidentally initializing
things to PLANE_A if we don't set the value before it is used for the
first time. But this shouldn't be a problem with the current code.

V2: Rebase.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (v1)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h      | 2 +-
 drivers/gpu/drm/i915/intel_display.c | 5 ++---
 drivers/gpu/drm/i915/intel_fbc.c     | 6 +++---
 drivers/gpu/drm/i915/intel_sprite.c  | 2 +-
 4 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 26ffe8b..c0b8644 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -775,7 +775,7 @@ struct i915_fbc {
 	unsigned long uncompressed_size;
 	unsigned threshold;
 	unsigned int fb_id;
-	enum plane plane;
+	struct intel_crtc *crtc;
 	int y;
 
 	struct drm_mm_node compressed_fb;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 3fe9598..2655b63 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -4277,11 +4277,10 @@ static void intel_crtc_disable_planes(struct drm_crtc *crtc)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	int pipe = intel_crtc->pipe;
-	int plane = intel_crtc->plane;
 
 	intel_crtc_wait_for_pending_flips(crtc);
 
-	if (dev_priv->fbc.plane == plane)
+	if (dev_priv->fbc.crtc == intel_crtc)
 		intel_fbc_disable(dev);
 
 	hsw_disable_ips(intel_crtc);
@@ -11932,7 +11931,7 @@ intel_check_primary_plane(struct drm_plane *plane,
 		 */
 		if (intel_crtc->primary_enabled &&
 		    INTEL_INFO(dev)->gen <= 4 && !IS_G4X(dev) &&
-		    dev_priv->fbc.plane == intel_crtc->plane &&
+		    dev_priv->fbc.crtc == intel_crtc &&
 		    state->base.rotation != BIT(DRM_ROTATE_0)) {
 			intel_crtc->atomic.disable_fbc = true;
 		}
diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
index ce7774d..7341e87 100644
--- a/drivers/gpu/drm/i915/intel_fbc.c
+++ b/drivers/gpu/drm/i915/intel_fbc.c
@@ -369,7 +369,7 @@ static void intel_fbc_work_fn(struct work_struct *__work)
 		if (work->crtc->primary->fb == work->fb) {
 			dev_priv->display.enable_fbc(work->crtc);
 
-			dev_priv->fbc.plane = to_intel_crtc(work->crtc)->plane;
+			dev_priv->fbc.crtc = to_intel_crtc(work->crtc);
 			dev_priv->fbc.fb_id = work->crtc->primary->fb->base.id;
 			dev_priv->fbc.y = work->crtc->y;
 		}
@@ -460,7 +460,7 @@ void intel_fbc_disable(struct drm_device *dev)
 		return;
 
 	dev_priv->display.disable_fbc(dev);
-	dev_priv->fbc.plane = -1;
+	dev_priv->fbc.crtc = NULL;
 }
 
 static bool set_no_fbc_reason(struct drm_i915_private *dev_priv,
@@ -612,7 +612,7 @@ void intel_fbc_update(struct drm_device *dev)
 	 * cannot be unpinned (and have its GTT offset and fence revoked)
 	 * without first being decoupled from the scanout and FBC disabled.
 	 */
-	if (dev_priv->fbc.plane == intel_crtc->plane &&
+	if (dev_priv->fbc.crtc == intel_crtc &&
 	    dev_priv->fbc.fb_id == fb->base.id &&
 	    dev_priv->fbc.y == crtc->y)
 		return;
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index 0a52c44..5afc89e 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -993,7 +993,7 @@ intel_pre_disable_primary(struct drm_crtc *crtc)
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 
 	mutex_lock(&dev->struct_mutex);
-	if (dev_priv->fbc.plane == intel_crtc->plane)
+	if (dev_priv->fbc.crtc == intel_crtc)
 		intel_fbc_disable(dev);
 	mutex_unlock(&dev->struct_mutex);
 
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/9] drm/i915: pass which operation triggered the frontbuffer tracking
  2015-02-09 16:46 [RESEND 0/9] Reviewed FBC patches Paulo Zanoni
                   ` (2 preceding siblings ...)
  2015-02-09 16:46 ` [PATCH 3/9] drm/i915: change dev_priv->fbc.plane to dev_priv->fbc.crtc Paulo Zanoni
@ 2015-02-09 16:46 ` Paulo Zanoni
  2015-02-09 18:29   ` Daniel Vetter
  2015-02-09 16:46 ` [PATCH 5/9] drm/i915: also do frontbuffer tracking on pwrites Paulo Zanoni
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Paulo Zanoni @ 2015-02-09 16:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

We want to port FBC to the frontbuffer tracking infrastructure, but
for that we need to know what caused the object invalidation/flush so
we can react accordingly: CPU mmaps need manual, GTT mmaps and
flips don't need handling and ring rendering needs nukes.

v2: - s/ORIGIN_RENDER/ORIGIN_CS/ (Daniel, Rodrigo)
    - Fix copy/pasted wrong documentation
    - Rebase
v3: - Rebase

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (v2)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h            |  7 +++++++
 drivers/gpu/drm/i915/i915_gem.c            | 10 +++++-----
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 drivers/gpu/drm/i915/intel_drv.h           | 11 +++++++----
 drivers/gpu/drm/i915/intel_frontbuffer.c   | 15 ++++++++++-----
 5 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c0b8644..d578211 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -771,6 +771,13 @@ struct intel_context {
 	struct list_head link;
 };
 
+enum fb_op_origin {
+	ORIGIN_GTT,
+	ORIGIN_CPU,
+	ORIGIN_CS,
+	ORIGIN_FLIP,
+};
+
 struct i915_fbc {
 	unsigned long uncompressed_size;
 	unsigned threshold;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c26d36c..3d198f8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2310,7 +2310,7 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 			list_move_tail(&vma->mm_list, &vma->vm->inactive_list);
 	}
 
-	intel_fb_obj_flush(obj, true);
+	intel_fb_obj_flush(obj, true, ORIGIN_CS);
 
 	list_del_init(&obj->ring_list);
 
@@ -3677,7 +3677,7 @@ i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj)
 	old_write_domain = obj->base.write_domain;
 	obj->base.write_domain = 0;
 
-	intel_fb_obj_flush(obj, false);
+	intel_fb_obj_flush(obj, false, ORIGIN_GTT);
 
 	trace_i915_gem_object_change_domain(obj,
 					    obj->base.read_domains,
@@ -3699,7 +3699,7 @@ i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj)
 	old_write_domain = obj->base.write_domain;
 	obj->base.write_domain = 0;
 
-	intel_fb_obj_flush(obj, false);
+	intel_fb_obj_flush(obj, false, ORIGIN_CPU);
 
 	trace_i915_gem_object_change_domain(obj,
 					    obj->base.read_domains,
@@ -3764,7 +3764,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	}
 
 	if (write)
-		intel_fb_obj_invalidate(obj, NULL);
+		intel_fb_obj_invalidate(obj, NULL, ORIGIN_GTT);
 
 	trace_i915_gem_object_change_domain(obj,
 					    old_read_domains,
@@ -4079,7 +4079,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write)
 	}
 
 	if (write)
-		intel_fb_obj_invalidate(obj, NULL);
+		intel_fb_obj_invalidate(obj, NULL, ORIGIN_CPU);
 
 	trace_i915_gem_object_change_domain(obj,
 					    old_read_domains,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index b773368..f4342f0 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -971,7 +971,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 			obj->dirty = 1;
 			i915_gem_request_assign(&obj->last_write_req, req);
 
-			intel_fb_obj_invalidate(obj, ring);
+			intel_fb_obj_invalidate(obj, ring, ORIGIN_CS);
 
 			/* update for the implicit flush after a batch */
 			obj->base.write_domain &= ~I915_GEM_GPU_DOMAINS;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 76b3c20..a73d091 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -853,13 +853,15 @@ void intel_ddi_set_vc_payload_alloc(struct drm_crtc *crtc, bool state);
 
 /* intel_frontbuffer.c */
 void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
-			     struct intel_engine_cs *ring);
+			     struct intel_engine_cs *ring,
+			     enum fb_op_origin origin);
 void intel_frontbuffer_flip_prepare(struct drm_device *dev,
 				    unsigned frontbuffer_bits);
 void intel_frontbuffer_flip_complete(struct drm_device *dev,
 				     unsigned frontbuffer_bits);
 void intel_frontbuffer_flush(struct drm_device *dev,
-			     unsigned frontbuffer_bits);
+			     unsigned frontbuffer_bits,
+			     enum fb_op_origin origin);
 /**
  * intel_frontbuffer_flip - synchronous frontbuffer flip
  * @dev: DRM device
@@ -875,12 +877,13 @@ static inline
 void intel_frontbuffer_flip(struct drm_device *dev,
 			    unsigned frontbuffer_bits)
 {
-	intel_frontbuffer_flush(dev, frontbuffer_bits);
+	intel_frontbuffer_flush(dev, frontbuffer_bits, ORIGIN_FLIP);
 }
 
 int intel_fb_align_height(struct drm_device *dev, int height,
 			  unsigned int tiling);
-void intel_fb_obj_flush(struct drm_i915_gem_object *obj, bool retire);
+void intel_fb_obj_flush(struct drm_i915_gem_object *obj, bool retire,
+			enum fb_op_origin origin);
 
 
 /* intel_audio.c */
diff --git a/drivers/gpu/drm/i915/intel_frontbuffer.c b/drivers/gpu/drm/i915/intel_frontbuffer.c
index 73cb6e0..0a773981 100644
--- a/drivers/gpu/drm/i915/intel_frontbuffer.c
+++ b/drivers/gpu/drm/i915/intel_frontbuffer.c
@@ -127,6 +127,7 @@ static void intel_mark_fb_busy(struct drm_device *dev,
  * intel_fb_obj_invalidate - invalidate frontbuffer object
  * @obj: GEM object to invalidate
  * @ring: set for asynchronous rendering
+ * @origin: which operation caused the invalidation
  *
  * This function gets called every time rendering on the given object starts and
  * frontbuffer caching (fbc, low refresh rate for DRRS, panel self refresh) must
@@ -135,7 +136,8 @@ static void intel_mark_fb_busy(struct drm_device *dev,
  * scheduled.
  */
 void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
-			     struct intel_engine_cs *ring)
+			     struct intel_engine_cs *ring,
+			     enum fb_op_origin origin)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -164,6 +166,7 @@ void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
  * intel_frontbuffer_flush - flush frontbuffer
  * @dev: DRM device
  * @frontbuffer_bits: frontbuffer plane tracking bits
+ * @origin: which operation caused the flush
  *
  * This function gets called every time rendering on the given planes has
  * completed and frontbuffer caching can be started again. Flushes will get
@@ -172,7 +175,8 @@ void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
  * Can be called without any locks held.
  */
 void intel_frontbuffer_flush(struct drm_device *dev,
-			     unsigned frontbuffer_bits)
+			     unsigned frontbuffer_bits,
+			     enum fb_op_origin origin)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
@@ -201,13 +205,14 @@ void intel_frontbuffer_flush(struct drm_device *dev,
  * intel_fb_obj_flush - flush frontbuffer object
  * @obj: GEM object to flush
  * @retire: set when retiring asynchronous rendering
+ * @origin: which operation caused the flush
  *
  * This function gets called every time rendering on the given object has
  * completed and frontbuffer caching can be started again. If @retire is true
  * then any delayed flushes will be unblocked.
  */
 void intel_fb_obj_flush(struct drm_i915_gem_object *obj,
-			bool retire)
+			bool retire, enum fb_op_origin origin)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
@@ -229,7 +234,7 @@ void intel_fb_obj_flush(struct drm_i915_gem_object *obj,
 		mutex_unlock(&dev_priv->fb_tracking.lock);
 	}
 
-	intel_frontbuffer_flush(dev, frontbuffer_bits);
+	intel_frontbuffer_flush(dev, frontbuffer_bits, origin);
 }
 
 /**
@@ -277,5 +282,5 @@ void intel_frontbuffer_flip_complete(struct drm_device *dev,
 	dev_priv->fb_tracking.flip_bits &= ~frontbuffer_bits;
 	mutex_unlock(&dev_priv->fb_tracking.lock);
 
-	intel_frontbuffer_flush(dev, frontbuffer_bits);
+	intel_frontbuffer_flush(dev, frontbuffer_bits, ORIGIN_FLIP);
 }
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/9] drm/i915: also do frontbuffer tracking on pwrites
  2015-02-09 16:46 [RESEND 0/9] Reviewed FBC patches Paulo Zanoni
                   ` (3 preceding siblings ...)
  2015-02-09 16:46 ` [PATCH 4/9] drm/i915: pass which operation triggered the frontbuffer tracking Paulo Zanoni
@ 2015-02-09 16:46 ` Paulo Zanoni
  2015-02-09 18:41   ` Daniel Vetter
  2015-02-09 16:46 ` [PATCH 6/9] drm/i915: add frontbuffer tracking to FBC Paulo Zanoni
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Paulo Zanoni @ 2015-02-09 16:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

We need this for FBC, and possibly for PSR too.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3d198f8..15910fa 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1111,6 +1111,10 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 			ret = i915_gem_phys_pwrite(obj, args, file);
 		else
 			ret = i915_gem_shmem_pwrite(dev, obj, args, file);
+
+		intel_fb_obj_flush(obj, false, ORIGIN_CPU);
+	} else {
+		intel_fb_obj_flush(obj, false, ORIGIN_GTT);
 	}
 
 out:
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/9] drm/i915: add frontbuffer tracking to FBC
  2015-02-09 16:46 [RESEND 0/9] Reviewed FBC patches Paulo Zanoni
                   ` (4 preceding siblings ...)
  2015-02-09 16:46 ` [PATCH 5/9] drm/i915: also do frontbuffer tracking on pwrites Paulo Zanoni
@ 2015-02-09 16:46 ` Paulo Zanoni
  2015-02-09 19:05   ` Daniel Vetter
  2015-02-09 16:46 ` [PATCH 7/9] drm/i915: extract intel_fbc_find_crtc() Paulo Zanoni
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Paulo Zanoni @ 2015-02-09 16:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni, Rodrigo Vivi

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

Kill the blt/render tracking we currently have and use the frontbuffer
tracking infrastructure.

Don't enable things by default yet.

v2: (Rodrigo) Fix small conflict on rebase and typo at subject.
v3: (Paulo) Rebase on RENDER_CS change.
v4: (Paulo) Rebase.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (v2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h          |   9 +--
 drivers/gpu/drm/i915/intel_drv.h         |   6 +-
 drivers/gpu/drm/i915/intel_fbc.c         | 107 ++++++++++++++++++++-----------
 drivers/gpu/drm/i915/intel_frontbuffer.c |  14 +---
 drivers/gpu/drm/i915/intel_ringbuffer.c  |  41 +-----------
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   1 -
 6 files changed, 80 insertions(+), 98 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d578211..dc002df 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -782,6 +782,7 @@ struct i915_fbc {
 	unsigned long uncompressed_size;
 	unsigned threshold;
 	unsigned int fb_id;
+	unsigned int possible_framebuffer_bits;
 	struct intel_crtc *crtc;
 	int y;
 
@@ -794,14 +795,6 @@ struct i915_fbc {
 	 * possible. */
 	bool enabled;
 
-	/* On gen8 some rings cannont perform fbc clean operation so for now
-	 * we are doing this on SW with mmio.
-	 * This variable works in the opposite information direction
-	 * of ring->fbc_dirty telling software on frontbuffer tracking
-	 * to perform the cache clean on sw side.
-	 */
-	bool need_sw_cache_clean;
-
 	struct intel_fbc_work {
 		struct delayed_work work;
 		struct drm_crtc *crtc;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index a73d091..ed85df8 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1101,7 +1101,11 @@ bool intel_fbc_enabled(struct drm_device *dev);
 void intel_fbc_update(struct drm_device *dev);
 void intel_fbc_init(struct drm_i915_private *dev_priv);
 void intel_fbc_disable(struct drm_device *dev);
-void bdw_fbc_sw_flush(struct drm_device *dev, u32 value);
+void intel_fbc_invalidate(struct drm_i915_private *dev_priv,
+			  unsigned int frontbuffer_bits,
+			  enum fb_op_origin origin);
+void intel_fbc_flush(struct drm_i915_private *dev_priv,
+		     unsigned int frontbuffer_bits, enum fb_op_origin origin);
 
 /* intel_hdmi.c */
 void intel_hdmi_init(struct drm_device *dev, int hdmi_reg, enum port port);
diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
index 7341e87..26c4f9c 100644
--- a/drivers/gpu/drm/i915/intel_fbc.c
+++ b/drivers/gpu/drm/i915/intel_fbc.c
@@ -174,29 +174,10 @@ static bool g4x_fbc_enabled(struct drm_device *dev)
 	return I915_READ(DPFC_CONTROL) & DPFC_CTL_EN;
 }
 
-static void snb_fbc_blit_update(struct drm_device *dev)
+static void intel_fbc_nuke(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	u32 blt_ecoskpd;
-
-	/* Make sure blitter notifies FBC of writes */
-
-	/* Blitter is part of Media powerwell on VLV. No impact of
-	 * his param in other platforms for now */
-	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_MEDIA);
-
-	blt_ecoskpd = I915_READ(GEN6_BLITTER_ECOSKPD);
-	blt_ecoskpd |= GEN6_BLITTER_FBC_NOTIFY <<
-		GEN6_BLITTER_LOCK_SHIFT;
-	I915_WRITE(GEN6_BLITTER_ECOSKPD, blt_ecoskpd);
-	blt_ecoskpd |= GEN6_BLITTER_FBC_NOTIFY;
-	I915_WRITE(GEN6_BLITTER_ECOSKPD, blt_ecoskpd);
-	blt_ecoskpd &= ~(GEN6_BLITTER_FBC_NOTIFY <<
-			 GEN6_BLITTER_LOCK_SHIFT);
-	I915_WRITE(GEN6_BLITTER_ECOSKPD, blt_ecoskpd);
-	POSTING_READ(GEN6_BLITTER_ECOSKPD);
-
-	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_MEDIA);
+	I915_WRITE(MSG_FBC_REND_STATE, FBC_REND_NUKE);
+	POSTING_READ(MSG_FBC_REND_STATE);
 }
 
 static void ilk_fbc_enable(struct drm_crtc *crtc)
@@ -239,9 +220,10 @@ static void ilk_fbc_enable(struct drm_crtc *crtc)
 		I915_WRITE(SNB_DPFC_CTL_SA,
 			   SNB_CPU_FENCE_ENABLE | obj->fence_reg);
 		I915_WRITE(DPFC_CPU_FENCE_OFFSET, crtc->y);
-		snb_fbc_blit_update(dev);
 	}
 
+	intel_fbc_nuke(dev_priv);
+
 	DRM_DEBUG_KMS("enabled fbc on plane %c\n", plane_name(intel_crtc->plane));
 }
 
@@ -320,7 +302,7 @@ static void gen7_fbc_enable(struct drm_crtc *crtc)
 		   SNB_CPU_FENCE_ENABLE | obj->fence_reg);
 	I915_WRITE(DPFC_CPU_FENCE_OFFSET, crtc->y);
 
-	snb_fbc_blit_update(dev);
+	intel_fbc_nuke(dev_priv);
 
 	DRM_DEBUG_KMS("enabled fbc on plane %c\n", plane_name(intel_crtc->plane));
 }
@@ -340,19 +322,6 @@ bool intel_fbc_enabled(struct drm_device *dev)
 	return dev_priv->fbc.enabled;
 }
 
-void bdw_fbc_sw_flush(struct drm_device *dev, u32 value)
-{
-	struct drm_i915_private *dev_priv = dev->dev_private;
-
-	if (!IS_GEN8(dev))
-		return;
-
-	if (!intel_fbc_enabled(dev))
-		return;
-
-	I915_WRITE(MSG_FBC_REND_STATE, value);
-}
-
 static void intel_fbc_work_fn(struct work_struct *__work)
 {
 	struct intel_fbc_work *work =
@@ -658,6 +627,63 @@ out_disable:
 	i915_gem_stolen_cleanup_compression(dev);
 }
 
+void intel_fbc_invalidate(struct drm_i915_private *dev_priv,
+			  unsigned int frontbuffer_bits,
+			  enum fb_op_origin origin)
+{
+	struct drm_device *dev = dev_priv->dev;
+	unsigned int fbc_bits;
+
+	if (origin == ORIGIN_GTT)
+		return;
+
+	if (dev_priv->fbc.enabled)
+		fbc_bits = INTEL_FRONTBUFFER_PRIMARY(dev_priv->fbc.crtc->pipe);
+	else if (dev_priv->fbc.fbc_work)
+		fbc_bits = INTEL_FRONTBUFFER_PRIMARY(
+			to_intel_crtc(dev_priv->fbc.fbc_work->crtc)->pipe);
+	else
+		fbc_bits = dev_priv->fbc.possible_framebuffer_bits;
+
+	if ((fbc_bits & frontbuffer_bits) == 0)
+		return;
+
+	intel_fbc_disable(dev);
+}
+
+void intel_fbc_flush(struct drm_i915_private *dev_priv,
+		     unsigned int frontbuffer_bits,
+		     enum fb_op_origin origin)
+{
+	struct drm_device *dev = dev_priv->dev;
+	unsigned int fbc_bits = 0;
+	bool fbc_enabled;
+
+	if (dev_priv->fbc.enabled)
+		fbc_bits = INTEL_FRONTBUFFER_PRIMARY(dev_priv->fbc.crtc->pipe);
+	else
+		fbc_bits = dev_priv->fbc.possible_framebuffer_bits;
+
+	/* These are not the planes you are looking for. */
+	if ((frontbuffer_bits & fbc_bits) == 0)
+		return;
+
+	fbc_enabled = intel_fbc_enabled(dev);
+
+	if ((dev_priv->fb_tracking.busy_bits & frontbuffer_bits &
+	     fbc_bits) == 0) {
+		if (fbc_enabled) {
+			if (origin == ORIGIN_CS || origin == ORIGIN_CPU)
+				intel_fbc_nuke(dev_priv);
+		} else {
+			intel_fbc_update(dev);
+		}
+	} else {
+		if (fbc_enabled)
+			intel_fbc_disable(dev);
+	}
+}
+
 /**
  * intel_fbc_init - Initialize FBC
  * @dev_priv: the i915 device
@@ -666,12 +692,19 @@ out_disable:
  */
 void intel_fbc_init(struct drm_i915_private *dev_priv)
 {
+	enum pipe pipe;
+
 	if (!HAS_FBC(dev_priv)) {
 		dev_priv->fbc.enabled = false;
 		dev_priv->fbc.no_fbc_reason = FBC_UNSUPPORTED;
 		return;
 	}
 
+	/* TODO: some platforms have FBC tied to a specific plane! */
+	for_each_pipe(dev_priv, pipe)
+		dev_priv->fbc.possible_framebuffer_bits |=
+				INTEL_FRONTBUFFER_PRIMARY(pipe);
+
 	if (INTEL_INFO(dev_priv)->gen >= 7) {
 		dev_priv->display.fbc_enabled = ilk_fbc_enabled;
 		dev_priv->display.enable_fbc = gen7_fbc_enable;
diff --git a/drivers/gpu/drm/i915/intel_frontbuffer.c b/drivers/gpu/drm/i915/intel_frontbuffer.c
index 0a773981..2fc059c 100644
--- a/drivers/gpu/drm/i915/intel_frontbuffer.c
+++ b/drivers/gpu/drm/i915/intel_frontbuffer.c
@@ -118,8 +118,6 @@ static void intel_mark_fb_busy(struct drm_device *dev,
 			continue;
 
 		intel_increase_pllclock(dev, pipe);
-		if (ring && intel_fbc_enabled(dev))
-			ring->fbc_dirty = true;
 	}
 }
 
@@ -160,6 +158,7 @@ void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
 
 	intel_psr_invalidate(dev, obj->frontbuffer_bits);
 	intel_edp_drrs_invalidate(dev, obj->frontbuffer_bits);
+	intel_fbc_invalidate(dev_priv, obj->frontbuffer_bits, origin);
 }
 
 /**
@@ -189,16 +188,7 @@ void intel_frontbuffer_flush(struct drm_device *dev,
 
 	intel_edp_drrs_flush(dev, frontbuffer_bits);
 	intel_psr_flush(dev, frontbuffer_bits);
-
-	/*
-	 * FIXME: Unconditional fbc flushing here is a rather gross hack and
-	 * needs to be reworked into a proper frontbuffer tracking scheme like
-	 * psr employs.
-	 */
-	if (dev_priv->fbc.need_sw_cache_clean) {
-		dev_priv->fbc.need_sw_cache_clean = false;
-		bdw_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
-	}
+	intel_fbc_flush(dev_priv, frontbuffer_bits, origin);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 573b80f..48dd8a2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -317,29 +317,6 @@ gen7_render_ring_cs_stall_wa(struct intel_engine_cs *ring)
 	return 0;
 }
 
-static int gen7_ring_fbc_flush(struct intel_engine_cs *ring, u32 value)
-{
-	int ret;
-
-	if (!ring->fbc_dirty)
-		return 0;
-
-	ret = intel_ring_begin(ring, 6);
-	if (ret)
-		return ret;
-	/* WaFbcNukeOn3DBlt:ivb/hsw */
-	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
-	intel_ring_emit(ring, MSG_FBC_REND_STATE);
-	intel_ring_emit(ring, value);
-	intel_ring_emit(ring, MI_STORE_REGISTER_MEM(1) | MI_SRM_LRM_GLOBAL_GTT);
-	intel_ring_emit(ring, MSG_FBC_REND_STATE);
-	intel_ring_emit(ring, ring->scratch.gtt_offset + 256);
-	intel_ring_advance(ring);
-
-	ring->fbc_dirty = false;
-	return 0;
-}
-
 static int
 gen7_render_ring_flush(struct intel_engine_cs *ring,
 		       u32 invalidate_domains, u32 flush_domains)
@@ -398,9 +375,6 @@ gen7_render_ring_flush(struct intel_engine_cs *ring,
 	intel_ring_emit(ring, 0);
 	intel_ring_advance(ring);
 
-	if (!invalidate_domains && flush_domains)
-		return gen7_ring_fbc_flush(ring, FBC_REND_NUKE);
-
 	return 0;
 }
 
@@ -462,9 +436,6 @@ gen8_render_ring_flush(struct intel_engine_cs *ring,
 	if (ret)
 		return ret;
 
-	if (!invalidate_domains && flush_domains)
-		return gen7_ring_fbc_flush(ring, FBC_REND_NUKE);
-
 	return 0;
 }
 
@@ -2382,7 +2353,6 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
 			   u32 invalidate, u32 flush)
 {
 	struct drm_device *dev = ring->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	uint32_t cmd;
 	int ret;
 
@@ -2391,7 +2361,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
 		return ret;
 
 	cmd = MI_FLUSH_DW;
-	if (INTEL_INFO(ring->dev)->gen >= 8)
+	if (INTEL_INFO(dev)->gen >= 8)
 		cmd += 1;
 	/*
 	 * Bspec vol 1c.3 - blitter engine command streamer:
@@ -2404,7 +2374,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
 			MI_FLUSH_DW_OP_STOREDW;
 	intel_ring_emit(ring, cmd);
 	intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
-	if (INTEL_INFO(ring->dev)->gen >= 8) {
+	if (INTEL_INFO(dev)->gen >= 8) {
 		intel_ring_emit(ring, 0); /* upper addr */
 		intel_ring_emit(ring, 0); /* value */
 	} else  {
@@ -2413,13 +2383,6 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
 	}
 	intel_ring_advance(ring);
 
-	if (!invalidate && flush) {
-		if (IS_GEN7(dev))
-			return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
-		else if (IS_BROADWELL(dev))
-			dev_priv->fbc.need_sw_cache_clean = true;
-	}
-
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 714f3fd..f8dc0c6 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -267,7 +267,6 @@ struct  intel_engine_cs {
 	 */
 	struct drm_i915_gem_request *outstanding_lazy_request;
 	bool gpu_caches_dirty;
-	bool fbc_dirty;
 
 	wait_queue_head_t irq_queue;
 
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 7/9] drm/i915: extract intel_fbc_find_crtc()
  2015-02-09 16:46 [RESEND 0/9] Reviewed FBC patches Paulo Zanoni
                   ` (5 preceding siblings ...)
  2015-02-09 16:46 ` [PATCH 6/9] drm/i915: add frontbuffer tracking to FBC Paulo Zanoni
@ 2015-02-09 16:46 ` Paulo Zanoni
  2015-02-09 16:46 ` [PATCH 8/9] drm/i915: HSW+ FBC is tied to pipe A Paulo Zanoni
  2015-02-09 16:46 ` [PATCH 9/9] drm/i915: gen5+ can have FBC with multiple pipes Paulo Zanoni
  8 siblings, 0 replies; 19+ messages in thread
From: Paulo Zanoni @ 2015-02-09 16:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

I want to make this code a little more complicated, so let's extract
the function first.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_fbc.c | 46 +++++++++++++++++++++++++---------------
 1 file changed, 29 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
index 26c4f9c..7829c21 100644
--- a/drivers/gpu/drm/i915/intel_fbc.c
+++ b/drivers/gpu/drm/i915/intel_fbc.c
@@ -442,6 +442,32 @@ static bool set_no_fbc_reason(struct drm_i915_private *dev_priv,
 	return true;
 }
 
+static struct drm_crtc *intel_fbc_find_crtc(struct drm_i915_private *dev_priv)
+{
+	struct drm_device *dev = dev_priv->dev;
+	struct drm_crtc *crtc = NULL, *tmp_crtc;
+
+	for_each_crtc(dev, tmp_crtc) {
+		if (intel_crtc_active(tmp_crtc) &&
+		    to_intel_crtc(tmp_crtc)->primary_enabled) {
+			if (crtc) {
+				if (set_no_fbc_reason(dev_priv, FBC_MULTIPLE_PIPES))
+					DRM_DEBUG_KMS("more than one pipe active, disabling compression\n");
+				return NULL;
+			}
+			crtc = tmp_crtc;
+		}
+	}
+
+	if (!crtc || crtc->primary->fb == NULL) {
+		if (set_no_fbc_reason(dev_priv, FBC_NO_OUTPUT))
+			DRM_DEBUG_KMS("no output, disabling\n");
+		return NULL;
+	}
+
+	return crtc;
+}
+
 /**
  * intel_fbc_update - enable/disable FBC as needed
  * @dev: the drm_device
@@ -464,7 +490,7 @@ static bool set_no_fbc_reason(struct drm_i915_private *dev_priv,
 void intel_fbc_update(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct drm_crtc *crtc = NULL, *tmp_crtc;
+	struct drm_crtc *crtc = NULL;
 	struct intel_crtc *intel_crtc;
 	struct drm_framebuffer *fb;
 	struct drm_i915_gem_object *obj;
@@ -495,23 +521,9 @@ void intel_fbc_update(struct drm_device *dev)
 	 *   - new fb is too large to fit in compressed buffer
 	 *   - going to an unsupported config (interlace, pixel multiply, etc.)
 	 */
-	for_each_crtc(dev, tmp_crtc) {
-		if (intel_crtc_active(tmp_crtc) &&
-		    to_intel_crtc(tmp_crtc)->primary_enabled) {
-			if (crtc) {
-				if (set_no_fbc_reason(dev_priv, FBC_MULTIPLE_PIPES))
-					DRM_DEBUG_KMS("more than one pipe active, disabling compression\n");
-				goto out_disable;
-			}
-			crtc = tmp_crtc;
-		}
-	}
-
-	if (!crtc || crtc->primary->fb == NULL) {
-		if (set_no_fbc_reason(dev_priv, FBC_NO_OUTPUT))
-			DRM_DEBUG_KMS("no output, disabling\n");
+	crtc = intel_fbc_find_crtc(dev_priv);
+	if (!crtc)
 		goto out_disable;
-	}
 
 	intel_crtc = to_intel_crtc(crtc);
 	fb = crtc->primary->fb;
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 8/9] drm/i915: HSW+ FBC is tied to pipe A
  2015-02-09 16:46 [RESEND 0/9] Reviewed FBC patches Paulo Zanoni
                   ` (6 preceding siblings ...)
  2015-02-09 16:46 ` [PATCH 7/9] drm/i915: extract intel_fbc_find_crtc() Paulo Zanoni
@ 2015-02-09 16:46 ` Paulo Zanoni
  2015-02-09 16:46 ` [PATCH 9/9] drm/i915: gen5+ can have FBC with multiple pipes Paulo Zanoni
  8 siblings, 0 replies; 19+ messages in thread
From: Paulo Zanoni @ 2015-02-09 16:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

So add code to consider this case.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_fbc.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
index 7829c21..0d1c75d 100644
--- a/drivers/gpu/drm/i915/intel_fbc.c
+++ b/drivers/gpu/drm/i915/intel_fbc.c
@@ -444,10 +444,16 @@ static bool set_no_fbc_reason(struct drm_i915_private *dev_priv,
 
 static struct drm_crtc *intel_fbc_find_crtc(struct drm_i915_private *dev_priv)
 {
-	struct drm_device *dev = dev_priv->dev;
 	struct drm_crtc *crtc = NULL, *tmp_crtc;
+	enum pipe pipe;
+	bool pipe_a_only = false;
+
+	if (IS_HASWELL(dev_priv) || INTEL_INFO(dev_priv)->gen >= 8)
+		pipe_a_only = true;
+
+	for_each_pipe(dev_priv, pipe) {
+		tmp_crtc = dev_priv->pipe_to_crtc_mapping[pipe];
 
-	for_each_crtc(dev, tmp_crtc) {
 		if (intel_crtc_active(tmp_crtc) &&
 		    to_intel_crtc(tmp_crtc)->primary_enabled) {
 			if (crtc) {
@@ -457,6 +463,9 @@ static struct drm_crtc *intel_fbc_find_crtc(struct drm_i915_private *dev_priv)
 			}
 			crtc = tmp_crtc;
 		}
+
+		if (pipe_a_only)
+			break;
 	}
 
 	if (!crtc || crtc->primary->fb == NULL) {
@@ -712,11 +721,14 @@ void intel_fbc_init(struct drm_i915_private *dev_priv)
 		return;
 	}
 
-	/* TODO: some platforms have FBC tied to a specific plane! */
-	for_each_pipe(dev_priv, pipe)
+	for_each_pipe(dev_priv, pipe) {
 		dev_priv->fbc.possible_framebuffer_bits |=
 				INTEL_FRONTBUFFER_PRIMARY(pipe);
 
+		if (IS_HASWELL(dev_priv) || INTEL_INFO(dev_priv)->gen >= 8)
+			break;
+	}
+
 	if (INTEL_INFO(dev_priv)->gen >= 7) {
 		dev_priv->display.fbc_enabled = ilk_fbc_enabled;
 		dev_priv->display.enable_fbc = gen7_fbc_enable;
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 9/9] drm/i915: gen5+ can have FBC with multiple pipes
  2015-02-09 16:46 [RESEND 0/9] Reviewed FBC patches Paulo Zanoni
                   ` (7 preceding siblings ...)
  2015-02-09 16:46 ` [PATCH 8/9] drm/i915: HSW+ FBC is tied to pipe A Paulo Zanoni
@ 2015-02-09 16:46 ` Paulo Zanoni
  2015-02-10 11:20   ` shuang.he
  8 siblings, 1 reply; 19+ messages in thread
From: Paulo Zanoni @ 2015-02-09 16:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

So allow it.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_fbc.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
index 0d1c75d..6c84fb2 100644
--- a/drivers/gpu/drm/i915/intel_fbc.c
+++ b/drivers/gpu/drm/i915/intel_fbc.c
@@ -446,17 +446,19 @@ static struct drm_crtc *intel_fbc_find_crtc(struct drm_i915_private *dev_priv)
 {
 	struct drm_crtc *crtc = NULL, *tmp_crtc;
 	enum pipe pipe;
-	bool pipe_a_only = false;
+	bool pipe_a_only = false, one_pipe_only = false;
 
 	if (IS_HASWELL(dev_priv) || INTEL_INFO(dev_priv)->gen >= 8)
 		pipe_a_only = true;
+	else if (INTEL_INFO(dev_priv)->gen <= 4)
+		one_pipe_only = true;
 
 	for_each_pipe(dev_priv, pipe) {
 		tmp_crtc = dev_priv->pipe_to_crtc_mapping[pipe];
 
 		if (intel_crtc_active(tmp_crtc) &&
 		    to_intel_crtc(tmp_crtc)->primary_enabled) {
-			if (crtc) {
+			if (one_pipe_only && crtc) {
 				if (set_no_fbc_reason(dev_priv, FBC_MULTIPLE_PIPES))
 					DRM_DEBUG_KMS("more than one pipe active, disabling compression\n");
 				return NULL;
-- 
2.1.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/9] drm/i915: pass which operation triggered the frontbuffer tracking
  2015-02-09 16:46 ` [PATCH 4/9] drm/i915: pass which operation triggered the frontbuffer tracking Paulo Zanoni
@ 2015-02-09 18:29   ` Daniel Vetter
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Vetter @ 2015-02-09 18:29 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Mon, Feb 09, 2015 at 02:46:30PM -0200, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> We want to port FBC to the frontbuffer tracking infrastructure, but
> for that we need to know what caused the object invalidation/flush so
> we can react accordingly: CPU mmaps need manual, GTT mmaps and
> flips don't need handling and ring rendering needs nukes.
> 
> v2: - s/ORIGIN_RENDER/ORIGIN_CS/ (Daniel, Rodrigo)
>     - Fix copy/pasted wrong documentation
>     - Rebase
> v3: - Rebase
> 
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (v2)
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

Sorry for being really late with my review here. I fully agree that origin
for invalidate makes a lot of sense, but for flush it kinda doesn't.
There's no origin for a flush, it just means that any previous rendering
(from _any_ origin) has now completed.

I've looked ahead a bit and I'll comment on the patch that uses the flush
origin what should imo be used instead.

Merged the first 3 patches from this series meanwhile, thanks.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h            |  7 +++++++
>  drivers/gpu/drm/i915/i915_gem.c            | 10 +++++-----
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
>  drivers/gpu/drm/i915/intel_drv.h           | 11 +++++++----
>  drivers/gpu/drm/i915/intel_frontbuffer.c   | 15 ++++++++++-----
>  5 files changed, 30 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c0b8644..d578211 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -771,6 +771,13 @@ struct intel_context {
>  	struct list_head link;
>  };
>  
> +enum fb_op_origin {
> +	ORIGIN_GTT,
> +	ORIGIN_CPU,
> +	ORIGIN_CS,
> +	ORIGIN_FLIP,
> +};
> +
>  struct i915_fbc {
>  	unsigned long uncompressed_size;
>  	unsigned threshold;
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c26d36c..3d198f8 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2310,7 +2310,7 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
>  			list_move_tail(&vma->mm_list, &vma->vm->inactive_list);
>  	}
>  
> -	intel_fb_obj_flush(obj, true);
> +	intel_fb_obj_flush(obj, true, ORIGIN_CS);
>  
>  	list_del_init(&obj->ring_list);
>  
> @@ -3677,7 +3677,7 @@ i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj)
>  	old_write_domain = obj->base.write_domain;
>  	obj->base.write_domain = 0;
>  
> -	intel_fb_obj_flush(obj, false);
> +	intel_fb_obj_flush(obj, false, ORIGIN_GTT);
>  
>  	trace_i915_gem_object_change_domain(obj,
>  					    obj->base.read_domains,
> @@ -3699,7 +3699,7 @@ i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj)
>  	old_write_domain = obj->base.write_domain;
>  	obj->base.write_domain = 0;
>  
> -	intel_fb_obj_flush(obj, false);
> +	intel_fb_obj_flush(obj, false, ORIGIN_CPU);
>  
>  	trace_i915_gem_object_change_domain(obj,
>  					    obj->base.read_domains,
> @@ -3764,7 +3764,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  	}
>  
>  	if (write)
> -		intel_fb_obj_invalidate(obj, NULL);
> +		intel_fb_obj_invalidate(obj, NULL, ORIGIN_GTT);
>  
>  	trace_i915_gem_object_change_domain(obj,
>  					    old_read_domains,
> @@ -4079,7 +4079,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write)
>  	}
>  
>  	if (write)
> -		intel_fb_obj_invalidate(obj, NULL);
> +		intel_fb_obj_invalidate(obj, NULL, ORIGIN_CPU);
>  
>  	trace_i915_gem_object_change_domain(obj,
>  					    old_read_domains,
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index b773368..f4342f0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -971,7 +971,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>  			obj->dirty = 1;
>  			i915_gem_request_assign(&obj->last_write_req, req);
>  
> -			intel_fb_obj_invalidate(obj, ring);
> +			intel_fb_obj_invalidate(obj, ring, ORIGIN_CS);
>  
>  			/* update for the implicit flush after a batch */
>  			obj->base.write_domain &= ~I915_GEM_GPU_DOMAINS;
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 76b3c20..a73d091 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -853,13 +853,15 @@ void intel_ddi_set_vc_payload_alloc(struct drm_crtc *crtc, bool state);
>  
>  /* intel_frontbuffer.c */
>  void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
> -			     struct intel_engine_cs *ring);
> +			     struct intel_engine_cs *ring,
> +			     enum fb_op_origin origin);
>  void intel_frontbuffer_flip_prepare(struct drm_device *dev,
>  				    unsigned frontbuffer_bits);
>  void intel_frontbuffer_flip_complete(struct drm_device *dev,
>  				     unsigned frontbuffer_bits);
>  void intel_frontbuffer_flush(struct drm_device *dev,
> -			     unsigned frontbuffer_bits);
> +			     unsigned frontbuffer_bits,
> +			     enum fb_op_origin origin);
>  /**
>   * intel_frontbuffer_flip - synchronous frontbuffer flip
>   * @dev: DRM device
> @@ -875,12 +877,13 @@ static inline
>  void intel_frontbuffer_flip(struct drm_device *dev,
>  			    unsigned frontbuffer_bits)
>  {
> -	intel_frontbuffer_flush(dev, frontbuffer_bits);
> +	intel_frontbuffer_flush(dev, frontbuffer_bits, ORIGIN_FLIP);
>  }
>  
>  int intel_fb_align_height(struct drm_device *dev, int height,
>  			  unsigned int tiling);
> -void intel_fb_obj_flush(struct drm_i915_gem_object *obj, bool retire);
> +void intel_fb_obj_flush(struct drm_i915_gem_object *obj, bool retire,
> +			enum fb_op_origin origin);
>  
>  
>  /* intel_audio.c */
> diff --git a/drivers/gpu/drm/i915/intel_frontbuffer.c b/drivers/gpu/drm/i915/intel_frontbuffer.c
> index 73cb6e0..0a773981 100644
> --- a/drivers/gpu/drm/i915/intel_frontbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_frontbuffer.c
> @@ -127,6 +127,7 @@ static void intel_mark_fb_busy(struct drm_device *dev,
>   * intel_fb_obj_invalidate - invalidate frontbuffer object
>   * @obj: GEM object to invalidate
>   * @ring: set for asynchronous rendering
> + * @origin: which operation caused the invalidation
>   *
>   * This function gets called every time rendering on the given object starts and
>   * frontbuffer caching (fbc, low refresh rate for DRRS, panel self refresh) must
> @@ -135,7 +136,8 @@ static void intel_mark_fb_busy(struct drm_device *dev,
>   * scheduled.
>   */
>  void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
> -			     struct intel_engine_cs *ring)
> +			     struct intel_engine_cs *ring,
> +			     enum fb_op_origin origin)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -164,6 +166,7 @@ void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
>   * intel_frontbuffer_flush - flush frontbuffer
>   * @dev: DRM device
>   * @frontbuffer_bits: frontbuffer plane tracking bits
> + * @origin: which operation caused the flush
>   *
>   * This function gets called every time rendering on the given planes has
>   * completed and frontbuffer caching can be started again. Flushes will get
> @@ -172,7 +175,8 @@ void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
>   * Can be called without any locks held.
>   */
>  void intel_frontbuffer_flush(struct drm_device *dev,
> -			     unsigned frontbuffer_bits)
> +			     unsigned frontbuffer_bits,
> +			     enum fb_op_origin origin)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> @@ -201,13 +205,14 @@ void intel_frontbuffer_flush(struct drm_device *dev,
>   * intel_fb_obj_flush - flush frontbuffer object
>   * @obj: GEM object to flush
>   * @retire: set when retiring asynchronous rendering
> + * @origin: which operation caused the flush
>   *
>   * This function gets called every time rendering on the given object has
>   * completed and frontbuffer caching can be started again. If @retire is true
>   * then any delayed flushes will be unblocked.
>   */
>  void intel_fb_obj_flush(struct drm_i915_gem_object *obj,
> -			bool retire)
> +			bool retire, enum fb_op_origin origin)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> @@ -229,7 +234,7 @@ void intel_fb_obj_flush(struct drm_i915_gem_object *obj,
>  		mutex_unlock(&dev_priv->fb_tracking.lock);
>  	}
>  
> -	intel_frontbuffer_flush(dev, frontbuffer_bits);
> +	intel_frontbuffer_flush(dev, frontbuffer_bits, origin);
>  }
>  
>  /**
> @@ -277,5 +282,5 @@ void intel_frontbuffer_flip_complete(struct drm_device *dev,
>  	dev_priv->fb_tracking.flip_bits &= ~frontbuffer_bits;
>  	mutex_unlock(&dev_priv->fb_tracking.lock);
>  
> -	intel_frontbuffer_flush(dev, frontbuffer_bits);
> +	intel_frontbuffer_flush(dev, frontbuffer_bits, ORIGIN_FLIP);
>  }
> -- 
> 2.1.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/9] drm/i915: also do frontbuffer tracking on pwrites
  2015-02-09 16:46 ` [PATCH 5/9] drm/i915: also do frontbuffer tracking on pwrites Paulo Zanoni
@ 2015-02-09 18:41   ` Daniel Vetter
  2015-02-11  8:30     ` Daniel Vetter
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Vetter @ 2015-02-09 18:41 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Mon, Feb 09, 2015 at 02:46:31PM -0200, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> We need this for FBC, and possibly for PSR too.
> 
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3d198f8..15910fa 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1111,6 +1111,10 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
>  			ret = i915_gem_phys_pwrite(obj, args, file);
>  		else
>  			ret = i915_gem_shmem_pwrite(dev, obj, args, file);
> +
> +		intel_fb_obj_flush(obj, false, ORIGIN_CPU);
> +	} else {
> +		intel_fb_obj_flush(obj, false, ORIGIN_GTT);

A flush alone does nothing. Well should, but you're kinda not quite using
it correctly in the next patch to convert fbc over to frontbuffer
tracking.

I guess the docs aren't perfect, so let me try again. There are two kinds
of events the frontbuffer tracking code supplies to tell its consumers
that screen changes are happening:
- invalidate/flush: Invalidate denotes the start of the frontbuffer
  rendering, from that point on psr/fbc/drrs must update the screen with
  the usual refresh rate and not cache anything anywhere. When the flush
  happens (which could easily be after a _very_ long time, e.g. fbcon)
  then only can caching recomence. Caching = enable fbc, allow psr or
  reduce refresh rate.
- flip events: That's an instantenous event (well at least for consumer,
  internally we need to track it as prepare/complete for async flips), and
  mostly just interesting when the hw doesn't notice flips (some psr modes
  and drrs).

So if you want to add frontbuffer tracking to pwrite then we need both an
invalidate (before the actual pwrite) and a flush (after the pwrite, like
you've added here).

The other issue is that there's a bug with the origin assignemnt:
phys_pwrite also goes through the gtt. I think it would be best if we push
the fb_obj_invalidate/flush into the relevant pwrite functions. That
should make it easier to review, since the invalidate/flush will be next
to the write op.
-Daniel

>  	}
>  
>  out:
> -- 
> 2.1.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/9] drm/i915: add frontbuffer tracking to FBC
  2015-02-09 16:46 ` [PATCH 6/9] drm/i915: add frontbuffer tracking to FBC Paulo Zanoni
@ 2015-02-09 19:05   ` Daniel Vetter
  2015-02-10 12:19     ` Paulo Zanoni
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Vetter @ 2015-02-09 19:05 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni, Rodrigo Vivi

On Mon, Feb 09, 2015 at 02:46:32PM -0200, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> Kill the blt/render tracking we currently have and use the frontbuffer
> tracking infrastructure.
> 
> Don't enable things by default yet.
> 
> v2: (Rodrigo) Fix small conflict on rebase and typo at subject.
> v3: (Paulo) Rebase on RENDER_CS change.
> v4: (Paulo) Rebase.
> 
> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (v2)
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h          |   9 +--
>  drivers/gpu/drm/i915/intel_drv.h         |   6 +-
>  drivers/gpu/drm/i915/intel_fbc.c         | 107 ++++++++++++++++++++-----------
>  drivers/gpu/drm/i915/intel_frontbuffer.c |  14 +---
>  drivers/gpu/drm/i915/intel_ringbuffer.c  |  41 +-----------
>  drivers/gpu/drm/i915/intel_ringbuffer.h  |   1 -
>  6 files changed, 80 insertions(+), 98 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index d578211..dc002df 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -782,6 +782,7 @@ struct i915_fbc {
>  	unsigned long uncompressed_size;
>  	unsigned threshold;
>  	unsigned int fb_id;
> +	unsigned int possible_framebuffer_bits;
>  	struct intel_crtc *crtc;
>  	int y;
>  
> @@ -794,14 +795,6 @@ struct i915_fbc {
>  	 * possible. */
>  	bool enabled;
>  
> -	/* On gen8 some rings cannont perform fbc clean operation so for now
> -	 * we are doing this on SW with mmio.
> -	 * This variable works in the opposite information direction
> -	 * of ring->fbc_dirty telling software on frontbuffer tracking
> -	 * to perform the cache clean on sw side.
> -	 */
> -	bool need_sw_cache_clean;
> -
>  	struct intel_fbc_work {
>  		struct delayed_work work;
>  		struct drm_crtc *crtc;
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index a73d091..ed85df8 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1101,7 +1101,11 @@ bool intel_fbc_enabled(struct drm_device *dev);
>  void intel_fbc_update(struct drm_device *dev);
>  void intel_fbc_init(struct drm_i915_private *dev_priv);
>  void intel_fbc_disable(struct drm_device *dev);
> -void bdw_fbc_sw_flush(struct drm_device *dev, u32 value);
> +void intel_fbc_invalidate(struct drm_i915_private *dev_priv,
> +			  unsigned int frontbuffer_bits,
> +			  enum fb_op_origin origin);
> +void intel_fbc_flush(struct drm_i915_private *dev_priv,
> +		     unsigned int frontbuffer_bits, enum fb_op_origin origin);
>  
>  /* intel_hdmi.c */
>  void intel_hdmi_init(struct drm_device *dev, int hdmi_reg, enum port port);
> diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
> index 7341e87..26c4f9c 100644
> --- a/drivers/gpu/drm/i915/intel_fbc.c
> +++ b/drivers/gpu/drm/i915/intel_fbc.c
> @@ -174,29 +174,10 @@ static bool g4x_fbc_enabled(struct drm_device *dev)
>  	return I915_READ(DPFC_CONTROL) & DPFC_CTL_EN;
>  }
>  
> -static void snb_fbc_blit_update(struct drm_device *dev)
> +static void intel_fbc_nuke(struct drm_i915_private *dev_priv)
>  {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	u32 blt_ecoskpd;
> -
> -	/* Make sure blitter notifies FBC of writes */
> -
> -	/* Blitter is part of Media powerwell on VLV. No impact of
> -	 * his param in other platforms for now */
> -	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_MEDIA);
> -
> -	blt_ecoskpd = I915_READ(GEN6_BLITTER_ECOSKPD);
> -	blt_ecoskpd |= GEN6_BLITTER_FBC_NOTIFY <<
> -		GEN6_BLITTER_LOCK_SHIFT;
> -	I915_WRITE(GEN6_BLITTER_ECOSKPD, blt_ecoskpd);
> -	blt_ecoskpd |= GEN6_BLITTER_FBC_NOTIFY;
> -	I915_WRITE(GEN6_BLITTER_ECOSKPD, blt_ecoskpd);
> -	blt_ecoskpd &= ~(GEN6_BLITTER_FBC_NOTIFY <<
> -			 GEN6_BLITTER_LOCK_SHIFT);
> -	I915_WRITE(GEN6_BLITTER_ECOSKPD, blt_ecoskpd);
> -	POSTING_READ(GEN6_BLITTER_ECOSKPD);
> -
> -	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_MEDIA);
> +	I915_WRITE(MSG_FBC_REND_STATE, FBC_REND_NUKE);
> +	POSTING_READ(MSG_FBC_REND_STATE);
>  }
>  
>  static void ilk_fbc_enable(struct drm_crtc *crtc)
> @@ -239,9 +220,10 @@ static void ilk_fbc_enable(struct drm_crtc *crtc)
>  		I915_WRITE(SNB_DPFC_CTL_SA,
>  			   SNB_CPU_FENCE_ENABLE | obj->fence_reg);
>  		I915_WRITE(DPFC_CPU_FENCE_OFFSET, crtc->y);
> -		snb_fbc_blit_update(dev);
>  	}
>  
> +	intel_fbc_nuke(dev_priv);
> +
>  	DRM_DEBUG_KMS("enabled fbc on plane %c\n", plane_name(intel_crtc->plane));
>  }
>  
> @@ -320,7 +302,7 @@ static void gen7_fbc_enable(struct drm_crtc *crtc)
>  		   SNB_CPU_FENCE_ENABLE | obj->fence_reg);
>  	I915_WRITE(DPFC_CPU_FENCE_OFFSET, crtc->y);
>  
> -	snb_fbc_blit_update(dev);
> +	intel_fbc_nuke(dev_priv);
>  
>  	DRM_DEBUG_KMS("enabled fbc on plane %c\n", plane_name(intel_crtc->plane));
>  }
> @@ -340,19 +322,6 @@ bool intel_fbc_enabled(struct drm_device *dev)
>  	return dev_priv->fbc.enabled;
>  }
>  
> -void bdw_fbc_sw_flush(struct drm_device *dev, u32 value)
> -{
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -
> -	if (!IS_GEN8(dev))
> -		return;
> -
> -	if (!intel_fbc_enabled(dev))
> -		return;
> -
> -	I915_WRITE(MSG_FBC_REND_STATE, value);
> -}
> -
>  static void intel_fbc_work_fn(struct work_struct *__work)
>  {
>  	struct intel_fbc_work *work =
> @@ -658,6 +627,63 @@ out_disable:
>  	i915_gem_stolen_cleanup_compression(dev);
>  }
>  
> +void intel_fbc_invalidate(struct drm_i915_private *dev_priv,
> +			  unsigned int frontbuffer_bits,
> +			  enum fb_op_origin origin)
> +{
> +	struct drm_device *dev = dev_priv->dev;
> +	unsigned int fbc_bits;
> +
> +	if (origin == ORIGIN_GTT)
> +		return;
> +
> +	if (dev_priv->fbc.enabled)
> +		fbc_bits = INTEL_FRONTBUFFER_PRIMARY(dev_priv->fbc.crtc->pipe);
> +	else if (dev_priv->fbc.fbc_work)
> +		fbc_bits = INTEL_FRONTBUFFER_PRIMARY(
> +			to_intel_crtc(dev_priv->fbc.fbc_work->crtc)->pipe);
> +	else
> +		fbc_bits = dev_priv->fbc.possible_framebuffer_bits;
> +
> +	if ((fbc_bits & frontbuffer_bits) == 0)
> +		return;

The fbc code should store all the currently invalidate fb_bits somewhere,
so that it knows when it can reenable fbc. Copypasting from the psr code:

dev_priv->fbc.busy_frontbuffer_bits |= fbc_bits & frontbuffer_bits;

Of course there should only ever be one bit (for the primary plane on the
crtc fbc is enabled on), so a boolean should be enough. But we could use
the additional bits for a bit of consistency checks.

> +
> +	intel_fbc_disable(dev);
> +}
> +
> +void intel_fbc_flush(struct drm_i915_private *dev_priv,
> +		     unsigned int frontbuffer_bits,
> +		     enum fb_op_origin origin)
> +{
> +	struct drm_device *dev = dev_priv->dev;
> +	unsigned int fbc_bits = 0;
> +	bool fbc_enabled;
> +
> +	if (dev_priv->fbc.enabled)
> +		fbc_bits = INTEL_FRONTBUFFER_PRIMARY(dev_priv->fbc.crtc->pipe);
> +	else
> +		fbc_bits = dev_priv->fbc.possible_framebuffer_bits;
> +
> +	/* These are not the planes you are looking for. */
> +	if ((frontbuffer_bits & fbc_bits) == 0)
> +		return;
> +
> +	fbc_enabled = intel_fbc_enabled(dev);
> +
> +	if ((dev_priv->fb_tracking.busy_bits & frontbuffer_bits &
> +	     fbc_bits) == 0) {
> +		if (fbc_enabled) {
> +			if (origin == ORIGIN_CS || origin == ORIGIN_CPU)
> +				intel_fbc_nuke(dev_priv);
> +		} else {
> +			intel_fbc_update(dev);
> +		}

Ok, here comes the gist of what needs to be changed.
- You need to check the fbc-private copy of busy bits, otherwise the
  filtering for the GTT origin won't work. We want to keep fbc enabled (or
  renable) even when gtt is still invalidated somehow.
- That means when you decide to reenable fbc (i.e.
  dev_priv->fbc.busy_frontbuffer_bits == 0) then it's guranteed that fbc
  is already disabled, and a plain intel_fbc_enable is good enough.
- nuke in flush is not enough and too late - the frontbuffer rendering
  could have been going on for a while since you've seen the invalidate,
  the flush only happens when everything is quiet again. A nuke at the end
  is hence too late and not enough.

This way you don't need the origin information in the flush helper.

One bit I'm not yet sure off is where to put intel_fbc_update to
reallocate buffers over pageflips. The complication since you've written
these patches is the atomic flips, which can now asynchronously update the
primary plane (and change the pixel format), but different from the
pageflip code the atomic paths don't disable/enable fbc over a flip.

Which means that doing the intel_fbc_update just somewhere in between the
disable and enable won't be enough to keep everything in sync.

I think what we ultimately need is a bit of logic in the atomic_check part
which decides whether fbc must be disabled/enabled (e.g. when we need to
realloc the cfb) around the update. And then we'll place explicit
intel_fbc_disable/enable calls in the atomic_begin/flush callbacks.

The other case is just updating the gtt fence data. At least on g4x+ we
should try not to disable fbc but keep it going over pageflips. Which
means there could be interesting races between the flip and gtt writes
right afterwards (which aren't synchronized in any way).

I see two options:
- We update the fence register whenever we want, but while a pageflip is
  pending we don't filter out gtt invalidates (since we kinda don't know
  whether the hw invalidate will hit before/after the flip happens). This
  one is a bit tricky, we'd need to add fbc callbacks to
  intel_frontbuffer_flip_prepare/complete.
- We do a manual nuke after each flip by adding calls to intel_fbc_flip to
  intel_frontbuffer_flip and flip_complete. This one has the downside that
  proper userspace which doesn't do gtt writes will get hurt by nuking the
  fbc twice: Once for the flip and once right afterwards.

Cheers, Daniel
> +	} else {
> +		if (fbc_enabled)
> +			intel_fbc_disable(dev);
> +	}
> +}
> +
>  /**
>   * intel_fbc_init - Initialize FBC
>   * @dev_priv: the i915 device
> @@ -666,12 +692,19 @@ out_disable:
>   */
>  void intel_fbc_init(struct drm_i915_private *dev_priv)
>  {
> +	enum pipe pipe;
> +
>  	if (!HAS_FBC(dev_priv)) {
>  		dev_priv->fbc.enabled = false;
>  		dev_priv->fbc.no_fbc_reason = FBC_UNSUPPORTED;
>  		return;
>  	}
>  
> +	/* TODO: some platforms have FBC tied to a specific plane! */
> +	for_each_pipe(dev_priv, pipe)
> +		dev_priv->fbc.possible_framebuffer_bits |=
> +				INTEL_FRONTBUFFER_PRIMARY(pipe);
> +
>  	if (INTEL_INFO(dev_priv)->gen >= 7) {
>  		dev_priv->display.fbc_enabled = ilk_fbc_enabled;
>  		dev_priv->display.enable_fbc = gen7_fbc_enable;
> diff --git a/drivers/gpu/drm/i915/intel_frontbuffer.c b/drivers/gpu/drm/i915/intel_frontbuffer.c
> index 0a773981..2fc059c 100644
> --- a/drivers/gpu/drm/i915/intel_frontbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_frontbuffer.c
> @@ -118,8 +118,6 @@ static void intel_mark_fb_busy(struct drm_device *dev,
>  			continue;
>  
>  		intel_increase_pllclock(dev, pipe);
> -		if (ring && intel_fbc_enabled(dev))
> -			ring->fbc_dirty = true;
>  	}
>  }
>  
> @@ -160,6 +158,7 @@ void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
>  
>  	intel_psr_invalidate(dev, obj->frontbuffer_bits);
>  	intel_edp_drrs_invalidate(dev, obj->frontbuffer_bits);
> +	intel_fbc_invalidate(dev_priv, obj->frontbuffer_bits, origin);
>  }
>  
>  /**
> @@ -189,16 +188,7 @@ void intel_frontbuffer_flush(struct drm_device *dev,
>  
>  	intel_edp_drrs_flush(dev, frontbuffer_bits);
>  	intel_psr_flush(dev, frontbuffer_bits);
> -
> -	/*
> -	 * FIXME: Unconditional fbc flushing here is a rather gross hack and
> -	 * needs to be reworked into a proper frontbuffer tracking scheme like
> -	 * psr employs.
> -	 */
> -	if (dev_priv->fbc.need_sw_cache_clean) {
> -		dev_priv->fbc.need_sw_cache_clean = false;
> -		bdw_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
> -	}
> +	intel_fbc_flush(dev_priv, frontbuffer_bits, origin);
>  }
>  
>  /**
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 573b80f..48dd8a2 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -317,29 +317,6 @@ gen7_render_ring_cs_stall_wa(struct intel_engine_cs *ring)
>  	return 0;
>  }
>  
> -static int gen7_ring_fbc_flush(struct intel_engine_cs *ring, u32 value)
> -{
> -	int ret;
> -
> -	if (!ring->fbc_dirty)
> -		return 0;
> -
> -	ret = intel_ring_begin(ring, 6);
> -	if (ret)
> -		return ret;
> -	/* WaFbcNukeOn3DBlt:ivb/hsw */
> -	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> -	intel_ring_emit(ring, MSG_FBC_REND_STATE);
> -	intel_ring_emit(ring, value);
> -	intel_ring_emit(ring, MI_STORE_REGISTER_MEM(1) | MI_SRM_LRM_GLOBAL_GTT);
> -	intel_ring_emit(ring, MSG_FBC_REND_STATE);
> -	intel_ring_emit(ring, ring->scratch.gtt_offset + 256);
> -	intel_ring_advance(ring);
> -
> -	ring->fbc_dirty = false;
> -	return 0;
> -}
> -
>  static int
>  gen7_render_ring_flush(struct intel_engine_cs *ring,
>  		       u32 invalidate_domains, u32 flush_domains)
> @@ -398,9 +375,6 @@ gen7_render_ring_flush(struct intel_engine_cs *ring,
>  	intel_ring_emit(ring, 0);
>  	intel_ring_advance(ring);
>  
> -	if (!invalidate_domains && flush_domains)
> -		return gen7_ring_fbc_flush(ring, FBC_REND_NUKE);
> -
>  	return 0;
>  }
>  
> @@ -462,9 +436,6 @@ gen8_render_ring_flush(struct intel_engine_cs *ring,
>  	if (ret)
>  		return ret;
>  
> -	if (!invalidate_domains && flush_domains)
> -		return gen7_ring_fbc_flush(ring, FBC_REND_NUKE);
> -
>  	return 0;
>  }
>  
> @@ -2382,7 +2353,6 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>  			   u32 invalidate, u32 flush)
>  {
>  	struct drm_device *dev = ring->dev;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
>  	uint32_t cmd;
>  	int ret;
>  
> @@ -2391,7 +2361,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>  		return ret;
>  
>  	cmd = MI_FLUSH_DW;
> -	if (INTEL_INFO(ring->dev)->gen >= 8)
> +	if (INTEL_INFO(dev)->gen >= 8)
>  		cmd += 1;
>  	/*
>  	 * Bspec vol 1c.3 - blitter engine command streamer:
> @@ -2404,7 +2374,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>  			MI_FLUSH_DW_OP_STOREDW;
>  	intel_ring_emit(ring, cmd);
>  	intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
> -	if (INTEL_INFO(ring->dev)->gen >= 8) {
> +	if (INTEL_INFO(dev)->gen >= 8) {
>  		intel_ring_emit(ring, 0); /* upper addr */
>  		intel_ring_emit(ring, 0); /* value */
>  	} else  {
> @@ -2413,13 +2383,6 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>  	}
>  	intel_ring_advance(ring);
>  
> -	if (!invalidate && flush) {
> -		if (IS_GEN7(dev))
> -			return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
> -		else if (IS_BROADWELL(dev))
> -			dev_priv->fbc.need_sw_cache_clean = true;
> -	}
> -
>  	return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 714f3fd..f8dc0c6 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -267,7 +267,6 @@ struct  intel_engine_cs {
>  	 */
>  	struct drm_i915_gem_request *outstanding_lazy_request;
>  	bool gpu_caches_dirty;
> -	bool fbc_dirty;
>  
>  	wait_queue_head_t irq_queue;
>  
> -- 
> 2.1.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 9/9] drm/i915: gen5+ can have FBC with multiple pipes
  2015-02-09 16:46 ` [PATCH 9/9] drm/i915: gen5+ can have FBC with multiple pipes Paulo Zanoni
@ 2015-02-10 11:20   ` shuang.he
  0 siblings, 0 replies; 19+ messages in thread
From: shuang.he @ 2015-02-10 11:20 UTC (permalink / raw)
  To: shuang.he, ethan.gao, intel-gfx, przanoni

Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
Task id: 5734
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                                  282/283              282/283
ILK                                  308/315              308/315
SNB              +1-1              340/346              340/346
IVB                 -1              378/384              377/384
BYT                                  296/296              296/296
HSW              +1                 421/428              422/428
BDW                                  318/333              318/333
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
 SNB  igt_kms_flip_dpms-vs-vblank-race-interruptible      DMESG_WARN(3, M22)PASS(2, M22)      DMESG_WARN(1, M22)
 SNB  igt_kms_pipe_crc_basic_read-crc-pipe-A      DMESG_WARN(1, M22)PASS(6, M22)      PASS(1, M22)
 IVB  igt_gem_pwrite_pread_snooped-copy-performance      DMESG_WARN(1, M34)PASS(5, M34)      DMESG_WARN(1, M34)
 HSW  igt_gem_storedw_loop_blt      DMESG_WARN(2, M20)PASS(3, M20)      PASS(1, M20)
Note: You need to pay more attention to line start with '*'
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/9] drm/i915: add frontbuffer tracking to FBC
  2015-02-09 19:05   ` Daniel Vetter
@ 2015-02-10 12:19     ` Paulo Zanoni
  2015-02-11  8:25       ` Daniel Vetter
  0 siblings, 1 reply; 19+ messages in thread
From: Paulo Zanoni @ 2015-02-10 12:19 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel Graphics Development, Paulo Zanoni, Rodrigo Vivi

2015-02-09 17:05 GMT-02:00 Daniel Vetter <daniel@ffwll.ch>:
> On Mon, Feb 09, 2015 at 02:46:32PM -0200, Paulo Zanoni wrote:
>> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>
>> Kill the blt/render tracking we currently have and use the frontbuffer
>> tracking infrastructure.
>>
>> Don't enable things by default yet.
>>
>> v2: (Rodrigo) Fix small conflict on rebase and typo at subject.
>> v3: (Paulo) Rebase on RENDER_CS change.
>> v4: (Paulo) Rebase.
>>
>> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (v2)
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_drv.h          |   9 +--
>>  drivers/gpu/drm/i915/intel_drv.h         |   6 +-
>>  drivers/gpu/drm/i915/intel_fbc.c         | 107 ++++++++++++++++++++-----------
>>  drivers/gpu/drm/i915/intel_frontbuffer.c |  14 +---
>>  drivers/gpu/drm/i915/intel_ringbuffer.c  |  41 +-----------
>>  drivers/gpu/drm/i915/intel_ringbuffer.h  |   1 -
>>  6 files changed, 80 insertions(+), 98 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index d578211..dc002df 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -782,6 +782,7 @@ struct i915_fbc {
>>       unsigned long uncompressed_size;
>>       unsigned threshold;
>>       unsigned int fb_id;
>> +     unsigned int possible_framebuffer_bits;
>>       struct intel_crtc *crtc;
>>       int y;
>>
>> @@ -794,14 +795,6 @@ struct i915_fbc {
>>        * possible. */
>>       bool enabled;
>>
>> -     /* On gen8 some rings cannont perform fbc clean operation so for now
>> -      * we are doing this on SW with mmio.
>> -      * This variable works in the opposite information direction
>> -      * of ring->fbc_dirty telling software on frontbuffer tracking
>> -      * to perform the cache clean on sw side.
>> -      */
>> -     bool need_sw_cache_clean;
>> -
>>       struct intel_fbc_work {
>>               struct delayed_work work;
>>               struct drm_crtc *crtc;
>> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
>> index a73d091..ed85df8 100644
>> --- a/drivers/gpu/drm/i915/intel_drv.h
>> +++ b/drivers/gpu/drm/i915/intel_drv.h
>> @@ -1101,7 +1101,11 @@ bool intel_fbc_enabled(struct drm_device *dev);
>>  void intel_fbc_update(struct drm_device *dev);
>>  void intel_fbc_init(struct drm_i915_private *dev_priv);
>>  void intel_fbc_disable(struct drm_device *dev);
>> -void bdw_fbc_sw_flush(struct drm_device *dev, u32 value);
>> +void intel_fbc_invalidate(struct drm_i915_private *dev_priv,
>> +                       unsigned int frontbuffer_bits,
>> +                       enum fb_op_origin origin);
>> +void intel_fbc_flush(struct drm_i915_private *dev_priv,
>> +                  unsigned int frontbuffer_bits, enum fb_op_origin origin);
>>
>>  /* intel_hdmi.c */
>>  void intel_hdmi_init(struct drm_device *dev, int hdmi_reg, enum port port);
>> diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
>> index 7341e87..26c4f9c 100644
>> --- a/drivers/gpu/drm/i915/intel_fbc.c
>> +++ b/drivers/gpu/drm/i915/intel_fbc.c
>> @@ -174,29 +174,10 @@ static bool g4x_fbc_enabled(struct drm_device *dev)
>>       return I915_READ(DPFC_CONTROL) & DPFC_CTL_EN;
>>  }
>>
>> -static void snb_fbc_blit_update(struct drm_device *dev)
>> +static void intel_fbc_nuke(struct drm_i915_private *dev_priv)
>>  {
>> -     struct drm_i915_private *dev_priv = dev->dev_private;
>> -     u32 blt_ecoskpd;
>> -
>> -     /* Make sure blitter notifies FBC of writes */
>> -
>> -     /* Blitter is part of Media powerwell on VLV. No impact of
>> -      * his param in other platforms for now */
>> -     intel_uncore_forcewake_get(dev_priv, FORCEWAKE_MEDIA);
>> -
>> -     blt_ecoskpd = I915_READ(GEN6_BLITTER_ECOSKPD);
>> -     blt_ecoskpd |= GEN6_BLITTER_FBC_NOTIFY <<
>> -             GEN6_BLITTER_LOCK_SHIFT;
>> -     I915_WRITE(GEN6_BLITTER_ECOSKPD, blt_ecoskpd);
>> -     blt_ecoskpd |= GEN6_BLITTER_FBC_NOTIFY;
>> -     I915_WRITE(GEN6_BLITTER_ECOSKPD, blt_ecoskpd);
>> -     blt_ecoskpd &= ~(GEN6_BLITTER_FBC_NOTIFY <<
>> -                      GEN6_BLITTER_LOCK_SHIFT);
>> -     I915_WRITE(GEN6_BLITTER_ECOSKPD, blt_ecoskpd);
>> -     POSTING_READ(GEN6_BLITTER_ECOSKPD);
>> -
>> -     intel_uncore_forcewake_put(dev_priv, FORCEWAKE_MEDIA);
>> +     I915_WRITE(MSG_FBC_REND_STATE, FBC_REND_NUKE);
>> +     POSTING_READ(MSG_FBC_REND_STATE);
>>  }
>>
>>  static void ilk_fbc_enable(struct drm_crtc *crtc)
>> @@ -239,9 +220,10 @@ static void ilk_fbc_enable(struct drm_crtc *crtc)
>>               I915_WRITE(SNB_DPFC_CTL_SA,
>>                          SNB_CPU_FENCE_ENABLE | obj->fence_reg);
>>               I915_WRITE(DPFC_CPU_FENCE_OFFSET, crtc->y);
>> -             snb_fbc_blit_update(dev);
>>       }
>>
>> +     intel_fbc_nuke(dev_priv);
>> +
>>       DRM_DEBUG_KMS("enabled fbc on plane %c\n", plane_name(intel_crtc->plane));
>>  }
>>
>> @@ -320,7 +302,7 @@ static void gen7_fbc_enable(struct drm_crtc *crtc)
>>                  SNB_CPU_FENCE_ENABLE | obj->fence_reg);
>>       I915_WRITE(DPFC_CPU_FENCE_OFFSET, crtc->y);
>>
>> -     snb_fbc_blit_update(dev);
>> +     intel_fbc_nuke(dev_priv);
>>
>>       DRM_DEBUG_KMS("enabled fbc on plane %c\n", plane_name(intel_crtc->plane));
>>  }
>> @@ -340,19 +322,6 @@ bool intel_fbc_enabled(struct drm_device *dev)
>>       return dev_priv->fbc.enabled;
>>  }
>>
>> -void bdw_fbc_sw_flush(struct drm_device *dev, u32 value)
>> -{
>> -     struct drm_i915_private *dev_priv = dev->dev_private;
>> -
>> -     if (!IS_GEN8(dev))
>> -             return;
>> -
>> -     if (!intel_fbc_enabled(dev))
>> -             return;
>> -
>> -     I915_WRITE(MSG_FBC_REND_STATE, value);
>> -}
>> -
>>  static void intel_fbc_work_fn(struct work_struct *__work)
>>  {
>>       struct intel_fbc_work *work =
>> @@ -658,6 +627,63 @@ out_disable:
>>       i915_gem_stolen_cleanup_compression(dev);
>>  }
>>
>> +void intel_fbc_invalidate(struct drm_i915_private *dev_priv,
>> +                       unsigned int frontbuffer_bits,
>> +                       enum fb_op_origin origin)
>> +{
>> +     struct drm_device *dev = dev_priv->dev;
>> +     unsigned int fbc_bits;
>> +
>> +     if (origin == ORIGIN_GTT)
>> +             return;
>> +
>> +     if (dev_priv->fbc.enabled)
>> +             fbc_bits = INTEL_FRONTBUFFER_PRIMARY(dev_priv->fbc.crtc->pipe);
>> +     else if (dev_priv->fbc.fbc_work)
>> +             fbc_bits = INTEL_FRONTBUFFER_PRIMARY(
>> +                     to_intel_crtc(dev_priv->fbc.fbc_work->crtc)->pipe);
>> +     else
>> +             fbc_bits = dev_priv->fbc.possible_framebuffer_bits;
>> +
>> +     if ((fbc_bits & frontbuffer_bits) == 0)
>> +             return;
>
> The fbc code should store all the currently invalidate fb_bits somewhere,
> so that it knows when it can reenable fbc. Copypasting from the psr code:
>
> dev_priv->fbc.busy_frontbuffer_bits |= fbc_bits & frontbuffer_bits;
>
> Of course there should only ever be one bit (for the primary plane on the
> crtc fbc is enabled on), so a boolean should be enough. But we could use
> the additional bits for a bit of consistency checks.
>
>> +
>> +     intel_fbc_disable(dev);
>> +}
>> +
>> +void intel_fbc_flush(struct drm_i915_private *dev_priv,
>> +                  unsigned int frontbuffer_bits,
>> +                  enum fb_op_origin origin)
>> +{
>> +     struct drm_device *dev = dev_priv->dev;
>> +     unsigned int fbc_bits = 0;
>> +     bool fbc_enabled;
>> +
>> +     if (dev_priv->fbc.enabled)
>> +             fbc_bits = INTEL_FRONTBUFFER_PRIMARY(dev_priv->fbc.crtc->pipe);
>> +     else
>> +             fbc_bits = dev_priv->fbc.possible_framebuffer_bits;
>> +
>> +     /* These are not the planes you are looking for. */
>> +     if ((frontbuffer_bits & fbc_bits) == 0)
>> +             return;
>> +
>> +     fbc_enabled = intel_fbc_enabled(dev);
>> +
>> +     if ((dev_priv->fb_tracking.busy_bits & frontbuffer_bits &
>> +          fbc_bits) == 0) {
>> +             if (fbc_enabled) {
>> +                     if (origin == ORIGIN_CS || origin == ORIGIN_CPU)
>> +                             intel_fbc_nuke(dev_priv);
>> +             } else {
>> +                     intel_fbc_update(dev);
>> +             }
>
> Ok, here comes the gist of what needs to be changed.
> - You need to check the fbc-private copy of busy bits, otherwise the
>   filtering for the GTT origin won't work. We want to keep fbc enabled (or
>   renable) even when gtt is still invalidated somehow.
> - That means when you decide to reenable fbc (i.e.
>   dev_priv->fbc.busy_frontbuffer_bits == 0) then it's guranteed that fbc
>   is already disabled, and a plain intel_fbc_enable is good enough.
> - nuke in flush is not enough and too late - the frontbuffer rendering
>   could have been going on for a while since you've seen the invalidate,
>   the flush only happens when everything is quiet again. A nuke at the end
>   is hence too late and not enough.
>
> This way you don't need the origin information in the flush helper.
>
> One bit I'm not yet sure off is where to put intel_fbc_update to
> reallocate buffers over pageflips. The complication since you've written
> these patches is the atomic flips, which can now asynchronously update the
> primary plane (and change the pixel format), but different from the
> pageflip code the atomic paths don't disable/enable fbc over a flip.
>
> Which means that doing the intel_fbc_update just somewhere in between the
> disable and enable won't be enough to keep everything in sync.
>
> I think what we ultimately need is a bit of logic in the atomic_check part
> which decides whether fbc must be disabled/enabled (e.g. when we need to
> realloc the cfb) around the update. And then we'll place explicit
> intel_fbc_disable/enable calls in the atomic_begin/flush callbacks.
>
> The other case is just updating the gtt fence data. At least on g4x+ we
> should try not to disable fbc but keep it going over pageflips. Which
> means there could be interesting races between the flip and gtt writes
> right afterwards (which aren't synchronized in any way).
>
> I see two options:
> - We update the fence register whenever we want, but while a pageflip is
>   pending we don't filter out gtt invalidates (since we kinda don't know
>   whether the hw invalidate will hit before/after the flip happens). This
>   one is a bit tricky, we'd need to add fbc callbacks to
>   intel_frontbuffer_flip_prepare/complete.
> - We do a manual nuke after each flip by adding calls to intel_fbc_flip to
>   intel_frontbuffer_flip and flip_complete. This one has the downside that
>   proper userspace which doesn't do gtt writes will get hurt by nuking the
>   fbc twice: Once for the flip and once right afterwards.

Thanks for the review!

If you can think of any way to write a test case to reproduce the
problems you pointed above, please give me ideas! I'll try to do this
while rewriting the patches, but maybe you already have something in
your mind? I guess we need to try to find a way to get an invalidate
but only get the corresponding flush a long time (> 1 frame) after
that? Maybe keep the frame busy with tons of operations that don't
change the CRC?


>
> Cheers, Daniel
>> +     } else {
>> +             if (fbc_enabled)
>> +                     intel_fbc_disable(dev);
>> +     }
>> +}
>> +
>>  /**
>>   * intel_fbc_init - Initialize FBC
>>   * @dev_priv: the i915 device
>> @@ -666,12 +692,19 @@ out_disable:
>>   */
>>  void intel_fbc_init(struct drm_i915_private *dev_priv)
>>  {
>> +     enum pipe pipe;
>> +
>>       if (!HAS_FBC(dev_priv)) {
>>               dev_priv->fbc.enabled = false;
>>               dev_priv->fbc.no_fbc_reason = FBC_UNSUPPORTED;
>>               return;
>>       }
>>
>> +     /* TODO: some platforms have FBC tied to a specific plane! */
>> +     for_each_pipe(dev_priv, pipe)
>> +             dev_priv->fbc.possible_framebuffer_bits |=
>> +                             INTEL_FRONTBUFFER_PRIMARY(pipe);
>> +
>>       if (INTEL_INFO(dev_priv)->gen >= 7) {
>>               dev_priv->display.fbc_enabled = ilk_fbc_enabled;
>>               dev_priv->display.enable_fbc = gen7_fbc_enable;
>> diff --git a/drivers/gpu/drm/i915/intel_frontbuffer.c b/drivers/gpu/drm/i915/intel_frontbuffer.c
>> index 0a773981..2fc059c 100644
>> --- a/drivers/gpu/drm/i915/intel_frontbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_frontbuffer.c
>> @@ -118,8 +118,6 @@ static void intel_mark_fb_busy(struct drm_device *dev,
>>                       continue;
>>
>>               intel_increase_pllclock(dev, pipe);
>> -             if (ring && intel_fbc_enabled(dev))
>> -                     ring->fbc_dirty = true;
>>       }
>>  }
>>
>> @@ -160,6 +158,7 @@ void intel_fb_obj_invalidate(struct drm_i915_gem_object *obj,
>>
>>       intel_psr_invalidate(dev, obj->frontbuffer_bits);
>>       intel_edp_drrs_invalidate(dev, obj->frontbuffer_bits);
>> +     intel_fbc_invalidate(dev_priv, obj->frontbuffer_bits, origin);
>>  }
>>
>>  /**
>> @@ -189,16 +188,7 @@ void intel_frontbuffer_flush(struct drm_device *dev,
>>
>>       intel_edp_drrs_flush(dev, frontbuffer_bits);
>>       intel_psr_flush(dev, frontbuffer_bits);
>> -
>> -     /*
>> -      * FIXME: Unconditional fbc flushing here is a rather gross hack and
>> -      * needs to be reworked into a proper frontbuffer tracking scheme like
>> -      * psr employs.
>> -      */
>> -     if (dev_priv->fbc.need_sw_cache_clean) {
>> -             dev_priv->fbc.need_sw_cache_clean = false;
>> -             bdw_fbc_sw_flush(dev, FBC_REND_CACHE_CLEAN);
>> -     }
>> +     intel_fbc_flush(dev_priv, frontbuffer_bits, origin);
>>  }
>>
>>  /**
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index 573b80f..48dd8a2 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -317,29 +317,6 @@ gen7_render_ring_cs_stall_wa(struct intel_engine_cs *ring)
>>       return 0;
>>  }
>>
>> -static int gen7_ring_fbc_flush(struct intel_engine_cs *ring, u32 value)
>> -{
>> -     int ret;
>> -
>> -     if (!ring->fbc_dirty)
>> -             return 0;
>> -
>> -     ret = intel_ring_begin(ring, 6);
>> -     if (ret)
>> -             return ret;
>> -     /* WaFbcNukeOn3DBlt:ivb/hsw */
>> -     intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
>> -     intel_ring_emit(ring, MSG_FBC_REND_STATE);
>> -     intel_ring_emit(ring, value);
>> -     intel_ring_emit(ring, MI_STORE_REGISTER_MEM(1) | MI_SRM_LRM_GLOBAL_GTT);
>> -     intel_ring_emit(ring, MSG_FBC_REND_STATE);
>> -     intel_ring_emit(ring, ring->scratch.gtt_offset + 256);
>> -     intel_ring_advance(ring);
>> -
>> -     ring->fbc_dirty = false;
>> -     return 0;
>> -}
>> -
>>  static int
>>  gen7_render_ring_flush(struct intel_engine_cs *ring,
>>                      u32 invalidate_domains, u32 flush_domains)
>> @@ -398,9 +375,6 @@ gen7_render_ring_flush(struct intel_engine_cs *ring,
>>       intel_ring_emit(ring, 0);
>>       intel_ring_advance(ring);
>>
>> -     if (!invalidate_domains && flush_domains)
>> -             return gen7_ring_fbc_flush(ring, FBC_REND_NUKE);
>> -
>>       return 0;
>>  }
>>
>> @@ -462,9 +436,6 @@ gen8_render_ring_flush(struct intel_engine_cs *ring,
>>       if (ret)
>>               return ret;
>>
>> -     if (!invalidate_domains && flush_domains)
>> -             return gen7_ring_fbc_flush(ring, FBC_REND_NUKE);
>> -
>>       return 0;
>>  }
>>
>> @@ -2382,7 +2353,6 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>>                          u32 invalidate, u32 flush)
>>  {
>>       struct drm_device *dev = ring->dev;
>> -     struct drm_i915_private *dev_priv = dev->dev_private;
>>       uint32_t cmd;
>>       int ret;
>>
>> @@ -2391,7 +2361,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>>               return ret;
>>
>>       cmd = MI_FLUSH_DW;
>> -     if (INTEL_INFO(ring->dev)->gen >= 8)
>> +     if (INTEL_INFO(dev)->gen >= 8)
>>               cmd += 1;
>>       /*
>>        * Bspec vol 1c.3 - blitter engine command streamer:
>> @@ -2404,7 +2374,7 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>>                       MI_FLUSH_DW_OP_STOREDW;
>>       intel_ring_emit(ring, cmd);
>>       intel_ring_emit(ring, I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT);
>> -     if (INTEL_INFO(ring->dev)->gen >= 8) {
>> +     if (INTEL_INFO(dev)->gen >= 8) {
>>               intel_ring_emit(ring, 0); /* upper addr */
>>               intel_ring_emit(ring, 0); /* value */
>>       } else  {
>> @@ -2413,13 +2383,6 @@ static int gen6_ring_flush(struct intel_engine_cs *ring,
>>       }
>>       intel_ring_advance(ring);
>>
>> -     if (!invalidate && flush) {
>> -             if (IS_GEN7(dev))
>> -                     return gen7_ring_fbc_flush(ring, FBC_REND_CACHE_CLEAN);
>> -             else if (IS_BROADWELL(dev))
>> -                     dev_priv->fbc.need_sw_cache_clean = true;
>> -     }
>> -
>>       return 0;
>>  }
>>
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> index 714f3fd..f8dc0c6 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> @@ -267,7 +267,6 @@ struct  intel_engine_cs {
>>        */
>>       struct drm_i915_gem_request *outstanding_lazy_request;
>>       bool gpu_caches_dirty;
>> -     bool fbc_dirty;
>>
>>       wait_queue_head_t irq_queue;
>>
>> --
>> 2.1.4
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 
Paulo Zanoni
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/9] drm/i915: add frontbuffer tracking to FBC
  2015-02-10 12:19     ` Paulo Zanoni
@ 2015-02-11  8:25       ` Daniel Vetter
  0 siblings, 0 replies; 19+ messages in thread
From: Daniel Vetter @ 2015-02-11  8:25 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: Intel Graphics Development, Paulo Zanoni, Rodrigo Vivi

On Tue, Feb 10, 2015 at 10:19:10AM -0200, Paulo Zanoni wrote:
> 2015-02-09 17:05 GMT-02:00 Daniel Vetter <daniel@ffwll.ch>:
> > Ok, here comes the gist of what needs to be changed.
> > - You need to check the fbc-private copy of busy bits, otherwise the
> >   filtering for the GTT origin won't work. We want to keep fbc enabled (or
> >   renable) even when gtt is still invalidated somehow.
> > - That means when you decide to reenable fbc (i.e.
> >   dev_priv->fbc.busy_frontbuffer_bits == 0) then it's guranteed that fbc
> >   is already disabled, and a plain intel_fbc_enable is good enough.
> > - nuke in flush is not enough and too late - the frontbuffer rendering
> >   could have been going on for a while since you've seen the invalidate,
> >   the flush only happens when everything is quiet again. A nuke at the end
> >   is hence too late and not enough.
> >
> > This way you don't need the origin information in the flush helper.
> >
> > One bit I'm not yet sure off is where to put intel_fbc_update to
> > reallocate buffers over pageflips. The complication since you've written
> > these patches is the atomic flips, which can now asynchronously update the
> > primary plane (and change the pixel format), but different from the
> > pageflip code the atomic paths don't disable/enable fbc over a flip.
> >
> > Which means that doing the intel_fbc_update just somewhere in between the
> > disable and enable won't be enough to keep everything in sync.
> >
> > I think what we ultimately need is a bit of logic in the atomic_check part
> > which decides whether fbc must be disabled/enabled (e.g. when we need to
> > realloc the cfb) around the update. And then we'll place explicit
> > intel_fbc_disable/enable calls in the atomic_begin/flush callbacks.
> >
> > The other case is just updating the gtt fence data. At least on g4x+ we
> > should try not to disable fbc but keep it going over pageflips. Which
> > means there could be interesting races between the flip and gtt writes
> > right afterwards (which aren't synchronized in any way).
> >
> > I see two options:
> > - We update the fence register whenever we want, but while a pageflip is
> >   pending we don't filter out gtt invalidates (since we kinda don't know
> >   whether the hw invalidate will hit before/after the flip happens). This
> >   one is a bit tricky, we'd need to add fbc callbacks to
> >   intel_frontbuffer_flip_prepare/complete.
> > - We do a manual nuke after each flip by adding calls to intel_fbc_flip to
> >   intel_frontbuffer_flip and flip_complete. This one has the downside that
> >   proper userspace which doesn't do gtt writes will get hurt by nuking the
> >   fbc twice: Once for the flip and once right afterwards.
> 
> Thanks for the review!
> 
> If you can think of any way to write a test case to reproduce the
> problems you pointed above, please give me ideas! I'll try to do this
> while rewriting the patches, but maybe you already have something in
> your mind? I guess we need to try to find a way to get an invalidate
> but only get the corresponding flush a long time (> 1 frame) after
> that? Maybe keep the frame busy with tons of operations that don't
> change the CRC?

Well that's going to be fairly hard since you need to race pageflips with
gtt writes and then try to sneak in gtt writes for the new buffer in
before the fence and everything is set up correctly.

The time between invalidate and flush isn't a problem since for GTT mmaps
it's forever by default. So as long as you don't do any set_domain/execbuf
or anything else GTT mmaps will stay invalidated. But since we want to
filter out GTT invalidates (the hw actually works for that case!) the
problem creeps in when we update the gtt fbc fence in a racy way. The
frontbuffer tracking stuff is race-free (avoiding that race is why there's
the flip_prepare/complete split) itself.

If you want to try the following might work to reproduce this case:

Prerequisites: We don't disable fbc any more over pageflips, which we
  can do on g4x as long as the pixel format stays the same.

1. Take reference crc for black, pain test buffer white.
2. Pageflip to that white buffer.
3. Immediately afterwards (i.e. before the pageflip completed) draw the
buffer black through gtt mmaps with all the set_domain stuff that's
required.
4. Wait a few frames, grab crc.

And now that I've describe in detail there's actually no race here as long
as we update the fbc fence _before_ we do the pageflip. Some invalidates
will hit the old buffer still, but the pageflip will invalidate everything
anyway. And any gtt writes after the flip will do line invalidation
correctly.

So I don't think we'll have a problem, and I also don't think we need
another testcase really.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/9] drm/i915: also do frontbuffer tracking on pwrites
  2015-02-09 18:41   ` Daniel Vetter
@ 2015-02-11  8:30     ` Daniel Vetter
  2015-02-12 14:12       ` Jani Nikula
  2015-02-12 17:58       ` Paulo Zanoni
  0 siblings, 2 replies; 19+ messages in thread
From: Daniel Vetter @ 2015-02-11  8:30 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Mon, Feb 09, 2015 at 07:41:17PM +0100, Daniel Vetter wrote:
> On Mon, Feb 09, 2015 at 02:46:31PM -0200, Paulo Zanoni wrote:
> > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > 
> > We need this for FBC, and possibly for PSR too.
> > 
> > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 3d198f8..15910fa 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -1111,6 +1111,10 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
> >  			ret = i915_gem_phys_pwrite(obj, args, file);
> >  		else
> >  			ret = i915_gem_shmem_pwrite(dev, obj, args, file);
> > +
> > +		intel_fb_obj_flush(obj, false, ORIGIN_CPU);
> > +	} else {
> > +		intel_fb_obj_flush(obj, false, ORIGIN_GTT);
> 
> A flush alone does nothing. Well should, but you're kinda not quite using
> it correctly in the next patch to convert fbc over to frontbuffer
> tracking.
> 
> I guess the docs aren't perfect, so let me try again. There are two kinds
> of events the frontbuffer tracking code supplies to tell its consumers
> that screen changes are happening:
> - invalidate/flush: Invalidate denotes the start of the frontbuffer
>   rendering, from that point on psr/fbc/drrs must update the screen with
>   the usual refresh rate and not cache anything anywhere. When the flush
>   happens (which could easily be after a _very_ long time, e.g. fbcon)
>   then only can caching recomence. Caching = enable fbc, allow psr or
>   reduce refresh rate.
> - flip events: That's an instantenous event (well at least for consumer,
>   internally we need to track it as prepare/complete for async flips), and
>   mostly just interesting when the hw doesn't notice flips (some psr modes
>   and drrs).
> 
> So if you want to add frontbuffer tracking to pwrite then we need both an
> invalidate (before the actual pwrite) and a flush (after the pwrite, like
> you've added here).
> 
> The other issue is that there's a bug with the origin assignemnt:
> phys_pwrite also goes through the gtt. I think it would be best if we push
> the fb_obj_invalidate/flush into the relevant pwrite functions. That
> should make it easier to review, since the invalidate/flush will be next
> to the write op.

btw what's the use-case here? We don't upload stuff to X-tiled buffers
with pwrite, so this isn't really relevant for fbc I think.

It is a real gap for psr though, since cursor updates are done with
pwrite. But that probably gets papered over by X also updating the
position when the image changes, which means we'll get another (hw)
flip combo. But that's just X, and we indeed don't have a pwrite cursor
case yet in the psr testcase.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/9] drm/i915: also do frontbuffer tracking on pwrites
  2015-02-11  8:30     ` Daniel Vetter
@ 2015-02-12 14:12       ` Jani Nikula
  2015-02-12 17:58       ` Paulo Zanoni
  1 sibling, 0 replies; 19+ messages in thread
From: Jani Nikula @ 2015-02-12 14:12 UTC (permalink / raw)
  To: Daniel Vetter, Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Wed, 11 Feb 2015, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Mon, Feb 09, 2015 at 07:41:17PM +0100, Daniel Vetter wrote:
>> On Mon, Feb 09, 2015 at 02:46:31PM -0200, Paulo Zanoni wrote:
>> > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> > 
>> > We need this for FBC, and possibly for PSR too.
>> > 
>> > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/i915_gem.c | 4 ++++
>> >  1 file changed, 4 insertions(+)
>> > 
>> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> > index 3d198f8..15910fa 100644
>> > --- a/drivers/gpu/drm/i915/i915_gem.c
>> > +++ b/drivers/gpu/drm/i915/i915_gem.c
>> > @@ -1111,6 +1111,10 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
>> >  			ret = i915_gem_phys_pwrite(obj, args, file);
>> >  		else
>> >  			ret = i915_gem_shmem_pwrite(dev, obj, args, file);
>> > +
>> > +		intel_fb_obj_flush(obj, false, ORIGIN_CPU);
>> > +	} else {
>> > +		intel_fb_obj_flush(obj, false, ORIGIN_GTT);
>> 
>> A flush alone does nothing. Well should, but you're kinda not quite using
>> it correctly in the next patch to convert fbc over to frontbuffer
>> tracking.
>> 
>> I guess the docs aren't perfect, so let me try again. There are two kinds
>> of events the frontbuffer tracking code supplies to tell its consumers
>> that screen changes are happening:
>> - invalidate/flush: Invalidate denotes the start of the frontbuffer
>>   rendering, from that point on psr/fbc/drrs must update the screen with
>>   the usual refresh rate and not cache anything anywhere. When the flush
>>   happens (which could easily be after a _very_ long time, e.g. fbcon)
>>   then only can caching recomence. Caching = enable fbc, allow psr or
>>   reduce refresh rate.
>> - flip events: That's an instantenous event (well at least for consumer,
>>   internally we need to track it as prepare/complete for async flips), and
>>   mostly just interesting when the hw doesn't notice flips (some psr modes
>>   and drrs).
>> 
>> So if you want to add frontbuffer tracking to pwrite then we need both an
>> invalidate (before the actual pwrite) and a flush (after the pwrite, like
>> you've added here).
>> 
>> The other issue is that there's a bug with the origin assignemnt:
>> phys_pwrite also goes through the gtt. I think it would be best if we push
>> the fb_obj_invalidate/flush into the relevant pwrite functions. That
>> should make it easier to review, since the invalidate/flush will be next
>> to the write op.
>
> btw what's the use-case here? We don't upload stuff to X-tiled buffers
> with pwrite, so this isn't really relevant for fbc I think.
>
> It is a real gap for psr though, since cursor updates are done with
> pwrite. But that probably gets papered over by X also updating the
> position when the image changes, which means we'll get another (hw)
> flip combo. But that's just X, and we indeed don't have a pwrite cursor
> case yet in the psr testcase.

FYI, you asked to test an earlier version of this patch in
https://bugs.freedesktop.org/show_bug.cgi?id=87143

BR,
Jani


> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/9] drm/i915: also do frontbuffer tracking on pwrites
  2015-02-11  8:30     ` Daniel Vetter
  2015-02-12 14:12       ` Jani Nikula
@ 2015-02-12 17:58       ` Paulo Zanoni
  1 sibling, 0 replies; 19+ messages in thread
From: Paulo Zanoni @ 2015-02-12 17:58 UTC (permalink / raw)
  To: Daniel Vetter, Rodrigo Vivi; +Cc: Intel Graphics Development, Paulo Zanoni

2015-02-11 6:30 GMT-02:00 Daniel Vetter <daniel@ffwll.ch>:
> On Mon, Feb 09, 2015 at 07:41:17PM +0100, Daniel Vetter wrote:
>> On Mon, Feb 09, 2015 at 02:46:31PM -0200, Paulo Zanoni wrote:
>> > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> >
>> > We need this for FBC, and possibly for PSR too.
>> >
>> > Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> > ---
>> >  drivers/gpu/drm/i915/i915_gem.c | 4 ++++
>> >  1 file changed, 4 insertions(+)
>> >
>> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> > index 3d198f8..15910fa 100644
>> > --- a/drivers/gpu/drm/i915/i915_gem.c
>> > +++ b/drivers/gpu/drm/i915/i915_gem.c
>> > @@ -1111,6 +1111,10 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
>> >                     ret = i915_gem_phys_pwrite(obj, args, file);
>> >             else
>> >                     ret = i915_gem_shmem_pwrite(dev, obj, args, file);
>> > +
>> > +           intel_fb_obj_flush(obj, false, ORIGIN_CPU);
>> > +   } else {
>> > +           intel_fb_obj_flush(obj, false, ORIGIN_GTT);
>>
>> A flush alone does nothing. Well should, but you're kinda not quite using
>> it correctly in the next patch to convert fbc over to frontbuffer
>> tracking.
>>
>> I guess the docs aren't perfect, so let me try again. There are two kinds
>> of events the frontbuffer tracking code supplies to tell its consumers
>> that screen changes are happening:
>> - invalidate/flush: Invalidate denotes the start of the frontbuffer
>>   rendering, from that point on psr/fbc/drrs must update the screen with
>>   the usual refresh rate and not cache anything anywhere. When the flush
>>   happens (which could easily be after a _very_ long time, e.g. fbcon)
>>   then only can caching recomence. Caching = enable fbc, allow psr or
>>   reduce refresh rate.
>> - flip events: That's an instantenous event (well at least for consumer,
>>   internally we need to track it as prepare/complete for async flips), and
>>   mostly just interesting when the hw doesn't notice flips (some psr modes
>>   and drrs).
>>
>> So if you want to add frontbuffer tracking to pwrite then we need both an
>> invalidate (before the actual pwrite) and a flush (after the pwrite, like
>> you've added here).
>>
>> The other issue is that there's a bug with the origin assignemnt:
>> phys_pwrite also goes through the gtt. I think it would be best if we push
>> the fb_obj_invalidate/flush into the relevant pwrite functions. That
>> should make it easier to review, since the invalidate/flush will be next
>> to the write op.
>
> btw what's the use-case here? We don't upload stuff to X-tiled buffers
> with pwrite, so this isn't really relevant for fbc I think.

We can do this, I wrote a testcase for that. Of course, user space
needs to be aware of the tiling, but that's documented on the PRM.

>
> It is a real gap for psr though, since cursor updates are done with
> pwrite.

A lot of the tests I wrote can be used by PSR, I should plan on doing
this later.

> But that probably gets papered over by X also updating the
> position when the image changes, which means we'll get another (hw)
> flip combo. But that's just X, and we indeed don't have a pwrite cursor
> case yet in the psr testcase.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 
Paulo Zanoni
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-02-12 17:58 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-09 16:46 [RESEND 0/9] Reviewed FBC patches Paulo Zanoni
2015-02-09 16:46 ` [PATCH 1/9] drm/i915: don't try to find crtcs for FBC if it's disabled Paulo Zanoni
2015-02-09 16:46 ` [PATCH 2/9] drm/i915: don't keep reassigning FBC_UNSUPPORTED Paulo Zanoni
2015-02-09 16:46 ` [PATCH 3/9] drm/i915: change dev_priv->fbc.plane to dev_priv->fbc.crtc Paulo Zanoni
2015-02-09 16:46 ` [PATCH 4/9] drm/i915: pass which operation triggered the frontbuffer tracking Paulo Zanoni
2015-02-09 18:29   ` Daniel Vetter
2015-02-09 16:46 ` [PATCH 5/9] drm/i915: also do frontbuffer tracking on pwrites Paulo Zanoni
2015-02-09 18:41   ` Daniel Vetter
2015-02-11  8:30     ` Daniel Vetter
2015-02-12 14:12       ` Jani Nikula
2015-02-12 17:58       ` Paulo Zanoni
2015-02-09 16:46 ` [PATCH 6/9] drm/i915: add frontbuffer tracking to FBC Paulo Zanoni
2015-02-09 19:05   ` Daniel Vetter
2015-02-10 12:19     ` Paulo Zanoni
2015-02-11  8:25       ` Daniel Vetter
2015-02-09 16:46 ` [PATCH 7/9] drm/i915: extract intel_fbc_find_crtc() Paulo Zanoni
2015-02-09 16:46 ` [PATCH 8/9] drm/i915: HSW+ FBC is tied to pipe A Paulo Zanoni
2015-02-09 16:46 ` [PATCH 9/9] drm/i915: gen5+ can have FBC with multiple pipes Paulo Zanoni
2015-02-10 11:20   ` shuang.he

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.