All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/9] New DDB Algo and WM fixes
@ 2016-09-09  8:00 Kumar, Mahesh
  2016-09-09  8:00 ` [PATCH v3 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions Kumar, Mahesh
                   ` (8 more replies)
  0 siblings, 9 replies; 40+ messages in thread
From: Kumar, Mahesh @ 2016-09-09  8:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This series implements new DDB allocation algorithm to solve the cases,
where we have sufficient DDB available to enable multiple planes, But
due to the current algorithm not dividing it properly among planes, we
end-up failing the flip.
It also takes care of enabling same watermark level for each
plane, for efficient power saving.
This series also implements Transition Watermarks and Gen-9 related
arbitrated display bandwidth Workarounds.

There are two steps in current WM programming.

1. Calculate minimum number of blocks required  for a WM level to be
enabled. For 1440x2560 panel we need 41 blocks as minimum number of
blocks to enable WM0. This is the step which doesn't use vertical size.
It only depends on Pipe drain rate and plane horizontal size as per the
current Bspec algorithm.
So all the plane below have minimum  number of blocks required to enable
WM0 as 41
    Plane 1  - 1440x2560        -    Min blocks to enable WM0 = 41
    Plane 2  - 1440x2560        -    Min blocks to enable WM0 = 41
    Plane 3  - 1440x48          -    Min blocks to enable WM0 = 41
    Plane 4  - 1440x96          -    Min blocks to enable WM0 = 41

2. Number of blocks allotted by the driver
    Driver allocates  12 for Plane 3   &  16 for plane 4

    Total Dbuf Available = 508
    Dbuf Available after 32 blocks for cursor = 508 - (32)  = 476
    allocate minimum blocks for each plane 8 * 4 = 32
    remaining blocks = 476 - 32 = 444
    Relative Data Rate for Planes
       Plane 1  =  1440 * 2560 * 3  =  11059200
       Plane 2  =  1440 * 2560 * 3  =  11059200
       Plane 3  =  1440 * 48   * 3  =  207360
       Plane 4  =  1440 * 96   * 3  =  414720
       Total Relative BW            =  22740480

-   Allocate Buffer
    buffer allocation = (Plane relative data rate / total data rate)
		    * total remaming DDB + minimum plane DDB
     Plane 1  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
     Plane 2  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
     Plane 3  buffer allocation = (207360   / 22740480) * 444 + 8 = 12
     Plane 4  buffer allocation = (414720   / 22740480) * 444 + 8 = 16

In this case it forced driver to disable Plane 3 & 4. Driver need to use
more efficient way to allocate buffer that is optimum for power.

New Algorithm suggested by HW team is:

1. Calculate minimum buffer allocations for each plane and for each
    watermark level

2. Add minimum buffer allocations required for enabling WM7
    for all the planes

Level 0 =  41 + 41 + 41 + 41  = 164
Level 1 =  42 + 42 + 42 + 42  = 168
Level 2 =  42 + 42 + 42 + 42  = 168
Level 3 =  94 + 94 + 94 + 94 =  376
Level 4 =  94 + 94 + 94 + 94 =  376
Level 5 =  94 + 94 + 94 + 94 =  376
Level 6 =  94 + 94 + 94 + 94 =  376
Level 7 =  94 + 94 + 94 + 94 =  376

3. Check to see how many buffer allocation are left and enable
the best case. In this case since we have 476 blocks we can enable
WM0-7 on all 4 planes.
Let's say if we have only 200 block available then the best cases
allocation is to enable Level2 which requires 168 blocks

Changes since v1:
 - Rebased the series on top of Paulo's patches (under review)
    https://patchwork.freedesktop.org/series/12082/
 - Add changes to calculate WM in fixed point 16.16
 - Address review comments for Transition WM
Changes since v2:
 - Added Paulo's WM fixes series URL in cover letter

Mahesh Kumar (9):
  drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm)
    functions
  drm/i915/skl+: use linetime latency instead of ddb size
  drm/i915/skl: New ddb allocation algorithm
  drm/i915: Decode system memory bandwidth
  drm/i915/gen9: WM memory bandwidth related workaround
  drm/i915/skl+: change WM calc to fixed point 16.16
  drm/i915/bxt: Enable IPC support
  drm/i915/bxt: set chicken bit as IPC y-tile WA
  drm/i915/bxt: Implement Transition WM

 drivers/gpu/drm/i915/i915_drv.c      | 101 +++++++++
 drivers/gpu/drm/i915/i915_drv.h      |  29 +++
 drivers/gpu/drm/i915/i915_params.c   |   5 +
 drivers/gpu/drm/i915/i915_params.h   |   1 +
 drivers/gpu/drm/i915/i915_reg.h      |  29 +++
 drivers/gpu/drm/i915/intel_display.c |  50 +++++
 drivers/gpu/drm/i915/intel_drv.h     |  12 +
 drivers/gpu/drm/i915/intel_pm.c      | 412 +++++++++++++++++++++++++++--------
 8 files changed, 551 insertions(+), 88 deletions(-)

-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v3 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions
  2016-09-09  8:00 [PATCH v3 0/9] New DDB Algo and WM fixes Kumar, Mahesh
@ 2016-09-09  8:00 ` Kumar, Mahesh
  2016-09-20 12:17   ` Paulo Zanoni
  2016-09-09  8:00 ` [PATCH v3 2/9] drm/i915/skl+: use linetime latency instead of ddb size Kumar, Mahesh
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 40+ messages in thread
From: Kumar, Mahesh @ 2016-09-09  8:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch make use of plane_wm variable directly instead of passing
skl_plane_wm struct. this way reduces number of argument requirement
in watermark calculation functions.

It also gives more freedom of decision making to implement Bspec WM
workarounds.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 86c6d66..3fdec4d 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3542,9 +3542,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_plane_state *intel_pstate,
 				uint16_t ddb_allocation,
 				int level,
-				uint16_t *out_blocks, /* out */
-				uint8_t *out_lines, /* out */
-				bool *enabled /* out */)
+				struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_plane_state *pstate = &intel_pstate->base;
 	struct drm_framebuffer *fb = pstate->fb;
@@ -3557,6 +3555,11 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	uint32_t width = 0, height = 0;
 	uint32_t plane_pixel_rate;
 	uint32_t y_tile_minimum, y_min_scanlines;
+	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
+	struct skl_wm_level *result = &pipe_wm->wm[level];
+	uint16_t *out_blocks = &result->plane_res_b[id];
+	uint8_t *out_lines = &result->plane_res_l[id];
+	bool *enabled = &result->plane_en[id];
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
 		*enabled = false;
@@ -3673,7 +3676,7 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 		     struct skl_ddb_allocation *ddb,
 		     struct intel_crtc_state *cstate,
 		     int level,
-		     struct skl_wm_level *result)
+		     struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_atomic_state *state = cstate->base.state;
 	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
@@ -3684,12 +3687,6 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 	enum pipe pipe = intel_crtc->pipe;
 	int ret;
 
-	/*
-	 * We'll only calculate watermarks for planes that are actually
-	 * enabled, so make sure all other planes are set as disabled.
-	 */
-	memset(result, 0, sizeof(*result));
-
 	for_each_intel_plane_mask(&dev_priv->drm,
 				  intel_plane,
 				  cstate->base.plane_mask) {
@@ -3727,9 +3724,7 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 					   intel_pstate,
 					   ddb_blocks,
 					   level,
-					   &result->plane_res_b[i],
-					   &result->plane_res_l[i],
-					   &result->plane_en[i]);
+					   pipe_wm);
 		if (ret)
 			return ret;
 	}
@@ -3777,9 +3772,15 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 	int level, max_level = ilk_wm_max_level(dev);
 	int ret;
 
+	/*
+	 * We'll only calculate watermarks for planes that are actually
+	 * enabled, so make sure all other planes are set as disabled.
+	 */
+	memset(pipe_wm, 0, sizeof(*pipe_wm));
+
 	for (level = 0; level <= max_level; level++) {
 		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
-					   level, &pipe_wm->wm[level]);
+					   level, pipe_wm);
 		if (ret)
 			return ret;
 	}
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 2/9] drm/i915/skl+: use linetime latency instead of ddb size
  2016-09-09  8:00 [PATCH v3 0/9] New DDB Algo and WM fixes Kumar, Mahesh
  2016-09-09  8:00 ` [PATCH v3 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions Kumar, Mahesh
@ 2016-09-09  8:00 ` Kumar, Mahesh
  2016-09-12  8:56   ` Maarten Lankhorst
  2016-09-19 18:19   ` Paulo Zanoni
  2016-09-09  8:01 ` [PATCH v3 3/9] drm/i915/skl: New ddb allocation algorithm Kumar, Mahesh
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Kumar, Mahesh @ 2016-09-09  8:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch make changes to use linetime latency instead of allocated
DDB size during plane watermark calculation in switch case, This is
required to implement new DDB allocation algorithm.

In New Algorithm DDB is allocated based on WM values, because of which
number of DDB blocks will not be available during WM calculation,
So this "linetime latency" is suggested by SV/HW team to use during
switch-case for WM blocks selection.

Changes since v1:
 - Rebase on top of Paulo's patch series

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 3fdec4d..cfd9b7d1 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3622,10 +3622,15 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
 		selected_result = max(method2, y_tile_minimum);
 	} else {
+		uint32_t linetime_us = 0;
+
+		linetime_us = DIV_ROUND_UP(width * 1000,
+				skl_pipe_pixel_rate(cstate));
+
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
 		    (plane_bytes_per_line / 512 < 1))
 			selected_result = method2;
-		else if ((ddb_allocation / plane_blocks_per_line) >= 1)
+		if (latency >= linetime_us)
 			selected_result = min(method1, method2);
 		else
 			selected_result = method1;
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 3/9] drm/i915/skl: New ddb allocation algorithm
  2016-09-09  8:00 [PATCH v3 0/9] New DDB Algo and WM fixes Kumar, Mahesh
  2016-09-09  8:00 ` [PATCH v3 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions Kumar, Mahesh
  2016-09-09  8:00 ` [PATCH v3 2/9] drm/i915/skl+: use linetime latency instead of ddb size Kumar, Mahesh
@ 2016-09-09  8:01 ` Kumar, Mahesh
  2016-09-12 10:50   ` Maarten Lankhorst
  2016-09-12 13:11   ` Maarten Lankhorst
  2016-09-09  8:01 ` [PATCH v3 4/9] drm/i915: Decode system memory bandwidth Kumar, Mahesh
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Kumar, Mahesh @ 2016-09-09  8:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch implements new DDB allocation algorithm as per HW team
suggestion. This algo takecare of scenario where we allocate less DDB
for the planes with lower relative pixel rate, but they require more DDB
to work.
It also takes care of enabling same watermark level for each
plane, for efficient power saving.

Changes since v1:
 - Rebase on top of Paulo's patch series

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 129 +++++++++++++++++++++-------------------
 1 file changed, 67 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index cfd9b7d1..7c70e07 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3348,6 +3348,7 @@ skl_ddb_min_alloc(const struct drm_plane_state *pstate,
 
 static int
 skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
+		      struct skl_pipe_wm *pipe_wm,
 		      struct skl_ddb_allocation *ddb /* out */)
 {
 	struct drm_atomic_state *state = cstate->base.state;
@@ -3363,8 +3364,11 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	uint16_t *minimum = cstate->wm.skl.minimum_blocks;
 	uint16_t *y_minimum = cstate->wm.skl.minimum_y_blocks;
 	unsigned int total_data_rate;
+	uint16_t total_min_blocks = 0;
+	uint16_t total_level_ddb = 0;
 	int num_active;
-	int id, i;
+	int max_level, level;
+	int id, i, ret = 0;
 
 	if (WARN_ON(!state))
 		return 0;
@@ -3380,6 +3384,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	alloc_size = skl_ddb_entry_size(alloc);
 	if (alloc_size == 0) {
 		memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
+		memset(ddb->y_plane[pipe], 0, sizeof(ddb->y_plane[pipe]));
 		return 0;
 	}
 
@@ -3413,19 +3418,42 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	}
 
 	for (i = 0; i < PLANE_CURSOR; i++) {
-		alloc_size -= minimum[i];
-		alloc_size -= y_minimum[i];
+		total_min_blocks += minimum[i];
+		total_min_blocks += y_minimum[i];
 	}
 
-	/*
-	 * 2. Distribute the remaining space in proportion to the amount of
-	 * data each plane needs to fetch from memory.
-	 *
-	 * FIXME: we may not allocate every single block here.
-	 */
+	for (level = ilk_wm_max_level(dev); level >= 0; level--) {
+		total_level_ddb = 0;
+		for (i = 0; i < PLANE_CURSOR; i++) {
+			/*
+			 * TODO: We should calculate watermark values for Y/UV
+			 * plane both in case of NV12 format and use both values
+			 * for ddb calculation, As NV12 is disabled as of now.
+			 * using only single plane value here.
+			 */
+			uint16_t min = minimum[i] + y_minimum[i];
+			uint16_t plane_level_ddb_wm =
+				max(pipe_wm->wm[level].plane_res_b[i], min);
+			total_level_ddb += plane_level_ddb_wm;
+		}
+
+		if (total_level_ddb <= alloc_size)
+			break;
+	}
+
+	if ((level < 0) || (total_min_blocks > alloc_size)) {
+		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
+		DRM_DEBUG_KMS("minimum required %d/%d\n", (level < 0) ?
+				total_level_ddb : total_min_blocks, alloc_size);
+		ret = -EINVAL;
+		goto exit;
+	}
+	max_level = level;
+	alloc_size -= total_level_ddb;
+
 	total_data_rate = skl_get_total_relative_data_rate(cstate);
 	if (total_data_rate == 0)
-		return 0;
+		goto exit;
 
 	start = alloc->start;
 	for_each_intel_plane_on_crtc(dev, intel_crtc, intel_plane) {
@@ -3440,7 +3468,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		 * promote the expression to 64 bits to avoid overflowing, the
 		 * result is < available as data_rate / total_data_rate < 1
 		 */
-		plane_blocks = minimum[id];
+		plane_blocks = max(pipe_wm->wm[max_level].plane_res_b[id],
+					minimum[id]);
 		plane_blocks += div_u64((uint64_t)alloc_size * data_rate,
 					total_data_rate);
 
@@ -3454,6 +3483,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 
 		/*
 		 * allocation for y_plane part of planar format:
+		 * TODO: Once we start calculating watermark values for Y/UV
+		 * plane both consider it for initial allowed wm blocks.
 		 */
 		y_data_rate = cstate->wm.skl.plane_y_data_rate[id];
 
@@ -3467,9 +3498,22 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		}
 
 		start += y_plane_blocks;
+
+		/*
+		 * Now enable all levels in WM structure which can be enabled
+		 * using current DDB allocation
+		 */
+		for (i = ilk_wm_max_level(dev); i > 0; i--) {
+			if (i > max_level || pipe_wm->wm[i].plane_res_l[id] > 31
+					|| pipe_wm->wm[i].plane_res_b[id] == 0)
+				pipe_wm->wm[i].plane_en[id] = false;
+			else
+				pipe_wm->wm[i].plane_en[id] = true;
+		}
 	}
 
-	return 0;
+exit:
+	return ret;
 }
 
 static uint32_t skl_pipe_pixel_rate(const struct intel_crtc_state *config)
@@ -3540,7 +3584,6 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
 				struct intel_plane_state *intel_pstate,
-				uint16_t ddb_allocation,
 				int level,
 				struct skl_pipe_wm *pipe_wm)
 {
@@ -3559,12 +3602,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	struct skl_wm_level *result = &pipe_wm->wm[level];
 	uint16_t *out_blocks = &result->plane_res_b[id];
 	uint8_t *out_lines = &result->plane_res_l[id];
-	bool *enabled = &result->plane_en[id];
 
-	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
-		*enabled = false;
+	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
 		return 0;
-	}
 
 	width = drm_rect_width(&intel_pstate->base.src) >> 16;
 	height = drm_rect_height(&intel_pstate->base.src) >> 16;
@@ -3649,54 +3689,27 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		}
 	}
 
-	if (res_blocks >= ddb_allocation || res_lines > 31) {
-		*enabled = false;
-
-		/*
-		 * If there are no valid level 0 watermarks, then we can't
-		 * support this display configuration.
-		 */
-		if (level) {
-			return 0;
-		} else {
-			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
-			DRM_DEBUG_KMS("Plane %d.%d: blocks required = %u/%u, lines required = %u/31\n",
-				      to_intel_crtc(cstate->base.crtc)->pipe,
-				      skl_wm_plane_id(to_intel_plane(pstate->plane)),
-				      res_blocks, ddb_allocation, res_lines);
-
-			return -EINVAL;
-		}
-	}
-
 	*out_blocks = res_blocks;
 	*out_lines = res_lines;
-	*enabled = true;
 
 	return 0;
 }
 
 static int
 skl_compute_wm_level(const struct drm_i915_private *dev_priv,
-		     struct skl_ddb_allocation *ddb,
 		     struct intel_crtc_state *cstate,
 		     int level,
 		     struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_atomic_state *state = cstate->base.state;
-	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
 	struct drm_plane *plane;
 	struct intel_plane *intel_plane;
 	struct intel_plane_state *intel_pstate;
-	uint16_t ddb_blocks;
-	enum pipe pipe = intel_crtc->pipe;
 	int ret;
 
 	for_each_intel_plane_mask(&dev_priv->drm,
 				  intel_plane,
 				  cstate->base.plane_mask) {
-		int i = skl_wm_plane_id(intel_plane);
-
 		plane = &intel_plane->base;
 		intel_pstate = NULL;
 		if (state)
@@ -3722,12 +3735,9 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 
 		WARN_ON(!intel_pstate->base.fb);
 
-		ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][i]);
-
 		ret = skl_compute_plane_wm(dev_priv,
 					   cstate,
 					   intel_pstate,
-					   ddb_blocks,
 					   level,
 					   pipe_wm);
 		if (ret)
@@ -3784,11 +3794,15 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 	memset(pipe_wm, 0, sizeof(*pipe_wm));
 
 	for (level = 0; level <= max_level; level++) {
-		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
+		ret = skl_compute_wm_level(dev_priv, cstate,
 					   level, pipe_wm);
 		if (ret)
 			return ret;
 	}
+	ret = skl_allocate_pipe_ddb(cstate, pipe_wm, ddb);
+	if (ret)
+		return ret;
+
 	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
 
 	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
@@ -3976,13 +3990,12 @@ pipes_modified(struct drm_atomic_state *state)
 }
 
 static int
-skl_compute_ddb(struct drm_atomic_state *state)
+skl_include_affected_pipes(struct drm_atomic_state *state)
 {
 	struct drm_device *dev = state->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
 	struct intel_crtc *intel_crtc;
-	struct skl_ddb_allocation *ddb = &intel_state->wm_results.ddb;
 	uint32_t realloc_pipes = pipes_modified(state);
 	int ret;
 
@@ -4035,10 +4048,6 @@ skl_compute_ddb(struct drm_atomic_state *state)
 		if (IS_ERR(cstate))
 			return PTR_ERR(cstate);
 
-		ret = skl_allocate_pipe_ddb(cstate, ddb);
-		if (ret)
-			return ret;
-
 		ret = drm_atomic_add_affected_planes(state, &intel_crtc->base);
 		if (ret)
 			return ret;
@@ -4092,19 +4101,15 @@ skl_compute_wm(struct drm_atomic_state *state)
 	/* Clear all dirty flags */
 	results->dirty_pipes = 0;
 
-	ret = skl_compute_ddb(state);
+	ret = skl_include_affected_pipes(state);
 	if (ret)
 		return ret;
 
 	/*
 	 * Calculate WM's for all pipes that are part of this transaction.
-	 * Note that the DDB allocation above may have added more CRTC's that
-	 * weren't otherwise being modified (and set bits in dirty_pipes) if
-	 * pipe allocations had to change.
-	 *
-	 * FIXME:  Now that we're doing this in the atomic check phase, we
-	 * should allow skl_update_pipe_wm() to return failure in cases where
-	 * no suitable watermark values can be found.
+	 * Note that affected pipe calculation above may have added more
+	 * CRTC's that weren't otherwise being modified (and set bits in
+	 * dirty_pipes) if pipe allocations had to change.
 	 */
 	for_each_crtc_in_state(state, crtc, cstate, i) {
 		struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 4/9] drm/i915: Decode system memory bandwidth
  2016-09-09  8:00 [PATCH v3 0/9] New DDB Algo and WM fixes Kumar, Mahesh
                   ` (2 preceding siblings ...)
  2016-09-09  8:01 ` [PATCH v3 3/9] drm/i915/skl: New ddb allocation algorithm Kumar, Mahesh
@ 2016-09-09  8:01 ` Kumar, Mahesh
  2016-09-16  8:02   ` Pandiyan, Dhinakaran
  2016-09-19 20:41   ` Paulo Zanoni
  2016-09-09  8:01 ` [PATCH v3 5/9] drm/i915/gen9: WM memory bandwidth related workaround Kumar, Mahesh
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 40+ messages in thread
From: Kumar, Mahesh @ 2016-09-09  8:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch adds support to decode system memory bandwidth
which will be used for arbitrated display memory percentage
calculation in GEN9 based system.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 96 +++++++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h | 18 ++++++++
 drivers/gpu/drm/i915/i915_reg.h | 25 +++++++++++
 3 files changed, 139 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 02c34d6..0a4f18d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -973,6 +973,96 @@ static void intel_sanitize_options(struct drm_i915_private *dev_priv)
 	DRM_DEBUG_DRIVER("use GPU sempahores? %s\n", yesno(i915.semaphores));
 }
 
+static void
+intel_get_memdev_info(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	uint32_t val = 0;
+	uint32_t mem_speed = 0;
+	uint8_t dram_type;
+	uint32_t dram_channel;
+	uint8_t num_channel;
+	bool rank_valid = false;
+
+	if (!IS_GEN9(dev_priv))
+		goto exit;
+
+	val = I915_READ(P_CR_MC_BIOS_REQ_0_0_0);
+	mem_speed = div_u64((uint64_t) (val & REQ_DATA_MASK) *
+			MEMORY_FREQ_MULTIPLIER, 1000);
+
+	if (mem_speed == 0)
+		goto exit;
+
+	dev_priv->memdev_info.valid = true;
+	dev_priv->memdev_info.mem_speed = mem_speed;
+	dram_type = (val >> DRAM_TYPE_SHIFT) & DRAM_TYPE_MASK;
+	dram_channel = (val >> DRAM_CHANNEL_SHIFT) & DRAM_CHANNEL_MASK;
+	num_channel = hweight32(dram_channel);
+
+	/*
+	 * The lpddr3 and lpddr4 technologies can have 1-4 channels and the
+	 * channels are 32bits wide; while ddr3l technologies can have 1-2
+	 * channels and the channels are 64 bits wide. But SV team found that in
+	 * case of single 64 bit wide DDR3L dimms two bits were set and system
+	 * with two DDR3L 64bit dimm all four bits were set.
+	 */
+
+	switch (dram_type) {
+	case DRAM_TYPE_LPDDR3:
+	case DRAM_TYPE_LPDDR4:
+		dev_priv->memdev_info.data_width = 4;
+		dev_priv->memdev_info.num_channel = num_channel;
+		break;
+	case DRAM_TYPE_DDR3L:
+		dev_priv->memdev_info.data_width = 8;
+		dev_priv->memdev_info.num_channel = num_channel / 2;
+		break;
+	default:
+		dev_priv->memdev_info.data_width = 4;
+		dev_priv->memdev_info.num_channel = num_channel;
+	}
+
+	/*
+	 * Now read each DUNIT8/9/10/11 to check the rank of each dimms.
+	 * all the dimms should have same rank as in first valid Dimm
+	 */
+#define D_CR_DRP0_DUNIT_INVALID	    0xFFFFFFFF
+
+	dev_priv->memdev_info.rank_valid = true;
+	if (I915_READ(D_CR_DRP0_DUNIT8) != D_CR_DRP0_DUNIT_INVALID) {
+		val = I915_READ(D_CR_DRP0_DUNIT8);
+		rank_valid = true;
+	} else if (I915_READ(D_CR_DRP0_DUNIT9) != D_CR_DRP0_DUNIT_INVALID) {
+		val = I915_READ(D_CR_DRP0_DUNIT9);
+		rank_valid = true;
+	} else if (I915_READ(D_CR_DRP0_DUNIT10) != D_CR_DRP0_DUNIT_INVALID) {
+		val = I915_READ(D_CR_DRP0_DUNIT10);
+		rank_valid = true;
+	} else if (I915_READ(D_CR_DRP0_DUNIT11) != D_CR_DRP0_DUNIT_INVALID) {
+		val = I915_READ(D_CR_DRP0_DUNIT11);
+		rank_valid = true;
+	}
+#undef D_CR_DRP0_DUNIT_INVALID
+
+	if (rank_valid) {
+		dev_priv->memdev_info.rank_valid = true;
+		dev_priv->memdev_info.rank = (val & DRAM_RANK_MASK);
+	}
+
+	DRM_DEBUG_DRIVER("valid:%s speed-%d width-%d num_channel-%d\n",
+		dev_priv->memdev_info.valid ? "true" : "false",
+		dev_priv->memdev_info.mem_speed,
+		dev_priv->memdev_info.data_width,
+		dev_priv->memdev_info.num_channel);
+	DRM_DEBUG_DRIVER("rank_valid:%s rank-%d\n",
+		dev_priv->memdev_info.rank_valid ? "true" : "false",
+		dev_priv->memdev_info.rank);
+	return;
+exit:
+	dev_priv->memdev_info.valid = false;
+}
+
 /**
  * i915_driver_init_hw - setup state requiring device access
  * @dev_priv: device private
@@ -1076,6 +1166,12 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
 			DRM_DEBUG_DRIVER("can't enable MSI");
 	}
 
+	/*
+	 * Fill the memdev structure to get the system raw bandwidth
+	 * This will be used by WM algorithm, to implement GEN9 based WA
+	 */
+	intel_get_memdev_info(dev);
+
 	return 0;
 
 out_ggtt:
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4ec23e5..4313992 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2036,6 +2036,24 @@ struct drm_i915_private {
 		bool distrust_bios_wm;
 	} wm;
 
+	struct {
+		/*
+		 * memory device info
+		 * valid: memory info is valid or not
+		 * mem_speed: memory freq in KHz
+		 * channel_width: Channel width in bytes
+		 * num_channel: total number of channels
+		 * rank: 0-rank disable, 1-Single rank, 2-dual rank
+		 */
+		bool valid;
+		uint32_t mem_speed;
+		uint8_t data_width;
+		uint8_t num_channel;
+		bool rank_valid;
+		uint8_t rank;
+	} memdev_info;
+
+
 	struct i915_runtime_pm pm;
 
 	/* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index a29d707..b38445c 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7716,6 +7716,31 @@ enum {
 #define  DC_STATE_DEBUG_MASK_CORES	(1<<0)
 #define  DC_STATE_DEBUG_MASK_MEMORY_UP	(1<<1)
 
+#define P_CR_MC_BIOS_REQ_0_0_0		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x7114)
+#define REQ_DATA_MASK			(0x3F << 0)
+#define DRAM_TYPE_SHIFT			24
+#define DRAM_TYPE_MASK			0x7
+#define DRAM_CHANNEL_SHIFT		12
+#define DRAM_CHANNEL_MASK		0xF
+
+#define DRAM_TYPE_LPDDR3		0x1
+#define DRAM_TYPE_LPDDR4		0x2
+#define DRAM_TYPE_DDR3L			0x4
+/*
+ * BIOS programs this field of REQ_DATA [5:0] in integer
+ * multiple of 133330 KHz (133.33MHz)
+ */
+#define MEMORY_FREQ_MULTIPLIER		0x208D2
+#define D_CR_DRP0_DUNIT8		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1000)
+#define D_CR_DRP0_DUNIT9		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1200)
+#define D_CR_DRP0_DUNIT10		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1400)
+#define D_CR_DRP0_DUNIT11		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1600)
+#define D_CR_DRP0_RKEN0			(1 << 0)
+#define D_CR_DRP0_RKEN1			(1 << 1)
+#define DRAM_RANK_MASK			0x3
+#define DRAM_SINGLE_RANK		0x1
+#define DRAM_DUAL_RANK			0x3
+
 /* Please see hsw_read_dcomp() and hsw_write_dcomp() before using this register,
  * since on HSW we can't write to it using I915_WRITE. */
 #define D_COMP_HSW			_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5F0C)
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 5/9] drm/i915/gen9: WM memory bandwidth related workaround
  2016-09-09  8:00 [PATCH v3 0/9] New DDB Algo and WM fixes Kumar, Mahesh
                   ` (3 preceding siblings ...)
  2016-09-09  8:01 ` [PATCH v3 4/9] drm/i915: Decode system memory bandwidth Kumar, Mahesh
@ 2016-09-09  8:01 ` Kumar, Mahesh
  2016-09-12 11:02   ` Maarten Lankhorst
  2016-09-09  8:01 ` [PATCH v3 6/9] drm/i915/skl+: change WM calc to fixed point 16.16 Kumar, Mahesh
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 40+ messages in thread
From: Kumar, Mahesh @ 2016-09-09  8:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch implemnets Workarounds related to display arbitrated memory
bandwidth. These WA are applicabe for all gen-9 based platforms.

Changes since v1:
 - Rebase on top of Paulo's patch series

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h  |   9 +++
 drivers/gpu/drm/i915/intel_drv.h |  11 +++
 drivers/gpu/drm/i915/intel_pm.c  | 145 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 165 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4313992..4737a0e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1074,6 +1074,13 @@ enum intel_sbi_destination {
 	SBI_MPHY,
 };
 
+/* SKL+ Watermark arbitrated display bandwidth Workarounds */
+enum watermark_memory_wa {
+	WATERMARK_WA_NONE,
+	WATERMARK_WA_X_TILED,
+	WATERMARK_WA_Y_TILED,
+};
+
 #define QUIRK_PIPEA_FORCE (1<<0)
 #define QUIRK_LVDS_SSC_DISABLE (1<<1)
 #define QUIRK_INVERT_BRIGHTNESS (1<<2)
@@ -1623,6 +1630,8 @@ struct skl_ddb_allocation {
 
 struct skl_wm_values {
 	unsigned dirty_pipes;
+	/* any WaterMark memory workaround Required */
+	enum watermark_memory_wa mem_wa;
 	struct skl_ddb_allocation ddb;
 	uint32_t wm_linetime[I915_MAX_PIPES];
 	uint32_t plane[I915_MAX_PIPES][I915_MAX_PLANES][8];
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 6cd7e8a..66cb46c 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1800,6 +1800,17 @@ intel_atomic_get_crtc_state(struct drm_atomic_state *state,
 	return to_intel_crtc_state(crtc_state);
 }
 
+static inline struct intel_crtc_state *
+intel_atomic_get_existing_crtc_state(struct drm_atomic_state *state,
+				      struct intel_crtc *crtc)
+{
+	struct drm_crtc_state *crtc_state;
+
+	crtc_state = drm_atomic_get_existing_crtc_state(state, &crtc->base);
+
+	return to_intel_crtc_state(crtc_state);
+}
+
 static inline struct intel_plane_state *
 intel_atomic_get_existing_plane_state(struct drm_atomic_state *state,
 				      struct intel_plane *plane)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 7c70e07..0ec328b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3589,6 +3589,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 {
 	struct drm_plane_state *pstate = &intel_pstate->base;
 	struct drm_framebuffer *fb = pstate->fb;
+	struct intel_atomic_state *intel_state =
+			to_intel_atomic_state(cstate->base.state);
 	uint32_t latency = dev_priv->wm.skl_latency[level];
 	uint32_t method1, method2;
 	uint32_t plane_bytes_per_line, plane_blocks_per_line;
@@ -3602,10 +3604,17 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	struct skl_wm_level *result = &pipe_wm->wm[level];
 	uint16_t *out_blocks = &result->plane_res_b[id];
 	uint8_t *out_lines = &result->plane_res_l[id];
+	enum watermark_memory_wa mem_wa;
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
 		return 0;
 
+	mem_wa = intel_state ? intel_state->wm_results.mem_wa : WATERMARK_WA_NONE;
+	if (mem_wa != WATERMARK_WA_NONE) {
+		if (fb->modifier[0] == I915_FORMAT_MOD_X_TILED)
+			latency += 15;
+	}
+
 	width = drm_rect_width(&intel_pstate->base.src) >> 16;
 	height = drm_rect_height(&intel_pstate->base.src) >> 16;
 
@@ -3637,6 +3646,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		y_min_scanlines = 4;
 	}
 
+	if (mem_wa == WATERMARK_WA_Y_TILED)
+		y_min_scanlines *= 2;
+
 	plane_bytes_per_line = width * cpp;
 	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
 	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
@@ -4041,6 +4053,15 @@ skl_include_affected_pipes(struct drm_atomic_state *state)
 		intel_state->wm_results.dirty_pipes = ~0;
 	}
 
+	/*
+	 * If Watermark workaround is changed we need to recalculate
+	 * watermark values for all active pipes
+	 */
+	if (intel_state->wm_results.mem_wa != dev_priv->wm.skl_hw.mem_wa) {
+		realloc_pipes = ~0;
+		intel_state->wm_results.dirty_pipes = ~0;
+	}
+
 	for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) {
 		struct intel_crtc_state *cstate;
 
@@ -4057,6 +4078,128 @@ skl_include_affected_pipes(struct drm_atomic_state *state)
 }
 
 static void
+skl_set_memory_bandwidth_wm_wa(struct drm_atomic_state *state)
+{
+	struct drm_device *dev = state->dev;
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct intel_crtc *intel_crtc;
+	struct intel_plane_state *intel_pstate;
+	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
+	int num_active_plane, num_active_pipe;
+	uint32_t plane_bw, max_plane_bw, pipe_bw, max_pipe_bw;
+	uint32_t total_pipe_bw;
+	uint32_t system_bw = 0;
+	uint8_t num_channel, data_width, rank;
+	int x_tile_per;
+	int display_bw_per;
+	bool y_tile_enabled = false;
+
+	if (!dev_priv->memdev_info.valid)
+		goto exit;
+
+	num_channel = dev_priv->memdev_info.num_channel;
+	data_width = dev_priv->memdev_info.data_width;
+	system_bw = dev_priv->memdev_info.mem_speed * num_channel * data_width;
+
+	if (!system_bw)
+		goto exit;
+
+	max_pipe_bw = 0;
+	for_each_intel_crtc(dev, intel_crtc) {
+		struct intel_crtc_state *cstate;
+		struct intel_plane *plane;
+
+		/*
+		 * If CRTC is part of current atomic commit, get crtc state from
+		 * existing CRTC state. else take the cached CRTC state
+		 */
+		cstate = NULL;
+		if (state)
+			cstate = intel_atomic_get_existing_crtc_state(state,
+					intel_crtc);
+		if (!cstate)
+			cstate = to_intel_crtc_state(intel_crtc->base.state);
+
+		if (!cstate->base.active)
+			continue;
+
+		num_active_plane = 0;
+		max_plane_bw = 0;
+		for_each_intel_plane_mask(dev, plane, cstate->base.plane_mask) {
+			struct drm_framebuffer *fb = NULL;
+
+			intel_pstate = NULL;
+			if (state)
+				intel_pstate =
+				intel_atomic_get_existing_plane_state(state,
+									plane);
+			if (!intel_pstate)
+				intel_pstate =
+					to_intel_plane_state(plane->base.state);
+
+			WARN_ON(!intel_pstate->base.fb);
+
+			if (!intel_pstate->base.visible)
+				continue;
+
+			fb = intel_pstate->base.fb;
+			if (fb && (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
+				fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED))
+				y_tile_enabled = true;
+
+			plane_bw = skl_adjusted_plane_pixel_rate(cstate,
+								intel_pstate);
+			max_plane_bw = max(plane_bw, max_plane_bw);
+			num_active_plane++;
+		}
+		pipe_bw = max_plane_bw * num_active_plane;
+		max_pipe_bw = max(pipe_bw, max_pipe_bw);
+	}
+
+	if (intel_state->active_pipe_changes)
+		num_active_pipe = hweight32(intel_state->active_crtcs);
+	else
+		num_active_pipe = hweight32(dev_priv->active_crtcs);
+
+	total_pipe_bw = max_pipe_bw * num_active_pipe;
+
+	display_bw_per = DIV_ROUND_UP_ULL(total_pipe_bw * 100, system_bw * 1000);
+
+	/*
+	 * If there is any Ytile plane enabled and arbitrated display
+	 * bandwidth > 20% of raw system memory bandwidth
+	 * Enale Y-tile related WA
+	 *
+	 * If memory is dual channel single rank, Xtile limit = 35%, else Xtile
+	 * limit = 60%
+	 * If there is no Ytile plane enabled and
+	 * arbitrated display bandwidth > Xtile limit
+	 * Enable X-tile realated WA
+	 */
+	if (y_tile_enabled && (display_bw_per > 20))
+		intel_state->wm_results.mem_wa = WATERMARK_WA_Y_TILED;
+	else {
+
+		if (dev_priv->memdev_info.rank_valid)
+			rank = dev_priv->memdev_info.rank;
+		else
+			rank = DRAM_DUAL_RANK; /* Assume we are dual rank */
+
+		if ((rank == DRAM_SINGLE_RANK) && (num_channel == 2))
+			x_tile_per = 35;
+		else
+			x_tile_per = 60;
+
+		if (display_bw_per > x_tile_per)
+			intel_state->wm_results.mem_wa = WATERMARK_WA_X_TILED;
+	}
+	return;
+
+exit:
+	intel_state->wm_results.mem_wa = WATERMARK_WA_NONE;
+}
+
+static void
 skl_copy_wm_for_pipe(struct skl_wm_values *dst,
 		     struct skl_wm_values *src,
 		     enum pipe pipe)
@@ -4101,6 +4244,8 @@ skl_compute_wm(struct drm_atomic_state *state)
 	/* Clear all dirty flags */
 	results->dirty_pipes = 0;
 
+	skl_set_memory_bandwidth_wm_wa(state);
+
 	ret = skl_include_affected_pipes(state);
 	if (ret)
 		return ret;
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 6/9] drm/i915/skl+: change WM calc to fixed point 16.16
  2016-09-09  8:00 [PATCH v3 0/9] New DDB Algo and WM fixes Kumar, Mahesh
                   ` (4 preceding siblings ...)
  2016-09-09  8:01 ` [PATCH v3 5/9] drm/i915/gen9: WM memory bandwidth related workaround Kumar, Mahesh
@ 2016-09-09  8:01 ` Kumar, Mahesh
  2016-09-21 18:32   ` Paulo Zanoni
  2016-09-09  8:01 ` [PATCH v3 7/9] drm/i915/bxt: Enable IPC support Kumar, Mahesh
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 40+ messages in thread
From: Kumar, Mahesh @ 2016-09-09  8:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch changes Watermak calculation to fixed point calculation.
Problem with current calculation is during plane_blocks_per_line
calculation we divide intermediate blocks with min_scanlines and
takes floor of the result because of integer operation.
hence we end-up assigning less blocks than required. Which leads to
flickers.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0ec328b..d4390e4 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3530,13 +3530,15 @@ static uint32_t skl_pipe_pixel_rate(const struct intel_crtc_state *config)
 */
 static uint32_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp, uint32_t latency)
 {
-	uint32_t wm_intermediate_val, ret;
+	uint64_t wm_intermediate_val;
+	uint32_t ret;
 
 	if (latency == 0)
 		return UINT_MAX;
 
-	wm_intermediate_val = latency * pixel_rate * cpp / 512;
-	ret = DIV_ROUND_UP(wm_intermediate_val, 1000);
+	wm_intermediate_val = latency * pixel_rate * cpp;
+	wm_intermediate_val <<= 16;
+	ret = DIV_ROUND_UP_ULL(wm_intermediate_val, 1000 * 512);
 
 	return ret;
 }
@@ -3605,6 +3607,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	uint16_t *out_blocks = &result->plane_res_b[id];
 	uint8_t *out_lines = &result->plane_res_l[id];
 	enum watermark_memory_wa mem_wa;
+	bool y_tiled = false;
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
 		return 0;
@@ -3652,14 +3655,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	plane_bytes_per_line = width * cpp;
 	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
 	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
+		y_tiled = true;
 		plane_blocks_per_line =
 		      DIV_ROUND_UP(plane_bytes_per_line * y_min_scanlines, 512);
-		plane_blocks_per_line /= y_min_scanlines;
+		plane_blocks_per_line = (plane_blocks_per_line << 16) /
+								y_min_scanlines;
 	} else if (fb->modifier[0] == DRM_FORMAT_MOD_NONE) {
 		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512)
 					+ 1;
+		plane_blocks_per_line <<= 16;
 	} else {
 		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512);
+		plane_blocks_per_line <<= 16;
 	}
 
 	method1 = skl_wm_method1(plane_pixel_rate, cpp, latency);
@@ -3670,8 +3677,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 
 	y_tile_minimum = plane_blocks_per_line * y_min_scanlines;
 
-	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
-	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
+	if (y_tiled) {
 		selected_result = max(method2, y_tile_minimum);
 	} else {
 		uint32_t linetime_us = 0;
@@ -3688,12 +3694,11 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 			selected_result = method1;
 	}
 
-	res_blocks = selected_result + 1;
+	res_blocks = DIV_ROUND_UP(selected_result, 1 << 16) + 1;
 	res_lines = DIV_ROUND_UP(selected_result, plane_blocks_per_line);
 
 	if (level >= 1 && level <= 7) {
-		if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
-		    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
+		if (y_tiled) {
 			res_blocks += y_tile_minimum;
 			res_lines += y_min_scanlines;
 		} else {
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 7/9] drm/i915/bxt: Enable IPC support
  2016-09-09  8:00 [PATCH v3 0/9] New DDB Algo and WM fixes Kumar, Mahesh
                   ` (5 preceding siblings ...)
  2016-09-09  8:01 ` [PATCH v3 6/9] drm/i915/skl+: change WM calc to fixed point 16.16 Kumar, Mahesh
@ 2016-09-09  8:01 ` Kumar, Mahesh
  2016-09-21 20:06   ` Paulo Zanoni
  2016-09-09  8:01 ` [PATCH v3 8/9] drm/i915/bxt: set chicken bit as IPC y-tile WA Kumar, Mahesh
  2016-09-09  8:01 ` [PATCH v3 9/9] drm/i915/bxt: Implement Transition WM Kumar, Mahesh
  8 siblings, 1 reply; 40+ messages in thread
From: Kumar, Mahesh @ 2016-09-09  8:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch adds IPC support for platforms. This patch enables IPC
only for BXT platform as for SKL recommendation is to keep is disabled
This patch also adds kernel command-line option to enable/disable
the IPC if required.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c    |  5 +++++
 drivers/gpu/drm/i915/i915_params.c |  5 +++++
 drivers/gpu/drm/i915/i915_params.h |  1 +
 drivers/gpu/drm/i915/i915_reg.h    |  1 +
 drivers/gpu/drm/i915/intel_drv.h   |  1 +
 drivers/gpu/drm/i915/intel_pm.c    | 21 +++++++++++++++++++++
 6 files changed, 34 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 0a4f18d..22d84e6 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1335,6 +1335,11 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	intel_runtime_pm_enable(dev_priv);
 
+	/*
+	 * Now enable the IPC for supported platforms
+	 */
+	intel_enable_ipc(dev_priv);
+
 	/* Everything is in place, we can now relax! */
 	DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
 		 driver.name, driver.major, driver.minor, driver.patchlevel,
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 768ad89..cc41b8d 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -62,6 +62,7 @@ struct i915_params i915 __read_mostly = {
 	.inject_load_failure = 0,
 	.enable_dpcd_backlight = false,
 	.enable_gvt = false,
+	.enable_ipc = true,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -233,3 +234,7 @@ MODULE_PARM_DESC(enable_dpcd_backlight,
 module_param_named(enable_gvt, i915.enable_gvt, bool, 0400);
 MODULE_PARM_DESC(enable_gvt,
 	"Enable support for Intel GVT-g graphics virtualization host support(default:false)");
+
+module_param_named(enable_ipc, i915.enable_ipc, bool, 0400);
+MODULE_PARM_DESC(enable_ipc,
+		"Enable Isochronous Priority Control (default:true)");
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index 3a0dd78..f99b9b9 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -65,6 +65,7 @@ struct i915_params {
 	bool enable_dp_mst;
 	bool enable_dpcd_backlight;
 	bool enable_gvt;
+	bool enable_ipc;
 };
 
 extern struct i915_params i915 __read_mostly;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index b38445c..75b5b52 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -6139,6 +6139,7 @@ enum {
 #define  DISP_FBC_WM_DIS		(1<<15)
 #define DISP_ARB_CTL2	_MMIO(0x45004)
 #define  DISP_DATA_PARTITION_5_6	(1<<6)
+#define  DISP_IPC_ENABLE		(1<<3)
 #define DBUF_CTL	_MMIO(0x45008)
 #define  DBUF_POWER_REQUEST		(1<<31)
 #define  DBUF_POWER_STATE		(1<<30)
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 66cb46c..56c8ac8 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1753,6 +1753,7 @@ void skl_write_plane_wm(struct intel_crtc *intel_crtc,
 uint32_t ilk_pipe_pixel_rate(const struct intel_crtc_state *pipe_config);
 bool ilk_disable_lp_wm(struct drm_device *dev);
 int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
+void intel_enable_ipc(struct drm_i915_private *dev_priv);
 static inline int intel_enable_rc6(void)
 {
 	return i915.enable_rc6;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d4390e4..8d0037c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4793,6 +4793,27 @@ void intel_update_watermarks(struct drm_crtc *crtc)
 }
 
 /*
+ * enable IPC for Supported Platforms
+ */
+void intel_enable_ipc(struct drm_i915_private *dev_priv)
+{
+	u32 val;
+
+	/* enable IPC only for Broxton for now*/
+	if (!IS_BROXTON(dev_priv))
+		return;
+
+	val = I915_READ(DISP_ARB_CTL2);
+
+	if (i915.enable_ipc)
+		val |= DISP_IPC_ENABLE;
+	else
+		val &= ~DISP_IPC_ENABLE;
+
+	I915_WRITE(DISP_ARB_CTL2, val);
+}
+
+/*
  * Lock protecting IPS related data structures
  */
 DEFINE_SPINLOCK(mchdev_lock);
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 8/9] drm/i915/bxt: set chicken bit as IPC y-tile WA
  2016-09-09  8:00 [PATCH v3 0/9] New DDB Algo and WM fixes Kumar, Mahesh
                   ` (6 preceding siblings ...)
  2016-09-09  8:01 ` [PATCH v3 7/9] drm/i915/bxt: Enable IPC support Kumar, Mahesh
@ 2016-09-09  8:01 ` Kumar, Mahesh
  2016-09-21 20:23   ` Paulo Zanoni
  2016-09-09  8:01 ` [PATCH v3 9/9] drm/i915/bxt: Implement Transition WM Kumar, Mahesh
  8 siblings, 1 reply; 40+ messages in thread
From: Kumar, Mahesh @ 2016-09-09  8:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

It implements the WA to enable IDLE_WAKEMEM bit of CHICKEN_DCPR_1
register for Broxton platform. When IPC is enabled & Y-tile is
enabled in any of the enabled plane, above bit should be set.
Without this WA system observes random hang.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h      |  2 ++
 drivers/gpu/drm/i915/i915_reg.h      |  3 +++
 drivers/gpu/drm/i915/intel_display.c | 50 ++++++++++++++++++++++++++++++++++++
 3 files changed, 55 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4737a0e..79b9305 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1632,6 +1632,8 @@ struct skl_wm_values {
 	unsigned dirty_pipes;
 	/* any WaterMark memory workaround Required */
 	enum watermark_memory_wa mem_wa;
+	/* IPC Y-tiled WA related member */
+	u32 y_plane_mask;
 	struct skl_ddb_allocation ddb;
 	uint32_t wm_linetime[I915_MAX_PIPES];
 	uint32_t plane[I915_MAX_PIPES][I915_MAX_PLANES][8];
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 75b5b52..210d9b0 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -5658,6 +5658,9 @@ enum {
 #define PLANE_NV12_BUF_CFG(pipe, plane)	\
 	_MMIO_PLANE(plane, _PLANE_NV12_BUF_CFG_1(pipe), _PLANE_NV12_BUF_CFG_2(pipe))
 
+#define CHICKEN_DCPR_1             _MMIO(0x46430)
+#define IDLE_WAKEMEM_MASK          (1 << 13)
+
 /* SKL new cursor registers */
 #define _CUR_BUF_CFG_A				0x7017c
 #define _CUR_BUF_CFG_B				0x7117c
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 5f50f53..ee7f88e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -12415,6 +12415,8 @@ int intel_plane_atomic_calc_changes(struct drm_crtc_state *crtc_state,
 	bool is_crtc_enabled = crtc_state->active;
 	bool turn_off, turn_on, visible, was_visible;
 	struct drm_framebuffer *fb = plane_state->fb;
+	struct intel_atomic_state *intel_state =
+				to_intel_atomic_state(plane_state->state);
 	int ret;
 
 	if (INTEL_GEN(dev) >= 9 && plane->type != DRM_PLANE_TYPE_CURSOR) {
@@ -12501,6 +12503,15 @@ int intel_plane_atomic_calc_changes(struct drm_crtc_state *crtc_state,
 	    !needs_scaling(old_plane_state))
 		pipe_config->disable_lp_wm = true;
 
+	if (fb && (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
+			fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED)) {
+		intel_state->wm_results.y_plane_mask |=
+						(1 << drm_plane_index(plane));
+	} else {
+		intel_state->wm_results.y_plane_mask &=
+						~(1 << drm_plane_index(plane));
+	}
+
 	return 0;
 }
 
@@ -13963,6 +13974,10 @@ static int intel_atomic_check(struct drm_device *dev,
 	if (ret)
 		return ret;
 
+	/* Copy the Y-tile WA related states */
+	intel_state->wm_results.y_plane_mask =
+				dev_priv->wm.skl_results.y_plane_mask;
+
 	for_each_crtc_in_state(state, crtc, crtc_state, i) {
 		struct intel_crtc_state *pipe_config =
 			to_intel_crtc_state(crtc_state);
@@ -14204,6 +14219,32 @@ static void intel_update_crtcs(struct drm_atomic_state *state,
 	}
 }
 
+/*
+ * GEN9 IPC WA for Y-tiled
+ */
+void bxt_set_ipc_wa(struct drm_i915_private *dev_priv, bool enable)
+{
+	u32 val;
+
+	if (!IS_BROXTON(dev_priv) || !i915.enable_ipc)
+		return;
+
+	val = I915_READ(CHICKEN_DCPR_1);
+	/*
+	 * If WA is already enabled or disabled
+	 * no need to re-enable or disable it.
+	 */
+	if ((enable && (val & IDLE_WAKEMEM_MASK)) ||
+		(!enable && !(val & IDLE_WAKEMEM_MASK)))
+		return;
+
+	if (enable)
+		val |= IDLE_WAKEMEM_MASK;
+	else
+		val &= ~IDLE_WAKEMEM_MASK;
+	I915_WRITE(CHICKEN_DCPR_1, val);
+}
+
 static void skl_update_crtcs(struct drm_atomic_state *state,
 			     unsigned int *crtc_vblank_mask)
 {
@@ -14219,6 +14260,12 @@ static void skl_update_crtcs(struct drm_atomic_state *state,
 	enum pipe pipe;
 
 	/*
+	 * If IPC WA need to be enabled, enable it now
+	 */
+	if (intel_state->wm_results.y_plane_mask)
+		bxt_set_ipc_wa(dev_priv, true);
+
+	/*
 	 * Whenever the number of active pipes changes, we need to make sure we
 	 * update the pipes in the right order so that their ddb allocations
 	 * never overlap with eachother inbetween CRTC updates. Otherwise we'll
@@ -14261,6 +14308,9 @@ static void skl_update_crtcs(struct drm_atomic_state *state,
 			progress = true;
 		}
 	} while (progress);
+
+	if (!intel_state->wm_results.y_plane_mask)
+		bxt_set_ipc_wa(dev_priv, false);
 }
 
 static void intel_atomic_commit_tail(struct drm_atomic_state *state)
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 9/9] drm/i915/bxt: Implement Transition WM
  2016-09-09  8:00 [PATCH v3 0/9] New DDB Algo and WM fixes Kumar, Mahesh
                   ` (7 preceding siblings ...)
  2016-09-09  8:01 ` [PATCH v3 8/9] drm/i915/bxt: set chicken bit as IPC y-tile WA Kumar, Mahesh
@ 2016-09-09  8:01 ` Kumar, Mahesh
  2016-09-14 11:54   ` [PATCH v4] " Kumar, Mahesh
  8 siblings, 1 reply; 40+ messages in thread
From: Kumar, Mahesh @ 2016-09-09  8:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch enables Transition WM for SKL+ platforms.

Transition WM are used if IPC is enabled, to decide, number of blocks
to be fetched before reducing the priority of display to fetch from
memory.

Changes since v1:
 - Don't enable transition WM for GEN9 (Art/Paulo)
 - Rebase to use fixed point 16.16 calculation
 - Fix the use of selected result from latency level-0

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 60 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 57 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 8d0037c..f6a3fab 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2862,6 +2862,8 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
 
 #define SKL_DDB_SIZE		896	/* in blocks */
 #define BXT_DDB_SIZE		512
+#define SKL_TRANS_WM_AMOUNT	10	/* tunable value */
+#define SKL_TRANS_WM_MIN	14
 #define SKL_SAGV_BLOCK_TIME	30 /* µs */
 
 /*
@@ -3583,6 +3585,48 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 	return pixel_rate;
 }
 
+static void skl_compute_plane_trans_wm(const struct drm_i915_private *dev_priv,
+					struct skl_pipe_wm *pipe_wm,
+					struct intel_plane_state *intel_pstate,
+					uint32_t selected_result,
+					uint32_t y_tile_min,
+					bool y_tile)
+{
+	struct drm_plane_state *pstate = &intel_pstate->base;
+	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
+	uint16_t *out_blocks = &pipe_wm->trans_wm.plane_res_b[id];
+	uint8_t *out_lines = &pipe_wm->trans_wm.plane_res_l[id];
+	uint32_t trans_min = 0, trans_offset_blocks;
+	uint32_t trans_y_tile_min = 0, res_blocks = 0;
+	uint16_t trans_res_blocks = 0;
+
+	/* Keep Trans WM disabled for GEN9 */
+	if (IS_GEN9(dev_priv))
+		goto exit;
+
+	trans_min = SKL_TRANS_WM_MIN;
+
+	trans_offset_blocks = (trans_min + SKL_TRANS_WM_AMOUNT) << 16;
+
+	if (y_tile) {
+		trans_y_tile_min = 2 * y_tile_min;
+		res_blocks = max(trans_y_tile_min, selected_result) +
+			trans_offset_blocks;
+	} else {
+		res_blocks = selected_result + trans_offset_blocks;
+	}
+
+	trans_res_blocks = DIV_ROUND_UP(res_blocks, 1 << 16) + 1;
+
+	/* WA BUG:1938466 add one block for non y-tile planes */
+	if (!y_tile)
+		trans_res_blocks += 1;
+exit:
+	*out_blocks = trans_res_blocks;
+	*out_lines = 0;
+}
+
+
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
 				struct intel_plane_state *intel_pstate,
@@ -3709,6 +3753,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	*out_blocks = res_blocks;
 	*out_lines = res_lines;
 
+	if (level == 0)
+		skl_compute_plane_trans_wm(dev_priv, pipe_wm, intel_pstate,
+				selected_result, y_tile_minimum, y_tiled);
 	return 0;
 }
 
@@ -3778,11 +3825,13 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 }
 
 static void skl_compute_transition_wm(struct intel_crtc_state *cstate,
-				      struct skl_wm_level *trans_wm /* out */)
+				      struct skl_wm_level *trans_wm, /* out */
+				      struct skl_ddb_allocation *ddb)
 {
 	struct drm_crtc *crtc = cstate->base.crtc;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	struct intel_plane *intel_plane;
+	enum pipe pipe = to_intel_crtc(crtc)->pipe;
 
 	if (!cstate->base.active)
 		return;
@@ -3790,8 +3839,13 @@ static void skl_compute_transition_wm(struct intel_crtc_state *cstate,
 	/* Until we know more, just disable transition WMs */
 	for_each_intel_plane_on_crtc(crtc->dev, intel_crtc, intel_plane) {
 		int i = skl_wm_plane_id(intel_plane);
+		uint16_t plane_ddb = skl_ddb_entry_size(&ddb->plane[pipe][i]);
+		uint16_t trans_res_blocks = trans_wm->plane_res_b[i];
 
-		trans_wm->plane_en[i] = false;
+		if ((plane_ddb > 0) && (trans_res_blocks > plane_ddb))
+			trans_wm->plane_en[i] = false;
+		else
+			trans_wm->plane_en[i] = true;
 	}
 }
 
@@ -3822,7 +3876,7 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 
 	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
 
-	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
+	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm, ddb);
 
 	return 0;
 }
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/9] drm/i915/skl+: use linetime latency instead of ddb size
  2016-09-09  8:00 ` [PATCH v3 2/9] drm/i915/skl+: use linetime latency instead of ddb size Kumar, Mahesh
@ 2016-09-12  8:56   ` Maarten Lankhorst
  2016-09-19 18:19   ` Paulo Zanoni
  1 sibling, 0 replies; 40+ messages in thread
From: Maarten Lankhorst @ 2016-09-12  8:56 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx; +Cc: paulo.r.zanoni

Op 09-09-16 om 10:00 schreef Kumar, Mahesh:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>
> This patch make changes to use linetime latency instead of allocated
> DDB size during plane watermark calculation in switch case, This is
> required to implement new DDB allocation algorithm.
>
> In New Algorithm DDB is allocated based on WM values, because of which
> number of DDB blocks will not be available during WM calculation,
> So this "linetime latency" is suggested by SV/HW team to use during
> switch-case for WM blocks selection.
>
> Changes since v1:
>  - Rebase on top of Paulo's patch series
>
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 3fdec4d..cfd9b7d1 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3622,10 +3622,15 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
>  		selected_result = max(method2, y_tile_minimum);
>  	} else {
> +		uint32_t linetime_us = 0;
> +
> +		linetime_us = DIV_ROUND_UP(width * 1000,
> +				skl_pipe_pixel_rate(cstate));
> +
>  		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
>  		    (plane_bytes_per_line / 512 < 1))
>  			selected_result = method2;
> -		else if ((ddb_allocation / plane_blocks_per_line) >= 1)
> +		if (latency >= linetime_us)
>  			selected_result = min(method1, method2);
>  		else
>  			selected_result = method1;

You removed a 'else' here. Also the prerequisite patch is not part of this series.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 3/9] drm/i915/skl: New ddb allocation algorithm
  2016-09-09  8:01 ` [PATCH v3 3/9] drm/i915/skl: New ddb allocation algorithm Kumar, Mahesh
@ 2016-09-12 10:50   ` Maarten Lankhorst
  2016-09-12 13:11   ` Maarten Lankhorst
  1 sibling, 0 replies; 40+ messages in thread
From: Maarten Lankhorst @ 2016-09-12 10:50 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx; +Cc: paulo.r.zanoni

Op 09-09-16 om 10:01 schreef Kumar, Mahesh:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>
> This patch implements new DDB allocation algorithm as per HW team
> suggestion. This algo takecare of scenario where we allocate less DDB
> for the planes with lower relative pixel rate, but they require more DDB
> to work.
> It also takes care of enabling same watermark level for each
> plane, for efficient power saving.
>
> Changes since v1:
>  - Rebase on top of Paulo's patch series
>
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 129 +++++++++++++++++++++-------------------
>  1 file changed, 67 insertions(+), 62 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index cfd9b7d1..7c70e07 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3348,6 +3348,7 @@ skl_ddb_min_alloc(const struct drm_plane_state *pstate,
>  
>  static int
>  skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
> +		      struct skl_pipe_wm *pipe_wm,
>  		      struct skl_ddb_allocation *ddb /* out */)
>  {
>  	struct drm_atomic_state *state = cstate->base.state;
> @@ -3363,8 +3364,11 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	uint16_t *minimum = cstate->wm.skl.minimum_blocks;
>  	uint16_t *y_minimum = cstate->wm.skl.minimum_y_blocks;
>  	unsigned int total_data_rate;
> +	uint16_t total_min_blocks = 0;
> +	uint16_t total_level_ddb = 0;
>  	int num_active;
> -	int id, i;
> +	int max_level, level;
> +	int id, i, ret = 0;
>  
>  	if (WARN_ON(!state))
>  		return 0;
> @@ -3380,6 +3384,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	alloc_size = skl_ddb_entry_size(alloc);
>  	if (alloc_size == 0) {
>  		memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
> +		memset(ddb->y_plane[pipe], 0, sizeof(ddb->y_plane[pipe]));
>  		return 0;
>  	}
This hunk looks like a bugfix, and should be a separate commit?
> @@ -3413,19 +3418,42 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	}
>  
>  	for (i = 0; i < PLANE_CURSOR; i++) {
> -		alloc_size -= minimum[i];
> -		alloc_size -= y_minimum[i];
> +		total_min_blocks += minimum[i];
> +		total_min_blocks += y_minimum[i];
>  	}
>  
> -	/*
> -	 * 2. Distribute the remaining space in proportion to the amount of
> -	 * data each plane needs to fetch from memory.
> -	 *
> -	 * FIXME: we may not allocate every single block here.
> -	 */
> +	for (level = ilk_wm_max_level(dev); level >= 0; level--) {
> +		total_level_ddb = 0;
> +		for (i = 0; i < PLANE_CURSOR; i++) {
> +			/*
> +			 * TODO: We should calculate watermark values for Y/UV
> +			 * plane both in case of NV12 format and use both values
> +			 * for ddb calculation, As NV12 is disabled as of now.
> +			 * using only single plane value here.
> +			 */
> +			uint16_t min = minimum[i] + y_minimum[i];
> +			uint16_t plane_level_ddb_wm =
> +				max(pipe_wm->wm[level].plane_res_b[i], min);
> +			total_level_ddb += plane_level_ddb_wm;
> +		}
> +
> +		if (total_level_ddb <= alloc_size)
> +			break;
> +	}
> +
> +	if ((level < 0) || (total_min_blocks > alloc_size)) {
> +		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
> +		DRM_DEBUG_KMS("minimum required %d/%d\n", (level < 0) ?
> +				total_level_ddb : total_min_blocks, alloc_size);
> +		ret = -EINVAL;
> +		goto exit;
> +	}
> +	max_level = level;
> +	alloc_size -= total_level_ddb;
> +
>  	total_data_rate = skl_get_total_relative_data_rate(cstate);
>  	if (total_data_rate == 0)
> -		return 0;
> +		goto exit;
>  
>  	start = alloc->start;
>  	for_each_intel_plane_on_crtc(dev, intel_crtc, intel_plane) {
> @@ -3440,7 +3468,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  		 * promote the expression to 64 bits to avoid overflowing, the
>  		 * result is < available as data_rate / total_data_rate < 1
>  		 */
> -		plane_blocks = minimum[id];
> +		plane_blocks = max(pipe_wm->wm[max_level].plane_res_b[id],
> +					minimum[id]);
>  		plane_blocks += div_u64((uint64_t)alloc_size * data_rate,
>  					total_data_rate);
>  
> @@ -3454,6 +3483,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  
>  		/*
>  		 * allocation for y_plane part of planar format:
> +		 * TODO: Once we start calculating watermark values for Y/UV
> +		 * plane both consider it for initial allowed wm blocks.
>  		 */
>  		y_data_rate = cstate->wm.skl.plane_y_data_rate[id];
>  
> @@ -3467,9 +3498,22 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  		}
>  
>  		start += y_plane_blocks;
> +
> +		/*
> +		 * Now enable all levels in WM structure which can be enabled
> +		 * using current DDB allocation
> +		 */
> +		for (i = ilk_wm_max_level(dev); i > 0; i--) {
> +			if (i > max_level || pipe_wm->wm[i].plane_res_l[id] > 31
> +					|| pipe_wm->wm[i].plane_res_b[id] == 0)
> +				pipe_wm->wm[i].plane_en[id] = false;
> +			else
> +				pipe_wm->wm[i].plane_en[id] = true;
> +		}
>  	}
>  
> -	return 0;
> +exit:
> +	return ret;
>  }
>  
>  static uint32_t skl_pipe_pixel_rate(const struct intel_crtc_state *config)
> @@ -3540,7 +3584,6 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
>  static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  				struct intel_crtc_state *cstate,
>  				struct intel_plane_state *intel_pstate,
> -				uint16_t ddb_allocation,
>  				int level,
>  				struct skl_pipe_wm *pipe_wm)
>  {
> @@ -3559,12 +3602,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  	struct skl_wm_level *result = &pipe_wm->wm[level];
>  	uint16_t *out_blocks = &result->plane_res_b[id];
>  	uint8_t *out_lines = &result->plane_res_l[id];
> -	bool *enabled = &result->plane_en[id];
>  
> -	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
> -		*enabled = false;
> +	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
>  		return 0;
> -	}
>  
>  	width = drm_rect_width(&intel_pstate->base.src) >> 16;
>  	height = drm_rect_height(&intel_pstate->base.src) >> 16;
> @@ -3649,54 +3689,27 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  		}
>  	}
>  
> -	if (res_blocks >= ddb_allocation || res_lines > 31) {
> -		*enabled = false;
> -
> -		/*
> -		 * If there are no valid level 0 watermarks, then we can't
> -		 * support this display configuration.
> -		 */
> -		if (level) {
> -			return 0;
> -		} else {
> -			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
> -			DRM_DEBUG_KMS("Plane %d.%d: blocks required = %u/%u, lines required = %u/31\n",
> -				      to_intel_crtc(cstate->base.crtc)->pipe,
> -				      skl_wm_plane_id(to_intel_plane(pstate->plane)),
> -				      res_blocks, ddb_allocation, res_lines);
> -
> -			return -EINVAL;
> -		}
> -	}
> -
>  	*out_blocks = res_blocks;
>  	*out_lines = res_lines;
> -	*enabled = true;
>  
>  	return 0;
>  }
>  
>  static int
>  skl_compute_wm_level(const struct drm_i915_private *dev_priv,
> -		     struct skl_ddb_allocation *ddb,
>  		     struct intel_crtc_state *cstate,
>  		     int level,
>  		     struct skl_pipe_wm *pipe_wm)
>  {
>  	struct drm_atomic_state *state = cstate->base.state;
> -	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
>  	struct drm_plane *plane;
>  	struct intel_plane *intel_plane;
>  	struct intel_plane_state *intel_pstate;
> -	uint16_t ddb_blocks;
> -	enum pipe pipe = intel_crtc->pipe;
>  	int ret;
>  
>  	for_each_intel_plane_mask(&dev_priv->drm,
>  				  intel_plane,
>  				  cstate->base.plane_mask) {
> -		int i = skl_wm_plane_id(intel_plane);
> -
>  		plane = &intel_plane->base;
>  		intel_pstate = NULL;
>  		if (state)
> @@ -3722,12 +3735,9 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
>  
>  		WARN_ON(!intel_pstate->base.fb);
>  
> -		ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][i]);
> -
>  		ret = skl_compute_plane_wm(dev_priv,
>  					   cstate,
>  					   intel_pstate,
> -					   ddb_blocks,
>  					   level,
>  					   pipe_wm);
>  		if (ret)
> @@ -3784,11 +3794,15 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
>  	memset(pipe_wm, 0, sizeof(*pipe_wm));
>  
>  	for (level = 0; level <= max_level; level++) {
> -		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
> +		ret = skl_compute_wm_level(dev_priv, cstate,
>  					   level, pipe_wm);
>  		if (ret)
>  			return ret;
>  	}
> +	ret = skl_allocate_pipe_ddb(cstate, pipe_wm, ddb);
> +	if (ret)
> +		return ret;
> +
>  	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
>  
>  	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
> @@ -3976,13 +3990,12 @@ pipes_modified(struct drm_atomic_state *state)
>  }
>  
>  static int
> -skl_compute_ddb(struct drm_atomic_state *state)
> +skl_include_affected_pipes(struct drm_atomic_state *state)
>  {
>  	struct drm_device *dev = state->dev;
>  	struct drm_i915_private *dev_priv = to_i915(dev);
>  	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
>  	struct intel_crtc *intel_crtc;
> -	struct skl_ddb_allocation *ddb = &intel_state->wm_results.ddb;
>  	uint32_t realloc_pipes = pipes_modified(state);
>  	int ret;
>  
> @@ -4035,10 +4048,6 @@ skl_compute_ddb(struct drm_atomic_state *state)
>  		if (IS_ERR(cstate))
>  			return PTR_ERR(cstate);
>  
> -		ret = skl_allocate_pipe_ddb(cstate, ddb);
> -		if (ret)
> -			return ret;
> -
>  		ret = drm_atomic_add_affected_planes(state, &intel_crtc->base);
>  		if (ret)
>  			return ret;
> @@ -4092,19 +4101,15 @@ skl_compute_wm(struct drm_atomic_state *state)
>  	/* Clear all dirty flags */
>  	results->dirty_pipes = 0;
>  
> -	ret = skl_compute_ddb(state);
> +	ret = skl_include_affected_pipes(state);
>  	if (ret)
>  		return ret;
>  
>  	/*
>  	 * Calculate WM's for all pipes that are part of this transaction.
> -	 * Note that the DDB allocation above may have added more CRTC's that
> -	 * weren't otherwise being modified (and set bits in dirty_pipes) if
> -	 * pipe allocations had to change.
> -	 *
> -	 * FIXME:  Now that we're doing this in the atomic check phase, we
> -	 * should allow skl_update_pipe_wm() to return failure in cases where
> -	 * no suitable watermark values can be found.
> +	 * Note that affected pipe calculation above may have added more
> +	 * CRTC's that weren't otherwise being modified (and set bits in
> +	 * dirty_pipes) if pipe allocations had to change.
>  	 */
>  	for_each_crtc_in_state(state, crtc, cstate, i) {
>  		struct intel_crtc *intel_crtc = to_intel_crtc(crtc);

Rest looks ok on first glance. I didn't check yet if compute_pipe_ddb is called for included pipes too, but that's probably the case.
I'll probably throw the whole series through some tests to be sure it works right. :)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 5/9] drm/i915/gen9: WM memory bandwidth related workaround
  2016-09-09  8:01 ` [PATCH v3 5/9] drm/i915/gen9: WM memory bandwidth related workaround Kumar, Mahesh
@ 2016-09-12 11:02   ` Maarten Lankhorst
  2016-09-12 11:12     ` Maarten Lankhorst
  0 siblings, 1 reply; 40+ messages in thread
From: Maarten Lankhorst @ 2016-09-12 11:02 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx; +Cc: paulo.r.zanoni

Op 09-09-16 om 10:01 schreef Kumar, Mahesh:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>
> This patch implemnets Workarounds related to display arbitrated memory
> bandwidth. These WA are applicabe for all gen-9 based platforms.
>
> Changes since v1:
>  - Rebase on top of Paulo's patch series
>
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h  |   9 +++
>  drivers/gpu/drm/i915/intel_drv.h |  11 +++
>  drivers/gpu/drm/i915/intel_pm.c  | 145 +++++++++++++++++++++++++++++++++++++++
>  3 files changed, 165 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4313992..4737a0e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1074,6 +1074,13 @@ enum intel_sbi_destination {
>  	SBI_MPHY,
>  };
>  
> +/* SKL+ Watermark arbitrated display bandwidth Workarounds */
> +enum watermark_memory_wa {
> +	WATERMARK_WA_NONE,
> +	WATERMARK_WA_X_TILED,
> +	WATERMARK_WA_Y_TILED,
> +};
> +
>  #define QUIRK_PIPEA_FORCE (1<<0)
>  #define QUIRK_LVDS_SSC_DISABLE (1<<1)
>  #define QUIRK_INVERT_BRIGHTNESS (1<<2)
> @@ -1623,6 +1630,8 @@ struct skl_ddb_allocation {
>  
>  struct skl_wm_values {
>  	unsigned dirty_pipes;
> +	/* any WaterMark memory workaround Required */
> +	enum watermark_memory_wa mem_wa;
>  	struct skl_ddb_allocation ddb;
>  	uint32_t wm_linetime[I915_MAX_PIPES];
>  	uint32_t plane[I915_MAX_PIPES][I915_MAX_PLANES][8];
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 6cd7e8a..66cb46c 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1800,6 +1800,17 @@ intel_atomic_get_crtc_state(struct drm_atomic_state *state,
>  	return to_intel_crtc_state(crtc_state);
>  }
>  
> +static inline struct intel_crtc_state *
> +intel_atomic_get_existing_crtc_state(struct drm_atomic_state *state,
> +				      struct intel_crtc *crtc)
> +{
> +	struct drm_crtc_state *crtc_state;
> +
> +	crtc_state = drm_atomic_get_existing_crtc_state(state, &crtc->base);
> +
> +	return to_intel_crtc_state(crtc_state);
> +}
> +
>  static inline struct intel_plane_state *
>  intel_atomic_get_existing_plane_state(struct drm_atomic_state *state,
>  				      struct intel_plane *plane)
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 7c70e07..0ec328b 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3589,6 +3589,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  {
>  	struct drm_plane_state *pstate = &intel_pstate->base;
>  	struct drm_framebuffer *fb = pstate->fb;
> +	struct intel_atomic_state *intel_state =
> +			to_intel_atomic_state(cstate->base.state);
>  	uint32_t latency = dev_priv->wm.skl_latency[level];
>  	uint32_t method1, method2;
>  	uint32_t plane_bytes_per_line, plane_blocks_per_line;
> @@ -3602,10 +3604,17 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  	struct skl_wm_level *result = &pipe_wm->wm[level];
>  	uint16_t *out_blocks = &result->plane_res_b[id];
>  	uint8_t *out_lines = &result->plane_res_l[id];
> +	enum watermark_memory_wa mem_wa;
>  
>  	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
>  		return 0;
>  
> +	mem_wa = intel_state ? intel_state->wm_results.mem_wa : WATERMARK_WA_NONE;
> +	if (mem_wa != WATERMARK_WA_NONE) {
> +		if (fb->modifier[0] == I915_FORMAT_MOD_X_TILED)
> +			latency += 15;
> +	}
> +
>  	width = drm_rect_width(&intel_pstate->base.src) >> 16;
>  	height = drm_rect_height(&intel_pstate->base.src) >> 16;
>  
> @@ -3637,6 +3646,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  		y_min_scanlines = 4;
>  	}
>  
> +	if (mem_wa == WATERMARK_WA_Y_TILED)
> +		y_min_scanlines *= 2;
> +
>  	plane_bytes_per_line = width * cpp;
>  	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
>  	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
I don't have y_min_scanlines in nightly? What is this series based on?
It doesn't apply cleanly at least..

~Maarten
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 5/9] drm/i915/gen9: WM memory bandwidth related workaround
  2016-09-12 11:02   ` Maarten Lankhorst
@ 2016-09-12 11:12     ` Maarten Lankhorst
  0 siblings, 0 replies; 40+ messages in thread
From: Maarten Lankhorst @ 2016-09-12 11:12 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx; +Cc: paulo.r.zanoni

Op 12-09-16 om 13:02 schreef Maarten Lankhorst:
> Op 09-09-16 om 10:01 schreef Kumar, Mahesh:
>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>>
>> This patch implemnets Workarounds related to display arbitrated memory
>> bandwidth. These WA are applicabe for all gen-9 based platforms.
>>
>> Changes since v1:
>>  - Rebase on top of Paulo's patch series
>>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_drv.h  |   9 +++
>>  drivers/gpu/drm/i915/intel_drv.h |  11 +++
>>  drivers/gpu/drm/i915/intel_pm.c  | 145 +++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 165 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 4313992..4737a0e 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1074,6 +1074,13 @@ enum intel_sbi_destination {
>>  	SBI_MPHY,
>>  };
>>  
>> +/* SKL+ Watermark arbitrated display bandwidth Workarounds */
>> +enum watermark_memory_wa {
>> +	WATERMARK_WA_NONE,
>> +	WATERMARK_WA_X_TILED,
>> +	WATERMARK_WA_Y_TILED,
>> +};
>> +
>>  #define QUIRK_PIPEA_FORCE (1<<0)
>>  #define QUIRK_LVDS_SSC_DISABLE (1<<1)
>>  #define QUIRK_INVERT_BRIGHTNESS (1<<2)
>> @@ -1623,6 +1630,8 @@ struct skl_ddb_allocation {
>>  
>>  struct skl_wm_values {
>>  	unsigned dirty_pipes;
>> +	/* any WaterMark memory workaround Required */
>> +	enum watermark_memory_wa mem_wa;
>>  	struct skl_ddb_allocation ddb;
>>  	uint32_t wm_linetime[I915_MAX_PIPES];
>>  	uint32_t plane[I915_MAX_PIPES][I915_MAX_PLANES][8];
>> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
>> index 6cd7e8a..66cb46c 100644
>> --- a/drivers/gpu/drm/i915/intel_drv.h
>> +++ b/drivers/gpu/drm/i915/intel_drv.h
>> @@ -1800,6 +1800,17 @@ intel_atomic_get_crtc_state(struct drm_atomic_state *state,
>>  	return to_intel_crtc_state(crtc_state);
>>  }
>>  
>> +static inline struct intel_crtc_state *
>> +intel_atomic_get_existing_crtc_state(struct drm_atomic_state *state,
>> +				      struct intel_crtc *crtc)
>> +{
>> +	struct drm_crtc_state *crtc_state;
>> +
>> +	crtc_state = drm_atomic_get_existing_crtc_state(state, &crtc->base);
>> +
>> +	return to_intel_crtc_state(crtc_state);
>> +}
>> +
>>  static inline struct intel_plane_state *
>>  intel_atomic_get_existing_plane_state(struct drm_atomic_state *state,
>>  				      struct intel_plane *plane)
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index 7c70e07..0ec328b 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -3589,6 +3589,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>  {
>>  	struct drm_plane_state *pstate = &intel_pstate->base;
>>  	struct drm_framebuffer *fb = pstate->fb;
>> +	struct intel_atomic_state *intel_state =
>> +			to_intel_atomic_state(cstate->base.state);
>>  	uint32_t latency = dev_priv->wm.skl_latency[level];
>>  	uint32_t method1, method2;
>>  	uint32_t plane_bytes_per_line, plane_blocks_per_line;
>> @@ -3602,10 +3604,17 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>  	struct skl_wm_level *result = &pipe_wm->wm[level];
>>  	uint16_t *out_blocks = &result->plane_res_b[id];
>>  	uint8_t *out_lines = &result->plane_res_l[id];
>> +	enum watermark_memory_wa mem_wa;
>>  
>>  	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
>>  		return 0;
>>  
>> +	mem_wa = intel_state ? intel_state->wm_results.mem_wa : WATERMARK_WA_NONE;
>> +	if (mem_wa != WATERMARK_WA_NONE) {
>> +		if (fb->modifier[0] == I915_FORMAT_MOD_X_TILED)
>> +			latency += 15;
>> +	}
>> +
>>  	width = drm_rect_width(&intel_pstate->base.src) >> 16;
>>  	height = drm_rect_height(&intel_pstate->base.src) >> 16;
>>  
>> @@ -3637,6 +3646,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>  		y_min_scanlines = 4;
>>  	}
>>  
>> +	if (mem_wa == WATERMARK_WA_Y_TILED)
>> +		y_min_scanlines *= 2;
>> +
>>  	plane_bytes_per_line = width * cpp;
>>  	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
>>  	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
> I don't have y_min_scanlines in nightly? What is this series based on?
> It doesn't apply cleanly at least..
Ah nevermind, applies on top of Paulo's series.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 3/9] drm/i915/skl: New ddb allocation algorithm
  2016-09-09  8:01 ` [PATCH v3 3/9] drm/i915/skl: New ddb allocation algorithm Kumar, Mahesh
  2016-09-12 10:50   ` Maarten Lankhorst
@ 2016-09-12 13:11   ` Maarten Lankhorst
  2016-09-13  6:21     ` Mahesh Kumar
  2016-09-13 12:15     ` [PATCH v4] " Kumar, Mahesh
  1 sibling, 2 replies; 40+ messages in thread
From: Maarten Lankhorst @ 2016-09-12 13:11 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx; +Cc: paulo.r.zanoni

Hey,

Op 09-09-16 om 10:01 schreef Kumar, Mahesh:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>
> This patch implements new DDB allocation algorithm as per HW team
> suggestion. This algo takecare of scenario where we allocate less DDB
> for the planes with lower relative pixel rate, but they require more DDB
> to work.
> It also takes care of enabling same watermark level for each
> plane, for efficient power saving.
>
> Changes since v1:
>  - Rebase on top of Paulo's patch series
This breaks the kms_atomic_transition testcase. I'm getting a lot of CPU pipe (A,B,C) FIFO underrun error messages,
and my DP-MST display won't turn on at all.

~Maarten
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 3/9] drm/i915/skl: New ddb allocation algorithm
  2016-09-12 13:11   ` Maarten Lankhorst
@ 2016-09-13  6:21     ` Mahesh Kumar
  2016-09-13 12:15     ` [PATCH v4] " Kumar, Mahesh
  1 sibling, 0 replies; 40+ messages in thread
From: Mahesh Kumar @ 2016-09-13  6:21 UTC (permalink / raw)
  To: Maarten Lankhorst, intel-gfx; +Cc: paulo.r.zanoni

Hi,

It seems I missed one condition while setting enable bit for WM's, 
because of which It's not enabling WM level-0 & you are observing flickers.
I'll upload updated patch.


On Monday 12 September 2016 06:41 PM, Maarten Lankhorst wrote:
> Hey,
>
> Op 09-09-16 om 10:01 schreef Kumar, Mahesh:
>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>>
>> This patch implements new DDB allocation algorithm as per HW team
>> suggestion. This algo takecare of scenario where we allocate less DDB
>> for the planes with lower relative pixel rate, but they require more DDB
>> to work.
>> It also takes care of enabling same watermark level for each
>> plane, for efficient power saving.
>>
>> Changes since v1:
>>   - Rebase on top of Paulo's patch series
> This breaks the kms_atomic_transition testcase. I'm getting a lot of CPU pipe (A,B,C) FIFO underrun error messages,
> and my DP-MST display won't turn on at all.
>
> ~Maarten

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v4] drm/i915/skl: New ddb allocation algorithm
  2016-09-12 13:11   ` Maarten Lankhorst
  2016-09-13  6:21     ` Mahesh Kumar
@ 2016-09-13 12:15     ` Kumar, Mahesh
  2016-09-13 12:40       ` Maarten Lankhorst
  1 sibling, 1 reply; 40+ messages in thread
From: Kumar, Mahesh @ 2016-09-13 12:15 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch implements new DDB allocation algorithm as per HW team
suggestion. This algo takecare of scenario where we allocate less DDB
for the planes with lower relative pixel rate, but they require more DDB
to work.
It also takes care of enabling same watermark level for each
plane, for efficient power saving.

Changes since v1:
 - Rebase on top of Paulo's patch series

Changes since v2:
 - Fix the for loop condition to enable WM

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 129 +++++++++++++++++++++-------------------
 1 file changed, 67 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 3370fc8..c52cc57 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3348,6 +3348,7 @@ skl_ddb_min_alloc(const struct drm_plane_state *pstate,
 
 static int
 skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
+		      struct skl_pipe_wm *pipe_wm,
 		      struct skl_ddb_allocation *ddb /* out */)
 {
 	struct drm_atomic_state *state = cstate->base.state;
@@ -3363,8 +3364,11 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	uint16_t *minimum = cstate->wm.skl.minimum_blocks;
 	uint16_t *y_minimum = cstate->wm.skl.minimum_y_blocks;
 	unsigned int total_data_rate;
+	uint16_t total_min_blocks = 0;
+	uint16_t total_level_ddb = 0;
 	int num_active;
-	int id, i;
+	int max_level, level;
+	int id, i, ret = 0;
 
 	if (WARN_ON(!state))
 		return 0;
@@ -3380,6 +3384,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	alloc_size = skl_ddb_entry_size(alloc);
 	if (alloc_size == 0) {
 		memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
+		memset(ddb->y_plane[pipe], 0, sizeof(ddb->y_plane[pipe]));
 		return 0;
 	}
 
@@ -3413,19 +3418,42 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	}
 
 	for (i = 0; i < PLANE_CURSOR; i++) {
-		alloc_size -= minimum[i];
-		alloc_size -= y_minimum[i];
+		total_min_blocks += minimum[i];
+		total_min_blocks += y_minimum[i];
 	}
 
-	/*
-	 * 2. Distribute the remaining space in proportion to the amount of
-	 * data each plane needs to fetch from memory.
-	 *
-	 * FIXME: we may not allocate every single block here.
-	 */
+	for (level = ilk_wm_max_level(dev); level >= 0; level--) {
+		total_level_ddb = 0;
+		for (i = 0; i < PLANE_CURSOR; i++) {
+			/*
+			 * TODO: We should calculate watermark values for Y/UV
+			 * plane both in case of NV12 format and use both values
+			 * for ddb calculation, As NV12 is disabled as of now.
+			 * using only single plane value here.
+			 */
+			uint16_t min = minimum[i] + y_minimum[i];
+			uint16_t plane_level_ddb_wm =
+				max(pipe_wm->wm[level].plane_res_b[i], min);
+			total_level_ddb += plane_level_ddb_wm;
+		}
+
+		if (total_level_ddb <= alloc_size)
+			break;
+	}
+
+	if ((level < 0) || (total_min_blocks > alloc_size)) {
+		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
+		DRM_DEBUG_KMS("minimum required %d/%d\n", (level < 0) ?
+				total_level_ddb : total_min_blocks, alloc_size);
+		ret = -EINVAL;
+		goto exit;
+	}
+	max_level = level;
+	alloc_size -= total_level_ddb;
+
 	total_data_rate = skl_get_total_relative_data_rate(cstate);
 	if (total_data_rate == 0)
-		return 0;
+		goto exit;
 
 	start = alloc->start;
 	for_each_intel_plane_on_crtc(dev, intel_crtc, intel_plane) {
@@ -3440,7 +3468,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		 * promote the expression to 64 bits to avoid overflowing, the
 		 * result is < available as data_rate / total_data_rate < 1
 		 */
-		plane_blocks = minimum[id];
+		plane_blocks = max(pipe_wm->wm[max_level].plane_res_b[id],
+					minimum[id]);
 		plane_blocks += div_u64((uint64_t)alloc_size * data_rate,
 					total_data_rate);
 
@@ -3454,6 +3483,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 
 		/*
 		 * allocation for y_plane part of planar format:
+		 * TODO: Once we start calculating watermark values for Y/UV
+		 * plane both consider it for initial allowed wm blocks.
 		 */
 		y_data_rate = cstate->wm.skl.plane_y_data_rate[id];
 
@@ -3467,9 +3498,22 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		}
 
 		start += y_plane_blocks;
+
+		/*
+		 * Now enable all levels in WM structure which can be enabled
+		 * using current DDB allocation
+		 */
+		for (i = ilk_wm_max_level(dev); i >= 0; i--) {
+			if (i > max_level || pipe_wm->wm[i].plane_res_l[id] > 31
+					|| pipe_wm->wm[i].plane_res_b[id] == 0)
+				pipe_wm->wm[i].plane_en[id] = false;
+			else
+				pipe_wm->wm[i].plane_en[id] = true;
+		}
 	}
 
-	return 0;
+exit:
+	return ret;
 }
 
 static uint32_t skl_pipe_pixel_rate(const struct intel_crtc_state *config)
@@ -3540,7 +3584,6 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
 				struct intel_plane_state *intel_pstate,
-				uint16_t ddb_allocation,
 				int level,
 				struct skl_pipe_wm *pipe_wm)
 {
@@ -3559,12 +3602,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	struct skl_wm_level *result = &pipe_wm->wm[level];
 	uint16_t *out_blocks = &result->plane_res_b[id];
 	uint8_t *out_lines = &result->plane_res_l[id];
-	bool *enabled = &result->plane_en[id];
 
-	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
-		*enabled = false;
+	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
 		return 0;
-	}
 
 	width = drm_rect_width(&intel_pstate->base.src) >> 16;
 	height = drm_rect_height(&intel_pstate->base.src) >> 16;
@@ -3649,54 +3689,27 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		}
 	}
 
-	if (res_blocks >= ddb_allocation || res_lines > 31) {
-		*enabled = false;
-
-		/*
-		 * If there are no valid level 0 watermarks, then we can't
-		 * support this display configuration.
-		 */
-		if (level) {
-			return 0;
-		} else {
-			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
-			DRM_DEBUG_KMS("Plane %d.%d: blocks required = %u/%u, lines required = %u/31\n",
-				      to_intel_crtc(cstate->base.crtc)->pipe,
-				      skl_wm_plane_id(to_intel_plane(pstate->plane)),
-				      res_blocks, ddb_allocation, res_lines);
-
-			return -EINVAL;
-		}
-	}
-
 	*out_blocks = res_blocks;
 	*out_lines = res_lines;
-	*enabled = true;
 
 	return 0;
 }
 
 static int
 skl_compute_wm_level(const struct drm_i915_private *dev_priv,
-		     struct skl_ddb_allocation *ddb,
 		     struct intel_crtc_state *cstate,
 		     int level,
 		     struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_atomic_state *state = cstate->base.state;
-	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
 	struct drm_plane *plane;
 	struct intel_plane *intel_plane;
 	struct intel_plane_state *intel_pstate;
-	uint16_t ddb_blocks;
-	enum pipe pipe = intel_crtc->pipe;
 	int ret;
 
 	for_each_intel_plane_mask(&dev_priv->drm,
 				  intel_plane,
 				  cstate->base.plane_mask) {
-		int i = skl_wm_plane_id(intel_plane);
-
 		plane = &intel_plane->base;
 		intel_pstate = NULL;
 		if (state)
@@ -3722,12 +3735,9 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 
 		WARN_ON(!intel_pstate->base.fb);
 
-		ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][i]);
-
 		ret = skl_compute_plane_wm(dev_priv,
 					   cstate,
 					   intel_pstate,
-					   ddb_blocks,
 					   level,
 					   pipe_wm);
 		if (ret)
@@ -3784,11 +3794,15 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 	memset(pipe_wm, 0, sizeof(*pipe_wm));
 
 	for (level = 0; level <= max_level; level++) {
-		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
+		ret = skl_compute_wm_level(dev_priv, cstate,
 					   level, pipe_wm);
 		if (ret)
 			return ret;
 	}
+	ret = skl_allocate_pipe_ddb(cstate, pipe_wm, ddb);
+	if (ret)
+		return ret;
+
 	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
 
 	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
@@ -3976,13 +3990,12 @@ pipes_modified(struct drm_atomic_state *state)
 }
 
 static int
-skl_compute_ddb(struct drm_atomic_state *state)
+skl_include_affected_pipes(struct drm_atomic_state *state)
 {
 	struct drm_device *dev = state->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
 	struct intel_crtc *intel_crtc;
-	struct skl_ddb_allocation *ddb = &intel_state->wm_results.ddb;
 	uint32_t realloc_pipes = pipes_modified(state);
 	int ret;
 
@@ -4035,10 +4048,6 @@ skl_compute_ddb(struct drm_atomic_state *state)
 		if (IS_ERR(cstate))
 			return PTR_ERR(cstate);
 
-		ret = skl_allocate_pipe_ddb(cstate, ddb);
-		if (ret)
-			return ret;
-
 		ret = drm_atomic_add_affected_planes(state, &intel_crtc->base);
 		if (ret)
 			return ret;
@@ -4092,19 +4101,15 @@ skl_compute_wm(struct drm_atomic_state *state)
 	/* Clear all dirty flags */
 	results->dirty_pipes = 0;
 
-	ret = skl_compute_ddb(state);
+	ret = skl_include_affected_pipes(state);
 	if (ret)
 		return ret;
 
 	/*
 	 * Calculate WM's for all pipes that are part of this transaction.
-	 * Note that the DDB allocation above may have added more CRTC's that
-	 * weren't otherwise being modified (and set bits in dirty_pipes) if
-	 * pipe allocations had to change.
-	 *
-	 * FIXME:  Now that we're doing this in the atomic check phase, we
-	 * should allow skl_update_pipe_wm() to return failure in cases where
-	 * no suitable watermark values can be found.
+	 * Note that affected pipe calculation above may have added more
+	 * CRTC's that weren't otherwise being modified (and set bits in
+	 * dirty_pipes) if pipe allocations had to change.
 	 */
 	for_each_crtc_in_state(state, crtc, cstate, i) {
 		struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v4] drm/i915/skl: New ddb allocation algorithm
  2016-09-13 12:15     ` [PATCH v4] " Kumar, Mahesh
@ 2016-09-13 12:40       ` Maarten Lankhorst
  2016-09-14 12:36         ` Mahesh Kumar
  0 siblings, 1 reply; 40+ messages in thread
From: Maarten Lankhorst @ 2016-09-13 12:40 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx; +Cc: paulo.r.zanoni

Op 13-09-16 om 14:15 schreef Kumar, Mahesh:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>
> This patch implements new DDB allocation algorithm as per HW team
> suggestion. This algo takecare of scenario where we allocate less DDB
> for the planes with lower relative pixel rate, but they require more DDB
> to work.
> It also takes care of enabling same watermark level for each
> plane, for efficient power saving.
>
> Changes since v1:
>  - Rebase on top of Paulo's patch series
>
> Changes since v2:
>  - Fix the for loop condition to enable WM
>
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
I'm still getting underrun issues when running the entire patch series against kms_atomic_transition and kms_plane.
Can you confirm?

~Maarten
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v4] drm/i915/bxt: Implement Transition WM
  2016-09-09  8:01 ` [PATCH v3 9/9] drm/i915/bxt: Implement Transition WM Kumar, Mahesh
@ 2016-09-14 11:54   ` Kumar, Mahesh
  2016-09-21 20:27     ` Paulo Zanoni
  0 siblings, 1 reply; 40+ messages in thread
From: Kumar, Mahesh @ 2016-09-14 11:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch enables Transition WM for SKL+ platforms.

Transition WM are used if IPC is enabled, to decide, number of blocks
to be fetched before reducing the priority of display to fetch from
memory.

Changes since v1:
 - Don't enable transition WM for GEN9 (Art/Paulo)
 - Rebase to use fixed point 16.16 calculation
 - Fix the use of selected result from latency level-0

Changes since v2:
 - Fix transition WM enable condition

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 60 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 57 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index de2e738..4814a1a 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2862,6 +2862,8 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
 
 #define SKL_DDB_SIZE		896	/* in blocks */
 #define BXT_DDB_SIZE		512
+#define SKL_TRANS_WM_AMOUNT	10	/* tunable value */
+#define SKL_TRANS_WM_MIN	14
 #define SKL_SAGV_BLOCK_TIME	30 /* µs */
 
 /*
@@ -3583,6 +3585,48 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 	return pixel_rate;
 }
 
+static void skl_compute_plane_trans_wm(const struct drm_i915_private *dev_priv,
+					struct skl_pipe_wm *pipe_wm,
+					struct intel_plane_state *intel_pstate,
+					uint32_t selected_result,
+					uint32_t y_tile_min,
+					bool y_tile)
+{
+	struct drm_plane_state *pstate = &intel_pstate->base;
+	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
+	uint16_t *out_blocks = &pipe_wm->trans_wm.plane_res_b[id];
+	uint8_t *out_lines = &pipe_wm->trans_wm.plane_res_l[id];
+	uint32_t trans_min = 0, trans_offset_blocks;
+	uint32_t trans_y_tile_min = 0, res_blocks = 0;
+	uint16_t trans_res_blocks = 0;
+
+	/* Keep Trans WM disabled for GEN9 */
+	if (IS_GEN9(dev_priv))
+		goto exit;
+
+	trans_min = SKL_TRANS_WM_MIN;
+
+	trans_offset_blocks = (trans_min + SKL_TRANS_WM_AMOUNT) << 16;
+
+	if (y_tile) {
+		trans_y_tile_min = 2 * y_tile_min;
+		res_blocks = max(trans_y_tile_min, selected_result) +
+			trans_offset_blocks;
+	} else {
+		res_blocks = selected_result + trans_offset_blocks;
+	}
+
+	trans_res_blocks = DIV_ROUND_UP(res_blocks, 1 << 16) + 1;
+
+	/* WA BUG:1938466 add one block for non y-tile planes */
+	if (!y_tile)
+		trans_res_blocks += 1;
+exit:
+	*out_blocks = trans_res_blocks;
+	*out_lines = 0;
+}
+
+
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
 				struct intel_plane_state *intel_pstate,
@@ -3709,6 +3753,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	*out_blocks = res_blocks;
 	*out_lines = res_lines;
 
+	if (level == 0)
+		skl_compute_plane_trans_wm(dev_priv, pipe_wm, intel_pstate,
+				selected_result, y_tile_minimum, y_tiled);
 	return 0;
 }
 
@@ -3778,11 +3825,13 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 }
 
 static void skl_compute_transition_wm(struct intel_crtc_state *cstate,
-				      struct skl_wm_level *trans_wm /* out */)
+				      struct skl_wm_level *trans_wm, /* out */
+				      struct skl_ddb_allocation *ddb)
 {
 	struct drm_crtc *crtc = cstate->base.crtc;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	struct intel_plane *intel_plane;
+	enum pipe pipe = to_intel_crtc(crtc)->pipe;
 
 	if (!cstate->base.active)
 		return;
@@ -3790,8 +3839,13 @@ static void skl_compute_transition_wm(struct intel_crtc_state *cstate,
 	/* Until we know more, just disable transition WMs */
 	for_each_intel_plane_on_crtc(crtc->dev, intel_crtc, intel_plane) {
 		int i = skl_wm_plane_id(intel_plane);
+		uint16_t plane_ddb = skl_ddb_entry_size(&ddb->plane[pipe][i]);
+		uint16_t trans_res_blocks = trans_wm->plane_res_b[i];
 
-		trans_wm->plane_en[i] = false;
+		if ((trans_res_blocks > 0) && (trans_res_blocks <= plane_ddb))
+			trans_wm->plane_en[i] = true;
+		else
+			trans_wm->plane_en[i] = false;
 	}
 }
 
@@ -3822,7 +3876,7 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 
 	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
 
-	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
+	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm, ddb);
 
 	return 0;
 }
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v4] drm/i915/skl: New ddb allocation algorithm
  2016-09-13 12:40       ` Maarten Lankhorst
@ 2016-09-14 12:36         ` Mahesh Kumar
  2016-09-19  8:27           ` Maarten Lankhorst
  2016-09-19  9:55           ` Maarten Lankhorst
  0 siblings, 2 replies; 40+ messages in thread
From: Mahesh Kumar @ 2016-09-14 12:36 UTC (permalink / raw)
  To: Maarten Lankhorst, intel-gfx; +Cc: paulo.r.zanoni

Hi,
There was an issue with transition WM, it was getting enabled & causing 
fifo underrun.
I fixed the condition, After that tested kms_plane & not getting any 
underrun.
Please let me know if you see any other issue.

Regards,
-Mahesh

On Tuesday 13 September 2016 06:10 PM, Maarten Lankhorst wrote:
> Op 13-09-16 om 14:15 schreef Kumar, Mahesh:
>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>>
>> This patch implements new DDB allocation algorithm as per HW team
>> suggestion. This algo takecare of scenario where we allocate less DDB
>> for the planes with lower relative pixel rate, but they require more DDB
>> to work.
>> It also takes care of enabling same watermark level for each
>> plane, for efficient power saving.
>>
>> Changes since v1:
>>   - Rebase on top of Paulo's patch series
>>
>> Changes since v2:
>>   - Fix the for loop condition to enable WM
>>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> I'm still getting underrun issues when running the entire patch series against kms_atomic_transition and kms_plane.
> Can you confirm?
>
> ~Maarten

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/9] drm/i915: Decode system memory bandwidth
  2016-09-09  8:01 ` [PATCH v3 4/9] drm/i915: Decode system memory bandwidth Kumar, Mahesh
@ 2016-09-16  8:02   ` Pandiyan, Dhinakaran
  2016-09-16 11:35     ` Mahesh Kumar
  2016-09-19 20:41   ` Paulo Zanoni
  1 sibling, 1 reply; 40+ messages in thread
From: Pandiyan, Dhinakaran @ 2016-09-16  8:02 UTC (permalink / raw)
  To: Kumar, Mahesh1; +Cc: intel-gfx, Zanoni, Paulo R

On Fri, 2016-09-09 at 13:31 +0530, Kumar, Mahesh wrote:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>
> 
> This patch adds support to decode system memory bandwidth
> which will be used for arbitrated display memory percentage
> calculation in GEN9 based system.
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 96 +++++++++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_drv.h | 18 ++++++++
>  drivers/gpu/drm/i915/i915_reg.h | 25 +++++++++++
>  3 files changed, 139 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 02c34d6..0a4f18d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -973,6 +973,96 @@ static void intel_sanitize_options(struct drm_i915_private *dev_priv)
>  	DRM_DEBUG_DRIVER("use GPU sempahores? %s\n", yesno(i915.semaphores));
>  }
>  
> +static void
> +intel_get_memdev_info(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	uint32_t val = 0;
> +	uint32_t mem_speed = 0;
> +	uint8_t dram_type;
> +	uint32_t dram_channel;
> +	uint8_t num_channel;
> +	bool rank_valid = false;
> +
> +	if (!IS_GEN9(dev_priv))
> +		goto exit;
> +
> +	val = I915_READ(P_CR_MC_BIOS_REQ_0_0_0);
> +	mem_speed = div_u64((uint64_t) (val & REQ_DATA_MASK) *
> +			MEMORY_FREQ_MULTIPLIER, 1000);
> +
> +	if (mem_speed == 0)
> +		goto exit;
> +
> +	dev_priv->memdev_info.valid = true;
> +	dev_priv->memdev_info.mem_speed = mem_speed;
> +	dram_type = (val >> DRAM_TYPE_SHIFT) & DRAM_TYPE_MASK;
> +	dram_channel = (val >> DRAM_CHANNEL_SHIFT) & DRAM_CHANNEL_MASK;
> +	num_channel = hweight32(dram_channel);
> +
> +	/*
> +	 * The lpddr3 and lpddr4 technologies can have 1-4 channels and the
> +	 * channels are 32bits wide; while ddr3l technologies can have 1-2
> +	 * channels and the channels are 64 bits wide. But SV team found that in
> +	 * case of single 64 bit wide DDR3L dimms two bits were set and system
> +	 * with two DDR3L 64bit dimm all four bits were set.

What bits are set? It would be good to clarify the register.

> +	 */
> +
> +	switch (dram_type) {
> +	case DRAM_TYPE_LPDDR3:
> +	case DRAM_TYPE_LPDDR4:
> +		dev_priv->memdev_info.data_width = 4;
> +		dev_priv->memdev_info.num_channel = num_channel;
> +		break;
> +	case DRAM_TYPE_DDR3L:
> +		dev_priv->memdev_info.data_width = 8;
> +		dev_priv->memdev_info.num_channel = num_channel / 2;

Why is this /2 done here?

> +		break;
> +	default:
> +		dev_priv->memdev_info.data_width = 4;
> +		dev_priv->memdev_info.num_channel = num_channel;
> +	}
> +
> +	/*
> +	 * Now read each DUNIT8/9/10/11 to check the rank of each dimms.
> +	 * all the dimms should have same rank as in first valid Dimm
> +	 */
> +#define D_CR_DRP0_DUNIT_INVALID	    0xFFFFFFFF
> +
> +	dev_priv->memdev_info.rank_valid = true;
> +	if (I915_READ(D_CR_DRP0_DUNIT8) != D_CR_DRP0_DUNIT_INVALID) {
> +		val = I915_READ(D_CR_DRP0_DUNIT8);
> +		rank_valid = true;
> +	} else if (I915_READ(D_CR_DRP0_DUNIT9) != D_CR_DRP0_DUNIT_INVALID) {
> +		val = I915_READ(D_CR_DRP0_DUNIT9);
> +		rank_valid = true;
> +	} else if (I915_READ(D_CR_DRP0_DUNIT10) != D_CR_DRP0_DUNIT_INVALID) {
> +		val = I915_READ(D_CR_DRP0_DUNIT10);
> +		rank_valid = true;
> +	} else if (I915_READ(D_CR_DRP0_DUNIT11) != D_CR_DRP0_DUNIT_INVALID) {
> +		val = I915_READ(D_CR_DRP0_DUNIT11);
> +		rank_valid = true;
> +	}
> +#undef D_CR_DRP0_DUNIT_INVALID
> +
> +	if (rank_valid) {
> +		dev_priv->memdev_info.rank_valid = true;
> +		dev_priv->memdev_info.rank = (val & DRAM_RANK_MASK);
> +	}
> +
> +	DRM_DEBUG_DRIVER("valid:%s speed-%d width-%d num_channel-%d\n",
> +		dev_priv->memdev_info.valid ? "true" : "false",
> +		dev_priv->memdev_info.mem_speed,
> +		dev_priv->memdev_info.data_width,
> +		dev_priv->memdev_info.num_channel);
> +	DRM_DEBUG_DRIVER("rank_valid:%s rank-%d\n",
> +		dev_priv->memdev_info.rank_valid ? "true" : "false",
> +		dev_priv->memdev_info.rank);
> +	return;
> +exit:
> +	dev_priv->memdev_info.valid = false;
> +}
> +
>  /**
>   * i915_driver_init_hw - setup state requiring device access
>   * @dev_priv: device private
> @@ -1076,6 +1166,12 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
>  			DRM_DEBUG_DRIVER("can't enable MSI");
>  	}
>  
> +	/*
> +	 * Fill the memdev structure to get the system raw bandwidth
> +	 * This will be used by WM algorithm, to implement GEN9 based WA
> +	 */
> +	intel_get_memdev_info(dev);
> +
>  	return 0;
>  
>  out_ggtt:
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4ec23e5..4313992 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2036,6 +2036,24 @@ struct drm_i915_private {
>  		bool distrust_bios_wm;
>  	} wm;
>  
> +	struct {
> +		/*
> +		 * memory device info
> +		 * valid: memory info is valid or not
> +		 * mem_speed: memory freq in KHz
> +		 * channel_width: Channel width in bytes
> +		 * num_channel: total number of channels
> +		 * rank: 0-rank disable, 1-Single rank, 2-dual rank
> +		 */
> +		bool valid;
> +		uint32_t mem_speed;
> +		uint8_t data_width;
> +		uint8_t num_channel;
> +		bool rank_valid;
> +		uint8_t rank;
> +	} memdev_info;
> +
> +
>  	struct i915_runtime_pm pm;
>  
>  	/* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index a29d707..b38445c 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -7716,6 +7716,31 @@ enum {
>  #define  DC_STATE_DEBUG_MASK_CORES	(1<<0)
>  #define  DC_STATE_DEBUG_MASK_MEMORY_UP	(1<<1)
>  
> +#define P_CR_MC_BIOS_REQ_0_0_0		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x7114)
> +#define REQ_DATA_MASK			(0x3F << 0)
> +#define DRAM_TYPE_SHIFT			24
> +#define DRAM_TYPE_MASK			0x7
> +#define DRAM_CHANNEL_SHIFT		12
> +#define DRAM_CHANNEL_MASK		0xF
> +
> +#define DRAM_TYPE_LPDDR3		0x1
> +#define DRAM_TYPE_LPDDR4		0x2
> +#define DRAM_TYPE_DDR3L			0x4
> +/*
> + * BIOS programs this field of REQ_DATA [5:0] in integer
> + * multiple of 133330 KHz (133.33MHz)
> + */
> +#define MEMORY_FREQ_MULTIPLIER		0x208D2
> +#define D_CR_DRP0_DUNIT8		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1000)
> +#define D_CR_DRP0_DUNIT9		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1200)
> +#define D_CR_DRP0_DUNIT10		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1400)
> +#define D_CR_DRP0_DUNIT11		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1600)
> +#define D_CR_DRP0_RKEN0			(1 << 0)
> +#define D_CR_DRP0_RKEN1			(1 << 1)
> +#define DRAM_RANK_MASK			0x3
> +#define DRAM_SINGLE_RANK		0x1
> +#define DRAM_DUAL_RANK			0x3

Should this be Ox2 ?

> +
>  /* Please see hsw_read_dcomp() and hsw_write_dcomp() before using this register,
>   * since on HSW we can't write to it using I915_WRITE. */
>  #define D_COMP_HSW			_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5F0C)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/9] drm/i915: Decode system memory bandwidth
  2016-09-16  8:02   ` Pandiyan, Dhinakaran
@ 2016-09-16 11:35     ` Mahesh Kumar
  0 siblings, 0 replies; 40+ messages in thread
From: Mahesh Kumar @ 2016-09-16 11:35 UTC (permalink / raw)
  To: Pandiyan, Dhinakaran; +Cc: intel-gfx, Zanoni, Paulo R



On Friday 16 September 2016 01:32 PM, Pandiyan, Dhinakaran wrote:
> On Fri, 2016-09-09 at 13:31 +0530, Kumar, Mahesh wrote:
>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>>
>> This patch adds support to decode system memory bandwidth
>> which will be used for arbitrated display memory percentage
>> calculation in GEN9 based system.
>>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.c | 96 +++++++++++++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_drv.h | 18 ++++++++
>>   drivers/gpu/drm/i915/i915_reg.h | 25 +++++++++++
>>   3 files changed, 139 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>> index 02c34d6..0a4f18d 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -973,6 +973,96 @@ static void intel_sanitize_options(struct drm_i915_private *dev_priv)
>>   	DRM_DEBUG_DRIVER("use GPU sempahores? %s\n", yesno(i915.semaphores));
>>   }
>>   
>> +static void
>> +intel_get_memdev_info(struct drm_device *dev)
>> +{
>> +	struct drm_i915_private *dev_priv = to_i915(dev);
>> +	uint32_t val = 0;
>> +	uint32_t mem_speed = 0;
>> +	uint8_t dram_type;
>> +	uint32_t dram_channel;
>> +	uint8_t num_channel;
>> +	bool rank_valid = false;
>> +
>> +	if (!IS_GEN9(dev_priv))
>> +		goto exit;
>> +
>> +	val = I915_READ(P_CR_MC_BIOS_REQ_0_0_0);
>> +	mem_speed = div_u64((uint64_t) (val & REQ_DATA_MASK) *
>> +			MEMORY_FREQ_MULTIPLIER, 1000);
>> +
>> +	if (mem_speed == 0)
>> +		goto exit;
>> +
>> +	dev_priv->memdev_info.valid = true;
>> +	dev_priv->memdev_info.mem_speed = mem_speed;
>> +	dram_type = (val >> DRAM_TYPE_SHIFT) & DRAM_TYPE_MASK;
>> +	dram_channel = (val >> DRAM_CHANNEL_SHIFT) & DRAM_CHANNEL_MASK;
>> +	num_channel = hweight32(dram_channel);
>> +
>> +	/*
>> +	 * The lpddr3 and lpddr4 technologies can have 1-4 channels and the
>> +	 * channels are 32bits wide; while ddr3l technologies can have 1-2
>> +	 * channels and the channels are 64 bits wide. But SV team found that in
>> +	 * case of single 64 bit wide DDR3L dimms two bits were set and system
>> +	 * with two DDR3L 64bit dimm all four bits were set.
> What bits are set? It would be good to clarify the register.
number of channel-active bits in  P_CR_MC_BIOS_REQ_0_0_0_MCHBAR register,
Will update the comment.
>
>> +	 */
>> +
>> +	switch (dram_type) {
>> +	case DRAM_TYPE_LPDDR3:
>> +	case DRAM_TYPE_LPDDR4:
>> +		dev_priv->memdev_info.data_width = 4;
>> +		dev_priv->memdev_info.num_channel = num_channel;
>> +		break;
>> +	case DRAM_TYPE_DDR3L:
>> +		dev_priv->memdev_info.data_width = 8;
>> +		dev_priv->memdev_info.num_channel = num_channel / 2;
> Why is this /2 done here?
This is the same reason as in comment section above, 2 bits per channel 
are set in P_CR_MC_BIOS_REQ_0_0_0 register in-case of DDR3L DRAM,
So we need to divide "number of bit set" by 2 to get actual number of 
channel for DDR3L.
>
>> +		break;
>> +	default:
>> +		dev_priv->memdev_info.data_width = 4;
>> +		dev_priv->memdev_info.num_channel = num_channel;
>> +	}
>> +
>> +	/*
>> +	 * Now read each DUNIT8/9/10/11 to check the rank of each dimms.
>> +	 * all the dimms should have same rank as in first valid Dimm
>> +	 */
>> +#define D_CR_DRP0_DUNIT_INVALID	    0xFFFFFFFF
>> +
>> +	dev_priv->memdev_info.rank_valid = true;
>> +	if (I915_READ(D_CR_DRP0_DUNIT8) != D_CR_DRP0_DUNIT_INVALID) {
>> +		val = I915_READ(D_CR_DRP0_DUNIT8);
>> +		rank_valid = true;
>> +	} else if (I915_READ(D_CR_DRP0_DUNIT9) != D_CR_DRP0_DUNIT_INVALID) {
>> +		val = I915_READ(D_CR_DRP0_DUNIT9);
>> +		rank_valid = true;
>> +	} else if (I915_READ(D_CR_DRP0_DUNIT10) != D_CR_DRP0_DUNIT_INVALID) {
>> +		val = I915_READ(D_CR_DRP0_DUNIT10);
>> +		rank_valid = true;
>> +	} else if (I915_READ(D_CR_DRP0_DUNIT11) != D_CR_DRP0_DUNIT_INVALID) {
>> +		val = I915_READ(D_CR_DRP0_DUNIT11);
>> +		rank_valid = true;
>> +	}
>> +#undef D_CR_DRP0_DUNIT_INVALID
>> +
>> +	if (rank_valid) {
>> +		dev_priv->memdev_info.rank_valid = true;
>> +		dev_priv->memdev_info.rank = (val & DRAM_RANK_MASK);
>> +	}
>> +
>> +	DRM_DEBUG_DRIVER("valid:%s speed-%d width-%d num_channel-%d\n",
>> +		dev_priv->memdev_info.valid ? "true" : "false",
>> +		dev_priv->memdev_info.mem_speed,
>> +		dev_priv->memdev_info.data_width,
>> +		dev_priv->memdev_info.num_channel);
>> +	DRM_DEBUG_DRIVER("rank_valid:%s rank-%d\n",
>> +		dev_priv->memdev_info.rank_valid ? "true" : "false",
>> +		dev_priv->memdev_info.rank);
>> +	return;
>> +exit:
>> +	dev_priv->memdev_info.valid = false;
>> +}
>> +
>>   /**
>>    * i915_driver_init_hw - setup state requiring device access
>>    * @dev_priv: device private
>> @@ -1076,6 +1166,12 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
>>   			DRM_DEBUG_DRIVER("can't enable MSI");
>>   	}
>>   
>> +	/*
>> +	 * Fill the memdev structure to get the system raw bandwidth
>> +	 * This will be used by WM algorithm, to implement GEN9 based WA
>> +	 */
>> +	intel_get_memdev_info(dev);
>> +
>>   	return 0;
>>   
>>   out_ggtt:
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 4ec23e5..4313992 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2036,6 +2036,24 @@ struct drm_i915_private {
>>   		bool distrust_bios_wm;
>>   	} wm;
>>   
>> +	struct {
>> +		/*
>> +		 * memory device info
>> +		 * valid: memory info is valid or not
>> +		 * mem_speed: memory freq in KHz
>> +		 * channel_width: Channel width in bytes
>> +		 * num_channel: total number of channels
>> +		 * rank: 0-rank disable, 1-Single rank, 2-dual rank
>> +		 */
>> +		bool valid;
>> +		uint32_t mem_speed;
>> +		uint8_t data_width;
>> +		uint8_t num_channel;
>> +		bool rank_valid;
>> +		uint8_t rank;
>> +	} memdev_info;
>> +
>> +
>>   	struct i915_runtime_pm pm;
>>   
>>   	/* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index a29d707..b38445c 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -7716,6 +7716,31 @@ enum {
>>   #define  DC_STATE_DEBUG_MASK_CORES	(1<<0)
>>   #define  DC_STATE_DEBUG_MASK_MEMORY_UP	(1<<1)
>>   
>> +#define P_CR_MC_BIOS_REQ_0_0_0		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x7114)
>> +#define REQ_DATA_MASK			(0x3F << 0)
>> +#define DRAM_TYPE_SHIFT			24
>> +#define DRAM_TYPE_MASK			0x7
>> +#define DRAM_CHANNEL_SHIFT		12
>> +#define DRAM_CHANNEL_MASK		0xF
>> +
>> +#define DRAM_TYPE_LPDDR3		0x1
>> +#define DRAM_TYPE_LPDDR4		0x2
>> +#define DRAM_TYPE_DDR3L			0x4
>> +/*
>> + * BIOS programs this field of REQ_DATA [5:0] in integer
>> + * multiple of 133330 KHz (133.33MHz)
>> + */
>> +#define MEMORY_FREQ_MULTIPLIER		0x208D2
>> +#define D_CR_DRP0_DUNIT8		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1000)
>> +#define D_CR_DRP0_DUNIT9		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1200)
>> +#define D_CR_DRP0_DUNIT10		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1400)
>> +#define D_CR_DRP0_DUNIT11		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1600)
>> +#define D_CR_DRP0_RKEN0			(1 << 0)
>> +#define D_CR_DRP0_RKEN1			(1 << 1)
>> +#define DRAM_RANK_MASK			0x3
>> +#define DRAM_SINGLE_RANK		0x1
>> +#define DRAM_DUAL_RANK			0x3
> Should this be Ox2 ?
No, In case of dual-rank both the bit (rank-0 & rank-1) will be set.

Thanks & Regards,
-Mahesh
>
>> +
>>   /* Please see hsw_read_dcomp() and hsw_write_dcomp() before using this register,
>>    * since on HSW we can't write to it using I915_WRITE. */
>>   #define D_COMP_HSW			_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5F0C)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4] drm/i915/skl: New ddb allocation algorithm
  2016-09-14 12:36         ` Mahesh Kumar
@ 2016-09-19  8:27           ` Maarten Lankhorst
  2016-09-19  9:55           ` Maarten Lankhorst
  1 sibling, 0 replies; 40+ messages in thread
From: Maarten Lankhorst @ 2016-09-19  8:27 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx; +Cc: paulo.r.zanoni

Hey,

Op 14-09-16 om 14:36 schreef Mahesh Kumar:
> Hi,
> There was an issue with transition WM, it was getting enabled & causing fifo underrun.
> I fixed the condition, After that tested kms_plane & not getting any underrun.
> Please let me know if you see any other issue. 

kms_cursor_legacy.cursorA-vs-flipA-atomic-transitions-varying-size is broken by this patch.

It's easy to overlook, since cursorA-vs-flipB of the same test was already broken,
but this is definitely something new introduced by this patch.

~Maarten

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4] drm/i915/skl: New ddb allocation algorithm
  2016-09-14 12:36         ` Mahesh Kumar
  2016-09-19  8:27           ` Maarten Lankhorst
@ 2016-09-19  9:55           ` Maarten Lankhorst
  2016-09-21 13:03             ` Mahesh Kumar
  1 sibling, 1 reply; 40+ messages in thread
From: Maarten Lankhorst @ 2016-09-19  9:55 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx; +Cc: paulo.r.zanoni

Op 14-09-16 om 14:36 schreef Mahesh Kumar:
> Hi,
> There was an issue with transition WM, it was getting enabled & causing fifo underrun.
> I fixed the condition, After that tested kms_plane & not getting any underrun.
> Please let me know if you see any other issue.
>
> Regards,
> -Mahesh
>
> On Tuesday 13 September 2016 06:10 PM, Maarten Lankhorst wrote:
>> Op 13-09-16 om 14:15 schreef Kumar, Mahesh:
>>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>>>
>>> This patch implements new DDB allocation algorithm as per HW team
>>> suggestion. This algo takecare of scenario where we allocate less DDB
>>> for the planes with lower relative pixel rate, but they require more DDB
>>> to work.
>>> It also takes care of enabling same watermark level for each
>>> plane, for efficient power saving.
>>>
>>> Changes since v1:
>>>   - Rebase on top of Paulo's patch series
>>>
>>> Changes since v2:
>>>   - Fix the for loop condition to enable WM
>>>
>>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> I'm still getting underrun issues when running the entire patch series against kms_atomic_transition and kms_plane.
>> Can you confirm?
>>
>> ~Maarten
>
Found it..

During the test run:
# cat /sys/kernel/debug/dri/0/i915_ddb_info 
                  Start     End    Size
Pipe A
  Plane1              0       0       0
  Plane2             30     890     860
  Cursor            860     892      32

Pretty sure the start value here is bogus, and plane2 wm's end up overlapping with cursor.

~Maarten

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/9] drm/i915/skl+: use linetime latency instead of ddb size
  2016-09-09  8:00 ` [PATCH v3 2/9] drm/i915/skl+: use linetime latency instead of ddb size Kumar, Mahesh
  2016-09-12  8:56   ` Maarten Lankhorst
@ 2016-09-19 18:19   ` Paulo Zanoni
  2016-09-19 18:24     ` Zanoni, Paulo R
  1 sibling, 1 reply; 40+ messages in thread
From: Paulo Zanoni @ 2016-09-19 18:19 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx

Em Sex, 2016-09-09 às 13:30 +0530, Kumar, Mahesh escreveu:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>
> 
> This patch make changes to use linetime latency instead of allocated
> DDB size during plane watermark calculation in switch case, This is
> required to implement new DDB allocation algorithm.
> 
> In New Algorithm DDB is allocated based on WM values, because of
> which
> number of DDB blocks will not be available during WM calculation,
> So this "linetime latency" is suggested by SV/HW team to use during
> switch-case for WM blocks selection.

Why is this not part of BSpec? If there's some problem with the current
algorithm and we need a new one, why is it not part of our spec?

> 
> Changes since v1:
>  - Rebase on top of Paulo's patch series
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c
> b/drivers/gpu/drm/i915/intel_pm.c
> index 3fdec4d..cfd9b7d1 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3622,10 +3622,15 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
>  		selected_result = max(method2, y_tile_minimum);
>  	} else {
> +		uint32_t linetime_us = 0;
> +
> +		linetime_us = DIV_ROUND_UP(width * 1000,
> +				skl_pipe_pixel_rate(cstate));
> +
>  		if ((cpp * cstate->base.adjusted_mode.crtc_htotal /
> 512 < 1) &&
>  		    (plane_bytes_per_line / 512 < 1))
>  			selected_result = method2;
> -		else if ((ddb_allocation / plane_blocks_per_line) >=
> 1)
> +		if (latency >= linetime_us)
>  			selected_result = min(method1, method2);
>  		else
>  			selected_result = method1;
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/9] drm/i915/skl+: use linetime latency instead of ddb size
  2016-09-19 18:19   ` Paulo Zanoni
@ 2016-09-19 18:24     ` Zanoni, Paulo R
  2016-09-22  8:02       ` Mahesh Kumar
  0 siblings, 1 reply; 40+ messages in thread
From: Zanoni, Paulo R @ 2016-09-19 18:24 UTC (permalink / raw)
  To: intel-gfx, Kumar, Mahesh1

Em Seg, 2016-09-19 às 15:19 -0300, Paulo Zanoni escreveu:
> Em Sex, 2016-09-09 às 13:30 +0530, Kumar, Mahesh escreveu:
> > 
> > From: Mahesh Kumar <mahesh1.kumar@intel.com>
> > 
> > This patch make changes to use linetime latency instead of
> > allocated
> > DDB size during plane watermark calculation in switch case, This is
> > required to implement new DDB allocation algorithm.
> > 
> > In New Algorithm DDB is allocated based on WM values, because of
> > which
> > number of DDB blocks will not be available during WM calculation,
> > So this "linetime latency" is suggested by SV/HW team to use during
> > switch-case for WM blocks selection.
> 
> Why is this not part of BSpec? If there's some problem with the
> current
> algorithm and we need a new one, why is it not part of our spec?
> 
> > 
> > 
> > Changes since v1:
> >  - Rebase on top of Paulo's patch series
> > 
> > Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_pm.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > b/drivers/gpu/drm/i915/intel_pm.c
> > index 3fdec4d..cfd9b7d1 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -3622,10 +3622,15 @@ static int skl_compute_plane_wm(const
> > struct
> > drm_i915_private *dev_priv,
> >  	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
> >  		selected_result = max(method2, y_tile_minimum);
> >  	} else {
> > +		uint32_t linetime_us = 0;
> > +
> > +		linetime_us = DIV_ROUND_UP(width * 1000,
> > +				skl_pipe_pixel_rate(cstate));

Also, shouldn't this be width * 8 * 1000?

> > +
> >  		if ((cpp * cstate->base.adjusted_mode.crtc_htotal
> > /
> > 512 < 1) &&
> >  		    (plane_bytes_per_line / 512 < 1))
> >  			selected_result = method2;
> > -		else if ((ddb_allocation / plane_blocks_per_line)
> > >=
> > 1)
> > +		if (latency >= linetime_us)
> >  			selected_result = min(method1, method2);
> >  		else
> >  			selected_result = method1;
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/9] drm/i915: Decode system memory bandwidth
  2016-09-09  8:01 ` [PATCH v3 4/9] drm/i915: Decode system memory bandwidth Kumar, Mahesh
  2016-09-16  8:02   ` Pandiyan, Dhinakaran
@ 2016-09-19 20:41   ` Paulo Zanoni
  1 sibling, 0 replies; 40+ messages in thread
From: Paulo Zanoni @ 2016-09-19 20:41 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx

Hi

Em Sex, 2016-09-09 às 13:31 +0530, Kumar, Mahesh escreveu:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>
> 
> This patch adds support to decode system memory bandwidth
> which will be used for arbitrated display memory percentage
> calculation in GEN9 based system.

This is not a complete review of this patch since I can't find the
documentation for the registers used by the patch, but I'll try to
provide some early feedback. Most of it is about styling, so feel free
to provide counter arguments if you disagree.

> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 96
> +++++++++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_drv.h | 18 ++++++++
>  drivers/gpu/drm/i915/i915_reg.h | 25 +++++++++++
>  3 files changed, 139 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c
> b/drivers/gpu/drm/i915/i915_drv.c
> index 02c34d6..0a4f18d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -973,6 +973,96 @@ static void intel_sanitize_options(struct
> drm_i915_private *dev_priv)
>  	DRM_DEBUG_DRIVER("use GPU sempahores? %s\n",
> yesno(i915.semaphores));
>  }
>  
> +static void
> +intel_get_memdev_info(struct drm_device *dev)

In our current standards we prefer that you use drm_i915_private as the
parameter here. And the new function doesn't even use drm_device for
anything, so that reinforces the argument.


> +{
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	uint32_t val = 0;
> +	uint32_t mem_speed = 0;
> +	uint8_t dram_type;
> +	uint32_t dram_channel;
> +	uint8_t num_channel;
> +	bool rank_valid = false;

We generally don't zero-initialize things that don't need to be zero-
initialized, like these 3 variables. I know this is an eternal
discussion and each side has pros and cons, but following the already
used coding style usually wins.


Also, please just add this:

struct memdev_info *memdev_info = &dev_priv->memdev_info;

All those "dev_priv->memdev_info" statements below are already making
my fingers hurt, and I didn't even type them :)


> +
> +	if (!IS_GEN9(dev_priv))
> +		goto exit;

If you just set memdev_info.valid to false right at the beginning of
the function then you can just return here and below without using the
goto. IMHO it will look better. Or you could even rely on the fact that
we used kzalloc anyway.


> +
> +	val = I915_READ(P_CR_MC_BIOS_REQ_0_0_0);
> +	mem_speed = div_u64((uint64_t) (val & REQ_DATA_MASK) *
> +			MEMORY_FREQ_MULTIPLIER, 1000);
> +
> +	if (mem_speed == 0)
> +		goto exit;

Perhaps a DRM_DEBUG_KMS("something something memory data is not valid")
would be useful here too.


> +
> +	dev_priv->memdev_info.valid = true;
> +	dev_priv->memdev_info.mem_speed = mem_speed;
> +	dram_type = (val >> DRAM_TYPE_SHIFT) & DRAM_TYPE_MASK;
> +	dram_channel = (val >> DRAM_CHANNEL_SHIFT) &
> DRAM_CHANNEL_MASK;
> +	num_channel = hweight32(dram_channel);
> +
> +	/*
> +	 * The lpddr3 and lpddr4 technologies can have 1-4 channels
> and the
> +	 * channels are 32bits wide; while ddr3l technologies can
> have 1-2
> +	 * channels and the channels are 64 bits wide. But SV team
> found that in
> +	 * case of single 64 bit wide DDR3L dimms two bits were set
> and system
> +	 * with two DDR3L 64bit dimm all four bits were set.
> +	 */

I still don't have access to the spec, but this comment suggests the
spec doesn't match the SV team findings. So can't the SV team request
the specs to reflect their findings? As an outsider, I see this as the
SV team's word against the HW team's words, and I don't know who made
the mistake: the SV engineer or the HW engineer. So I expect a
discussion between the two and the conclusions being reflected on the
specification :). Errata knowledge shouldn't go straight to the
drivers, that's a recipe for eternal doubt to the future people
maintaining this code.


> +
> +	switch (dram_type) {
> +	case DRAM_TYPE_LPDDR3:
> +	case DRAM_TYPE_LPDDR4:
> +		dev_priv->memdev_info.data_width = 4;
> +		dev_priv->memdev_info.num_channel = num_channel;
> +		break;
> +	case DRAM_TYPE_DDR3L:
> +		dev_priv->memdev_info.data_width = 8;
> +		dev_priv->memdev_info.num_channel = num_channel / 2;
> +		break;
> +	default:

Again, no access to the spec here, but shouldn't this case reset
memdev_info.valid to false and then return (possibly with a
DRM_DEBUG_KMS)?


> +		dev_priv->memdev_info.data_width = 4;
> +		dev_priv->memdev_info.num_channel = num_channel;
> +	}
> +
> +	/*
> +	 * Now read each DUNIT8/9/10/11 to check the rank of each
> dimms.
> +	 * all the dimms should have same rank as in first valid
> Dimm
> +	 */
> 
> +#define D_CR_DRP0_DUNIT_INVALID	    0xFFFFFFFF

I'm not a huge fan of these mid-file defines with later undefines.
Can't we just move this to the .h file, or just use a variable, or just
go with the plain 0xFFFFFFFF?


> +
> +	dev_priv->memdev_info.rank_valid = true;

We never set this to false. Bug?


> +	if (I915_READ(D_CR_DRP0_DUNIT8) != D_CR_DRP0_DUNIT_INVALID)
> {
> +		val = I915_READ(D_CR_DRP0_DUNIT8);
> +		rank_valid = true;
> +	} else if (I915_READ(D_CR_DRP0_DUNIT9) !=
> D_CR_DRP0_DUNIT_INVALID) {
> +		val = I915_READ(D_CR_DRP0_DUNIT9);
> +		rank_valid = true;
> +	} else if (I915_READ(D_CR_DRP0_DUNIT10) !=
> D_CR_DRP0_DUNIT_INVALID) {
> +		val = I915_READ(D_CR_DRP0_DUNIT10);
> +		rank_valid = true;
> +	} else if (I915_READ(D_CR_DRP0_DUNIT11) !=
> D_CR_DRP0_DUNIT_INVALID) {
> +		val = I915_READ(D_CR_DRP0_DUNIT11);
> +		rank_valid = true;
> +	}

You can restructure this to avoid the double I915_READ for the valid
case. There are many ways to do this.

Also, you could get rid of the rank_valid variable if you reorganize
things a little bit and then just add an extra "else" for the
"rank_valid is false" case.


> +#undef D_CR_DRP0_DUNIT_INVALID
> +
> +	if (rank_valid) {
> +		dev_priv->memdev_info.rank_valid = true;

But we just set dev_priv->memdev_info.rank_valid to true above, so no
need to redo it here.


> +		dev_priv->memdev_info.rank = (val & DRAM_RANK_MASK);
> +	}
> +
> +	DRM_DEBUG_DRIVER("valid:%s speed-%d width-%d num_channel-
> %d\n",

Speed, width and num_channel are unsigned, and not all of them are 32
bits. Please replace the %d with the more appropriate modifiers and
specifiers.


> +		dev_priv->memdev_info.valid ? "true" : "false",

Perhaps you could use yesno() here and below.

Also, you could prettify the text a little bit so people would be able
to read dmesg and figure out interesting information about their DRAM
without needing to look at the code.


> +		dev_priv->memdev_info.mem_speed,
> +		dev_priv->memdev_info.data_width,
> +		dev_priv->memdev_info.num_channel);
> +	DRM_DEBUG_DRIVER("rank_valid:%s rank-%d\n",
> +		dev_priv->memdev_info.rank_valid ? "true" : "false",
> +		dev_priv->memdev_info.rank);
> +	return;
> +exit:
> +	dev_priv->memdev_info.valid = false;
> +}
> +
>  /**
>   * i915_driver_init_hw - setup state requiring device access
>   * @dev_priv: device private
> @@ -1076,6 +1166,12 @@ static int i915_driver_init_hw(struct
> drm_i915_private *dev_priv)
>  			DRM_DEBUG_DRIVER("can't enable MSI");
>  	}
>  
> +	/*
> +	 * Fill the memdev structure to get the system raw bandwidth
> +	 * This will be used by WM algorithm, to implement GEN9
> based WA
> +	 */
> +	intel_get_memdev_info(dev);
> +
>  	return 0;
>  
>  out_ggtt:
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h
> index 4ec23e5..4313992 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2036,6 +2036,24 @@ struct drm_i915_private {
>  		bool distrust_bios_wm;
>  	} wm;
>  
> +	struct {
> +		/*
> +		 * memory device info
> +		 * valid: memory info is valid or not
> +		 * mem_speed: memory freq in KHz
> +		 * channel_width: Channel width in bytes

channel_width or data_width?


> +		 * num_channel: total number of channels
> +		 * rank: 0-rank disable, 1-Single rank, 2-dual rank
> +		 */
> +		bool valid;
> +		uint32_t mem_speed;
> +		uint8_t data_width;
> +		uint8_t num_channel;
> +		bool rank_valid;
> +		uint8_t rank;

You can remove the comments if you do this:

struct memdev_info {
	bool valid;
	uint32_t mem_speed_khz;
	uint8_t channel/data_width_bytes;
	uint8_t num_channels; (note the plural)
	enum {
		SOMETHING_RANK_DISABLE = 0,
		SOMETHING_RANK_SINGLE,
		SOMETHING_RANK_DUAL
	} rank;
} memdev_info;
	
I *really* like documenting units in the variable names, since it
avoids constant doc checking.

> +	} memdev_info;
> +
> +
>  	struct i915_runtime_pm pm;
>  
>  	/* Abstract the submission mechanism (legacy ringbuffer or
> execlists) away */
> diff --git a/drivers/gpu/drm/i915/i915_reg.h
> b/drivers/gpu/drm/i915/i915_reg.h
> index a29d707..b38445c 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -7716,6 +7716,31 @@ enum {
>  #define  DC_STATE_DEBUG_MASK_CORES	(1<<0)
>  #define  DC_STATE_DEBUG_MASK_MEMORY_UP	(1<<1)
>  
> +#define P_CR_MC_BIOS_REQ_0_0_0		_MMIO(MCHBAR_MIRROR_BA
> SE_SNB + 0x7114)
> +#define REQ_DATA_MASK			(0x3F << 0)
> +#define DRAM_TYPE_SHIFT			24
> +#define DRAM_TYPE_MASK			0x7
> +#define DRAM_CHANNEL_SHIFT		12
> +#define DRAM_CHANNEL_MASK		0xF
> +
> +#define DRAM_TYPE_LPDDR3		0x1
> +#define DRAM_TYPE_LPDDR4		0x2
> +#define DRAM_TYPE_DDR3L			0x4
> +/*
> + * BIOS programs this field of REQ_DATA [5:0] in integer
> + * multiple of 133330 KHz (133.33MHz)
> + */
> +#define MEMORY_FREQ_MULTIPLIER		0x208D2
> +#define D_CR_DRP0_DUNIT8		_MMIO(MCHBAR_MIRROR_BASE_SNB
> + 0x1000)
> +#define D_CR_DRP0_DUNIT9		_MMIO(MCHBAR_MIRROR_BASE_SNB
> + 0x1200)
> +#define D_CR_DRP0_DUNIT10		_MMIO(MCHBAR_MIRROR_BASE_SN
> B + 0x1400)
> +#define D_CR_DRP0_DUNIT11		_MMIO(MCHBAR_MIRROR_BASE_SN
> B + 0x1600)
> +#define D_CR_DRP0_RKEN0			(1 << 0)
> +#define D_CR_DRP0_RKEN1			(1 << 1)
> +#define DRAM_RANK_MASK			0x3
> +#define DRAM_SINGLE_RANK		0x1
> +#define DRAM_DUAL_RANK			0x3

I'd prefix all/most of these definitions with something to group them
all. Generic names such as MEMORY_FREQ_MULTIPLIER or DRAM_TYPE_LPDD3
may be confusing and/or cause problems later.


Thanks,
Paulo

> +
>  /* Please see hsw_read_dcomp() and hsw_write_dcomp() before using
> this register,
>   * since on HSW we can't write to it using I915_WRITE. */
>  #define D_COMP_HSW			_MMIO(MCHBAR_MIRROR_BASE_S
> NB + 0x5F0C)
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions
  2016-09-09  8:00 ` [PATCH v3 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions Kumar, Mahesh
@ 2016-09-20 12:17   ` Paulo Zanoni
  2016-09-21 13:48     ` Mahesh Kumar
  0 siblings, 1 reply; 40+ messages in thread
From: Paulo Zanoni @ 2016-09-20 12:17 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx

Em Sex, 2016-09-09 às 13:30 +0530, Kumar, Mahesh escreveu:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>
> 
> This patch make use of plane_wm variable directly instead of passing
> skl_plane_wm struct. this way reduces number of argument requirement
> in watermark calculation functions.
> 
> It also gives more freedom of decision making to implement Bspec WM
> workarounds.

This is just my personal opinion, but I don't think this patch is an
improvement to the codebase. The way things are currently organized,
there's no risk that we may end up reading some variable that's still
not set/computed, so there's less risk that some later patch may end up
using information it shouldn't. Also, by having these explicit out
parameters, we clearly highlight what's the goal of the function:
output those 3 values, instead of writing I-don't-know-what to a huge
struct.

Besides, the patch even keeps the out_ variables as local variables
instead of parameters, which makes things even more confusing IMHO,
since in_ and out_ variables are usually function parameters.

I also see that this patch is not necessary for the series. We kinda
use the new pipe_wm variable at patch 9, but we could just go with the
variables we have.

So, unless some new arguments are given, I'd suggest to just drop this
patch.

> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 29 +++++++++++++++--------------
>  1 file changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c
> b/drivers/gpu/drm/i915/intel_pm.c
> index 86c6d66..3fdec4d 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3542,9 +3542,7 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  				struct intel_plane_state
> *intel_pstate,
>  				uint16_t ddb_allocation,
>  				int level,
> -				uint16_t *out_blocks, /* out */
> -				uint8_t *out_lines, /* out */
> -				bool *enabled /* out */)
> +				struct skl_pipe_wm *pipe_wm)
>  {
>  	struct drm_plane_state *pstate = &intel_pstate->base;
>  	struct drm_framebuffer *fb = pstate->fb;
> @@ -3557,6 +3555,11 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	uint32_t width = 0, height = 0;
>  	uint32_t plane_pixel_rate;
>  	uint32_t y_tile_minimum, y_min_scanlines;
> +	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
> +	struct skl_wm_level *result = &pipe_wm->wm[level];
> +	uint16_t *out_blocks = &result->plane_res_b[id];
> +	uint8_t *out_lines = &result->plane_res_l[id];
> +	bool *enabled = &result->plane_en[id];
>  
>  	if (latency == 0 || !cstate->base.active || !intel_pstate-
> >base.visible) {
>  		*enabled = false;
> @@ -3673,7 +3676,7 @@ skl_compute_wm_level(const struct
> drm_i915_private *dev_priv,
>  		     struct skl_ddb_allocation *ddb,
>  		     struct intel_crtc_state *cstate,
>  		     int level,
> -		     struct skl_wm_level *result)
> +		     struct skl_pipe_wm *pipe_wm)
>  {
>  	struct drm_atomic_state *state = cstate->base.state;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(cstate-
> >base.crtc);
> @@ -3684,12 +3687,6 @@ skl_compute_wm_level(const struct
> drm_i915_private *dev_priv,
>  	enum pipe pipe = intel_crtc->pipe;
>  	int ret;
>  
> -	/*
> -	 * We'll only calculate watermarks for planes that are
> actually
> -	 * enabled, so make sure all other planes are set as
> disabled.
> -	 */
> -	memset(result, 0, sizeof(*result));
> -
>  	for_each_intel_plane_mask(&dev_priv->drm,
>  				  intel_plane,
>  				  cstate->base.plane_mask) {
> @@ -3727,9 +3724,7 @@ skl_compute_wm_level(const struct
> drm_i915_private *dev_priv,
>  					   intel_pstate,
>  					   ddb_blocks,
>  					   level,
> -					   &result->plane_res_b[i],
> -					   &result->plane_res_l[i],
> -					   &result->plane_en[i]);
> +					   pipe_wm);
>  		if (ret)
>  			return ret;
>  	}
> @@ -3777,9 +3772,15 @@ static int skl_build_pipe_wm(struct
> intel_crtc_state *cstate,
>  	int level, max_level = ilk_wm_max_level(dev);
>  	int ret;
>  
> +	/*
> +	 * We'll only calculate watermarks for planes that are
> actually
> +	 * enabled, so make sure all other planes are set as
> disabled.
> +	 */
> +	memset(pipe_wm, 0, sizeof(*pipe_wm));
> +
>  	for (level = 0; level <= max_level; level++) {
>  		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
> -					   level, &pipe_wm-
> >wm[level]);
> +					   level, pipe_wm);
>  		if (ret)
>  			return ret;
>  	}
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4] drm/i915/skl: New ddb allocation algorithm
  2016-09-19  9:55           ` Maarten Lankhorst
@ 2016-09-21 13:03             ` Mahesh Kumar
  0 siblings, 0 replies; 40+ messages in thread
From: Mahesh Kumar @ 2016-09-21 13:03 UTC (permalink / raw)
  To: Maarten Lankhorst, intel-gfx; +Cc: paulo.r.zanoni

Hi Maarten,

thanks for pointing out the issue,

not only start value, end value is also incorrect. I got the root-cause 
for both.
but this end value issue seems to be always there. end value should be 
-1 (because start block should also be counted)
Will fix both & upload the reworked patches.

-Mahesh


On Monday 19 September 2016 03:25 PM, Maarten Lankhorst wrote:
> Op 14-09-16 om 14:36 schreef Mahesh Kumar:
>> Hi,
>> There was an issue with transition WM, it was getting enabled & causing fifo underrun.
>> I fixed the condition, After that tested kms_plane & not getting any underrun.
>> Please let me know if you see any other issue.
>>
>> Regards,
>> -Mahesh
>>
>> On Tuesday 13 September 2016 06:10 PM, Maarten Lankhorst wrote:
>>> Op 13-09-16 om 14:15 schreef Kumar, Mahesh:
>>>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>>>>
>>>> This patch implements new DDB allocation algorithm as per HW team
>>>> suggestion. This algo takecare of scenario where we allocate less DDB
>>>> for the planes with lower relative pixel rate, but they require more DDB
>>>> to work.
>>>> It also takes care of enabling same watermark level for each
>>>> plane, for efficient power saving.
>>>>
>>>> Changes since v1:
>>>>    - Rebase on top of Paulo's patch series
>>>>
>>>> Changes since v2:
>>>>    - Fix the for loop condition to enable WM
>>>>
>>>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>>> I'm still getting underrun issues when running the entire patch series against kms_atomic_transition and kms_plane.
>>> Can you confirm?
>>>
>>> ~Maarten
> Found it..
>
> During the test run:
> # cat /sys/kernel/debug/dri/0/i915_ddb_info
>                    Start     End    Size
> Pipe A
>    Plane1              0       0       0
>    Plane2             30     890     860
>    Cursor            860     892      32
>
> Pretty sure the start value here is bogus, and plane2 wm's end up overlapping with cursor.
>
> ~Maarten
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions
  2016-09-20 12:17   ` Paulo Zanoni
@ 2016-09-21 13:48     ` Mahesh Kumar
  2016-09-21 13:59       ` Paulo Zanoni
  0 siblings, 1 reply; 40+ messages in thread
From: Mahesh Kumar @ 2016-09-21 13:48 UTC (permalink / raw)
  To: Paulo Zanoni, intel-gfx

there are two main reason I want to pass complete pipe_wm to compute 
function

1. for transition WM calculation, else we need to pass 
trans_wm.plane_res_b &  trans_wm.plane_res_l also as a parameter,

     that will increase the parameter-list. And trans_wm will be used 
(calculation will happen) only during level-0. All other levels (1-7) 
these will be extra unused variables.

     If we want to calculate transition WM separately (after plane WM 
calculation) there we will have to duplicate some of the code for WM 
level-0 calculation.

2. This will be used in Render compression WA for (Gen9 & CNL) where we 
have to make sure WM level-(1-7) result blocks/lines are >= WM level-0 
result blocks/lines.


agree, out_ variable should not be local variables.
If you think above points can be addressed, let me know, will update 
patches accordingly.

thanks,
-Mahesh

On Tuesday 20 September 2016 05:47 PM, Paulo Zanoni wrote:
> Em Sex, 2016-09-09 às 13:30 +0530, Kumar, Mahesh escreveu:
>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>>
>> This patch make use of plane_wm variable directly instead of passing
>> skl_plane_wm struct. this way reduces number of argument requirement
>> in watermark calculation functions.
>>
>> It also gives more freedom of decision making to implement Bspec WM
>> workarounds.
> This is just my personal opinion, but I don't think this patch is an
> improvement to the codebase. The way things are currently organized,
> there's no risk that we may end up reading some variable that's still
> not set/computed, so there's less risk that some later patch may end up
> using information it shouldn't. Also, by having these explicit out
> parameters, we clearly highlight what's the goal of the function:
> output those 3 values, instead of writing I-don't-know-what to a huge
> struct.
>
> Besides, the patch even keeps the out_ variables as local variables
> instead of parameters, which makes things even more confusing IMHO,
> since in_ and out_ variables are usually function parameters.
>
> I also see that this patch is not necessary for the series. We kinda
> use the new pipe_wm variable at patch 9, but we could just go with the
> variables we have.
>
> So, unless some new arguments are given, I'd suggest to just drop this
> patch.
>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_pm.c | 29 +++++++++++++++--------------
>>   1 file changed, 15 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c
>> b/drivers/gpu/drm/i915/intel_pm.c
>> index 86c6d66..3fdec4d 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -3542,9 +3542,7 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>   				struct intel_plane_state
>> *intel_pstate,
>>   				uint16_t ddb_allocation,
>>   				int level,
>> -				uint16_t *out_blocks, /* out */
>> -				uint8_t *out_lines, /* out */
>> -				bool *enabled /* out */)
>> +				struct skl_pipe_wm *pipe_wm)
>>   {
>>   	struct drm_plane_state *pstate = &intel_pstate->base;
>>   	struct drm_framebuffer *fb = pstate->fb;
>> @@ -3557,6 +3555,11 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>   	uint32_t width = 0, height = 0;
>>   	uint32_t plane_pixel_rate;
>>   	uint32_t y_tile_minimum, y_min_scanlines;
>> +	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
>> +	struct skl_wm_level *result = &pipe_wm->wm[level];
>> +	uint16_t *out_blocks = &result->plane_res_b[id];
>> +	uint8_t *out_lines = &result->plane_res_l[id];
>> +	bool *enabled = &result->plane_en[id];
>>   
>>   	if (latency == 0 || !cstate->base.active || !intel_pstate-
>>> base.visible) {
>>   		*enabled = false;
>> @@ -3673,7 +3676,7 @@ skl_compute_wm_level(const struct
>> drm_i915_private *dev_priv,
>>   		     struct skl_ddb_allocation *ddb,
>>   		     struct intel_crtc_state *cstate,
>>   		     int level,
>> -		     struct skl_wm_level *result)
>> +		     struct skl_pipe_wm *pipe_wm)
>>   {
>>   	struct drm_atomic_state *state = cstate->base.state;
>>   	struct intel_crtc *intel_crtc = to_intel_crtc(cstate-
>>> base.crtc);
>> @@ -3684,12 +3687,6 @@ skl_compute_wm_level(const struct
>> drm_i915_private *dev_priv,
>>   	enum pipe pipe = intel_crtc->pipe;
>>   	int ret;
>>   
>> -	/*
>> -	 * We'll only calculate watermarks for planes that are
>> actually
>> -	 * enabled, so make sure all other planes are set as
>> disabled.
>> -	 */
>> -	memset(result, 0, sizeof(*result));
>> -
>>   	for_each_intel_plane_mask(&dev_priv->drm,
>>   				  intel_plane,
>>   				  cstate->base.plane_mask) {
>> @@ -3727,9 +3724,7 @@ skl_compute_wm_level(const struct
>> drm_i915_private *dev_priv,
>>   					   intel_pstate,
>>   					   ddb_blocks,
>>   					   level,
>> -					   &result->plane_res_b[i],
>> -					   &result->plane_res_l[i],
>> -					   &result->plane_en[i]);
>> +					   pipe_wm);
>>   		if (ret)
>>   			return ret;
>>   	}
>> @@ -3777,9 +3772,15 @@ static int skl_build_pipe_wm(struct
>> intel_crtc_state *cstate,
>>   	int level, max_level = ilk_wm_max_level(dev);
>>   	int ret;
>>   
>> +	/*
>> +	 * We'll only calculate watermarks for planes that are
>> actually
>> +	 * enabled, so make sure all other planes are set as
>> disabled.
>> +	 */
>> +	memset(pipe_wm, 0, sizeof(*pipe_wm));
>> +
>>   	for (level = 0; level <= max_level; level++) {
>>   		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
>> -					   level, &pipe_wm-
>>> wm[level]);
>> +					   level, pipe_wm);
>>   		if (ret)
>>   			return ret;
>>   	}

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions
  2016-09-21 13:48     ` Mahesh Kumar
@ 2016-09-21 13:59       ` Paulo Zanoni
  0 siblings, 0 replies; 40+ messages in thread
From: Paulo Zanoni @ 2016-09-21 13:59 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx

Em Qua, 2016-09-21 às 19:18 +0530, Mahesh Kumar escreveu:
> there are two main reason I want to pass complete pipe_wm to compute 
> function
> 
> 1. for transition WM calculation, else we need to pass 
> trans_wm.plane_res_b &  trans_wm.plane_res_l also as a parameter,
> 
>      that will increase the parameter-list. And trans_wm will be
> used 
> (calculation will happen) only during level-0. All other levels (1-
> 7) 
> these will be extra unused variables.
> 
>      If we want to calculate transition WM separately (after plane
> WM 
> calculation) there we will have to duplicate some of the code for WM 
> level-0 calculation.
> 
> 2. This will be used in Render compression WA for (Gen9 & CNL) where
> we 
> have to make sure WM level-(1-7) result blocks/lines are >= WM level-
> 0 
> result blocks/lines.

Well, I tried to check against the series and I really thought they
were not necessary.

Considering there's also some discussion whether we want transition WMs
or not (i.e., the HW guys say we don't want), can you please at least
move this patch to be right before the point where it is needed? This
way we'll be able to discuss & merge the others before.

Thanks,
Paulo

> 
> 
> agree, out_ variable should not be local variables.
> If you think above points can be addressed, let me know, will update 
> patches accordingly.
> 
> thanks,
> -Mahesh
> 
> On Tuesday 20 September 2016 05:47 PM, Paulo Zanoni wrote:
> > 
> > Em Sex, 2016-09-09 às 13:30 +0530, Kumar, Mahesh escreveu:
> > > 
> > > From: Mahesh Kumar <mahesh1.kumar@intel.com>
> > > 
> > > This patch make use of plane_wm variable directly instead of
> > > passing
> > > skl_plane_wm struct. this way reduces number of argument
> > > requirement
> > > in watermark calculation functions.
> > > 
> > > It also gives more freedom of decision making to implement Bspec
> > > WM
> > > workarounds.
> > This is just my personal opinion, but I don't think this patch is
> > an
> > improvement to the codebase. The way things are currently
> > organized,
> > there's no risk that we may end up reading some variable that's
> > still
> > not set/computed, so there's less risk that some later patch may
> > end up
> > using information it shouldn't. Also, by having these explicit out
> > parameters, we clearly highlight what's the goal of the function:
> > output those 3 values, instead of writing I-don't-know-what to a
> > huge
> > struct.
> > 
> > Besides, the patch even keeps the out_ variables as local variables
> > instead of parameters, which makes things even more confusing IMHO,
> > since in_ and out_ variables are usually function parameters.
> > 
> > I also see that this patch is not necessary for the series. We
> > kinda
> > use the new pipe_wm variable at patch 9, but we could just go with
> > the
> > variables we have.
> > 
> > So, unless some new arguments are given, I'd suggest to just drop
> > this
> > patch.
> > 
> > > 
> > > Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> > > ---
> > >   drivers/gpu/drm/i915/intel_pm.c | 29 +++++++++++++++-----------
> > > ---
> > >   1 file changed, 15 insertions(+), 14 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > b/drivers/gpu/drm/i915/intel_pm.c
> > > index 86c6d66..3fdec4d 100644
> > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > @@ -3542,9 +3542,7 @@ static int skl_compute_plane_wm(const
> > > struct
> > > drm_i915_private *dev_priv,
> > >   				struct intel_plane_state
> > > *intel_pstate,
> > >   				uint16_t ddb_allocation,
> > >   				int level,
> > > -				uint16_t *out_blocks, /* out */
> > > -				uint8_t *out_lines, /* out */
> > > -				bool *enabled /* out */)
> > > +				struct skl_pipe_wm *pipe_wm)
> > >   {
> > >   	struct drm_plane_state *pstate = &intel_pstate->base;
> > >   	struct drm_framebuffer *fb = pstate->fb;
> > > @@ -3557,6 +3555,11 @@ static int skl_compute_plane_wm(const
> > > struct
> > > drm_i915_private *dev_priv,
> > >   	uint32_t width = 0, height = 0;
> > >   	uint32_t plane_pixel_rate;
> > >   	uint32_t y_tile_minimum, y_min_scanlines;
> > > +	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
> > > +	struct skl_wm_level *result = &pipe_wm->wm[level];
> > > +	uint16_t *out_blocks = &result->plane_res_b[id];
> > > +	uint8_t *out_lines = &result->plane_res_l[id];
> > > +	bool *enabled = &result->plane_en[id];
> > >   
> > >   	if (latency == 0 || !cstate->base.active ||
> > > !intel_pstate-
> > > > 
> > > > base.visible) {
> > >   		*enabled = false;
> > > @@ -3673,7 +3676,7 @@ skl_compute_wm_level(const struct
> > > drm_i915_private *dev_priv,
> > >   		     struct skl_ddb_allocation *ddb,
> > >   		     struct intel_crtc_state *cstate,
> > >   		     int level,
> > > -		     struct skl_wm_level *result)
> > > +		     struct skl_pipe_wm *pipe_wm)
> > >   {
> > >   	struct drm_atomic_state *state = cstate->base.state;
> > >   	struct intel_crtc *intel_crtc = to_intel_crtc(cstate-
> > > > 
> > > > base.crtc);
> > > @@ -3684,12 +3687,6 @@ skl_compute_wm_level(const struct
> > > drm_i915_private *dev_priv,
> > >   	enum pipe pipe = intel_crtc->pipe;
> > >   	int ret;
> > >   
> > > -	/*
> > > -	 * We'll only calculate watermarks for planes that are
> > > actually
> > > -	 * enabled, so make sure all other planes are set as
> > > disabled.
> > > -	 */
> > > -	memset(result, 0, sizeof(*result));
> > > -
> > >   	for_each_intel_plane_mask(&dev_priv->drm,
> > >   				  intel_plane,
> > >   				  cstate->base.plane_mask) {
> > > @@ -3727,9 +3724,7 @@ skl_compute_wm_level(const struct
> > > drm_i915_private *dev_priv,
> > >   					   intel_pstate,
> > >   					   ddb_blocks,
> > >   					   level,
> > > -					   &result-
> > > >plane_res_b[i],
> > > -					   &result-
> > > >plane_res_l[i],
> > > -					   &result-
> > > >plane_en[i]);
> > > +					   pipe_wm);
> > >   		if (ret)
> > >   			return ret;
> > >   	}
> > > @@ -3777,9 +3772,15 @@ static int skl_build_pipe_wm(struct
> > > intel_crtc_state *cstate,
> > >   	int level, max_level = ilk_wm_max_level(dev);
> > >   	int ret;
> > >   
> > > +	/*
> > > +	 * We'll only calculate watermarks for planes that are
> > > actually
> > > +	 * enabled, so make sure all other planes are set as
> > > disabled.
> > > +	 */
> > > +	memset(pipe_wm, 0, sizeof(*pipe_wm));
> > > +
> > >   	for (level = 0; level <= max_level; level++) {
> > >   		ret = skl_compute_wm_level(dev_priv, ddb,
> > > cstate,
> > > -					   level, &pipe_wm-
> > > > 
> > > > wm[level]);
> > > +					   level, pipe_wm);
> > >   		if (ret)
> > >   			return ret;
> > >   	}
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 6/9] drm/i915/skl+: change WM calc to fixed point 16.16
  2016-09-09  8:01 ` [PATCH v3 6/9] drm/i915/skl+: change WM calc to fixed point 16.16 Kumar, Mahesh
@ 2016-09-21 18:32   ` Paulo Zanoni
  2016-09-22  9:25     ` Mahesh Kumar
  0 siblings, 1 reply; 40+ messages in thread
From: Paulo Zanoni @ 2016-09-21 18:32 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx

Hi

Em Sex, 2016-09-09 às 13:31 +0530, Kumar, Mahesh escreveu:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>

First of all, good catch with this patch!

> 
> This patch changes Watermak calculation to fixed point calculation.
> Problem with current calculation is during plane_blocks_per_line
> calculation we divide intermediate blocks with min_scanlines and
> takes floor of the result because of integer operation.
> hence we end-up assigning less blocks than required. Which leads to
> flickers.

This problem is that the patch doesn't only do this. It also tries to
make the method1 result be a fixed point 16.16, and it also does a
completely unrelated change with the addition of the y_tiled variable.
I'd really like of the 3 changes were 3 different patches.


> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c
> b/drivers/gpu/drm/i915/intel_pm.c
> index 0ec328b..d4390e4 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3530,13 +3530,15 @@ static uint32_t skl_pipe_pixel_rate(const
> struct intel_crtc_state *config)
>  */
>  static uint32_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp,
> uint32_t latency)
>  {
> -	uint32_t wm_intermediate_val, ret;
> +	uint64_t wm_intermediate_val;
> +	uint32_t ret;
>  
>  	if (latency == 0)
>  		return UINT_MAX;
>  
> -	wm_intermediate_val = latency * pixel_rate * cpp / 512;
> -	ret = DIV_ROUND_UP(wm_intermediate_val, 1000);
> +	wm_intermediate_val = latency * pixel_rate * cpp;
> +	wm_intermediate_val <<= 16;
> +	ret = DIV_ROUND_UP_ULL(wm_intermediate_val, 1000 * 512);
>  
>  	return ret;
>  }

So shouldn't we fix the callers of skl_wm_method1 to take into
consideration that the value returned is now a fixed point 16.16?
Sounds like we have a bug here.

Also, having method1 be 16.16 and method2 be a normal integer adds a
lot to the confusion.


> @@ -3605,6 +3607,7 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	uint16_t *out_blocks = &result->plane_res_b[id];
>  	uint8_t *out_lines = &result->plane_res_l[id];
>  	enum watermark_memory_wa mem_wa;
> +	bool y_tiled = false;

Please either discard these y_tiled chunks or move them to a separate
patch. And since we have the y_tiled variable, maybe we'd also like to
have an x_tiled variable for consistency between the checks?


>  
>  	if (latency == 0 || !cstate->base.active || !intel_pstate-
> >base.visible)
>  		return 0;
> @@ -3652,14 +3655,18 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	plane_bytes_per_line = width * cpp;
>  	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
>  	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
> +		y_tiled = true;
>  		plane_blocks_per_line =
>  		      DIV_ROUND_UP(plane_bytes_per_line *
> y_min_scanlines, 512);
> -		plane_blocks_per_line /= y_min_scanlines;
> +		plane_blocks_per_line = (plane_blocks_per_line <<
> 16) /
> +								y_mi
> n_scanlines;
>  	} else if (fb->modifier[0] == DRM_FORMAT_MOD_NONE) {
>  		plane_blocks_per_line =
> DIV_ROUND_UP(plane_bytes_per_line, 512)
>  					+ 1;
> +		plane_blocks_per_line <<= 16;
>  	} else {
>  		plane_blocks_per_line =
> DIV_ROUND_UP(plane_bytes_per_line, 512);
> +		plane_blocks_per_line <<= 16;
>  	}

This is probably a rebase problem, but not all places where
plane_blocks_per_line are used were fixed to take into consideration
the new 16.16 format.


Now, what I've been thinking is that we completely fail at type-safety
when we mix these normal integers with the 16.16 integers. Ideally, we
should find a way to make the compiler complain about this to us, since
it's too easy to make such mistakes. Can't we try to make the compiler
help us, such as by defining something like

typedef struct {
	uint32_t val;
} uint_fixed_16_16_t;

and then making the appropriate functions/macros for maniuplating them
and mixing them with integers? This would help catching problems such
as the one we have here. Also, other pieces of i915.ko and drm.ko could
benefit from this.

>  
>  	method1 = skl_wm_method1(plane_pixel_rate, cpp, latency);
> @@ -3670,8 +3677,7 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  
>  	y_tile_minimum = plane_blocks_per_line * y_min_scanlines;
>  
> -	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
> -	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
> +	if (y_tiled) {
>  		selected_result = max(method2, y_tile_minimum);
>  	} else {
>  		uint32_t linetime_us = 0;
> @@ -3688,12 +3694,11 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  			selected_result = method1;
>  	}
>  
> -	res_blocks = selected_result + 1;
> +	res_blocks = DIV_ROUND_UP(selected_result, 1 << 16) + 1;
>  	res_lines = DIV_ROUND_UP(selected_result,
> plane_blocks_per_line);
>  
>  	if (level >= 1 && level <= 7) {
> -		if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
> -		    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
> +		if (y_tiled) {
>  			res_blocks += y_tile_minimum;
>  			res_lines += y_min_scanlines;
>  		} else {
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 7/9] drm/i915/bxt: Enable IPC support
  2016-09-09  8:01 ` [PATCH v3 7/9] drm/i915/bxt: Enable IPC support Kumar, Mahesh
@ 2016-09-21 20:06   ` Paulo Zanoni
  2016-09-22 10:14     ` Mahesh Kumar
  0 siblings, 1 reply; 40+ messages in thread
From: Paulo Zanoni @ 2016-09-21 20:06 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx

Hi

Lots of nitpicking in my review. Feel free to disagree with them.


Em Sex, 2016-09-09 às 13:31 +0530, Kumar, Mahesh escreveu:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>
> 
> This patch adds IPC support for platforms. This patch enables IPC
> only for BXT platform as for SKL recommendation is to keep is
> disabled

But don't we want it for KBL too?

Also, can you please elaborate the commit message a little bit more?
What exactly does this feature do? What do we gain with it? Are there
any downsides?

> This patch also adds kernel command-line option to enable/disable
> the IPC if required.

Any reason on why someone would want to do this besides for debugging?
I'm not sure if every single little feature like this one-bit one-
platform feature requires an i915 option, so unless there's a strong
reason, we can just omit the option.

> 

> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c    |  5 +++++
>  drivers/gpu/drm/i915/i915_params.c |  5 +++++
>  drivers/gpu/drm/i915/i915_params.h |  1 +
>  drivers/gpu/drm/i915/i915_reg.h    |  1 +
>  drivers/gpu/drm/i915/intel_drv.h   |  1 +
>  drivers/gpu/drm/i915/intel_pm.c    | 21 +++++++++++++++++++++
>  6 files changed, 34 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c
> b/drivers/gpu/drm/i915/i915_drv.c
> index 0a4f18d..22d84e6 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1335,6 +1335,11 @@ int i915_driver_load(struct pci_dev *pdev,
> const struct pci_device_id *ent)
>  
>  	intel_runtime_pm_enable(dev_priv);
>  
> +	/*
> +	 * Now enable the IPC for supported platforms
> +	 */
> +	intel_enable_ipc(dev_priv);

The comment is unnecessary: it basically only says what the function
name already says.

Also, I think it makes more sense to move this to intel_init_pm(). As a
bonus, you'll be able to make the function static. Or even just make
the IPC code be part of init_clock_gating() since the whole feature is
just "enable this bit".


> +
>  	/* Everything is in place, we can now relax! */
>  	DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
>  		 driver.name, driver.major, driver.minor,
> driver.patchlevel,
> diff --git a/drivers/gpu/drm/i915/i915_params.c
> b/drivers/gpu/drm/i915/i915_params.c
> index 768ad89..cc41b8d 100644
> --- a/drivers/gpu/drm/i915/i915_params.c
> +++ b/drivers/gpu/drm/i915/i915_params.c
> @@ -62,6 +62,7 @@ struct i915_params i915 __read_mostly = {
>  	.inject_load_failure = 0,
>  	.enable_dpcd_backlight = false,
>  	.enable_gvt = false,
> +	.enable_ipc = true,
>  };
>  
>  module_param_named(modeset, i915.modeset, int, 0400);
> @@ -233,3 +234,7 @@ MODULE_PARM_DESC(enable_dpcd_backlight,
>  module_param_named(enable_gvt, i915.enable_gvt, bool, 0400);
>  MODULE_PARM_DESC(enable_gvt,
>  	"Enable support for Intel GVT-g graphics virtualization host
> support(default:false)");
> +
> +module_param_named(enable_ipc, i915.enable_ipc, bool, 0400);
> +MODULE_PARM_DESC(enable_ipc,
> +		"Enable Isochronous Priority Control
> (default:true)");
> diff --git a/drivers/gpu/drm/i915/i915_params.h
> b/drivers/gpu/drm/i915/i915_params.h
> index 3a0dd78..f99b9b9 100644
> --- a/drivers/gpu/drm/i915/i915_params.h
> +++ b/drivers/gpu/drm/i915/i915_params.h
> @@ -65,6 +65,7 @@ struct i915_params {
>  	bool enable_dp_mst;
>  	bool enable_dpcd_backlight;
>  	bool enable_gvt;
> +	bool enable_ipc;
>  };
>  
>  extern struct i915_params i915 __read_mostly;
> diff --git a/drivers/gpu/drm/i915/i915_reg.h
> b/drivers/gpu/drm/i915/i915_reg.h
> index b38445c..75b5b52 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -6139,6 +6139,7 @@ enum {
>  #define  DISP_FBC_WM_DIS		(1<<15)
>  #define DISP_ARB_CTL2	_MMIO(0x45004)
>  #define  DISP_DATA_PARTITION_5_6	(1<<6)
> +#define  DISP_IPC_ENABLE		(1<<3)
>  #define DBUF_CTL	_MMIO(0x45008)
>  #define  DBUF_POWER_REQUEST		(1<<31)
>  #define  DBUF_POWER_STATE		(1<<30)
> diff --git a/drivers/gpu/drm/i915/intel_drv.h
> b/drivers/gpu/drm/i915/intel_drv.h
> index 66cb46c..56c8ac8 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1753,6 +1753,7 @@ void skl_write_plane_wm(struct intel_crtc
> *intel_crtc,
>  uint32_t ilk_pipe_pixel_rate(const struct intel_crtc_state
> *pipe_config);
>  bool ilk_disable_lp_wm(struct drm_device *dev);
>  int sanitize_rc6_option(struct drm_i915_private *dev_priv, int
> enable_rc6);
> +void intel_enable_ipc(struct drm_i915_private *dev_priv);
>  static inline int intel_enable_rc6(void)
>  {
>  	return i915.enable_rc6;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c
> b/drivers/gpu/drm/i915/intel_pm.c
> index d4390e4..8d0037c 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4793,6 +4793,27 @@ void intel_update_watermarks(struct drm_crtc
> *crtc)
>  }
>  
>  /*
> + * enable IPC for Supported Platforms
> + */

This comment also doesn't say very much. Also, s/enable/Enable/ if you
keep it.

> +void intel_enable_ipc(struct drm_i915_private *dev_priv)
> +{
> +	u32 val;
> +
> +	/* enable IPC only for Broxton for now*/

Also not a useful comment, unless you explain why.

> +	if (!IS_BROXTON(dev_priv))
> +		return;
> +
> +	val = I915_READ(DISP_ARB_CTL2);
> +
> +	if (i915.enable_ipc)
> +		val |= DISP_IPC_ENABLE;
> +	else
> +		val &= ~DISP_IPC_ENABLE;
> +
> +	I915_WRITE(DISP_ARB_CTL2, val);

> +}
> +
> +/*
>   * Lock protecting IPS related data structures
>   */
>  DEFINE_SPINLOCK(mchdev_lock);
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 8/9] drm/i915/bxt: set chicken bit as IPC y-tile WA
  2016-09-09  8:01 ` [PATCH v3 8/9] drm/i915/bxt: set chicken bit as IPC y-tile WA Kumar, Mahesh
@ 2016-09-21 20:23   ` Paulo Zanoni
  2016-09-22  9:43     ` Mahesh Kumar
  0 siblings, 1 reply; 40+ messages in thread
From: Paulo Zanoni @ 2016-09-21 20:23 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx

Em Sex, 2016-09-09 às 13:31 +0530, Kumar, Mahesh escreveu:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>
> 
> It implements the WA to enable IDLE_WAKEMEM bit of CHICKEN_DCPR_1
> register for Broxton platform. When IPC is enabled & Y-tile is
> enabled in any of the enabled plane, above bit should be set.
> Without this WA system observes random hang.

The previous patch enabled the feature, and now this patch implements a
missing workaround for the feature. This is not the appropriate order,
since it can mean that the previous patch introduced a bug that was
fixed by this patch, and this potentially breaks bisectability. We only
enable the feature once we know all its workarounds are enabled and the
feature will Just Work*. So, normally, my suggestion would be to either
invert the patch order, or just make patches 7 and 8 be a single patch.

But we have another problem, please see
commit 590e8ff04bc0182dce97228e5e352d6413d80456. What's the problem
with the current implementation? Why can't we just keep the bit as 1
forever?


> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h      |  2 ++
>  drivers/gpu/drm/i915/i915_reg.h      |  3 +++
>  drivers/gpu/drm/i915/intel_display.c | 50
> ++++++++++++++++++++++++++++++++++++
>  3 files changed, 55 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h
> index 4737a0e..79b9305 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1632,6 +1632,8 @@ struct skl_wm_values {
>  	unsigned dirty_pipes;
>  	/* any WaterMark memory workaround Required */
>  	enum watermark_memory_wa mem_wa;
> +	/* IPC Y-tiled WA related member */
> +	u32 y_plane_mask;
>  	struct skl_ddb_allocation ddb;
>  	uint32_t wm_linetime[I915_MAX_PIPES];
>  	uint32_t plane[I915_MAX_PIPES][I915_MAX_PLANES][8];
> diff --git a/drivers/gpu/drm/i915/i915_reg.h
> b/drivers/gpu/drm/i915/i915_reg.h
> index 75b5b52..210d9b0 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -5658,6 +5658,9 @@ enum {
>  #define PLANE_NV12_BUF_CFG(pipe, plane)	\
>  	_MMIO_PLANE(plane, _PLANE_NV12_BUF_CFG_1(pipe),
> _PLANE_NV12_BUF_CFG_2(pipe))
>  
> +#define CHICKEN_DCPR_1             _MMIO(0x46430)
> +#define IDLE_WAKEMEM_MASK          (1 << 13)
> +
>  /* SKL new cursor registers */
>  #define _CUR_BUF_CFG_A				0x7017c
>  #define _CUR_BUF_CFG_B				0x7117c
> diff --git a/drivers/gpu/drm/i915/intel_display.c
> b/drivers/gpu/drm/i915/intel_display.c
> index 5f50f53..ee7f88e 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -12415,6 +12415,8 @@ int intel_plane_atomic_calc_changes(struct
> drm_crtc_state *crtc_state,
>  	bool is_crtc_enabled = crtc_state->active;
>  	bool turn_off, turn_on, visible, was_visible;
>  	struct drm_framebuffer *fb = plane_state->fb;
> +	struct intel_atomic_state *intel_state =
> +				to_intel_atomic_state(plane_state-
> >state);
>  	int ret;
>  
>  	if (INTEL_GEN(dev) >= 9 && plane->type !=
> DRM_PLANE_TYPE_CURSOR) {
> @@ -12501,6 +12503,15 @@ int intel_plane_atomic_calc_changes(struct
> drm_crtc_state *crtc_state,
>  	    !needs_scaling(old_plane_state))
>  		pipe_config->disable_lp_wm = true;
>  
> +	if (fb && (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
> +			fb->modifier[0] ==
> I915_FORMAT_MOD_Yf_TILED)) {
> +		intel_state->wm_results.y_plane_mask |=
> +						(1 <<
> drm_plane_index(plane));
> +	} else {
> +		intel_state->wm_results.y_plane_mask &=
> +						~(1 <<
> drm_plane_index(plane));
> +	}
> +
>  	return 0;
>  }
>  
> @@ -13963,6 +13974,10 @@ static int intel_atomic_check(struct
> drm_device *dev,
>  	if (ret)
>  		return ret;
>  
> +	/* Copy the Y-tile WA related states */
> +	intel_state->wm_results.y_plane_mask =
> +				dev_priv-
> >wm.skl_results.y_plane_mask;
> +
>  	for_each_crtc_in_state(state, crtc, crtc_state, i) {
>  		struct intel_crtc_state *pipe_config =
>  			to_intel_crtc_state(crtc_state);
> @@ -14204,6 +14219,32 @@ static void intel_update_crtcs(struct
> drm_atomic_state *state,
>  	}
>  }
>  
> +/*
> + * GEN9 IPC WA for Y-tiled
> + */
> +void bxt_set_ipc_wa(struct drm_i915_private *dev_priv, bool enable)
> +{
> +	u32 val;
> +
> +	if (!IS_BROXTON(dev_priv) || !i915.enable_ipc)
> +		return;
> +
> +	val = I915_READ(CHICKEN_DCPR_1);
> +	/*
> +	 * If WA is already enabled or disabled
> +	 * no need to re-enable or disable it.
> +	 */
> +	if ((enable && (val & IDLE_WAKEMEM_MASK)) ||
> +		(!enable && !(val & IDLE_WAKEMEM_MASK)))
> +		return;
> +
> +	if (enable)
> +		val |= IDLE_WAKEMEM_MASK;
> +	else
> +		val &= ~IDLE_WAKEMEM_MASK;
> +	I915_WRITE(CHICKEN_DCPR_1, val);
> +}
> +
>  static void skl_update_crtcs(struct drm_atomic_state *state,
>  			     unsigned int *crtc_vblank_mask)
>  {
> @@ -14219,6 +14260,12 @@ static void skl_update_crtcs(struct
> drm_atomic_state *state,
>  	enum pipe pipe;
>  
>  	/*
> +	 * If IPC WA need to be enabled, enable it now
> +	 */
> +	if (intel_state->wm_results.y_plane_mask)
> +		bxt_set_ipc_wa(dev_priv, true);
> +
> +	/*
>  	 * Whenever the number of active pipes changes, we need to
> make sure we
>  	 * update the pipes in the right order so that their ddb
> allocations
>  	 * never overlap with eachother inbetween CRTC updates.
> Otherwise we'll
> @@ -14261,6 +14308,9 @@ static void skl_update_crtcs(struct
> drm_atomic_state *state,
>  			progress = true;
>  		}
>  	} while (progress);
> +
> +	if (!intel_state->wm_results.y_plane_mask)
> +		bxt_set_ipc_wa(dev_priv, false);
>  }
>  
>  static void intel_atomic_commit_tail(struct drm_atomic_state *state)
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4] drm/i915/bxt: Implement Transition WM
  2016-09-14 11:54   ` [PATCH v4] " Kumar, Mahesh
@ 2016-09-21 20:27     ` Paulo Zanoni
  0 siblings, 0 replies; 40+ messages in thread
From: Paulo Zanoni @ 2016-09-21 20:27 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx

Em Qua, 2016-09-14 às 17:24 +0530, Kumar, Mahesh escreveu:
> From: Mahesh Kumar <mahesh1.kumar@intel.com>
> 
> This patch enables Transition WM for SKL+ platforms.
> 
> Transition WM are used if IPC is enabled, to decide, number of blocks
> to be fetched before reducing the priority of display to fetch from
> memory.
> 
> Changes since v1:
>  - Don't enable transition WM for GEN9 (Art/Paulo)
>  - Rebase to use fixed point 16.16 calculation
>  - Fix the use of selected result from latency level-0
> 
> Changes since v2:
>  - Fix transition WM enable condition
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 60
> ++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 57 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c
> b/drivers/gpu/drm/i915/intel_pm.c
> index de2e738..4814a1a 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2862,6 +2862,8 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
>  
>  #define SKL_DDB_SIZE		896	/* in blocks */
>  #define BXT_DDB_SIZE		512
> +#define SKL_TRANS_WM_AMOUNT	10	/* tunable value */
> +#define SKL_TRANS_WM_MIN	14
>  #define SKL_SAGV_BLOCK_TIME	30 /* µs */
>  
>  /*
> @@ -3583,6 +3585,48 @@ static uint32_t
> skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
>  	return pixel_rate;
>  }
>  
> +static void skl_compute_plane_trans_wm(const struct drm_i915_private
> *dev_priv,
> +					struct skl_pipe_wm *pipe_wm,
> +					struct intel_plane_state
> *intel_pstate,
> +					uint32_t selected_result,
> +					uint32_t y_tile_min,
> +					bool y_tile)
> +{
> +	struct drm_plane_state *pstate = &intel_pstate->base;
> +	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
> +	uint16_t *out_blocks = &pipe_wm->trans_wm.plane_res_b[id];
> +	uint8_t *out_lines = &pipe_wm->trans_wm.plane_res_l[id];
> +	uint32_t trans_min = 0, trans_offset_blocks;
> +	uint32_t trans_y_tile_min = 0, res_blocks = 0;
> +	uint16_t trans_res_blocks = 0;
> +
> +	/* Keep Trans WM disabled for GEN9 */
> +	if (IS_GEN9(dev_priv))
> +		goto exit;

But the only platforms that call this function are gen9...


> +
> +	trans_min = SKL_TRANS_WM_MIN;
> +
> +	trans_offset_blocks = (trans_min + SKL_TRANS_WM_AMOUNT) <<
> 16;
> +
> +	if (y_tile) {
> +		trans_y_tile_min = 2 * y_tile_min;
> +		res_blocks = max(trans_y_tile_min, selected_result)
> +
> +			trans_offset_blocks;
> +	} else {
> +		res_blocks = selected_result + trans_offset_blocks;
> +	}
> +
> +	trans_res_blocks = DIV_ROUND_UP(res_blocks, 1 << 16) + 1;
> +
> +	/* WA BUG:1938466 add one block for non y-tile planes */
> +	if (!y_tile)
> +		trans_res_blocks += 1;
> +exit:
> +	*out_blocks = trans_res_blocks;
> +	*out_lines = 0;
> +}
> +
> +
>  static int skl_compute_plane_wm(const struct drm_i915_private
> *dev_priv,
>  				struct intel_crtc_state *cstate,
>  				struct intel_plane_state
> *intel_pstate,
> @@ -3709,6 +3753,9 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	*out_blocks = res_blocks;
>  	*out_lines = res_lines;
>  
> +	if (level == 0)
> +		skl_compute_plane_trans_wm(dev_priv, pipe_wm,
> intel_pstate,
> +				selected_result, y_tile_minimum,
> y_tiled);
>  	return 0;
>  }
>  
> @@ -3778,11 +3825,13 @@ skl_compute_linetime_wm(struct
> intel_crtc_state *cstate)
>  }
>  
>  static void skl_compute_transition_wm(struct intel_crtc_state
> *cstate,
> -				      struct skl_wm_level *trans_wm
> /* out */)
> +				      struct skl_wm_level *trans_wm,
> /* out */
> +				      struct skl_ddb_allocation
> *ddb)
>  {
>  	struct drm_crtc *crtc = cstate->base.crtc;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>  	struct intel_plane *intel_plane;
> +	enum pipe pipe = to_intel_crtc(crtc)->pipe;
>  
>  	if (!cstate->base.active)
>  		return;
> @@ -3790,8 +3839,13 @@ static void skl_compute_transition_wm(struct
> intel_crtc_state *cstate,
>  	/* Until we know more, just disable transition WMs */
>  	for_each_intel_plane_on_crtc(crtc->dev, intel_crtc,
> intel_plane) {
>  		int i = skl_wm_plane_id(intel_plane);
> +		uint16_t plane_ddb = skl_ddb_entry_size(&ddb-
> >plane[pipe][i]);
> +		uint16_t trans_res_blocks = trans_wm-
> >plane_res_b[i];
>  
> -		trans_wm->plane_en[i] = false;
> +		if ((trans_res_blocks > 0) && (trans_res_blocks <=
> plane_ddb))
> +			trans_wm->plane_en[i] = true;
> +		else
> +			trans_wm->plane_en[i] = false;
>  	}
>  }
>  
> @@ -3822,7 +3876,7 @@ static int skl_build_pipe_wm(struct
> intel_crtc_state *cstate,
>  
>  	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
>  
> -	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
> +	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm, ddb);
>  
>  	return 0;
>  }
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/9] drm/i915/skl+: use linetime latency instead of ddb size
  2016-09-19 18:24     ` Zanoni, Paulo R
@ 2016-09-22  8:02       ` Mahesh Kumar
  0 siblings, 0 replies; 40+ messages in thread
From: Mahesh Kumar @ 2016-09-22  8:02 UTC (permalink / raw)
  To: Zanoni, Paulo R, intel-gfx

Hi,


On Monday 19 September 2016 11:54 PM, Zanoni, Paulo R wrote:
> Em Seg, 2016-09-19 às 15:19 -0300, Paulo Zanoni escreveu:
>> Em Sex, 2016-09-09 às 13:30 +0530, Kumar, Mahesh escreveu:
>>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>>>
>>> This patch make changes to use linetime latency instead of
>>> allocated
>>> DDB size during plane watermark calculation in switch case, This is
>>> required to implement new DDB allocation algorithm.
>>>
>>> In New Algorithm DDB is allocated based on WM values, because of
>>> which
>>> number of DDB blocks will not be available during WM calculation,
>>> So this "linetime latency" is suggested by SV/HW team to use during
>>> switch-case for WM blocks selection.
>> Why is this not part of BSpec? If there's some problem with the
>> current
>> algorithm and we need a new one, why is it not part of our spec?
>>
This is now included in BSpec.... :)
>>>
>>> Changes since v1:
>>>   - Rebase on top of Paulo's patch series
>>>
>>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/intel_pm.c | 7 ++++++-
>>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_pm.c
>>> b/drivers/gpu/drm/i915/intel_pm.c
>>> index 3fdec4d..cfd9b7d1 100644
>>> --- a/drivers/gpu/drm/i915/intel_pm.c
>>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>>> @@ -3622,10 +3622,15 @@ static int skl_compute_plane_wm(const
>>> struct
>>> drm_i915_private *dev_priv,
>>>   	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
>>>   		selected_result = max(method2, y_tile_minimum);
>>>   	} else {
>>> +		uint32_t linetime_us = 0;
>>> +
>>> +		linetime_us = DIV_ROUND_UP(width * 1000,
>>> +				skl_pipe_pixel_rate(cstate));
> Also, shouldn't this be width * 8 * 1000?
we don't need to multiply by 8 here, as this is the time taken for 
single line to be fetched.

thanks,
-Mahesh
>
>>> +
>>>   		if ((cpp * cstate->base.adjusted_mode.crtc_htotal
>>> /
>>> 512 < 1) &&
>>>   		    (plane_bytes_per_line / 512 < 1))
>>>   			selected_result = method2;
>>> -		else if ((ddb_allocation / plane_blocks_per_line)
>>>> =
>>> 1)
>>> +		if (latency >= linetime_us)
>>>   			selected_result = min(method1, method2);
>>>   		else
>>>   			selected_result = method1;
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 6/9] drm/i915/skl+: change WM calc to fixed point 16.16
  2016-09-21 18:32   ` Paulo Zanoni
@ 2016-09-22  9:25     ` Mahesh Kumar
  0 siblings, 0 replies; 40+ messages in thread
From: Mahesh Kumar @ 2016-09-22  9:25 UTC (permalink / raw)
  To: Paulo Zanoni, intel-gfx

Hi,


On Thursday 22 September 2016 12:02 AM, Paulo Zanoni wrote:
> Hi
>
> Em Sex, 2016-09-09 às 13:31 +0530, Kumar, Mahesh escreveu:
>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
> First of all, good catch with this patch!
>
>> This patch changes Watermak calculation to fixed point calculation.
>> Problem with current calculation is during plane_blocks_per_line
>> calculation we divide intermediate blocks with min_scanlines and
>> takes floor of the result because of integer operation.
>> hence we end-up assigning less blocks than required. Which leads to
>> flickers.
> This problem is that the patch doesn't only do this. It also tries to
> make the method1 result be a fixed point 16.16, and it also does a
> completely unrelated change with the addition of the y_tiled variable.
> I'd really like of the 3 changes were 3 different patches.
>
agree, y_tiled, change can go separately, this was just, I don't wanted 
to write complete fb->modifier name always :).
But imo making method-1 & method-2 output 16.16 fixed point should go in 
single patch. (plane_blocks_per_line, method_1 & method_2 to 16.16)
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_pm.c | 23 ++++++++++++++---------
>>   1 file changed, 14 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c
>> b/drivers/gpu/drm/i915/intel_pm.c
>> index 0ec328b..d4390e4 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -3530,13 +3530,15 @@ static uint32_t skl_pipe_pixel_rate(const
>> struct intel_crtc_state *config)
>>   */
>>   static uint32_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp,
>> uint32_t latency)
>>   {
>> -	uint32_t wm_intermediate_val, ret;
>> +	uint64_t wm_intermediate_val;
>> +	uint32_t ret;
>>   
>>   	if (latency == 0)
>>   		return UINT_MAX;
>>   
>> -	wm_intermediate_val = latency * pixel_rate * cpp / 512;
>> -	ret = DIV_ROUND_UP(wm_intermediate_val, 1000);
>> +	wm_intermediate_val = latency * pixel_rate * cpp;
>> +	wm_intermediate_val <<= 16;
>> +	ret = DIV_ROUND_UP_ULL(wm_intermediate_val, 1000 * 512);
>>   
>>   	return ret;
>>   }
> So shouldn't we fix the callers of skl_wm_method1 to take into
> consideration that the value returned is now a fixed point 16.16?
> Sounds like we have a bug here.
caller is now considering it as 16.16 only
>
> Also, having method1 be 16.16 and method2 be a normal integer adds a
> lot to the confusion.
Actually method-2 output is also 16.16 fixed point. Because we are 
multiplying it with plane_blocks_per_line, which is 16.16.
>
>
>> @@ -3605,6 +3607,7 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>   	uint16_t *out_blocks = &result->plane_res_b[id];
>>   	uint8_t *out_lines = &result->plane_res_l[id];
>>   	enum watermark_memory_wa mem_wa;
>> +	bool y_tiled = false;
> Please either discard these y_tiled chunks or move them to a separate
> patch. And since we have the y_tiled variable, maybe we'd also like to
> have an x_tiled variable for consistency between the checks?
will move y_tiled to other patch, if we have another variable for 
x_tiled then we need one for none_tiled also.
I'm not sure how useful it would be.
>
>
>>   
>>   	if (latency == 0 || !cstate->base.active || !intel_pstate-
>>> base.visible)
>>   		return 0;
>> @@ -3652,14 +3655,18 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>   	plane_bytes_per_line = width * cpp;
>>   	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
>>   	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
>> +		y_tiled = true;
>>   		plane_blocks_per_line =
>>   		      DIV_ROUND_UP(plane_bytes_per_line *
>> y_min_scanlines, 512);
>> -		plane_blocks_per_line /= y_min_scanlines;
>> +		plane_blocks_per_line = (plane_blocks_per_line <<
>> 16) /
>> +								y_mi
>> n_scanlines;
>>   	} else if (fb->modifier[0] == DRM_FORMAT_MOD_NONE) {
>>   		plane_blocks_per_line =
>> DIV_ROUND_UP(plane_bytes_per_line, 512)
>>   					+ 1;
>> +		plane_blocks_per_line <<= 16;
>>   	} else {
>>   		plane_blocks_per_line =
>> DIV_ROUND_UP(plane_bytes_per_line, 512);
>> +		plane_blocks_per_line <<= 16;
>>   	}
> This is probably a rebase problem, but not all places where
> plane_blocks_per_line are used were fixed to take into consideration
> the new 16.16 format.
>
>
> Now, what I've been thinking is that we completely fail at type-safety
> when we mix these normal integers with the 16.16 integers. Ideally, we
> should find a way to make the compiler complain about this to us, since
> it's too easy to make such mistakes. Can't we try to make the compiler
> help us, such as by defining something like
>
> typedef struct {
> 	uint32_t val;
> } uint_fixed_16_16_t;
sounds good to have different type for fixed_16_16.

Regards,
-Mahesh
>
> and then making the appropriate functions/macros for maniuplating them
> and mixing them with integers? This would help catching problems such
> as the one we have here. Also, other pieces of i915.ko and drm.ko could
> benefit from this.
>
>>   
>>   	method1 = skl_wm_method1(plane_pixel_rate, cpp, latency);
>> @@ -3670,8 +3677,7 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>   
>>   	y_tile_minimum = plane_blocks_per_line * y_min_scanlines;
>>   
>> -	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
>> -	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
>> +	if (y_tiled) {
>>   		selected_result = max(method2, y_tile_minimum);
>>   	} else {
>>   		uint32_t linetime_us = 0;
>> @@ -3688,12 +3694,11 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>   			selected_result = method1;
>>   	}
>>   
>> -	res_blocks = selected_result + 1;
>> +	res_blocks = DIV_ROUND_UP(selected_result, 1 << 16) + 1;
>>   	res_lines = DIV_ROUND_UP(selected_result,
>> plane_blocks_per_line);
>>   
>>   	if (level >= 1 && level <= 7) {
>> -		if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
>> -		    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
>> +		if (y_tiled) {
>>   			res_blocks += y_tile_minimum;
>>   			res_lines += y_min_scanlines;
>>   		} else {

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 8/9] drm/i915/bxt: set chicken bit as IPC y-tile WA
  2016-09-21 20:23   ` Paulo Zanoni
@ 2016-09-22  9:43     ` Mahesh Kumar
  2016-09-22 11:53       ` Paulo Zanoni
  0 siblings, 1 reply; 40+ messages in thread
From: Mahesh Kumar @ 2016-09-22  9:43 UTC (permalink / raw)
  To: Paulo Zanoni, intel-gfx

Hi,


On Thursday 22 September 2016 01:53 AM, Paulo Zanoni wrote:
> Em Sex, 2016-09-09 às 13:31 +0530, Kumar, Mahesh escreveu:
>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>>
>> It implements the WA to enable IDLE_WAKEMEM bit of CHICKEN_DCPR_1
>> register for Broxton platform. When IPC is enabled & Y-tile is
>> enabled in any of the enabled plane, above bit should be set.
>> Without this WA system observes random hang.
> The previous patch enabled the feature, and now this patch implements a
> missing workaround for the feature. This is not the appropriate order,
> since it can mean that the previous patch introduced a bug that was
> fixed by this patch, and this potentially breaks bisectability. We only
> enable the feature once we know all its workarounds are enabled and the
> feature will Just Work*. So, normally, my suggestion would be to either
> invert the patch order, or just make patches 7 and 8 be a single patch.
I can invert the order of patches, If below point seems OK.
>
> But we have another problem, please see
> commit 590e8ff04bc0182dce97228e5e352d6413d80456. What's the problem
> with the current implementation? Why can't we just keep the bit as 1
> forever?
mentioned commit is making IDLE_WAKEMEM bit always 1b (masking), but it 
is require only in case of y/yf tiling.
need to check for consequences in keeping this bit always Masked.
what is your opinion on this? Should not we implement it only in case of 
y/yf tiling, when we can do it in our framework?

Regards,
-Mahesh
>
>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h      |  2 ++
>>   drivers/gpu/drm/i915/i915_reg.h      |  3 +++
>>   drivers/gpu/drm/i915/intel_display.c | 50
>> ++++++++++++++++++++++++++++++++++++
>>   3 files changed, 55 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 4737a0e..79b9305 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1632,6 +1632,8 @@ struct skl_wm_values {
>>   	unsigned dirty_pipes;
>>   	/* any WaterMark memory workaround Required */
>>   	enum watermark_memory_wa mem_wa;
>> +	/* IPC Y-tiled WA related member */
>> +	u32 y_plane_mask;
>>   	struct skl_ddb_allocation ddb;
>>   	uint32_t wm_linetime[I915_MAX_PIPES];
>>   	uint32_t plane[I915_MAX_PIPES][I915_MAX_PLANES][8];
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h
>> b/drivers/gpu/drm/i915/i915_reg.h
>> index 75b5b52..210d9b0 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -5658,6 +5658,9 @@ enum {
>>   #define PLANE_NV12_BUF_CFG(pipe, plane)	\
>>   	_MMIO_PLANE(plane, _PLANE_NV12_BUF_CFG_1(pipe),
>> _PLANE_NV12_BUF_CFG_2(pipe))
>>   
>> +#define CHICKEN_DCPR_1             _MMIO(0x46430)
>> +#define IDLE_WAKEMEM_MASK          (1 << 13)
>> +
>>   /* SKL new cursor registers */
>>   #define _CUR_BUF_CFG_A				0x7017c
>>   #define _CUR_BUF_CFG_B				0x7117c
>> diff --git a/drivers/gpu/drm/i915/intel_display.c
>> b/drivers/gpu/drm/i915/intel_display.c
>> index 5f50f53..ee7f88e 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -12415,6 +12415,8 @@ int intel_plane_atomic_calc_changes(struct
>> drm_crtc_state *crtc_state,
>>   	bool is_crtc_enabled = crtc_state->active;
>>   	bool turn_off, turn_on, visible, was_visible;
>>   	struct drm_framebuffer *fb = plane_state->fb;
>> +	struct intel_atomic_state *intel_state =
>> +				to_intel_atomic_state(plane_state-
>>> state);
>>   	int ret;
>>   
>>   	if (INTEL_GEN(dev) >= 9 && plane->type !=
>> DRM_PLANE_TYPE_CURSOR) {
>> @@ -12501,6 +12503,15 @@ int intel_plane_atomic_calc_changes(struct
>> drm_crtc_state *crtc_state,
>>   	    !needs_scaling(old_plane_state))
>>   		pipe_config->disable_lp_wm = true;
>>   
>> +	if (fb && (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
>> +			fb->modifier[0] ==
>> I915_FORMAT_MOD_Yf_TILED)) {
>> +		intel_state->wm_results.y_plane_mask |=
>> +						(1 <<
>> drm_plane_index(plane));
>> +	} else {
>> +		intel_state->wm_results.y_plane_mask &=
>> +						~(1 <<
>> drm_plane_index(plane));
>> +	}
>> +
>>   	return 0;
>>   }
>>   
>> @@ -13963,6 +13974,10 @@ static int intel_atomic_check(struct
>> drm_device *dev,
>>   	if (ret)
>>   		return ret;
>>   
>> +	/* Copy the Y-tile WA related states */
>> +	intel_state->wm_results.y_plane_mask =
>> +				dev_priv-
>>> wm.skl_results.y_plane_mask;
>> +
>>   	for_each_crtc_in_state(state, crtc, crtc_state, i) {
>>   		struct intel_crtc_state *pipe_config =
>>   			to_intel_crtc_state(crtc_state);
>> @@ -14204,6 +14219,32 @@ static void intel_update_crtcs(struct
>> drm_atomic_state *state,
>>   	}
>>   }
>>   
>> +/*
>> + * GEN9 IPC WA for Y-tiled
>> + */
>> +void bxt_set_ipc_wa(struct drm_i915_private *dev_priv, bool enable)
>> +{
>> +	u32 val;
>> +
>> +	if (!IS_BROXTON(dev_priv) || !i915.enable_ipc)
>> +		return;
>> +
>> +	val = I915_READ(CHICKEN_DCPR_1);
>> +	/*
>> +	 * If WA is already enabled or disabled
>> +	 * no need to re-enable or disable it.
>> +	 */
>> +	if ((enable && (val & IDLE_WAKEMEM_MASK)) ||
>> +		(!enable && !(val & IDLE_WAKEMEM_MASK)))
>> +		return;
>> +
>> +	if (enable)
>> +		val |= IDLE_WAKEMEM_MASK;
>> +	else
>> +		val &= ~IDLE_WAKEMEM_MASK;
>> +	I915_WRITE(CHICKEN_DCPR_1, val);
>> +}
>> +
>>   static void skl_update_crtcs(struct drm_atomic_state *state,
>>   			     unsigned int *crtc_vblank_mask)
>>   {
>> @@ -14219,6 +14260,12 @@ static void skl_update_crtcs(struct
>> drm_atomic_state *state,
>>   	enum pipe pipe;
>>   
>>   	/*
>> +	 * If IPC WA need to be enabled, enable it now
>> +	 */
>> +	if (intel_state->wm_results.y_plane_mask)
>> +		bxt_set_ipc_wa(dev_priv, true);
>> +
>> +	/*
>>   	 * Whenever the number of active pipes changes, we need to
>> make sure we
>>   	 * update the pipes in the right order so that their ddb
>> allocations
>>   	 * never overlap with eachother inbetween CRTC updates.
>> Otherwise we'll
>> @@ -14261,6 +14308,9 @@ static void skl_update_crtcs(struct
>> drm_atomic_state *state,
>>   			progress = true;
>>   		}
>>   	} while (progress);
>> +
>> +	if (!intel_state->wm_results.y_plane_mask)
>> +		bxt_set_ipc_wa(dev_priv, false);
>>   }
>>   
>>   static void intel_atomic_commit_tail(struct drm_atomic_state *state)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 7/9] drm/i915/bxt: Enable IPC support
  2016-09-21 20:06   ` Paulo Zanoni
@ 2016-09-22 10:14     ` Mahesh Kumar
  0 siblings, 0 replies; 40+ messages in thread
From: Mahesh Kumar @ 2016-09-22 10:14 UTC (permalink / raw)
  To: Paulo Zanoni, intel-gfx

Hi,


On Thursday 22 September 2016 01:36 AM, Paulo Zanoni wrote:
> Hi
>
> Lots of nitpicking in my review. Feel free to disagree with them.
>
>
> Em Sex, 2016-09-09 às 13:31 +0530, Kumar, Mahesh escreveu:
>> From: Mahesh Kumar <mahesh1.kumar@intel.com>
>>
>> This patch adds IPC support for platforms. This patch enables IPC
>> only for BXT platform as for SKL recommendation is to keep is
>> disabled
> But don't we want it for KBL too?
I didn't check for KBL, it may be require for KBL also.
>
> Also, can you please elaborate the commit message a little bit more?
> What exactly does this feature do? What do we gain with it? Are there
> any downsides?
Will add more description about the feature in the commit.
>> This patch also adds kernel command-line option to enable/disable
>> the IPC if required.
> Any reason on why someone would want to do this besides for debugging?
> I'm not sure if every single little feature like this one-bit one-
> platform feature requires an i915 option, so unless there's a strong
> reason, we can just omit the option.\
Main intention of giving one command line option was to keep IPC disable 
for debugging only.
I can remove the command-line option, But for debugging we need some 
option to control the enable/disable.
What would you suggest, which I can use instead of command-line option? 
OR we can totally ignore that?
>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.c    |  5 +++++
>>   drivers/gpu/drm/i915/i915_params.c |  5 +++++
>>   drivers/gpu/drm/i915/i915_params.h |  1 +
>>   drivers/gpu/drm/i915/i915_reg.h    |  1 +
>>   drivers/gpu/drm/i915/intel_drv.h   |  1 +
>>   drivers/gpu/drm/i915/intel_pm.c    | 21 +++++++++++++++++++++
>>   6 files changed, 34 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>> b/drivers/gpu/drm/i915/i915_drv.c
>> index 0a4f18d..22d84e6 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -1335,6 +1335,11 @@ int i915_driver_load(struct pci_dev *pdev,
>> const struct pci_device_id *ent)
>>   
>>   	intel_runtime_pm_enable(dev_priv);
>>   
>> +	/*
>> +	 * Now enable the IPC for supported platforms
>> +	 */
>> +	intel_enable_ipc(dev_priv);
> The comment is unnecessary: it basically only says what the function
> name already says.
>
> Also, I think it makes more sense to move this to intel_init_pm(). As a
> bonus, you'll be able to make the function static. Or even just make
> the IPC code be part of init_clock_gating() since the whole feature is
> just "enable this bit".
agree, keeping this as part of intel_init_pm seems more relevant,
But we have faced some issue because of enabling it early in intel_init_pm,
If BIOS is not enabling the IPC & we only are going to enable first, 
then BIOS was not taking care of programming transition WM
& since we were enabling it early before sanitizing WMs (during 
pm_init), this was causing flickers/underrun.
Didn't tried it recently, but as transition WM are disabled now we 
should not face that issue for now*.

Thanks,
-Mahesh
>
>
>> +
>>   	/* Everything is in place, we can now relax! */
>>   	DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
>>   		 driver.name, driver.major, driver.minor,
>> driver.patchlevel,
>> diff --git a/drivers/gpu/drm/i915/i915_params.c
>> b/drivers/gpu/drm/i915/i915_params.c
>> index 768ad89..cc41b8d 100644
>> --- a/drivers/gpu/drm/i915/i915_params.c
>> +++ b/drivers/gpu/drm/i915/i915_params.c
>> @@ -62,6 +62,7 @@ struct i915_params i915 __read_mostly = {
>>   	.inject_load_failure = 0,
>>   	.enable_dpcd_backlight = false,
>>   	.enable_gvt = false,
>> +	.enable_ipc = true,
>>   };
>>   
>>   module_param_named(modeset, i915.modeset, int, 0400);
>> @@ -233,3 +234,7 @@ MODULE_PARM_DESC(enable_dpcd_backlight,
>>   module_param_named(enable_gvt, i915.enable_gvt, bool, 0400);
>>   MODULE_PARM_DESC(enable_gvt,
>>   	"Enable support for Intel GVT-g graphics virtualization host
>> support(default:false)");
>> +
>> +module_param_named(enable_ipc, i915.enable_ipc, bool, 0400);
>> +MODULE_PARM_DESC(enable_ipc,
>> +		"Enable Isochronous Priority Control
>> (default:true)");
>> diff --git a/drivers/gpu/drm/i915/i915_params.h
>> b/drivers/gpu/drm/i915/i915_params.h
>> index 3a0dd78..f99b9b9 100644
>> --- a/drivers/gpu/drm/i915/i915_params.h
>> +++ b/drivers/gpu/drm/i915/i915_params.h
>> @@ -65,6 +65,7 @@ struct i915_params {
>>   	bool enable_dp_mst;
>>   	bool enable_dpcd_backlight;
>>   	bool enable_gvt;
>> +	bool enable_ipc;
>>   };
>>   
>>   extern struct i915_params i915 __read_mostly;
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h
>> b/drivers/gpu/drm/i915/i915_reg.h
>> index b38445c..75b5b52 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -6139,6 +6139,7 @@ enum {
>>   #define  DISP_FBC_WM_DIS		(1<<15)
>>   #define DISP_ARB_CTL2	_MMIO(0x45004)
>>   #define  DISP_DATA_PARTITION_5_6	(1<<6)
>> +#define  DISP_IPC_ENABLE		(1<<3)
>>   #define DBUF_CTL	_MMIO(0x45008)
>>   #define  DBUF_POWER_REQUEST		(1<<31)
>>   #define  DBUF_POWER_STATE		(1<<30)
>> diff --git a/drivers/gpu/drm/i915/intel_drv.h
>> b/drivers/gpu/drm/i915/intel_drv.h
>> index 66cb46c..56c8ac8 100644
>> --- a/drivers/gpu/drm/i915/intel_drv.h
>> +++ b/drivers/gpu/drm/i915/intel_drv.h
>> @@ -1753,6 +1753,7 @@ void skl_write_plane_wm(struct intel_crtc
>> *intel_crtc,
>>   uint32_t ilk_pipe_pixel_rate(const struct intel_crtc_state
>> *pipe_config);
>>   bool ilk_disable_lp_wm(struct drm_device *dev);
>>   int sanitize_rc6_option(struct drm_i915_private *dev_priv, int
>> enable_rc6);
>> +void intel_enable_ipc(struct drm_i915_private *dev_priv);
>>   static inline int intel_enable_rc6(void)
>>   {
>>   	return i915.enable_rc6;
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c
>> b/drivers/gpu/drm/i915/intel_pm.c
>> index d4390e4..8d0037c 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -4793,6 +4793,27 @@ void intel_update_watermarks(struct drm_crtc
>> *crtc)
>>   }
>>   
>>   /*
>> + * enable IPC for Supported Platforms
>> + */
> This comment also doesn't say very much. Also, s/enable/Enable/ if you
> keep it.
>
>> +void intel_enable_ipc(struct drm_i915_private *dev_priv)
>> +{
>> +	u32 val;
>> +
>> +	/* enable IPC only for Broxton for now*/
> Also not a useful comment, unless you explain why.
>
>> +	if (!IS_BROXTON(dev_priv))
>> +		return;
>> +
>> +	val = I915_READ(DISP_ARB_CTL2);
>> +
>> +	if (i915.enable_ipc)
>> +		val |= DISP_IPC_ENABLE;
>> +	else
>> +		val &= ~DISP_IPC_ENABLE;
>> +
>> +	I915_WRITE(DISP_ARB_CTL2, val);
>> +}
>> +
>> +/*
>>    * Lock protecting IPS related data structures
>>    */
>>   DEFINE_SPINLOCK(mchdev_lock);

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 8/9] drm/i915/bxt: set chicken bit as IPC y-tile WA
  2016-09-22  9:43     ` Mahesh Kumar
@ 2016-09-22 11:53       ` Paulo Zanoni
  0 siblings, 0 replies; 40+ messages in thread
From: Paulo Zanoni @ 2016-09-22 11:53 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx

Em Qui, 2016-09-22 às 15:13 +0530, Mahesh Kumar escreveu:
> Hi,
> 
> 
> On Thursday 22 September 2016 01:53 AM, Paulo Zanoni wrote:
> > 
> > Em Sex, 2016-09-09 às 13:31 +0530, Kumar, Mahesh escreveu:
> > > 
> > > From: Mahesh Kumar <mahesh1.kumar@intel.com>
> > > 
> > > It implements the WA to enable IDLE_WAKEMEM bit of CHICKEN_DCPR_1
> > > register for Broxton platform. When IPC is enabled & Y-tile is
> > > enabled in any of the enabled plane, above bit should be set.
> > > Without this WA system observes random hang.
> > The previous patch enabled the feature, and now this patch
> > implements a
> > missing workaround for the feature. This is not the appropriate
> > order,
> > since it can mean that the previous patch introduced a bug that was
> > fixed by this patch, and this potentially breaks bisectability. We
> > only
> > enable the feature once we know all its workarounds are enabled and
> > the
> > feature will Just Work*. So, normally, my suggestion would be to
> > either
> > invert the patch order, or just make patches 7 and 8 be a single
> > patch.
> I can invert the order of patches, If below point seems OK.
> > 
> > 
> > But we have another problem, please see
> > commit 590e8ff04bc0182dce97228e5e352d6413d80456. What's the problem
> > with the current implementation? Why can't we just keep the bit as
> > 1
> > forever?
> mentioned commit is making IDLE_WAKEMEM bit always 1b (masking), but
> it 
> is require only in case of y/yf tiling.
> need to check for consequences in keeping this bit always Masked.
> what is your opinion on this? Should not we implement it only in case
> of 
> y/yf tiling, when we can do it in our framework?

If there are no bad consequences for keeping the bit as 1 even when
we're not using y/yf tiling, then the simpler solution wins IMHO.

> 
> Regards,
> -Mahesh
> > 
> > 
> > 
> > > 
> > > Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> > > ---
> > >   drivers/gpu/drm/i915/i915_drv.h      |  2 ++
> > >   drivers/gpu/drm/i915/i915_reg.h      |  3 +++
> > >   drivers/gpu/drm/i915/intel_display.c | 50
> > > ++++++++++++++++++++++++++++++++++++
> > >   3 files changed, 55 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > b/drivers/gpu/drm/i915/i915_drv.h
> > > index 4737a0e..79b9305 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1632,6 +1632,8 @@ struct skl_wm_values {
> > >   	unsigned dirty_pipes;
> > >   	/* any WaterMark memory workaround Required */
> > >   	enum watermark_memory_wa mem_wa;
> > > +	/* IPC Y-tiled WA related member */
> > > +	u32 y_plane_mask;
> > >   	struct skl_ddb_allocation ddb;
> > >   	uint32_t wm_linetime[I915_MAX_PIPES];
> > >   	uint32_t plane[I915_MAX_PIPES][I915_MAX_PLANES][8];
> > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > > b/drivers/gpu/drm/i915/i915_reg.h
> > > index 75b5b52..210d9b0 100644
> > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > @@ -5658,6 +5658,9 @@ enum {
> > >   #define PLANE_NV12_BUF_CFG(pipe, plane)	\
> > >   	_MMIO_PLANE(plane, _PLANE_NV12_BUF_CFG_1(pipe),
> > > _PLANE_NV12_BUF_CFG_2(pipe))
> > >   
> > > +#define CHICKEN_DCPR_1             _MMIO(0x46430)
> > > +#define IDLE_WAKEMEM_MASK          (1 << 13)
> > > +
> > >   /* SKL new cursor registers */
> > >   #define _CUR_BUF_CFG_A				0x7017c
> > >   #define _CUR_BUF_CFG_B				0x7117c
> > > diff --git a/drivers/gpu/drm/i915/intel_display.c
> > > b/drivers/gpu/drm/i915/intel_display.c
> > > index 5f50f53..ee7f88e 100644
> > > --- a/drivers/gpu/drm/i915/intel_display.c
> > > +++ b/drivers/gpu/drm/i915/intel_display.c
> > > @@ -12415,6 +12415,8 @@ int
> > > intel_plane_atomic_calc_changes(struct
> > > drm_crtc_state *crtc_state,
> > >   	bool is_crtc_enabled = crtc_state->active;
> > >   	bool turn_off, turn_on, visible, was_visible;
> > >   	struct drm_framebuffer *fb = plane_state->fb;
> > > +	struct intel_atomic_state *intel_state =
> > > +				to_intel_atomic_state(plane_stat
> > > e-
> > > > 
> > > > state);
> > >   	int ret;
> > >   
> > >   	if (INTEL_GEN(dev) >= 9 && plane->type !=
> > > DRM_PLANE_TYPE_CURSOR) {
> > > @@ -12501,6 +12503,15 @@ int
> > > intel_plane_atomic_calc_changes(struct
> > > drm_crtc_state *crtc_state,
> > >   	    !needs_scaling(old_plane_state))
> > >   		pipe_config->disable_lp_wm = true;
> > >   
> > > +	if (fb && (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
> > > +			fb->modifier[0] ==
> > > I915_FORMAT_MOD_Yf_TILED)) {
> > > +		intel_state->wm_results.y_plane_mask |=
> > > +						(1 <<
> > > drm_plane_index(plane));
> > > +	} else {
> > > +		intel_state->wm_results.y_plane_mask &=
> > > +						~(1 <<
> > > drm_plane_index(plane));
> > > +	}
> > > +
> > >   	return 0;
> > >   }
> > >   
> > > @@ -13963,6 +13974,10 @@ static int intel_atomic_check(struct
> > > drm_device *dev,
> > >   	if (ret)
> > >   		return ret;
> > >   
> > > +	/* Copy the Y-tile WA related states */
> > > +	intel_state->wm_results.y_plane_mask =
> > > +				dev_priv-
> > > > 
> > > > wm.skl_results.y_plane_mask;
> > > +
> > >   	for_each_crtc_in_state(state, crtc, crtc_state, i) {
> > >   		struct intel_crtc_state *pipe_config =
> > >   			to_intel_crtc_state(crtc_state);
> > > @@ -14204,6 +14219,32 @@ static void intel_update_crtcs(struct
> > > drm_atomic_state *state,
> > >   	}
> > >   }
> > >   
> > > +/*
> > > + * GEN9 IPC WA for Y-tiled
> > > + */
> > > +void bxt_set_ipc_wa(struct drm_i915_private *dev_priv, bool
> > > enable)
> > > +{
> > > +	u32 val;
> > > +
> > > +	if (!IS_BROXTON(dev_priv) || !i915.enable_ipc)
> > > +		return;
> > > +
> > > +	val = I915_READ(CHICKEN_DCPR_1);
> > > +	/*
> > > +	 * If WA is already enabled or disabled
> > > +	 * no need to re-enable or disable it.
> > > +	 */
> > > +	if ((enable && (val & IDLE_WAKEMEM_MASK)) ||
> > > +		(!enable && !(val & IDLE_WAKEMEM_MASK)))
> > > +		return;
> > > +
> > > +	if (enable)
> > > +		val |= IDLE_WAKEMEM_MASK;
> > > +	else
> > > +		val &= ~IDLE_WAKEMEM_MASK;
> > > +	I915_WRITE(CHICKEN_DCPR_1, val);
> > > +}
> > > +
> > >   static void skl_update_crtcs(struct drm_atomic_state *state,
> > >   			     unsigned int *crtc_vblank_mask)
> > >   {
> > > @@ -14219,6 +14260,12 @@ static void skl_update_crtcs(struct
> > > drm_atomic_state *state,
> > >   	enum pipe pipe;
> > >   
> > >   	/*
> > > +	 * If IPC WA need to be enabled, enable it now
> > > +	 */
> > > +	if (intel_state->wm_results.y_plane_mask)
> > > +		bxt_set_ipc_wa(dev_priv, true);
> > > +
> > > +	/*
> > >   	 * Whenever the number of active pipes changes, we need
> > > to
> > > make sure we
> > >   	 * update the pipes in the right order so that their
> > > ddb
> > > allocations
> > >   	 * never overlap with eachother inbetween CRTC updates.
> > > Otherwise we'll
> > > @@ -14261,6 +14308,9 @@ static void skl_update_crtcs(struct
> > > drm_atomic_state *state,
> > >   			progress = true;
> > >   		}
> > >   	} while (progress);
> > > +
> > > +	if (!intel_state->wm_results.y_plane_mask)
> > > +		bxt_set_ipc_wa(dev_priv, false);
> > >   }
> > >   
> > >   static void intel_atomic_commit_tail(struct drm_atomic_state
> > > *state)
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2016-09-22 14:12 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-09  8:00 [PATCH v3 0/9] New DDB Algo and WM fixes Kumar, Mahesh
2016-09-09  8:00 ` [PATCH v3 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions Kumar, Mahesh
2016-09-20 12:17   ` Paulo Zanoni
2016-09-21 13:48     ` Mahesh Kumar
2016-09-21 13:59       ` Paulo Zanoni
2016-09-09  8:00 ` [PATCH v3 2/9] drm/i915/skl+: use linetime latency instead of ddb size Kumar, Mahesh
2016-09-12  8:56   ` Maarten Lankhorst
2016-09-19 18:19   ` Paulo Zanoni
2016-09-19 18:24     ` Zanoni, Paulo R
2016-09-22  8:02       ` Mahesh Kumar
2016-09-09  8:01 ` [PATCH v3 3/9] drm/i915/skl: New ddb allocation algorithm Kumar, Mahesh
2016-09-12 10:50   ` Maarten Lankhorst
2016-09-12 13:11   ` Maarten Lankhorst
2016-09-13  6:21     ` Mahesh Kumar
2016-09-13 12:15     ` [PATCH v4] " Kumar, Mahesh
2016-09-13 12:40       ` Maarten Lankhorst
2016-09-14 12:36         ` Mahesh Kumar
2016-09-19  8:27           ` Maarten Lankhorst
2016-09-19  9:55           ` Maarten Lankhorst
2016-09-21 13:03             ` Mahesh Kumar
2016-09-09  8:01 ` [PATCH v3 4/9] drm/i915: Decode system memory bandwidth Kumar, Mahesh
2016-09-16  8:02   ` Pandiyan, Dhinakaran
2016-09-16 11:35     ` Mahesh Kumar
2016-09-19 20:41   ` Paulo Zanoni
2016-09-09  8:01 ` [PATCH v3 5/9] drm/i915/gen9: WM memory bandwidth related workaround Kumar, Mahesh
2016-09-12 11:02   ` Maarten Lankhorst
2016-09-12 11:12     ` Maarten Lankhorst
2016-09-09  8:01 ` [PATCH v3 6/9] drm/i915/skl+: change WM calc to fixed point 16.16 Kumar, Mahesh
2016-09-21 18:32   ` Paulo Zanoni
2016-09-22  9:25     ` Mahesh Kumar
2016-09-09  8:01 ` [PATCH v3 7/9] drm/i915/bxt: Enable IPC support Kumar, Mahesh
2016-09-21 20:06   ` Paulo Zanoni
2016-09-22 10:14     ` Mahesh Kumar
2016-09-09  8:01 ` [PATCH v3 8/9] drm/i915/bxt: set chicken bit as IPC y-tile WA Kumar, Mahesh
2016-09-21 20:23   ` Paulo Zanoni
2016-09-22  9:43     ` Mahesh Kumar
2016-09-22 11:53       ` Paulo Zanoni
2016-09-09  8:01 ` [PATCH v3 9/9] drm/i915/bxt: Implement Transition WM Kumar, Mahesh
2016-09-14 11:54   ` [PATCH v4] " Kumar, Mahesh
2016-09-21 20:27     ` Paulo Zanoni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.