All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] Implement New DDB allocation algorithm
@ 2016-08-29 12:35 Kumar, Mahesh
  2016-08-29 12:35 ` [PATCH 1/7] drm/i915/hsw+: set intel_crtc active once pipe is active Kumar, Mahesh
                   ` (8 more replies)
  0 siblings, 9 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-08-29 12:35 UTC (permalink / raw)
  To: intel-gfx

This series implements new DDB allocation algorithm to solve the cases,
where we have sufficient DDB available to enable multiple planes, But
due to the current algorithm not dividing it properly among planes, we
end-up failing the flip.
It also takes care of enabling same watermark level for each
plane, for efficient power saving.
This series also implements Transition Watermarks and Gen-9 related
arbitrated display bandwidth Workarounds.

There are two steps in current WM programming.

1. Calculate minimum number of blocks required  for a WM level to be
enabled. For 1440x2560 panel we need 41 blocks as minimum number of
blocks to enable WM0. This is the step which doesn't use vertical size.
It only depends on Pipe drain rate and plane horizontal size as per the
current Bspec algorithm.
So all the plane below have minimum  number of blocks required to enable
WM0 as 41
    Plane 1  - 1440x2560        -    Min blocks to enable WM0 = 41
    Plane 2  - 1440x2560        -    Min blocks to enable WM0 = 41
    Plane 3  - 1440x48          -    Min blocks to enable WM0 = 41
    Plane 4  - 1440x96          -    Min blocks to enable WM0 = 41

2. Number of blocks allotted by the driver
    Driver allocates  12 for Plane 3   &  16 for plane 4

    Total Dbuf Available = 508
    Dbuf Available after 32 blocks for cursor = 508 - (32)  = 476
    allocate minimum blocks for each plane 8 * 4 = 32
    remaining blocks = 476 - 32 = 444
    Relative Data Rate for Planes
       Plane 1  =  1440 * 2560 * 3  =  11059200
       Plane 2  =  1440 * 2560 * 3  =  11059200
       Plane 3  =  1440 * 48   * 3  =  207360
       Plane 4  =  1440 * 96   * 3  =  414720
       Total Relative BW            =  22740480

-   Allocate Buffer
    buffer allocation = (Plane relative data rate / total data rate)
		    * total remaming DDB + minimum plane DDB
     Plane 1  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
     Plane 2  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
     Plane 3  buffer allocation = (207360   / 22740480) * 444 + 8 = 12
     Plane 4  buffer allocation = (414720   / 22740480) * 444 + 8 = 16

In this case it forced driver to disable Plane 3 & 4. Driver need to use
more efficient way to allocate buffer that is optimum for power.

New Algorithm suggested by HW team is:

1. Calculate minimum buffer allocations for each plane and for each
    watermark level

2. Add minimum buffer allocations required for enabling WM7
    for all the planes

Level 0 =  41 + 41 + 41 + 41  = 164
Level 1 =  42 + 42 + 42 + 42  = 168
Level 2 =  42 + 42 + 42 + 42  = 168
Level 3 =  94 + 94 + 94 + 94 =  376
Level 4 =  94 + 94 + 94 + 94 =  376
Level 5 =  94 + 94 + 94 + 94 =  376
Level 6 =  94 + 94 + 94 + 94 =  376
Level 7 =  94 + 94 + 94 + 94 =  376

3. Check to see how many buffer allocation are left and enable
the best case. In this case since we have 476 blocks we can enable
WM0-7 on all 4 planes.
Let's say if we have only 200 block available then the best cases
allocation is to enable Level2 which requires 168 blocks

Kumar, Mahesh (7):
  drm/i915/hsw+: set intel_crtc active once pipe is active
  drm/i915/skl+: use linetime latency instead of ddb size
  drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm)
    functions
  drm/i915/skl: New ddb allocation algorithm
  drm/i915: Decode system memory bandwidth
  FOR_UPSTREAM [VPG]: drm/i915/skl+: Implement Transition WM
  drm/i915/gen9: WM memory bandwidth related workaround

 drivers/gpu/drm/i915/i915_drv.c      |  96 +++++++++
 drivers/gpu/drm/i915/i915_drv.h      |  27 +++
 drivers/gpu/drm/i915/i915_reg.h      |  25 +++
 drivers/gpu/drm/i915/intel_display.c |   4 +-
 drivers/gpu/drm/i915/intel_drv.h     |  11 ++
 drivers/gpu/drm/i915/intel_pm.c      | 366 +++++++++++++++++++++++++++--------
 6 files changed, 447 insertions(+), 82 deletions(-)

-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/7] drm/i915/hsw+: set intel_crtc active once pipe is active
  2016-08-29 12:35 [PATCH 0/7] Implement New DDB allocation algorithm Kumar, Mahesh
@ 2016-08-29 12:35 ` Kumar, Mahesh
  2016-08-30 19:18   ` Zanoni, Paulo R
  2016-08-31 10:51   ` Maarten Lankhorst
  2016-08-29 12:35 ` [PATCH 2/7] drm/i915/skl+: use linetime latency instead of ddb size Kumar, Mahesh
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-08-29 12:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kumar

Set the intel_crtc->active flag after pipe/crtc is actually active in
haswell_crtc_enable function.

Signed-off-by: Kumar, Mahesh <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index e4e6141..7258883 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -5427,8 +5427,6 @@ static void haswell_crtc_enable(struct intel_crtc_state *pipe_config,
 
 	intel_color_set_csc(&pipe_config->base);
 
-	intel_crtc->active = true;
-
 	if (intel_crtc->config->has_pch_encoder)
 		intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, false);
 	else
@@ -5475,6 +5473,8 @@ static void haswell_crtc_enable(struct intel_crtc_state *pipe_config,
 	assert_vblank_disabled(crtc);
 	drm_crtc_vblank_on(crtc);
 
+	intel_crtc->active = true;
+
 	intel_encoders_enable(crtc, pipe_config, old_state);
 
 	if (intel_crtc->config->has_pch_encoder) {
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/7] drm/i915/skl+: use linetime latency instead of ddb size
  2016-08-29 12:35 [PATCH 0/7] Implement New DDB allocation algorithm Kumar, Mahesh
  2016-08-29 12:35 ` [PATCH 1/7] drm/i915/hsw+: set intel_crtc active once pipe is active Kumar, Mahesh
@ 2016-08-29 12:35 ` Kumar, Mahesh
  2016-08-29 12:35 ` [PATCH 3/7] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions Kumar, Mahesh
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-08-29 12:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kumar

This patch make changes to use linetime latency instead of allocated
DDB size during plane watermark calculation in switch case, This is
required to implement new DDB allocation algorithm.

In New Algorithm DDB is allocated based on WM values, because of which
number of DDB blocks will not be available during WM calculation,
So this "linetime latency" is suggested by SV/HW team to use during
switch-case for WM blocks selection.

Signed-off-by: Kumar, Mahesh <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index aba6fd0..b2f17eb 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3554,6 +3554,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	uint8_t cpp;
 	uint32_t width = 0, height = 0;
 	uint32_t plane_pixel_rate;
+	uint32_t linetime_us = 0;
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
 		*enabled = false;
@@ -3603,7 +3604,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		y_tile_minimum = plane_blocks_per_line * min_scanlines;
 		selected_result = max(method2, y_tile_minimum);
 	} else {
-		if ((ddb_allocation / plane_blocks_per_line) >= 1)
+		linetime_us = DIV_ROUND_UP(width * 1000,
+				skl_pipe_pixel_rate(cstate));
+		if (latency >= linetime_us)
 			selected_result = min(method1, method2);
 		else
 			selected_result = method1;
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/7] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions
  2016-08-29 12:35 [PATCH 0/7] Implement New DDB allocation algorithm Kumar, Mahesh
  2016-08-29 12:35 ` [PATCH 1/7] drm/i915/hsw+: set intel_crtc active once pipe is active Kumar, Mahesh
  2016-08-29 12:35 ` [PATCH 2/7] drm/i915/skl+: use linetime latency instead of ddb size Kumar, Mahesh
@ 2016-08-29 12:35 ` Kumar, Mahesh
  2016-08-29 12:35 ` [PATCH 4/7] drm/i915/skl: New ddb allocation algorithm Kumar, Mahesh
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-08-29 12:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kumar

This patch make use of plane_wm variable directly instead of passing
skl_plane_wm struct. this way reduces number of argument requirement
in watermark calculation functions.

It also gives more freedom of decision making to implement Bspec WM
workarounds.

Signed-off-by: Kumar, Mahesh <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index b2f17eb..5fa02cb 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3540,9 +3540,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_plane_state *intel_pstate,
 				uint16_t ddb_allocation,
 				int level,
-				uint16_t *out_blocks, /* out */
-				uint8_t *out_lines, /* out */
-				bool *enabled /* out */)
+				struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_plane_state *pstate = &intel_pstate->base;
 	struct drm_framebuffer *fb = pstate->fb;
@@ -3555,6 +3553,11 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	uint32_t width = 0, height = 0;
 	uint32_t plane_pixel_rate;
 	uint32_t linetime_us = 0;
+	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
+	struct skl_wm_level *result = &pipe_wm->wm[level];
+	uint16_t *out_blocks = &result->plane_res_b[id];
+	uint8_t *out_lines = &result->plane_res_l[id];
+	bool *enabled = &result->plane_en[id];
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
 		*enabled = false;
@@ -3655,7 +3658,7 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 		     struct skl_ddb_allocation *ddb,
 		     struct intel_crtc_state *cstate,
 		     int level,
-		     struct skl_wm_level *result)
+		     struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_atomic_state *state = cstate->base.state;
 	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
@@ -3666,12 +3669,6 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 	enum pipe pipe = intel_crtc->pipe;
 	int ret;
 
-	/*
-	 * We'll only calculate watermarks for planes that are actually
-	 * enabled, so make sure all other planes are set as disabled.
-	 */
-	memset(result, 0, sizeof(*result));
-
 	for_each_intel_plane_mask(&dev_priv->drm,
 				  intel_plane,
 				  cstate->base.plane_mask) {
@@ -3709,9 +3706,7 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 					   intel_pstate,
 					   ddb_blocks,
 					   level,
-					   &result->plane_res_b[i],
-					   &result->plane_res_l[i],
-					   &result->plane_en[i]);
+					   pipe_wm);
 		if (ret)
 			return ret;
 	}
@@ -3759,9 +3754,15 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 	int level, max_level = ilk_wm_max_level(dev);
 	int ret;
 
+	/*
+	 * We'll only calculate watermarks for planes that are actually
+	 * enabled, so make sure all other planes are set as disabled.
+	 */
+	memset(pipe_wm, 0, sizeof(*pipe_wm));
+
 	for (level = 0; level <= max_level; level++) {
 		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
-					   level, &pipe_wm->wm[level]);
+					   level, pipe_wm);
 		if (ret)
 			return ret;
 	}
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/7] drm/i915/skl: New ddb allocation algorithm
  2016-08-29 12:35 [PATCH 0/7] Implement New DDB allocation algorithm Kumar, Mahesh
                   ` (2 preceding siblings ...)
  2016-08-29 12:35 ` [PATCH 3/7] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions Kumar, Mahesh
@ 2016-08-29 12:35 ` Kumar, Mahesh
  2016-08-31 13:38   ` Maarten Lankhorst
  2016-08-29 12:35 ` [PATCH 5/7] drm/i915: Decode system memory bandwidth Kumar, Mahesh
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Kumar, Mahesh @ 2016-08-29 12:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kumar

This patch implements new DDB allocation algorithm as per HW team
suggestion. This algo takecare of scenario where we allocate less DDB
for the planes with lower relative pixel rate, but they require more DDB
to work.
It also takes care of enabling same watermark level for each
plane, for efficient power saving.

Signed-off-by: Kumar, Mahesh <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 129 +++++++++++++++++++++-------------------
 1 file changed, 67 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 5fa02cb..9f69050 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3331,6 +3331,7 @@ skl_ddb_min_alloc(const struct drm_plane_state *pstate,
 
 static int
 skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
+		      struct skl_pipe_wm *pipe_wm,
 		      struct skl_ddb_allocation *ddb /* out */)
 {
 	struct drm_atomic_state *state = cstate->base.state;
@@ -3346,8 +3347,11 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	uint16_t *minimum = cstate->wm.skl.minimum_blocks;
 	uint16_t *y_minimum = cstate->wm.skl.minimum_y_blocks;
 	unsigned int total_data_rate;
+	uint16_t total_min_blocks = 0;
+	uint16_t total_level_ddb = 0;
 	int num_active;
-	int id, i;
+	int max_level, level;
+	int id, i, ret = 0;
 
 	if (WARN_ON(!state))
 		return 0;
@@ -3363,6 +3367,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	alloc_size = skl_ddb_entry_size(alloc);
 	if (alloc_size == 0) {
 		memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
+		memset(ddb->y_plane[pipe], 0, sizeof(ddb->y_plane[pipe]));
 		return 0;
 	}
 
@@ -3396,19 +3401,42 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	}
 
 	for (i = 0; i < PLANE_CURSOR; i++) {
-		alloc_size -= minimum[i];
-		alloc_size -= y_minimum[i];
+		total_min_blocks += minimum[i];
+		total_min_blocks += y_minimum[i];
 	}
 
-	/*
-	 * 2. Distribute the remaining space in proportion to the amount of
-	 * data each plane needs to fetch from memory.
-	 *
-	 * FIXME: we may not allocate every single block here.
-	 */
+	for (level = ilk_wm_max_level(dev); level >= 0; level--) {
+		total_level_ddb = 0;
+		for (i = 0; i < PLANE_CURSOR; i++) {
+			/*
+			 * TODO: We should calculate watermark values for Y/UV
+			 * plane both in case of NV12 format and use both values
+			 * for ddb calculation, As NV12 is disabled as of now.
+			 * using only single plane value here.
+			 */
+			uint16_t min = minimum[i] + y_minimum[i];
+			uint16_t plane_level_ddb_wm =
+				max(pipe_wm->wm[level].plane_res_b[i], min);
+			total_level_ddb += plane_level_ddb_wm;
+		}
+
+		if (total_level_ddb <= alloc_size)
+			break;
+	}
+
+	if ((level < 0) || (total_min_blocks > alloc_size)) {
+		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
+		DRM_DEBUG_KMS("minimum required %d/%d\n", (level < 0) ?
+				total_level_ddb : total_min_blocks, alloc_size);
+		ret = -EINVAL;
+		goto exit;
+	}
+	max_level = level;
+	alloc_size -= total_level_ddb;
+
 	total_data_rate = skl_get_total_relative_data_rate(cstate);
 	if (total_data_rate == 0)
-		return 0;
+		goto exit;
 
 	start = alloc->start;
 	for_each_intel_plane_on_crtc(dev, intel_crtc, intel_plane) {
@@ -3423,7 +3451,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		 * promote the expression to 64 bits to avoid overflowing, the
 		 * result is < available as data_rate / total_data_rate < 1
 		 */
-		plane_blocks = minimum[id];
+		plane_blocks = max(pipe_wm->wm[max_level].plane_res_b[id],
+					minimum[id]);
 		plane_blocks += div_u64((uint64_t)alloc_size * data_rate,
 					total_data_rate);
 
@@ -3437,6 +3466,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 
 		/*
 		 * allocation for y_plane part of planar format:
+		 * TODO: Once we start calculating watermark values for Y/UV
+		 * plane both consider it for initial allowed wm blocks.
 		 */
 		y_data_rate = cstate->wm.skl.plane_y_data_rate[id];
 
@@ -3450,9 +3481,22 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		}
 
 		start += y_plane_blocks;
+
+		/*
+		 * Now enable all levels in WM structure which can be enabled
+		 * using current DDB allocation
+		 */
+		for (i = ilk_wm_max_level(dev); i > 0; i--) {
+			if (i > max_level || pipe_wm->wm[i].plane_res_l[id] > 31
+					|| pipe_wm->wm[i].plane_res_b[id] == 0)
+				pipe_wm->wm[i].plane_en[id] = false;
+			else
+				pipe_wm->wm[i].plane_en[id] = true;
+		}
 	}
 
-	return 0;
+exit:
+	return ret;
 }
 
 static uint32_t skl_pipe_pixel_rate(const struct intel_crtc_state *config)
@@ -3538,7 +3582,6 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
 				struct intel_plane_state *intel_pstate,
-				uint16_t ddb_allocation,
 				int level,
 				struct skl_pipe_wm *pipe_wm)
 {
@@ -3557,12 +3600,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	struct skl_wm_level *result = &pipe_wm->wm[level];
 	uint16_t *out_blocks = &result->plane_res_b[id];
 	uint8_t *out_lines = &result->plane_res_l[id];
-	bool *enabled = &result->plane_en[id];
 
-	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
-		*enabled = false;
+	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
 		return 0;
-	}
 
 	width = drm_rect_width(&intel_pstate->base.src) >> 16;
 	height = drm_rect_height(&intel_pstate->base.src) >> 16;
@@ -3626,54 +3666,27 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 			res_blocks++;
 	}
 
-	if (res_blocks >= ddb_allocation || res_lines > 31) {
-		*enabled = false;
-
-		/*
-		 * If there are no valid level 0 watermarks, then we can't
-		 * support this display configuration.
-		 */
-		if (level) {
-			return 0;
-		} else {
-			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
-			DRM_DEBUG_KMS("Plane %d.%d: blocks required = %u/%u, lines required = %u/31\n",
-				      to_intel_crtc(cstate->base.crtc)->pipe,
-				      skl_wm_plane_id(to_intel_plane(pstate->plane)),
-				      res_blocks, ddb_allocation, res_lines);
-
-			return -EINVAL;
-		}
-	}
-
 	*out_blocks = res_blocks;
 	*out_lines = res_lines;
-	*enabled = true;
 
 	return 0;
 }
 
 static int
 skl_compute_wm_level(const struct drm_i915_private *dev_priv,
-		     struct skl_ddb_allocation *ddb,
 		     struct intel_crtc_state *cstate,
 		     int level,
 		     struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_atomic_state *state = cstate->base.state;
-	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
 	struct drm_plane *plane;
 	struct intel_plane *intel_plane;
 	struct intel_plane_state *intel_pstate;
-	uint16_t ddb_blocks;
-	enum pipe pipe = intel_crtc->pipe;
 	int ret;
 
 	for_each_intel_plane_mask(&dev_priv->drm,
 				  intel_plane,
 				  cstate->base.plane_mask) {
-		int i = skl_wm_plane_id(intel_plane);
-
 		plane = &intel_plane->base;
 		intel_pstate = NULL;
 		if (state)
@@ -3699,12 +3712,9 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 
 		WARN_ON(!intel_pstate->base.fb);
 
-		ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][i]);
-
 		ret = skl_compute_plane_wm(dev_priv,
 					   cstate,
 					   intel_pstate,
-					   ddb_blocks,
 					   level,
 					   pipe_wm);
 		if (ret)
@@ -3761,11 +3771,15 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 	memset(pipe_wm, 0, sizeof(*pipe_wm));
 
 	for (level = 0; level <= max_level; level++) {
-		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
+		ret = skl_compute_wm_level(dev_priv, cstate,
 					   level, pipe_wm);
 		if (ret)
 			return ret;
 	}
+	ret = skl_allocate_pipe_ddb(cstate, pipe_wm, ddb);
+	if (ret)
+		return ret;
+
 	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
 
 	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
@@ -3953,13 +3967,12 @@ pipes_modified(struct drm_atomic_state *state)
 }
 
 static int
-skl_compute_ddb(struct drm_atomic_state *state)
+skl_include_affected_pipes(struct drm_atomic_state *state)
 {
 	struct drm_device *dev = state->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
 	struct intel_crtc *intel_crtc;
-	struct skl_ddb_allocation *ddb = &intel_state->wm_results.ddb;
 	uint32_t realloc_pipes = pipes_modified(state);
 	int ret;
 
@@ -4012,10 +4025,6 @@ skl_compute_ddb(struct drm_atomic_state *state)
 		if (IS_ERR(cstate))
 			return PTR_ERR(cstate);
 
-		ret = skl_allocate_pipe_ddb(cstate, ddb);
-		if (ret)
-			return ret;
-
 		ret = drm_atomic_add_affected_planes(state, &intel_crtc->base);
 		if (ret)
 			return ret;
@@ -4069,19 +4078,15 @@ skl_compute_wm(struct drm_atomic_state *state)
 	/* Clear all dirty flags */
 	results->dirty_pipes = 0;
 
-	ret = skl_compute_ddb(state);
+	ret = skl_include_affected_pipes(state);
 	if (ret)
 		return ret;
 
 	/*
 	 * Calculate WM's for all pipes that are part of this transaction.
-	 * Note that the DDB allocation above may have added more CRTC's that
-	 * weren't otherwise being modified (and set bits in dirty_pipes) if
-	 * pipe allocations had to change.
-	 *
-	 * FIXME:  Now that we're doing this in the atomic check phase, we
-	 * should allow skl_update_pipe_wm() to return failure in cases where
-	 * no suitable watermark values can be found.
+	 * Note that affected pipe calculation above may have added more
+	 * CRTC's that weren't otherwise being modified (and set bits in
+	 * dirty_pipes) if pipe allocations had to change.
 	 */
 	for_each_crtc_in_state(state, crtc, cstate, i) {
 		struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 5/7] drm/i915: Decode system memory bandwidth
  2016-08-29 12:35 [PATCH 0/7] Implement New DDB allocation algorithm Kumar, Mahesh
                   ` (3 preceding siblings ...)
  2016-08-29 12:35 ` [PATCH 4/7] drm/i915/skl: New ddb allocation algorithm Kumar, Mahesh
@ 2016-08-29 12:35 ` Kumar, Mahesh
  2016-08-29 12:35 ` [PATCH] FOR_UPSTREAM [VPG]: drm/i915/skl+: Implement Transition WM Kumar, Mahesh
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-08-29 12:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kumar

This patch adds support to decode system memory bandwidth
which will be used for arbitrated display memory percentage
calculation in GEN9 based system.

Signed-off-by: Kumar, Mahesh <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 96 +++++++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h | 18 ++++++++
 drivers/gpu/drm/i915/i915_reg.h | 25 +++++++++++
 3 files changed, 139 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 47fe072..5ece036 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -996,6 +996,96 @@ static void intel_sanitize_options(struct drm_i915_private *dev_priv)
 	DRM_DEBUG_DRIVER("use GPU sempahores? %s\n", yesno(i915.semaphores));
 }
 
+static void
+intel_get_memdev_info(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	uint32_t val = 0;
+	uint32_t mem_speed = 0;
+	uint8_t dram_type;
+	uint32_t dram_channel;
+	uint8_t num_channel;
+	bool rank_valid = false;
+
+	if (!IS_GEN9(dev_priv))
+		goto exit;
+
+	val = I915_READ(P_CR_MC_BIOS_REQ_0_0_0);
+	mem_speed = div_u64((uint64_t) (val & REQ_DATA_MASK) *
+			MEMORY_FREQ_MULTIPLIER, 1000);
+
+	if (mem_speed == 0)
+		goto exit;
+
+	dev_priv->memdev_info.valid = true;
+	dev_priv->memdev_info.mem_speed = mem_speed;
+	dram_type = (val >> DRAM_TYPE_SHIFT) & DRAM_TYPE_MASK;
+	dram_channel = (val >> DRAM_CHANNEL_SHIFT) & DRAM_CHANNEL_MASK;
+	num_channel = hweight32(dram_channel);
+
+	/*
+	 * The lpddr3 and lpddr4 technologies can have 1-4 channels and the
+	 * channels are 32bits wide; while ddr3l technologies can have 1-2
+	 * channels and the channels are 64 bits wide. But SV team found that in
+	 * case of single 64 bit wide DDR3L dimms two bits were set and system
+	 * with two DDR3L 64bit dimm all four bits were set.
+	 */
+
+	switch (dram_type) {
+	case DRAM_TYPE_LPDDR3:
+	case DRAM_TYPE_LPDDR4:
+		dev_priv->memdev_info.data_width = 4;
+		dev_priv->memdev_info.num_channel = num_channel;
+		break;
+	case DRAM_TYPE_DDR3L:
+		dev_priv->memdev_info.data_width = 8;
+		dev_priv->memdev_info.num_channel = num_channel / 2;
+		break;
+	default:
+		dev_priv->memdev_info.data_width = 4;
+		dev_priv->memdev_info.num_channel = num_channel;
+	}
+
+	/*
+	 * Now read each DUNIT8/9/10/11 to check the rank of each dimms.
+	 * all the dimms should have same rank as in first valid Dimm
+	 */
+#define D_CR_DRP0_DUNIT_INVALID	    0xFFFFFFFF
+
+	dev_priv->memdev_info.rank_valid = true;
+	if (I915_READ(D_CR_DRP0_DUNIT8) != D_CR_DRP0_DUNIT_INVALID) {
+		val = I915_READ(D_CR_DRP0_DUNIT8);
+		rank_valid = true;
+	} else if (I915_READ(D_CR_DRP0_DUNIT9) != D_CR_DRP0_DUNIT_INVALID) {
+		val = I915_READ(D_CR_DRP0_DUNIT9);
+		rank_valid = true;
+	} else if (I915_READ(D_CR_DRP0_DUNIT10) != D_CR_DRP0_DUNIT_INVALID) {
+		val = I915_READ(D_CR_DRP0_DUNIT10);
+		rank_valid = true;
+	} else if (I915_READ(D_CR_DRP0_DUNIT11) != D_CR_DRP0_DUNIT_INVALID) {
+		val = I915_READ(D_CR_DRP0_DUNIT11);
+		rank_valid = true;
+	}
+#undef D_CR_DRP0_DUNIT_INVALID
+
+	if (rank_valid) {
+		dev_priv->memdev_info.rank_valid = true;
+		dev_priv->memdev_info.rank = (val & DRAM_RANK_MASK);
+	}
+
+	DRM_DEBUG_DRIVER("valid:%s speed-%d width-%d num_channel-%d\n",
+		dev_priv->memdev_info.valid ? "true" : "false",
+		dev_priv->memdev_info.mem_speed,
+		dev_priv->memdev_info.data_width,
+		dev_priv->memdev_info.num_channel);
+	DRM_DEBUG_DRIVER("rank_valid:%s rank-%d\n",
+		dev_priv->memdev_info.rank_valid ? "true" : "false",
+		dev_priv->memdev_info.rank);
+	return;
+exit:
+	dev_priv->memdev_info.valid = false;
+}
+
 /**
  * i915_driver_init_hw - setup state requiring device access
  * @dev_priv: device private
@@ -1099,6 +1189,12 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
 			DRM_DEBUG_DRIVER("can't enable MSI");
 	}
 
+	/*
+	 * Fill the memdev structure to get the system raw bandwidth
+	 * This will be used by WM algorithm, to implement GEN9 based WA
+	 */
+	intel_get_memdev_info(dev);
+
 	return 0;
 
 out_ggtt:
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 77a5478..d0123f8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2017,6 +2017,24 @@ struct drm_i915_private {
 		bool distrust_bios_wm;
 	} wm;
 
+	struct {
+		/*
+		 * memory device info
+		 * valid: memory info is valid or not
+		 * mem_speed: memory freq in KHz
+		 * channel_width: Channel width in bytes
+		 * num_channel: total number of channels
+		 * rank: 0-rank disable, 1-Single rank, 2-dual rank
+		 */
+		bool valid;
+		uint32_t mem_speed;
+		uint8_t data_width;
+		uint8_t num_channel;
+		bool rank_valid;
+		uint8_t rank;
+	} memdev_info;
+
+
 	struct i915_runtime_pm pm;
 
 	/* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index a29d707..b38445c 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7716,6 +7716,31 @@ enum {
 #define  DC_STATE_DEBUG_MASK_CORES	(1<<0)
 #define  DC_STATE_DEBUG_MASK_MEMORY_UP	(1<<1)
 
+#define P_CR_MC_BIOS_REQ_0_0_0		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x7114)
+#define REQ_DATA_MASK			(0x3F << 0)
+#define DRAM_TYPE_SHIFT			24
+#define DRAM_TYPE_MASK			0x7
+#define DRAM_CHANNEL_SHIFT		12
+#define DRAM_CHANNEL_MASK		0xF
+
+#define DRAM_TYPE_LPDDR3		0x1
+#define DRAM_TYPE_LPDDR4		0x2
+#define DRAM_TYPE_DDR3L			0x4
+/*
+ * BIOS programs this field of REQ_DATA [5:0] in integer
+ * multiple of 133330 KHz (133.33MHz)
+ */
+#define MEMORY_FREQ_MULTIPLIER		0x208D2
+#define D_CR_DRP0_DUNIT8		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1000)
+#define D_CR_DRP0_DUNIT9		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1200)
+#define D_CR_DRP0_DUNIT10		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1400)
+#define D_CR_DRP0_DUNIT11		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1600)
+#define D_CR_DRP0_RKEN0			(1 << 0)
+#define D_CR_DRP0_RKEN1			(1 << 1)
+#define DRAM_RANK_MASK			0x3
+#define DRAM_SINGLE_RANK		0x1
+#define DRAM_DUAL_RANK			0x3
+
 /* Please see hsw_read_dcomp() and hsw_write_dcomp() before using this register,
  * since on HSW we can't write to it using I915_WRITE. */
 #define D_COMP_HSW			_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5F0C)
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH] FOR_UPSTREAM [VPG]: drm/i915/skl+: Implement Transition WM
  2016-08-29 12:35 [PATCH 0/7] Implement New DDB allocation algorithm Kumar, Mahesh
                   ` (4 preceding siblings ...)
  2016-08-29 12:35 ` [PATCH 5/7] drm/i915: Decode system memory bandwidth Kumar, Mahesh
@ 2016-08-29 12:35 ` Kumar, Mahesh
  2016-08-30 19:32   ` Zanoni, Paulo R
  2016-08-29 12:35 ` [PATCH] drm/i915/gen9: WM memory bandwidth related workaround Kumar, Mahesh
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Kumar, Mahesh @ 2016-08-29 12:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kumar

This patch enables Transition WM for SKL+ platforms.

Transition WM are used if IPC is enabled, to decide, number of blocks
to be fetched before reducing the priority of display to fetch from
memory.

Signed-off-by: Kumar, Mahesh <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 57 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 53 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 9f69050..f4f387f 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2861,6 +2861,8 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
 #define SKL_DDB_SIZE		896	/* in blocks */
 #define BXT_DDB_SIZE		512
 #define SKL_SAGV_BLOCK_TIME	30 /* µs */
+#define SKL_TRANS_WM_AMOUNT	10
+#define SKL_TRANS_WM_MIN	14
 
 /*
  * Return the index of a plane in the SKL DDB and wm result arrays.  Primary
@@ -3579,6 +3581,41 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 	return pixel_rate;
 }
 
+static void skl_compute_plane_trans_wm(const struct drm_i915_private *dev_priv,
+					struct skl_pipe_wm *pipe_wm,
+					struct intel_plane_state *intel_pstate,
+					uint32_t y_tile_min,
+					bool y_tile)
+{
+	struct drm_plane_state *pstate = &intel_pstate->base;
+	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
+	uint16_t *out_blocks = &pipe_wm->trans_wm.plane_res_b[id];
+	uint8_t *out_lines = &pipe_wm->trans_wm.plane_res_l[id];
+	uint16_t res_blocks = pipe_wm->wm[0].plane_res_b[id];
+	uint32_t trans_min = 0, trans_offset_blocks;
+	uint16_t trans_y_tile_min = 0;
+	uint16_t trans_res_blocks;
+
+
+	if (IS_GEN9(dev_priv))
+		trans_min = SKL_TRANS_WM_MIN;
+
+	trans_offset_blocks = trans_min + SKL_TRANS_WM_AMOUNT;
+
+	if (y_tile) {
+		trans_y_tile_min = 2 * y_tile_min;
+		trans_res_blocks = max(trans_y_tile_min, res_blocks) +
+			trans_offset_blocks;
+	} else {
+		trans_res_blocks = res_blocks + trans_offset_blocks;
+		/* WA BUG:1938466 add one block for non y-tile planes */
+		trans_res_blocks += 1;
+	}
+
+	*out_blocks = trans_res_blocks;
+	*out_lines = 0;
+}
+
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
 				struct intel_plane_state *intel_pstate,
@@ -3600,6 +3637,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	struct skl_wm_level *result = &pipe_wm->wm[level];
 	uint16_t *out_blocks = &result->plane_res_b[id];
 	uint8_t *out_lines = &result->plane_res_l[id];
+	uint32_t y_tile_minimum = 0;
+	bool y_tile = false;
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
 		return 0;
@@ -3627,7 +3666,6 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
 	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
 		uint32_t min_scanlines = 4;
-		uint32_t y_tile_minimum;
 		if (intel_rotation_90_or_270(pstate->rotation)) {
 			int cpp = (fb->pixel_format == DRM_FORMAT_NV12) ?
 				drm_format_plane_cpp(fb->pixel_format, 1) :
@@ -3646,6 +3684,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		}
 		y_tile_minimum = plane_blocks_per_line * min_scanlines;
 		selected_result = max(method2, y_tile_minimum);
+		y_tile = true;
 	} else {
 		linetime_us = DIV_ROUND_UP(width * 1000,
 				skl_pipe_pixel_rate(cstate));
@@ -3669,6 +3708,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	*out_blocks = res_blocks;
 	*out_lines = res_lines;
 
+	if (level == 0)
+		skl_compute_plane_trans_wm(dev_priv, pipe_wm, intel_pstate,
+						y_tile_minimum, y_tile);
 	return 0;
 }
 
@@ -3738,11 +3780,13 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 }
 
 static void skl_compute_transition_wm(struct intel_crtc_state *cstate,
-				      struct skl_wm_level *trans_wm /* out */)
+				      struct skl_wm_level *trans_wm,
+				      struct skl_ddb_allocation *ddb)
 {
 	struct drm_crtc *crtc = cstate->base.crtc;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	struct intel_plane *intel_plane;
+	enum pipe pipe = to_intel_crtc(crtc)->pipe;
 
 	if (!cstate->base.active)
 		return;
@@ -3750,8 +3794,13 @@ static void skl_compute_transition_wm(struct intel_crtc_state *cstate,
 	/* Until we know more, just disable transition WMs */
 	for_each_intel_plane_on_crtc(crtc->dev, intel_crtc, intel_plane) {
 		int i = skl_wm_plane_id(intel_plane);
+		uint16_t plane_ddb = skl_ddb_entry_size(&ddb->plane[pipe][i]);
+		uint16_t trans_res_blocks = trans_wm->plane_res_b[i];
 
-		trans_wm->plane_en[i] = false;
+		if ((plane_ddb == 0) || (trans_res_blocks > plane_ddb))
+			trans_wm->plane_en[i] = false;
+		else
+			trans_wm->plane_en[i] = true;
 	}
 }
 
@@ -3782,7 +3831,7 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 
 	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
 
-	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
+	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm, ddb);
 
 	return 0;
 }
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH] drm/i915/gen9: WM memory bandwidth related workaround
  2016-08-29 12:35 [PATCH 0/7] Implement New DDB allocation algorithm Kumar, Mahesh
                   ` (5 preceding siblings ...)
  2016-08-29 12:35 ` [PATCH] FOR_UPSTREAM [VPG]: drm/i915/skl+: Implement Transition WM Kumar, Mahesh
@ 2016-08-29 12:35 ` Kumar, Mahesh
  2016-09-08 11:26 ` [PATCH v2 0/9] New DDB Algo and WM fixes Kumar, Mahesh
  2016-09-08 11:55 ` ✗ Fi.CI.BAT: failure for Implement New DDB allocation algorithm Patchwork
  8 siblings, 0 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-08-29 12:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Kumar

This patch implemnets Workariunds related to display arbitrated memory
bandwidth. These WA are applicabe for all gen-9 based platforms.

Signed-off-by: Kumar, Mahesh <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h  |   9 +++
 drivers/gpu/drm/i915/intel_drv.h |  11 +++
 drivers/gpu/drm/i915/intel_pm.c  | 145 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 165 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d0123f8..095af6c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1055,6 +1055,13 @@ enum intel_sbi_destination {
 	SBI_MPHY,
 };
 
+/* SKL+ Watermark arbitrated display bandwidth Workarounds */
+enum watermark_memory_wa {
+	WATERMARK_WA_NONE,
+	WATERMARK_WA_X_TILED,
+	WATERMARK_WA_Y_TILED,
+};
+
 #define QUIRK_PIPEA_FORCE (1<<0)
 #define QUIRK_LVDS_SSC_DISABLE (1<<1)
 #define QUIRK_INVERT_BRIGHTNESS (1<<2)
@@ -1604,6 +1611,8 @@ struct skl_ddb_allocation {
 
 struct skl_wm_values {
 	unsigned dirty_pipes;
+	/* any WaterMark memory workaround Required */
+	enum watermark_memory_wa mem_wa;
 	struct skl_ddb_allocation ddb;
 	uint32_t wm_linetime[I915_MAX_PIPES];
 	uint32_t plane[I915_MAX_PIPES][I915_MAX_PLANES][8];
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 570a7ca..f2c38cc 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1807,6 +1807,17 @@ intel_atomic_get_crtc_state(struct drm_atomic_state *state,
 	return to_intel_crtc_state(crtc_state);
 }
 
+static inline struct intel_crtc_state *
+intel_atomic_get_existing_crtc_state(struct drm_atomic_state *state,
+				      struct intel_crtc *crtc)
+{
+	struct drm_crtc_state *crtc_state;
+
+	crtc_state = drm_atomic_get_existing_crtc_state(state, &crtc->base);
+
+	return to_intel_crtc_state(crtc_state);
+}
+
 static inline struct intel_plane_state *
 intel_atomic_get_existing_plane_state(struct drm_atomic_state *state,
 				      struct intel_plane *plane)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index ac0d0ca..e9cd6ac 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3624,6 +3624,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 {
 	struct drm_plane_state *pstate = &intel_pstate->base;
 	struct drm_framebuffer *fb = pstate->fb;
+	struct intel_atomic_state *intel_state =
+			to_intel_atomic_state(cstate->base.state);
 	uint32_t latency = dev_priv->wm.skl_latency[level];
 	uint32_t method1, method2;
 	uint32_t plane_bytes_per_line, plane_blocks_per_line;
@@ -3639,10 +3641,17 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	uint8_t *out_lines = &result->plane_res_l[id];
 	uint32_t y_tile_minimum = 0;
 	bool y_tile = false;
+	enum watermark_memory_wa mem_wa;
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
 		return 0;
 
+	mem_wa = intel_state ? intel_state->wm_results.mem_wa : WATERMARK_WA_NONE;
+	if (mem_wa != WATERMARK_WA_NONE) {
+		if (fb->modifier[0] == I915_FORMAT_MOD_X_TILED)
+			latency += 15;
+	}
+
 	width = drm_rect_width(&intel_pstate->base.src) >> 16;
 	height = drm_rect_height(&intel_pstate->base.src) >> 16;
 
@@ -3682,6 +3691,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				WARN(1, "Unsupported pixel depth for rotation");
 			}
 		}
+		if (mem_wa == WATERMARK_WA_Y_TILED)
+			min_scanlines *= 2;
+
 		y_tile_minimum = plane_blocks_per_line * min_scanlines;
 		selected_result = max(method2, y_tile_minimum);
 		y_tile = true;
@@ -4065,6 +4077,15 @@ skl_include_affected_pipes(struct drm_atomic_state *state)
 		intel_state->wm_results.dirty_pipes = ~0;
 	}
 
+	/*
+	 * If Watermark workaround is changed we need to recalculate
+	 * watermark values for all active pipes
+	 */
+	if (intel_state->wm_results.mem_wa != dev_priv->wm.skl_hw.mem_wa) {
+		realloc_pipes = ~0;
+		intel_state->wm_results.dirty_pipes = ~0;
+	}
+
 	for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) {
 		struct intel_crtc_state *cstate;
 
@@ -4081,6 +4102,128 @@ skl_include_affected_pipes(struct drm_atomic_state *state)
 }
 
 static void
+skl_set_memory_bandwidth_wm_wa(struct drm_atomic_state *state)
+{
+	struct drm_device *dev = state->dev;
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct intel_crtc *intel_crtc;
+	struct intel_plane_state *intel_pstate;
+	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
+	int num_active_plane, num_active_pipe;
+	uint32_t plane_bw, max_plane_bw, pipe_bw, max_pipe_bw;
+	uint32_t total_pipe_bw;
+	uint32_t system_bw = 0;
+	uint8_t num_channel, data_width, rank;
+	int x_tile_per;
+	int display_bw_per;
+	bool y_tile_enabled = false;
+
+	if (!dev_priv->memdev_info.valid)
+		goto exit;
+
+	num_channel = dev_priv->memdev_info.num_channel;
+	data_width = dev_priv->memdev_info.data_width;
+	system_bw = dev_priv->memdev_info.mem_speed * num_channel * data_width;
+
+	if (!system_bw)
+		goto exit;
+
+	max_pipe_bw = 0;
+	for_each_intel_crtc(dev, intel_crtc) {
+		struct intel_crtc_state *cstate;
+		struct intel_plane *plane;
+
+		/*
+		 * If CRTC is part of current atomic commit, get crtc state from
+		 * existing CRTC state. else take the cached CRTC state
+		 */
+		cstate = NULL;
+		if (state)
+			cstate = intel_atomic_get_existing_crtc_state(state,
+					intel_crtc);
+		if (!cstate)
+			cstate = to_intel_crtc_state(intel_crtc->base.state);
+
+		if (!cstate->base.active)
+			continue;
+
+		num_active_plane = 0;
+		max_plane_bw = 0;
+		for_each_intel_plane_mask(dev, plane, cstate->base.plane_mask) {
+			struct drm_framebuffer *fb = NULL;
+
+			intel_pstate = NULL;
+			if (state)
+				intel_pstate =
+				intel_atomic_get_existing_plane_state(state,
+									plane);
+			if (!intel_pstate)
+				intel_pstate =
+					to_intel_plane_state(plane->base.state);
+
+			WARN_ON(!intel_pstate->base.fb);
+
+			if (!intel_pstate->base.visible)
+				continue;
+
+			fb = intel_pstate->base.fb;
+			if (fb && (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
+				fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED))
+				y_tile_enabled = true;
+
+			plane_bw = skl_adjusted_plane_pixel_rate(cstate,
+								intel_pstate);
+			max_plane_bw = max(plane_bw, max_plane_bw);
+			num_active_plane++;
+		}
+		pipe_bw = max_plane_bw * num_active_plane;
+		max_pipe_bw = max(pipe_bw, max_pipe_bw);
+	}
+
+	if (intel_state->active_pipe_changes)
+		num_active_pipe = hweight32(intel_state->active_crtcs);
+	else
+		num_active_pipe = hweight32(dev_priv->active_crtcs);
+
+	total_pipe_bw = max_pipe_bw * num_active_pipe;
+
+	display_bw_per = DIV_ROUND_UP_ULL(total_pipe_bw * 100, system_bw * 1000);
+
+	/*
+	 * If there is any Ytile plane enabled and arbitrated display
+	 * bandwidth > 20% of raw system memory bandwidth
+	 * Enale Y-tile related WA
+	 *
+	 * If memory is dual channel single rank, Xtile limit = 35%, else Xtile
+	 * limit = 60%
+	 * If there is no Ytile plane enabled and
+	 * arbitrated display bandwidth > Xtile limit
+	 * Enable X-tile realated WA
+	 */
+	if (y_tile_enabled && (display_bw_per > 20))
+		intel_state->wm_results.mem_wa = WATERMARK_WA_Y_TILED;
+	else {
+
+		if (dev_priv->memdev_info.rank_valid)
+			rank = dev_priv->memdev_info.rank;
+		else
+			rank = DRAM_DUAL_RANK; /* Assume we are dual rank */
+
+		if ((rank == DRAM_SINGLE_RANK) && (num_channel == 2))
+			x_tile_per = 35;
+		else
+			x_tile_per = 60;
+
+		if (display_bw_per > x_tile_per)
+			intel_state->wm_results.mem_wa = WATERMARK_WA_X_TILED;
+	}
+	return;
+
+exit:
+	intel_state->wm_results.mem_wa = WATERMARK_WA_NONE;
+}
+
+static void
 skl_copy_wm_for_pipe(struct skl_wm_values *dst,
 		     struct skl_wm_values *src,
 		     enum pipe pipe)
@@ -4125,6 +4268,8 @@ skl_compute_wm(struct drm_atomic_state *state)
 	/* Clear all dirty flags */
 	results->dirty_pipes = 0;
 
+	skl_set_memory_bandwidth_wm_wa(state);
+
 	ret = skl_include_affected_pipes(state);
 	if (ret)
 		return ret;
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/7] drm/i915/hsw+: set intel_crtc active once pipe is active
  2016-08-29 12:35 ` [PATCH 1/7] drm/i915/hsw+: set intel_crtc active once pipe is active Kumar, Mahesh
@ 2016-08-30 19:18   ` Zanoni, Paulo R
  2016-08-31 10:51   ` Maarten Lankhorst
  1 sibling, 0 replies; 28+ messages in thread
From: Zanoni, Paulo R @ 2016-08-30 19:18 UTC (permalink / raw)
  To: intel-gfx, Kumar, Mahesh1; +Cc: Kumar

Em Seg, 2016-08-29 às 18:05 +0530, Kumar, Mahesh escreveu:
> Set the intel_crtc->active flag after pipe/crtc is actually active in
> haswell_crtc_enable function.

Why?

Can you please elaborate more on why this change is needed, what are
the benefits it brings, what are the problems it solves and why is the
current code bad or wrong? Please explain all this in the commit
message, not just as an email reply.

In other words: if I'm bisecting a theoretical bug and then suddenly
conclude that this patch is the problem, how will I know what's going
to break once I revert this patch?

Thanks,
Paulo

> 
> Signed-off-by: Kumar, Mahesh <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c
> b/drivers/gpu/drm/i915/intel_display.c
> index e4e6141..7258883 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -5427,8 +5427,6 @@ static void haswell_crtc_enable(struct
> intel_crtc_state *pipe_config,
>  
>  	intel_color_set_csc(&pipe_config->base);
>  
> -	intel_crtc->active = true;
> -
>  	if (intel_crtc->config->has_pch_encoder)
>  		intel_set_cpu_fifo_underrun_reporting(dev_priv,
> pipe, false);
>  	else
> @@ -5475,6 +5473,8 @@ static void haswell_crtc_enable(struct
> intel_crtc_state *pipe_config,
>  	assert_vblank_disabled(crtc);
>  	drm_crtc_vblank_on(crtc);
>  
> +	intel_crtc->active = true;
> +
>  	intel_encoders_enable(crtc, pipe_config, old_state);
>  
>  	if (intel_crtc->config->has_pch_encoder) {
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] FOR_UPSTREAM [VPG]: drm/i915/skl+: Implement Transition WM
  2016-08-29 12:35 ` [PATCH] FOR_UPSTREAM [VPG]: drm/i915/skl+: Implement Transition WM Kumar, Mahesh
@ 2016-08-30 19:32   ` Zanoni, Paulo R
  2016-08-31 13:47     ` Zanoni, Paulo R
  0 siblings, 1 reply; 28+ messages in thread
From: Zanoni, Paulo R @ 2016-08-30 19:32 UTC (permalink / raw)
  To: intel-gfx, Kumar, Mahesh1; +Cc: Kumar

Hi

Em Seg, 2016-08-29 às 18:05 +0530, Kumar, Mahesh escreveu:
> This patch enables Transition WM for SKL+ platforms.
> 
> Transition WM are used if IPC is enabled, to decide, number of blocks
> to be fetched before reducing the priority of display to fetch from
> memory.
> 
> Signed-off-by: Kumar, Mahesh <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 57
> ++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 53 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c
> b/drivers/gpu/drm/i915/intel_pm.c
> index 9f69050..f4f387f 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2861,6 +2861,8 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
>  #define SKL_DDB_SIZE		896	/* in blocks */
>  #define BXT_DDB_SIZE		512
>  #define SKL_SAGV_BLOCK_TIME	30 /* µs */
> +#define SKL_TRANS_WM_AMOUNT	10
> +#define SKL_TRANS_WM_MIN	14

As far as I understood (and maybe I'm wrong), SKL_TRANS_WM_AMOUNT is a
tunable value, so it depends on a lot of variable factors. Why does
this patch use 10 instead of every other possible value?

Thanks,
Paulo


>  
>  /*
>   * Return the index of a plane in the SKL DDB and wm result
> arrays.  Primary
> @@ -3579,6 +3581,41 @@ static uint32_t
> skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
>  	return pixel_rate;
>  }
>  
> +static void skl_compute_plane_trans_wm(const struct drm_i915_private
> *dev_priv,
> +					struct skl_pipe_wm *pipe_wm,
> +					struct intel_plane_state
> *intel_pstate,
> +					uint32_t y_tile_min,
> +					bool y_tile)
> +{
> +	struct drm_plane_state *pstate = &intel_pstate->base;
> +	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
> +	uint16_t *out_blocks = &pipe_wm->trans_wm.plane_res_b[id];
> +	uint8_t *out_lines = &pipe_wm->trans_wm.plane_res_l[id];
> +	uint16_t res_blocks = pipe_wm->wm[0].plane_res_b[id];
> +	uint32_t trans_min = 0, trans_offset_blocks;
> +	uint16_t trans_y_tile_min = 0;
> +	uint16_t trans_res_blocks;
> +
> +
> +	if (IS_GEN9(dev_priv))
> +		trans_min = SKL_TRANS_WM_MIN;
> +
> +	trans_offset_blocks = trans_min + SKL_TRANS_WM_AMOUNT;
> +
> +	if (y_tile) {
> +		trans_y_tile_min = 2 * y_tile_min;
> +		trans_res_blocks = max(trans_y_tile_min, res_blocks)
> +
> +			trans_offset_blocks;
> +	} else {
> +		trans_res_blocks = res_blocks + trans_offset_blocks;
> +		/* WA BUG:1938466 add one block for non y-tile
> planes */
> +		trans_res_blocks += 1;
> +	}
> +
> +	*out_blocks = trans_res_blocks;
> +	*out_lines = 0;
> +}
> +
>  static int skl_compute_plane_wm(const struct drm_i915_private
> *dev_priv,
>  				struct intel_crtc_state *cstate,
>  				struct intel_plane_state
> *intel_pstate,
> @@ -3600,6 +3637,8 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	struct skl_wm_level *result = &pipe_wm->wm[level];
>  	uint16_t *out_blocks = &result->plane_res_b[id];
>  	uint8_t *out_lines = &result->plane_res_l[id];
> +	uint32_t y_tile_minimum = 0;
> +	bool y_tile = false;
>  
>  	if (latency == 0 || !cstate->base.active || !intel_pstate-
> >base.visible)
>  		return 0;
> @@ -3627,7 +3666,6 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
>  	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
>  		uint32_t min_scanlines = 4;
> -		uint32_t y_tile_minimum;
>  		if (intel_rotation_90_or_270(pstate->rotation)) {
>  			int cpp = (fb->pixel_format ==
> DRM_FORMAT_NV12) ?
>  				drm_format_plane_cpp(fb-
> >pixel_format, 1) :
> @@ -3646,6 +3684,7 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  		}
>  		y_tile_minimum = plane_blocks_per_line *
> min_scanlines;
>  		selected_result = max(method2, y_tile_minimum);
> +		y_tile = true;
>  	} else {
>  		linetime_us = DIV_ROUND_UP(width * 1000,
>  				skl_pipe_pixel_rate(cstate));
> @@ -3669,6 +3708,9 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	*out_blocks = res_blocks;
>  	*out_lines = res_lines;
>  
> +	if (level == 0)
> +		skl_compute_plane_trans_wm(dev_priv, pipe_wm,
> intel_pstate,
> +						y_tile_minimum,
> y_tile);
>  	return 0;
>  }
>  
> @@ -3738,11 +3780,13 @@ skl_compute_linetime_wm(struct
> intel_crtc_state *cstate)
>  }
>  
>  static void skl_compute_transition_wm(struct intel_crtc_state
> *cstate,
> -				      struct skl_wm_level *trans_wm
> /* out */)
> +				      struct skl_wm_level *trans_wm,
> +				      struct skl_ddb_allocation
> *ddb)
>  {
>  	struct drm_crtc *crtc = cstate->base.crtc;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>  	struct intel_plane *intel_plane;
> +	enum pipe pipe = to_intel_crtc(crtc)->pipe;
>  
>  	if (!cstate->base.active)
>  		return;
> @@ -3750,8 +3794,13 @@ static void skl_compute_transition_wm(struct
> intel_crtc_state *cstate,
>  	/* Until we know more, just disable transition WMs */
>  	for_each_intel_plane_on_crtc(crtc->dev, intel_crtc,
> intel_plane) {
>  		int i = skl_wm_plane_id(intel_plane);
> +		uint16_t plane_ddb = skl_ddb_entry_size(&ddb-
> >plane[pipe][i]);
> +		uint16_t trans_res_blocks = trans_wm-
> >plane_res_b[i];
>  
> -		trans_wm->plane_en[i] = false;
> +		if ((plane_ddb == 0) || (trans_res_blocks >
> plane_ddb))
> +			trans_wm->plane_en[i] = false;
> +		else
> +			trans_wm->plane_en[i] = true;
>  	}
>  }
>  
> @@ -3782,7 +3831,7 @@ static int skl_build_pipe_wm(struct
> intel_crtc_state *cstate,
>  
>  	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
>  
> -	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
> +	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm, ddb);
>  
>  	return 0;
>  }
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/7] drm/i915/hsw+: set intel_crtc active once pipe is active
  2016-08-29 12:35 ` [PATCH 1/7] drm/i915/hsw+: set intel_crtc active once pipe is active Kumar, Mahesh
  2016-08-30 19:18   ` Zanoni, Paulo R
@ 2016-08-31 10:51   ` Maarten Lankhorst
  2016-08-31 14:04     ` Mahesh Kumar
  1 sibling, 1 reply; 28+ messages in thread
From: Maarten Lankhorst @ 2016-08-31 10:51 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx; +Cc: Kumar

Op 29-08-16 om 14:35 schreef Kumar, Mahesh:
> Set the intel_crtc->active flag after pipe/crtc is actually active in
> haswell_crtc_enable function.
>
> Signed-off-by: Kumar, Mahesh <mahesh1.kumar@intel.com>
I think more state needs to be pre-calculated, rather than touching intel_crtc->active.
Something like using the .initial_watermarks callback, we're so close to removing
that non-atomic flag altogether. :-)

~Maarten
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] drm/i915/skl: New ddb allocation algorithm
  2016-08-29 12:35 ` [PATCH 4/7] drm/i915/skl: New ddb allocation algorithm Kumar, Mahesh
@ 2016-08-31 13:38   ` Maarten Lankhorst
  2016-08-31 14:18     ` Mahesh Kumar
  0 siblings, 1 reply; 28+ messages in thread
From: Maarten Lankhorst @ 2016-08-31 13:38 UTC (permalink / raw)
  To: Kumar, Mahesh, intel-gfx; +Cc: Kumar

Hey,

Op 29-08-16 om 14:35 schreef Kumar, Mahesh:
> This patch implements new DDB allocation algorithm as per HW team
> suggestion. This algo takecare of scenario where we allocate less DDB
> for the planes with lower relative pixel rate, but they require more DDB
> to work.
> It also takes care of enabling same watermark level for each
> plane, for efficient power saving.
Patch 2 and 3 look ok, I'm not a watermark expert though. :)

To be honest, skylake watermarks are very complicated with all the interdepencies. I can understand this helps,
but if you do separate per crtc atomic commits, it doesn't help a update for crtc1 may introduce a vblank wait on crtc2 and the other way around. This will drop the framerate to half.

Is there no way around it?

~Maarten
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] FOR_UPSTREAM [VPG]: drm/i915/skl+: Implement Transition WM
  2016-08-30 19:32   ` Zanoni, Paulo R
@ 2016-08-31 13:47     ` Zanoni, Paulo R
  2016-09-02 11:46       ` Mahesh Kumar
  0 siblings, 1 reply; 28+ messages in thread
From: Zanoni, Paulo R @ 2016-08-31 13:47 UTC (permalink / raw)
  To: intel-gfx, Kumar, Mahesh1

Em Ter, 2016-08-30 às 19:32 +0000, Zanoni, Paulo R escreveu:
> Hi
> 
> Em Seg, 2016-08-29 às 18:05 +0530, Kumar, Mahesh escreveu:
> > 
> > This patch enables Transition WM for SKL+ platforms.
> > 
> > Transition WM are used if IPC is enabled, to decide, number of
> > blocks
> > to be fetched before reducing the priority of display to fetch from
> > memory.
> > 
> > Signed-off-by: Kumar, Mahesh <mahesh1.kumar@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_pm.c | 57
> > ++++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 53 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > b/drivers/gpu/drm/i915/intel_pm.c
> > index 9f69050..f4f387f 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -2861,6 +2861,8 @@ bool ilk_disable_lp_wm(struct drm_device
> > *dev)
> >  #define SKL_DDB_SIZE		896	/* in blocks */
> >  #define BXT_DDB_SIZE		512
> >  #define SKL_SAGV_BLOCK_TIME	30 /* µs */
> > +#define SKL_TRANS_WM_AMOUNT	10
> > +#define SKL_TRANS_WM_MIN	14
> 
> As far as I understood (and maybe I'm wrong), SKL_TRANS_WM_AMOUNT is
> a
> tunable value, so it depends on a lot of variable factors. Why does
> this patch use 10 instead of every other possible value?

I kept investigating and it seems that the current recommendation is
still that transition watermarks stay disabled for SKL and KBL.

> 
> Thanks,
> Paulo
> 
> 
> > 
> >  
> >  /*
> >   * Return the index of a plane in the SKL DDB and wm result
> > arrays.  Primary
> > @@ -3579,6 +3581,41 @@ static uint32_t
> > skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
> >  	return pixel_rate;
> >  }
> >  
> > +static void skl_compute_plane_trans_wm(const struct
> > drm_i915_private
> > *dev_priv,
> > +					struct skl_pipe_wm
> > *pipe_wm,
> > +					struct intel_plane_state
> > *intel_pstate,
> > +					uint32_t y_tile_min,
> > +					bool y_tile)
> > +{
> > +	struct drm_plane_state *pstate = &intel_pstate->base;
> > +	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
> > +	uint16_t *out_blocks = &pipe_wm->trans_wm.plane_res_b[id];
> > +	uint8_t *out_lines = &pipe_wm->trans_wm.plane_res_l[id];
> > +	uint16_t res_blocks = pipe_wm->wm[0].plane_res_b[id];
> > +	uint32_t trans_min = 0, trans_offset_blocks;
> > +	uint16_t trans_y_tile_min = 0;
> > +	uint16_t trans_res_blocks;
> > +
> > +
> > +	if (IS_GEN9(dev_priv))
> > +		trans_min = SKL_TRANS_WM_MIN;
> > +
> > +	trans_offset_blocks = trans_min + SKL_TRANS_WM_AMOUNT;
> > +
> > +	if (y_tile) {
> > +		trans_y_tile_min = 2 * y_tile_min;
> > +		trans_res_blocks = max(trans_y_tile_min,
> > res_blocks)
> > +
> > +			trans_offset_blocks;
> > +	} else {
> > +		trans_res_blocks = res_blocks +
> > trans_offset_blocks;
> > +		/* WA BUG:1938466 add one block for non y-tile
> > planes */
> > +		trans_res_blocks += 1;
> > +	}
> > +
> > +	*out_blocks = trans_res_blocks;
> > +	*out_lines = 0;
> > +}
> > +
> >  static int skl_compute_plane_wm(const struct drm_i915_private
> > *dev_priv,
> >  				struct intel_crtc_state *cstate,
> >  				struct intel_plane_state
> > *intel_pstate,
> > @@ -3600,6 +3637,8 @@ static int skl_compute_plane_wm(const struct
> > drm_i915_private *dev_priv,
> >  	struct skl_wm_level *result = &pipe_wm->wm[level];
> >  	uint16_t *out_blocks = &result->plane_res_b[id];
> >  	uint8_t *out_lines = &result->plane_res_l[id];
> > +	uint32_t y_tile_minimum = 0;
> > +	bool y_tile = false;
> >  
> >  	if (latency == 0 || !cstate->base.active || !intel_pstate-
> > > 
> > > base.visible)
> >  		return 0;
> > @@ -3627,7 +3666,6 @@ static int skl_compute_plane_wm(const struct
> > drm_i915_private *dev_priv,
> >  	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
> >  	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
> >  		uint32_t min_scanlines = 4;
> > -		uint32_t y_tile_minimum;
> >  		if (intel_rotation_90_or_270(pstate->rotation)) {
> >  			int cpp = (fb->pixel_format ==
> > DRM_FORMAT_NV12) ?
> >  				drm_format_plane_cpp(fb-
> > > 
> > > pixel_format, 1) :
> > @@ -3646,6 +3684,7 @@ static int skl_compute_plane_wm(const struct
> > drm_i915_private *dev_priv,
> >  		}
> >  		y_tile_minimum = plane_blocks_per_line *
> > min_scanlines;
> >  		selected_result = max(method2, y_tile_minimum);
> > +		y_tile = true;
> >  	} else {
> >  		linetime_us = DIV_ROUND_UP(width * 1000,
> >  				skl_pipe_pixel_rate(cstate));
> > @@ -3669,6 +3708,9 @@ static int skl_compute_plane_wm(const struct
> > drm_i915_private *dev_priv,
> >  	*out_blocks = res_blocks;
> >  	*out_lines = res_lines;
> >  
> > +	if (level == 0)
> > +		skl_compute_plane_trans_wm(dev_priv, pipe_wm,
> > intel_pstate,
> > +						y_tile_minimum,
> > y_tile);
> >  	return 0;
> >  }
> >  
> > @@ -3738,11 +3780,13 @@ skl_compute_linetime_wm(struct
> > intel_crtc_state *cstate)
> >  }
> >  
> >  static void skl_compute_transition_wm(struct intel_crtc_state
> > *cstate,
> > -				      struct skl_wm_level
> > *trans_wm
> > /* out */)
> > +				      struct skl_wm_level
> > *trans_wm,
> > +				      struct skl_ddb_allocation
> > *ddb)
> >  {
> >  	struct drm_crtc *crtc = cstate->base.crtc;
> >  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> >  	struct intel_plane *intel_plane;
> > +	enum pipe pipe = to_intel_crtc(crtc)->pipe;
> >  
> >  	if (!cstate->base.active)
> >  		return;
> > @@ -3750,8 +3794,13 @@ static void skl_compute_transition_wm(struct
> > intel_crtc_state *cstate,
> >  	/* Until we know more, just disable transition WMs */
> >  	for_each_intel_plane_on_crtc(crtc->dev, intel_crtc,
> > intel_plane) {
> >  		int i = skl_wm_plane_id(intel_plane);
> > +		uint16_t plane_ddb = skl_ddb_entry_size(&ddb-
> > > 
> > > plane[pipe][i]);
> > +		uint16_t trans_res_blocks = trans_wm-
> > > 
> > > plane_res_b[i];
> >  
> > -		trans_wm->plane_en[i] = false;
> > +		if ((plane_ddb == 0) || (trans_res_blocks >
> > plane_ddb))
> > +			trans_wm->plane_en[i] = false;
> > +		else
> > +			trans_wm->plane_en[i] = true;
> >  	}
> >  }
> >  
> > @@ -3782,7 +3831,7 @@ static int skl_build_pipe_wm(struct
> > intel_crtc_state *cstate,
> >  
> >  	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
> >  
> > -	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
> > +	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm,
> > ddb);
> >  
> >  	return 0;
> >  }
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/7] drm/i915/hsw+: set intel_crtc active once pipe is active
  2016-08-31 10:51   ` Maarten Lankhorst
@ 2016-08-31 14:04     ` Mahesh Kumar
  0 siblings, 0 replies; 28+ messages in thread
From: Mahesh Kumar @ 2016-08-31 14:04 UTC (permalink / raw)
  To: Maarten Lankhorst, intel-gfx; +Cc: Kumar

earlier implementation of "DDB writing" was using this flag before 
wait_for_vblank if we need to expand the DDB & it overlap with other 
pipe's DDB.
In that case, I observed on corner case requiring a cd-clock change, 
intel_update_watermarks was waiting for vblank, when it was called from 
haswell_crtc_enable function. This was leading to "vblank not available 
on crtc " warning (since vblank int was not yet enabled on that CRTC).
But as I can see we now no longer use wait for vblank during DDB writing 
if it's called from crtc_enable function after Paul's patch series, I 
can drop this patch all together.
thanks for review & input

Regards,
-Mahesh

On Wednesday 31 August 2016 04:21 PM, Maarten Lankhorst wrote:
> more state needs to be pre-calculated, rather than touching intel_crtc->active.
> Something like using the .initial_watermarks callback, we're so close to removing
> that non-atomic flag altogether.:-)
>
> ~Maarten

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] drm/i915/skl: New ddb allocation algorithm
  2016-08-31 13:38   ` Maarten Lankhorst
@ 2016-08-31 14:18     ` Mahesh Kumar
  0 siblings, 0 replies; 28+ messages in thread
From: Mahesh Kumar @ 2016-08-31 14:18 UTC (permalink / raw)
  To: Maarten Lankhorst, intel-gfx; +Cc: Kumar

Hi,
In this series, I'm not equalizing watermarks among planes of all 
CRTC's, these patches equalizing it only among Planes of single CRTC 
from DDB pool allocated for that CRTC, which doesn't require wait_for_vblank
So it'll not affect the FPS.

On Wednesday 31 August 2016 07:08 PM, Maarten Lankhorst wrote:
> Hey,
>
> Op 29-08-16 om 14:35 schreef Kumar, Mahesh:
>> This patch implements new DDB allocation algorithm as per HW team
>> suggestion. This algo takecare of scenario where we allocate less DDB
>> for the planes with lower relative pixel rate, but they require more DDB
>> to work.
>> It also takes care of enabling same watermark level for each
>> plane, for efficient power saving.
> Patch 2 and 3 look ok, I'm not a watermark expert though. :)
>
> To be honest, skylake watermarks are very complicated with all the interdepencies. I can understand this helps,
> but if you do separate per crtc atomic commits, it doesn't help a update for crtc1 may introduce a vblank wait on crtc2 and the other way around. This will drop the framerate to half.
>
> Is there no way around it?
>
> ~Maarten

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] FOR_UPSTREAM [VPG]: drm/i915/skl+: Implement Transition WM
  2016-08-31 13:47     ` Zanoni, Paulo R
@ 2016-09-02 11:46       ` Mahesh Kumar
  2016-09-02 12:21         ` Zanoni, Paulo R
  0 siblings, 1 reply; 28+ messages in thread
From: Mahesh Kumar @ 2016-09-02 11:46 UTC (permalink / raw)
  To: Zanoni, Paulo R, intel-gfx

Hi,

On Wednesday 31 August 2016 07:17 PM, Zanoni, Paulo R wrote:
> Em Ter, 2016-08-30 às 19:32 +0000, Zanoni, Paulo R escreveu:
>> Hi
>>
>> Em Seg, 2016-08-29 às 18:05 +0530, Kumar, Mahesh escreveu:
>>> This patch enables Transition WM for SKL+ platforms.
>>>
>>> Transition WM are used if IPC is enabled, to decide, number of
>>> blocks
>>> to be fetched before reducing the priority of display to fetch from
>>> memory.
>>>
>>> Signed-off-by: Kumar, Mahesh <mahesh1.kumar@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/intel_pm.c | 57
>>> ++++++++++++++++++++++++++++++++++++++---
>>>   1 file changed, 53 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_pm.c
>>> b/drivers/gpu/drm/i915/intel_pm.c
>>> index 9f69050..f4f387f 100644
>>> --- a/drivers/gpu/drm/i915/intel_pm.c
>>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>>> @@ -2861,6 +2861,8 @@ bool ilk_disable_lp_wm(struct drm_device
>>> *dev)
>>>   #define SKL_DDB_SIZE		896	/* in blocks */
>>>   #define BXT_DDB_SIZE		512
>>>   #define SKL_SAGV_BLOCK_TIME	30 /* µs */
>>> +#define SKL_TRANS_WM_AMOUNT	10
>>> +#define SKL_TRANS_WM_MIN	14
>> As far as I understood (and maybe I'm wrong), SKL_TRANS_WM_AMOUNT is
>> a
>> tunable value, so it depends on a lot of variable factors. Why does
>> this patch use 10 instead of every other possible value?
This value 10 was driven by experiments in other OS with the help of SV 
team.
I'm trying to get more information regarding this from HW/SV team.
we can update this value later if required, what is your thought on this?
> I kept investigating and it seems that the current recommendation is
> still that transition watermarks stay disabled for SKL and KBL.

My earlier impression from Bspec was transition WM's should be disabled 
in SKL A/B stepping only. But from the latest discussion with HW team,
it seems transition WM must be disabled for all SKL stepping.
SV team recommendation is, It should be enabled with IPC for BXT/APL 
systems, So will update the patch with IS_BXT() check.

thanks,

Regards,
-Mahesh
>
>> Thanks,
>> Paulo
>>
>>
>>>   
>>>   /*
>>>    * Return the index of a plane in the SKL DDB and wm result
>>> arrays.  Primary
>>> @@ -3579,6 +3581,41 @@ static uint32_t
>>> skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
>>>   	return pixel_rate;
>>>   }
>>>   
>>> +static void skl_compute_plane_trans_wm(const struct
>>> drm_i915_private
>>> *dev_priv,
>>> +					struct skl_pipe_wm
>>> *pipe_wm,
>>> +					struct intel_plane_state
>>> *intel_pstate,
>>> +					uint32_t y_tile_min,
>>> +					bool y_tile)
>>> +{
>>> +	struct drm_plane_state *pstate = &intel_pstate->base;
>>> +	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
>>> +	uint16_t *out_blocks = &pipe_wm->trans_wm.plane_res_b[id];
>>> +	uint8_t *out_lines = &pipe_wm->trans_wm.plane_res_l[id];
>>> +	uint16_t res_blocks = pipe_wm->wm[0].plane_res_b[id];
>>> +	uint32_t trans_min = 0, trans_offset_blocks;
>>> +	uint16_t trans_y_tile_min = 0;
>>> +	uint16_t trans_res_blocks;
>>> +
>>> +
>>> +	if (IS_GEN9(dev_priv))
>>> +		trans_min = SKL_TRANS_WM_MIN;
>>> +
>>> +	trans_offset_blocks = trans_min + SKL_TRANS_WM_AMOUNT;
>>> +
>>> +	if (y_tile) {
>>> +		trans_y_tile_min = 2 * y_tile_min;
>>> +		trans_res_blocks = max(trans_y_tile_min,
>>> res_blocks)
>>> +
>>> +			trans_offset_blocks;
>>> +	} else {
>>> +		trans_res_blocks = res_blocks +
>>> trans_offset_blocks;
>>> +		/* WA BUG:1938466 add one block for non y-tile
>>> planes */
>>> +		trans_res_blocks += 1;
>>> +	}
>>> +
>>> +	*out_blocks = trans_res_blocks;
>>> +	*out_lines = 0;
>>> +}
>>> +
>>>   static int skl_compute_plane_wm(const struct drm_i915_private
>>> *dev_priv,
>>>   				struct intel_crtc_state *cstate,
>>>   				struct intel_plane_state
>>> *intel_pstate,
>>> @@ -3600,6 +3637,8 @@ static int skl_compute_plane_wm(const struct
>>> drm_i915_private *dev_priv,
>>>   	struct skl_wm_level *result = &pipe_wm->wm[level];
>>>   	uint16_t *out_blocks = &result->plane_res_b[id];
>>>   	uint8_t *out_lines = &result->plane_res_l[id];
>>> +	uint32_t y_tile_minimum = 0;
>>> +	bool y_tile = false;
>>>   
>>>   	if (latency == 0 || !cstate->base.active || !intel_pstate-
>>>> base.visible)
>>>   		return 0;
>>> @@ -3627,7 +3666,6 @@ static int skl_compute_plane_wm(const struct
>>> drm_i915_private *dev_priv,
>>>   	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
>>>   	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
>>>   		uint32_t min_scanlines = 4;
>>> -		uint32_t y_tile_minimum;
>>>   		if (intel_rotation_90_or_270(pstate->rotation)) {
>>>   			int cpp = (fb->pixel_format ==
>>> DRM_FORMAT_NV12) ?
>>>   				drm_format_plane_cpp(fb-
>>>> pixel_format, 1) :
>>> @@ -3646,6 +3684,7 @@ static int skl_compute_plane_wm(const struct
>>> drm_i915_private *dev_priv,
>>>   		}
>>>   		y_tile_minimum = plane_blocks_per_line *
>>> min_scanlines;
>>>   		selected_result = max(method2, y_tile_minimum);
>>> +		y_tile = true;
>>>   	} else {
>>>   		linetime_us = DIV_ROUND_UP(width * 1000,
>>>   				skl_pipe_pixel_rate(cstate));
>>> @@ -3669,6 +3708,9 @@ static int skl_compute_plane_wm(const struct
>>> drm_i915_private *dev_priv,
>>>   	*out_blocks = res_blocks;
>>>   	*out_lines = res_lines;
>>>   
>>> +	if (level == 0)
>>> +		skl_compute_plane_trans_wm(dev_priv, pipe_wm,
>>> intel_pstate,
>>> +						y_tile_minimum,
>>> y_tile);
>>>   	return 0;
>>>   }
>>>   
>>> @@ -3738,11 +3780,13 @@ skl_compute_linetime_wm(struct
>>> intel_crtc_state *cstate)
>>>   }
>>>   
>>>   static void skl_compute_transition_wm(struct intel_crtc_state
>>> *cstate,
>>> -				      struct skl_wm_level
>>> *trans_wm
>>> /* out */)
>>> +				      struct skl_wm_level
>>> *trans_wm,
>>> +				      struct skl_ddb_allocation
>>> *ddb)
>>>   {
>>>   	struct drm_crtc *crtc = cstate->base.crtc;
>>>   	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>>>   	struct intel_plane *intel_plane;
>>> +	enum pipe pipe = to_intel_crtc(crtc)->pipe;
>>>   
>>>   	if (!cstate->base.active)
>>>   		return;
>>> @@ -3750,8 +3794,13 @@ static void skl_compute_transition_wm(struct
>>> intel_crtc_state *cstate,
>>>   	/* Until we know more, just disable transition WMs */
>>>   	for_each_intel_plane_on_crtc(crtc->dev, intel_crtc,
>>> intel_plane) {
>>>   		int i = skl_wm_plane_id(intel_plane);
>>> +		uint16_t plane_ddb = skl_ddb_entry_size(&ddb-
>>>> plane[pipe][i]);
>>> +		uint16_t trans_res_blocks = trans_wm-
>>>> plane_res_b[i];
>>>   
>>> -		trans_wm->plane_en[i] = false;
>>> +		if ((plane_ddb == 0) || (trans_res_blocks >
>>> plane_ddb))
>>> +			trans_wm->plane_en[i] = false;
>>> +		else
>>> +			trans_wm->plane_en[i] = true;
>>>   	}
>>>   }
>>>   
>>> @@ -3782,7 +3831,7 @@ static int skl_build_pipe_wm(struct
>>> intel_crtc_state *cstate,
>>>   
>>>   	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
>>>   
>>> -	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
>>> +	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm,
>>> ddb);
>>>   
>>>   	return 0;
>>>   }
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] FOR_UPSTREAM [VPG]: drm/i915/skl+: Implement Transition WM
  2016-09-02 11:46       ` Mahesh Kumar
@ 2016-09-02 12:21         ` Zanoni, Paulo R
  0 siblings, 0 replies; 28+ messages in thread
From: Zanoni, Paulo R @ 2016-09-02 12:21 UTC (permalink / raw)
  To: intel-gfx, Kumar, Mahesh1; +Cc: Runyan, Arthur J

Em Sex, 2016-09-02 às 17:16 +0530, Mahesh Kumar escreveu:
> Hi,
> 
> On Wednesday 31 August 2016 07:17 PM, Zanoni, Paulo R wrote:
> > 
> > Em Ter, 2016-08-30 às 19:32 +0000, Zanoni, Paulo R escreveu:
> > > 
> > > Hi
> > > 
> > > Em Seg, 2016-08-29 às 18:05 +0530, Kumar, Mahesh escreveu:
> > > > 
> > > > This patch enables Transition WM for SKL+ platforms.
> > > > 
> > > > Transition WM are used if IPC is enabled, to decide, number of
> > > > blocks
> > > > to be fetched before reducing the priority of display to fetch
> > > > from
> > > > memory.
> > > > 
> > > > Signed-off-by: Kumar, Mahesh <mahesh1.kumar@intel.com>
> > > > ---
> > > >   drivers/gpu/drm/i915/intel_pm.c | 57
> > > > ++++++++++++++++++++++++++++++++++++++---
> > > >   1 file changed, 53 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > > b/drivers/gpu/drm/i915/intel_pm.c
> > > > index 9f69050..f4f387f 100644
> > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > @@ -2861,6 +2861,8 @@ bool ilk_disable_lp_wm(struct drm_device
> > > > *dev)
> > > >   #define SKL_DDB_SIZE		896	/* in blocks
> > > > */
> > > >   #define BXT_DDB_SIZE		512
> > > >   #define SKL_SAGV_BLOCK_TIME	30 /* µs */
> > > > +#define SKL_TRANS_WM_AMOUNT	10
> > > > +#define SKL_TRANS_WM_MIN	14
> > > As far as I understood (and maybe I'm wrong), SKL_TRANS_WM_AMOUNT
> > > is
> > > a
> > > tunable value, so it depends on a lot of variable factors. Why
> > > does
> > > this patch use 10 instead of every other possible value?
> This value 10 was driven by experiments in other OS with the help of
> SV 
> team.
> I'm trying to get more information regarding this from HW/SV team.
> we can update this value later if required, what is your thought on
> this?
> > 
> > I kept investigating and it seems that the current recommendation
> > is
> > still that transition watermarks stay disabled for SKL and KBL.
> 
> My earlier impression from Bspec was transition WM's should be
> disabled 
> in SKL A/B stepping only. But from the latest discussion with HW
> team,
> it seems transition WM must be disabled for all SKL stepping.
> SV team recommendation is, It should be enabled with IPC for BXT/APL 
> systems, So will update the patch with IS_BXT() check.

BSpec says transition watermarks should be disabled for GEN9:ALL.

Arthur, can you please comment on this?

> 
> thanks,
> 
> Regards,
> -Mahesh
> > 
> > 
> > > 
> > > Thanks,
> > > Paulo
> > > 
> > > 
> > > > 
> > > >   
> > > >   /*
> > > >    * Return the index of a plane in the SKL DDB and wm result
> > > > arrays.  Primary
> > > > @@ -3579,6 +3581,41 @@ static uint32_t
> > > > skl_adjusted_plane_pixel_rate(const struct intel_crtc_state
> > > > *cst
> > > >   	return pixel_rate;
> > > >   }
> > > >   
> > > > +static void skl_compute_plane_trans_wm(const struct
> > > > drm_i915_private
> > > > *dev_priv,
> > > > +					struct skl_pipe_wm
> > > > *pipe_wm,
> > > > +					struct
> > > > intel_plane_state
> > > > *intel_pstate,
> > > > +					uint32_t y_tile_min,
> > > > +					bool y_tile)
> > > > +{
> > > > +	struct drm_plane_state *pstate = &intel_pstate->base;
> > > > +	int id = skl_wm_plane_id(to_intel_plane(pstate-
> > > > >plane));
> > > > +	uint16_t *out_blocks = &pipe_wm-
> > > > >trans_wm.plane_res_b[id];
> > > > +	uint8_t *out_lines = &pipe_wm-
> > > > >trans_wm.plane_res_l[id];
> > > > +	uint16_t res_blocks = pipe_wm->wm[0].plane_res_b[id];
> > > > +	uint32_t trans_min = 0, trans_offset_blocks;
> > > > +	uint16_t trans_y_tile_min = 0;
> > > > +	uint16_t trans_res_blocks;
> > > > +
> > > > +
> > > > +	if (IS_GEN9(dev_priv))
> > > > +		trans_min = SKL_TRANS_WM_MIN;
> > > > +
> > > > +	trans_offset_blocks = trans_min + SKL_TRANS_WM_AMOUNT;
> > > > +
> > > > +	if (y_tile) {
> > > > +		trans_y_tile_min = 2 * y_tile_min;
> > > > +		trans_res_blocks = max(trans_y_tile_min,
> > > > res_blocks)
> > > > +
> > > > +			trans_offset_blocks;
> > > > +	} else {
> > > > +		trans_res_blocks = res_blocks +
> > > > trans_offset_blocks;
> > > > +		/* WA BUG:1938466 add one block for non y-tile
> > > > planes */
> > > > +		trans_res_blocks += 1;
> > > > +	}
> > > > +
> > > > +	*out_blocks = trans_res_blocks;
> > > > +	*out_lines = 0;
> > > > +}
> > > > +
> > > >   static int skl_compute_plane_wm(const struct drm_i915_private
> > > > *dev_priv,
> > > >   				struct intel_crtc_state
> > > > *cstate,
> > > >   				struct intel_plane_state
> > > > *intel_pstate,
> > > > @@ -3600,6 +3637,8 @@ static int skl_compute_plane_wm(const
> > > > struct
> > > > drm_i915_private *dev_priv,
> > > >   	struct skl_wm_level *result = &pipe_wm->wm[level];
> > > >   	uint16_t *out_blocks = &result->plane_res_b[id];
> > > >   	uint8_t *out_lines = &result->plane_res_l[id];
> > > > +	uint32_t y_tile_minimum = 0;
> > > > +	bool y_tile = false;
> > > >   
> > > >   	if (latency == 0 || !cstate->base.active ||
> > > > !intel_pstate-
> > > > > 
> > > > > base.visible)
> > > >   		return 0;
> > > > @@ -3627,7 +3666,6 @@ static int skl_compute_plane_wm(const
> > > > struct
> > > > drm_i915_private *dev_priv,
> > > >   	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
> > > >   	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
> > > >   		uint32_t min_scanlines = 4;
> > > > -		uint32_t y_tile_minimum;
> > > >   		if (intel_rotation_90_or_270(pstate-
> > > > >rotation)) {
> > > >   			int cpp = (fb->pixel_format ==
> > > > DRM_FORMAT_NV12) ?
> > > >   				drm_format_plane_cpp(fb-
> > > > > 
> > > > > pixel_format, 1) :
> > > > @@ -3646,6 +3684,7 @@ static int skl_compute_plane_wm(const
> > > > struct
> > > > drm_i915_private *dev_priv,
> > > >   		}
> > > >   		y_tile_minimum = plane_blocks_per_line *
> > > > min_scanlines;
> > > >   		selected_result = max(method2,
> > > > y_tile_minimum);
> > > > +		y_tile = true;
> > > >   	} else {
> > > >   		linetime_us = DIV_ROUND_UP(width * 1000,
> > > >   				skl_pipe_pixel_rate(cstate));
> > > > @@ -3669,6 +3708,9 @@ static int skl_compute_plane_wm(const
> > > > struct
> > > > drm_i915_private *dev_priv,
> > > >   	*out_blocks = res_blocks;
> > > >   	*out_lines = res_lines;
> > > >   
> > > > +	if (level == 0)
> > > > +		skl_compute_plane_trans_wm(dev_priv, pipe_wm,
> > > > intel_pstate,
> > > > +						y_tile_minimum
> > > > ,
> > > > y_tile);
> > > >   	return 0;
> > > >   }
> > > >   
> > > > @@ -3738,11 +3780,13 @@ skl_compute_linetime_wm(struct
> > > > intel_crtc_state *cstate)
> > > >   }
> > > >   
> > > >   static void skl_compute_transition_wm(struct intel_crtc_state
> > > > *cstate,
> > > > -				      struct skl_wm_level
> > > > *trans_wm
> > > > /* out */)
> > > > +				      struct skl_wm_level
> > > > *trans_wm,
> > > > +				      struct
> > > > skl_ddb_allocation
> > > > *ddb)
> > > >   {
> > > >   	struct drm_crtc *crtc = cstate->base.crtc;
> > > >   	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> > > >   	struct intel_plane *intel_plane;
> > > > +	enum pipe pipe = to_intel_crtc(crtc)->pipe;
> > > >   
> > > >   	if (!cstate->base.active)
> > > >   		return;
> > > > @@ -3750,8 +3794,13 @@ static void
> > > > skl_compute_transition_wm(struct
> > > > intel_crtc_state *cstate,
> > > >   	/* Until we know more, just disable transition WMs */
> > > >   	for_each_intel_plane_on_crtc(crtc->dev, intel_crtc,
> > > > intel_plane) {
> > > >   		int i = skl_wm_plane_id(intel_plane);
> > > > +		uint16_t plane_ddb = skl_ddb_entry_size(&ddb-
> > > > > 
> > > > > plane[pipe][i]);
> > > > +		uint16_t trans_res_blocks = trans_wm-
> > > > > 
> > > > > plane_res_b[i];
> > > >   
> > > > -		trans_wm->plane_en[i] = false;
> > > > +		if ((plane_ddb == 0) || (trans_res_blocks >
> > > > plane_ddb))
> > > > +			trans_wm->plane_en[i] = false;
> > > > +		else
> > > > +			trans_wm->plane_en[i] = true;
> > > >   	}
> > > >   }
> > > >   
> > > > @@ -3782,7 +3831,7 @@ static int skl_build_pipe_wm(struct
> > > > intel_crtc_state *cstate,
> > > >   
> > > >   	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
> > > >   
> > > > -	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
> > > > +	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm,
> > > > ddb);
> > > >   
> > > >   	return 0;
> > > >   }
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 0/9] New DDB Algo and WM fixes
  2016-08-29 12:35 [PATCH 0/7] Implement New DDB allocation algorithm Kumar, Mahesh
                   ` (6 preceding siblings ...)
  2016-08-29 12:35 ` [PATCH] drm/i915/gen9: WM memory bandwidth related workaround Kumar, Mahesh
@ 2016-09-08 11:26 ` Kumar, Mahesh
  2016-09-08 11:26   ` [PATCH v2 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions Kumar, Mahesh
                     ` (8 more replies)
  2016-09-08 11:55 ` ✗ Fi.CI.BAT: failure for Implement New DDB allocation algorithm Patchwork
  8 siblings, 9 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-09-08 11:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This series implements new DDB allocation algorithm to solve the cases,
where we have sufficient DDB available to enable multiple planes, But
due to the current algorithm not dividing it properly among planes, we
end-up failing the flip.
It also takes care of enabling same watermark level for each
plane, for efficient power saving.
This series also implements Transition Watermarks and Gen-9 related
arbitrated display bandwidth Workarounds.

There are two steps in current WM programming.

1. Calculate minimum number of blocks required  for a WM level to be
enabled. For 1440x2560 panel we need 41 blocks as minimum number of
blocks to enable WM0. This is the step which doesn't use vertical size.
It only depends on Pipe drain rate and plane horizontal size as per the
current Bspec algorithm.
So all the plane below have minimum  number of blocks required to enable
WM0 as 41
    Plane 1  - 1440x2560        -    Min blocks to enable WM0 = 41
    Plane 2  - 1440x2560        -    Min blocks to enable WM0 = 41
    Plane 3  - 1440x48          -    Min blocks to enable WM0 = 41
    Plane 4  - 1440x96          -    Min blocks to enable WM0 = 41

2. Number of blocks allotted by the driver
    Driver allocates  12 for Plane 3   &  16 for plane 4

    Total Dbuf Available = 508
    Dbuf Available after 32 blocks for cursor = 508 - (32)  = 476
    allocate minimum blocks for each plane 8 * 4 = 32
    remaining blocks = 476 - 32 = 444
    Relative Data Rate for Planes
       Plane 1  =  1440 * 2560 * 3  =  11059200
       Plane 2  =  1440 * 2560 * 3  =  11059200
       Plane 3  =  1440 * 48   * 3  =  207360
       Plane 4  =  1440 * 96   * 3  =  414720
       Total Relative BW            =  22740480

-   Allocate Buffer
    buffer allocation = (Plane relative data rate / total data rate)
		    * total remaming DDB + minimum plane DDB
     Plane 1  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
     Plane 2  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
     Plane 3  buffer allocation = (207360   / 22740480) * 444 + 8 = 12
     Plane 4  buffer allocation = (414720   / 22740480) * 444 + 8 = 16

In this case it forced driver to disable Plane 3 & 4. Driver need to use
more efficient way to allocate buffer that is optimum for power.

New Algorithm suggested by HW team is:

1. Calculate minimum buffer allocations for each plane and for each
    watermark level

2. Add minimum buffer allocations required for enabling WM7
    for all the planes

Level 0 =  41 + 41 + 41 + 41  = 164
Level 1 =  42 + 42 + 42 + 42  = 168
Level 2 =  42 + 42 + 42 + 42  = 168
Level 3 =  94 + 94 + 94 + 94 =  376
Level 4 =  94 + 94 + 94 + 94 =  376
Level 5 =  94 + 94 + 94 + 94 =  376
Level 6 =  94 + 94 + 94 + 94 =  376
Level 7 =  94 + 94 + 94 + 94 =  376

3. Check to see how many buffer allocation are left and enable
the best case. In this case since we have 476 blocks we can enable
WM0-7 on all 4 planes.
Let's say if we have only 200 block available then the best cases
allocation is to enable Level2 which requires 168 blocks

Changes since v1:
 - Rebased the series on top of Paulo's patches
 - Add changes to calculate WM in fixed point 16.16
 - Address review comments for Transition WM

Mahesh Kumar (9):
  drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm)
    functions
  drm/i915/skl+: use linetime latency instead of ddb size
  drm/i915/skl: New ddb allocation algorithm
  drm/i915: Decode system memory bandwidth
  drm/i915/gen9: WM memory bandwidth related workaround
  drm/i915/skl+: change WM calc to fixed point 16.16
  drm/i915/bxt: Enable IPC support
  drm/i915/bxt: set chicken bit as IPC y-tile WA
  drm/i915/bxt: Implement Transition WM

 drivers/gpu/drm/i915/i915_drv.c      | 101 +++++++++
 drivers/gpu/drm/i915/i915_drv.h      |  29 +++
 drivers/gpu/drm/i915/i915_params.c   |   5 +
 drivers/gpu/drm/i915/i915_params.h   |   1 +
 drivers/gpu/drm/i915/i915_reg.h      |  29 +++
 drivers/gpu/drm/i915/intel_display.c |  50 +++++
 drivers/gpu/drm/i915/intel_drv.h     |  12 +
 drivers/gpu/drm/i915/intel_pm.c      | 412 +++++++++++++++++++++++++++--------
 8 files changed, 551 insertions(+), 88 deletions(-)

-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions
  2016-09-08 11:26 ` [PATCH v2 0/9] New DDB Algo and WM fixes Kumar, Mahesh
@ 2016-09-08 11:26   ` Kumar, Mahesh
  2016-09-08 11:26   ` [PATCH v2 2/9] drm/i915/skl+: use linetime latency instead of ddb size Kumar, Mahesh
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-09-08 11:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch make use of plane_wm variable directly instead of passing
skl_plane_wm struct. this way reduces number of argument requirement
in watermark calculation functions.

It also gives more freedom of decision making to implement Bspec WM
workarounds.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 86c6d66..3fdec4d 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3542,9 +3542,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_plane_state *intel_pstate,
 				uint16_t ddb_allocation,
 				int level,
-				uint16_t *out_blocks, /* out */
-				uint8_t *out_lines, /* out */
-				bool *enabled /* out */)
+				struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_plane_state *pstate = &intel_pstate->base;
 	struct drm_framebuffer *fb = pstate->fb;
@@ -3557,6 +3555,11 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	uint32_t width = 0, height = 0;
 	uint32_t plane_pixel_rate;
 	uint32_t y_tile_minimum, y_min_scanlines;
+	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
+	struct skl_wm_level *result = &pipe_wm->wm[level];
+	uint16_t *out_blocks = &result->plane_res_b[id];
+	uint8_t *out_lines = &result->plane_res_l[id];
+	bool *enabled = &result->plane_en[id];
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
 		*enabled = false;
@@ -3673,7 +3676,7 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 		     struct skl_ddb_allocation *ddb,
 		     struct intel_crtc_state *cstate,
 		     int level,
-		     struct skl_wm_level *result)
+		     struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_atomic_state *state = cstate->base.state;
 	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
@@ -3684,12 +3687,6 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 	enum pipe pipe = intel_crtc->pipe;
 	int ret;
 
-	/*
-	 * We'll only calculate watermarks for planes that are actually
-	 * enabled, so make sure all other planes are set as disabled.
-	 */
-	memset(result, 0, sizeof(*result));
-
 	for_each_intel_plane_mask(&dev_priv->drm,
 				  intel_plane,
 				  cstate->base.plane_mask) {
@@ -3727,9 +3724,7 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 					   intel_pstate,
 					   ddb_blocks,
 					   level,
-					   &result->plane_res_b[i],
-					   &result->plane_res_l[i],
-					   &result->plane_en[i]);
+					   pipe_wm);
 		if (ret)
 			return ret;
 	}
@@ -3777,9 +3772,15 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 	int level, max_level = ilk_wm_max_level(dev);
 	int ret;
 
+	/*
+	 * We'll only calculate watermarks for planes that are actually
+	 * enabled, so make sure all other planes are set as disabled.
+	 */
+	memset(pipe_wm, 0, sizeof(*pipe_wm));
+
 	for (level = 0; level <= max_level; level++) {
 		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
-					   level, &pipe_wm->wm[level]);
+					   level, pipe_wm);
 		if (ret)
 			return ret;
 	}
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/9] drm/i915/skl+: use linetime latency instead of ddb size
  2016-09-08 11:26 ` [PATCH v2 0/9] New DDB Algo and WM fixes Kumar, Mahesh
  2016-09-08 11:26   ` [PATCH v2 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions Kumar, Mahesh
@ 2016-09-08 11:26   ` Kumar, Mahesh
  2016-09-08 11:26   ` [PATCH v2 3/9] drm/i915/skl: New ddb allocation algorithm Kumar, Mahesh
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-09-08 11:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch make changes to use linetime latency instead of allocated
DDB size during plane watermark calculation in switch case, This is
required to implement new DDB allocation algorithm.

In New Algorithm DDB is allocated based on WM values, because of which
number of DDB blocks will not be available during WM calculation,
So this "linetime latency" is suggested by SV/HW team to use during
switch-case for WM blocks selection.

Changes since v1:
 - Rebase on top of Paulo's patch series

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 3fdec4d..cfd9b7d1 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3622,10 +3622,15 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
 		selected_result = max(method2, y_tile_minimum);
 	} else {
+		uint32_t linetime_us = 0;
+
+		linetime_us = DIV_ROUND_UP(width * 1000,
+				skl_pipe_pixel_rate(cstate));
+
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
 		    (plane_bytes_per_line / 512 < 1))
 			selected_result = method2;
-		else if ((ddb_allocation / plane_blocks_per_line) >= 1)
+		if (latency >= linetime_us)
 			selected_result = min(method1, method2);
 		else
 			selected_result = method1;
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 3/9] drm/i915/skl: New ddb allocation algorithm
  2016-09-08 11:26 ` [PATCH v2 0/9] New DDB Algo and WM fixes Kumar, Mahesh
  2016-09-08 11:26   ` [PATCH v2 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions Kumar, Mahesh
  2016-09-08 11:26   ` [PATCH v2 2/9] drm/i915/skl+: use linetime latency instead of ddb size Kumar, Mahesh
@ 2016-09-08 11:26   ` Kumar, Mahesh
  2016-09-08 11:26   ` [PATCH v2 4/9] drm/i915: Decode system memory bandwidth Kumar, Mahesh
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-09-08 11:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch implements new DDB allocation algorithm as per HW team
suggestion. This algo takecare of scenario where we allocate less DDB
for the planes with lower relative pixel rate, but they require more DDB
to work.
It also takes care of enabling same watermark level for each
plane, for efficient power saving.

Changes since v1:
 - Rebase on top of Paulo's patch series

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 129 +++++++++++++++++++++-------------------
 1 file changed, 67 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index cfd9b7d1..7c70e07 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3348,6 +3348,7 @@ skl_ddb_min_alloc(const struct drm_plane_state *pstate,
 
 static int
 skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
+		      struct skl_pipe_wm *pipe_wm,
 		      struct skl_ddb_allocation *ddb /* out */)
 {
 	struct drm_atomic_state *state = cstate->base.state;
@@ -3363,8 +3364,11 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	uint16_t *minimum = cstate->wm.skl.minimum_blocks;
 	uint16_t *y_minimum = cstate->wm.skl.minimum_y_blocks;
 	unsigned int total_data_rate;
+	uint16_t total_min_blocks = 0;
+	uint16_t total_level_ddb = 0;
 	int num_active;
-	int id, i;
+	int max_level, level;
+	int id, i, ret = 0;
 
 	if (WARN_ON(!state))
 		return 0;
@@ -3380,6 +3384,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	alloc_size = skl_ddb_entry_size(alloc);
 	if (alloc_size == 0) {
 		memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
+		memset(ddb->y_plane[pipe], 0, sizeof(ddb->y_plane[pipe]));
 		return 0;
 	}
 
@@ -3413,19 +3418,42 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	}
 
 	for (i = 0; i < PLANE_CURSOR; i++) {
-		alloc_size -= minimum[i];
-		alloc_size -= y_minimum[i];
+		total_min_blocks += minimum[i];
+		total_min_blocks += y_minimum[i];
 	}
 
-	/*
-	 * 2. Distribute the remaining space in proportion to the amount of
-	 * data each plane needs to fetch from memory.
-	 *
-	 * FIXME: we may not allocate every single block here.
-	 */
+	for (level = ilk_wm_max_level(dev); level >= 0; level--) {
+		total_level_ddb = 0;
+		for (i = 0; i < PLANE_CURSOR; i++) {
+			/*
+			 * TODO: We should calculate watermark values for Y/UV
+			 * plane both in case of NV12 format and use both values
+			 * for ddb calculation, As NV12 is disabled as of now.
+			 * using only single plane value here.
+			 */
+			uint16_t min = minimum[i] + y_minimum[i];
+			uint16_t plane_level_ddb_wm =
+				max(pipe_wm->wm[level].plane_res_b[i], min);
+			total_level_ddb += plane_level_ddb_wm;
+		}
+
+		if (total_level_ddb <= alloc_size)
+			break;
+	}
+
+	if ((level < 0) || (total_min_blocks > alloc_size)) {
+		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
+		DRM_DEBUG_KMS("minimum required %d/%d\n", (level < 0) ?
+				total_level_ddb : total_min_blocks, alloc_size);
+		ret = -EINVAL;
+		goto exit;
+	}
+	max_level = level;
+	alloc_size -= total_level_ddb;
+
 	total_data_rate = skl_get_total_relative_data_rate(cstate);
 	if (total_data_rate == 0)
-		return 0;
+		goto exit;
 
 	start = alloc->start;
 	for_each_intel_plane_on_crtc(dev, intel_crtc, intel_plane) {
@@ -3440,7 +3468,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		 * promote the expression to 64 bits to avoid overflowing, the
 		 * result is < available as data_rate / total_data_rate < 1
 		 */
-		plane_blocks = minimum[id];
+		plane_blocks = max(pipe_wm->wm[max_level].plane_res_b[id],
+					minimum[id]);
 		plane_blocks += div_u64((uint64_t)alloc_size * data_rate,
 					total_data_rate);
 
@@ -3454,6 +3483,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 
 		/*
 		 * allocation for y_plane part of planar format:
+		 * TODO: Once we start calculating watermark values for Y/UV
+		 * plane both consider it for initial allowed wm blocks.
 		 */
 		y_data_rate = cstate->wm.skl.plane_y_data_rate[id];
 
@@ -3467,9 +3498,22 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		}
 
 		start += y_plane_blocks;
+
+		/*
+		 * Now enable all levels in WM structure which can be enabled
+		 * using current DDB allocation
+		 */
+		for (i = ilk_wm_max_level(dev); i > 0; i--) {
+			if (i > max_level || pipe_wm->wm[i].plane_res_l[id] > 31
+					|| pipe_wm->wm[i].plane_res_b[id] == 0)
+				pipe_wm->wm[i].plane_en[id] = false;
+			else
+				pipe_wm->wm[i].plane_en[id] = true;
+		}
 	}
 
-	return 0;
+exit:
+	return ret;
 }
 
 static uint32_t skl_pipe_pixel_rate(const struct intel_crtc_state *config)
@@ -3540,7 +3584,6 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
 				struct intel_plane_state *intel_pstate,
-				uint16_t ddb_allocation,
 				int level,
 				struct skl_pipe_wm *pipe_wm)
 {
@@ -3559,12 +3602,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	struct skl_wm_level *result = &pipe_wm->wm[level];
 	uint16_t *out_blocks = &result->plane_res_b[id];
 	uint8_t *out_lines = &result->plane_res_l[id];
-	bool *enabled = &result->plane_en[id];
 
-	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
-		*enabled = false;
+	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
 		return 0;
-	}
 
 	width = drm_rect_width(&intel_pstate->base.src) >> 16;
 	height = drm_rect_height(&intel_pstate->base.src) >> 16;
@@ -3649,54 +3689,27 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		}
 	}
 
-	if (res_blocks >= ddb_allocation || res_lines > 31) {
-		*enabled = false;
-
-		/*
-		 * If there are no valid level 0 watermarks, then we can't
-		 * support this display configuration.
-		 */
-		if (level) {
-			return 0;
-		} else {
-			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
-			DRM_DEBUG_KMS("Plane %d.%d: blocks required = %u/%u, lines required = %u/31\n",
-				      to_intel_crtc(cstate->base.crtc)->pipe,
-				      skl_wm_plane_id(to_intel_plane(pstate->plane)),
-				      res_blocks, ddb_allocation, res_lines);
-
-			return -EINVAL;
-		}
-	}
-
 	*out_blocks = res_blocks;
 	*out_lines = res_lines;
-	*enabled = true;
 
 	return 0;
 }
 
 static int
 skl_compute_wm_level(const struct drm_i915_private *dev_priv,
-		     struct skl_ddb_allocation *ddb,
 		     struct intel_crtc_state *cstate,
 		     int level,
 		     struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_atomic_state *state = cstate->base.state;
-	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
 	struct drm_plane *plane;
 	struct intel_plane *intel_plane;
 	struct intel_plane_state *intel_pstate;
-	uint16_t ddb_blocks;
-	enum pipe pipe = intel_crtc->pipe;
 	int ret;
 
 	for_each_intel_plane_mask(&dev_priv->drm,
 				  intel_plane,
 				  cstate->base.plane_mask) {
-		int i = skl_wm_plane_id(intel_plane);
-
 		plane = &intel_plane->base;
 		intel_pstate = NULL;
 		if (state)
@@ -3722,12 +3735,9 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 
 		WARN_ON(!intel_pstate->base.fb);
 
-		ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][i]);
-
 		ret = skl_compute_plane_wm(dev_priv,
 					   cstate,
 					   intel_pstate,
-					   ddb_blocks,
 					   level,
 					   pipe_wm);
 		if (ret)
@@ -3784,11 +3794,15 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 	memset(pipe_wm, 0, sizeof(*pipe_wm));
 
 	for (level = 0; level <= max_level; level++) {
-		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
+		ret = skl_compute_wm_level(dev_priv, cstate,
 					   level, pipe_wm);
 		if (ret)
 			return ret;
 	}
+	ret = skl_allocate_pipe_ddb(cstate, pipe_wm, ddb);
+	if (ret)
+		return ret;
+
 	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
 
 	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
@@ -3976,13 +3990,12 @@ pipes_modified(struct drm_atomic_state *state)
 }
 
 static int
-skl_compute_ddb(struct drm_atomic_state *state)
+skl_include_affected_pipes(struct drm_atomic_state *state)
 {
 	struct drm_device *dev = state->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
 	struct intel_crtc *intel_crtc;
-	struct skl_ddb_allocation *ddb = &intel_state->wm_results.ddb;
 	uint32_t realloc_pipes = pipes_modified(state);
 	int ret;
 
@@ -4035,10 +4048,6 @@ skl_compute_ddb(struct drm_atomic_state *state)
 		if (IS_ERR(cstate))
 			return PTR_ERR(cstate);
 
-		ret = skl_allocate_pipe_ddb(cstate, ddb);
-		if (ret)
-			return ret;
-
 		ret = drm_atomic_add_affected_planes(state, &intel_crtc->base);
 		if (ret)
 			return ret;
@@ -4092,19 +4101,15 @@ skl_compute_wm(struct drm_atomic_state *state)
 	/* Clear all dirty flags */
 	results->dirty_pipes = 0;
 
-	ret = skl_compute_ddb(state);
+	ret = skl_include_affected_pipes(state);
 	if (ret)
 		return ret;
 
 	/*
 	 * Calculate WM's for all pipes that are part of this transaction.
-	 * Note that the DDB allocation above may have added more CRTC's that
-	 * weren't otherwise being modified (and set bits in dirty_pipes) if
-	 * pipe allocations had to change.
-	 *
-	 * FIXME:  Now that we're doing this in the atomic check phase, we
-	 * should allow skl_update_pipe_wm() to return failure in cases where
-	 * no suitable watermark values can be found.
+	 * Note that affected pipe calculation above may have added more
+	 * CRTC's that weren't otherwise being modified (and set bits in
+	 * dirty_pipes) if pipe allocations had to change.
 	 */
 	for_each_crtc_in_state(state, crtc, cstate, i) {
 		struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 4/9] drm/i915: Decode system memory bandwidth
  2016-09-08 11:26 ` [PATCH v2 0/9] New DDB Algo and WM fixes Kumar, Mahesh
                     ` (2 preceding siblings ...)
  2016-09-08 11:26   ` [PATCH v2 3/9] drm/i915/skl: New ddb allocation algorithm Kumar, Mahesh
@ 2016-09-08 11:26   ` Kumar, Mahesh
  2016-09-08 11:26   ` [PATCH v2 5/9] drm/i915/gen9: WM memory bandwidth related workaround Kumar, Mahesh
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-09-08 11:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch adds support to decode system memory bandwidth
which will be used for arbitrated display memory percentage
calculation in GEN9 based system.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 96 +++++++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h | 18 ++++++++
 drivers/gpu/drm/i915/i915_reg.h | 25 +++++++++++
 3 files changed, 139 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 02c34d6..0a4f18d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -973,6 +973,96 @@ static void intel_sanitize_options(struct drm_i915_private *dev_priv)
 	DRM_DEBUG_DRIVER("use GPU sempahores? %s\n", yesno(i915.semaphores));
 }
 
+static void
+intel_get_memdev_info(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	uint32_t val = 0;
+	uint32_t mem_speed = 0;
+	uint8_t dram_type;
+	uint32_t dram_channel;
+	uint8_t num_channel;
+	bool rank_valid = false;
+
+	if (!IS_GEN9(dev_priv))
+		goto exit;
+
+	val = I915_READ(P_CR_MC_BIOS_REQ_0_0_0);
+	mem_speed = div_u64((uint64_t) (val & REQ_DATA_MASK) *
+			MEMORY_FREQ_MULTIPLIER, 1000);
+
+	if (mem_speed == 0)
+		goto exit;
+
+	dev_priv->memdev_info.valid = true;
+	dev_priv->memdev_info.mem_speed = mem_speed;
+	dram_type = (val >> DRAM_TYPE_SHIFT) & DRAM_TYPE_MASK;
+	dram_channel = (val >> DRAM_CHANNEL_SHIFT) & DRAM_CHANNEL_MASK;
+	num_channel = hweight32(dram_channel);
+
+	/*
+	 * The lpddr3 and lpddr4 technologies can have 1-4 channels and the
+	 * channels are 32bits wide; while ddr3l technologies can have 1-2
+	 * channels and the channels are 64 bits wide. But SV team found that in
+	 * case of single 64 bit wide DDR3L dimms two bits were set and system
+	 * with two DDR3L 64bit dimm all four bits were set.
+	 */
+
+	switch (dram_type) {
+	case DRAM_TYPE_LPDDR3:
+	case DRAM_TYPE_LPDDR4:
+		dev_priv->memdev_info.data_width = 4;
+		dev_priv->memdev_info.num_channel = num_channel;
+		break;
+	case DRAM_TYPE_DDR3L:
+		dev_priv->memdev_info.data_width = 8;
+		dev_priv->memdev_info.num_channel = num_channel / 2;
+		break;
+	default:
+		dev_priv->memdev_info.data_width = 4;
+		dev_priv->memdev_info.num_channel = num_channel;
+	}
+
+	/*
+	 * Now read each DUNIT8/9/10/11 to check the rank of each dimms.
+	 * all the dimms should have same rank as in first valid Dimm
+	 */
+#define D_CR_DRP0_DUNIT_INVALID	    0xFFFFFFFF
+
+	dev_priv->memdev_info.rank_valid = true;
+	if (I915_READ(D_CR_DRP0_DUNIT8) != D_CR_DRP0_DUNIT_INVALID) {
+		val = I915_READ(D_CR_DRP0_DUNIT8);
+		rank_valid = true;
+	} else if (I915_READ(D_CR_DRP0_DUNIT9) != D_CR_DRP0_DUNIT_INVALID) {
+		val = I915_READ(D_CR_DRP0_DUNIT9);
+		rank_valid = true;
+	} else if (I915_READ(D_CR_DRP0_DUNIT10) != D_CR_DRP0_DUNIT_INVALID) {
+		val = I915_READ(D_CR_DRP0_DUNIT10);
+		rank_valid = true;
+	} else if (I915_READ(D_CR_DRP0_DUNIT11) != D_CR_DRP0_DUNIT_INVALID) {
+		val = I915_READ(D_CR_DRP0_DUNIT11);
+		rank_valid = true;
+	}
+#undef D_CR_DRP0_DUNIT_INVALID
+
+	if (rank_valid) {
+		dev_priv->memdev_info.rank_valid = true;
+		dev_priv->memdev_info.rank = (val & DRAM_RANK_MASK);
+	}
+
+	DRM_DEBUG_DRIVER("valid:%s speed-%d width-%d num_channel-%d\n",
+		dev_priv->memdev_info.valid ? "true" : "false",
+		dev_priv->memdev_info.mem_speed,
+		dev_priv->memdev_info.data_width,
+		dev_priv->memdev_info.num_channel);
+	DRM_DEBUG_DRIVER("rank_valid:%s rank-%d\n",
+		dev_priv->memdev_info.rank_valid ? "true" : "false",
+		dev_priv->memdev_info.rank);
+	return;
+exit:
+	dev_priv->memdev_info.valid = false;
+}
+
 /**
  * i915_driver_init_hw - setup state requiring device access
  * @dev_priv: device private
@@ -1076,6 +1166,12 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
 			DRM_DEBUG_DRIVER("can't enable MSI");
 	}
 
+	/*
+	 * Fill the memdev structure to get the system raw bandwidth
+	 * This will be used by WM algorithm, to implement GEN9 based WA
+	 */
+	intel_get_memdev_info(dev);
+
 	return 0;
 
 out_ggtt:
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f05869b..8236927 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2023,6 +2023,24 @@ struct drm_i915_private {
 		bool distrust_bios_wm;
 	} wm;
 
+	struct {
+		/*
+		 * memory device info
+		 * valid: memory info is valid or not
+		 * mem_speed: memory freq in KHz
+		 * channel_width: Channel width in bytes
+		 * num_channel: total number of channels
+		 * rank: 0-rank disable, 1-Single rank, 2-dual rank
+		 */
+		bool valid;
+		uint32_t mem_speed;
+		uint8_t data_width;
+		uint8_t num_channel;
+		bool rank_valid;
+		uint8_t rank;
+	} memdev_info;
+
+
 	struct i915_runtime_pm pm;
 
 	/* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index a29d707..b38445c 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7716,6 +7716,31 @@ enum {
 #define  DC_STATE_DEBUG_MASK_CORES	(1<<0)
 #define  DC_STATE_DEBUG_MASK_MEMORY_UP	(1<<1)
 
+#define P_CR_MC_BIOS_REQ_0_0_0		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x7114)
+#define REQ_DATA_MASK			(0x3F << 0)
+#define DRAM_TYPE_SHIFT			24
+#define DRAM_TYPE_MASK			0x7
+#define DRAM_CHANNEL_SHIFT		12
+#define DRAM_CHANNEL_MASK		0xF
+
+#define DRAM_TYPE_LPDDR3		0x1
+#define DRAM_TYPE_LPDDR4		0x2
+#define DRAM_TYPE_DDR3L			0x4
+/*
+ * BIOS programs this field of REQ_DATA [5:0] in integer
+ * multiple of 133330 KHz (133.33MHz)
+ */
+#define MEMORY_FREQ_MULTIPLIER		0x208D2
+#define D_CR_DRP0_DUNIT8		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1000)
+#define D_CR_DRP0_DUNIT9		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1200)
+#define D_CR_DRP0_DUNIT10		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1400)
+#define D_CR_DRP0_DUNIT11		_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x1600)
+#define D_CR_DRP0_RKEN0			(1 << 0)
+#define D_CR_DRP0_RKEN1			(1 << 1)
+#define DRAM_RANK_MASK			0x3
+#define DRAM_SINGLE_RANK		0x1
+#define DRAM_DUAL_RANK			0x3
+
 /* Please see hsw_read_dcomp() and hsw_write_dcomp() before using this register,
  * since on HSW we can't write to it using I915_WRITE. */
 #define D_COMP_HSW			_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5F0C)
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 5/9] drm/i915/gen9: WM memory bandwidth related workaround
  2016-09-08 11:26 ` [PATCH v2 0/9] New DDB Algo and WM fixes Kumar, Mahesh
                     ` (3 preceding siblings ...)
  2016-09-08 11:26   ` [PATCH v2 4/9] drm/i915: Decode system memory bandwidth Kumar, Mahesh
@ 2016-09-08 11:26   ` Kumar, Mahesh
  2016-09-08 11:26   ` [PATCH v2 6/9] drm/i915/skl+: change WM calc to fixed point 16.16 Kumar, Mahesh
                     ` (3 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-09-08 11:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch implemnets Workariunds related to display arbitrated memory
bandwidth. These WA are applicabe for all gen-9 based platforms.

Changes since v1:
 - Rebase on top of Paulo's patch series

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h  |   9 +++
 drivers/gpu/drm/i915/intel_drv.h |  11 +++
 drivers/gpu/drm/i915/intel_pm.c  | 145 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 165 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8236927..7964c0f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1061,6 +1061,13 @@ enum intel_sbi_destination {
 	SBI_MPHY,
 };
 
+/* SKL+ Watermark arbitrated display bandwidth Workarounds */
+enum watermark_memory_wa {
+	WATERMARK_WA_NONE,
+	WATERMARK_WA_X_TILED,
+	WATERMARK_WA_Y_TILED,
+};
+
 #define QUIRK_PIPEA_FORCE (1<<0)
 #define QUIRK_LVDS_SSC_DISABLE (1<<1)
 #define QUIRK_INVERT_BRIGHTNESS (1<<2)
@@ -1610,6 +1617,8 @@ struct skl_ddb_allocation {
 
 struct skl_wm_values {
 	unsigned dirty_pipes;
+	/* any WaterMark memory workaround Required */
+	enum watermark_memory_wa mem_wa;
 	struct skl_ddb_allocation ddb;
 	uint32_t wm_linetime[I915_MAX_PIPES];
 	uint32_t plane[I915_MAX_PIPES][I915_MAX_PLANES][8];
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index ed6b53f..bd45a2c 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1807,6 +1807,17 @@ intel_atomic_get_crtc_state(struct drm_atomic_state *state,
 	return to_intel_crtc_state(crtc_state);
 }
 
+static inline struct intel_crtc_state *
+intel_atomic_get_existing_crtc_state(struct drm_atomic_state *state,
+				      struct intel_crtc *crtc)
+{
+	struct drm_crtc_state *crtc_state;
+
+	crtc_state = drm_atomic_get_existing_crtc_state(state, &crtc->base);
+
+	return to_intel_crtc_state(crtc_state);
+}
+
 static inline struct intel_plane_state *
 intel_atomic_get_existing_plane_state(struct drm_atomic_state *state,
 				      struct intel_plane *plane)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 7c70e07..0ec328b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3589,6 +3589,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 {
 	struct drm_plane_state *pstate = &intel_pstate->base;
 	struct drm_framebuffer *fb = pstate->fb;
+	struct intel_atomic_state *intel_state =
+			to_intel_atomic_state(cstate->base.state);
 	uint32_t latency = dev_priv->wm.skl_latency[level];
 	uint32_t method1, method2;
 	uint32_t plane_bytes_per_line, plane_blocks_per_line;
@@ -3602,10 +3604,17 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	struct skl_wm_level *result = &pipe_wm->wm[level];
 	uint16_t *out_blocks = &result->plane_res_b[id];
 	uint8_t *out_lines = &result->plane_res_l[id];
+	enum watermark_memory_wa mem_wa;
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
 		return 0;
 
+	mem_wa = intel_state ? intel_state->wm_results.mem_wa : WATERMARK_WA_NONE;
+	if (mem_wa != WATERMARK_WA_NONE) {
+		if (fb->modifier[0] == I915_FORMAT_MOD_X_TILED)
+			latency += 15;
+	}
+
 	width = drm_rect_width(&intel_pstate->base.src) >> 16;
 	height = drm_rect_height(&intel_pstate->base.src) >> 16;
 
@@ -3637,6 +3646,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		y_min_scanlines = 4;
 	}
 
+	if (mem_wa == WATERMARK_WA_Y_TILED)
+		y_min_scanlines *= 2;
+
 	plane_bytes_per_line = width * cpp;
 	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
 	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
@@ -4041,6 +4053,15 @@ skl_include_affected_pipes(struct drm_atomic_state *state)
 		intel_state->wm_results.dirty_pipes = ~0;
 	}
 
+	/*
+	 * If Watermark workaround is changed we need to recalculate
+	 * watermark values for all active pipes
+	 */
+	if (intel_state->wm_results.mem_wa != dev_priv->wm.skl_hw.mem_wa) {
+		realloc_pipes = ~0;
+		intel_state->wm_results.dirty_pipes = ~0;
+	}
+
 	for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) {
 		struct intel_crtc_state *cstate;
 
@@ -4057,6 +4078,128 @@ skl_include_affected_pipes(struct drm_atomic_state *state)
 }
 
 static void
+skl_set_memory_bandwidth_wm_wa(struct drm_atomic_state *state)
+{
+	struct drm_device *dev = state->dev;
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct intel_crtc *intel_crtc;
+	struct intel_plane_state *intel_pstate;
+	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
+	int num_active_plane, num_active_pipe;
+	uint32_t plane_bw, max_plane_bw, pipe_bw, max_pipe_bw;
+	uint32_t total_pipe_bw;
+	uint32_t system_bw = 0;
+	uint8_t num_channel, data_width, rank;
+	int x_tile_per;
+	int display_bw_per;
+	bool y_tile_enabled = false;
+
+	if (!dev_priv->memdev_info.valid)
+		goto exit;
+
+	num_channel = dev_priv->memdev_info.num_channel;
+	data_width = dev_priv->memdev_info.data_width;
+	system_bw = dev_priv->memdev_info.mem_speed * num_channel * data_width;
+
+	if (!system_bw)
+		goto exit;
+
+	max_pipe_bw = 0;
+	for_each_intel_crtc(dev, intel_crtc) {
+		struct intel_crtc_state *cstate;
+		struct intel_plane *plane;
+
+		/*
+		 * If CRTC is part of current atomic commit, get crtc state from
+		 * existing CRTC state. else take the cached CRTC state
+		 */
+		cstate = NULL;
+		if (state)
+			cstate = intel_atomic_get_existing_crtc_state(state,
+					intel_crtc);
+		if (!cstate)
+			cstate = to_intel_crtc_state(intel_crtc->base.state);
+
+		if (!cstate->base.active)
+			continue;
+
+		num_active_plane = 0;
+		max_plane_bw = 0;
+		for_each_intel_plane_mask(dev, plane, cstate->base.plane_mask) {
+			struct drm_framebuffer *fb = NULL;
+
+			intel_pstate = NULL;
+			if (state)
+				intel_pstate =
+				intel_atomic_get_existing_plane_state(state,
+									plane);
+			if (!intel_pstate)
+				intel_pstate =
+					to_intel_plane_state(plane->base.state);
+
+			WARN_ON(!intel_pstate->base.fb);
+
+			if (!intel_pstate->base.visible)
+				continue;
+
+			fb = intel_pstate->base.fb;
+			if (fb && (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
+				fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED))
+				y_tile_enabled = true;
+
+			plane_bw = skl_adjusted_plane_pixel_rate(cstate,
+								intel_pstate);
+			max_plane_bw = max(plane_bw, max_plane_bw);
+			num_active_plane++;
+		}
+		pipe_bw = max_plane_bw * num_active_plane;
+		max_pipe_bw = max(pipe_bw, max_pipe_bw);
+	}
+
+	if (intel_state->active_pipe_changes)
+		num_active_pipe = hweight32(intel_state->active_crtcs);
+	else
+		num_active_pipe = hweight32(dev_priv->active_crtcs);
+
+	total_pipe_bw = max_pipe_bw * num_active_pipe;
+
+	display_bw_per = DIV_ROUND_UP_ULL(total_pipe_bw * 100, system_bw * 1000);
+
+	/*
+	 * If there is any Ytile plane enabled and arbitrated display
+	 * bandwidth > 20% of raw system memory bandwidth
+	 * Enale Y-tile related WA
+	 *
+	 * If memory is dual channel single rank, Xtile limit = 35%, else Xtile
+	 * limit = 60%
+	 * If there is no Ytile plane enabled and
+	 * arbitrated display bandwidth > Xtile limit
+	 * Enable X-tile realated WA
+	 */
+	if (y_tile_enabled && (display_bw_per > 20))
+		intel_state->wm_results.mem_wa = WATERMARK_WA_Y_TILED;
+	else {
+
+		if (dev_priv->memdev_info.rank_valid)
+			rank = dev_priv->memdev_info.rank;
+		else
+			rank = DRAM_DUAL_RANK; /* Assume we are dual rank */
+
+		if ((rank == DRAM_SINGLE_RANK) && (num_channel == 2))
+			x_tile_per = 35;
+		else
+			x_tile_per = 60;
+
+		if (display_bw_per > x_tile_per)
+			intel_state->wm_results.mem_wa = WATERMARK_WA_X_TILED;
+	}
+	return;
+
+exit:
+	intel_state->wm_results.mem_wa = WATERMARK_WA_NONE;
+}
+
+static void
 skl_copy_wm_for_pipe(struct skl_wm_values *dst,
 		     struct skl_wm_values *src,
 		     enum pipe pipe)
@@ -4101,6 +4244,8 @@ skl_compute_wm(struct drm_atomic_state *state)
 	/* Clear all dirty flags */
 	results->dirty_pipes = 0;
 
+	skl_set_memory_bandwidth_wm_wa(state);
+
 	ret = skl_include_affected_pipes(state);
 	if (ret)
 		return ret;
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 6/9] drm/i915/skl+: change WM calc to fixed point 16.16
  2016-09-08 11:26 ` [PATCH v2 0/9] New DDB Algo and WM fixes Kumar, Mahesh
                     ` (4 preceding siblings ...)
  2016-09-08 11:26   ` [PATCH v2 5/9] drm/i915/gen9: WM memory bandwidth related workaround Kumar, Mahesh
@ 2016-09-08 11:26   ` Kumar, Mahesh
  2016-09-08 11:26   ` [PATCH v2 7/9] drm/i915/bxt: Enable IPC support Kumar, Mahesh
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-09-08 11:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch changes Watermak calculation to fixed point calculation.
Problem with current calculation is during plane_blocks_per_line
calculation we divide intermediate blocks with min_scanlines and
takes floor of the result because of integer operation.
hence we end-up assigning less blocks than required. Which leads to
flickers.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0ec328b..d4390e4 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3530,13 +3530,15 @@ static uint32_t skl_pipe_pixel_rate(const struct intel_crtc_state *config)
 */
 static uint32_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp, uint32_t latency)
 {
-	uint32_t wm_intermediate_val, ret;
+	uint64_t wm_intermediate_val;
+	uint32_t ret;
 
 	if (latency == 0)
 		return UINT_MAX;
 
-	wm_intermediate_val = latency * pixel_rate * cpp / 512;
-	ret = DIV_ROUND_UP(wm_intermediate_val, 1000);
+	wm_intermediate_val = latency * pixel_rate * cpp;
+	wm_intermediate_val <<= 16;
+	ret = DIV_ROUND_UP_ULL(wm_intermediate_val, 1000 * 512);
 
 	return ret;
 }
@@ -3605,6 +3607,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	uint16_t *out_blocks = &result->plane_res_b[id];
 	uint8_t *out_lines = &result->plane_res_l[id];
 	enum watermark_memory_wa mem_wa;
+	bool y_tiled = false;
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
 		return 0;
@@ -3652,14 +3655,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	plane_bytes_per_line = width * cpp;
 	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
 	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
+		y_tiled = true;
 		plane_blocks_per_line =
 		      DIV_ROUND_UP(plane_bytes_per_line * y_min_scanlines, 512);
-		plane_blocks_per_line /= y_min_scanlines;
+		plane_blocks_per_line = (plane_blocks_per_line << 16) /
+								y_min_scanlines;
 	} else if (fb->modifier[0] == DRM_FORMAT_MOD_NONE) {
 		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512)
 					+ 1;
+		plane_blocks_per_line <<= 16;
 	} else {
 		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512);
+		plane_blocks_per_line <<= 16;
 	}
 
 	method1 = skl_wm_method1(plane_pixel_rate, cpp, latency);
@@ -3670,8 +3677,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 
 	y_tile_minimum = plane_blocks_per_line * y_min_scanlines;
 
-	if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
-	    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
+	if (y_tiled) {
 		selected_result = max(method2, y_tile_minimum);
 	} else {
 		uint32_t linetime_us = 0;
@@ -3688,12 +3694,11 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 			selected_result = method1;
 	}
 
-	res_blocks = selected_result + 1;
+	res_blocks = DIV_ROUND_UP(selected_result, 1 << 16) + 1;
 	res_lines = DIV_ROUND_UP(selected_result, plane_blocks_per_line);
 
 	if (level >= 1 && level <= 7) {
-		if (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
-		    fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED) {
+		if (y_tiled) {
 			res_blocks += y_tile_minimum;
 			res_lines += y_min_scanlines;
 		} else {
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 7/9] drm/i915/bxt: Enable IPC support
  2016-09-08 11:26 ` [PATCH v2 0/9] New DDB Algo and WM fixes Kumar, Mahesh
                     ` (5 preceding siblings ...)
  2016-09-08 11:26   ` [PATCH v2 6/9] drm/i915/skl+: change WM calc to fixed point 16.16 Kumar, Mahesh
@ 2016-09-08 11:26   ` Kumar, Mahesh
  2016-09-08 11:26   ` [PATCH v2 8/9] drm/i915/bxt: set chicken bit as IPC y-tile WA Kumar, Mahesh
  2016-09-08 11:26   ` [PATCH v2 9/9] drm/i915/bxt: Implement Transition WM Kumar, Mahesh
  8 siblings, 0 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-09-08 11:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch adds IPC support for platforms. This patch enables IPC
only for BXT platform as for SKL recommendation is to keep is disabled
This patch also adds kernel command-line option to enable/disable
the IPC if required.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c    |  5 +++++
 drivers/gpu/drm/i915/i915_params.c |  5 +++++
 drivers/gpu/drm/i915/i915_params.h |  1 +
 drivers/gpu/drm/i915/i915_reg.h    |  1 +
 drivers/gpu/drm/i915/intel_drv.h   |  1 +
 drivers/gpu/drm/i915/intel_pm.c    | 21 +++++++++++++++++++++
 6 files changed, 34 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 0a4f18d..22d84e6 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1335,6 +1335,11 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	intel_runtime_pm_enable(dev_priv);
 
+	/*
+	 * Now enable the IPC for supported platforms
+	 */
+	intel_enable_ipc(dev_priv);
+
 	/* Everything is in place, we can now relax! */
 	DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
 		 driver.name, driver.major, driver.minor, driver.patchlevel,
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index 768ad89..cc41b8d 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -62,6 +62,7 @@ struct i915_params i915 __read_mostly = {
 	.inject_load_failure = 0,
 	.enable_dpcd_backlight = false,
 	.enable_gvt = false,
+	.enable_ipc = true,
 };
 
 module_param_named(modeset, i915.modeset, int, 0400);
@@ -233,3 +234,7 @@ MODULE_PARM_DESC(enable_dpcd_backlight,
 module_param_named(enable_gvt, i915.enable_gvt, bool, 0400);
 MODULE_PARM_DESC(enable_gvt,
 	"Enable support for Intel GVT-g graphics virtualization host support(default:false)");
+
+module_param_named(enable_ipc, i915.enable_ipc, bool, 0400);
+MODULE_PARM_DESC(enable_ipc,
+		"Enable Isochronous Priority Control (default:true)");
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index 3a0dd78..f99b9b9 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -65,6 +65,7 @@ struct i915_params {
 	bool enable_dp_mst;
 	bool enable_dpcd_backlight;
 	bool enable_gvt;
+	bool enable_ipc;
 };
 
 extern struct i915_params i915 __read_mostly;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index b38445c..75b5b52 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -6139,6 +6139,7 @@ enum {
 #define  DISP_FBC_WM_DIS		(1<<15)
 #define DISP_ARB_CTL2	_MMIO(0x45004)
 #define  DISP_DATA_PARTITION_5_6	(1<<6)
+#define  DISP_IPC_ENABLE		(1<<3)
 #define DBUF_CTL	_MMIO(0x45008)
 #define  DBUF_POWER_REQUEST		(1<<31)
 #define  DBUF_POWER_STATE		(1<<30)
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index bd45a2c..4645d6e 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1760,6 +1760,7 @@ void skl_write_plane_wm(struct intel_crtc *intel_crtc,
 uint32_t ilk_pipe_pixel_rate(const struct intel_crtc_state *pipe_config);
 bool ilk_disable_lp_wm(struct drm_device *dev);
 int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
+void intel_enable_ipc(struct drm_i915_private *dev_priv);
 static inline int intel_enable_rc6(void)
 {
 	return i915.enable_rc6;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d4390e4..8d0037c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4793,6 +4793,27 @@ void intel_update_watermarks(struct drm_crtc *crtc)
 }
 
 /*
+ * enable IPC for Supported Platforms
+ */
+void intel_enable_ipc(struct drm_i915_private *dev_priv)
+{
+	u32 val;
+
+	/* enable IPC only for Broxton for now*/
+	if (!IS_BROXTON(dev_priv))
+		return;
+
+	val = I915_READ(DISP_ARB_CTL2);
+
+	if (i915.enable_ipc)
+		val |= DISP_IPC_ENABLE;
+	else
+		val &= ~DISP_IPC_ENABLE;
+
+	I915_WRITE(DISP_ARB_CTL2, val);
+}
+
+/*
  * Lock protecting IPS related data structures
  */
 DEFINE_SPINLOCK(mchdev_lock);
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 8/9] drm/i915/bxt: set chicken bit as IPC y-tile WA
  2016-09-08 11:26 ` [PATCH v2 0/9] New DDB Algo and WM fixes Kumar, Mahesh
                     ` (6 preceding siblings ...)
  2016-09-08 11:26   ` [PATCH v2 7/9] drm/i915/bxt: Enable IPC support Kumar, Mahesh
@ 2016-09-08 11:26   ` Kumar, Mahesh
  2016-09-08 11:26   ` [PATCH v2 9/9] drm/i915/bxt: Implement Transition WM Kumar, Mahesh
  8 siblings, 0 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-09-08 11:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

It implements the WA to enable IDLE_WAKEMEM bit of CHICKEN_DCPR_1
register for Broxton platform. When IPC is enabled & Y-tile is
enabled in any of the enabled plane, above bit should be set.
Without this WA system observes random hang.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h      |  2 ++
 drivers/gpu/drm/i915/i915_reg.h      |  3 +++
 drivers/gpu/drm/i915/intel_display.c | 50 ++++++++++++++++++++++++++++++++++++
 3 files changed, 55 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7964c0f..c94eedb 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1619,6 +1619,8 @@ struct skl_wm_values {
 	unsigned dirty_pipes;
 	/* any WaterMark memory workaround Required */
 	enum watermark_memory_wa mem_wa;
+	/* IPC Y-tiled WA related member */
+	u32 y_plane_mask;
 	struct skl_ddb_allocation ddb;
 	uint32_t wm_linetime[I915_MAX_PIPES];
 	uint32_t plane[I915_MAX_PIPES][I915_MAX_PLANES][8];
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 75b5b52..210d9b0 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -5658,6 +5658,9 @@ enum {
 #define PLANE_NV12_BUF_CFG(pipe, plane)	\
 	_MMIO_PLANE(plane, _PLANE_NV12_BUF_CFG_1(pipe), _PLANE_NV12_BUF_CFG_2(pipe))
 
+#define CHICKEN_DCPR_1             _MMIO(0x46430)
+#define IDLE_WAKEMEM_MASK          (1 << 13)
+
 /* SKL new cursor registers */
 #define _CUR_BUF_CFG_A				0x7017c
 #define _CUR_BUF_CFG_B				0x7117c
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index a6c3caa..c75685d 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -12429,6 +12429,8 @@ int intel_plane_atomic_calc_changes(struct drm_crtc_state *crtc_state,
 	bool is_crtc_enabled = crtc_state->active;
 	bool turn_off, turn_on, visible, was_visible;
 	struct drm_framebuffer *fb = plane_state->fb;
+	struct intel_atomic_state *intel_state =
+				to_intel_atomic_state(plane_state->state);
 	int ret;
 
 	if (INTEL_GEN(dev) >= 9 && plane->type != DRM_PLANE_TYPE_CURSOR) {
@@ -12515,6 +12517,15 @@ int intel_plane_atomic_calc_changes(struct drm_crtc_state *crtc_state,
 	    !needs_scaling(old_plane_state))
 		pipe_config->disable_lp_wm = true;
 
+	if (fb && (fb->modifier[0] == I915_FORMAT_MOD_Y_TILED ||
+			fb->modifier[0] == I915_FORMAT_MOD_Yf_TILED)) {
+		intel_state->wm_results.y_plane_mask |=
+						(1 << drm_plane_index(plane));
+	} else {
+		intel_state->wm_results.y_plane_mask &=
+						~(1 << drm_plane_index(plane));
+	}
+
 	return 0;
 }
 
@@ -13985,6 +13996,10 @@ static int intel_atomic_check(struct drm_device *dev,
 	if (ret)
 		return ret;
 
+	/* Copy the Y-tile WA related states */
+	intel_state->wm_results.y_plane_mask =
+				dev_priv->wm.skl_results.y_plane_mask;
+
 	for_each_crtc_in_state(state, crtc, crtc_state, i) {
 		struct intel_crtc_state *pipe_config =
 			to_intel_crtc_state(crtc_state);
@@ -14226,6 +14241,32 @@ static void intel_update_crtcs(struct drm_atomic_state *state,
 	}
 }
 
+/*
+ * GEN9 IPC WA for Y-tiled
+ */
+void bxt_set_ipc_wa(struct drm_i915_private *dev_priv, bool enable)
+{
+	u32 val;
+
+	if (!IS_BROXTON(dev_priv) || !i915.enable_ipc)
+		return;
+
+	val = I915_READ(CHICKEN_DCPR_1);
+	/*
+	 * If WA is already enabled or disabled
+	 * no need to re-enable or disable it.
+	 */
+	if ((enable && (val & IDLE_WAKEMEM_MASK)) ||
+		(!enable && !(val & IDLE_WAKEMEM_MASK)))
+		return;
+
+	if (enable)
+		val |= IDLE_WAKEMEM_MASK;
+	else
+		val &= ~IDLE_WAKEMEM_MASK;
+	I915_WRITE(CHICKEN_DCPR_1, val);
+}
+
 static void skl_update_crtcs(struct drm_atomic_state *state,
 			     unsigned int *crtc_vblank_mask)
 {
@@ -14241,6 +14282,12 @@ static void skl_update_crtcs(struct drm_atomic_state *state,
 	enum pipe pipe;
 
 	/*
+	 * If IPC WA need to be enabled, enable it now
+	 */
+	if (intel_state->wm_results.y_plane_mask)
+		bxt_set_ipc_wa(dev_priv, true);
+
+	/*
 	 * Whenever the number of active pipes changes, we need to make sure we
 	 * update the pipes in the right order so that their ddb allocations
 	 * never overlap with eachother inbetween CRTC updates. Otherwise we'll
@@ -14283,6 +14330,9 @@ static void skl_update_crtcs(struct drm_atomic_state *state,
 			progress = true;
 		}
 	} while (progress);
+
+	if (!intel_state->wm_results.y_plane_mask)
+		bxt_set_ipc_wa(dev_priv, false);
 }
 
 static void intel_atomic_commit_tail(struct drm_atomic_state *state)
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 9/9] drm/i915/bxt: Implement Transition WM
  2016-09-08 11:26 ` [PATCH v2 0/9] New DDB Algo and WM fixes Kumar, Mahesh
                     ` (7 preceding siblings ...)
  2016-09-08 11:26   ` [PATCH v2 8/9] drm/i915/bxt: set chicken bit as IPC y-tile WA Kumar, Mahesh
@ 2016-09-08 11:26   ` Kumar, Mahesh
  8 siblings, 0 replies; 28+ messages in thread
From: Kumar, Mahesh @ 2016-09-08 11:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni

From: Mahesh Kumar <mahesh1.kumar@intel.com>

This patch enables Transition WM for SKL+ platforms.

Transition WM are used if IPC is enabled, to decide, number of blocks
to be fetched before reducing the priority of display to fetch from
memory.

Changes since v1:
 - Don't enable transition WM for GEN9 (Art/Paulo)
 - Rebase to use fixed point 16.16 calculation
 - Fix the use of selected result from latency level-0

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 60 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 57 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 8d0037c..f6a3fab 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2862,6 +2862,8 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
 
 #define SKL_DDB_SIZE		896	/* in blocks */
 #define BXT_DDB_SIZE		512
+#define SKL_TRANS_WM_AMOUNT	10	/* tunable value */
+#define SKL_TRANS_WM_MIN	14
 #define SKL_SAGV_BLOCK_TIME	30 /* µs */
 
 /*
@@ -3583,6 +3585,48 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 	return pixel_rate;
 }
 
+static void skl_compute_plane_trans_wm(const struct drm_i915_private *dev_priv,
+					struct skl_pipe_wm *pipe_wm,
+					struct intel_plane_state *intel_pstate,
+					uint32_t selected_result,
+					uint32_t y_tile_min,
+					bool y_tile)
+{
+	struct drm_plane_state *pstate = &intel_pstate->base;
+	int id = skl_wm_plane_id(to_intel_plane(pstate->plane));
+	uint16_t *out_blocks = &pipe_wm->trans_wm.plane_res_b[id];
+	uint8_t *out_lines = &pipe_wm->trans_wm.plane_res_l[id];
+	uint32_t trans_min = 0, trans_offset_blocks;
+	uint32_t trans_y_tile_min = 0, res_blocks = 0;
+	uint16_t trans_res_blocks = 0;
+
+	/* Keep Trans WM disabled for GEN9 */
+	if (IS_GEN9(dev_priv))
+		goto exit;
+
+	trans_min = SKL_TRANS_WM_MIN;
+
+	trans_offset_blocks = (trans_min + SKL_TRANS_WM_AMOUNT) << 16;
+
+	if (y_tile) {
+		trans_y_tile_min = 2 * y_tile_min;
+		res_blocks = max(trans_y_tile_min, selected_result) +
+			trans_offset_blocks;
+	} else {
+		res_blocks = selected_result + trans_offset_blocks;
+	}
+
+	trans_res_blocks = DIV_ROUND_UP(res_blocks, 1 << 16) + 1;
+
+	/* WA BUG:1938466 add one block for non y-tile planes */
+	if (!y_tile)
+		trans_res_blocks += 1;
+exit:
+	*out_blocks = trans_res_blocks;
+	*out_lines = 0;
+}
+
+
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
 				struct intel_plane_state *intel_pstate,
@@ -3709,6 +3753,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	*out_blocks = res_blocks;
 	*out_lines = res_lines;
 
+	if (level == 0)
+		skl_compute_plane_trans_wm(dev_priv, pipe_wm, intel_pstate,
+				selected_result, y_tile_minimum, y_tiled);
 	return 0;
 }
 
@@ -3778,11 +3825,13 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 }
 
 static void skl_compute_transition_wm(struct intel_crtc_state *cstate,
-				      struct skl_wm_level *trans_wm /* out */)
+				      struct skl_wm_level *trans_wm, /* out */
+				      struct skl_ddb_allocation *ddb)
 {
 	struct drm_crtc *crtc = cstate->base.crtc;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	struct intel_plane *intel_plane;
+	enum pipe pipe = to_intel_crtc(crtc)->pipe;
 
 	if (!cstate->base.active)
 		return;
@@ -3790,8 +3839,13 @@ static void skl_compute_transition_wm(struct intel_crtc_state *cstate,
 	/* Until we know more, just disable transition WMs */
 	for_each_intel_plane_on_crtc(crtc->dev, intel_crtc, intel_plane) {
 		int i = skl_wm_plane_id(intel_plane);
+		uint16_t plane_ddb = skl_ddb_entry_size(&ddb->plane[pipe][i]);
+		uint16_t trans_res_blocks = trans_wm->plane_res_b[i];
 
-		trans_wm->plane_en[i] = false;
+		if ((plane_ddb > 0) && (trans_res_blocks > plane_ddb))
+			trans_wm->plane_en[i] = false;
+		else
+			trans_wm->plane_en[i] = true;
 	}
 }
 
@@ -3822,7 +3876,7 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 
 	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
 
-	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm);
+	skl_compute_transition_wm(cstate, &pipe_wm->trans_wm, ddb);
 
 	return 0;
 }
-- 
2.8.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* ✗ Fi.CI.BAT: failure for Implement New DDB allocation algorithm
  2016-08-29 12:35 [PATCH 0/7] Implement New DDB allocation algorithm Kumar, Mahesh
                   ` (7 preceding siblings ...)
  2016-09-08 11:26 ` [PATCH v2 0/9] New DDB Algo and WM fixes Kumar, Mahesh
@ 2016-09-08 11:55 ` Patchwork
  8 siblings, 0 replies; 28+ messages in thread
From: Patchwork @ 2016-09-08 11:55 UTC (permalink / raw)
  To: Kumar, Mahesh; +Cc: intel-gfx

== Series Details ==

Series: Implement New DDB allocation algorithm
URL   : https://patchwork.freedesktop.org/series/11720/
State : failure

== Summary ==

Series 11720v1 Implement New DDB allocation algorithm
http://patchwork.freedesktop.org/api/1.0/series/11720/revisions/1/mbox/

Test drv_module_reload_basic:
                skip       -> PASS       (fi-skl-6260u)
Test gem_exec_suspend:
        Subgroup basic-s3:
                pass       -> INCOMPLETE (fi-hsw-4770k)
Test kms_cursor_legacy:
        Subgroup basic-cursor-vs-flip-legacy:
                pass       -> FAIL       (fi-bsw-n3050)

fi-bdw-5557u     total:252  pass:236  dwarn:0   dfail:0   fail:1   skip:15 
fi-bsw-n3050     total:252  pass:204  dwarn:0   dfail:0   fail:2   skip:46 
fi-byt-n2820     total:252  pass:209  dwarn:0   dfail:0   fail:2   skip:41 
fi-hsw-4770k     total:107  pass:91   dwarn:0   dfail:0   fail:0   skip:15 
fi-hsw-4770r     total:252  pass:225  dwarn:0   dfail:0   fail:1   skip:26 
fi-ilk-650       total:252  pass:182  dwarn:0   dfail:0   fail:3   skip:67 
fi-ivb-3520m     total:252  pass:220  dwarn:0   dfail:0   fail:1   skip:31 
fi-ivb-3770      total:252  pass:220  dwarn:0   dfail:0   fail:1   skip:31 
fi-skl-6260u     total:252  pass:237  dwarn:0   dfail:0   fail:1   skip:14 
fi-skl-6700k     total:252  pass:222  dwarn:1   dfail:0   fail:1   skip:28 
fi-snb-2520m     total:252  pass:206  dwarn:0   dfail:0   fail:1   skip:45 
fi-snb-2600      total:252  pass:206  dwarn:0   dfail:0   fail:1   skip:45 

Results at /archive/results/CI_IGT_test/Patchwork_2492/

5986f290e25f42d3d5df390411cc43683deb1301 drm-intel-nightly: 2016y-09m-08d-09h-11m-50s UTC integration manifest
e9d28fd drm/i915: Decode system memory bandwidth
c792093 drm/i915/skl: New ddb allocation algorithm
b4b85c4 drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions
0967750 drm/i915/skl+: use linetime latency instead of ddb size
d35a2a1 drm/i915/hsw+: set intel_crtc active once pipe is active

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2016-09-08 11:55 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-29 12:35 [PATCH 0/7] Implement New DDB allocation algorithm Kumar, Mahesh
2016-08-29 12:35 ` [PATCH 1/7] drm/i915/hsw+: set intel_crtc active once pipe is active Kumar, Mahesh
2016-08-30 19:18   ` Zanoni, Paulo R
2016-08-31 10:51   ` Maarten Lankhorst
2016-08-31 14:04     ` Mahesh Kumar
2016-08-29 12:35 ` [PATCH 2/7] drm/i915/skl+: use linetime latency instead of ddb size Kumar, Mahesh
2016-08-29 12:35 ` [PATCH 3/7] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions Kumar, Mahesh
2016-08-29 12:35 ` [PATCH 4/7] drm/i915/skl: New ddb allocation algorithm Kumar, Mahesh
2016-08-31 13:38   ` Maarten Lankhorst
2016-08-31 14:18     ` Mahesh Kumar
2016-08-29 12:35 ` [PATCH 5/7] drm/i915: Decode system memory bandwidth Kumar, Mahesh
2016-08-29 12:35 ` [PATCH] FOR_UPSTREAM [VPG]: drm/i915/skl+: Implement Transition WM Kumar, Mahesh
2016-08-30 19:32   ` Zanoni, Paulo R
2016-08-31 13:47     ` Zanoni, Paulo R
2016-09-02 11:46       ` Mahesh Kumar
2016-09-02 12:21         ` Zanoni, Paulo R
2016-08-29 12:35 ` [PATCH] drm/i915/gen9: WM memory bandwidth related workaround Kumar, Mahesh
2016-09-08 11:26 ` [PATCH v2 0/9] New DDB Algo and WM fixes Kumar, Mahesh
2016-09-08 11:26   ` [PATCH v2 1/9] drm/i915/skl: pass pipe_wm in skl_compute_(wm_level/plane_wm) functions Kumar, Mahesh
2016-09-08 11:26   ` [PATCH v2 2/9] drm/i915/skl+: use linetime latency instead of ddb size Kumar, Mahesh
2016-09-08 11:26   ` [PATCH v2 3/9] drm/i915/skl: New ddb allocation algorithm Kumar, Mahesh
2016-09-08 11:26   ` [PATCH v2 4/9] drm/i915: Decode system memory bandwidth Kumar, Mahesh
2016-09-08 11:26   ` [PATCH v2 5/9] drm/i915/gen9: WM memory bandwidth related workaround Kumar, Mahesh
2016-09-08 11:26   ` [PATCH v2 6/9] drm/i915/skl+: change WM calc to fixed point 16.16 Kumar, Mahesh
2016-09-08 11:26   ` [PATCH v2 7/9] drm/i915/bxt: Enable IPC support Kumar, Mahesh
2016-09-08 11:26   ` [PATCH v2 8/9] drm/i915/bxt: set chicken bit as IPC y-tile WA Kumar, Mahesh
2016-09-08 11:26   ` [PATCH v2 9/9] drm/i915/bxt: Implement Transition WM Kumar, Mahesh
2016-09-08 11:55 ` ✗ Fi.CI.BAT: failure for Implement New DDB allocation algorithm Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.