All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Implement DDB algorithm and WM cleanup
@ 2017-02-28 11:31 Mahesh Kumar
  2017-02-28 11:31 ` [PATCH 1/8] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point Mahesh Kumar
                   ` (8 more replies)
  0 siblings, 9 replies; 13+ messages in thread
From: Mahesh Kumar @ 2017-02-28 11:31 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This series implements new DDB allocation algorithm to solve the cases,
where we have sufficient DDB available to enable multiple planes, But
due to the current algorithm not dividing it properly among planes, we
end-up failing the flip.
It also takes care of enabling same watermark level for each
plane, for efficient power saving.

There are two steps in current WM programming.

1. Calculate minimum number of blocks required  for a WM level to be
enabled. For 1440x2560 panel we need 41 blocks as minimum number of
blocks to enable WM0. This is the step which doesn't use vertical size.
It only depends on Pipe drain rate and plane horizontal size as per the
current Bspec algorithm.
So all the plane below have minimum  number of blocks required to enable
WM0 as 41
    Plane 1  - 1440x2560        -    Min blocks to enable WM0 = 41
    Plane 2  - 1440x2560        -    Min blocks to enable WM0 = 41
    Plane 3  - 1440x48          -    Min blocks to enable WM0 = 41
    Plane 4  - 1440x96          -    Min blocks to enable WM0 = 41

2. Number of blocks allotted by the driver
    Driver allocates  12 for Plane 3   &  16 for plane 4

    Total Dbuf Available = 508
    Dbuf Available after 32 blocks for cursor = 508 - (32)  = 476
    allocate minimum blocks for each plane 8 * 4 = 32
    remaining blocks = 476 - 32 = 444
    Relative Data Rate for Planes
       Plane 1  =  1440 * 2560 * 3  =  11059200
       Plane 2  =  1440 * 2560 * 3  =  11059200
       Plane 3  =  1440 * 48   * 3  =  207360
       Plane 4  =  1440 * 96   * 3  =  414720
       Total Relative BW            =  22740480

-   Allocate Buffer
    buffer allocation = (Plane relative data rate / total data rate)
		    * total remaming DDB + minimum plane DDB
     Plane 1  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
     Plane 2  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
     Plane 3  buffer allocation = (207360   / 22740480) * 444 + 8 = 12
     Plane 4  buffer allocation = (414720   / 22740480) * 444 + 8 = 16

In this case it forced driver to disable Plane 3 & 4. Driver need to use
more efficient way to allocate buffer that is optimum for power.

New Algorithm suggested by HW team is:

1. Calculate minimum buffer allocations for each plane and for each
    watermark level

2. Add minimum buffer allocations required for enabling WM7
    for all the planes

Level 0 =  41 + 41 + 41 + 41  = 164
Level 1 =  42 + 42 + 42 + 42  = 168
Level 2 =  42 + 42 + 42 + 42  = 168
Level 3 =  94 + 94 + 94 + 94 =  376
Level 4 =  94 + 94 + 94 + 94 =  376
Level 5 =  94 + 94 + 94 + 94 =  376
Level 6 =  94 + 94 + 94 + 94 =  376
Level 7 =  94 + 94 + 94 + 94 =  376

3. Check to see how many buffer allocation are left and enable
the best case. In this case since we have 476 blocks we can enable
WM0-7 on all 4 planes.
Let's say if we have only 200 block available then the best cases
allocation is to enable Level2 which requires 168 blocks

Mahesh Kumar (8):
  drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed
    point
  drm/i915/skl+: use linetime latency if ddb size is not available
  drm/i915/skl: Fail the flip if no FB for WM calculation
  drm/i915/skl+: no need to memset again
  drm/i915/skl+: ddb min requirement may exceed allocation
  drm/i915/skl+: Watermark calculation cleanup
  drm/i915/skl: New ddb allocation algorithm
  drm/i915/skl+: consider max supported plane pixel rate while scaling

 drivers/gpu/drm/i915/i915_drv.h      |  52 +++-
 drivers/gpu/drm/i915/intel_display.c |   3 +
 drivers/gpu/drm/i915/intel_drv.h     |   2 +
 drivers/gpu/drm/i915/intel_pm.c      | 511 +++++++++++++++++++++++------------
 4 files changed, 384 insertions(+), 184 deletions(-)

-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/8] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point
  2017-02-28 11:31 [PATCH 0/8] Implement DDB algorithm and WM cleanup Mahesh Kumar
@ 2017-02-28 11:31 ` Mahesh Kumar
  2017-03-16 18:48   ` Paulo Zanoni
  2017-02-28 11:31 ` [PATCH 2/8] drm/i915/skl+: use linetime latency if ddb size is not available Mahesh Kumar
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 13+ messages in thread
From: Mahesh Kumar @ 2017-02-28 11:31 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch make changes to calculate adjusted plane pixel rate &
plane downscale amount using fixed_point functions available. This also
adds few fixed point function to facilitate calculation.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 34 +++++++++++++++++++++++++++++----
 drivers/gpu/drm/i915/intel_pm.c | 42 ++++++++++++++++++++---------------------
 2 files changed, 51 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index eed9ead1b592..f26f61b0e7c8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -170,8 +170,8 @@ static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
 	return max;
 }
 
-static inline uint_fixed_16_16_t fixed_16_16_div_round_up(uint32_t val,
-							  uint32_t d)
+static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val,
+						 uint32_t d)
 {
 	uint_fixed_16_16_t fp, res;
 
@@ -180,8 +180,8 @@ static inline uint_fixed_16_16_t fixed_16_16_div_round_up(uint32_t val,
 	return res;
 }
 
-static inline uint_fixed_16_16_t fixed_16_16_div_round_up_u64(uint32_t val,
-							      uint32_t d)
+static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val,
+						     uint32_t d)
 {
 	uint_fixed_16_16_t res;
 	uint64_t interm_val;
@@ -206,6 +206,32 @@ static inline uint_fixed_16_16_t mul_u32_fixed_16_16(uint32_t val,
 	return fp;
 }
 
+static inline uint32_t mul_u32_fixed_16_16_round_up(uint32_t val,
+						    uint_fixed_16_16_t mul)
+{
+	uint64_t intermediate_val;
+	uint32_t result;
+
+	intermediate_val = (uint64_t) val * mul.val;
+	intermediate_val = DIV_ROUND_UP_ULL(intermediate_val, 1 << 16);
+	WARN_ON(intermediate_val >> 32);
+	result = clamp_t(uint32_t, intermediate_val, 0, ~0);
+	return result;
+}
+
+static inline uint_fixed_16_16_t mul_fixed_16_16(uint_fixed_16_16_t val,
+						 uint_fixed_16_16_t mul)
+{
+	uint64_t intermediate_val;
+	uint_fixed_16_16_t fp;
+
+	intermediate_val = (uint64_t) val.val * mul.val;
+	intermediate_val = intermediate_val >> 16;
+	WARN_ON(intermediate_val >> 32);
+	fp.val = clamp_t(uint32_t, intermediate_val, 0, ~0);
+	return fp;
+}
+
 static inline const char *yesno(bool v)
 {
 	return v ? "yes" : "no";
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 169c4908ad5b..09562d86c0fb 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3161,28 +3161,30 @@ void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
  * Return value is provided in 16.16 fixed point form to retain fractional part.
  * Caller should take care of dividing & rounding off the value.
  */
-static uint32_t
+static uint_fixed_16_16_t
 skl_plane_downscale_amount(const struct intel_plane_state *pstate)
 {
-	uint32_t downscale_h, downscale_w;
 	uint32_t src_w, src_h, dst_w, dst_h;
+	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
+	uint_fixed_16_16_t downscale_h, downscale_w;
 
 	if (WARN_ON(!pstate->base.visible))
-		return DRM_PLANE_HELPER_NO_SCALING;
+		return u32_to_fixed_16_16(0);
 
-	/* n.b., src is 16.16 fixed point, dst is whole integer */
-	src_w = drm_rect_width(&pstate->base.src);
-	src_h = drm_rect_height(&pstate->base.src);
+	src_w = drm_rect_width(&pstate->base.src) >> 16;
+	src_h = drm_rect_height(&pstate->base.src) >> 16;
 	dst_w = drm_rect_width(&pstate->base.dst);
 	dst_h = drm_rect_height(&pstate->base.dst);
 	if (drm_rotation_90_or_270(pstate->base.rotation))
 		swap(dst_w, dst_h);
 
-	downscale_h = max(src_h / dst_h, (uint32_t)DRM_PLANE_HELPER_NO_SCALING);
-	downscale_w = max(src_w / dst_w, (uint32_t)DRM_PLANE_HELPER_NO_SCALING);
+	fp_w_ratio = fixed_16_16_div(src_w, dst_w);
+	fp_h_ratio = fixed_16_16_div(src_h, dst_h);
+	downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
+	downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
 
 	/* Provide result in 16.16 fixed point */
-	return (uint64_t)downscale_w * downscale_h >> 16;
+	return mul_fixed_16_16(downscale_w, downscale_h);
 }
 
 static unsigned int
@@ -3191,10 +3193,11 @@ skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 			     int y)
 {
 	struct intel_plane_state *intel_pstate = to_intel_plane_state(pstate);
-	uint32_t down_scale_amount, data_rate;
+	uint32_t data_rate;
 	uint32_t width = 0, height = 0;
 	struct drm_framebuffer *fb;
 	u32 format;
+	uint_fixed_16_16_t downscale_amount;
 
 	if (!intel_pstate->base.visible)
 		return 0;
@@ -3226,9 +3229,9 @@ skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 		data_rate = width * height * fb->format->cpp[0];
 	}
 
-	down_scale_amount = skl_plane_downscale_amount(intel_pstate);
+	downscale_amount = skl_plane_downscale_amount(intel_pstate);
 
-	return (uint64_t)data_rate * down_scale_amount >> 16;
+	return mul_u32_fixed_16_16_round_up(data_rate, downscale_amount);
 }
 
 /*
@@ -3488,7 +3491,7 @@ static uint_fixed_16_16_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp,
 		return FP_16_16_MAX;
 
 	wm_intermediate_val = latency * pixel_rate * cpp;
-	ret = fixed_16_16_div_round_up_u64(wm_intermediate_val, 1000 * 512);
+	ret = fixed_16_16_div_u64(wm_intermediate_val, 1000 * 512);
 	return ret;
 }
 
@@ -3514,8 +3517,7 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 					      struct intel_plane_state *pstate)
 {
 	uint64_t adjusted_pixel_rate;
-	uint64_t downscale_amount;
-	uint64_t pixel_rate;
+	uint_fixed_16_16_t downscale_amount;
 
 	/* Shouldn't reach here on disabled planes... */
 	if (WARN_ON(!pstate->base.visible))
@@ -3528,10 +3530,8 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 	adjusted_pixel_rate = cstate->pixel_rate;
 	downscale_amount = skl_plane_downscale_amount(pstate);
 
-	pixel_rate = adjusted_pixel_rate * downscale_amount >> 16;
-	WARN_ON(pixel_rate != clamp_t(uint32_t, pixel_rate, 0, ~0));
-
-	return pixel_rate;
+	return mul_u32_fixed_16_16_round_up(adjusted_pixel_rate,
+					    downscale_amount);
 }
 
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
@@ -3617,8 +3617,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	if (y_tiled) {
 		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line *
 					   y_min_scanlines, 512);
-		plane_blocks_per_line =
-		      fixed_16_16_div_round_up(interm_pbpl, y_min_scanlines);
+		plane_blocks_per_line = fixed_16_16_div(interm_pbpl,
+							y_min_scanlines);
 	} else if (x_tiled) {
 		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512);
 		plane_blocks_per_line = u32_to_fixed_16_16(interm_pbpl);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/8] drm/i915/skl+: use linetime latency if ddb size is not available
  2017-02-28 11:31 [PATCH 0/8] Implement DDB algorithm and WM cleanup Mahesh Kumar
  2017-02-28 11:31 ` [PATCH 1/8] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point Mahesh Kumar
@ 2017-02-28 11:31 ` Mahesh Kumar
  2017-02-28 11:31 ` [PATCH 3/8] drm/i915/skl: Fail the flip if no FB for WM calculation Mahesh Kumar
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Mahesh Kumar @ 2017-02-28 11:31 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch make changes to use linetime latency if allocated
DDB size during plane watermark calculation in switch case not available,
This is required to implement new DDB allocation algorithm.

In New Algorithm DDB is allocated based on WM values, because of which
number of DDB blocks will not be available during WM calculation,
So this "linetime latency" is suggested by SV/HW team to use during
switch-case for WM blocks selection.

Changes since v1:
 - Rebase on top of Paulo's patch series
Changes since v2:
 - Fix if-else condition (pointed by Maarten)
Changes since v3:
 - Use common function for timetime_us calculation (Paulo)
 - rebase on drm-tip

Signed-off-by: "Mahesh Kumar" <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |  7 +++++++
 drivers/gpu/drm/i915/intel_pm.c | 42 ++++++++++++++++++++++++++++++++---------
 2 files changed, 40 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f26f61b0e7c8..e2bb1d0435f1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -132,6 +132,13 @@ typedef struct {
 	fp; \
 })
 
+static inline bool is_fixed_16_16_zero(uint_fixed_16_16_t val)
+{
+	if (val.val == 0)
+		return true;
+	return false;
+}
+
 static inline uint_fixed_16_16_t u32_to_fixed_16_16(uint32_t val)
 {
 	uint_fixed_16_16_t fp;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 09562d86c0fb..55080d0baa49 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3513,6 +3513,27 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
 	return ret;
 }
 
+static uint_fixed_16_16_t
+skl_get_linetime_us(struct intel_crtc_state *cstate)
+{
+	uint32_t pixel_rate;
+	uint32_t crtc_htotal;
+	uint_fixed_16_16_t linetime_us;
+
+	if (!cstate->base.active)
+		return u32_to_fixed_16_16(0);
+
+	pixel_rate = cstate->pixel_rate;
+
+	if (WARN_ON(pixel_rate == 0))
+		return u32_to_fixed_16_16(0);
+
+	crtc_htotal = cstate->base.adjusted_mode.crtc_htotal;
+	linetime_us = fixed_16_16_div_u64(crtc_htotal * 1000, pixel_rate);
+
+	return linetime_us;
+}
+
 static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
 					      struct intel_plane_state *pstate)
 {
@@ -3639,12 +3660,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	if (y_tiled) {
 		selected_result = max_fixed_16_16(method2, y_tile_minimum);
 	} else {
+		uint32_t linetime_us;
+
+		linetime_us = fixed_16_16_to_u32_round_up(
+				skl_get_linetime_us(cstate));
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
 		    (plane_bytes_per_line / 512 < 1))
 			selected_result = method2;
-		else if ((ddb_allocation /
+		else if ((ddb_allocation && ddb_allocation /
 			fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1)
 			selected_result = min_fixed_16_16(method1, method2);
+		else if (latency >= linetime_us)
+			selected_result = method2;
 		else
 			selected_result = method1;
 	}
@@ -3747,19 +3774,16 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 {
 	struct drm_atomic_state *state = cstate->base.state;
 	struct drm_i915_private *dev_priv = to_i915(state->dev);
-	uint32_t pixel_rate;
+	uint_fixed_16_16_t linetime_us;
 	uint32_t linetime_wm;
 
-	if (!cstate->base.active)
-		return 0;
+	linetime_us = skl_get_linetime_us(cstate);
 
-	pixel_rate = cstate->pixel_rate;
-
-	if (WARN_ON(pixel_rate == 0))
+	if (is_fixed_16_16_zero(linetime_us))
 		return 0;
 
-	linetime_wm = DIV_ROUND_UP(8 * cstate->base.adjusted_mode.crtc_htotal *
-				   1000, pixel_rate);
+	linetime_wm = fixed_16_16_to_u32_round_up(mul_u32_fixed_16_16(8,
+				linetime_us));
 
 	/* Display WA #1135: bxt. */
 	if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled)
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/8] drm/i915/skl: Fail the flip if no FB for WM calculation
  2017-02-28 11:31 [PATCH 0/8] Implement DDB algorithm and WM cleanup Mahesh Kumar
  2017-02-28 11:31 ` [PATCH 1/8] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point Mahesh Kumar
  2017-02-28 11:31 ` [PATCH 2/8] drm/i915/skl+: use linetime latency if ddb size is not available Mahesh Kumar
@ 2017-02-28 11:31 ` Mahesh Kumar
  2017-02-28 11:31 ` [PATCH 4/8] drm/i915/skl+: no need to memset again Mahesh Kumar
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Mahesh Kumar @ 2017-02-28 11:31 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

Fail the flip if no FB is present but plane_state is set as visible,
that is not a valid combination so instead of continue fail the flip.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 55080d0baa49..7cf6f5702cbe 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3751,7 +3751,8 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 	if (!intel_pstate)
 		intel_pstate = to_intel_plane_state(plane->state);
 
-	WARN_ON(!intel_pstate->base.fb);
+	if (WARN_ON(!intel_pstate->base.fb))
+		return -EINVAL;
 
 	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/8] drm/i915/skl+: no need to memset again
  2017-02-28 11:31 [PATCH 0/8] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (2 preceding siblings ...)
  2017-02-28 11:31 ` [PATCH 3/8] drm/i915/skl: Fail the flip if no FB for WM calculation Mahesh Kumar
@ 2017-02-28 11:31 ` Mahesh Kumar
  2017-02-28 11:31 ` [PATCH 5/8] drm/i915/skl+: ddb min requirement may exceed allocation Mahesh Kumar
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Mahesh Kumar @ 2017-02-28 11:31 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

We are already doing memset of ddb structure at the begining of skl_allocatE_pipe_ddb
function, No need to again do a memset.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 7cf6f5702cbe..76c986166664 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3395,10 +3395,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 
 	skl_ddb_get_pipe_allocation_limits(dev, cstate, alloc, &num_active);
 	alloc_size = skl_ddb_entry_size(alloc);
-	if (alloc_size == 0) {
-		memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
+	if (alloc_size == 0)
 		return 0;
-	}
 
 	skl_ddb_calc_min(cstate, num_active, minimum, y_minimum);
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/8] drm/i915/skl+: ddb min requirement may exceed allocation
  2017-02-28 11:31 [PATCH 0/8] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (3 preceding siblings ...)
  2017-02-28 11:31 ` [PATCH 4/8] drm/i915/skl+: no need to memset again Mahesh Kumar
@ 2017-02-28 11:31 ` Mahesh Kumar
  2017-04-12  9:17   ` Ander Conselvan De Oliveira
  2017-02-28 11:31 ` [PATCH 6/8] drm/i915/skl+: Watermark calculation cleanup Mahesh Kumar
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 13+ messages in thread
From: Mahesh Kumar @ 2017-02-28 11:31 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

DDB minimum requirement may also exceed the allocated DDB for CRTC.
Instead of directly deducting from alloc_size, check against
total_min_ddb requirement. if exceeding fail the flip.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 76c986166664..24e9e5d69833 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3380,6 +3380,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	int num_active;
 	unsigned plane_data_rate[I915_MAX_PLANES] = {};
 	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
+	uint16_t total_min_blocks = 0;
 
 	/* Clear the partitioning for disabled planes. */
 	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
@@ -3407,10 +3408,18 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	 */
 
 	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
-		alloc_size -= minimum[plane_id];
-		alloc_size -= y_minimum[plane_id];
+		total_min_blocks += minimum[plane_id];
+		total_min_blocks += y_minimum[plane_id];
 	}
 
+	if ((total_min_blocks > alloc_size)) {
+		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
+		DRM_DEBUG_KMS("minimum required %d/%d\n", total_min_blocks,
+							alloc_size);
+		return-EINVAL;
+	}
+
+	alloc_size -= total_min_blocks;
 	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
 	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/8] drm/i915/skl+: Watermark calculation cleanup
  2017-02-28 11:31 [PATCH 0/8] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (4 preceding siblings ...)
  2017-02-28 11:31 ` [PATCH 5/8] drm/i915/skl+: ddb min requirement may exceed allocation Mahesh Kumar
@ 2017-02-28 11:31 ` Mahesh Kumar
  2017-02-28 11:31 ` [PATCH 7/8] drm/i915/skl: New ddb allocation algorithm Mahesh Kumar
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 13+ messages in thread
From: Mahesh Kumar @ 2017-02-28 11:31 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch cleans the watermark calculation functions. This patch make
use of already available macro to walk through plane_state list instead
of calculating plane_state in function itself.
Now we iterate over WM levels in skl_compute_wm_level function.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 89 +++++++++++++++++------------------------
 1 file changed, 37 insertions(+), 52 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 24e9e5d69833..51a483c68521 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3542,7 +3542,7 @@ skl_get_linetime_us(struct intel_crtc_state *cstate)
 }
 
 static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
-					      struct intel_plane_state *pstate)
+					      const struct intel_plane_state *pstate)
 {
 	uint64_t adjusted_pixel_rate;
 	uint_fixed_16_16_t downscale_amount;
@@ -3564,15 +3564,15 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
-				struct intel_plane_state *intel_pstate,
+				const struct intel_plane_state *intel_pstate,
 				uint16_t ddb_allocation,
 				int level,
 				uint16_t *out_blocks, /* out */
 				uint8_t *out_lines, /* out */
 				bool *enabled /* out */)
 {
-	struct drm_plane_state *pstate = &intel_pstate->base;
-	struct drm_framebuffer *fb = pstate->fb;
+	const struct drm_plane_state *pstate = &intel_pstate->base;
+	const struct drm_framebuffer *fb = pstate->fb;
 	uint32_t latency = dev_priv->wm.skl_latency[level];
 	uint_fixed_16_16_t method1, method2;
 	uint_fixed_16_16_t plane_blocks_per_line;
@@ -3727,52 +3727,36 @@ static int
 skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 		     struct skl_ddb_allocation *ddb,
 		     struct intel_crtc_state *cstate,
-		     struct intel_plane *intel_plane,
-		     int level,
-		     struct skl_wm_level *result)
+		     const struct intel_plane_state *intel_pstate,
+		     struct skl_plane_wm *wm)
 {
-	struct drm_atomic_state *state = cstate->base.state;
 	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
-	struct drm_plane *plane = &intel_plane->base;
-	struct intel_plane_state *intel_pstate = NULL;
+	struct drm_plane *plane = intel_pstate->base.plane;
+	struct intel_plane *intel_plane = to_intel_plane(plane);
 	uint16_t ddb_blocks;
 	enum pipe pipe = intel_crtc->pipe;
+	int level, max_level = ilk_wm_max_level(dev_priv);
 	int ret;
 
-	if (state)
-		intel_pstate =
-			intel_atomic_get_existing_plane_state(state,
-							      intel_plane);
-
-	/*
-	 * Note: If we start supporting multiple pending atomic commits against
-	 * the same planes/CRTC's in the future, plane->state will no longer be
-	 * the correct pre-state to use for the calculations here and we'll
-	 * need to change where we get the 'unchanged' plane data from.
-	 *
-	 * For now this is fine because we only allow one queued commit against
-	 * a CRTC.  Even if the plane isn't modified by this transaction and we
-	 * don't have a plane lock, we still have the CRTC's lock, so we know
-	 * that no other transactions are racing with us to update it.
-	 */
-	if (!intel_pstate)
-		intel_pstate = to_intel_plane_state(plane->state);
-
 	if (WARN_ON(!intel_pstate->base.fb))
 		return -EINVAL;
 
 	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
 
-	ret = skl_compute_plane_wm(dev_priv,
-				   cstate,
-				   intel_pstate,
-				   ddb_blocks,
-				   level,
-				   &result->plane_res_b,
-				   &result->plane_res_l,
-				   &result->plane_en);
-	if (ret)
-		return ret;
+	for (level = 0; level <= max_level; level++) {
+		struct skl_wm_level *result = &wm->wm[level];
+
+		ret = skl_compute_plane_wm(dev_priv,
+					   cstate,
+					   intel_pstate,
+					   ddb_blocks,
+					   level,
+					   &result->plane_res_b,
+					   &result->plane_res_l,
+					   &result->plane_en);
+		if (ret)
+			return ret;
+	}
 
 	return 0;
 }
@@ -3815,10 +3799,11 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 			     struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_device *dev = cstate->base.crtc->dev;
+	struct drm_crtc_state *crtc_state = &cstate->base;
 	const struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_plane *intel_plane;
+	struct drm_plane *plane;
+	const struct drm_plane_state *pstate;
 	struct skl_plane_wm *wm;
-	int level, max_level = ilk_wm_max_level(dev_priv);
 	int ret;
 
 	/*
@@ -3827,18 +3812,18 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 	 */
 	memset(pipe_wm->planes, 0, sizeof(pipe_wm->planes));
 
-	for_each_intel_plane_mask(&dev_priv->drm,
-				  intel_plane,
-				  cstate->base.plane_mask) {
-		wm = &pipe_wm->planes[intel_plane->id];
+	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
+		const struct intel_plane_state *intel_pstate =
+						to_intel_plane_state(pstate);
+		enum plane_id plane_id = to_intel_plane(plane)->id;
+
+		wm = &pipe_wm->planes[plane_id];
+
+		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
+					   intel_pstate, wm);
+		if (ret)
+			return ret;
 
-		for (level = 0; level <= max_level; level++) {
-			ret = skl_compute_wm_level(dev_priv, ddb, cstate,
-						   intel_plane, level,
-						   &wm->wm[level]);
-			if (ret)
-				return ret;
-		}
 		skl_compute_transition_wm(cstate, &wm->trans_wm);
 	}
 	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 7/8] drm/i915/skl: New ddb allocation algorithm
  2017-02-28 11:31 [PATCH 0/8] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (5 preceding siblings ...)
  2017-02-28 11:31 ` [PATCH 6/8] drm/i915/skl+: Watermark calculation cleanup Mahesh Kumar
@ 2017-02-28 11:31 ` Mahesh Kumar
  2017-02-28 11:31 ` [PATCH 8/8] drm/i915/skl+: consider max supported plane pixel rate while scaling Mahesh Kumar
  2017-02-28 15:52 ` ✗ Fi.CI.BAT: failure for Implement DDB algorithm and WM cleanup Patchwork
  8 siblings, 0 replies; 13+ messages in thread
From: Mahesh Kumar @ 2017-02-28 11:31 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch implements new DDB allocation algorithm as per HW team
recommendation. This algo takecare of scenario where we allocate less DDB
for the planes with lower relative pixel rate, but they require more DDB
to work.
It also takes care of enabling same watermark level for each
plane, for efficient power saving.

Changes since v1:
 - Rebase on top of Paulo's patch series

Changes since v2:
 - Fix the for loop condition to enable WM

Changes since v3:
 - Fix crash in cursor i-g-t reported by Maarten
 - Rebase after addressing Paulo's comments
 - Few other ULT fixes
Changes since v4:
 - Rebase on drm-tip
 - Added separate function to enable WM levels

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 251 ++++++++++++++++++++++++----------------
 1 file changed, 150 insertions(+), 101 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 51a483c68521..84979b0af684 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3362,13 +3362,41 @@ skl_ddb_calc_min(const struct intel_crtc_state *cstate, int num_active,
 	minimum[PLANE_CURSOR] = skl_cursor_allocation(num_active);
 }
 
+static void
+skl_enable_plane_wm_levels(const struct drm_i915_private *dev_priv,
+			   uint16_t plane_ddb,
+			   uint16_t max_level,
+			   struct skl_plane_wm *wm)
+{
+	int level;
+	/*
+	 * Now enable all levels in WM structure which can be enabled
+	 * using current DDB allocation
+	 */
+	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
+		struct skl_wm_level *level_wm = &wm->wm[level];
+
+		if (level > max_level || level_wm->plane_res_b == 0
+				      || level_wm->plane_res_l >= 31
+				      || level_wm->plane_res_b >= plane_ddb) {
+			level_wm->plane_en = false;
+			level_wm->plane_res_b = 0;
+			level_wm->plane_res_l = 0;
+		} else {
+			level_wm->plane_en = true;
+		}
+	}
+}
+
 static int
 skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
-		      struct skl_ddb_allocation *ddb /* out */)
+		      struct skl_ddb_allocation *ddb /* out */,
+		      struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_atomic_state *state = cstate->base.state;
 	struct drm_crtc *crtc = cstate->base.crtc;
 	struct drm_device *dev = crtc->dev;
+	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	enum pipe pipe = intel_crtc->pipe;
 	struct skl_ddb_entry *alloc = &cstate->wm.skl.ddb;
@@ -3381,6 +3409,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	unsigned plane_data_rate[I915_MAX_PLANES] = {};
 	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
 	uint16_t total_min_blocks = 0;
+	uint16_t total_level_ddb;
+	int max_level, level;
 
 	/* Clear the partitioning for disabled planes. */
 	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
@@ -3419,10 +3449,43 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		return-EINVAL;
 	}
 
-	alloc_size -= total_min_blocks;
-	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
+	alloc_size -= minimum[PLANE_CURSOR];
+	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end -
+							minimum[PLANE_CURSOR];
 	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
 
+	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
+		total_level_ddb = 0;
+		for_each_plane_id_on_crtc(intel_crtc, plane_id) {
+			/*
+			 * TODO: We should calculate watermark values for Y/UV
+			 * plane both in case of NV12 format and use both values
+			 * for ddb calculation. NV12 is disabled as of now, So
+			 * using only single/UV plane value here.
+			 */
+			struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
+			uint16_t plane_res_b = wm->wm[level].plane_res_b;
+			uint16_t min = minimum[plane_id] + y_minimum[plane_id];
+
+			if (plane_id == PLANE_CURSOR)
+				continue;
+
+			total_level_ddb += max(plane_res_b, min);
+		}
+
+		if (total_level_ddb <= alloc_size)
+			break;
+	}
+
+	if ((level < 0) || (total_min_blocks > alloc_size)) {
+		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
+		DRM_DEBUG_KMS("minimum required %d/%d\n", (level < 0) ?
+				total_level_ddb : total_min_blocks, alloc_size);
+		return -EINVAL;
+	}
+	max_level = level;
+	alloc_size -= total_level_ddb;
+
 	/*
 	 * 2. Distribute the remaining space in proportion to the amount of
 	 * data each plane needs to fetch from memory.
@@ -3438,10 +3501,16 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	start = alloc->start;
 	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
 		unsigned int data_rate, y_data_rate;
-		uint16_t plane_blocks, y_plane_blocks = 0;
-
-		if (plane_id == PLANE_CURSOR)
+		uint16_t plane_blocks = 0, y_plane_blocks;
+		struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
+
+		if (plane_id == PLANE_CURSOR) {
+			plane_blocks =
+				skl_ddb_entry_size(&ddb->plane[pipe][plane_id]);
+			skl_enable_plane_wm_levels(dev_priv, plane_blocks,
+						   max_level, wm);
 			continue;
+		}
 
 		data_rate = plane_data_rate[plane_id];
 
@@ -3450,33 +3519,35 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		 * promote the expression to 64 bits to avoid overflowing, the
 		 * result is < available as data_rate / total_data_rate < 1
 		 */
-		plane_blocks = minimum[plane_id];
-		plane_blocks += div_u64((uint64_t)alloc_size * data_rate,
-					total_data_rate);
 
 		/* Leave disabled planes at (0,0) */
 		if (data_rate) {
+			plane_blocks = minimum[plane_id];
+			plane_blocks += div_u64((uint64_t)alloc_size *
+					data_rate, total_data_rate);
 			ddb->plane[pipe][plane_id].start = start;
 			ddb->plane[pipe][plane_id].end = start + plane_blocks;
+			start += plane_blocks;
 		}
 
-		start += plane_blocks;
-
 		/*
 		 * allocation for y_plane part of planar format:
+		 * TODO: Once we start calculating watermark values for Y/UV
+		 * plane both consider it for initial allowed wm blocks.
 		 */
 		y_data_rate = plane_y_data_rate[plane_id];
 
-		y_plane_blocks = y_minimum[plane_id];
-		y_plane_blocks += div_u64((uint64_t)alloc_size * y_data_rate,
-					total_data_rate);
-
 		if (y_data_rate) {
+			y_plane_blocks = y_minimum[plane_id];
+			y_plane_blocks += div_u64((uint64_t)alloc_size *
+					y_data_rate, total_data_rate);
 			ddb->y_plane[pipe][plane_id].start = start;
 			ddb->y_plane[pipe][plane_id].end = start + y_plane_blocks;
 		}
-
-		start += y_plane_blocks;
+		skl_enable_plane_wm_levels(dev_priv,
+					   plane_blocks,
+					   max_level,
+					   wm);
 	}
 
 	return 0;
@@ -3565,11 +3636,9 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
 				const struct intel_plane_state *intel_pstate,
-				uint16_t ddb_allocation,
 				int level,
 				uint16_t *out_blocks, /* out */
-				uint8_t *out_lines, /* out */
-				bool *enabled /* out */)
+				uint8_t *out_lines /* out */)
 {
 	const struct drm_plane_state *pstate = &intel_pstate->base;
 	const struct drm_framebuffer *fb = pstate->fb;
@@ -3590,10 +3659,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
 	bool y_tiled, x_tiled;
 
-	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
-		*enabled = false;
+	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible)
 		return 0;
-	}
 
 	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
 		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
@@ -3674,9 +3741,6 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
 		    (plane_bytes_per_line / 512 < 1))
 			selected_result = method2;
-		else if ((ddb_allocation && ddb_allocation /
-			fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1)
-			selected_result = min_fixed_16_16(method1, method2);
 		else if (latency >= linetime_us)
 			selected_result = method2;
 		else
@@ -3696,64 +3760,41 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		}
 	}
 
-	if (res_blocks >= ddb_allocation || res_lines > 31) {
-		*enabled = false;
-
-		/*
-		 * If there are no valid level 0 watermarks, then we can't
-		 * support this display configuration.
-		 */
-		if (level) {
-			return 0;
-		} else {
-			struct drm_plane *plane = pstate->plane;
+	if (res_lines >= 31 && level == 0) {
+		struct drm_plane *plane = pstate->plane;
 
-			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
-			DRM_DEBUG_KMS("[PLANE:%d:%s] blocks required = %u/%u, lines required = %u/31\n",
-				      plane->base.id, plane->name,
-				      res_blocks, ddb_allocation, res_lines);
-			return -EINVAL;
-		}
+		DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
+		DRM_DEBUG_KMS("[PLANE:%d:%s] lines required = %u/31\n",
+				plane->base.id, plane->name, res_lines);
 	}
 
 	*out_blocks = res_blocks;
 	*out_lines = res_lines;
-	*enabled = true;
 
 	return 0;
 }
 
 static int
 skl_compute_wm_level(const struct drm_i915_private *dev_priv,
-		     struct skl_ddb_allocation *ddb,
 		     struct intel_crtc_state *cstate,
 		     const struct intel_plane_state *intel_pstate,
 		     struct skl_plane_wm *wm)
 {
-	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
-	struct drm_plane *plane = intel_pstate->base.plane;
-	struct intel_plane *intel_plane = to_intel_plane(plane);
-	uint16_t ddb_blocks;
-	enum pipe pipe = intel_crtc->pipe;
 	int level, max_level = ilk_wm_max_level(dev_priv);
 	int ret;
 
 	if (WARN_ON(!intel_pstate->base.fb))
 		return -EINVAL;
 
-	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
-
 	for (level = 0; level <= max_level; level++) {
 		struct skl_wm_level *result = &wm->wm[level];
 
 		ret = skl_compute_plane_wm(dev_priv,
 					   cstate,
 					   intel_pstate,
-					   ddb_blocks,
 					   level,
 					   &result->plane_res_b,
-					   &result->plane_res_l,
-					   &result->plane_en);
+					   &result->plane_res_l);
 		if (ret)
 			return ret;
 	}
@@ -3819,8 +3860,7 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 
 		wm = &pipe_wm->planes[plane_id];
 
-		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
-					   intel_pstate, wm);
+		ret = skl_compute_wm_level(dev_priv, cstate, intel_pstate, wm);
 		if (ret)
 			return ret;
 
@@ -3934,6 +3974,45 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
 	return false;
 }
 
+static int
+skl_ddb_add_affected_planes(struct intel_crtc_state *cstate,
+			    const struct skl_pipe_wm *old_pipe_wm,
+			    const struct skl_pipe_wm *pipe_wm)
+{
+	struct drm_atomic_state *state = cstate->base.state;
+	struct drm_device *dev = state->dev;
+	struct drm_crtc *crtc = cstate->base.crtc;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
+	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
+	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
+	struct drm_plane_state *plane_state;
+	struct drm_plane *plane;
+	enum pipe pipe = intel_crtc->pipe;
+
+	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
+
+	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
+		enum plane_id plane_id = to_intel_plane(plane)->id;
+		const struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
+		const struct skl_plane_wm *old_wm = &old_pipe_wm->planes[plane_id];
+
+		if ((skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
+					&new_ddb->plane[pipe][plane_id]) &&
+		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
+					&new_ddb->y_plane[pipe][plane_id])) &&
+		    !memcmp(wm, old_wm, sizeof(struct skl_plane_wm)))
+			continue;
+
+		plane_state = drm_atomic_get_plane_state(state, plane);
+		if (IS_ERR(plane_state))
+			return PTR_ERR(plane_state);
+	}
+
+	return 0;
+}
+
 static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
 			      const struct skl_pipe_wm *old_pipe_wm,
 			      struct skl_pipe_wm *pipe_wm, /* out */
@@ -3947,6 +4026,18 @@ static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
 	if (ret)
 		return ret;
 
+	ret = skl_allocate_pipe_ddb(intel_cstate, ddb, pipe_wm);
+	if (ret)
+		return ret;
+	/*
+	 * TODO: Do we still need to add planes in state, As WM update is
+	 * not part of update_plane anymore, So wm for planes can be updated
+	 * irrespective of updade_plane call.
+	 */
+	ret = skl_ddb_add_affected_planes(intel_cstate, old_pipe_wm, pipe_wm);
+	if (ret)
+		return ret;
+
 	if (!memcmp(old_pipe_wm, pipe_wm, sizeof(*pipe_wm)))
 		*changed = false;
 	else
@@ -3969,41 +4060,7 @@ pipes_modified(struct drm_atomic_state *state)
 }
 
 static int
-skl_ddb_add_affected_planes(struct intel_crtc_state *cstate)
-{
-	struct drm_atomic_state *state = cstate->base.state;
-	struct drm_device *dev = state->dev;
-	struct drm_crtc *crtc = cstate->base.crtc;
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
-	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
-	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
-	struct drm_plane_state *plane_state;
-	struct drm_plane *plane;
-	enum pipe pipe = intel_crtc->pipe;
-
-	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
-
-	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
-		enum plane_id plane_id = to_intel_plane(plane)->id;
-
-		if (skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
-					&new_ddb->plane[pipe][plane_id]) &&
-		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
-					&new_ddb->y_plane[pipe][plane_id]))
-			continue;
-
-		plane_state = drm_atomic_get_plane_state(state, plane);
-		if (IS_ERR(plane_state))
-			return PTR_ERR(plane_state);
-	}
-
-	return 0;
-}
-
-static int
-skl_compute_ddb(struct drm_atomic_state *state)
+skl_include_affected_crtc(struct drm_atomic_state *state)
 {
 	struct drm_device *dev = state->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
@@ -4067,14 +4124,6 @@ skl_compute_ddb(struct drm_atomic_state *state)
 		cstate = intel_atomic_get_crtc_state(state, intel_crtc);
 		if (IS_ERR(cstate))
 			return PTR_ERR(cstate);
-
-		ret = skl_allocate_pipe_ddb(cstate, ddb);
-		if (ret)
-			return ret;
-
-		ret = skl_ddb_add_affected_planes(cstate);
-		if (ret)
-			return ret;
 	}
 
 	return 0;
@@ -4155,7 +4204,7 @@ skl_compute_wm(struct drm_atomic_state *state)
 	/* Clear all dirty flags */
 	results->dirty_pipes = 0;
 
-	ret = skl_compute_ddb(state);
+	ret = skl_include_affected_crtc(state);
 	if (ret)
 		return ret;
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 8/8] drm/i915/skl+: consider max supported plane pixel rate while scaling
  2017-02-28 11:31 [PATCH 0/8] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (6 preceding siblings ...)
  2017-02-28 11:31 ` [PATCH 7/8] drm/i915/skl: New ddb allocation algorithm Mahesh Kumar
@ 2017-02-28 11:31 ` Mahesh Kumar
  2017-02-28 15:52 ` ✗ Fi.CI.BAT: failure for Implement DDB algorithm and WM cleanup Patchwork
  8 siblings, 0 replies; 13+ messages in thread
From: Mahesh Kumar @ 2017-02-28 11:31 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

A display resolution is only supported if it meets all the restrictions
below for Maximum Pipe Pixel Rate.

The display resolution must fit within the maximum pixel rate output
from the pipe. Make sure that the display pipe is able to feed pixels at
a rate required to support the desired resolution.
For each enabled plane on the pipe {
    If plane scaling enabled {
	Horizontal down scale amount = Maximum[1, plane horizontal size /
		    scaler horizontal window size]
	Vertical down scale amount = Maximum[1, plane vertical size /
		    scaler vertical window size]
	Plane down scale amount = Horizontal down scale amount *
		    Vertical down scale amount
	Plane Ratio = 1 / Plane down scale amount
    }
    Else {
	Plane Ratio = 1
    }
    If plane source pixel format is 64 bits per pixel {
	Plane Ratio = Plane Ratio * 8/9
    }
}

Pipe Ratio = Minimum Plane Ratio of all enabled planes on the pipe

If pipe scaling is enabled {
    Horizontal down scale amount = Maximum[1, pipe horizontal source size /
		scaler horizontal window size]
    Vertical down scale amount = Maximum[1, pipe vertical source size /
		scaler vertical window size]
    Note: The progressive fetch - interlace display mode is equivalent to a
		2.0 vertical down scale
    Pipe down scale amount = Horizontal down scale amount *
		Vertical down scale amount
    Pipe Ratio = Pipe Ratio / Pipe down scale amount
}

Pipe maximum pixel rate = CDCLK frequency * Pipe Ratio

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h      | 11 +++++
 drivers/gpu/drm/i915/intel_display.c |  3 ++
 drivers/gpu/drm/i915/intel_drv.h     |  2 +
 drivers/gpu/drm/i915/intel_pm.c      | 86 ++++++++++++++++++++++++++++++++++++
 4 files changed, 102 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e2bb1d0435f1..3d4f12abab76 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -201,6 +201,17 @@ static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val,
 	return res;
 }
 
+static inline uint32_t fixed_16_16_div_round_up_u64(uint32_t val,
+						    uint_fixed_16_16_t d)
+{
+	uint64_t interm_val;
+
+	interm_val = (uint64_t)val << 16;
+	interm_val = DIV_ROUND_UP_ULL(interm_val, d.val);
+	WARN_ON(interm_val >> 32);
+	return clamp_t(uint32_t, interm_val, 0, ~0);
+}
+
 static inline uint_fixed_16_16_t mul_u32_fixed_16_16(uint32_t val,
 						     uint_fixed_16_16_t mul)
 {
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index d1831809ad3d..79a55e79b1fd 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11037,6 +11037,9 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
 			ret = skl_update_scaler_crtc(pipe_config);
 
 		if (!ret)
+			ret = skl_check_pipe_max_pixel_rate(intel_crtc,
+							    pipe_config);
+		if (!ret)
 			ret = intel_atomic_setup_scalers(dev_priv, intel_crtc,
 							 pipe_config);
 	}
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 3c8aaca947d3..2e4dea6f4f62 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1825,6 +1825,8 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
 				 int ignore);
 bool ilk_disable_lp_wm(struct drm_device *dev);
 int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
+int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
+				  struct intel_crtc_state *cstate);
 static inline int intel_enable_rc6(void)
 {
 	return i915.enable_rc6;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 84979b0af684..843588ab97f6 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3187,6 +3187,92 @@ skl_plane_downscale_amount(const struct intel_plane_state *pstate)
 	return mul_fixed_16_16(downscale_w, downscale_h);
 }
 
+static uint_fixed_16_16_t
+skl_pipe_downscale_amount(const struct intel_crtc_state *config)
+{
+	uint_fixed_16_16_t pipe_downscale = u32_to_fixed_16_16(1);
+
+	if (!config->base.active)
+		return pipe_downscale;
+
+	if (config->pch_pfit.enabled) {
+		uint32_t src_w, src_h, dst_w, dst_h;
+		uint32_t pfit_size = config->pch_pfit.size;
+		uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
+		uint_fixed_16_16_t downscale_h, downscale_w;
+
+		src_w = config->pipe_src_w;
+		src_h = config->pipe_src_h;
+		dst_w = pfit_size >> 16;
+		dst_h = pfit_size & 0xffff;
+
+		if (!dst_w || !dst_h)
+			return pipe_downscale;
+
+		fp_w_ratio = fixed_16_16_div(src_w, dst_w);
+		fp_h_ratio = fixed_16_16_div(src_h, dst_h);
+		downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
+		downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
+
+		pipe_downscale = mul_fixed_16_16(downscale_w, downscale_h);
+	}
+
+	return pipe_downscale;
+}
+
+int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
+				  struct intel_crtc_state *cstate)
+{
+	struct drm_crtc_state *crtc_state = &cstate->base;
+	struct drm_atomic_state *state = crtc_state->state;
+	struct drm_plane *plane;
+	const struct drm_plane_state *pstate;
+	struct intel_plane_state *intel_pstate;
+	int crtc_clock, cdclk;
+	uint32_t pipe_max_pixel_rate;
+	uint_fixed_16_16_t pipe_downscale;
+	uint_fixed_16_16_t max_downscale = u32_to_fixed_16_16(1);
+
+	if (cstate->base.active)
+		return 0;
+
+	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
+		uint_fixed_16_16_t plane_downscale;
+		uint_fixed_16_16_t fp_9_div_8 = fixed_16_16_div(9, 8);
+		int bpp;
+
+		if (!pstate->visible)
+			continue;
+
+		if (WARN_ON(!pstate->fb))
+			continue;
+
+		intel_pstate = to_intel_plane_state(pstate);
+		plane_downscale = skl_plane_downscale_amount(intel_pstate);
+		bpp = pstate->fb->format->cpp[0] * 8;
+		if (bpp == 64)
+			plane_downscale = mul_fixed_16_16(plane_downscale,
+							  fp_9_div_8);
+
+		max_downscale = max_fixed_16_16(plane_downscale, max_downscale);
+	}
+	pipe_downscale = skl_pipe_downscale_amount(cstate);
+
+	pipe_downscale = mul_fixed_16_16(pipe_downscale, max_downscale);
+
+	crtc_clock = crtc_state->adjusted_mode.crtc_clock;
+	cdclk = to_intel_atomic_state(state)->cdclk.logical.cdclk;
+	pipe_max_pixel_rate = fixed_16_16_div_round_up_u64(cdclk,
+							   pipe_downscale);
+
+	if (pipe_max_pixel_rate < crtc_clock) {
+		DRM_ERROR("Max supported pixel clock with scaling exceeds\n");
+		return -ERANGE;
+	}
+
+	return 0;
+}
+
 static unsigned int
 skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 			     const struct drm_plane_state *pstate,
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* ✗ Fi.CI.BAT: failure for Implement DDB algorithm and WM cleanup
  2017-02-28 11:31 [PATCH 0/8] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (7 preceding siblings ...)
  2017-02-28 11:31 ` [PATCH 8/8] drm/i915/skl+: consider max supported plane pixel rate while scaling Mahesh Kumar
@ 2017-02-28 15:52 ` Patchwork
  8 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2017-02-28 15:52 UTC (permalink / raw)
  To: Kumar, Mahesh; +Cc: intel-gfx

== Series Details ==

Series: Implement DDB algorithm and WM cleanup
URL   : https://patchwork.freedesktop.org/series/20376/
State : failure

== Summary ==

Series 20376v1 Implement DDB algorithm and WM cleanup
https://patchwork.freedesktop.org/api/1.0/series/20376/revisions/1/mbox/

Test gem_exec_suspend:
        Subgroup basic-s3:
                pass       -> DMESG-WARN (fi-skl-6770hq) fdo#100004
        Subgroup basic-s4-devices:
                pass       -> DMESG-WARN (fi-skl-6770hq) fdo#100004
Test kms_busy:
        Subgroup basic-flip-default-a:
                pass       -> INCOMPLETE (fi-skl-6770hq)

fdo#100004 https://bugs.freedesktop.org/show_bug.cgi?id=100004

fi-bdw-5557u     total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11 
fi-bsw-n3050     total:278  pass:239  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205     total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19 
fi-bxt-t5700     total:108  pass:95   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900     total:278  pass:251  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820     total:278  pass:247  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770      total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16 
fi-hsw-4770r     total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16 
fi-ilk-650       total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50 
fi-ivb-3520m     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18 
fi-ivb-3770      total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18 
fi-kbl-7500u     total:278  pass:259  dwarn:1   dfail:0   fail:0   skip:18 
fi-skl-6260u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10 
fi-skl-6700hq    total:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17 
fi-skl-6700k     total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18 
fi-skl-6770hq    total:196  pass:190  dwarn:2   dfail:0   fail:0   skip:3  
fi-snb-2520m     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28 
fi-snb-2600      total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29 

89683de1f9e7f8fb56bf90ad0682c407b7037bf7 drm-tip: 2017y-02m-28d-14h-58m-42s UTC integration manifest
a21d035 drm/i915/skl+: consider max supported plane pixel rate while scaling
3de7987 drm/i915/skl: New ddb allocation algorithm
05345a5 drm/i915/skl+: Watermark calculation cleanup
4fb432b drm/i915/skl+: ddb min requirement may exceed allocation
37a7a8a drm/i915/skl+: no need to memset again
05e3efc drm/i915/skl: Fail the flip if no FB for WM calculation
d61c0e7 drm/i915/skl+: use linetime latency if ddb size is not available
12d716c drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4001/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/8] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point
  2017-02-28 11:31 ` [PATCH 1/8] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point Mahesh Kumar
@ 2017-03-16 18:48   ` Paulo Zanoni
  0 siblings, 0 replies; 13+ messages in thread
From: Paulo Zanoni @ 2017-03-16 18:48 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx; +Cc: maarten.lankhorst

Em Ter, 2017-02-28 às 17:01 +0530, Mahesh Kumar escreveu:
> This patch make changes to calculate adjusted plane pixel rate &
> plane downscale amount using fixed_point functions available. This
> also
> adds few fixed point function to facilitate calculation.

Before we go into the full review I need to ask: but why does the patch
do what it says it does? Is this a bug fix? Is this needed for some
rework later in the series? What happens if we don't merge this patch
but try to merge the others without it? Having those questions answered
in the commit message help not only the patch reviewers, but also all
the future git-bisecters and anybody else who happens to end up
touching this code change.


> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h | 34 +++++++++++++++++++++++++++++
> ----
>  drivers/gpu/drm/i915/intel_pm.c | 42 ++++++++++++++++++++-----------
> ----------
>  2 files changed, 51 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h
> index eed9ead1b592..f26f61b0e7c8 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -170,8 +170,8 @@ static inline uint_fixed_16_16_t
> max_fixed_16_16(uint_fixed_16_16_t max1,
>  	return max;
>  }
>  
> -static inline uint_fixed_16_16_t fixed_16_16_div_round_up(uint32_t
> val,
> -							  uint32_t
> d)
> +static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val,
> +						 uint32_t d)

Why are we renaming this? It still rounds up.

>  {
>  	uint_fixed_16_16_t fp, res;
>  
> @@ -180,8 +180,8 @@ static inline uint_fixed_16_16_t
> fixed_16_16_div_round_up(uint32_t val,
>  	return res;
>  }
>  
> -static inline uint_fixed_16_16_t
> fixed_16_16_div_round_up_u64(uint32_t val,
> -							      uint32
> _t d)
> +static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val,
> +						     uint32_t d)

Why are we renaming this? It still rounds up.

>  {
>  	uint_fixed_16_16_t res;
>  	uint64_t interm_val;
> @@ -206,6 +206,32 @@ static inline uint_fixed_16_16_t
> mul_u32_fixed_16_16(uint32_t val,
>  	return fp;
>  }
>  
> +static inline uint32_t mul_u32_fixed_16_16_round_up(uint32_t val,
> +						    uint_fixed_16_16
> _t mul)
> +{
> +	uint64_t intermediate_val;
> +	uint32_t result;
> +
> +	intermediate_val = (uint64_t) val * mul.val;
> +	intermediate_val = DIV_ROUND_UP_ULL(intermediate_val, 1 <<
> 16);
> +	WARN_ON(intermediate_val >> 32);
> +	result = clamp_t(uint32_t, intermediate_val, 0, ~0);
> +	return result;
> +}
> +
> +static inline uint_fixed_16_16_t mul_fixed_16_16(uint_fixed_16_16_t
> val,
> +						 uint_fixed_16_16_t
> mul)
> +{
> +	uint64_t intermediate_val;
> +	uint_fixed_16_16_t fp;
> +
> +	intermediate_val = (uint64_t) val.val * mul.val;
> +	intermediate_val = intermediate_val >> 16;
> +	WARN_ON(intermediate_val >> 32);
> +	fp.val = clamp_t(uint32_t, intermediate_val, 0, ~0);
> +	return fp;
> +}
> +
>  static inline const char *yesno(bool v)
>  {
>  	return v ? "yes" : "no";
> diff --git a/drivers/gpu/drm/i915/intel_pm.c
> b/drivers/gpu/drm/i915/intel_pm.c
> index 169c4908ad5b..09562d86c0fb 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3161,28 +3161,30 @@ void skl_ddb_get_hw_state(struct
> drm_i915_private *dev_priv,
>   * Return value is provided in 16.16 fixed point form to retain
> fractional part.
>   * Caller should take care of dividing & rounding off the value.
>   */
> -static uint32_t
> +static uint_fixed_16_16_t
>  skl_plane_downscale_amount(const struct intel_plane_state *pstate)
>  {
> -	uint32_t downscale_h, downscale_w;
>  	uint32_t src_w, src_h, dst_w, dst_h;
> +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> +	uint_fixed_16_16_t downscale_h, downscale_w;
>  
>  	if (WARN_ON(!pstate->base.visible))
> -		return DRM_PLANE_HELPER_NO_SCALING;
> +		return u32_to_fixed_16_16(0);
>  
> -	/* n.b., src is 16.16 fixed point, dst is whole integer */
> -	src_w = drm_rect_width(&pstate->base.src);
> -	src_h = drm_rect_height(&pstate->base.src);
> +	src_w = drm_rect_width(&pstate->base.src) >> 16;
> +	src_h = drm_rect_height(&pstate->base.src) >> 16;
>  	dst_w = drm_rect_width(&pstate->base.dst);
>  	dst_h = drm_rect_height(&pstate->base.dst);
>  	if (drm_rotation_90_or_270(pstate->base.rotation))
>  		swap(dst_w, dst_h);
>  
> -	downscale_h = max(src_h / dst_h,
> (uint32_t)DRM_PLANE_HELPER_NO_SCALING);
> -	downscale_w = max(src_w / dst_w,
> (uint32_t)DRM_PLANE_HELPER_NO_SCALING);
> +	fp_w_ratio = fixed_16_16_div(src_w, dst_w);
> +	fp_h_ratio = fixed_16_16_div(src_h, dst_h);
> +	downscale_w = max_fixed_16_16(fp_w_ratio,
> u32_to_fixed_16_16(1));
> +	downscale_h = max_fixed_16_16(fp_h_ratio,
> u32_to_fixed_16_16(1));
>  
>  	/* Provide result in 16.16 fixed point */
> -	return (uint64_t)downscale_w * downscale_h >> 16;
> +	return mul_fixed_16_16(downscale_w, downscale_h);
>  }
>  
>  static unsigned int
> @@ -3191,10 +3193,11 @@ skl_plane_relative_data_rate(const struct
> intel_crtc_state *cstate,
>  			     int y)
>  {
>  	struct intel_plane_state *intel_pstate =
> to_intel_plane_state(pstate);
> -	uint32_t down_scale_amount, data_rate;
> +	uint32_t data_rate;
>  	uint32_t width = 0, height = 0;
>  	struct drm_framebuffer *fb;
>  	u32 format;
> +	uint_fixed_16_16_t downscale_amount;
>  
>  	if (!intel_pstate->base.visible)
>  		return 0;
> @@ -3226,9 +3229,9 @@ skl_plane_relative_data_rate(const struct
> intel_crtc_state *cstate,
>  		data_rate = width * height * fb->format->cpp[0];
>  	}
>  
> -	down_scale_amount =
> skl_plane_downscale_amount(intel_pstate);
> +	downscale_amount = skl_plane_downscale_amount(intel_pstate);
>  
> -	return (uint64_t)data_rate * down_scale_amount >> 16;
> +	return mul_u32_fixed_16_16_round_up(data_rate,
> downscale_amount);
>  }
>  
>  /*
> @@ -3488,7 +3491,7 @@ static uint_fixed_16_16_t
> skl_wm_method1(uint32_t pixel_rate, uint8_t cpp,
>  		return FP_16_16_MAX;
>  
>  	wm_intermediate_val = latency * pixel_rate * cpp;
> -	ret = fixed_16_16_div_round_up_u64(wm_intermediate_val, 1000
> * 512);
> +	ret = fixed_16_16_div_u64(wm_intermediate_val, 1000 * 512);
>  	return ret;
>  }
>  
> @@ -3514,8 +3517,7 @@ static uint32_t
> skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
>  					      struct
> intel_plane_state *pstate)
>  {
>  	uint64_t adjusted_pixel_rate;
> -	uint64_t downscale_amount;
> -	uint64_t pixel_rate;
> +	uint_fixed_16_16_t downscale_amount;
>  
>  	/* Shouldn't reach here on disabled planes... */
>  	if (WARN_ON(!pstate->base.visible))
> @@ -3528,10 +3530,8 @@ static uint32_t
> skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
>  	adjusted_pixel_rate = cstate->pixel_rate;
>  	downscale_amount = skl_plane_downscale_amount(pstate);
>  
> -	pixel_rate = adjusted_pixel_rate * downscale_amount >> 16;
> -	WARN_ON(pixel_rate != clamp_t(uint32_t, pixel_rate, 0, ~0));
> -
> -	return pixel_rate;
> +	return mul_u32_fixed_16_16_round_up(adjusted_pixel_rate,
> +					    downscale_amount);
>  }
>  
>  static int skl_compute_plane_wm(const struct drm_i915_private
> *dev_priv,
> @@ -3617,8 +3617,8 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	if (y_tiled) {
>  		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line *
>  					   y_min_scanlines, 512);
> -		plane_blocks_per_line =
> -		      fixed_16_16_div_round_up(interm_pbpl,
> y_min_scanlines);
> +		plane_blocks_per_line = fixed_16_16_div(interm_pbpl,
> +							y_min_scanli
> nes);
>  	} else if (x_tiled) {
>  		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line,
> 512);
>  		plane_blocks_per_line =
> u32_to_fixed_16_16(interm_pbpl);
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] drm/i915/skl+: ddb min requirement may exceed allocation
  2017-02-28 11:31 ` [PATCH 5/8] drm/i915/skl+: ddb min requirement may exceed allocation Mahesh Kumar
@ 2017-04-12  9:17   ` Ander Conselvan De Oliveira
  2017-04-12 15:09     ` Mahesh Kumar
  0 siblings, 1 reply; 13+ messages in thread
From: Ander Conselvan De Oliveira @ 2017-04-12  9:17 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

On Tue, 2017-02-28 at 17:01 +0530, Mahesh Kumar wrote:
> DDB minimum requirement may also exceed the allocated DDB for CRTC.
> Instead of directly deducting from alloc_size, check against
> total_min_ddb requirement. if exceeding fail the flip.

Instead of doing a low level description of the code change, including variable
names and whatnot, use this space to describe in a high level what the patch
does and why it is necessary. For instance, in this particular case the patch
changes the modeset to fail when there isn't enough ddb for the given
configuration. Previously it succeeded, but the plane ddb allocations would be
bogus because of the negative value of alloc_size, leading to screen corruption
and system hangs.

Do I understand correctly that this patch is independent from the rest of the
series? Might make sense to merge it earlier, since the bug is there with the
old ddb allocation algorithm too.

> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 76c986166664..24e9e5d69833 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3380,6 +3380,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	int num_active;
>  	unsigned plane_data_rate[I915_MAX_PLANES] = {};
>  	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
> +	uint16_t total_min_blocks = 0;
>  
>  	/* Clear the partitioning for disabled planes. */
>  	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
> @@ -3407,10 +3408,18 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	 */
>  
>  	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
> -		alloc_size -= minimum[plane_id];
> -		alloc_size -= y_minimum[plane_id];
> +		total_min_blocks += minimum[plane_id];
> +		total_min_blocks += y_minimum[plane_id];
>  	}
>  
> +	if ((total_min_blocks > alloc_size)) {

One '(' is enough.

> +		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
> +		DRM_DEBUG_KMS("minimum required %d/%d\n", total_min_blocks,
> +							alloc_size);
> +		return-EINVAL;

Space between return and -EINVAL please.

Ander

> +	}
> +
> +	alloc_size -= total_min_blocks;
>  	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
>  	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
>  
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 5/8] drm/i915/skl+: ddb min requirement may exceed allocation
  2017-04-12  9:17   ` Ander Conselvan De Oliveira
@ 2017-04-12 15:09     ` Mahesh Kumar
  0 siblings, 0 replies; 13+ messages in thread
From: Mahesh Kumar @ 2017-04-12 15:09 UTC (permalink / raw)
  To: Ander Conselvan De Oliveira, intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

Hi Ander,

Thanks for review

On Wednesday 12 April 2017 02:47 PM, Ander Conselvan De Oliveira wrote:
> On Tue, 2017-02-28 at 17:01 +0530, Mahesh Kumar wrote:
>> DDB minimum requirement may also exceed the allocated DDB for CRTC.
>> Instead of directly deducting from alloc_size, check against
>> total_min_ddb requirement. if exceeding fail the flip.
> Instead of doing a low level description of the code change, including variable
> names and whatnot, use this space to describe in a high level what the patch
> does and why it is necessary. For instance, in this particular case the patch
> changes the modeset to fail when there isn't enough ddb for the given
> configuration. Previously it succeeded, but the plane ddb allocations would be
> bogus because of the negative value of alloc_size, leading to screen corruption
> and system hangs.
Ok, will make changes,
Not only modeset, any flip whose requirement is exceeding available ddb 
allocation, we should fail that ioctl.
>
> Do I understand correctly that this patch is independent from the rest of the
> series? Might make sense to merge it earlier, since the bug is there with the
> old ddb allocation algorithm too.
Yes, patch 3,4 as well are independent of the series & applicable for 
old algorithm too.
>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++--
>>   1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index 76c986166664..24e9e5d69833 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -3380,6 +3380,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>>   	int num_active;
>>   	unsigned plane_data_rate[I915_MAX_PLANES] = {};
>>   	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
>> +	uint16_t total_min_blocks = 0;
>>   
>>   	/* Clear the partitioning for disabled planes. */
>>   	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
>> @@ -3407,10 +3408,18 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>>   	 */
>>   
>>   	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
>> -		alloc_size -= minimum[plane_id];
>> -		alloc_size -= y_minimum[plane_id];
>> +		total_min_blocks += minimum[plane_id];
>> +		total_min_blocks += y_minimum[plane_id];
>>   	}
>>   
>> +	if ((total_min_blocks > alloc_size)) {
> One '(' is enough.
will do :)
>
>> +		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
>> +		DRM_DEBUG_KMS("minimum required %d/%d\n", total_min_blocks,
>> +							alloc_size);
>> +		return-EINVAL;
> Space between return and -EINVAL please.
will update.

-Mahesh
>
> Ander
>
>> +	}
>> +
>> +	alloc_size -= total_min_blocks;
>>   	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
>>   	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
>>   

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-04-12 15:06 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-28 11:31 [PATCH 0/8] Implement DDB algorithm and WM cleanup Mahesh Kumar
2017-02-28 11:31 ` [PATCH 1/8] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point Mahesh Kumar
2017-03-16 18:48   ` Paulo Zanoni
2017-02-28 11:31 ` [PATCH 2/8] drm/i915/skl+: use linetime latency if ddb size is not available Mahesh Kumar
2017-02-28 11:31 ` [PATCH 3/8] drm/i915/skl: Fail the flip if no FB for WM calculation Mahesh Kumar
2017-02-28 11:31 ` [PATCH 4/8] drm/i915/skl+: no need to memset again Mahesh Kumar
2017-02-28 11:31 ` [PATCH 5/8] drm/i915/skl+: ddb min requirement may exceed allocation Mahesh Kumar
2017-04-12  9:17   ` Ander Conselvan De Oliveira
2017-04-12 15:09     ` Mahesh Kumar
2017-02-28 11:31 ` [PATCH 6/8] drm/i915/skl+: Watermark calculation cleanup Mahesh Kumar
2017-02-28 11:31 ` [PATCH 7/8] drm/i915/skl: New ddb allocation algorithm Mahesh Kumar
2017-02-28 11:31 ` [PATCH 8/8] drm/i915/skl+: consider max supported plane pixel rate while scaling Mahesh Kumar
2017-02-28 15:52 ` ✗ Fi.CI.BAT: failure for Implement DDB algorithm and WM cleanup Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.