All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] Implement DDB algorithm and WM cleanup
@ 2017-05-15  8:34 Mahesh Kumar
  2017-05-15  8:34 ` [PATCH 01/12] drm/i915: fix naming of fixed_16_16 wrapper Mahesh Kumar
                   ` (12 more replies)
  0 siblings, 13 replies; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This series implements new DDB allocation algorithm to solve the cases,
where we have sufficient DDB available to enable multiple planes, But
due to the current algorithm not dividing it properly among planes, we
end-up failing the flip.
It also takes care of enabling same watermark level for each
plane, for efficient power saving.
Series also fixes/cleans-up few bug in present code.

There are two steps in current WM programming.

1. Calculate minimum number of blocks required  for a WM level to be
enabled. For 1440x2560 panel we need 41 blocks as minimum number of
blocks to enable WM0. This is the step which doesn't use vertical size.
It only depends on Pipe drain rate and plane horizontal size as per the
current Bspec algorithm.
So all the plane below have minimum  number of blocks required to enable
WM0 as 41
    Plane 1  - 1440x2560        -    Min blocks to enable WM0 = 41
    Plane 2  - 1440x2560        -    Min blocks to enable WM0 = 41
    Plane 3  - 1440x48          -    Min blocks to enable WM0 = 41
    Plane 4  - 1440x96          -    Min blocks to enable WM0 = 41

2. Number of blocks allotted by the driver
    Driver allocates  12 for Plane 3   &  16 for plane 4

    Total Dbuf Available = 508
    Dbuf Available after 32 blocks for cursor = 508 - (32)  = 476
    allocate minimum blocks for each plane 8 * 4 = 32
    remaining blocks = 476 - 32 = 444
    Relative Data Rate for Planes
       Plane 1  =  1440 * 2560 * 3  =  11059200
       Plane 2  =  1440 * 2560 * 3  =  11059200
       Plane 3  =  1440 * 48   * 3  =  207360
       Plane 4  =  1440 * 96   * 3  =  414720
       Total Relative BW            =  22740480

-   Allocate Buffer
    buffer allocation = (Plane relative data rate / total data rate)
		    * total remaming DDB + minimum plane DDB
     Plane 1  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
     Plane 2  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
     Plane 3  buffer allocation = (207360   / 22740480) * 444 + 8 = 12
     Plane 4  buffer allocation = (414720   / 22740480) * 444 + 8 = 16

In this case it forced driver to disable Plane 3 & 4. Driver need to use
more efficient way to allocate buffer that is optimum for power.

New Algorithm suggested by HW team is:

1. Calculate minimum buffer allocations for each plane and for each
    watermark level

2. Add minimum buffer allocations required for enabling WM7
    for all the planes

Level 0 =  41 + 41 + 41 + 41  = 164
Level 1 =  42 + 42 + 42 + 42  = 168
Level 2 =  42 + 42 + 42 + 42  = 168
Level 3 =  94 + 94 + 94 + 94 =  376
Level 4 =  94 + 94 + 94 + 94 =  376
Level 5 =  94 + 94 + 94 + 94 =  376
Level 6 =  94 + 94 + 94 + 94 =  376
Level 7 =  94 + 94 + 94 + 94 =  376

3. Check to see how many buffer allocation are left and enable
the best case. In this case since we have 476 blocks we can enable
WM0-7 on all 4 planes.
Let's say if we have only 200 block available then the best cases
allocation is to enable Level2 which requires 168 blocks

Kumar, Mahesh (11):
  drm/i915: fix naming of fixed_16_16 wrapper.
  drm/i915: Add more wrapper for fixed_point_16_16 operations
  drm/i915: Use fixed_16_16 wrapper for division operation
  drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed
    point
  drm/i915/skl: Fail the flip if no FB for WM calculation
  drm/i915/skl+: no need to memset again
  drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe
    allocation
  drm/i915/skl+: Watermark calculation cleanup
  drm/i915/skl+: use linetime latency if ddb size is not available
  drm/i915/skl: New ddb allocation algorithm
  drm/i915/skl+: consider max supported plane pixel rate while scaling

Mahesh Kumar (1):
  drm/i915/skl+: Perform wm level calculations in separate function

 drivers/gpu/drm/i915/i915_drv.h      |  56 +++-
 drivers/gpu/drm/i915/intel_display.c |   3 +
 drivers/gpu/drm/i915/intel_drv.h     |   2 +
 drivers/gpu/drm/i915/intel_pm.c      | 527 ++++++++++++++++++++++-------------
 4 files changed, 398 insertions(+), 190 deletions(-)

-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 01/12] drm/i915: fix naming of fixed_16_16 wrapper.
  2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
@ 2017-05-15  8:34 ` Mahesh Kumar
  2017-05-15  8:34 ` [PATCH 02/12] drm/i915: Add more wrapper for fixed_point_16_16 operations Mahesh Kumar
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>

fixed_16_16_div_round_up(_u64), wrapper for fixed_16_16 division
operation don't really round_up the result. Wrapper round_up only the
fraction part of the result to make it 16-bit.
This patch eliminates round_up keyword from the wrapper.

Later patch will introduce the new wrapper to do rounding-off the result
and give unt32_t output to cleanup mix use of fixed_16_16_t & uint32_t
variables.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 6 ++----
 drivers/gpu/drm/i915/intel_pm.c | 6 +++---
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ff3574a56812..f56fa1f71c11 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -153,8 +153,7 @@ static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
 	return max;
 }
 
-static inline uint_fixed_16_16_t fixed_16_16_div_round_up(uint32_t val,
-							  uint32_t d)
+static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
 {
 	uint_fixed_16_16_t fp, res;
 
@@ -163,8 +162,7 @@ static inline uint_fixed_16_16_t fixed_16_16_div_round_up(uint32_t val,
 	return res;
 }
 
-static inline uint_fixed_16_16_t fixed_16_16_div_round_up_u64(uint32_t val,
-							      uint32_t d)
+static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val, uint32_t d)
 {
 	uint_fixed_16_16_t res;
 	uint64_t interm_val;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index ef0e9f8d4dbd..d12bbe651dd5 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4165,7 +4165,7 @@ static uint_fixed_16_16_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp,
 		return FP_16_16_MAX;
 
 	wm_intermediate_val = latency * pixel_rate * cpp;
-	ret = fixed_16_16_div_round_up_u64(wm_intermediate_val, 1000 * 512);
+	ret = fixed_16_16_div_u64(wm_intermediate_val, 1000 * 512);
 	return ret;
 }
 
@@ -4301,8 +4301,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	if (y_tiled) {
 		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line *
 					   y_min_scanlines, 512);
-		plane_blocks_per_line =
-		      fixed_16_16_div_round_up(interm_pbpl, y_min_scanlines);
+		plane_blocks_per_line = fixed_16_16_div(interm_pbpl,
+							y_min_scanlines);
 	} else if (x_tiled) {
 		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512);
 		plane_blocks_per_line = u32_to_fixed_16_16(interm_pbpl);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 02/12] drm/i915: Add more wrapper for fixed_point_16_16 operations
  2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
  2017-05-15  8:34 ` [PATCH 01/12] drm/i915: fix naming of fixed_16_16 wrapper Mahesh Kumar
@ 2017-05-15  8:34 ` Mahesh Kumar
  2017-05-15 22:35   ` Matt Roper
  2017-05-15  8:34 ` [PATCH 03/12] drm/i915: Use fixed_16_16 wrapper for division operation Mahesh Kumar
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>

This patch adds few wrapper to perform fixed_point_16_16 operations
mul_round_up_u32_fixed16 : Multiplies u32 and fixed_16_16_t variables
			   & returns u32 result with rounding-up.
mul_fixed16 : Multiplies two fixed_16_16_t variable & returns fixed_16_16
div_round_up_fixed16 : Perform division operation on fixed_16_16_t
		       variables & return u32 result with round-off
div_round_up_u32_fixed16 : devide uint32_t variable by fixed_16_16 variable
			   and round_up the result to uint32_t.

These wrappers will be used by later patches in the series.

Changes from V1:
 - Rename wrapper as per Matt's comment

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 43 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f56fa1f71c11..a1858c3eb33a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -153,6 +153,38 @@ static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
 	return max;
 }
 
+static inline uint32_t div_round_up_fixed16(uint_fixed_16_16_t val,
+					    uint_fixed_16_16_t d)
+{
+	return DIV_ROUND_UP(val.val, d.val);
+}
+
+static inline uint32_t mul_round_up_u32_fixed16(uint32_t val,
+						uint_fixed_16_16_t mul)
+{
+       uint64_t intermediate_val;
+       uint32_t result;
+
+       intermediate_val = (uint64_t) val * mul.val;
+       intermediate_val = DIV_ROUND_UP_ULL(intermediate_val, 1 << 16);
+       WARN_ON(intermediate_val >> 32);
+       result = clamp_t(uint32_t, intermediate_val, 0, ~0);
+       return result;
+}
+
+static inline uint_fixed_16_16_t mul_fixed16(uint_fixed_16_16_t val,
+					     uint_fixed_16_16_t mul)
+{
+	uint64_t intermediate_val;
+	uint_fixed_16_16_t fp;
+
+	intermediate_val = (uint64_t) val.val * mul.val;
+	intermediate_val = intermediate_val >> 16;
+	WARN_ON(intermediate_val >> 32);
+	fp.val = clamp_t(uint32_t, intermediate_val, 0, ~0);
+	return fp;
+}
+
 static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
 {
 	uint_fixed_16_16_t fp, res;
@@ -175,6 +207,17 @@ static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val, uint32_t d)
 	return res;
 }
 
+static inline uint32_t div_round_up_u32_fixed16(uint32_t val,
+						uint_fixed_16_16_t d)
+{
+	uint64_t interm_val;
+
+	interm_val = (uint64_t)val << 16;
+	interm_val = DIV_ROUND_UP_ULL(interm_val, d.val);
+	WARN_ON(interm_val >> 32);
+	return clamp_t(uint32_t, interm_val, 0, ~0);
+}
+
 static inline uint_fixed_16_16_t mul_u32_fixed_16_16(uint32_t val,
 						     uint_fixed_16_16_t mul)
 {
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 03/12] drm/i915: Use fixed_16_16 wrapper for division operation
  2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
  2017-05-15  8:34 ` [PATCH 01/12] drm/i915: fix naming of fixed_16_16 wrapper Mahesh Kumar
  2017-05-15  8:34 ` [PATCH 02/12] drm/i915: Add more wrapper for fixed_point_16_16 operations Mahesh Kumar
@ 2017-05-15  8:34 ` Mahesh Kumar
  2017-05-15  8:34 ` [PATCH 04/12] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point Mahesh Kumar
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>

Don't use fixed_16_16 structure members directly, instead use wrapper to
perform fixed_16_16 division operation.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d12bbe651dd5..8ff8cc59dcca 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4334,8 +4334,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	}
 
 	res_blocks = fixed_16_16_to_u32_round_up(selected_result) + 1;
-	res_lines = DIV_ROUND_UP(selected_result.val,
-				 plane_blocks_per_line.val);
+	res_lines = div_round_up_fixed16(selected_result,
+					 plane_blocks_per_line);
 
 	if (level >= 1 && level <= 7) {
 		if (y_tiled) {
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 04/12] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point
  2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (2 preceding siblings ...)
  2017-05-15  8:34 ` [PATCH 03/12] drm/i915: Use fixed_16_16 wrapper for division operation Mahesh Kumar
@ 2017-05-15  8:34 ` Mahesh Kumar
  2017-05-15  8:34 ` [PATCH 05/12] drm/i915/skl: Fail the flip if no FB for WM calculation Mahesh Kumar
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>

This patch make changes to calculate adjusted plane pixel rate &
plane downscale amount using fixed_point functions available.
This patch will give uniformity in code, & will help to avoid mixing of
32bit uint32_t variable for fixed-16.16 with fixed_16_16_t variables in
later patch in the series.

Changes from V1:
 - Rebase based on wrapper name change
 - Remove unnecessary comment

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 8ff8cc59dcca..ab056952cfa4 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3827,26 +3827,27 @@ void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
  * Return value is provided in 16.16 fixed point form to retain fractional part.
  * Caller should take care of dividing & rounding off the value.
  */
-static uint32_t
+static uint_fixed_16_16_t
 skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 			   const struct intel_plane_state *pstate)
 {
 	struct intel_plane *plane = to_intel_plane(pstate->base.plane);
-	uint32_t downscale_h, downscale_w;
 	uint32_t src_w, src_h, dst_w, dst_h;
+	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
+	uint_fixed_16_16_t downscale_h, downscale_w;
 
 	if (WARN_ON(!intel_wm_plane_visible(cstate, pstate)))
-		return DRM_PLANE_HELPER_NO_SCALING;
+		return u32_to_fixed_16_16(0);
 
 	/* n.b., src is 16.16 fixed point, dst is whole integer */
 	if (plane->id == PLANE_CURSOR) {
-		src_w = pstate->base.src_w;
-		src_h = pstate->base.src_h;
+		src_w = pstate->base.src_w >> 16;
+		src_h = pstate->base.src_h >> 16;
 		dst_w = pstate->base.crtc_w;
 		dst_h = pstate->base.crtc_h;
 	} else {
-		src_w = drm_rect_width(&pstate->base.src);
-		src_h = drm_rect_height(&pstate->base.src);
+		src_w = drm_rect_width(&pstate->base.src) >> 16;
+		src_h = drm_rect_height(&pstate->base.src) >> 16;
 		dst_w = drm_rect_width(&pstate->base.dst);
 		dst_h = drm_rect_height(&pstate->base.dst);
 	}
@@ -3854,11 +3855,12 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 	if (drm_rotation_90_or_270(pstate->base.rotation))
 		swap(dst_w, dst_h);
 
-	downscale_h = max(src_h / dst_h, (uint32_t)DRM_PLANE_HELPER_NO_SCALING);
-	downscale_w = max(src_w / dst_w, (uint32_t)DRM_PLANE_HELPER_NO_SCALING);
+	fp_w_ratio = fixed_16_16_div(src_w, dst_w);
+	fp_h_ratio = fixed_16_16_div(src_h, dst_h);
+	downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
+	downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
 
-	/* Provide result in 16.16 fixed point */
-	return (uint64_t)downscale_w * downscale_h >> 16;
+	return mul_fixed16(downscale_w, downscale_h);
 }
 
 static unsigned int
@@ -3868,10 +3870,11 @@ skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 {
 	struct intel_plane *plane = to_intel_plane(pstate->plane);
 	struct intel_plane_state *intel_pstate = to_intel_plane_state(pstate);
-	uint32_t down_scale_amount, data_rate;
+	uint32_t data_rate;
 	uint32_t width = 0, height = 0;
 	struct drm_framebuffer *fb;
 	u32 format;
+	uint_fixed_16_16_t down_scale_amount;
 
 	if (!intel_pstate->base.visible)
 		return 0;
@@ -3905,7 +3908,7 @@ skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 
 	down_scale_amount = skl_plane_downscale_amount(cstate, intel_pstate);
 
-	return (uint64_t)data_rate * down_scale_amount >> 16;
+	return mul_round_up_u32_fixed16(data_rate, down_scale_amount);
 }
 
 /*
@@ -4191,8 +4194,7 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 					      struct intel_plane_state *pstate)
 {
 	uint64_t adjusted_pixel_rate;
-	uint64_t downscale_amount;
-	uint64_t pixel_rate;
+	uint_fixed_16_16_t downscale_amount;
 
 	/* Shouldn't reach here on disabled planes... */
 	if (WARN_ON(!intel_wm_plane_visible(cstate, pstate)))
@@ -4205,10 +4207,8 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 	adjusted_pixel_rate = cstate->pixel_rate;
 	downscale_amount = skl_plane_downscale_amount(cstate, pstate);
 
-	pixel_rate = adjusted_pixel_rate * downscale_amount >> 16;
-	WARN_ON(pixel_rate != clamp_t(uint32_t, pixel_rate, 0, ~0));
-
-	return pixel_rate;
+	return mul_round_up_u32_fixed16(adjusted_pixel_rate,
+					    downscale_amount);
 }
 
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 05/12] drm/i915/skl: Fail the flip if no FB for WM calculation
  2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (3 preceding siblings ...)
  2017-05-15  8:34 ` [PATCH 04/12] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point Mahesh Kumar
@ 2017-05-15  8:34 ` Mahesh Kumar
  2017-05-15 22:35   ` Matt Roper
  2017-05-15  8:34 ` [PATCH 06/12] drm/i915/skl+: no need to memset again Mahesh Kumar
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>

Fail the flip if no FB is present but plane_state is set as visible.
Above is not a valid combination so instead of continue fail the flip.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index ab056952cfa4..f494af358874 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4408,7 +4408,8 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 	if (!intel_pstate)
 		intel_pstate = to_intel_plane_state(plane->state);
 
-	WARN_ON(!intel_pstate->base.fb);
+	if (WARN_ON(!intel_pstate->base.fb))
+		return -EINVAL;
 
 	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 06/12] drm/i915/skl+: no need to memset again
  2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (4 preceding siblings ...)
  2017-05-15  8:34 ` [PATCH 05/12] drm/i915/skl: Fail the flip if no FB for WM calculation Mahesh Kumar
@ 2017-05-15  8:34 ` Mahesh Kumar
  2017-05-15  8:34 ` [PATCH 07/12] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation Mahesh Kumar
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>

We are already doing memset of ddb structure at the begining of skl_allocate_pipe_ddb
function, No need to again do a memset.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index f494af358874..37dd3e7108c9 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4072,10 +4072,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 
 	skl_ddb_get_pipe_allocation_limits(dev, cstate, alloc, &num_active);
 	alloc_size = skl_ddb_entry_size(alloc);
-	if (alloc_size == 0) {
-		memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
+	if (alloc_size == 0)
 		return 0;
-	}
 
 	skl_ddb_calc_min(cstate, num_active, minimum, y_minimum);
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 07/12] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
  2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (5 preceding siblings ...)
  2017-05-15  8:34 ` [PATCH 06/12] drm/i915/skl+: no need to memset again Mahesh Kumar
@ 2017-05-15  8:34 ` Mahesh Kumar
  2017-05-15 22:36   ` Matt Roper
  2017-05-15  8:34 ` [PATCH 08/12] drm/i915/skl+: Watermark calculation cleanup Mahesh Kumar
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>

DDB minimum requirement of crtc configuration (cumulative of all the
enabled planes in crtc) may exceed the allocated DDB for crtc/pipe.
This patch make changes to fail the flip/ioctl if minimum requirement
for pipe exceeds the total ddb allocated to the pipe.
Previously it succeeded but making alloc_size a negative value. Which
will make subsequent calculations for plane ddb allocation bogus & may
lead to screen corruption or system hang.

Changes from V1:
 - Improve commit message as per Ander's comment
 - Remove extra parentheses (Ander)

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 37dd3e7108c9..bbc72069ab57 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4057,6 +4057,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	int num_active;
 	unsigned plane_data_rate[I915_MAX_PLANES] = {};
 	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
+	uint16_t total_min_blocks = 0;
 
 	/* Clear the partitioning for disabled planes. */
 	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
@@ -4084,10 +4085,18 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	 */
 
 	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
-		alloc_size -= minimum[plane_id];
-		alloc_size -= y_minimum[plane_id];
+		total_min_blocks += minimum[plane_id];
+		total_min_blocks += y_minimum[plane_id];
 	}
 
+	if (total_min_blocks > alloc_size) {
+		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
+		DRM_DEBUG_KMS("minimum required %d/%d\n", total_min_blocks,
+							alloc_size);
+		return -EINVAL;
+	}
+
+	alloc_size -= total_min_blocks;
 	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
 	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 08/12] drm/i915/skl+: Watermark calculation cleanup
  2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (6 preceding siblings ...)
  2017-05-15  8:34 ` [PATCH 07/12] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation Mahesh Kumar
@ 2017-05-15  8:34 ` Mahesh Kumar
  2017-05-15 22:36   ` Matt Roper
  2017-05-15  8:34 ` [PATCH 09/12] drm/i915/skl+: Perform wm level calculations in separate function Mahesh Kumar
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>

This patch cleanup/reorganises the watermark calculation functions.
This patch make use of already available macro
"drm_atomic_crtc_state_for_each_plane_state" to walk through
plane_state list instead of calculating plane_state in function itself.

This restructuring will help later patch for new DDB allocation
algorithm to do only algo related changes.

Changes from V1:
 - split the patch in two parts as per Matt's comment

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 53 +++++++++++++++--------------------------
 1 file changed, 19 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index bbc72069ab57..c24a4e1bcb8b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4197,8 +4197,9 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
 	return ret;
 }
 
-static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
-					      struct intel_plane_state *pstate)
+static uint32_t
+skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
+			      const struct intel_plane_state *pstate)
 {
 	uint64_t adjusted_pixel_rate;
 	uint_fixed_16_16_t downscale_amount;
@@ -4220,7 +4221,7 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
-				struct intel_plane_state *intel_pstate,
+				const struct intel_plane_state *intel_pstate,
 				uint16_t ddb_allocation,
 				int level,
 				uint16_t *out_blocks, /* out */
@@ -4228,8 +4229,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				bool *enabled /* out */)
 {
 	struct intel_plane *plane = to_intel_plane(intel_pstate->base.plane);
-	struct drm_plane_state *pstate = &intel_pstate->base;
-	struct drm_framebuffer *fb = pstate->fb;
+	const struct drm_plane_state *pstate = &intel_pstate->base;
+	const struct drm_framebuffer *fb = pstate->fb;
 	uint32_t latency = dev_priv->wm.skl_latency[level];
 	uint_fixed_16_16_t method1, method2;
 	uint_fixed_16_16_t plane_blocks_per_line;
@@ -4384,37 +4385,17 @@ static int
 skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 		     struct skl_ddb_allocation *ddb,
 		     struct intel_crtc_state *cstate,
-		     struct intel_plane *intel_plane,
+		     const struct intel_plane_state *intel_pstate,
 		     int level,
 		     struct skl_wm_level *result)
 {
-	struct drm_atomic_state *state = cstate->base.state;
 	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
-	struct drm_plane *plane = &intel_plane->base;
-	struct intel_plane_state *intel_pstate = NULL;
+	struct drm_plane *plane = intel_pstate->base.plane;
+	struct intel_plane *intel_plane = to_intel_plane(plane);
 	uint16_t ddb_blocks;
 	enum pipe pipe = intel_crtc->pipe;
 	int ret;
 
-	if (state)
-		intel_pstate =
-			intel_atomic_get_existing_plane_state(state,
-							      intel_plane);
-
-	/*
-	 * Note: If we start supporting multiple pending atomic commits against
-	 * the same planes/CRTC's in the future, plane->state will no longer be
-	 * the correct pre-state to use for the calculations here and we'll
-	 * need to change where we get the 'unchanged' plane data from.
-	 *
-	 * For now this is fine because we only allow one queued commit against
-	 * a CRTC.  Even if the plane isn't modified by this transaction and we
-	 * don't have a plane lock, we still have the CRTC's lock, so we know
-	 * that no other transactions are racing with us to update it.
-	 */
-	if (!intel_pstate)
-		intel_pstate = to_intel_plane_state(plane->state);
-
 	if (WARN_ON(!intel_pstate->base.fb))
 		return -EINVAL;
 
@@ -4475,8 +4456,10 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 			     struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_device *dev = cstate->base.crtc->dev;
+	struct drm_crtc_state *crtc_state = &cstate->base;
 	const struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_plane *intel_plane;
+	struct drm_plane *plane;
+	const struct drm_plane_state *pstate;
 	struct skl_plane_wm *wm;
 	int level, max_level = ilk_wm_max_level(dev_priv);
 	int ret;
@@ -4487,14 +4470,16 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 	 */
 	memset(pipe_wm->planes, 0, sizeof(pipe_wm->planes));
 
-	for_each_intel_plane_mask(&dev_priv->drm,
-				  intel_plane,
-				  cstate->base.plane_mask) {
-		wm = &pipe_wm->planes[intel_plane->id];
+	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
+		const struct intel_plane_state *intel_pstate =
+						to_intel_plane_state(pstate);
+		enum plane_id plane_id = to_intel_plane(plane)->id;
+
+		wm = &pipe_wm->planes[plane_id];
 
 		for (level = 0; level <= max_level; level++) {
 			ret = skl_compute_wm_level(dev_priv, ddb, cstate,
-						   intel_plane, level,
+						   intel_pstate, level,
 						   &wm->wm[level]);
 			if (ret)
 				return ret;
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 09/12] drm/i915/skl+: Perform wm level calculations in separate function
  2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (7 preceding siblings ...)
  2017-05-15  8:34 ` [PATCH 08/12] drm/i915/skl+: Watermark calculation cleanup Mahesh Kumar
@ 2017-05-15  8:34 ` Mahesh Kumar
  2017-05-15 22:36   ` Matt Roper
  2017-05-15  8:34 ` [PATCH 10/12] drm/i915/skl+: use linetime latency if ddb size is not available Mahesh Kumar
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

Instead of iterating over planes & wm levels in a single function use
skl_compute_wm_level function to interate over WM levels.
Change name of function to skl_compute_wm_levels (Matt).

These changes are to clean-up WM code & will help in making only new
ddb algorithm related changes in later patch in series.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 48 ++++++++++++++++++++---------------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index c24a4e1bcb8b..0f1d9f672e83 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4382,18 +4382,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 }
 
 static int
-skl_compute_wm_level(const struct drm_i915_private *dev_priv,
-		     struct skl_ddb_allocation *ddb,
-		     struct intel_crtc_state *cstate,
-		     const struct intel_plane_state *intel_pstate,
-		     int level,
-		     struct skl_wm_level *result)
+skl_compute_wm_levels(const struct drm_i915_private *dev_priv,
+		      struct skl_ddb_allocation *ddb,
+		      struct intel_crtc_state *cstate,
+		      const struct intel_plane_state *intel_pstate,
+		      struct skl_plane_wm *wm)
 {
 	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
 	struct drm_plane *plane = intel_pstate->base.plane;
 	struct intel_plane *intel_plane = to_intel_plane(plane);
 	uint16_t ddb_blocks;
 	enum pipe pipe = intel_crtc->pipe;
+	int level, max_level = ilk_wm_max_level(dev_priv);
 	int ret;
 
 	if (WARN_ON(!intel_pstate->base.fb))
@@ -4401,16 +4401,20 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 
 	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
 
-	ret = skl_compute_plane_wm(dev_priv,
-				   cstate,
-				   intel_pstate,
-				   ddb_blocks,
-				   level,
-				   &result->plane_res_b,
-				   &result->plane_res_l,
-				   &result->plane_en);
-	if (ret)
-		return ret;
+	for (level = 0; level <= max_level; level++) {
+		struct skl_wm_level *result = &wm->wm[level];
+
+		ret = skl_compute_plane_wm(dev_priv,
+					   cstate,
+					   intel_pstate,
+					   ddb_blocks,
+					   level,
+					   &result->plane_res_b,
+					   &result->plane_res_l,
+					   &result->plane_en);
+		if (ret)
+			return ret;
+	}
 
 	return 0;
 }
@@ -4461,7 +4465,6 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 	struct drm_plane *plane;
 	const struct drm_plane_state *pstate;
 	struct skl_plane_wm *wm;
-	int level, max_level = ilk_wm_max_level(dev_priv);
 	int ret;
 
 	/*
@@ -4477,13 +4480,10 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 
 		wm = &pipe_wm->planes[plane_id];
 
-		for (level = 0; level <= max_level; level++) {
-			ret = skl_compute_wm_level(dev_priv, ddb, cstate,
-						   intel_pstate, level,
-						   &wm->wm[level]);
-			if (ret)
-				return ret;
-		}
+		ret = skl_compute_wm_levels(dev_priv, ddb, cstate,
+					    intel_pstate, wm);
+		if (ret)
+			return ret;
 		skl_compute_transition_wm(cstate, &wm->trans_wm);
 	}
 	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 10/12] drm/i915/skl+: use linetime latency if ddb size is not available
  2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (8 preceding siblings ...)
  2017-05-15  8:34 ` [PATCH 09/12] drm/i915/skl+: Perform wm level calculations in separate function Mahesh Kumar
@ 2017-05-15  8:34 ` Mahesh Kumar
  2017-05-15 22:38   ` Matt Roper
  2017-05-15  8:34 ` [PATCH 11/12] drm/i915/skl: New ddb allocation algorithm Mahesh Kumar
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>

This patch make changes to use linetime latency if allocated
DDB size during plane watermark calculation is not available,
This is required to implement new DDB allocation algorithm.

In New Algorithm DDB is allocated based on WM values, because of which
number of DDB blocks will not be available during WM calculation,
So this "linetime latency" is suggested by SV/HW team to be used during
switch-case for WM blocks selection.

Changes since v1:
 - Rebase on top of Paulo's patch series
Changes since v2:
 - Fix if-else condition (pointed by Maarten)
Changes since v3:
 - Use common function for timetime_us calculation (Paulo)
 - rebase on drm-tip
Changes since v4:
 - Use consistent name for fixed_point operation

Signed-off-by: "Mahesh Kumar" <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |  7 +++++++
 drivers/gpu/drm/i915/intel_pm.c | 42 ++++++++++++++++++++++++++++++++---------
 2 files changed, 40 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a1858c3eb33a..84052990e695 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -115,6 +115,13 @@ typedef struct {
 	fp; \
 })
 
+static inline bool is_fixed16_zero(uint_fixed_16_16_t val)
+{
+	if (val.val == 0)
+		return true;
+	return false;
+}
+
 static inline uint_fixed_16_16_t u32_to_fixed_16_16(uint32_t val)
 {
 	uint_fixed_16_16_t fp;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0f1d9f672e83..d73369c2c2d9 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4197,6 +4197,27 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
 	return ret;
 }
 
+static uint_fixed_16_16_t
+skl_get_linetime_us(struct intel_crtc_state *cstate)
+{
+	uint32_t pixel_rate;
+	uint32_t crtc_htotal;
+	uint_fixed_16_16_t linetime_us;
+
+	if (!cstate->base.active)
+		return u32_to_fixed_16_16(0);
+
+	pixel_rate = cstate->pixel_rate;
+
+	if (WARN_ON(pixel_rate == 0))
+		return u32_to_fixed_16_16(0);
+
+	crtc_htotal = cstate->base.adjusted_mode.crtc_htotal;
+	linetime_us = fixed_16_16_div_u64(crtc_htotal * 1000, pixel_rate);
+
+	return linetime_us;
+}
+
 static uint32_t
 skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
 			      const struct intel_plane_state *pstate)
@@ -4331,12 +4352,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	if (y_tiled) {
 		selected_result = max_fixed_16_16(method2, y_tile_minimum);
 	} else {
+		uint32_t linetime_us;
+
+		linetime_us = fixed_16_16_to_u32_round_up(
+				skl_get_linetime_us(cstate));
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
 		    (plane_bytes_per_line / 512 < 1))
 			selected_result = method2;
-		else if ((ddb_allocation /
+		else if ((ddb_allocation && ddb_allocation /
 			fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1)
 			selected_result = min_fixed_16_16(method1, method2);
+		else if (latency >= linetime_us)
+			selected_result = method2;
 		else
 			selected_result = method1;
 	}
@@ -4424,19 +4451,16 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 {
 	struct drm_atomic_state *state = cstate->base.state;
 	struct drm_i915_private *dev_priv = to_i915(state->dev);
-	uint32_t pixel_rate;
+	uint_fixed_16_16_t linetime_us;
 	uint32_t linetime_wm;
 
-	if (!cstate->base.active)
-		return 0;
+	linetime_us = skl_get_linetime_us(cstate);
 
-	pixel_rate = cstate->pixel_rate;
-
-	if (WARN_ON(pixel_rate == 0))
+	if (is_fixed16_zero(linetime_us))
 		return 0;
 
-	linetime_wm = DIV_ROUND_UP(8 * cstate->base.adjusted_mode.crtc_htotal *
-				   1000, pixel_rate);
+	linetime_wm = fixed_16_16_to_u32_round_up(mul_u32_fixed_16_16(8,
+				linetime_us));
 
 	/* Display WA #1135: bxt. */
 	if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled)
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 11/12] drm/i915/skl: New ddb allocation algorithm
  2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (9 preceding siblings ...)
  2017-05-15  8:34 ` [PATCH 10/12] drm/i915/skl+: use linetime latency if ddb size is not available Mahesh Kumar
@ 2017-05-15  8:34 ` Mahesh Kumar
  2017-05-15 22:38   ` Matt Roper
  2017-05-16 15:29   ` Maarten Lankhorst
  2017-05-15  8:34 ` [PATCH 12/12] drm/i915/skl+: consider max supported plane pixel rate while scaling Mahesh Kumar
  2017-05-15 11:02 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev7) Patchwork
  12 siblings, 2 replies; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>

This patch implements new DDB allocation algorithm as per HW team
recommendation. This algo takecare of scenario where we allocate less DDB
for the planes with lower relative pixel rate, but they require more DDB
to work.
It also takes care of enabling same watermark level for each
plane in crtc, for efficient power saving.

Changes since v1:
 - Rebase on top of Paulo's patch series

Changes since v2:
 - Fix the for loop condition to enable WM

Changes since v3:
 - Fix crash in cursor i-g-t reported by Maarten
 - Rebase after addressing Paulo's comments
 - Few other ULT fixes
Changes since v4:
 - Rebase on drm-tip
 - Added separate function to enable WM levels
Changes since v5:
 - Fix a crash identified in skl-6770HQ system

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 253 ++++++++++++++++++++++++----------------
 1 file changed, 152 insertions(+), 101 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d73369c2c2d9..d6b0ae0ef7a2 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4039,13 +4039,41 @@ skl_ddb_calc_min(const struct intel_crtc_state *cstate, int num_active,
 	minimum[PLANE_CURSOR] = skl_cursor_allocation(num_active);
 }
 
+static void
+skl_enable_plane_wm_levels(const struct drm_i915_private *dev_priv,
+			   uint16_t plane_ddb,
+			   uint16_t max_level,
+			   struct skl_plane_wm *wm)
+{
+	int level;
+	/*
+	 * Now enable all levels in WM structure which can be enabled
+	 * using current DDB allocation
+	 */
+	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
+		struct skl_wm_level *level_wm = &wm->wm[level];
+
+		if (level > max_level || level_wm->plane_res_b == 0
+				      || level_wm->plane_res_l >= 31
+				      || level_wm->plane_res_b >= plane_ddb) {
+			level_wm->plane_en = false;
+			level_wm->plane_res_b = 0;
+			level_wm->plane_res_l = 0;
+		} else {
+			level_wm->plane_en = true;
+		}
+	}
+}
+
 static int
 skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
-		      struct skl_ddb_allocation *ddb /* out */)
+		      struct skl_ddb_allocation *ddb /* out */,
+		      struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_atomic_state *state = cstate->base.state;
 	struct drm_crtc *crtc = cstate->base.crtc;
 	struct drm_device *dev = crtc->dev;
+	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	enum pipe pipe = intel_crtc->pipe;
 	struct skl_ddb_entry *alloc = &cstate->wm.skl.ddb;
@@ -4058,6 +4086,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	unsigned plane_data_rate[I915_MAX_PLANES] = {};
 	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
 	uint16_t total_min_blocks = 0;
+	uint16_t total_level_ddb;
+	int max_level, level;
 
 	/* Clear the partitioning for disabled planes. */
 	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
@@ -4096,10 +4126,43 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		return -EINVAL;
 	}
 
-	alloc_size -= total_min_blocks;
-	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
+	alloc_size -= minimum[PLANE_CURSOR];
+	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end -
+							minimum[PLANE_CURSOR];
 	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
 
+	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
+		total_level_ddb = 0;
+		for_each_plane_id_on_crtc(intel_crtc, plane_id) {
+			/*
+			 * TODO: We should calculate watermark values for Y/UV
+			 * plane both in case of NV12 format and use both values
+			 * for ddb calculation. NV12 is disabled as of now, So
+			 * using only single/UV plane value here.
+			 */
+			struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
+			uint16_t plane_res_b = wm->wm[level].plane_res_b;
+			uint16_t min = minimum[plane_id] + y_minimum[plane_id];
+
+			if (plane_id == PLANE_CURSOR)
+				continue;
+
+			total_level_ddb += max(plane_res_b, min);
+		}
+
+		if (total_level_ddb <= alloc_size)
+			break;
+	}
+
+	if ((level < 0) || (total_min_blocks > alloc_size)) {
+		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
+		DRM_DEBUG_KMS("minimum required %d/%d\n", (level < 0) ?
+				total_level_ddb : total_min_blocks, alloc_size);
+		return -EINVAL;
+	}
+	max_level = level;
+	alloc_size -= total_level_ddb;
+
 	/*
 	 * 2. Distribute the remaining space in proportion to the amount of
 	 * data each plane needs to fetch from memory.
@@ -4115,10 +4178,17 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	start = alloc->start;
 	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
 		unsigned int data_rate, y_data_rate;
-		uint16_t plane_blocks, y_plane_blocks = 0;
-
-		if (plane_id == PLANE_CURSOR)
+		uint16_t plane_blocks = 0, y_plane_blocks = 0;
+		struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
+		uint16_t plane_res_b = wm->wm[max_level].plane_res_b;
+
+		if (plane_id == PLANE_CURSOR) {
+			plane_blocks =
+				skl_ddb_entry_size(&ddb->plane[pipe][plane_id]);
+			skl_enable_plane_wm_levels(dev_priv, plane_blocks,
+						   max_level, wm);
 			continue;
+		}
 
 		data_rate = plane_data_rate[plane_id];
 
@@ -4127,33 +4197,36 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		 * promote the expression to 64 bits to avoid overflowing, the
 		 * result is < available as data_rate / total_data_rate < 1
 		 */
-		plane_blocks = minimum[plane_id];
-		plane_blocks += div_u64((uint64_t)alloc_size * data_rate,
-					total_data_rate);
 
 		/* Leave disabled planes at (0,0) */
 		if (data_rate) {
+			plane_blocks = max(minimum[plane_id], plane_res_b);
+			plane_blocks += div_u64((uint64_t)alloc_size *
+					data_rate, total_data_rate);
 			ddb->plane[pipe][plane_id].start = start;
 			ddb->plane[pipe][plane_id].end = start + plane_blocks;
+			start += plane_blocks;
 		}
 
-		start += plane_blocks;
-
 		/*
 		 * allocation for y_plane part of planar format:
+		 * TODO: Once we start calculating watermark values for Y/UV
+		 * plane both consider it for initial allowed wm blocks.
 		 */
 		y_data_rate = plane_y_data_rate[plane_id];
 
-		y_plane_blocks = y_minimum[plane_id];
-		y_plane_blocks += div_u64((uint64_t)alloc_size * y_data_rate,
-					total_data_rate);
-
 		if (y_data_rate) {
+			y_plane_blocks = y_minimum[plane_id];
+			y_plane_blocks += div_u64((uint64_t)alloc_size *
+					y_data_rate, total_data_rate);
 			ddb->y_plane[pipe][plane_id].start = start;
 			ddb->y_plane[pipe][plane_id].end = start + y_plane_blocks;
+			start += y_plane_blocks;
 		}
-
-		start += y_plane_blocks;
+		skl_enable_plane_wm_levels(dev_priv,
+					   plane_blocks,
+					   max_level,
+					   wm);
 	}
 
 	return 0;
@@ -4243,11 +4316,9 @@ skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
 				const struct intel_plane_state *intel_pstate,
-				uint16_t ddb_allocation,
 				int level,
 				uint16_t *out_blocks, /* out */
-				uint8_t *out_lines, /* out */
-				bool *enabled /* out */)
+				uint8_t *out_lines /* out */)
 {
 	struct intel_plane *plane = to_intel_plane(intel_pstate->base.plane);
 	const struct drm_plane_state *pstate = &intel_pstate->base;
@@ -4270,10 +4341,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	bool y_tiled, x_tiled;
 
 	if (latency == 0 ||
-	    !intel_wm_plane_visible(cstate, intel_pstate)) {
-		*enabled = false;
+	    !intel_wm_plane_visible(cstate, intel_pstate))
 		return 0;
-	}
 
 	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
 		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
@@ -4359,9 +4428,6 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
 		    (plane_bytes_per_line / 512 < 1))
 			selected_result = method2;
-		else if ((ddb_allocation && ddb_allocation /
-			fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1)
-			selected_result = min_fixed_16_16(method1, method2);
 		else if (latency >= linetime_us)
 			selected_result = method2;
 		else
@@ -4381,64 +4447,41 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		}
 	}
 
-	if (res_blocks >= ddb_allocation || res_lines > 31) {
-		*enabled = false;
-
-		/*
-		 * If there are no valid level 0 watermarks, then we can't
-		 * support this display configuration.
-		 */
-		if (level) {
-			return 0;
-		} else {
-			struct drm_plane *plane = pstate->plane;
+	if (res_lines >= 31 && level == 0) {
+		struct drm_plane *plane = pstate->plane;
 
-			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
-			DRM_DEBUG_KMS("[PLANE:%d:%s] blocks required = %u/%u, lines required = %u/31\n",
-				      plane->base.id, plane->name,
-				      res_blocks, ddb_allocation, res_lines);
-			return -EINVAL;
-		}
+		DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
+		DRM_DEBUG_KMS("[PLANE:%d:%s] lines required = %u/31\n",
+				plane->base.id, plane->name, res_lines);
 	}
 
 	*out_blocks = res_blocks;
 	*out_lines = res_lines;
-	*enabled = true;
 
 	return 0;
 }
 
 static int
 skl_compute_wm_levels(const struct drm_i915_private *dev_priv,
-		      struct skl_ddb_allocation *ddb,
 		      struct intel_crtc_state *cstate,
 		      const struct intel_plane_state *intel_pstate,
 		      struct skl_plane_wm *wm)
 {
-	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
-	struct drm_plane *plane = intel_pstate->base.plane;
-	struct intel_plane *intel_plane = to_intel_plane(plane);
-	uint16_t ddb_blocks;
-	enum pipe pipe = intel_crtc->pipe;
 	int level, max_level = ilk_wm_max_level(dev_priv);
 	int ret;
 
 	if (WARN_ON(!intel_pstate->base.fb))
 		return -EINVAL;
 
-	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
-
 	for (level = 0; level <= max_level; level++) {
 		struct skl_wm_level *result = &wm->wm[level];
 
 		ret = skl_compute_plane_wm(dev_priv,
 					   cstate,
 					   intel_pstate,
-					   ddb_blocks,
 					   level,
 					   &result->plane_res_b,
-					   &result->plane_res_l,
-					   &result->plane_en);
+					   &result->plane_res_l);
 		if (ret)
 			return ret;
 	}
@@ -4504,8 +4547,7 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 
 		wm = &pipe_wm->planes[plane_id];
 
-		ret = skl_compute_wm_levels(dev_priv, ddb, cstate,
-					    intel_pstate, wm);
+		ret = skl_compute_wm_levels(dev_priv, cstate, intel_pstate, wm);
 		if (ret)
 			return ret;
 		skl_compute_transition_wm(cstate, &wm->trans_wm);
@@ -4618,6 +4660,45 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
 	return false;
 }
 
+static int
+skl_ddb_add_affected_planes(struct intel_crtc_state *cstate,
+			    const struct skl_pipe_wm *old_pipe_wm,
+			    const struct skl_pipe_wm *pipe_wm)
+{
+	struct drm_atomic_state *state = cstate->base.state;
+	struct drm_device *dev = state->dev;
+	struct drm_crtc *crtc = cstate->base.crtc;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
+	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
+	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
+	struct drm_plane_state *plane_state;
+	struct drm_plane *plane;
+	enum pipe pipe = intel_crtc->pipe;
+
+	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
+
+	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
+		enum plane_id plane_id = to_intel_plane(plane)->id;
+		const struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
+		const struct skl_plane_wm *old_wm = &old_pipe_wm->planes[plane_id];
+
+		if ((skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
+					&new_ddb->plane[pipe][plane_id]) &&
+		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
+					&new_ddb->y_plane[pipe][plane_id])) &&
+		    !memcmp(wm, old_wm, sizeof(struct skl_plane_wm)))
+			continue;
+
+		plane_state = drm_atomic_get_plane_state(state, plane);
+		if (IS_ERR(plane_state))
+			return PTR_ERR(plane_state);
+	}
+
+	return 0;
+}
+
 static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
 			      const struct skl_pipe_wm *old_pipe_wm,
 			      struct skl_pipe_wm *pipe_wm, /* out */
@@ -4631,6 +4712,18 @@ static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
 	if (ret)
 		return ret;
 
+	ret = skl_allocate_pipe_ddb(intel_cstate, ddb, pipe_wm);
+	if (ret)
+		return ret;
+	/*
+	 * TODO: Do we still need to add planes in state, As WM update is
+	 * not part of update_plane anymore, So wm for planes can be updated
+	 * irrespective of updade_plane call.
+	 */
+	ret = skl_ddb_add_affected_planes(intel_cstate, old_pipe_wm, pipe_wm);
+	if (ret)
+		return ret;
+
 	if (!memcmp(old_pipe_wm, pipe_wm, sizeof(*pipe_wm)))
 		*changed = false;
 	else
@@ -4653,41 +4746,7 @@ pipes_modified(struct drm_atomic_state *state)
 }
 
 static int
-skl_ddb_add_affected_planes(struct intel_crtc_state *cstate)
-{
-	struct drm_atomic_state *state = cstate->base.state;
-	struct drm_device *dev = state->dev;
-	struct drm_crtc *crtc = cstate->base.crtc;
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
-	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
-	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
-	struct drm_plane_state *plane_state;
-	struct drm_plane *plane;
-	enum pipe pipe = intel_crtc->pipe;
-
-	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
-
-	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
-		enum plane_id plane_id = to_intel_plane(plane)->id;
-
-		if (skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
-					&new_ddb->plane[pipe][plane_id]) &&
-		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
-					&new_ddb->y_plane[pipe][plane_id]))
-			continue;
-
-		plane_state = drm_atomic_get_plane_state(state, plane);
-		if (IS_ERR(plane_state))
-			return PTR_ERR(plane_state);
-	}
-
-	return 0;
-}
-
-static int
-skl_compute_ddb(struct drm_atomic_state *state)
+skl_include_affected_crtc(struct drm_atomic_state *state)
 {
 	struct drm_device *dev = state->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
@@ -4751,14 +4810,6 @@ skl_compute_ddb(struct drm_atomic_state *state)
 		cstate = intel_atomic_get_crtc_state(state, intel_crtc);
 		if (IS_ERR(cstate))
 			return PTR_ERR(cstate);
-
-		ret = skl_allocate_pipe_ddb(cstate, ddb);
-		if (ret)
-			return ret;
-
-		ret = skl_ddb_add_affected_planes(cstate);
-		if (ret)
-			return ret;
 	}
 
 	return 0;
@@ -4839,7 +4890,7 @@ skl_compute_wm(struct drm_atomic_state *state)
 	/* Clear all dirty flags */
 	results->dirty_pipes = 0;
 
-	ret = skl_compute_ddb(state);
+	ret = skl_include_affected_crtc(state);
 	if (ret)
 		return ret;
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 12/12] drm/i915/skl+: consider max supported plane pixel rate while scaling
  2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (10 preceding siblings ...)
  2017-05-15  8:34 ` [PATCH 11/12] drm/i915/skl: New ddb allocation algorithm Mahesh Kumar
@ 2017-05-15  8:34 ` Mahesh Kumar
  2017-05-15 22:39   ` Matt Roper
  2017-05-16 11:27   ` Maarten Lankhorst
  2017-05-15 11:02 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev7) Patchwork
  12 siblings, 2 replies; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>

A display resolution is only supported if it meets all the restrictions
below for Maximum Pipe Pixel Rate.

The display resolution must fit within the maximum pixel rate output
from the pipe. Make sure that the display pipe is able to feed pixels at
a rate required to support the desired resolution.
For each enabled plane on the pipe {
    If plane scaling enabled {
	Horizontal down scale amount = Maximum[1, plane horizontal size /
		    scaler horizontal window size]
	Vertical down scale amount = Maximum[1, plane vertical size /
		    scaler vertical window size]
	Plane down scale amount = Horizontal down scale amount *
		    Vertical down scale amount
	Plane Ratio = 1 / Plane down scale amount
    }
    Else {
	Plane Ratio = 1
    }
    If plane source pixel format is 64 bits per pixel {
	Plane Ratio = Plane Ratio * 8/9
    }
}

Pipe Ratio = Minimum Plane Ratio of all enabled planes on the pipe

If pipe scaling is enabled {
    Horizontal down scale amount = Maximum[1, pipe horizontal source size /
		scaler horizontal window size]
    Vertical down scale amount = Maximum[1, pipe vertical source size /
		scaler vertical window size]
    Note: The progressive fetch - interlace display mode is equivalent to a
		2.0 vertical down scale
    Pipe down scale amount = Horizontal down scale amount *
		Vertical down scale amount
    Pipe Ratio = Pipe Ratio / Pipe down scale amount
}

Pipe maximum pixel rate = CDCLK frequency * Pipe Ratio

Changes since V1:
 - separate out fixed_16_16 wrapper API definition
Changes since V2:
 - Fix buggy crtc !active condition (Maarten)
 - use intel_wm_plane_visible wrapper as per Maarten's suggestion
Changes since V3:
 - Change failure return from ERANGE to EINVAL
Changes since V3:
 - Rebase based on previous patch changes

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c |  3 ++
 drivers/gpu/drm/i915/intel_drv.h     |  2 +
 drivers/gpu/drm/i915/intel_pm.c      | 87 ++++++++++++++++++++++++++++++++++++
 3 files changed, 92 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index c1cbfcef1f89..2cc1d11d9e6c 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11186,6 +11186,9 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
 			ret = skl_update_scaler_crtc(pipe_config);
 
 		if (!ret)
+			ret = skl_check_pipe_max_pixel_rate(intel_crtc,
+							    pipe_config);
+		if (!ret)
 			ret = intel_atomic_setup_scalers(dev_priv, intel_crtc,
 							 pipe_config);
 	}
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index bd500977b3fc..93afac4a83fa 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1885,6 +1885,8 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
 				 int ignore);
 bool ilk_disable_lp_wm(struct drm_device *dev);
 int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
+int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
+				  struct intel_crtc_state *cstate);
 static inline int intel_enable_rc6(void)
 {
 	return i915.enable_rc6;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d6b0ae0ef7a2..5a3d3398ae19 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3863,6 +3863,93 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 	return mul_fixed16(downscale_w, downscale_h);
 }
 
+static uint_fixed_16_16_t
+skl_pipe_downscale_amount(const struct intel_crtc_state *config)
+{
+	uint_fixed_16_16_t pipe_downscale = u32_to_fixed_16_16(1);
+
+	if (!config->base.active)
+		return pipe_downscale;
+
+	if (config->pch_pfit.enabled) {
+		uint32_t src_w, src_h, dst_w, dst_h;
+		uint32_t pfit_size = config->pch_pfit.size;
+		uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
+		uint_fixed_16_16_t downscale_h, downscale_w;
+
+		src_w = config->pipe_src_w;
+		src_h = config->pipe_src_h;
+		dst_w = pfit_size >> 16;
+		dst_h = pfit_size & 0xffff;
+
+		if (!dst_w || !dst_h)
+			return pipe_downscale;
+
+		fp_w_ratio = fixed_16_16_div(src_w, dst_w);
+		fp_h_ratio = fixed_16_16_div(src_h, dst_h);
+		downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
+		downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
+
+		pipe_downscale = mul_fixed16(downscale_w, downscale_h);
+	}
+
+	return pipe_downscale;
+}
+
+int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
+				  struct intel_crtc_state *cstate)
+{
+	struct drm_crtc_state *crtc_state = &cstate->base;
+	struct drm_atomic_state *state = crtc_state->state;
+	struct drm_plane *plane;
+	const struct drm_plane_state *pstate;
+	struct intel_plane_state *intel_pstate;
+	int crtc_clock, cdclk;
+	uint32_t pipe_max_pixel_rate;
+	uint_fixed_16_16_t pipe_downscale;
+	uint_fixed_16_16_t max_downscale = u32_to_fixed_16_16(1);
+
+	if (!cstate->base.active)
+		return 0;
+
+	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
+		uint_fixed_16_16_t plane_downscale;
+		uint_fixed_16_16_t fp_9_div_8 = fixed_16_16_div(9, 8);
+		int bpp;
+
+		if (!intel_wm_plane_visible(cstate,
+					    to_intel_plane_state(pstate)))
+			continue;
+
+		if (WARN_ON(!pstate->fb))
+			continue;
+
+		intel_pstate = to_intel_plane_state(pstate);
+		plane_downscale = skl_plane_downscale_amount(cstate,
+							     intel_pstate);
+		bpp = pstate->fb->format->cpp[0] * 8;
+		if (bpp == 64)
+			plane_downscale = mul_fixed16(plane_downscale,
+						      fp_9_div_8);
+
+		max_downscale = max_fixed_16_16(plane_downscale, max_downscale);
+	}
+	pipe_downscale = skl_pipe_downscale_amount(cstate);
+
+	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
+
+	crtc_clock = crtc_state->adjusted_mode.crtc_clock;
+	cdclk = to_intel_atomic_state(state)->cdclk.logical.cdclk;
+	pipe_max_pixel_rate = div_round_up_u32_fixed16(cdclk, pipe_downscale);
+
+	if (pipe_max_pixel_rate < crtc_clock) {
+		DRM_ERROR("Max supported pixel clock with scaling exceeds\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static unsigned int
 skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 			     const struct drm_plane_state *pstate,
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev7)
  2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (11 preceding siblings ...)
  2017-05-15  8:34 ` [PATCH 12/12] drm/i915/skl+: consider max supported plane pixel rate while scaling Mahesh Kumar
@ 2017-05-15 11:02 ` Patchwork
  12 siblings, 0 replies; 27+ messages in thread
From: Patchwork @ 2017-05-15 11:02 UTC (permalink / raw)
  To: Kumar, Mahesh; +Cc: intel-gfx

== Series Details ==

Series: Implement DDB algorithm and WM cleanup (rev7)
URL   : https://patchwork.freedesktop.org/series/20376/
State : success

== Summary ==

Series 20376v7 Implement DDB algorithm and WM cleanup
https://patchwork.freedesktop.org/api/1.0/series/20376/revisions/7/mbox/

Test gem_exec_flush:
        Subgroup basic-batch-kernel-default-uc:
                fail       -> PASS       (fi-snb-2600) fdo#100007
Test kms_busy:
        Subgroup basic-flip-default-c:
                incomplete -> PASS       (fi-bxt-j4205) fdo#101047

fdo#100007 https://bugs.freedesktop.org/show_bug.cgi?id=100007
fdo#101047 https://bugs.freedesktop.org/show_bug.cgi?id=101047

fi-bdw-gvtdvm    total:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  time:432s
fi-bxt-j4205     total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  time:524s
fi-byt-j1900     total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  time:491s
fi-byt-n2820     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:485s
fi-hsw-4770      total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:424s
fi-ilk-650       total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  time:417s
fi-ivb-3520m     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:493s
fi-kbl-7500u     total:278  pass:255  dwarn:5   dfail:0   fail:0   skip:18  time:463s
fi-kbl-7560u     total:278  pass:263  dwarn:5   dfail:0   fail:0   skip:10  time:573s
fi-skl-6700hq    total:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  time:584s
fi-skl-6700k     total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  time:464s
fi-skl-6770hq    total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:501s
fi-skl-gvtdvm    total:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  time:441s
fi-snb-2520m     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:535s
fi-snb-2600      total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  time:409s

d79d2bb351aa06bc5b39c070068fea112cfa29ef drm-tip: 2017y-05m-15d-08h-55m-10s UTC integration manifest
d4f7d74 drm/i915/skl+: consider max supported plane pixel rate while scaling
887fd8e drm/i915/skl: New ddb allocation algorithm
47456f6 drm/i915/skl+: use linetime latency if ddb size is not available
27fd743 drm/i915/skl+: Perform wm level calculations in separate function
f2daf18 drm/i915/skl+: Watermark calculation cleanup
b2f8e82 drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
28e546a drm/i915/skl+: no need to memset again
269d79f drm/i915/skl: Fail the flip if no FB for WM calculation
07c6b2f drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point
fe8a648 drm/i915: Use fixed_16_16 wrapper for division operation
4a0589d drm/i915: Add more wrapper for fixed_point_16_16 operations
58a1b07 drm/i915: fix naming of fixed_16_16 wrapper.

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4693/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 02/12] drm/i915: Add more wrapper for fixed_point_16_16 operations
  2017-05-15  8:34 ` [PATCH 02/12] drm/i915: Add more wrapper for fixed_point_16_16 operations Mahesh Kumar
@ 2017-05-15 22:35   ` Matt Roper
  0 siblings, 0 replies; 27+ messages in thread
From: Matt Roper @ 2017-05-15 22:35 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 15, 2017 at 02:04:27PM +0530, Mahesh Kumar wrote:
> From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>
> 
> This patch adds few wrapper to perform fixed_point_16_16 operations
> mul_round_up_u32_fixed16 : Multiplies u32 and fixed_16_16_t variables
> 			   & returns u32 result with rounding-up.
> mul_fixed16 : Multiplies two fixed_16_16_t variable & returns fixed_16_16
> div_round_up_fixed16 : Perform division operation on fixed_16_16_t
> 		       variables & return u32 result with round-off
> div_round_up_u32_fixed16 : devide uint32_t variable by fixed_16_16 variable
> 			   and round_up the result to uint32_t.
> 
> These wrappers will be used by later patches in the series.
> 
> Changes from V1:
>  - Rename wrapper as per Matt's comment
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h | 43 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index f56fa1f71c11..a1858c3eb33a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -153,6 +153,38 @@ static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
>  	return max;
>  }
>  
> +static inline uint32_t div_round_up_fixed16(uint_fixed_16_16_t val,
> +					    uint_fixed_16_16_t d)
> +{
> +	return DIV_ROUND_UP(val.val, d.val);
> +}
> +
> +static inline uint32_t mul_round_up_u32_fixed16(uint32_t val,
> +						uint_fixed_16_16_t mul)
> +{
> +       uint64_t intermediate_val;
> +       uint32_t result;
> +
> +       intermediate_val = (uint64_t) val * mul.val;
> +       intermediate_val = DIV_ROUND_UP_ULL(intermediate_val, 1 << 16);
> +       WARN_ON(intermediate_val >> 32);
> +       result = clamp_t(uint32_t, intermediate_val, 0, ~0);
> +       return result;
> +}

It looks like the indentation of this function is wrong (spaces instead
of tabs).

With that fixed,

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> +
> +static inline uint_fixed_16_16_t mul_fixed16(uint_fixed_16_16_t val,
> +					     uint_fixed_16_16_t mul)
> +{
> +	uint64_t intermediate_val;
> +	uint_fixed_16_16_t fp;
> +
> +	intermediate_val = (uint64_t) val.val * mul.val;
> +	intermediate_val = intermediate_val >> 16;
> +	WARN_ON(intermediate_val >> 32);
> +	fp.val = clamp_t(uint32_t, intermediate_val, 0, ~0);
> +	return fp;
> +}
> +
>  static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
>  {
>  	uint_fixed_16_16_t fp, res;
> @@ -175,6 +207,17 @@ static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val, uint32_t d)
>  	return res;
>  }
>  
> +static inline uint32_t div_round_up_u32_fixed16(uint32_t val,
> +						uint_fixed_16_16_t d)
> +{
> +	uint64_t interm_val;
> +
> +	interm_val = (uint64_t)val << 16;
> +	interm_val = DIV_ROUND_UP_ULL(interm_val, d.val);
> +	WARN_ON(interm_val >> 32);
> +	return clamp_t(uint32_t, interm_val, 0, ~0);
> +}
> +
>  static inline uint_fixed_16_16_t mul_u32_fixed_16_16(uint32_t val,
>  						     uint_fixed_16_16_t mul)
>  {
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 05/12] drm/i915/skl: Fail the flip if no FB for WM calculation
  2017-05-15  8:34 ` [PATCH 05/12] drm/i915/skl: Fail the flip if no FB for WM calculation Mahesh Kumar
@ 2017-05-15 22:35   ` Matt Roper
  0 siblings, 0 replies; 27+ messages in thread
From: Matt Roper @ 2017-05-15 22:35 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 15, 2017 at 02:04:30PM +0530, Mahesh Kumar wrote:
> From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>
> 
> Fail the flip if no FB is present but plane_state is set as visible.
> Above is not a valid combination so instead of continue fail the flip.
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_pm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index ab056952cfa4..f494af358874 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4408,7 +4408,8 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
>  	if (!intel_pstate)
>  		intel_pstate = to_intel_plane_state(plane->state);
>  
> -	WARN_ON(!intel_pstate->base.fb);
> +	if (WARN_ON(!intel_pstate->base.fb))
> +		return -EINVAL;
>  
>  	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
>  
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 07/12] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
  2017-05-15  8:34 ` [PATCH 07/12] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation Mahesh Kumar
@ 2017-05-15 22:36   ` Matt Roper
  0 siblings, 0 replies; 27+ messages in thread
From: Matt Roper @ 2017-05-15 22:36 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 15, 2017 at 02:04:32PM +0530, Mahesh Kumar wrote:
> From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>
> 
> DDB minimum requirement of crtc configuration (cumulative of all the
> enabled planes in crtc) may exceed the allocated DDB for crtc/pipe.
> This patch make changes to fail the flip/ioctl if minimum requirement
> for pipe exceeds the total ddb allocated to the pipe.
> Previously it succeeded but making alloc_size a negative value. Which
> will make subsequent calculations for plane ddb allocation bogus & may
> lead to screen corruption or system hang.
> 
> Changes from V1:
>  - Improve commit message as per Ander's comment
>  - Remove extra parentheses (Ander)
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 37dd3e7108c9..bbc72069ab57 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4057,6 +4057,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	int num_active;
>  	unsigned plane_data_rate[I915_MAX_PLANES] = {};
>  	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
> +	uint16_t total_min_blocks = 0;
>  
>  	/* Clear the partitioning for disabled planes. */
>  	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
> @@ -4084,10 +4085,18 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	 */
>  
>  	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
> -		alloc_size -= minimum[plane_id];
> -		alloc_size -= y_minimum[plane_id];
> +		total_min_blocks += minimum[plane_id];
> +		total_min_blocks += y_minimum[plane_id];
>  	}
>  
> +	if (total_min_blocks > alloc_size) {
> +		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
> +		DRM_DEBUG_KMS("minimum required %d/%d\n", total_min_blocks,
> +							alloc_size);
> +		return -EINVAL;
> +	}
> +
> +	alloc_size -= total_min_blocks;
>  	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
>  	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
>  
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 08/12] drm/i915/skl+: Watermark calculation cleanup
  2017-05-15  8:34 ` [PATCH 08/12] drm/i915/skl+: Watermark calculation cleanup Mahesh Kumar
@ 2017-05-15 22:36   ` Matt Roper
  0 siblings, 0 replies; 27+ messages in thread
From: Matt Roper @ 2017-05-15 22:36 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 15, 2017 at 02:04:33PM +0530, Mahesh Kumar wrote:
> From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>
> 
> This patch cleanup/reorganises the watermark calculation functions.
> This patch make use of already available macro
> "drm_atomic_crtc_state_for_each_plane_state" to walk through
> plane_state list instead of calculating plane_state in function itself.
> 
> This restructuring will help later patch for new DDB allocation
> algorithm to do only algo related changes.
> 
> Changes from V1:
>  - split the patch in two parts as per Matt's comment
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_pm.c | 53 +++++++++++++++--------------------------
>  1 file changed, 19 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index bbc72069ab57..c24a4e1bcb8b 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4197,8 +4197,9 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
>  	return ret;
>  }
>  
> -static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
> -					      struct intel_plane_state *pstate)
> +static uint32_t
> +skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
> +			      const struct intel_plane_state *pstate)
>  {
>  	uint64_t adjusted_pixel_rate;
>  	uint_fixed_16_16_t downscale_amount;
> @@ -4220,7 +4221,7 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
>  
>  static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  				struct intel_crtc_state *cstate,
> -				struct intel_plane_state *intel_pstate,
> +				const struct intel_plane_state *intel_pstate,
>  				uint16_t ddb_allocation,
>  				int level,
>  				uint16_t *out_blocks, /* out */
> @@ -4228,8 +4229,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  				bool *enabled /* out */)
>  {
>  	struct intel_plane *plane = to_intel_plane(intel_pstate->base.plane);
> -	struct drm_plane_state *pstate = &intel_pstate->base;
> -	struct drm_framebuffer *fb = pstate->fb;
> +	const struct drm_plane_state *pstate = &intel_pstate->base;
> +	const struct drm_framebuffer *fb = pstate->fb;
>  	uint32_t latency = dev_priv->wm.skl_latency[level];
>  	uint_fixed_16_16_t method1, method2;
>  	uint_fixed_16_16_t plane_blocks_per_line;
> @@ -4384,37 +4385,17 @@ static int
>  skl_compute_wm_level(const struct drm_i915_private *dev_priv,
>  		     struct skl_ddb_allocation *ddb,
>  		     struct intel_crtc_state *cstate,
> -		     struct intel_plane *intel_plane,
> +		     const struct intel_plane_state *intel_pstate,
>  		     int level,
>  		     struct skl_wm_level *result)
>  {
> -	struct drm_atomic_state *state = cstate->base.state;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
> -	struct drm_plane *plane = &intel_plane->base;
> -	struct intel_plane_state *intel_pstate = NULL;
> +	struct drm_plane *plane = intel_pstate->base.plane;
> +	struct intel_plane *intel_plane = to_intel_plane(plane);
>  	uint16_t ddb_blocks;
>  	enum pipe pipe = intel_crtc->pipe;
>  	int ret;
>  
> -	if (state)
> -		intel_pstate =
> -			intel_atomic_get_existing_plane_state(state,
> -							      intel_plane);
> -
> -	/*
> -	 * Note: If we start supporting multiple pending atomic commits against
> -	 * the same planes/CRTC's in the future, plane->state will no longer be
> -	 * the correct pre-state to use for the calculations here and we'll
> -	 * need to change where we get the 'unchanged' plane data from.
> -	 *
> -	 * For now this is fine because we only allow one queued commit against
> -	 * a CRTC.  Even if the plane isn't modified by this transaction and we
> -	 * don't have a plane lock, we still have the CRTC's lock, so we know
> -	 * that no other transactions are racing with us to update it.
> -	 */
> -	if (!intel_pstate)
> -		intel_pstate = to_intel_plane_state(plane->state);
> -
>  	if (WARN_ON(!intel_pstate->base.fb))
>  		return -EINVAL;
>  
> @@ -4475,8 +4456,10 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
>  			     struct skl_pipe_wm *pipe_wm)
>  {
>  	struct drm_device *dev = cstate->base.crtc->dev;
> +	struct drm_crtc_state *crtc_state = &cstate->base;
>  	const struct drm_i915_private *dev_priv = to_i915(dev);
> -	struct intel_plane *intel_plane;
> +	struct drm_plane *plane;
> +	const struct drm_plane_state *pstate;
>  	struct skl_plane_wm *wm;
>  	int level, max_level = ilk_wm_max_level(dev_priv);
>  	int ret;
> @@ -4487,14 +4470,16 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
>  	 */
>  	memset(pipe_wm->planes, 0, sizeof(pipe_wm->planes));
>  
> -	for_each_intel_plane_mask(&dev_priv->drm,
> -				  intel_plane,
> -				  cstate->base.plane_mask) {
> -		wm = &pipe_wm->planes[intel_plane->id];
> +	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
> +		const struct intel_plane_state *intel_pstate =
> +						to_intel_plane_state(pstate);
> +		enum plane_id plane_id = to_intel_plane(plane)->id;
> +
> +		wm = &pipe_wm->planes[plane_id];
>  
>  		for (level = 0; level <= max_level; level++) {
>  			ret = skl_compute_wm_level(dev_priv, ddb, cstate,
> -						   intel_plane, level,
> +						   intel_pstate, level,
>  						   &wm->wm[level]);
>  			if (ret)
>  				return ret;
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 09/12] drm/i915/skl+: Perform wm level calculations in separate function
  2017-05-15  8:34 ` [PATCH 09/12] drm/i915/skl+: Perform wm level calculations in separate function Mahesh Kumar
@ 2017-05-15 22:36   ` Matt Roper
  0 siblings, 0 replies; 27+ messages in thread
From: Matt Roper @ 2017-05-15 22:36 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 15, 2017 at 02:04:34PM +0530, Mahesh Kumar wrote:
> Instead of iterating over planes & wm levels in a single function use
> skl_compute_wm_level function to interate over WM levels.
> Change name of function to skl_compute_wm_levels (Matt).
> 
> These changes are to clean-up WM code & will help in making only new
> ddb algorithm related changes in later patch in series.
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_pm.c | 48 ++++++++++++++++++++---------------------
>  1 file changed, 24 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index c24a4e1bcb8b..0f1d9f672e83 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4382,18 +4382,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  }
>  
>  static int
> -skl_compute_wm_level(const struct drm_i915_private *dev_priv,
> -		     struct skl_ddb_allocation *ddb,
> -		     struct intel_crtc_state *cstate,
> -		     const struct intel_plane_state *intel_pstate,
> -		     int level,
> -		     struct skl_wm_level *result)
> +skl_compute_wm_levels(const struct drm_i915_private *dev_priv,
> +		      struct skl_ddb_allocation *ddb,
> +		      struct intel_crtc_state *cstate,
> +		      const struct intel_plane_state *intel_pstate,
> +		      struct skl_plane_wm *wm)
>  {
>  	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
>  	struct drm_plane *plane = intel_pstate->base.plane;
>  	struct intel_plane *intel_plane = to_intel_plane(plane);
>  	uint16_t ddb_blocks;
>  	enum pipe pipe = intel_crtc->pipe;
> +	int level, max_level = ilk_wm_max_level(dev_priv);
>  	int ret;
>  
>  	if (WARN_ON(!intel_pstate->base.fb))
> @@ -4401,16 +4401,20 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
>  
>  	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
>  
> -	ret = skl_compute_plane_wm(dev_priv,
> -				   cstate,
> -				   intel_pstate,
> -				   ddb_blocks,
> -				   level,
> -				   &result->plane_res_b,
> -				   &result->plane_res_l,
> -				   &result->plane_en);
> -	if (ret)
> -		return ret;
> +	for (level = 0; level <= max_level; level++) {
> +		struct skl_wm_level *result = &wm->wm[level];
> +
> +		ret = skl_compute_plane_wm(dev_priv,
> +					   cstate,
> +					   intel_pstate,
> +					   ddb_blocks,
> +					   level,
> +					   &result->plane_res_b,
> +					   &result->plane_res_l,
> +					   &result->plane_en);
> +		if (ret)
> +			return ret;
> +	}
>  
>  	return 0;
>  }
> @@ -4461,7 +4465,6 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
>  	struct drm_plane *plane;
>  	const struct drm_plane_state *pstate;
>  	struct skl_plane_wm *wm;
> -	int level, max_level = ilk_wm_max_level(dev_priv);
>  	int ret;
>  
>  	/*
> @@ -4477,13 +4480,10 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
>  
>  		wm = &pipe_wm->planes[plane_id];
>  
> -		for (level = 0; level <= max_level; level++) {
> -			ret = skl_compute_wm_level(dev_priv, ddb, cstate,
> -						   intel_pstate, level,
> -						   &wm->wm[level]);
> -			if (ret)
> -				return ret;
> -		}
> +		ret = skl_compute_wm_levels(dev_priv, ddb, cstate,
> +					    intel_pstate, wm);
> +		if (ret)
> +			return ret;
>  		skl_compute_transition_wm(cstate, &wm->trans_wm);
>  	}
>  	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 10/12] drm/i915/skl+: use linetime latency if ddb size is not available
  2017-05-15  8:34 ` [PATCH 10/12] drm/i915/skl+: use linetime latency if ddb size is not available Mahesh Kumar
@ 2017-05-15 22:38   ` Matt Roper
  2017-05-16  9:46     ` Mahesh Kumar
  0 siblings, 1 reply; 27+ messages in thread
From: Matt Roper @ 2017-05-15 22:38 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 15, 2017 at 02:04:35PM +0530, Mahesh Kumar wrote:
> From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>
> 
> This patch make changes to use linetime latency if allocated
> DDB size during plane watermark calculation is not available,
> This is required to implement new DDB allocation algorithm.
> 
> In New Algorithm DDB is allocated based on WM values, because of which
> number of DDB blocks will not be available during WM calculation,
> So this "linetime latency" is suggested by SV/HW team to be used during
> switch-case for WM blocks selection.

(Just realized I never actually sent my review of this patch on the
previous series review; sorry for the delay.)

I'm not sure if the term "linetime" is completely self-explanatory.  You
might want to explicitly clarify that it's the time it takes to fill a
single row at a given clock rate.  So "linetime latency" here is defined
as "line time in microseconds = pipe horizontal total number of pixels /
pixel rate in MHz."

> 
> Changes since v1:
>  - Rebase on top of Paulo's patch series
> Changes since v2:
>  - Fix if-else condition (pointed by Maarten)
> Changes since v3:
>  - Use common function for timetime_us calculation (Paulo)
>  - rebase on drm-tip
> Changes since v4:
>  - Use consistent name for fixed_point operation
> 
> Signed-off-by: "Mahesh Kumar" <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h |  7 +++++++
>  drivers/gpu/drm/i915/intel_pm.c | 42 ++++++++++++++++++++++++++++++++---------
>  2 files changed, 40 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a1858c3eb33a..84052990e695 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -115,6 +115,13 @@ typedef struct {
>  	fp; \
>  })
>  
> +static inline bool is_fixed16_zero(uint_fixed_16_16_t val)
> +{
> +	if (val.val == 0)
> +		return true;
> +	return false;
> +}
> +
>  static inline uint_fixed_16_16_t u32_to_fixed_16_16(uint32_t val)
>  {
>  	uint_fixed_16_16_t fp;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 0f1d9f672e83..d73369c2c2d9 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4197,6 +4197,27 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
>  	return ret;
>  }
>  
> +static uint_fixed_16_16_t
> +skl_get_linetime_us(struct intel_crtc_state *cstate)

We use the same definition of linetime for all platforms that use the
concept of 'linetime' (hsw+).  So maybe the skl_ prefix here isn't
really appropriate?

For consistency, should we also call this new function when calculating
the HSW/BDW watermarks in hsw_compute_linetime_wm()?  Or consolidate
skl_compute_linetime_wm() with hsw_compute_linetime_wm()?  Potentially
something we could do as a follow-up patch if we don't want to deal with
it while getting this series in.

> +{
> +	uint32_t pixel_rate;
> +	uint32_t crtc_htotal;
> +	uint_fixed_16_16_t linetime_us;
> +
> +	if (!cstate->base.active)
> +		return u32_to_fixed_16_16(0);
> +
> +	pixel_rate = cstate->pixel_rate;
> +
> +	if (WARN_ON(pixel_rate == 0))
> +		return u32_to_fixed_16_16(0);
> +
> +	crtc_htotal = cstate->base.adjusted_mode.crtc_htotal;
> +	linetime_us = fixed_16_16_div_u64(crtc_htotal * 1000, pixel_rate);
> +
> +	return linetime_us;
> +}
> +
>  static uint32_t
>  skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
>  			      const struct intel_plane_state *pstate)
> @@ -4331,12 +4352,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  	if (y_tiled) {
>  		selected_result = max_fixed_16_16(method2, y_tile_minimum);
>  	} else {
> +		uint32_t linetime_us;
> +
> +		linetime_us = fixed_16_16_to_u32_round_up(
> +				skl_get_linetime_us(cstate));
>  		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
>  		    (plane_bytes_per_line / 512 < 1))
>  			selected_result = method2;
> -		else if ((ddb_allocation /
> +		else if ((ddb_allocation && ddb_allocation /
>  			fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1)
>  			selected_result = min_fixed_16_16(method1, method2);
> +		else if (latency >= linetime_us)
> +			selected_result = method2;

Based on the commit message, I'm a little bit confused.  The plan is to
switch to the new algorithm where we calculate watermarks first and then
partition DDB second.  So at the end, ddb_allocation should never be
available here, right?  In that case, can we just remove it as a
parameter to this function and drop your first 'else if' branch here?
That would make us take the new decision logic immediately and give us
an earlier idea whether it really behaves the way we expect it to,
rather than waiting until we've made a bunch of other changes too.

It's also not clear to me why with the old algorithm, the DDB allocation
test being true meant "use the lower of the two watermark methods"
whereas the linetime test means "aways use method2."  Is there any more
explanation available for how/why the linetime latency plays into the
watermark method decision?  The new decision isn't reflected in the
bspec yet (that I can see), so it's kind of hard to review that we're
definitely doing the right thing.


Matt

>  		else
>  			selected_result = method1;
>  	}
> @@ -4424,19 +4451,16 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
>  {
>  	struct drm_atomic_state *state = cstate->base.state;
>  	struct drm_i915_private *dev_priv = to_i915(state->dev);
> -	uint32_t pixel_rate;
> +	uint_fixed_16_16_t linetime_us;
>  	uint32_t linetime_wm;
>  
> -	if (!cstate->base.active)
> -		return 0;
> +	linetime_us = skl_get_linetime_us(cstate);
>  
> -	pixel_rate = cstate->pixel_rate;
> -
> -	if (WARN_ON(pixel_rate == 0))
> +	if (is_fixed16_zero(linetime_us))
>  		return 0;
>  
> -	linetime_wm = DIV_ROUND_UP(8 * cstate->base.adjusted_mode.crtc_htotal *
> -				   1000, pixel_rate);
> +	linetime_wm = fixed_16_16_to_u32_round_up(mul_u32_fixed_16_16(8,
> +				linetime_us));
>  
>  	/* Display WA #1135: bxt. */
>  	if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled)
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 11/12] drm/i915/skl: New ddb allocation algorithm
  2017-05-15  8:34 ` [PATCH 11/12] drm/i915/skl: New ddb allocation algorithm Mahesh Kumar
@ 2017-05-15 22:38   ` Matt Roper
  2017-05-16 12:57     ` Mahesh Kumar
  2017-05-16 15:29   ` Maarten Lankhorst
  1 sibling, 1 reply; 27+ messages in thread
From: Matt Roper @ 2017-05-15 22:38 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 15, 2017 at 02:04:36PM +0530, Mahesh Kumar wrote:
> From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>
> 
> This patch implements new DDB allocation algorithm as per HW team
> recommendation. This algo takecare of scenario where we allocate less DDB
> for the planes with lower relative pixel rate, but they require more DDB
> to work.
> It also takes care of enabling same watermark level for each
> plane in crtc, for efficient power saving.
> 
> Changes since v1:
>  - Rebase on top of Paulo's patch series
> 
> Changes since v2:
>  - Fix the for loop condition to enable WM
> 
> Changes since v3:
>  - Fix crash in cursor i-g-t reported by Maarten
>  - Rebase after addressing Paulo's comments
>  - Few other ULT fixes
> Changes since v4:
>  - Rebase on drm-tip
>  - Added separate function to enable WM levels
> Changes since v5:
>  - Fix a crash identified in skl-6770HQ system
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 253 ++++++++++++++++++++++++----------------
>  1 file changed, 152 insertions(+), 101 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index d73369c2c2d9..d6b0ae0ef7a2 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4039,13 +4039,41 @@ skl_ddb_calc_min(const struct intel_crtc_state *cstate, int num_active,
>  	minimum[PLANE_CURSOR] = skl_cursor_allocation(num_active);
>  }
>  
> +static void
> +skl_enable_plane_wm_levels(const struct drm_i915_private *dev_priv,
> +			   uint16_t plane_ddb,
> +			   uint16_t max_level,
> +			   struct skl_plane_wm *wm)
> +{
> +	int level;
> +	/*
> +	 * Now enable all levels in WM structure which can be enabled
> +	 * using current DDB allocation
> +	 */
> +	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
> +		struct skl_wm_level *level_wm = &wm->wm[level];
> +
> +		if (level > max_level || level_wm->plane_res_b == 0
> +				      || level_wm->plane_res_l >= 31
> +				      || level_wm->plane_res_b >= plane_ddb) {

Is it possible to hit level_wm->plane_res_b >= plane_ddb without hitting
level > max_level given our new logic?

> +			level_wm->plane_en = false;
> +			level_wm->plane_res_b = 0;
> +			level_wm->plane_res_l = 0;
> +		} else {
> +			level_wm->plane_en = true;
> +		}
> +	}
> +}
> +
>  static int
>  skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
> -		      struct skl_ddb_allocation *ddb /* out */)
> +		      struct skl_ddb_allocation *ddb /* out */,
> +		      struct skl_pipe_wm *pipe_wm)

Not a huge deal, but it's a bit confusing to have the 'out' parameter in
the middle of 'in' parameters.  Maybe move the pipe_wm before the ddb
just to make this a little cleaner?

>  {
>  	struct drm_atomic_state *state = cstate->base.state;
>  	struct drm_crtc *crtc = cstate->base.crtc;
>  	struct drm_device *dev = crtc->dev;
> +	struct drm_i915_private *dev_priv = to_i915(dev);
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>  	enum pipe pipe = intel_crtc->pipe;
>  	struct skl_ddb_entry *alloc = &cstate->wm.skl.ddb;
> @@ -4058,6 +4086,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	unsigned plane_data_rate[I915_MAX_PLANES] = {};
>  	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
>  	uint16_t total_min_blocks = 0;
> +	uint16_t total_level_ddb;
> +	int max_level, level;
>  
>  	/* Clear the partitioning for disabled planes. */
>  	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
> @@ -4096,10 +4126,43 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  		return -EINVAL;
>  	}
>  
> -	alloc_size -= total_min_blocks;
> -	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
> +	alloc_size -= minimum[PLANE_CURSOR];
> +	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end -
> +							minimum[PLANE_CURSOR];
>  	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
>  
> +	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
> +		total_level_ddb = 0;
> +		for_each_plane_id_on_crtc(intel_crtc, plane_id) {
> +			/*
> +			 * TODO: We should calculate watermark values for Y/UV
> +			 * plane both in case of NV12 format and use both values
> +			 * for ddb calculation. NV12 is disabled as of now, So
> +			 * using only single/UV plane value here.
> +			 */
> +			struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
> +			uint16_t plane_res_b = wm->wm[level].plane_res_b;
> +			uint16_t min = minimum[plane_id] + y_minimum[plane_id];
> +
> +			if (plane_id == PLANE_CURSOR)
> +				continue;
> +
> +			total_level_ddb += max(plane_res_b, min);
> +		}
> +
> +		if (total_level_ddb <= alloc_size)
> +			break;

Since I got confused on my first time through the review, I suspect
other people might make the same mistake I did.  You might want to put a
comment above this like "If this level can successfully be enabled with
the pipe's current DDB allocation, then all lower levels are guaranteed
to succeed as well."

> +	}
> +
> +	if ((level < 0) || (total_min_blocks > alloc_size)) {
> +		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
> +		DRM_DEBUG_KMS("minimum required %d/%d\n", (level < 0) ?
> +				total_level_ddb : total_min_blocks, alloc_size);
> +		return -EINVAL;
> +	}
> +	max_level = level;
> +	alloc_size -= total_level_ddb;
> +
>  	/*
>  	 * 2. Distribute the remaining space in proportion to the amount of
>  	 * data each plane needs to fetch from memory.
> @@ -4115,10 +4178,17 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	start = alloc->start;
>  	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
>  		unsigned int data_rate, y_data_rate;
> -		uint16_t plane_blocks, y_plane_blocks = 0;
> -
> -		if (plane_id == PLANE_CURSOR)
> +		uint16_t plane_blocks = 0, y_plane_blocks = 0;
> +		struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
> +		uint16_t plane_res_b = wm->wm[max_level].plane_res_b;
> +
> +		if (plane_id == PLANE_CURSOR) {
> +			plane_blocks =
> +				skl_ddb_entry_size(&ddb->plane[pipe][plane_id]);
> +			skl_enable_plane_wm_levels(dev_priv, plane_blocks,
> +						   max_level, wm);
>  			continue;
> +		}
>  
>  		data_rate = plane_data_rate[plane_id];
>  
> @@ -4127,33 +4197,36 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  		 * promote the expression to 64 bits to avoid overflowing, the
>  		 * result is < available as data_rate / total_data_rate < 1
>  		 */
> -		plane_blocks = minimum[plane_id];
> -		plane_blocks += div_u64((uint64_t)alloc_size * data_rate,
> -					total_data_rate);
>  
>  		/* Leave disabled planes at (0,0) */
>  		if (data_rate) {
> +			plane_blocks = max(minimum[plane_id], plane_res_b);
> +			plane_blocks += div_u64((uint64_t)alloc_size *
> +					data_rate, total_data_rate);
>  			ddb->plane[pipe][plane_id].start = start;
>  			ddb->plane[pipe][plane_id].end = start + plane_blocks;
> +			start += plane_blocks;
>  		}
>  
> -		start += plane_blocks;
> -
>  		/*
>  		 * allocation for y_plane part of planar format:
> +		 * TODO: Once we start calculating watermark values for Y/UV
> +		 * plane both consider it for initial allowed wm blocks.
>  		 */
>  		y_data_rate = plane_y_data_rate[plane_id];
>  
> -		y_plane_blocks = y_minimum[plane_id];
> -		y_plane_blocks += div_u64((uint64_t)alloc_size * y_data_rate,
> -					total_data_rate);
> -
>  		if (y_data_rate) {
> +			y_plane_blocks = y_minimum[plane_id];
> +			y_plane_blocks += div_u64((uint64_t)alloc_size *
> +					y_data_rate, total_data_rate);
>  			ddb->y_plane[pipe][plane_id].start = start;
>  			ddb->y_plane[pipe][plane_id].end = start + y_plane_blocks;
> +			start += y_plane_blocks;
>  		}
> -
> -		start += y_plane_blocks;
> +		skl_enable_plane_wm_levels(dev_priv,
> +					   plane_blocks,
> +					   max_level,
> +					   wm);
>  	}
>  
>  	return 0;
> @@ -4243,11 +4316,9 @@ skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
>  static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  				struct intel_crtc_state *cstate,
>  				const struct intel_plane_state *intel_pstate,
> -				uint16_t ddb_allocation,
>  				int level,
>  				uint16_t *out_blocks, /* out */
> -				uint8_t *out_lines, /* out */
> -				bool *enabled /* out */)
> +				uint8_t *out_lines /* out */)
>  {
>  	struct intel_plane *plane = to_intel_plane(intel_pstate->base.plane);
>  	const struct drm_plane_state *pstate = &intel_pstate->base;
> @@ -4270,10 +4341,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  	bool y_tiled, x_tiled;
>  
>  	if (latency == 0 ||
> -	    !intel_wm_plane_visible(cstate, intel_pstate)) {
> -		*enabled = false;
> +	    !intel_wm_plane_visible(cstate, intel_pstate))
>  		return 0;
> -	}
>  
>  	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
>  		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
> @@ -4359,9 +4428,6 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
>  		    (plane_bytes_per_line / 512 < 1))
>  			selected_result = method2;
> -		else if ((ddb_allocation && ddb_allocation /
> -			fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1)
> -			selected_result = min_fixed_16_16(method1, method2);
>  		else if (latency >= linetime_us)
>  			selected_result = method2;
>  		else
> @@ -4381,64 +4447,41 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  		}
>  	}
>  
> -	if (res_blocks >= ddb_allocation || res_lines > 31) {
> -		*enabled = false;
> -
> -		/*
> -		 * If there are no valid level 0 watermarks, then we can't
> -		 * support this display configuration.
> -		 */
> -		if (level) {
> -			return 0;
> -		} else {
> -			struct drm_plane *plane = pstate->plane;
> +	if (res_lines >= 31 && level == 0) {
> +		struct drm_plane *plane = pstate->plane;
>  
> -			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
> -			DRM_DEBUG_KMS("[PLANE:%d:%s] blocks required = %u/%u, lines required = %u/31\n",
> -				      plane->base.id, plane->name,
> -				      res_blocks, ddb_allocation, res_lines);
> -			return -EINVAL;
> -		}
> +		DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
> +		DRM_DEBUG_KMS("[PLANE:%d:%s] lines required = %u/31\n",
> +				plane->base.id, plane->name, res_lines);

Should this still be returning an EINVAL?  It looks like we'll print the
message here, but then continue on and disable watermark level 0 in
skl_enable_plane_wm_levels() without failing the commit.

>  	}
>  
>  	*out_blocks = res_blocks;
>  	*out_lines = res_lines;
> -	*enabled = true;
>  
>  	return 0;
>  }
>  
>  static int
>  skl_compute_wm_levels(const struct drm_i915_private *dev_priv,
> -		      struct skl_ddb_allocation *ddb,
>  		      struct intel_crtc_state *cstate,
>  		      const struct intel_plane_state *intel_pstate,
>  		      struct skl_plane_wm *wm)
>  {
> -	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
> -	struct drm_plane *plane = intel_pstate->base.plane;
> -	struct intel_plane *intel_plane = to_intel_plane(plane);
> -	uint16_t ddb_blocks;
> -	enum pipe pipe = intel_crtc->pipe;
>  	int level, max_level = ilk_wm_max_level(dev_priv);
>  	int ret;
>  
>  	if (WARN_ON(!intel_pstate->base.fb))
>  		return -EINVAL;
>  
> -	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
> -
>  	for (level = 0; level <= max_level; level++) {
>  		struct skl_wm_level *result = &wm->wm[level];
>  
>  		ret = skl_compute_plane_wm(dev_priv,
>  					   cstate,
>  					   intel_pstate,
> -					   ddb_blocks,
>  					   level,
>  					   &result->plane_res_b,
> -					   &result->plane_res_l,
> -					   &result->plane_en);
> +					   &result->plane_res_l);
>  		if (ret)
>  			return ret;
>  	}
> @@ -4504,8 +4547,7 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
>  
>  		wm = &pipe_wm->planes[plane_id];
>  
> -		ret = skl_compute_wm_levels(dev_priv, ddb, cstate,
> -					    intel_pstate, wm);
> +		ret = skl_compute_wm_levels(dev_priv, cstate, intel_pstate, wm);
>  		if (ret)
>  			return ret;
>  		skl_compute_transition_wm(cstate, &wm->trans_wm);
> @@ -4618,6 +4660,45 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
>  	return false;
>  }
>  
> +static int
> +skl_ddb_add_affected_planes(struct intel_crtc_state *cstate,
> +			    const struct skl_pipe_wm *old_pipe_wm,
> +			    const struct skl_pipe_wm *pipe_wm)
> +{
> +	struct drm_atomic_state *state = cstate->base.state;
> +	struct drm_device *dev = state->dev;
> +	struct drm_crtc *crtc = cstate->base.crtc;
> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
> +	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
> +	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
> +	struct drm_plane_state *plane_state;
> +	struct drm_plane *plane;
> +	enum pipe pipe = intel_crtc->pipe;
> +
> +	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
> +
> +	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
> +		enum plane_id plane_id = to_intel_plane(plane)->id;
> +		const struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
> +		const struct skl_plane_wm *old_wm = &old_pipe_wm->planes[plane_id];
> +
> +		if ((skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
> +					&new_ddb->plane[pipe][plane_id]) &&
> +		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
> +					&new_ddb->y_plane[pipe][plane_id])) &&
> +		    !memcmp(wm, old_wm, sizeof(struct skl_plane_wm)))
> +			continue;
> +
> +		plane_state = drm_atomic_get_plane_state(state, plane);
> +		if (IS_ERR(plane_state))
> +			return PTR_ERR(plane_state);
> +	}
> +
> +	return 0;
> +}
> +
>  static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
>  			      const struct skl_pipe_wm *old_pipe_wm,
>  			      struct skl_pipe_wm *pipe_wm, /* out */
> @@ -4631,6 +4712,18 @@ static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
>  	if (ret)
>  		return ret;
>  
> +	ret = skl_allocate_pipe_ddb(intel_cstate, ddb, pipe_wm);
> +	if (ret)
> +		return ret;
> +	/*
> +	 * TODO: Do we still need to add planes in state, As WM update is
> +	 * not part of update_plane anymore, So wm for planes can be updated
> +	 * irrespective of updade_plane call.
> +	 */

I think the watermark registers get written by skl_write_plane_wm()
regardless of whether we add the plane to the state, yes.  However I
think the watermark registers are only armed by writing to the actual
plane registers, which I don't believe will happen anywhere unless you
add the plane to the state.

> +	ret = skl_ddb_add_affected_planes(intel_cstate, old_pipe_wm, pipe_wm);
> +	if (ret)
> +		return ret;
> +
>  	if (!memcmp(old_pipe_wm, pipe_wm, sizeof(*pipe_wm)))
>  		*changed = false;
>  	else
> @@ -4653,41 +4746,7 @@ pipes_modified(struct drm_atomic_state *state)
>  }
>  
>  static int
> -skl_ddb_add_affected_planes(struct intel_crtc_state *cstate)
> -{
> -	struct drm_atomic_state *state = cstate->base.state;
> -	struct drm_device *dev = state->dev;
> -	struct drm_crtc *crtc = cstate->base.crtc;
> -	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> -	struct drm_i915_private *dev_priv = to_i915(dev);
> -	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
> -	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
> -	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
> -	struct drm_plane_state *plane_state;
> -	struct drm_plane *plane;
> -	enum pipe pipe = intel_crtc->pipe;
> -
> -	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
> -
> -	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
> -		enum plane_id plane_id = to_intel_plane(plane)->id;
> -
> -		if (skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
> -					&new_ddb->plane[pipe][plane_id]) &&
> -		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
> -					&new_ddb->y_plane[pipe][plane_id]))
> -			continue;
> -
> -		plane_state = drm_atomic_get_plane_state(state, plane);
> -		if (IS_ERR(plane_state))
> -			return PTR_ERR(plane_state);
> -	}
> -
> -	return 0;
> -}
> -
> -static int
> -skl_compute_ddb(struct drm_atomic_state *state)
> +skl_include_affected_crtc(struct drm_atomic_state *state)

Minor nitpick, but I'd make this plural as well.
skl_include_affected_crtcs().


Matt

>  {
>  	struct drm_device *dev = state->dev;
>  	struct drm_i915_private *dev_priv = to_i915(dev);
> @@ -4751,14 +4810,6 @@ skl_compute_ddb(struct drm_atomic_state *state)
>  		cstate = intel_atomic_get_crtc_state(state, intel_crtc);
>  		if (IS_ERR(cstate))
>  			return PTR_ERR(cstate);
> -
> -		ret = skl_allocate_pipe_ddb(cstate, ddb);
> -		if (ret)
> -			return ret;
> -
> -		ret = skl_ddb_add_affected_planes(cstate);
> -		if (ret)
> -			return ret;
>  	}
>  
>  	return 0;
> @@ -4839,7 +4890,7 @@ skl_compute_wm(struct drm_atomic_state *state)
>  	/* Clear all dirty flags */
>  	results->dirty_pipes = 0;
>  
> -	ret = skl_compute_ddb(state);
> +	ret = skl_include_affected_crtc(state);
>  	if (ret)
>  		return ret;
>  
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 12/12] drm/i915/skl+: consider max supported plane pixel rate while scaling
  2017-05-15  8:34 ` [PATCH 12/12] drm/i915/skl+: consider max supported plane pixel rate while scaling Mahesh Kumar
@ 2017-05-15 22:39   ` Matt Roper
  2017-05-16 11:27   ` Maarten Lankhorst
  1 sibling, 0 replies; 27+ messages in thread
From: Matt Roper @ 2017-05-15 22:39 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 15, 2017 at 02:04:37PM +0530, Mahesh Kumar wrote:
> From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>
> 
> A display resolution is only supported if it meets all the restrictions
> below for Maximum Pipe Pixel Rate.
> 
> The display resolution must fit within the maximum pixel rate output
> from the pipe. Make sure that the display pipe is able to feed pixels at
> a rate required to support the desired resolution.
> For each enabled plane on the pipe {
>     If plane scaling enabled {
> 	Horizontal down scale amount = Maximum[1, plane horizontal size /
> 		    scaler horizontal window size]
> 	Vertical down scale amount = Maximum[1, plane vertical size /
> 		    scaler vertical window size]
> 	Plane down scale amount = Horizontal down scale amount *
> 		    Vertical down scale amount
> 	Plane Ratio = 1 / Plane down scale amount
>     }
>     Else {
> 	Plane Ratio = 1
>     }
>     If plane source pixel format is 64 bits per pixel {
> 	Plane Ratio = Plane Ratio * 8/9
>     }
> }
> 
> Pipe Ratio = Minimum Plane Ratio of all enabled planes on the pipe
> 
> If pipe scaling is enabled {
>     Horizontal down scale amount = Maximum[1, pipe horizontal source size /
> 		scaler horizontal window size]
>     Vertical down scale amount = Maximum[1, pipe vertical source size /
> 		scaler vertical window size]
>     Note: The progressive fetch - interlace display mode is equivalent to a
> 		2.0 vertical down scale
>     Pipe down scale amount = Horizontal down scale amount *
> 		Vertical down scale amount
>     Pipe Ratio = Pipe Ratio / Pipe down scale amount
> }
> 
> Pipe maximum pixel rate = CDCLK frequency * Pipe Ratio

It looks like you copied the above from the bspec, but you might want to
add a note that the actual calculations you do in the code just work
with down scale amounts rather than ratios, which is why a lot of the
code is inverted from the logic above.


> 
> Changes since V1:
>  - separate out fixed_16_16 wrapper API definition
> Changes since V2:
>  - Fix buggy crtc !active condition (Maarten)
>  - use intel_wm_plane_visible wrapper as per Maarten's suggestion
> Changes since V3:
>  - Change failure return from ERANGE to EINVAL
> Changes since V3:
>  - Rebase based on previous patch changes
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c |  3 ++
>  drivers/gpu/drm/i915/intel_drv.h     |  2 +
>  drivers/gpu/drm/i915/intel_pm.c      | 87 ++++++++++++++++++++++++++++++++++++
>  3 files changed, 92 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index c1cbfcef1f89..2cc1d11d9e6c 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -11186,6 +11186,9 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
>  			ret = skl_update_scaler_crtc(pipe_config);
>  
>  		if (!ret)
> +			ret = skl_check_pipe_max_pixel_rate(intel_crtc,
> +							    pipe_config);
> +		if (!ret)
>  			ret = intel_atomic_setup_scalers(dev_priv, intel_crtc,
>  							 pipe_config);
>  	}
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index bd500977b3fc..93afac4a83fa 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1885,6 +1885,8 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
>  				 int ignore);
>  bool ilk_disable_lp_wm(struct drm_device *dev);
>  int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
> +int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
> +				  struct intel_crtc_state *cstate);
>  static inline int intel_enable_rc6(void)
>  {
>  	return i915.enable_rc6;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index d6b0ae0ef7a2..5a3d3398ae19 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3863,6 +3863,93 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
>  	return mul_fixed16(downscale_w, downscale_h);
>  }
>  
> +static uint_fixed_16_16_t
> +skl_pipe_downscale_amount(const struct intel_crtc_state *config)
> +{
> +	uint_fixed_16_16_t pipe_downscale = u32_to_fixed_16_16(1);
> +
> +	if (!config->base.active)
> +		return pipe_downscale;
> +
> +	if (config->pch_pfit.enabled) {
> +		uint32_t src_w, src_h, dst_w, dst_h;
> +		uint32_t pfit_size = config->pch_pfit.size;
> +		uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> +		uint_fixed_16_16_t downscale_h, downscale_w;
> +
> +		src_w = config->pipe_src_w;
> +		src_h = config->pipe_src_h;
> +		dst_w = pfit_size >> 16;
> +		dst_h = pfit_size & 0xffff;
> +
> +		if (!dst_w || !dst_h)
> +			return pipe_downscale;
> +
> +		fp_w_ratio = fixed_16_16_div(src_w, dst_w);
> +		fp_h_ratio = fixed_16_16_div(src_h, dst_h);
> +		downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
> +		downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
> +
> +		pipe_downscale = mul_fixed16(downscale_w, downscale_h);
> +	}
> +
> +	return pipe_downscale;
> +}
> +
> +int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
> +				  struct intel_crtc_state *cstate)
> +{
> +	struct drm_crtc_state *crtc_state = &cstate->base;
> +	struct drm_atomic_state *state = crtc_state->state;
> +	struct drm_plane *plane;
> +	const struct drm_plane_state *pstate;
> +	struct intel_plane_state *intel_pstate;
> +	int crtc_clock, cdclk;
> +	uint32_t pipe_max_pixel_rate;
> +	uint_fixed_16_16_t pipe_downscale;
> +	uint_fixed_16_16_t max_downscale = u32_to_fixed_16_16(1);
> +
> +	if (!cstate->base.active)
> +		return 0;
> +
> +	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
> +		uint_fixed_16_16_t plane_downscale;
> +		uint_fixed_16_16_t fp_9_div_8 = fixed_16_16_div(9, 8);
> +		int bpp;
> +
> +		if (!intel_wm_plane_visible(cstate,
> +					    to_intel_plane_state(pstate)))
> +			continue;
> +
> +		if (WARN_ON(!pstate->fb))
> +			continue;
> +
> +		intel_pstate = to_intel_plane_state(pstate);
> +		plane_downscale = skl_plane_downscale_amount(cstate,
> +							     intel_pstate);
> +		bpp = pstate->fb->format->cpp[0] * 8;
> +		if (bpp == 64)
> +			plane_downscale = mul_fixed16(plane_downscale,
> +						      fp_9_div_8);
> +
> +		max_downscale = max_fixed_16_16(plane_downscale, max_downscale);
> +	}
> +	pipe_downscale = skl_pipe_downscale_amount(cstate);
> +
> +	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
> +
> +	crtc_clock = crtc_state->adjusted_mode.crtc_clock;
> +	cdclk = to_intel_atomic_state(state)->cdclk.logical.cdclk;
> +	pipe_max_pixel_rate = div_round_up_u32_fixed16(cdclk, pipe_downscale);
> +
> +	if (pipe_max_pixel_rate < crtc_clock) {
> +		DRM_ERROR("Max supported pixel clock with scaling exceeds\n");

There seems to be a typo in the message here.  I think you either meant
"Max supported pixel clock with scaling exceeded." or "Max supported
pixel clock with scaling exceeds [platform limits]."

With that fixed (and possible a slightly clarified commit message).

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
>  static unsigned int
>  skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
>  			     const struct drm_plane_state *pstate,
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 10/12] drm/i915/skl+: use linetime latency if ddb size is not available
  2017-05-15 22:38   ` Matt Roper
@ 2017-05-16  9:46     ` Mahesh Kumar
  0 siblings, 0 replies; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-16  9:46 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

Hi,


On Tuesday 16 May 2017 04:08 AM, Matt Roper wrote:
> On Mon, May 15, 2017 at 02:04:35PM +0530, Mahesh Kumar wrote:
>> From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>
>>
>> This patch make changes to use linetime latency if allocated
>> DDB size during plane watermark calculation is not available,
>> This is required to implement new DDB allocation algorithm.
>>
>> In New Algorithm DDB is allocated based on WM values, because of which
>> number of DDB blocks will not be available during WM calculation,
>> So this "linetime latency" is suggested by SV/HW team to be used during
>> switch-case for WM blocks selection.
> (Just realized I never actually sent my review of this patch on the
> previous series review; sorry for the delay.)
>
> I'm not sure if the term "linetime" is completely self-explanatory.  You
> might want to explicitly clarify that it's the time it takes to fill a
> single row at a given clock rate.  So "linetime latency" here is defined
> as "line time in microseconds = pipe horizontal total number of pixels /
> pixel rate in MHz."
Will update commit message.
>
>> Changes since v1:
>>   - Rebase on top of Paulo's patch series
>> Changes since v2:
>>   - Fix if-else condition (pointed by Maarten)
>> Changes since v3:
>>   - Use common function for timetime_us calculation (Paulo)
>>   - rebase on drm-tip
>> Changes since v4:
>>   - Use consistent name for fixed_point operation
>>
>> Signed-off-by: "Mahesh Kumar" <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h |  7 +++++++
>>   drivers/gpu/drm/i915/intel_pm.c | 42 ++++++++++++++++++++++++++++++++---------
>>   2 files changed, 40 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index a1858c3eb33a..84052990e695 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -115,6 +115,13 @@ typedef struct {
>>   	fp; \
>>   })
>>   
>> +static inline bool is_fixed16_zero(uint_fixed_16_16_t val)
>> +{
>> +	if (val.val == 0)
>> +		return true;
>> +	return false;
>> +}
>> +
>>   static inline uint_fixed_16_16_t u32_to_fixed_16_16(uint32_t val)
>>   {
>>   	uint_fixed_16_16_t fp;
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index 0f1d9f672e83..d73369c2c2d9 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -4197,6 +4197,27 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
>>   	return ret;
>>   }
>>   
>> +static uint_fixed_16_16_t
>> +skl_get_linetime_us(struct intel_crtc_state *cstate)
> We use the same definition of linetime for all platforms that use the
> concept of 'linetime' (hsw+).  So maybe the skl_ prefix here isn't
> really appropriate?
will make it intel_get_linetime_us.
> For consistency, should we also call this new function when calculating
> the HSW/BDW watermarks in hsw_compute_linetime_wm()?  Or consolidate
> skl_compute_linetime_wm() with hsw_compute_linetime_wm()?  Potentially
> something we could do as a follow-up patch if we don't want to deal with
> it while getting this series in.
sounds good :), will create a new patch to use intel_get_linetime_us in 
hsw_compute_linetime_wm.
hsw & skl require bunch of other calculations/WA in linetime_wm, so will 
keep them separate function only.
>
>> +{
>> +	uint32_t pixel_rate;
>> +	uint32_t crtc_htotal;
>> +	uint_fixed_16_16_t linetime_us;
>> +
>> +	if (!cstate->base.active)
>> +		return u32_to_fixed_16_16(0);
>> +
>> +	pixel_rate = cstate->pixel_rate;
>> +
>> +	if (WARN_ON(pixel_rate == 0))
>> +		return u32_to_fixed_16_16(0);
>> +
>> +	crtc_htotal = cstate->base.adjusted_mode.crtc_htotal;
>> +	linetime_us = fixed_16_16_div_u64(crtc_htotal * 1000, pixel_rate);
>> +
>> +	return linetime_us;
>> +}
>> +
>>   static uint32_t
>>   skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
>>   			      const struct intel_plane_state *pstate)
>> @@ -4331,12 +4352,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>   	if (y_tiled) {
>>   		selected_result = max_fixed_16_16(method2, y_tile_minimum);
>>   	} else {
>> +		uint32_t linetime_us;
>> +
>> +		linetime_us = fixed_16_16_to_u32_round_up(
>> +				skl_get_linetime_us(cstate));
>>   		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
>>   		    (plane_bytes_per_line / 512 < 1))
>>   			selected_result = method2;
>> -		else if ((ddb_allocation /
>> +		else if ((ddb_allocation && ddb_allocation /
>>   			fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1)
>>   			selected_result = min_fixed_16_16(method1, method2);
>> +		else if (latency >= linetime_us)
>> +			selected_result = method2;
> Based on the commit message, I'm a little bit confused.  The plan is to
> switch to the new algorithm where we calculate watermarks first and then
> partition DDB second.  So at the end, ddb_allocation should never be
> available here, right?  In that case, can we just remove it as a
> parameter to this function and drop your first 'else if' branch here?
> That would make us take the new decision logic immediately and give us
> an earlier idea whether it really behaves the way we expect it to,
> rather than waiting until we've made a bunch of other changes too.
right, We will eventually remove first "else if" check in next patch 
itself :)
I kept it in this patch due to earlier comment by Paulo not to remove 
(ddb_allocation / plane blocks per line)
check until patch which removes the ddb_allocation is merged/submitted.
https://patchwork.freedesktop.org/patch/115537/
>
> It's also not clear to me why with the old algorithm, the DDB allocation
> test being true meant "use the lower of the two watermark methods"
> whereas the linetime test means "aways use method2."  Is there any more
> explanation available for how/why the linetime latency plays into the
> watermark method decision?  The new decision isn't reflected in the
> bspec yet (that I can see), so it's kind of hard to review that we're
> definitely doing the right thing.
Thanks alot for catching this, I revisited the BSpec & it says to use 
minimum of method1, method2.
BTW these changes are already included in internal Bspec, After new 
version release it should be part of open BSpec.
I don't know about release process & timeline though :P

-Mahesh
>
>
> Matt
>
>>   		else
>>   			selected_result = method1;
>>   	}
>> @@ -4424,19 +4451,16 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
>>   {
>>   	struct drm_atomic_state *state = cstate->base.state;
>>   	struct drm_i915_private *dev_priv = to_i915(state->dev);
>> -	uint32_t pixel_rate;
>> +	uint_fixed_16_16_t linetime_us;
>>   	uint32_t linetime_wm;
>>   
>> -	if (!cstate->base.active)
>> -		return 0;
>> +	linetime_us = skl_get_linetime_us(cstate);
>>   
>> -	pixel_rate = cstate->pixel_rate;
>> -
>> -	if (WARN_ON(pixel_rate == 0))
>> +	if (is_fixed16_zero(linetime_us))
>>   		return 0;
>>   
>> -	linetime_wm = DIV_ROUND_UP(8 * cstate->base.adjusted_mode.crtc_htotal *
>> -				   1000, pixel_rate);
>> +	linetime_wm = fixed_16_16_to_u32_round_up(mul_u32_fixed_16_16(8,
>> +				linetime_us));
>>   
>>   	/* Display WA #1135: bxt. */
>>   	if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled)
>> -- 
>> 2.11.0
>>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 12/12] drm/i915/skl+: consider max supported plane pixel rate while scaling
  2017-05-15  8:34 ` [PATCH 12/12] drm/i915/skl+: consider max supported plane pixel rate while scaling Mahesh Kumar
  2017-05-15 22:39   ` Matt Roper
@ 2017-05-16 11:27   ` Maarten Lankhorst
  1 sibling, 0 replies; 27+ messages in thread
From: Maarten Lankhorst @ 2017-05-16 11:27 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

Op 15-05-17 om 10:34 schreef Mahesh Kumar:
> From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>
>
> A display resolution is only supported if it meets all the restrictions
> below for Maximum Pipe Pixel Rate.
>
> The display resolution must fit within the maximum pixel rate output
> from the pipe. Make sure that the display pipe is able to feed pixels at
> a rate required to support the desired resolution.
> For each enabled plane on the pipe {
>     If plane scaling enabled {
> 	Horizontal down scale amount = Maximum[1, plane horizontal size /
> 		    scaler horizontal window size]
> 	Vertical down scale amount = Maximum[1, plane vertical size /
> 		    scaler vertical window size]
> 	Plane down scale amount = Horizontal down scale amount *
> 		    Vertical down scale amount
> 	Plane Ratio = 1 / Plane down scale amount
>     }
>     Else {
> 	Plane Ratio = 1
>     }
>     If plane source pixel format is 64 bits per pixel {
> 	Plane Ratio = Plane Ratio * 8/9
>     }
> }
>
> Pipe Ratio = Minimum Plane Ratio of all enabled planes on the pipe
>
> If pipe scaling is enabled {
>     Horizontal down scale amount = Maximum[1, pipe horizontal source size /
> 		scaler horizontal window size]
>     Vertical down scale amount = Maximum[1, pipe vertical source size /
> 		scaler vertical window size]
>     Note: The progressive fetch - interlace display mode is equivalent to a
> 		2.0 vertical down scale
>     Pipe down scale amount = Horizontal down scale amount *
> 		Vertical down scale amount
>     Pipe Ratio = Pipe Ratio / Pipe down scale amount
> }
>
> Pipe maximum pixel rate = CDCLK frequency * Pipe Ratio
>
> Changes since V1:
>  - separate out fixed_16_16 wrapper API definition
> Changes since V2:
>  - Fix buggy crtc !active condition (Maarten)
>  - use intel_wm_plane_visible wrapper as per Maarten's suggestion
> Changes since V3:
>  - Change failure return from ERANGE to EINVAL
> Changes since V3:
>  - Rebase based on previous patch changes
>
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c |  3 ++
>  drivers/gpu/drm/i915/intel_drv.h     |  2 +
>  drivers/gpu/drm/i915/intel_pm.c      | 87 ++++++++++++++++++++++++++++++++++++
>  3 files changed, 92 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index c1cbfcef1f89..2cc1d11d9e6c 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -11186,6 +11186,9 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
>  			ret = skl_update_scaler_crtc(pipe_config);
>  
>  		if (!ret)
> +			ret = skl_check_pipe_max_pixel_rate(intel_crtc,
> +							    pipe_config);
> +		if (!ret)
>  			ret = intel_atomic_setup_scalers(dev_priv, intel_crtc,
>  							 pipe_config);
>  	}
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index bd500977b3fc..93afac4a83fa 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1885,6 +1885,8 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
>  				 int ignore);
>  bool ilk_disable_lp_wm(struct drm_device *dev);
>  int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
> +int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
> +				  struct intel_crtc_state *cstate);
>  static inline int intel_enable_rc6(void)
>  {
>  	return i915.enable_rc6;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index d6b0ae0ef7a2..5a3d3398ae19 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3863,6 +3863,93 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
>  	return mul_fixed16(downscale_w, downscale_h);
>  }
>  
> +static uint_fixed_16_16_t
> +skl_pipe_downscale_amount(const struct intel_crtc_state *config)
> +{
> +	uint_fixed_16_16_t pipe_downscale = u32_to_fixed_16_16(1);
> +
> +	if (!config->base.active)
> +		return pipe_downscale;
> +
> +	if (config->pch_pfit.enabled) {
> +		uint32_t src_w, src_h, dst_w, dst_h;
> +		uint32_t pfit_size = config->pch_pfit.size;
> +		uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> +		uint_fixed_16_16_t downscale_h, downscale_w;
> +
> +		src_w = config->pipe_src_w;
> +		src_h = config->pipe_src_h;
> +		dst_w = pfit_size >> 16;
> +		dst_h = pfit_size & 0xffff;
> +
> +		if (!dst_w || !dst_h)
> +			return pipe_downscale;
> +
> +		fp_w_ratio = fixed_16_16_div(src_w, dst_w);
> +		fp_h_ratio = fixed_16_16_div(src_h, dst_h);
> +		downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
> +		downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
> +
> +		pipe_downscale = mul_fixed16(downscale_w, downscale_h);
> +	}
> +
> +	return pipe_downscale;
> +}
> +
> +int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
> +				  struct intel_crtc_state *cstate)
> +{
> +	struct drm_crtc_state *crtc_state = &cstate->base;
> +	struct drm_atomic_state *state = crtc_state->state;
> +	struct drm_plane *plane;
> +	const struct drm_plane_state *pstate;
> +	struct intel_plane_state *intel_pstate;
> +	int crtc_clock, cdclk;
> +	uint32_t pipe_max_pixel_rate;
> +	uint_fixed_16_16_t pipe_downscale;
> +	uint_fixed_16_16_t max_downscale = u32_to_fixed_16_16(1);
> +
> +	if (!cstate->base.active)
> +		return 0;
> +
> +	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
> +		uint_fixed_16_16_t plane_downscale;
> +		uint_fixed_16_16_t fp_9_div_8 = fixed_16_16_div(9, 8);
> +		int bpp;
> +
> +		if (!intel_wm_plane_visible(cstate,
> +					    to_intel_plane_state(pstate)))
> +			continue;
> +
> +		if (WARN_ON(!pstate->fb))
> +			continue;
return -EINVAL in this case? Though it should obviously never happen.

Otherwise series looks good now, I need to look at patch 11 in more detail. But for the rest definitely
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> +		intel_pstate = to_intel_plane_state(pstate);
> +		plane_downscale = skl_plane_downscale_amount(cstate,
> +							     intel_pstate);
> +		bpp = pstate->fb->format->cpp[0] * 8;
> +		if (bpp == 64)
> +			plane_downscale = mul_fixed16(plane_downscale,
> +						      fp_9_div_8);
> +
> +		max_downscale = max_fixed_16_16(plane_downscale, max_downscale);
> +	}
> +	pipe_downscale = skl_pipe_downscale_amount(cstate);
> +
> +	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
> +
> +	crtc_clock = crtc_state->adjusted_mode.crtc_clock;
> +	cdclk = to_intel_atomic_state(state)->cdclk.logical.cdclk;
> +	pipe_max_pixel_rate = div_round_up_u32_fixed16(cdclk, pipe_downscale);
> +
> +	if (pipe_max_pixel_rate < crtc_clock) {
> +		DRM_ERROR("Max supported pixel clock with scaling exceeds\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
>  static unsigned int
>  skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
>  			     const struct drm_plane_state *pstate,


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 11/12] drm/i915/skl: New ddb allocation algorithm
  2017-05-15 22:38   ` Matt Roper
@ 2017-05-16 12:57     ` Mahesh Kumar
  0 siblings, 0 replies; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-16 12:57 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

Hi,


On Tuesday 16 May 2017 04:08 AM, Matt Roper wrote:
> On Mon, May 15, 2017 at 02:04:36PM +0530, Mahesh Kumar wrote:
>> From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>
>>
>> This patch implements new DDB allocation algorithm as per HW team
>> recommendation. This algo takecare of scenario where we allocate less DDB
>> for the planes with lower relative pixel rate, but they require more DDB
>> to work.
>> It also takes care of enabling same watermark level for each
>> plane in crtc, for efficient power saving.
>>
>> Changes since v1:
>>   - Rebase on top of Paulo's patch series
>>
>> Changes since v2:
>>   - Fix the for loop condition to enable WM
>>
>> Changes since v3:
>>   - Fix crash in cursor i-g-t reported by Maarten
>>   - Rebase after addressing Paulo's comments
>>   - Few other ULT fixes
>> Changes since v4:
>>   - Rebase on drm-tip
>>   - Added separate function to enable WM levels
>> Changes since v5:
>>   - Fix a crash identified in skl-6770HQ system
>>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_pm.c | 253 ++++++++++++++++++++++++----------------
>>   1 file changed, 152 insertions(+), 101 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index d73369c2c2d9..d6b0ae0ef7a2 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -4039,13 +4039,41 @@ skl_ddb_calc_min(const struct intel_crtc_state *cstate, int num_active,
>>   	minimum[PLANE_CURSOR] = skl_cursor_allocation(num_active);
>>   }
>>   
>> +static void
>> +skl_enable_plane_wm_levels(const struct drm_i915_private *dev_priv,
>> +			   uint16_t plane_ddb,
>> +			   uint16_t max_level,
>> +			   struct skl_plane_wm *wm)
>> +{
>> +	int level;
>> +	/*
>> +	 * Now enable all levels in WM structure which can be enabled
>> +	 * using current DDB allocation
>> +	 */
>> +	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
>> +		struct skl_wm_level *level_wm = &wm->wm[level];
>> +
>> +		if (level > max_level || level_wm->plane_res_b == 0
>> +				      || level_wm->plane_res_l >= 31
>> +				      || level_wm->plane_res_b >= plane_ddb) {
> Is it possible to hit level_wm->plane_res_b >= plane_ddb without hitting
> level > max_level given our new logic?
Yes, this is possible for cursor plane, in multi-display scenario we 
allocate fixed 8 blocks to cursor.
In some cases this may not be sufficient to enable all the WM levels 
which can be enabled for other planes.

>
>> +			level_wm->plane_en = false;
>> +			level_wm->plane_res_b = 0;
>> +			level_wm->plane_res_l = 0;
>> +		} else {
>> +			level_wm->plane_en = true;
>> +		}
>> +	}
>> +}
>> +
>>   static int
>>   skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>> -		      struct skl_ddb_allocation *ddb /* out */)
>> +		      struct skl_ddb_allocation *ddb /* out */,
>> +		      struct skl_pipe_wm *pipe_wm)
> Not a huge deal, but it's a bit confusing to have the 'out' parameter in
> the middle of 'in' parameters.  Maybe move the pipe_wm before the ddb
> just to make this a little cleaner?
sure, will fix this.
>
>>   {
>>   	struct drm_atomic_state *state = cstate->base.state;
>>   	struct drm_crtc *crtc = cstate->base.crtc;
>>   	struct drm_device *dev = crtc->dev;
>> +	struct drm_i915_private *dev_priv = to_i915(dev);
>>   	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>>   	enum pipe pipe = intel_crtc->pipe;
>>   	struct skl_ddb_entry *alloc = &cstate->wm.skl.ddb;
>> @@ -4058,6 +4086,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>>   	unsigned plane_data_rate[I915_MAX_PLANES] = {};
>>   	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
>>   	uint16_t total_min_blocks = 0;
>> +	uint16_t total_level_ddb;
>> +	int max_level, level;
>>   
>>   	/* Clear the partitioning for disabled planes. */
>>   	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
>> @@ -4096,10 +4126,43 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>>   		return -EINVAL;
>>   	}
>>   
>> -	alloc_size -= total_min_blocks;
>> -	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
>> +	alloc_size -= minimum[PLANE_CURSOR];
>> +	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end -
>> +							minimum[PLANE_CURSOR];
>>   	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
>>   
>> +	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
>> +		total_level_ddb = 0;
>> +		for_each_plane_id_on_crtc(intel_crtc, plane_id) {
>> +			/*
>> +			 * TODO: We should calculate watermark values for Y/UV
>> +			 * plane both in case of NV12 format and use both values
>> +			 * for ddb calculation. NV12 is disabled as of now, So
>> +			 * using only single/UV plane value here.
>> +			 */
>> +			struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
>> +			uint16_t plane_res_b = wm->wm[level].plane_res_b;
>> +			uint16_t min = minimum[plane_id] + y_minimum[plane_id];
>> +
>> +			if (plane_id == PLANE_CURSOR)
>> +				continue;
>> +
>> +			total_level_ddb += max(plane_res_b, min);
>> +		}
>> +
>> +		if (total_level_ddb <= alloc_size)
>> +			break;
> Since I got confused on my first time through the review, I suspect
> other people might make the same mistake I did.  You might want to put a
> comment above this like "If this level can successfully be enabled with
> the pipe's current DDB allocation, then all lower levels are guaranteed
> to succeed as well."
will fix
>> +	}
>> +
>> +	if ((level < 0) || (total_min_blocks > alloc_size)) {
>> +		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
>> +		DRM_DEBUG_KMS("minimum required %d/%d\n", (level < 0) ?
>> +				total_level_ddb : total_min_blocks, alloc_size);
>> +		return -EINVAL;
>> +	}
>> +	max_level = level;
>> +	alloc_size -= total_level_ddb;
>> +
>>   	/*
>>   	 * 2. Distribute the remaining space in proportion to the amount of
>>   	 * data each plane needs to fetch from memory.
>> @@ -4115,10 +4178,17 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>>   	start = alloc->start;
>>   	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
>>   		unsigned int data_rate, y_data_rate;
>> -		uint16_t plane_blocks, y_plane_blocks = 0;
>> -
>> -		if (plane_id == PLANE_CURSOR)
>> +		uint16_t plane_blocks = 0, y_plane_blocks = 0;
>> +		struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
>> +		uint16_t plane_res_b = wm->wm[max_level].plane_res_b;
>> +
>> +		if (plane_id == PLANE_CURSOR) {
>> +			plane_blocks =
>> +				skl_ddb_entry_size(&ddb->plane[pipe][plane_id]);
>> +			skl_enable_plane_wm_levels(dev_priv, plane_blocks,
>> +						   max_level, wm);
>>   			continue;
>> +		}
>>   
>>   		data_rate = plane_data_rate[plane_id];
>>   
>> @@ -4127,33 +4197,36 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>>   		 * promote the expression to 64 bits to avoid overflowing, the
>>   		 * result is < available as data_rate / total_data_rate < 1
>>   		 */
>> -		plane_blocks = minimum[plane_id];
>> -		plane_blocks += div_u64((uint64_t)alloc_size * data_rate,
>> -					total_data_rate);
>>   
>>   		/* Leave disabled planes at (0,0) */
>>   		if (data_rate) {
>> +			plane_blocks = max(minimum[plane_id], plane_res_b);
>> +			plane_blocks += div_u64((uint64_t)alloc_size *
>> +					data_rate, total_data_rate);
>>   			ddb->plane[pipe][plane_id].start = start;
>>   			ddb->plane[pipe][plane_id].end = start + plane_blocks;
>> +			start += plane_blocks;
>>   		}
>>   
>> -		start += plane_blocks;
>> -
>>   		/*
>>   		 * allocation for y_plane part of planar format:
>> +		 * TODO: Once we start calculating watermark values for Y/UV
>> +		 * plane both consider it for initial allowed wm blocks.
>>   		 */
>>   		y_data_rate = plane_y_data_rate[plane_id];
>>   
>> -		y_plane_blocks = y_minimum[plane_id];
>> -		y_plane_blocks += div_u64((uint64_t)alloc_size * y_data_rate,
>> -					total_data_rate);
>> -
>>   		if (y_data_rate) {
>> +			y_plane_blocks = y_minimum[plane_id];
>> +			y_plane_blocks += div_u64((uint64_t)alloc_size *
>> +					y_data_rate, total_data_rate);
>>   			ddb->y_plane[pipe][plane_id].start = start;
>>   			ddb->y_plane[pipe][plane_id].end = start + y_plane_blocks;
>> +			start += y_plane_blocks;
>>   		}
>> -
>> -		start += y_plane_blocks;
>> +		skl_enable_plane_wm_levels(dev_priv,
>> +					   plane_blocks,
>> +					   max_level,
>> +					   wm);
>>   	}
>>   
>>   	return 0;
>> @@ -4243,11 +4316,9 @@ skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
>>   static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>   				struct intel_crtc_state *cstate,
>>   				const struct intel_plane_state *intel_pstate,
>> -				uint16_t ddb_allocation,
>>   				int level,
>>   				uint16_t *out_blocks, /* out */
>> -				uint8_t *out_lines, /* out */
>> -				bool *enabled /* out */)
>> +				uint8_t *out_lines /* out */)
>>   {
>>   	struct intel_plane *plane = to_intel_plane(intel_pstate->base.plane);
>>   	const struct drm_plane_state *pstate = &intel_pstate->base;
>> @@ -4270,10 +4341,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>   	bool y_tiled, x_tiled;
>>   
>>   	if (latency == 0 ||
>> -	    !intel_wm_plane_visible(cstate, intel_pstate)) {
>> -		*enabled = false;
>> +	    !intel_wm_plane_visible(cstate, intel_pstate))
>>   		return 0;
>> -	}
>>   
>>   	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
>>   		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
>> @@ -4359,9 +4428,6 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>   		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
>>   		    (plane_bytes_per_line / 512 < 1))
>>   			selected_result = method2;
>> -		else if ((ddb_allocation && ddb_allocation /
>> -			fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1)
>> -			selected_result = min_fixed_16_16(method1, method2);
>>   		else if (latency >= linetime_us)
>>   			selected_result = method2;
>>   		else
>> @@ -4381,64 +4447,41 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>   		}
>>   	}
>>   
>> -	if (res_blocks >= ddb_allocation || res_lines > 31) {
>> -		*enabled = false;
>> -
>> -		/*
>> -		 * If there are no valid level 0 watermarks, then we can't
>> -		 * support this display configuration.
>> -		 */
>> -		if (level) {
>> -			return 0;
>> -		} else {
>> -			struct drm_plane *plane = pstate->plane;
>> +	if (res_lines >= 31 && level == 0) {
>> +		struct drm_plane *plane = pstate->plane;
>>   
>> -			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
>> -			DRM_DEBUG_KMS("[PLANE:%d:%s] blocks required = %u/%u, lines required = %u/31\n",
>> -				      plane->base.id, plane->name,
>> -				      res_blocks, ddb_allocation, res_lines);
>> -			return -EINVAL;
>> -		}
>> +		DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
>> +		DRM_DEBUG_KMS("[PLANE:%d:%s] lines required = %u/31\n",
>> +				plane->base.id, plane->name, res_lines);
> Should this still be returning an EINVAL?  It looks like we'll print the
> message here, but then continue on and disable watermark level 0 in
> skl_enable_plane_wm_levels() without failing the commit.
If it fails it will fail the flip ioctl itself, this return will go all 
the way up till calc_watermark_data & fail the intel_atomic_check.
at-least that is my understanding :)

>>   	}
>>   
>>   	*out_blocks = res_blocks;
>>   	*out_lines = res_lines;
>> -	*enabled = true;
>>   
>>   	return 0;
>>   }
>>   
>>   static int
>>   skl_compute_wm_levels(const struct drm_i915_private *dev_priv,
>> -		      struct skl_ddb_allocation *ddb,
>>   		      struct intel_crtc_state *cstate,
>>   		      const struct intel_plane_state *intel_pstate,
>>   		      struct skl_plane_wm *wm)
>>   {
>> -	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
>> -	struct drm_plane *plane = intel_pstate->base.plane;
>> -	struct intel_plane *intel_plane = to_intel_plane(plane);
>> -	uint16_t ddb_blocks;
>> -	enum pipe pipe = intel_crtc->pipe;
>>   	int level, max_level = ilk_wm_max_level(dev_priv);
>>   	int ret;
>>   
>>   	if (WARN_ON(!intel_pstate->base.fb))
>>   		return -EINVAL;
>>   
>> -	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
>> -
>>   	for (level = 0; level <= max_level; level++) {
>>   		struct skl_wm_level *result = &wm->wm[level];
>>   
>>   		ret = skl_compute_plane_wm(dev_priv,
>>   					   cstate,
>>   					   intel_pstate,
>> -					   ddb_blocks,
>>   					   level,
>>   					   &result->plane_res_b,
>> -					   &result->plane_res_l,
>> -					   &result->plane_en);
>> +					   &result->plane_res_l);
>>   		if (ret)
>>   			return ret;
>>   	}
>> @@ -4504,8 +4547,7 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
>>   
>>   		wm = &pipe_wm->planes[plane_id];
>>   
>> -		ret = skl_compute_wm_levels(dev_priv, ddb, cstate,
>> -					    intel_pstate, wm);
>> +		ret = skl_compute_wm_levels(dev_priv, cstate, intel_pstate, wm);
>>   		if (ret)
>>   			return ret;
>>   		skl_compute_transition_wm(cstate, &wm->trans_wm);
>> @@ -4618,6 +4660,45 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
>>   	return false;
>>   }
>>   
>> +static int
>> +skl_ddb_add_affected_planes(struct intel_crtc_state *cstate,
>> +			    const struct skl_pipe_wm *old_pipe_wm,
>> +			    const struct skl_pipe_wm *pipe_wm)
>> +{
>> +	struct drm_atomic_state *state = cstate->base.state;
>> +	struct drm_device *dev = state->dev;
>> +	struct drm_crtc *crtc = cstate->base.crtc;
>> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>> +	struct drm_i915_private *dev_priv = to_i915(dev);
>> +	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
>> +	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
>> +	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
>> +	struct drm_plane_state *plane_state;
>> +	struct drm_plane *plane;
>> +	enum pipe pipe = intel_crtc->pipe;
>> +
>> +	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
>> +
>> +	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
>> +		enum plane_id plane_id = to_intel_plane(plane)->id;
>> +		const struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
>> +		const struct skl_plane_wm *old_wm = &old_pipe_wm->planes[plane_id];
>> +
>> +		if ((skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
>> +					&new_ddb->plane[pipe][plane_id]) &&
>> +		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
>> +					&new_ddb->y_plane[pipe][plane_id])) &&
>> +		    !memcmp(wm, old_wm, sizeof(struct skl_plane_wm)))
>> +			continue;
>> +
>> +		plane_state = drm_atomic_get_plane_state(state, plane);
>> +		if (IS_ERR(plane_state))
>> +			return PTR_ERR(plane_state);
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>   static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
>>   			      const struct skl_pipe_wm *old_pipe_wm,
>>   			      struct skl_pipe_wm *pipe_wm, /* out */
>> @@ -4631,6 +4712,18 @@ static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
>>   	if (ret)
>>   		return ret;
>>   
>> +	ret = skl_allocate_pipe_ddb(intel_cstate, ddb, pipe_wm);
>> +	if (ret)
>> +		return ret;
>> +	/*
>> +	 * TODO: Do we still need to add planes in state, As WM update is
>> +	 * not part of update_plane anymore, So wm for planes can be updated
>> +	 * irrespective of updade_plane call.
>> +	 */
> I think the watermark registers get written by skl_write_plane_wm()
> regardless of whether we add the plane to the state, yes.  However I
> think the watermark registers are only armed by writing to the actual
> plane registers, which I don't believe will happen anywhere unless you
> add the plane to the state.
What about instead of including all the planes, we just set one 
plane_mask variable
& at the end of flip in "intel_finish_crtc_commit" trigger the arm 
register for all the masked planes?
this will be out of scope for this series, but will be part of optimization.
What's your take on that?
>
>> +	ret = skl_ddb_add_affected_planes(intel_cstate, old_pipe_wm, pipe_wm);
>> +	if (ret)
>> +		return ret;
>> +
>>   	if (!memcmp(old_pipe_wm, pipe_wm, sizeof(*pipe_wm)))
>>   		*changed = false;
>>   	else
>> @@ -4653,41 +4746,7 @@ pipes_modified(struct drm_atomic_state *state)
>>   }
>>   
>>   static int
>> -skl_ddb_add_affected_planes(struct intel_crtc_state *cstate)
>> -{
>> -	struct drm_atomic_state *state = cstate->base.state;
>> -	struct drm_device *dev = state->dev;
>> -	struct drm_crtc *crtc = cstate->base.crtc;
>> -	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>> -	struct drm_i915_private *dev_priv = to_i915(dev);
>> -	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
>> -	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
>> -	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
>> -	struct drm_plane_state *plane_state;
>> -	struct drm_plane *plane;
>> -	enum pipe pipe = intel_crtc->pipe;
>> -
>> -	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
>> -
>> -	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
>> -		enum plane_id plane_id = to_intel_plane(plane)->id;
>> -
>> -		if (skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
>> -					&new_ddb->plane[pipe][plane_id]) &&
>> -		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
>> -					&new_ddb->y_plane[pipe][plane_id]))
>> -			continue;
>> -
>> -		plane_state = drm_atomic_get_plane_state(state, plane);
>> -		if (IS_ERR(plane_state))
>> -			return PTR_ERR(plane_state);
>> -	}
>> -
>> -	return 0;
>> -}
>> -
>> -static int
>> -skl_compute_ddb(struct drm_atomic_state *state)
>> +skl_include_affected_crtc(struct drm_atomic_state *state)
> Minor nitpick, but I'd make this plural as well.
> skl_include_affected_crtcs().
will fix

thanks,
-Mahesh
>
>
> Matt
>
>>   {
>>   	struct drm_device *dev = state->dev;
>>   	struct drm_i915_private *dev_priv = to_i915(dev);
>> @@ -4751,14 +4810,6 @@ skl_compute_ddb(struct drm_atomic_state *state)
>>   		cstate = intel_atomic_get_crtc_state(state, intel_crtc);
>>   		if (IS_ERR(cstate))
>>   			return PTR_ERR(cstate);
>> -
>> -		ret = skl_allocate_pipe_ddb(cstate, ddb);
>> -		if (ret)
>> -			return ret;
>> -
>> -		ret = skl_ddb_add_affected_planes(cstate);
>> -		if (ret)
>> -			return ret;
>>   	}
>>   
>>   	return 0;
>> @@ -4839,7 +4890,7 @@ skl_compute_wm(struct drm_atomic_state *state)
>>   	/* Clear all dirty flags */
>>   	results->dirty_pipes = 0;
>>   
>> -	ret = skl_compute_ddb(state);
>> +	ret = skl_include_affected_crtc(state);
>>   	if (ret)
>>   		return ret;
>>   
>> -- 
>> 2.11.0
>>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 11/12] drm/i915/skl: New ddb allocation algorithm
  2017-05-15  8:34 ` [PATCH 11/12] drm/i915/skl: New ddb allocation algorithm Mahesh Kumar
  2017-05-15 22:38   ` Matt Roper
@ 2017-05-16 15:29   ` Maarten Lankhorst
  1 sibling, 0 replies; 27+ messages in thread
From: Maarten Lankhorst @ 2017-05-16 15:29 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

Op 15-05-17 om 10:34 schreef Mahesh Kumar:
> From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>
>
> This patch implements new DDB allocation algorithm as per HW team
> recommendation. This algo takecare of scenario where we allocate less DDB
> for the planes with lower relative pixel rate, but they require more DDB
> to work.
> It also takes care of enabling same watermark level for each
> plane in crtc, for efficient power saving.
>
> Changes since v1:
>  - Rebase on top of Paulo's patch series
>
> Changes since v2:
>  - Fix the for loop condition to enable WM
>
> Changes since v3:
>  - Fix crash in cursor i-g-t reported by Maarten
>  - Rebase after addressing Paulo's comments
>  - Few other ULT fixes
> Changes since v4:
>  - Rebase on drm-tip
>  - Added separate function to enable WM levels
> Changes since v5:
>  - Fix a crash identified in skl-6770HQ system
My main fear is that we may now add crtc's that were not part of the state before, so anything that may change watermarks may cause a pageflip on a unrelated crtc to return -EBUSY.
However this only happens during atomic commit and might also happen on nonblocking modesets, so it's probably harmless for now.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 04/12] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point
  2017-05-17 11:58 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
@ 2017-05-17 11:58 ` Mahesh Kumar
  0 siblings, 0 replies; 27+ messages in thread
From: Mahesh Kumar @ 2017-05-17 11:58 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

From: "Kumar, Mahesh" <mahesh1.kumar@intel.com>

This patch make changes to calculate adjusted plane pixel rate &
plane downscale amount using fixed_point functions available.
This patch will give uniformity in code, & will help to avoid mixing of
32bit uint32_t variable for fixed-16.16 with fixed_16_16_t variables in
later patch in the series.

Changes from V1:
 - Rebase based on wrapper name change
 - Remove unnecessary comment

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 8ff8cc59dcca..ab056952cfa4 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3827,26 +3827,27 @@ void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
  * Return value is provided in 16.16 fixed point form to retain fractional part.
  * Caller should take care of dividing & rounding off the value.
  */
-static uint32_t
+static uint_fixed_16_16_t
 skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 			   const struct intel_plane_state *pstate)
 {
 	struct intel_plane *plane = to_intel_plane(pstate->base.plane);
-	uint32_t downscale_h, downscale_w;
 	uint32_t src_w, src_h, dst_w, dst_h;
+	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
+	uint_fixed_16_16_t downscale_h, downscale_w;
 
 	if (WARN_ON(!intel_wm_plane_visible(cstate, pstate)))
-		return DRM_PLANE_HELPER_NO_SCALING;
+		return u32_to_fixed_16_16(0);
 
 	/* n.b., src is 16.16 fixed point, dst is whole integer */
 	if (plane->id == PLANE_CURSOR) {
-		src_w = pstate->base.src_w;
-		src_h = pstate->base.src_h;
+		src_w = pstate->base.src_w >> 16;
+		src_h = pstate->base.src_h >> 16;
 		dst_w = pstate->base.crtc_w;
 		dst_h = pstate->base.crtc_h;
 	} else {
-		src_w = drm_rect_width(&pstate->base.src);
-		src_h = drm_rect_height(&pstate->base.src);
+		src_w = drm_rect_width(&pstate->base.src) >> 16;
+		src_h = drm_rect_height(&pstate->base.src) >> 16;
 		dst_w = drm_rect_width(&pstate->base.dst);
 		dst_h = drm_rect_height(&pstate->base.dst);
 	}
@@ -3854,11 +3855,12 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 	if (drm_rotation_90_or_270(pstate->base.rotation))
 		swap(dst_w, dst_h);
 
-	downscale_h = max(src_h / dst_h, (uint32_t)DRM_PLANE_HELPER_NO_SCALING);
-	downscale_w = max(src_w / dst_w, (uint32_t)DRM_PLANE_HELPER_NO_SCALING);
+	fp_w_ratio = fixed_16_16_div(src_w, dst_w);
+	fp_h_ratio = fixed_16_16_div(src_h, dst_h);
+	downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
+	downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
 
-	/* Provide result in 16.16 fixed point */
-	return (uint64_t)downscale_w * downscale_h >> 16;
+	return mul_fixed16(downscale_w, downscale_h);
 }
 
 static unsigned int
@@ -3868,10 +3870,11 @@ skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 {
 	struct intel_plane *plane = to_intel_plane(pstate->plane);
 	struct intel_plane_state *intel_pstate = to_intel_plane_state(pstate);
-	uint32_t down_scale_amount, data_rate;
+	uint32_t data_rate;
 	uint32_t width = 0, height = 0;
 	struct drm_framebuffer *fb;
 	u32 format;
+	uint_fixed_16_16_t down_scale_amount;
 
 	if (!intel_pstate->base.visible)
 		return 0;
@@ -3905,7 +3908,7 @@ skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 
 	down_scale_amount = skl_plane_downscale_amount(cstate, intel_pstate);
 
-	return (uint64_t)data_rate * down_scale_amount >> 16;
+	return mul_round_up_u32_fixed16(data_rate, down_scale_amount);
 }
 
 /*
@@ -4191,8 +4194,7 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 					      struct intel_plane_state *pstate)
 {
 	uint64_t adjusted_pixel_rate;
-	uint64_t downscale_amount;
-	uint64_t pixel_rate;
+	uint_fixed_16_16_t downscale_amount;
 
 	/* Shouldn't reach here on disabled planes... */
 	if (WARN_ON(!intel_wm_plane_visible(cstate, pstate)))
@@ -4205,10 +4207,8 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 	adjusted_pixel_rate = cstate->pixel_rate;
 	downscale_amount = skl_plane_downscale_amount(cstate, pstate);
 
-	pixel_rate = adjusted_pixel_rate * downscale_amount >> 16;
-	WARN_ON(pixel_rate != clamp_t(uint32_t, pixel_rate, 0, ~0));
-
-	return pixel_rate;
+	return mul_round_up_u32_fixed16(adjusted_pixel_rate,
+					    downscale_amount);
 }
 
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-05-17 11:55 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-15  8:34 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
2017-05-15  8:34 ` [PATCH 01/12] drm/i915: fix naming of fixed_16_16 wrapper Mahesh Kumar
2017-05-15  8:34 ` [PATCH 02/12] drm/i915: Add more wrapper for fixed_point_16_16 operations Mahesh Kumar
2017-05-15 22:35   ` Matt Roper
2017-05-15  8:34 ` [PATCH 03/12] drm/i915: Use fixed_16_16 wrapper for division operation Mahesh Kumar
2017-05-15  8:34 ` [PATCH 04/12] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point Mahesh Kumar
2017-05-15  8:34 ` [PATCH 05/12] drm/i915/skl: Fail the flip if no FB for WM calculation Mahesh Kumar
2017-05-15 22:35   ` Matt Roper
2017-05-15  8:34 ` [PATCH 06/12] drm/i915/skl+: no need to memset again Mahesh Kumar
2017-05-15  8:34 ` [PATCH 07/12] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation Mahesh Kumar
2017-05-15 22:36   ` Matt Roper
2017-05-15  8:34 ` [PATCH 08/12] drm/i915/skl+: Watermark calculation cleanup Mahesh Kumar
2017-05-15 22:36   ` Matt Roper
2017-05-15  8:34 ` [PATCH 09/12] drm/i915/skl+: Perform wm level calculations in separate function Mahesh Kumar
2017-05-15 22:36   ` Matt Roper
2017-05-15  8:34 ` [PATCH 10/12] drm/i915/skl+: use linetime latency if ddb size is not available Mahesh Kumar
2017-05-15 22:38   ` Matt Roper
2017-05-16  9:46     ` Mahesh Kumar
2017-05-15  8:34 ` [PATCH 11/12] drm/i915/skl: New ddb allocation algorithm Mahesh Kumar
2017-05-15 22:38   ` Matt Roper
2017-05-16 12:57     ` Mahesh Kumar
2017-05-16 15:29   ` Maarten Lankhorst
2017-05-15  8:34 ` [PATCH 12/12] drm/i915/skl+: consider max supported plane pixel rate while scaling Mahesh Kumar
2017-05-15 22:39   ` Matt Roper
2017-05-16 11:27   ` Maarten Lankhorst
2017-05-15 11:02 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev7) Patchwork
2017-05-17 11:58 [PATCH 00/12] Implement DDB algorithm and WM cleanup Mahesh Kumar
2017-05-17 11:58 ` [PATCH 04/12] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point Mahesh Kumar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.