All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 05/11] drm/i915/skl: Fail the flip if no FB for WM calculation
  2017-05-08 11:48 ` [PATCH 05/11] drm/i915/skl: Fail the flip if no FB for WM calculation Mahesh Kumar
@ 2017-05-08 11:48   ` Lankhorst, Maarten
  2017-05-08 12:01     ` Mahesh Kumar
  0 siblings, 1 reply; 43+ messages in thread
From: Lankhorst, Maarten @ 2017-05-08 11:48 UTC (permalink / raw)
  To: intel-gfx, Kumar, Mahesh1; +Cc: Zanoni, Paulo R

Mahesh Kumar schreef op ma 08-05-2017 om 17:18 [+0530]:
> Fail the flip if no FB is present but plane_state is set as visible.
> Above is not a valid combination so instead of continue fail the
> flip.

Why is this patch necessary? drm_atomic_plane_check handles this.

~Maarten
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 00/11] Implement DDB algorithm and WM cleanup
@ 2017-05-08 11:48 Mahesh Kumar
  2017-05-08 11:48 ` [PATCH 01/11] drm/i915: fix naming of fixed_16_16 wrapper Mahesh Kumar
                   ` (15 more replies)
  0 siblings, 16 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-08 11:48 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This series implements new DDB allocation algorithm to solve the cases,
where we have sufficient DDB available to enable multiple planes, But
due to the current algorithm not dividing it properly among planes, we
end-up failing the flip.
It also takes care of enabling same watermark level for each
plane, for efficient power saving.
Series also fixes/cleans-up few bug in present code.

There are two steps in current WM programming.

1. Calculate minimum number of blocks required  for a WM level to be
enabled. For 1440x2560 panel we need 41 blocks as minimum number of
blocks to enable WM0. This is the step which doesn't use vertical size.
It only depends on Pipe drain rate and plane horizontal size as per the
current Bspec algorithm.
So all the plane below have minimum  number of blocks required to enable
WM0 as 41
    Plane 1  - 1440x2560        -    Min blocks to enable WM0 = 41
    Plane 2  - 1440x2560        -    Min blocks to enable WM0 = 41
    Plane 3  - 1440x48          -    Min blocks to enable WM0 = 41
    Plane 4  - 1440x96          -    Min blocks to enable WM0 = 41

2. Number of blocks allotted by the driver
    Driver allocates  12 for Plane 3   &  16 for plane 4

    Total Dbuf Available = 508
    Dbuf Available after 32 blocks for cursor = 508 - (32)  = 476
    allocate minimum blocks for each plane 8 * 4 = 32
    remaining blocks = 476 - 32 = 444
    Relative Data Rate for Planes
       Plane 1  =  1440 * 2560 * 3  =  11059200
       Plane 2  =  1440 * 2560 * 3  =  11059200
       Plane 3  =  1440 * 48   * 3  =  207360
       Plane 4  =  1440 * 96   * 3  =  414720
       Total Relative BW            =  22740480

-   Allocate Buffer
    buffer allocation = (Plane relative data rate / total data rate)
		    * total remaming DDB + minimum plane DDB
     Plane 1  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
     Plane 2  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
     Plane 3  buffer allocation = (207360   / 22740480) * 444 + 8 = 12
     Plane 4  buffer allocation = (414720   / 22740480) * 444 + 8 = 16

In this case it forced driver to disable Plane 3 & 4. Driver need to use
more efficient way to allocate buffer that is optimum for power.

New Algorithm suggested by HW team is:

1. Calculate minimum buffer allocations for each plane and for each
    watermark level

2. Add minimum buffer allocations required for enabling WM7
    for all the planes

Level 0 =  41 + 41 + 41 + 41  = 164
Level 1 =  42 + 42 + 42 + 42  = 168
Level 2 =  42 + 42 + 42 + 42  = 168
Level 3 =  94 + 94 + 94 + 94 =  376
Level 4 =  94 + 94 + 94 + 94 =  376
Level 5 =  94 + 94 + 94 + 94 =  376
Level 6 =  94 + 94 + 94 + 94 =  376
Level 7 =  94 + 94 + 94 + 94 =  376

3. Check to see how many buffer allocation are left and enable
the best case. In this case since we have 476 blocks we can enable
WM0-7 on all 4 planes.
Let's say if we have only 200 block available then the best cases
allocation is to enable Level2 which requires 168 blocks

Mahesh Kumar (11):
  drm/i915: fix naming of fixed_16_16 wrapper.
  drm/i915: Add more wrapper for fixed_point_16_16 operations
  drm/i915: Use fixed_16_16 wrapper for division operation
  drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed
    point
  drm/i915/skl: Fail the flip if no FB for WM calculation
  drm/i915/skl+: no need to memset again
  drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe
    allocation
  drm/i915/skl+: Watermark calculation cleanup
  drm/i915/skl+: use linetime latency if ddb size is not available
  drm/i915/skl: New ddb allocation algorithm
  drm/i915/skl+: consider max supported plane pixel rate while scaling

 drivers/gpu/drm/i915/i915_drv.h      |  56 +++-
 drivers/gpu/drm/i915/intel_display.c |   3 +
 drivers/gpu/drm/i915/intel_drv.h     |   2 +
 drivers/gpu/drm/i915/intel_pm.c      | 520 +++++++++++++++++++++++------------
 4 files changed, 395 insertions(+), 186 deletions(-)

-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 01/11] drm/i915: fix naming of fixed_16_16 wrapper.
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
@ 2017-05-08 11:48 ` Mahesh Kumar
  2017-05-12  0:21   ` Matt Roper
  2017-05-08 11:48 ` [PATCH 02/11] drm/i915: Add more wrapper for fixed_point_16_16 operations Mahesh Kumar
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-08 11:48 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

fixed_16_16_div_round_up(_u64), wrapper for fixed_16_16 division
operation don't really round_up the result. Wrapper round_up only the
fraction part of the result to make it 16-bit.
This patch eliminates round_up keyword from the wrapper.

Later patch will introduce the new wrapper to do rounding-off the result
and give unt32_t output to cleanup mix use of fixed_16_16_t & uint32_t
variables.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 6 ++----
 drivers/gpu/drm/i915/intel_pm.c | 6 +++---
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b20ed16da0ad..5804734d30a3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -152,8 +152,7 @@ static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
 	return max;
 }
 
-static inline uint_fixed_16_16_t fixed_16_16_div_round_up(uint32_t val,
-							  uint32_t d)
+static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
 {
 	uint_fixed_16_16_t fp, res;
 
@@ -162,8 +161,7 @@ static inline uint_fixed_16_16_t fixed_16_16_div_round_up(uint32_t val,
 	return res;
 }
 
-static inline uint_fixed_16_16_t fixed_16_16_div_round_up_u64(uint32_t val,
-							      uint32_t d)
+static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val, uint32_t d)
 {
 	uint_fixed_16_16_t res;
 	uint64_t interm_val;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index cacb65fa2dd5..91f09dfdb618 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3698,7 +3698,7 @@ static uint_fixed_16_16_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp,
 		return FP_16_16_MAX;
 
 	wm_intermediate_val = latency * pixel_rate * cpp;
-	ret = fixed_16_16_div_round_up_u64(wm_intermediate_val, 1000 * 512);
+	ret = fixed_16_16_div_u64(wm_intermediate_val, 1000 * 512);
 	return ret;
 }
 
@@ -3834,8 +3834,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	if (y_tiled) {
 		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line *
 					   y_min_scanlines, 512);
-		plane_blocks_per_line =
-		      fixed_16_16_div_round_up(interm_pbpl, y_min_scanlines);
+		plane_blocks_per_line = fixed_16_16_div(interm_pbpl,
+							y_min_scanlines);
 	} else if (x_tiled) {
 		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512);
 		plane_blocks_per_line = u32_to_fixed_16_16(interm_pbpl);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/11] drm/i915: Add more wrapper for fixed_point_16_16 operations
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
  2017-05-08 11:48 ` [PATCH 01/11] drm/i915: fix naming of fixed_16_16 wrapper Mahesh Kumar
@ 2017-05-08 11:48 ` Mahesh Kumar
  2017-05-10 12:37   ` Maarten Lankhorst
  2017-05-12  0:22   ` Matt Roper
  2017-05-08 11:48 ` [PATCH 03/11] drm/i915: Use fixed_16_16 wrapper for division operation Mahesh Kumar
                   ` (13 subsequent siblings)
  15 siblings, 2 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-08 11:48 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch adds few wrapper to perform fixed_point_16_16 operations
mul_u32_fixed_16_16_round_up : Multiplies u32 and fixed_16_16_t
				variables & returns u32 result with
				rounding-off.
mul_fixed_16_16 : Multiplies two fixed_16_16_t variable & returns
				fixed_16_16
div_round_up_fixed_16_16 : Perform division operation on fixed_16_16_t
				variables & return u32 result with round-off
fixed_16_16_div_round_up_u64 : devide uint32_t variable by fixed_16_16
				variable and round_up the result to uint32_t.

These wrappers will be used by later patches in the series.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 43 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5804734d30a3..5ff0091707c0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -152,6 +152,38 @@ static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
 	return max;
 }
 
+static inline uint32_t div_round_up_fixed_16_16(uint_fixed_16_16_t val,
+						uint_fixed_16_16_t d)
+{
+	return DIV_ROUND_UP(val.val, d.val);
+}
+
+static inline uint32_t mul_u32_fixed_16_16_round_up(uint32_t val,
+						    uint_fixed_16_16_t mul)
+{
+       uint64_t intermediate_val;
+       uint32_t result;
+
+       intermediate_val = (uint64_t) val * mul.val;
+       intermediate_val = DIV_ROUND_UP_ULL(intermediate_val, 1 << 16);
+       WARN_ON(intermediate_val >> 32);
+       result = clamp_t(uint32_t, intermediate_val, 0, ~0);
+       return result;
+}
+
+static inline uint_fixed_16_16_t mul_fixed_16_16(uint_fixed_16_16_t val,
+						 uint_fixed_16_16_t mul)
+{
+	uint64_t intermediate_val;
+	uint_fixed_16_16_t fp;
+
+	intermediate_val = (uint64_t) val.val * mul.val;
+	intermediate_val = intermediate_val >> 16;
+	WARN_ON(intermediate_val >> 32);
+	fp.val = clamp_t(uint32_t, intermediate_val, 0, ~0);
+	return fp;
+}
+
 static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
 {
 	uint_fixed_16_16_t fp, res;
@@ -174,6 +206,17 @@ static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val, uint32_t d)
 	return res;
 }
 
+static inline uint32_t fixed_16_16_div_round_up_u64(uint32_t val,
+						    uint_fixed_16_16_t d)
+{
+	uint64_t interm_val;
+
+	interm_val = (uint64_t)val << 16;
+	interm_val = DIV_ROUND_UP_ULL(interm_val, d.val);
+	WARN_ON(interm_val >> 32);
+	return clamp_t(uint32_t, interm_val, 0, ~0);
+}
+
 static inline uint_fixed_16_16_t mul_u32_fixed_16_16(uint32_t val,
 						     uint_fixed_16_16_t mul)
 {
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 03/11] drm/i915: Use fixed_16_16 wrapper for division operation
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
  2017-05-08 11:48 ` [PATCH 01/11] drm/i915: fix naming of fixed_16_16 wrapper Mahesh Kumar
  2017-05-08 11:48 ` [PATCH 02/11] drm/i915: Add more wrapper for fixed_point_16_16 operations Mahesh Kumar
@ 2017-05-08 11:48 ` Mahesh Kumar
  2017-05-12  0:22   ` Matt Roper
  2017-05-08 11:48 ` [PATCH 04/11] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point Mahesh Kumar
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-08 11:48 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

Don't use fixed_16_16 structure members directly, instead use wrapper to
perform fixed_16_16 division operation.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 91f09dfdb618..66b542ba47ad 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3867,8 +3867,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	}
 
 	res_blocks = fixed_16_16_to_u32_round_up(selected_result) + 1;
-	res_lines = DIV_ROUND_UP(selected_result.val,
-				 plane_blocks_per_line.val);
+	res_lines = div_round_up_fixed_16_16(selected_result,
+					     plane_blocks_per_line);
 
 	if (level >= 1 && level <= 7) {
 		if (y_tiled) {
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/11] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (2 preceding siblings ...)
  2017-05-08 11:48 ` [PATCH 03/11] drm/i915: Use fixed_16_16 wrapper for division operation Mahesh Kumar
@ 2017-05-08 11:48 ` Mahesh Kumar
  2017-05-12  0:22   ` Matt Roper
  2017-05-08 11:48 ` [PATCH 05/11] drm/i915/skl: Fail the flip if no FB for WM calculation Mahesh Kumar
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-08 11:48 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch make changes to calculate adjusted plane pixel rate &
plane downscale amount using fixed_point functions available.
This patch will give uniformity in code, & will help to avoid mixing of
32bit uint32_t variable for fixed-16.16 with fixed_16_16_t variables in
later patch in the series.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 37 +++++++++++++++++++------------------
 1 file changed, 19 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 66b542ba47ad..6414701ef09e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3360,26 +3360,27 @@ void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
  * Return value is provided in 16.16 fixed point form to retain fractional part.
  * Caller should take care of dividing & rounding off the value.
  */
-static uint32_t
+static uint_fixed_16_16_t
 skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 			   const struct intel_plane_state *pstate)
 {
 	struct intel_plane *plane = to_intel_plane(pstate->base.plane);
-	uint32_t downscale_h, downscale_w;
 	uint32_t src_w, src_h, dst_w, dst_h;
+	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
+	uint_fixed_16_16_t downscale_h, downscale_w;
 
 	if (WARN_ON(!intel_wm_plane_visible(cstate, pstate)))
-		return DRM_PLANE_HELPER_NO_SCALING;
+		return u32_to_fixed_16_16(0);
 
 	/* n.b., src is 16.16 fixed point, dst is whole integer */
 	if (plane->id == PLANE_CURSOR) {
-		src_w = pstate->base.src_w;
-		src_h = pstate->base.src_h;
+		src_w = pstate->base.src_w >> 16;
+		src_h = pstate->base.src_h >> 16;
 		dst_w = pstate->base.crtc_w;
 		dst_h = pstate->base.crtc_h;
 	} else {
-		src_w = drm_rect_width(&pstate->base.src);
-		src_h = drm_rect_height(&pstate->base.src);
+		src_w = drm_rect_width(&pstate->base.src) >> 16;
+		src_h = drm_rect_height(&pstate->base.src) >> 16;
 		dst_w = drm_rect_width(&pstate->base.dst);
 		dst_h = drm_rect_height(&pstate->base.dst);
 	}
@@ -3387,11 +3388,13 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 	if (drm_rotation_90_or_270(pstate->base.rotation))
 		swap(dst_w, dst_h);
 
-	downscale_h = max(src_h / dst_h, (uint32_t)DRM_PLANE_HELPER_NO_SCALING);
-	downscale_w = max(src_w / dst_w, (uint32_t)DRM_PLANE_HELPER_NO_SCALING);
+	fp_w_ratio = fixed_16_16_div(src_w, dst_w);
+	fp_h_ratio = fixed_16_16_div(src_h, dst_h);
+	downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
+	downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
 
 	/* Provide result in 16.16 fixed point */
-	return (uint64_t)downscale_w * downscale_h >> 16;
+	return mul_fixed_16_16(downscale_w, downscale_h);
 }
 
 static unsigned int
@@ -3401,10 +3404,11 @@ skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 {
 	struct intel_plane *plane = to_intel_plane(pstate->plane);
 	struct intel_plane_state *intel_pstate = to_intel_plane_state(pstate);
-	uint32_t down_scale_amount, data_rate;
+	uint32_t data_rate;
 	uint32_t width = 0, height = 0;
 	struct drm_framebuffer *fb;
 	u32 format;
+	uint_fixed_16_16_t down_scale_amount;
 
 	if (!intel_pstate->base.visible)
 		return 0;
@@ -3438,7 +3442,7 @@ skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 
 	down_scale_amount = skl_plane_downscale_amount(cstate, intel_pstate);
 
-	return (uint64_t)data_rate * down_scale_amount >> 16;
+	return mul_u32_fixed_16_16_round_up(data_rate, down_scale_amount);
 }
 
 /*
@@ -3724,8 +3728,7 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 					      struct intel_plane_state *pstate)
 {
 	uint64_t adjusted_pixel_rate;
-	uint64_t downscale_amount;
-	uint64_t pixel_rate;
+	uint_fixed_16_16_t downscale_amount;
 
 	/* Shouldn't reach here on disabled planes... */
 	if (WARN_ON(!intel_wm_plane_visible(cstate, pstate)))
@@ -3738,10 +3741,8 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 	adjusted_pixel_rate = cstate->pixel_rate;
 	downscale_amount = skl_plane_downscale_amount(cstate, pstate);
 
-	pixel_rate = adjusted_pixel_rate * downscale_amount >> 16;
-	WARN_ON(pixel_rate != clamp_t(uint32_t, pixel_rate, 0, ~0));
-
-	return pixel_rate;
+	return mul_u32_fixed_16_16_round_up(adjusted_pixel_rate,
+					    downscale_amount);
 }
 
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/11] drm/i915/skl: Fail the flip if no FB for WM calculation
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (3 preceding siblings ...)
  2017-05-08 11:48 ` [PATCH 04/11] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point Mahesh Kumar
@ 2017-05-08 11:48 ` Mahesh Kumar
  2017-05-08 11:48   ` Lankhorst, Maarten
  2017-05-08 11:48 ` [PATCH 06/11] drm/i915/skl+: no need to memset again Mahesh Kumar
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-08 11:48 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

Fail the flip if no FB is present but plane_state is set as visible.
Above is not a valid combination so instead of continue fail the flip.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 6414701ef09e..1d7c67d7e539 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3942,7 +3942,8 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 	if (!intel_pstate)
 		intel_pstate = to_intel_plane_state(plane->state);
 
-	WARN_ON(!intel_pstate->base.fb);
+	if (WARN_ON(!intel_pstate->base.fb))
+		return -EINVAL;
 
 	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/11] drm/i915/skl+: no need to memset again
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (4 preceding siblings ...)
  2017-05-08 11:48 ` [PATCH 05/11] drm/i915/skl: Fail the flip if no FB for WM calculation Mahesh Kumar
@ 2017-05-08 11:48 ` Mahesh Kumar
  2017-05-12  0:23   ` Matt Roper
  2017-05-08 11:48 ` [PATCH 07/11] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation Mahesh Kumar
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-08 11:48 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

We are already doing memset of ddb structure at the begining of skl_allocate_pipe_ddb
function, No need to again do a memset.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 1d7c67d7e539..2a4e9d89cd6f 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3606,10 +3606,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 
 	skl_ddb_get_pipe_allocation_limits(dev, cstate, alloc, &num_active);
 	alloc_size = skl_ddb_entry_size(alloc);
-	if (alloc_size == 0) {
-		memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
+	if (alloc_size == 0)
 		return 0;
-	}
 
 	skl_ddb_calc_min(cstate, num_active, minimum, y_minimum);
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/11] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (5 preceding siblings ...)
  2017-05-08 11:48 ` [PATCH 06/11] drm/i915/skl+: no need to memset again Mahesh Kumar
@ 2017-05-08 11:48 ` Mahesh Kumar
  2017-05-08 14:12   ` Ander Conselvan De Oliveira
  2017-05-12  0:23   ` Matt Roper
  2017-05-08 11:48 ` [PATCH 08/11] drm/i915/skl+: Watermark calculation cleanup Mahesh Kumar
                   ` (8 subsequent siblings)
  15 siblings, 2 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-08 11:48 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

DDB minimum requirement may exceed the allocated DDB for CRTC/Pipe. This
patch make changes to fail the flip if minimum requirement for pipe
exceeds the total ddb allocated to the pipe.
Previously it succeeded but making alloc_size a negative value. Which
will make later calculations for plane ddb allocation bogus & may lead
to screen corruption or system hang.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 2a4e9d89cd6f..0ace94d67432 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3591,6 +3591,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	int num_active;
 	unsigned plane_data_rate[I915_MAX_PLANES] = {};
 	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
+	uint16_t total_min_blocks = 0;
 
 	/* Clear the partitioning for disabled planes. */
 	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
@@ -3618,10 +3619,18 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	 */
 
 	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
-		alloc_size -= minimum[plane_id];
-		alloc_size -= y_minimum[plane_id];
+		total_min_blocks += minimum[plane_id];
+		total_min_blocks += y_minimum[plane_id];
 	}
 
+	if ((total_min_blocks > alloc_size)) {
+		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
+		DRM_DEBUG_KMS("minimum required %d/%d\n", total_min_blocks,
+							alloc_size);
+		return-EINVAL;
+	}
+
+	alloc_size -= total_min_blocks;
 	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
 	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 08/11] drm/i915/skl+: Watermark calculation cleanup
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (6 preceding siblings ...)
  2017-05-08 11:48 ` [PATCH 07/11] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation Mahesh Kumar
@ 2017-05-08 11:48 ` Mahesh Kumar
  2017-05-12  0:23   ` Matt Roper
  2017-05-08 11:49 ` [PATCH 09/11] drm/i915/skl+: use linetime latency if ddb size is not available Mahesh Kumar
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-08 11:48 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch cleanup/reorganises the watermark calculation functions.
This patch also make use of already available macro
"drm_atomic_crtc_state_for_each_plane_state" to walk through
plane_state list instead of calculating plane_state in function itself.
Now we iterate over WM levels in skl_compute_wm_level function instead
of "skl_build_pipe_wm" function.

This restructuring will help later patch for new DDB allocation
algorithm to do only algo related changes.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 89 +++++++++++++++++------------------------
 1 file changed, 37 insertions(+), 52 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0ace94d67432..9d0225862a7e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3732,7 +3732,7 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
 }
 
 static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
-					      struct intel_plane_state *pstate)
+					      const struct intel_plane_state *pstate)
 {
 	uint64_t adjusted_pixel_rate;
 	uint_fixed_16_16_t downscale_amount;
@@ -3754,7 +3754,7 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
-				struct intel_plane_state *intel_pstate,
+				const struct intel_plane_state *intel_pstate,
 				uint16_t ddb_allocation,
 				int level,
 				uint16_t *out_blocks, /* out */
@@ -3762,8 +3762,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				bool *enabled /* out */)
 {
 	struct intel_plane *plane = to_intel_plane(intel_pstate->base.plane);
-	struct drm_plane_state *pstate = &intel_pstate->base;
-	struct drm_framebuffer *fb = pstate->fb;
+	const struct drm_plane_state *pstate = &intel_pstate->base;
+	const struct drm_framebuffer *fb = pstate->fb;
 	uint32_t latency = dev_priv->wm.skl_latency[level];
 	uint_fixed_16_16_t method1, method2;
 	uint_fixed_16_16_t plane_blocks_per_line;
@@ -3918,52 +3918,36 @@ static int
 skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 		     struct skl_ddb_allocation *ddb,
 		     struct intel_crtc_state *cstate,
-		     struct intel_plane *intel_plane,
-		     int level,
-		     struct skl_wm_level *result)
+		     const struct intel_plane_state *intel_pstate,
+		     struct skl_plane_wm *wm)
 {
-	struct drm_atomic_state *state = cstate->base.state;
 	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
-	struct drm_plane *plane = &intel_plane->base;
-	struct intel_plane_state *intel_pstate = NULL;
+	struct drm_plane *plane = intel_pstate->base.plane;
+	struct intel_plane *intel_plane = to_intel_plane(plane);
 	uint16_t ddb_blocks;
 	enum pipe pipe = intel_crtc->pipe;
+	int level, max_level = ilk_wm_max_level(dev_priv);
 	int ret;
 
-	if (state)
-		intel_pstate =
-			intel_atomic_get_existing_plane_state(state,
-							      intel_plane);
-
-	/*
-	 * Note: If we start supporting multiple pending atomic commits against
-	 * the same planes/CRTC's in the future, plane->state will no longer be
-	 * the correct pre-state to use for the calculations here and we'll
-	 * need to change where we get the 'unchanged' plane data from.
-	 *
-	 * For now this is fine because we only allow one queued commit against
-	 * a CRTC.  Even if the plane isn't modified by this transaction and we
-	 * don't have a plane lock, we still have the CRTC's lock, so we know
-	 * that no other transactions are racing with us to update it.
-	 */
-	if (!intel_pstate)
-		intel_pstate = to_intel_plane_state(plane->state);
-
 	if (WARN_ON(!intel_pstate->base.fb))
 		return -EINVAL;
 
 	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
 
-	ret = skl_compute_plane_wm(dev_priv,
-				   cstate,
-				   intel_pstate,
-				   ddb_blocks,
-				   level,
-				   &result->plane_res_b,
-				   &result->plane_res_l,
-				   &result->plane_en);
-	if (ret)
-		return ret;
+	for (level = 0; level <= max_level; level++) {
+		struct skl_wm_level *result = &wm->wm[level];
+
+		ret = skl_compute_plane_wm(dev_priv,
+					   cstate,
+					   intel_pstate,
+					   ddb_blocks,
+					   level,
+					   &result->plane_res_b,
+					   &result->plane_res_l,
+					   &result->plane_en);
+		if (ret)
+			return ret;
+	}
 
 	return 0;
 }
@@ -4009,10 +3993,11 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 			     struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_device *dev = cstate->base.crtc->dev;
+	struct drm_crtc_state *crtc_state = &cstate->base;
 	const struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_plane *intel_plane;
+	struct drm_plane *plane;
+	const struct drm_plane_state *pstate;
 	struct skl_plane_wm *wm;
-	int level, max_level = ilk_wm_max_level(dev_priv);
 	int ret;
 
 	/*
@@ -4021,18 +4006,18 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 	 */
 	memset(pipe_wm->planes, 0, sizeof(pipe_wm->planes));
 
-	for_each_intel_plane_mask(&dev_priv->drm,
-				  intel_plane,
-				  cstate->base.plane_mask) {
-		wm = &pipe_wm->planes[intel_plane->id];
+	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
+		const struct intel_plane_state *intel_pstate =
+						to_intel_plane_state(pstate);
+		enum plane_id plane_id = to_intel_plane(plane)->id;
+
+		wm = &pipe_wm->planes[plane_id];
+
+		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
+					   intel_pstate, wm);
+		if (ret)
+			return ret;
 
-		for (level = 0; level <= max_level; level++) {
-			ret = skl_compute_wm_level(dev_priv, ddb, cstate,
-						   intel_plane, level,
-						   &wm->wm[level]);
-			if (ret)
-				return ret;
-		}
 		skl_compute_transition_wm(cstate, &wm->trans_wm);
 	}
 	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/11] drm/i915/skl+: use linetime latency if ddb size is not available
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (7 preceding siblings ...)
  2017-05-08 11:48 ` [PATCH 08/11] drm/i915/skl+: Watermark calculation cleanup Mahesh Kumar
@ 2017-05-08 11:49 ` Mahesh Kumar
  2017-05-08 11:49 ` [PATCH 10/11] drm/i915/skl: New ddb allocation algorithm Mahesh Kumar
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-08 11:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch make changes to use linetime latency if allocated
DDB size during plane watermark calculation is not available,
This is required to implement new DDB allocation algorithm.

In New Algorithm DDB is allocated based on WM values, because of which
number of DDB blocks will not be available during WM calculation,
So this "linetime latency" is suggested by SV/HW team to be used during
switch-case for WM blocks selection.

Changes since v1:
 - Rebase on top of Paulo's patch series
Changes since v2:
 - Fix if-else condition (pointed by Maarten)
Changes since v3:
 - Use common function for timetime_us calculation (Paulo)
 - rebase on drm-tip

Signed-off-by: "Mahesh Kumar" <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |  7 +++++++
 drivers/gpu/drm/i915/intel_pm.c | 42 ++++++++++++++++++++++++++++++++---------
 2 files changed, 40 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5ff0091707c0..6a3b63bc8bf0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -114,6 +114,13 @@ typedef struct {
 	fp; \
 })
 
+static inline bool is_fixed_16_16_zero(uint_fixed_16_16_t val)
+{
+	if (val.val == 0)
+		return true;
+	return false;
+}
+
 static inline uint_fixed_16_16_t u32_to_fixed_16_16(uint32_t val)
 {
 	uint_fixed_16_16_t fp;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 9d0225862a7e..bcf5d2523e4a 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3731,6 +3731,27 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
 	return ret;
 }
 
+static uint_fixed_16_16_t
+skl_get_linetime_us(struct intel_crtc_state *cstate)
+{
+	uint32_t pixel_rate;
+	uint32_t crtc_htotal;
+	uint_fixed_16_16_t linetime_us;
+
+	if (!cstate->base.active)
+		return u32_to_fixed_16_16(0);
+
+	pixel_rate = cstate->pixel_rate;
+
+	if (WARN_ON(pixel_rate == 0))
+		return u32_to_fixed_16_16(0);
+
+	crtc_htotal = cstate->base.adjusted_mode.crtc_htotal;
+	linetime_us = fixed_16_16_div_u64(crtc_htotal * 1000, pixel_rate);
+
+	return linetime_us;
+}
+
 static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
 					      const struct intel_plane_state *pstate)
 {
@@ -3864,12 +3885,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	if (y_tiled) {
 		selected_result = max_fixed_16_16(method2, y_tile_minimum);
 	} else {
+		uint32_t linetime_us;
+
+		linetime_us = fixed_16_16_to_u32_round_up(
+				skl_get_linetime_us(cstate));
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
 		    (plane_bytes_per_line / 512 < 1))
 			selected_result = method2;
-		else if ((ddb_allocation /
+		else if ((ddb_allocation && ddb_allocation /
 			fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1)
 			selected_result = min_fixed_16_16(method1, method2);
+		else if (latency >= linetime_us)
+			selected_result = method2;
 		else
 			selected_result = method1;
 	}
@@ -3957,19 +3984,16 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 {
 	struct drm_atomic_state *state = cstate->base.state;
 	struct drm_i915_private *dev_priv = to_i915(state->dev);
-	uint32_t pixel_rate;
+	uint_fixed_16_16_t linetime_us;
 	uint32_t linetime_wm;
 
-	if (!cstate->base.active)
-		return 0;
+	linetime_us = skl_get_linetime_us(cstate);
 
-	pixel_rate = cstate->pixel_rate;
-
-	if (WARN_ON(pixel_rate == 0))
+	if (is_fixed_16_16_zero(linetime_us))
 		return 0;
 
-	linetime_wm = DIV_ROUND_UP(8 * cstate->base.adjusted_mode.crtc_htotal *
-				   1000, pixel_rate);
+	linetime_wm = fixed_16_16_to_u32_round_up(mul_u32_fixed_16_16(8,
+				linetime_us));
 
 	/* Display WA #1135: bxt. */
 	if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled)
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/11] drm/i915/skl: New ddb allocation algorithm
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (8 preceding siblings ...)
  2017-05-08 11:49 ` [PATCH 09/11] drm/i915/skl+: use linetime latency if ddb size is not available Mahesh Kumar
@ 2017-05-08 11:49 ` Mahesh Kumar
  2017-05-12 22:24   ` Matt Roper
  2017-05-08 11:49 ` [PATCH 11/11] drm/i915/skl+: consider max supported plane pixel rate while scaling Mahesh Kumar
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-08 11:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch implements new DDB allocation algorithm as per HW team
recommendation. This algo takecare of scenario where we allocate less DDB
for the planes with lower relative pixel rate, but they require more DDB
to work.
It also takes care of enabling same watermark level for each
plane in crtc, for efficient power saving.

Changes since v1:
 - Rebase on top of Paulo's patch series

Changes since v2:
 - Fix the for loop condition to enable WM

Changes since v3:
 - Fix crash in cursor i-g-t reported by Maarten
 - Rebase after addressing Paulo's comments
 - Few other ULT fixes
Changes since v4:
 - Rebase on drm-tip
 - Added separate function to enable WM levels
Changes since v4:
 - Fix a crash identified in skl-6770HQ system

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 253 ++++++++++++++++++++++++----------------
 1 file changed, 152 insertions(+), 101 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index bcf5d2523e4a..92600cf42e12 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3573,13 +3573,41 @@ skl_ddb_calc_min(const struct intel_crtc_state *cstate, int num_active,
 	minimum[PLANE_CURSOR] = skl_cursor_allocation(num_active);
 }
 
+static void
+skl_enable_plane_wm_levels(const struct drm_i915_private *dev_priv,
+			   uint16_t plane_ddb,
+			   uint16_t max_level,
+			   struct skl_plane_wm *wm)
+{
+	int level;
+	/*
+	 * Now enable all levels in WM structure which can be enabled
+	 * using current DDB allocation
+	 */
+	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
+		struct skl_wm_level *level_wm = &wm->wm[level];
+
+		if (level > max_level || level_wm->plane_res_b == 0
+				      || level_wm->plane_res_l >= 31
+				      || level_wm->plane_res_b >= plane_ddb) {
+			level_wm->plane_en = false;
+			level_wm->plane_res_b = 0;
+			level_wm->plane_res_l = 0;
+		} else {
+			level_wm->plane_en = true;
+		}
+	}
+}
+
 static int
 skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
-		      struct skl_ddb_allocation *ddb /* out */)
+		      struct skl_ddb_allocation *ddb /* out */,
+		      struct skl_pipe_wm *pipe_wm)
 {
 	struct drm_atomic_state *state = cstate->base.state;
 	struct drm_crtc *crtc = cstate->base.crtc;
 	struct drm_device *dev = crtc->dev;
+	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	enum pipe pipe = intel_crtc->pipe;
 	struct skl_ddb_entry *alloc = &cstate->wm.skl.ddb;
@@ -3592,6 +3620,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	unsigned plane_data_rate[I915_MAX_PLANES] = {};
 	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
 	uint16_t total_min_blocks = 0;
+	uint16_t total_level_ddb;
+	int max_level, level;
 
 	/* Clear the partitioning for disabled planes. */
 	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
@@ -3630,10 +3660,43 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		return-EINVAL;
 	}
 
-	alloc_size -= total_min_blocks;
-	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
+	alloc_size -= minimum[PLANE_CURSOR];
+	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end -
+							minimum[PLANE_CURSOR];
 	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
 
+	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
+		total_level_ddb = 0;
+		for_each_plane_id_on_crtc(intel_crtc, plane_id) {
+			/*
+			 * TODO: We should calculate watermark values for Y/UV
+			 * plane both in case of NV12 format and use both values
+			 * for ddb calculation. NV12 is disabled as of now, So
+			 * using only single/UV plane value here.
+			 */
+			struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
+			uint16_t plane_res_b = wm->wm[level].plane_res_b;
+			uint16_t min = minimum[plane_id] + y_minimum[plane_id];
+
+			if (plane_id == PLANE_CURSOR)
+				continue;
+
+			total_level_ddb += max(plane_res_b, min);
+		}
+
+		if (total_level_ddb <= alloc_size)
+			break;
+	}
+
+	if ((level < 0) || (total_min_blocks > alloc_size)) {
+		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
+		DRM_DEBUG_KMS("minimum required %d/%d\n", (level < 0) ?
+				total_level_ddb : total_min_blocks, alloc_size);
+		return -EINVAL;
+	}
+	max_level = level;
+	alloc_size -= total_level_ddb;
+
 	/*
 	 * 2. Distribute the remaining space in proportion to the amount of
 	 * data each plane needs to fetch from memory.
@@ -3649,10 +3712,17 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	start = alloc->start;
 	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
 		unsigned int data_rate, y_data_rate;
-		uint16_t plane_blocks, y_plane_blocks = 0;
-
-		if (plane_id == PLANE_CURSOR)
+		uint16_t plane_blocks = 0, y_plane_blocks = 0;
+		struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
+		uint16_t plane_res_b = wm->wm[max_level].plane_res_b;
+
+		if (plane_id == PLANE_CURSOR) {
+			plane_blocks =
+				skl_ddb_entry_size(&ddb->plane[pipe][plane_id]);
+			skl_enable_plane_wm_levels(dev_priv, plane_blocks,
+						   max_level, wm);
 			continue;
+		}
 
 		data_rate = plane_data_rate[plane_id];
 
@@ -3661,33 +3731,36 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 		 * promote the expression to 64 bits to avoid overflowing, the
 		 * result is < available as data_rate / total_data_rate < 1
 		 */
-		plane_blocks = minimum[plane_id];
-		plane_blocks += div_u64((uint64_t)alloc_size * data_rate,
-					total_data_rate);
 
 		/* Leave disabled planes at (0,0) */
 		if (data_rate) {
+			plane_blocks = max(minimum[plane_id], plane_res_b);
+			plane_blocks += div_u64((uint64_t)alloc_size *
+					data_rate, total_data_rate);
 			ddb->plane[pipe][plane_id].start = start;
 			ddb->plane[pipe][plane_id].end = start + plane_blocks;
+			start += plane_blocks;
 		}
 
-		start += plane_blocks;
-
 		/*
 		 * allocation for y_plane part of planar format:
+		 * TODO: Once we start calculating watermark values for Y/UV
+		 * plane both consider it for initial allowed wm blocks.
 		 */
 		y_data_rate = plane_y_data_rate[plane_id];
 
-		y_plane_blocks = y_minimum[plane_id];
-		y_plane_blocks += div_u64((uint64_t)alloc_size * y_data_rate,
-					total_data_rate);
-
 		if (y_data_rate) {
+			y_plane_blocks = y_minimum[plane_id];
+			y_plane_blocks += div_u64((uint64_t)alloc_size *
+					y_data_rate, total_data_rate);
 			ddb->y_plane[pipe][plane_id].start = start;
 			ddb->y_plane[pipe][plane_id].end = start + y_plane_blocks;
+			start += y_plane_blocks;
 		}
-
-		start += y_plane_blocks;
+		skl_enable_plane_wm_levels(dev_priv,
+					   plane_blocks,
+					   max_level,
+					   wm);
 	}
 
 	return 0;
@@ -3776,11 +3849,9 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
 static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				struct intel_crtc_state *cstate,
 				const struct intel_plane_state *intel_pstate,
-				uint16_t ddb_allocation,
 				int level,
 				uint16_t *out_blocks, /* out */
-				uint8_t *out_lines, /* out */
-				bool *enabled /* out */)
+				uint8_t *out_lines /* out */)
 {
 	struct intel_plane *plane = to_intel_plane(intel_pstate->base.plane);
 	const struct drm_plane_state *pstate = &intel_pstate->base;
@@ -3803,10 +3874,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	bool y_tiled, x_tiled;
 
 	if (latency == 0 ||
-	    !intel_wm_plane_visible(cstate, intel_pstate)) {
-		*enabled = false;
+	    !intel_wm_plane_visible(cstate, intel_pstate))
 		return 0;
-	}
 
 	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
 		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
@@ -3892,9 +3961,6 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
 		    (plane_bytes_per_line / 512 < 1))
 			selected_result = method2;
-		else if ((ddb_allocation && ddb_allocation /
-			fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1)
-			selected_result = min_fixed_16_16(method1, method2);
 		else if (latency >= linetime_us)
 			selected_result = method2;
 		else
@@ -3914,64 +3980,41 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		}
 	}
 
-	if (res_blocks >= ddb_allocation || res_lines > 31) {
-		*enabled = false;
-
-		/*
-		 * If there are no valid level 0 watermarks, then we can't
-		 * support this display configuration.
-		 */
-		if (level) {
-			return 0;
-		} else {
-			struct drm_plane *plane = pstate->plane;
+	if (res_lines >= 31 && level == 0) {
+		struct drm_plane *plane = pstate->plane;
 
-			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
-			DRM_DEBUG_KMS("[PLANE:%d:%s] blocks required = %u/%u, lines required = %u/31\n",
-				      plane->base.id, plane->name,
-				      res_blocks, ddb_allocation, res_lines);
-			return -EINVAL;
-		}
+		DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
+		DRM_DEBUG_KMS("[PLANE:%d:%s] lines required = %u/31\n",
+				plane->base.id, plane->name, res_lines);
 	}
 
 	*out_blocks = res_blocks;
 	*out_lines = res_lines;
-	*enabled = true;
 
 	return 0;
 }
 
 static int
 skl_compute_wm_level(const struct drm_i915_private *dev_priv,
-		     struct skl_ddb_allocation *ddb,
 		     struct intel_crtc_state *cstate,
 		     const struct intel_plane_state *intel_pstate,
 		     struct skl_plane_wm *wm)
 {
-	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
-	struct drm_plane *plane = intel_pstate->base.plane;
-	struct intel_plane *intel_plane = to_intel_plane(plane);
-	uint16_t ddb_blocks;
-	enum pipe pipe = intel_crtc->pipe;
 	int level, max_level = ilk_wm_max_level(dev_priv);
 	int ret;
 
 	if (WARN_ON(!intel_pstate->base.fb))
 		return -EINVAL;
 
-	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
-
 	for (level = 0; level <= max_level; level++) {
 		struct skl_wm_level *result = &wm->wm[level];
 
 		ret = skl_compute_plane_wm(dev_priv,
 					   cstate,
 					   intel_pstate,
-					   ddb_blocks,
 					   level,
 					   &result->plane_res_b,
-					   &result->plane_res_l,
-					   &result->plane_en);
+					   &result->plane_res_l);
 		if (ret)
 			return ret;
 	}
@@ -4037,8 +4080,7 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
 
 		wm = &pipe_wm->planes[plane_id];
 
-		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
-					   intel_pstate, wm);
+		ret = skl_compute_wm_level(dev_priv, cstate, intel_pstate, wm);
 		if (ret)
 			return ret;
 
@@ -4152,6 +4194,45 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
 	return false;
 }
 
+static int
+skl_ddb_add_affected_planes(struct intel_crtc_state *cstate,
+			    const struct skl_pipe_wm *old_pipe_wm,
+			    const struct skl_pipe_wm *pipe_wm)
+{
+	struct drm_atomic_state *state = cstate->base.state;
+	struct drm_device *dev = state->dev;
+	struct drm_crtc *crtc = cstate->base.crtc;
+	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
+	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
+	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
+	struct drm_plane_state *plane_state;
+	struct drm_plane *plane;
+	enum pipe pipe = intel_crtc->pipe;
+
+	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
+
+	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
+		enum plane_id plane_id = to_intel_plane(plane)->id;
+		const struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
+		const struct skl_plane_wm *old_wm = &old_pipe_wm->planes[plane_id];
+
+		if ((skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
+					&new_ddb->plane[pipe][plane_id]) &&
+		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
+					&new_ddb->y_plane[pipe][plane_id])) &&
+		    !memcmp(wm, old_wm, sizeof(struct skl_plane_wm)))
+			continue;
+
+		plane_state = drm_atomic_get_plane_state(state, plane);
+		if (IS_ERR(plane_state))
+			return PTR_ERR(plane_state);
+	}
+
+	return 0;
+}
+
 static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
 			      const struct skl_pipe_wm *old_pipe_wm,
 			      struct skl_pipe_wm *pipe_wm, /* out */
@@ -4165,6 +4246,18 @@ static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
 	if (ret)
 		return ret;
 
+	ret = skl_allocate_pipe_ddb(intel_cstate, ddb, pipe_wm);
+	if (ret)
+		return ret;
+	/*
+	 * TODO: Do we still need to add planes in state, As WM update is
+	 * not part of update_plane anymore, So wm for planes can be updated
+	 * irrespective of updade_plane call.
+	 */
+	ret = skl_ddb_add_affected_planes(intel_cstate, old_pipe_wm, pipe_wm);
+	if (ret)
+		return ret;
+
 	if (!memcmp(old_pipe_wm, pipe_wm, sizeof(*pipe_wm)))
 		*changed = false;
 	else
@@ -4187,41 +4280,7 @@ pipes_modified(struct drm_atomic_state *state)
 }
 
 static int
-skl_ddb_add_affected_planes(struct intel_crtc_state *cstate)
-{
-	struct drm_atomic_state *state = cstate->base.state;
-	struct drm_device *dev = state->dev;
-	struct drm_crtc *crtc = cstate->base.crtc;
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
-	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
-	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
-	struct drm_plane_state *plane_state;
-	struct drm_plane *plane;
-	enum pipe pipe = intel_crtc->pipe;
-
-	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
-
-	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
-		enum plane_id plane_id = to_intel_plane(plane)->id;
-
-		if (skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
-					&new_ddb->plane[pipe][plane_id]) &&
-		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
-					&new_ddb->y_plane[pipe][plane_id]))
-			continue;
-
-		plane_state = drm_atomic_get_plane_state(state, plane);
-		if (IS_ERR(plane_state))
-			return PTR_ERR(plane_state);
-	}
-
-	return 0;
-}
-
-static int
-skl_compute_ddb(struct drm_atomic_state *state)
+skl_include_affected_crtc(struct drm_atomic_state *state)
 {
 	struct drm_device *dev = state->dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
@@ -4285,14 +4344,6 @@ skl_compute_ddb(struct drm_atomic_state *state)
 		cstate = intel_atomic_get_crtc_state(state, intel_crtc);
 		if (IS_ERR(cstate))
 			return PTR_ERR(cstate);
-
-		ret = skl_allocate_pipe_ddb(cstate, ddb);
-		if (ret)
-			return ret;
-
-		ret = skl_ddb_add_affected_planes(cstate);
-		if (ret)
-			return ret;
 	}
 
 	return 0;
@@ -4373,7 +4424,7 @@ skl_compute_wm(struct drm_atomic_state *state)
 	/* Clear all dirty flags */
 	results->dirty_pipes = 0;
 
-	ret = skl_compute_ddb(state);
+	ret = skl_include_affected_crtc(state);
 	if (ret)
 		return ret;
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 11/11] drm/i915/skl+: consider max supported plane pixel rate while scaling
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (9 preceding siblings ...)
  2017-05-08 11:49 ` [PATCH 10/11] drm/i915/skl: New ddb allocation algorithm Mahesh Kumar
@ 2017-05-08 11:49 ` Mahesh Kumar
  2017-05-10 13:22   ` Maarten Lankhorst
  2017-05-08 12:37 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev2) Patchwork
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-08 11:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

A display resolution is only supported if it meets all the restrictions
below for Maximum Pipe Pixel Rate.

The display resolution must fit within the maximum pixel rate output
from the pipe. Make sure that the display pipe is able to feed pixels at
a rate required to support the desired resolution.
For each enabled plane on the pipe {
    If plane scaling enabled {
	Horizontal down scale amount = Maximum[1, plane horizontal size /
		    scaler horizontal window size]
	Vertical down scale amount = Maximum[1, plane vertical size /
		    scaler vertical window size]
	Plane down scale amount = Horizontal down scale amount *
		    Vertical down scale amount
	Plane Ratio = 1 / Plane down scale amount
    }
    Else {
	Plane Ratio = 1
    }
    If plane source pixel format is 64 bits per pixel {
	Plane Ratio = Plane Ratio * 8/9
    }
}

Pipe Ratio = Minimum Plane Ratio of all enabled planes on the pipe

If pipe scaling is enabled {
    Horizontal down scale amount = Maximum[1, pipe horizontal source size /
		scaler horizontal window size]
    Vertical down scale amount = Maximum[1, pipe vertical source size /
		scaler vertical window size]
    Note: The progressive fetch - interlace display mode is equivalent to a
		2.0 vertical down scale
    Pipe down scale amount = Horizontal down scale amount *
		Vertical down scale amount
    Pipe Ratio = Pipe Ratio / Pipe down scale amount
}

Pipe maximum pixel rate = CDCLK frequency * Pipe Ratio

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c |  3 ++
 drivers/gpu/drm/i915/intel_drv.h     |  2 +
 drivers/gpu/drm/i915/intel_pm.c      | 87 ++++++++++++++++++++++++++++++++++++
 3 files changed, 92 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 85b9e2f521a0..d64367e810f8 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10992,6 +10992,9 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
 			ret = skl_update_scaler_crtc(pipe_config);
 
 		if (!ret)
+			ret = skl_check_pipe_max_pixel_rate(intel_crtc,
+							    pipe_config);
+		if (!ret)
 			ret = intel_atomic_setup_scalers(dev_priv, intel_crtc,
 							 pipe_config);
 	}
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 54f3ff840812..8323fc2ec4f2 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1859,6 +1859,8 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
 				 int ignore);
 bool ilk_disable_lp_wm(struct drm_device *dev);
 int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
+int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
+				  struct intel_crtc_state *cstate);
 static inline int intel_enable_rc6(void)
 {
 	return i915.enable_rc6;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 92600cf42e12..69b1692ffd07 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3397,6 +3397,93 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 	return mul_fixed_16_16(downscale_w, downscale_h);
 }
 
+static uint_fixed_16_16_t
+skl_pipe_downscale_amount(const struct intel_crtc_state *config)
+{
+	uint_fixed_16_16_t pipe_downscale = u32_to_fixed_16_16(1);
+
+	if (!config->base.active)
+		return pipe_downscale;
+
+	if (config->pch_pfit.enabled) {
+		uint32_t src_w, src_h, dst_w, dst_h;
+		uint32_t pfit_size = config->pch_pfit.size;
+		uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
+		uint_fixed_16_16_t downscale_h, downscale_w;
+
+		src_w = config->pipe_src_w;
+		src_h = config->pipe_src_h;
+		dst_w = pfit_size >> 16;
+		dst_h = pfit_size & 0xffff;
+
+		if (!dst_w || !dst_h)
+			return pipe_downscale;
+
+		fp_w_ratio = fixed_16_16_div(src_w, dst_w);
+		fp_h_ratio = fixed_16_16_div(src_h, dst_h);
+		downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
+		downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
+
+		pipe_downscale = mul_fixed_16_16(downscale_w, downscale_h);
+	}
+
+	return pipe_downscale;
+}
+
+int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
+				  struct intel_crtc_state *cstate)
+{
+	struct drm_crtc_state *crtc_state = &cstate->base;
+	struct drm_atomic_state *state = crtc_state->state;
+	struct drm_plane *plane;
+	const struct drm_plane_state *pstate;
+	struct intel_plane_state *intel_pstate;
+	int crtc_clock, cdclk;
+	uint32_t pipe_max_pixel_rate;
+	uint_fixed_16_16_t pipe_downscale;
+	uint_fixed_16_16_t max_downscale = u32_to_fixed_16_16(1);
+
+	if (cstate->base.active)
+		return 0;
+
+	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
+		uint_fixed_16_16_t plane_downscale;
+		uint_fixed_16_16_t fp_9_div_8 = fixed_16_16_div(9, 8);
+		int bpp;
+
+		if (!pstate->visible)
+			continue;
+
+		if (WARN_ON(!pstate->fb))
+			continue;
+
+		intel_pstate = to_intel_plane_state(pstate);
+		plane_downscale = skl_plane_downscale_amount(cstate,
+							     intel_pstate);
+		bpp = pstate->fb->format->cpp[0] * 8;
+		if (bpp == 64)
+			plane_downscale = mul_fixed_16_16(plane_downscale,
+							  fp_9_div_8);
+
+		max_downscale = max_fixed_16_16(plane_downscale, max_downscale);
+	}
+	pipe_downscale = skl_pipe_downscale_amount(cstate);
+
+	pipe_downscale = mul_fixed_16_16(pipe_downscale, max_downscale);
+
+	crtc_clock = crtc_state->adjusted_mode.crtc_clock;
+	cdclk = to_intel_atomic_state(state)->cdclk.logical.cdclk;
+	pipe_max_pixel_rate = fixed_16_16_div_round_up_u64(cdclk,
+							   pipe_downscale);
+
+	if (pipe_max_pixel_rate < crtc_clock) {
+		DRM_ERROR("Max supported pixel clock with scaling exceeds\n");
+		return -ERANGE;
+	}
+
+	return 0;
+}
+
 static unsigned int
 skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 			     const struct drm_plane_state *pstate,
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/11] drm/i915/skl: Fail the flip if no FB for WM calculation
  2017-05-08 11:48   ` Lankhorst, Maarten
@ 2017-05-08 12:01     ` Mahesh Kumar
  2017-05-12  0:22       ` Matt Roper
  0 siblings, 1 reply; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-08 12:01 UTC (permalink / raw)
  To: Lankhorst, Maarten, intel-gfx; +Cc: Zanoni, Paulo R

Hi,


On Monday 08 May 2017 05:18 PM, Lankhorst, Maarten wrote:
> Mahesh Kumar schreef op ma 08-05-2017 om 17:18 [+0530]:
>> Fail the flip if no FB is present but plane_state is set as visible.
>> Above is not a valid combination so instead of continue fail the
>> flip.
> Why is this patch necessary? drm_atomic_plane_check handles this.
Ideally we should never get such combination here. But current WM code 
checks for this situation and even if it's true it proceeds further. 
This patch just corrects the WM code flow decision.
I also think some of these checks are redundant here.

-Mahesh

>
> ~Maarten

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev2)
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (10 preceding siblings ...)
  2017-05-08 11:49 ` [PATCH 11/11] drm/i915/skl+: consider max supported plane pixel rate while scaling Mahesh Kumar
@ 2017-05-08 12:37 ` Patchwork
  2017-05-09  8:12 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev4) Patchwork
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 43+ messages in thread
From: Patchwork @ 2017-05-08 12:37 UTC (permalink / raw)
  To: Kumar, Mahesh; +Cc: intel-gfx

== Series Details ==

Series: Implement DDB algorithm and WM cleanup (rev2)
URL   : https://patchwork.freedesktop.org/series/20376/
State : success

== Summary ==

Series 20376v2 Implement DDB algorithm and WM cleanup
https://patchwork.freedesktop.org/api/1.0/series/20376/revisions/2/mbox/

Test gem_exec_suspend:
        Subgroup basic-s4-devices:
                pass       -> DMESG-WARN (fi-kbl-7560u) fdo#100125

fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125

fi-bdw-5557u     total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  time:428s
fi-bdw-gvtdvm    total:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  time:429s
fi-bsw-n3050     total:278  pass:242  dwarn:0   dfail:0   fail:0   skip:36  time:568s
fi-bxt-j4205     total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  time:507s
fi-bxt-t5700     total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20  time:541s
fi-byt-j1900     total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  time:483s
fi-byt-n2820     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:487s
fi-hsw-4770      total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:416s
fi-hsw-4770r     total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:408s
fi-ilk-650       total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  time:415s
fi-ivb-3520m     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:483s
fi-ivb-3770      total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:482s
fi-kbl-7500u     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:460s
fi-kbl-7560u     total:278  pass:267  dwarn:1   dfail:0   fail:0   skip:10  time:566s
fi-skl-6260u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:458s
fi-skl-6700hq    total:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  time:571s
fi-skl-6700k     total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  time:461s
fi-skl-6770hq    total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:487s
fi-skl-gvtdvm    total:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  time:438s
fi-snb-2520m     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:532s
fi-snb-2600      total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  time:403s

f3c6147e2a7879ee7face0b7b634dd26704fc3d4 drm-tip: 2017y-05m-08d-11h-23m-22s UTC integration manifest
e65003c drm/i915/skl+: consider max supported plane pixel rate while scaling
0a29ecc drm/i915/skl: New ddb allocation algorithm
1b2f85b drm/i915/skl+: use linetime latency if ddb size is not available
af519ec drm/i915/skl+: Watermark calculation cleanup
cb5ca95 drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
9535309 drm/i915/skl+: no need to memset again
9be01bb drm/i915/skl: Fail the flip if no FB for WM calculation
e955e9a drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point
7c8a3a3 drm/i915: Use fixed_16_16 wrapper for division operation
ff89b6a drm/i915: Add more wrapper for fixed_point_16_16 operations
7bf103a drm/i915: fix naming of fixed_16_16 wrapper.

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4640/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 07/11] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
  2017-05-08 11:48 ` [PATCH 07/11] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation Mahesh Kumar
@ 2017-05-08 14:12   ` Ander Conselvan De Oliveira
  2017-05-09  7:50     ` Mahesh Kumar
  2017-05-09  7:52     ` Mahesh Kumar
  2017-05-12  0:23   ` Matt Roper
  1 sibling, 2 replies; 43+ messages in thread
From: Ander Conselvan De Oliveira @ 2017-05-08 14:12 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

On Mon, 2017-05-08 at 17:18 +0530, Mahesh Kumar wrote:
> DDB minimum requirement may exceed the allocated DDB for CRTC/Pipe. This
> patch make changes to fail the flip if minimum requirement for pipe
> exceeds the total ddb allocated to the pipe.
> Previously it succeeded but making alloc_size a negative value. Which
> will make later calculations for plane ddb allocation bogus & may lead
> to screen corruption or system hang.

When sending a new version of a patch, please list the changes from the previous
version here. For example:

v2: Improve commit message.

v3: ...

> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 2a4e9d89cd6f..0ace94d67432 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3591,6 +3591,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	int num_active;
>  	unsigned plane_data_rate[I915_MAX_PLANES] = {};
>  	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
> +	uint16_t total_min_blocks = 0;
>  
>  	/* Clear the partitioning for disabled planes. */
>  	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
> @@ -3618,10 +3619,18 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	 */
>  
>  	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
> -		alloc_size -= minimum[plane_id];
> -		alloc_size -= y_minimum[plane_id];
> +		total_min_blocks += minimum[plane_id];
> +		total_min_blocks += y_minimum[plane_id];
>  	}
>  
> +	if ((total_min_blocks > alloc_size)) {
> +		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
> +		DRM_DEBUG_KMS("minimum required %d/%d\n", total_min_blocks,
> +							alloc_size);
> +		return-EINVAL;
> +	}

The comments I made earlier [1] are still valid. With those fixed and a
changelog in the commit message,

Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>

[1] https://lists.freedesktop.org/archives/intel-gfx/2017-April/125764.html
> +
> +	alloc_size -= total_min_blocks;
>  	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
>  	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
>  
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 07/11] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
  2017-05-08 14:12   ` Ander Conselvan De Oliveira
@ 2017-05-09  7:50     ` Mahesh Kumar
  2017-05-09  7:52     ` Mahesh Kumar
  1 sibling, 0 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-09  7:50 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

DDB minimum requirement of crtc configuration (cumulative of all the
enabled planes in crtc) may exceed the allocated DDB for crtc/pipe.
This patch make changes to fail the flip/ioctl if minimum requirement
for pipe exceeds the total ddb allocated to the pipe.
Previously it succeeded but making alloc_size a negative value. Which
will make subsequent calculations for plane ddb allocation bogus & may
lead to screen corruption or system hang.

Changes from V1:
 - Improve commit message as per Ander's comment
 - Remove extra parentheses (Ander)

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 2a4e9d89cd6f..c0f6aeab0390 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3591,6 +3591,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	int num_active;
 	unsigned plane_data_rate[I915_MAX_PLANES] = {};
 	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
+	uint16_t total_min_blocks = 0;
 
 	/* Clear the partitioning for disabled planes. */
 	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
@@ -3618,10 +3619,18 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	 */
 
 	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
-		alloc_size -= minimum[plane_id];
-		alloc_size -= y_minimum[plane_id];
+		total_min_blocks += minimum[plane_id];
+		total_min_blocks += y_minimum[plane_id];
 	}
 
+	if (total_min_blocks > alloc_size) {
+		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
+		DRM_DEBUG_KMS("minimum required %d/%d\n", total_min_blocks,
+							alloc_size);
+		return-EINVAL;
+	}
+
+	alloc_size -= total_min_blocks;
 	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
 	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/11] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
  2017-05-08 14:12   ` Ander Conselvan De Oliveira
  2017-05-09  7:50     ` Mahesh Kumar
@ 2017-05-09  7:52     ` Mahesh Kumar
  1 sibling, 0 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-09  7:52 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

DDB minimum requirement of crtc configuration (cumulative of all the
enabled planes in crtc) may exceed the allocated DDB for crtc/pipe.
This patch make changes to fail the flip/ioctl if minimum requirement
for pipe exceeds the total ddb allocated to the pipe.
Previously it succeeded but making alloc_size a negative value. Which
will make subsequent calculations for plane ddb allocation bogus & may
lead to screen corruption or system hang.

Changes from V1:
 - Improve commit message as per Ander's comment
 - Remove extra parentheses (Ander)

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 2a4e9d89cd6f..c0f6aeab0390 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3591,6 +3591,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	int num_active;
 	unsigned plane_data_rate[I915_MAX_PLANES] = {};
 	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
+	uint16_t total_min_blocks = 0;
 
 	/* Clear the partitioning for disabled planes. */
 	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
@@ -3618,10 +3619,18 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
 	 */
 
 	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
-		alloc_size -= minimum[plane_id];
-		alloc_size -= y_minimum[plane_id];
+		total_min_blocks += minimum[plane_id];
+		total_min_blocks += y_minimum[plane_id];
 	}
 
+	if (total_min_blocks > alloc_size) {
+		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
+		DRM_DEBUG_KMS("minimum required %d/%d\n", total_min_blocks,
+							alloc_size);
+		return -EINVAL;
+	}
+
+	alloc_size -= total_min_blocks;
 	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
 	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
 
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev4)
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (11 preceding siblings ...)
  2017-05-08 12:37 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev2) Patchwork
@ 2017-05-09  8:12 ` Patchwork
  2017-05-11  9:35 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev5) Patchwork
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 43+ messages in thread
From: Patchwork @ 2017-05-09  8:12 UTC (permalink / raw)
  To: Kumar, Mahesh; +Cc: intel-gfx

== Series Details ==

Series: Implement DDB algorithm and WM cleanup (rev4)
URL   : https://patchwork.freedesktop.org/series/20376/
State : success

== Summary ==

Series 20376v4 Implement DDB algorithm and WM cleanup
https://patchwork.freedesktop.org/api/1.0/series/20376/revisions/4/mbox/

Test gem_exec_suspend:
        Subgroup basic-s4-devices:
                pass       -> DMESG-WARN (fi-kbl-7560u) fdo#100125
Test kms_flip:
        Subgroup basic-flip-vs-modeset:
                dmesg-warn -> PASS       (fi-byt-j1900) fdo#100652

fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125
fdo#100652 https://bugs.freedesktop.org/show_bug.cgi?id=100652

fi-bdw-5557u     total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  time:432s
fi-bdw-gvtdvm    total:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  time:427s
fi-bsw-n3050     total:278  pass:242  dwarn:0   dfail:0   fail:0   skip:36  time:576s
fi-bxt-j4205     total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  time:507s
fi-bxt-t5700     total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20  time:546s
fi-byt-j1900     total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  time:488s
fi-byt-n2820     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:484s
fi-elk-e7500     total:278  pass:229  dwarn:0   dfail:0   fail:0   skip:49  time:417s
fi-hsw-4770      total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:411s
fi-hsw-4770r     total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:403s
fi-ilk-650       total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  time:419s
fi-ivb-3520m     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:493s
fi-ivb-3770      total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:470s
fi-kbl-7500u     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:456s
fi-kbl-7560u     total:278  pass:267  dwarn:1   dfail:0   fail:0   skip:10  time:576s
fi-skl-6260u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:452s
fi-skl-6700hq    total:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  time:576s
fi-skl-6700k     total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  time:468s
fi-skl-6770hq    total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:489s
fi-skl-gvtdvm    total:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  time:430s
fi-snb-2520m     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:531s
fi-snb-2600      total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  time:410s

f558b185c57202d90bdb9059f3d446956cbae133 drm-tip: 2017y-05m-08d-16h-35m-28s UTC integration manifest
cb76814 drm/i915/skl+: consider max supported plane pixel rate while scaling
11009bc drm/i915/skl: New ddb allocation algorithm
6ef345e drm/i915/skl+: use linetime latency if ddb size is not available
df85039 drm/i915/skl+: Watermark calculation cleanup
1696606 drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
8545cbe drm/i915/skl+: no need to memset again
077ca5b drm/i915/skl: Fail the flip if no FB for WM calculation
d071301 drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point
79b852b drm/i915: Use fixed_16_16 wrapper for division operation
66a693f drm/i915: Add more wrapper for fixed_point_16_16 operations
7775207 drm/i915: fix naming of fixed_16_16 wrapper.

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4644/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/11] drm/i915: Add more wrapper for fixed_point_16_16 operations
  2017-05-08 11:48 ` [PATCH 02/11] drm/i915: Add more wrapper for fixed_point_16_16 operations Mahesh Kumar
@ 2017-05-10 12:37   ` Maarten Lankhorst
  2017-05-12  0:22   ` Matt Roper
  1 sibling, 0 replies; 43+ messages in thread
From: Maarten Lankhorst @ 2017-05-10 12:37 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx; +Cc: paulo.r.zanoni

Op 08-05-17 om 13:48 schreef Mahesh Kumar:
> This patch adds few wrapper to perform fixed_point_16_16 operations
> mul_u32_fixed_16_16_round_up : Multiplies u32 and fixed_16_16_t
> 				variables & returns u32 result with
> 				rounding-off.
> mul_fixed_16_16 : Multiplies two fixed_16_16_t variable & returns
> 				fixed_16_16
> div_round_up_fixed_16_16 : Perform division operation on fixed_16_16_t
> 				variables & return u32 result with round-off
> fixed_16_16_div_round_up_u64 : devide uint32_t variable by fixed_16_16
> 				variable and round_up the result to uint32_t.
>
> These wrappers will be used by later patches in the series.
>
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h | 43 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 43 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 5804734d30a3..5ff0091707c0 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -152,6 +152,38 @@ static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
>  	return max;
>  }
>  
> +static inline uint32_t div_round_up_fixed_16_16(uint_fixed_16_16_t val,
> +						uint_fixed_16_16_t d)
> +{
> +	return DIV_ROUND_UP(val.val, d.val);
> +}
> +
> +static inline uint32_t mul_u32_fixed_16_16_round_up(uint32_t val,
> +						    uint_fixed_16_16_t mul)
> +{
> +       uint64_t intermediate_val;
> +       uint32_t result;
> +
> +       intermediate_val = (uint64_t) val * mul.val;
> +       intermediate_val = DIV_ROUND_UP_ULL(intermediate_val, 1 << 16);
> +       WARN_ON(intermediate_val >> 32);
> +       result = clamp_t(uint32_t, intermediate_val, 0, ~0);
> +       return result;
> +}
> +
> +static inline uint_fixed_16_16_t mul_fixed_16_16(uint_fixed_16_16_t val,
> +						 uint_fixed_16_16_t mul)
> +{
> +	uint64_t intermediate_val;
> +	uint_fixed_16_16_t fp;
> +
> +	intermediate_val = (uint64_t) val.val * mul.val;
> +	intermediate_val = intermediate_val >> 16;
> +	WARN_ON(intermediate_val >> 32);
> +	fp.val = clamp_t(uint32_t, intermediate_val, 0, ~0);
> +	return fp;
> +}
> +
>  static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
>  {
>  	uint_fixed_16_16_t fp, res;
> @@ -174,6 +206,17 @@ static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val, uint32_t d)
>  	return res;
>  }
>  
> +static inline uint32_t fixed_16_16_div_round_up_u64(uint32_t val,
> +						    uint_fixed_16_16_t d)
> +{
> +	uint64_t interm_val;
> +
> +	interm_val = (uint64_t)val << 16;
> +	interm_val = DIV_ROUND_UP_ULL(interm_val, d.val);
> +	WARN_ON(interm_val >> 32);
> +	return clamp_t(uint32_t, interm_val, 0, ~0);
> +}
> +
>  static inline uint_fixed_16_16_t mul_u32_fixed_16_16(uint32_t val,
>  						     uint_fixed_16_16_t mul)
>  {

I wouldn't mind if those macros would be moved later to drm core, I don't think i915 is the only one to benefit from fixed 16.16 calculations. :)

~Maarten

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/11] drm/i915/skl+: consider max supported plane pixel rate while scaling
  2017-05-08 11:49 ` [PATCH 11/11] drm/i915/skl+: consider max supported plane pixel rate while scaling Mahesh Kumar
@ 2017-05-10 13:22   ` Maarten Lankhorst
  2017-05-11  8:36     ` Mahesh Kumar
  2017-05-11  9:18     ` [PATCH v2] " Mahesh Kumar
  0 siblings, 2 replies; 43+ messages in thread
From: Maarten Lankhorst @ 2017-05-10 13:22 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

Op 08-05-17 om 13:49 schreef Mahesh Kumar:
> A display resolution is only supported if it meets all the restrictions
> below for Maximum Pipe Pixel Rate.
>
> The display resolution must fit within the maximum pixel rate output
> from the pipe. Make sure that the display pipe is able to feed pixels at
> a rate required to support the desired resolution.
> For each enabled plane on the pipe {
>     If plane scaling enabled {
> 	Horizontal down scale amount = Maximum[1, plane horizontal size /
> 		    scaler horizontal window size]
> 	Vertical down scale amount = Maximum[1, plane vertical size /
> 		    scaler vertical window size]
> 	Plane down scale amount = Horizontal down scale amount *
> 		    Vertical down scale amount
> 	Plane Ratio = 1 / Plane down scale amount
>     }
>     Else {
> 	Plane Ratio = 1
>     }
>     If plane source pixel format is 64 bits per pixel {
> 	Plane Ratio = Plane Ratio * 8/9
>     }
> }
>
> Pipe Ratio = Minimum Plane Ratio of all enabled planes on the pipe
>
> If pipe scaling is enabled {
>     Horizontal down scale amount = Maximum[1, pipe horizontal source size /
> 		scaler horizontal window size]
>     Vertical down scale amount = Maximum[1, pipe vertical source size /
> 		scaler vertical window size]
>     Note: The progressive fetch - interlace display mode is equivalent to a
> 		2.0 vertical down scale
>     Pipe down scale amount = Horizontal down scale amount *
> 		Vertical down scale amount
>     Pipe Ratio = Pipe Ratio / Pipe down scale amount
> }
>
> Pipe maximum pixel rate = CDCLK frequency * Pipe Ratio
>
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c |  3 ++
>  drivers/gpu/drm/i915/intel_drv.h     |  2 +
>  drivers/gpu/drm/i915/intel_pm.c      | 87 ++++++++++++++++++++++++++++++++++++
>  3 files changed, 92 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 85b9e2f521a0..d64367e810f8 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -10992,6 +10992,9 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
>  			ret = skl_update_scaler_crtc(pipe_config);
>  
>  		if (!ret)
> +			ret = skl_check_pipe_max_pixel_rate(intel_crtc,
> +							    pipe_config);
> +		if (!ret)
>  			ret = intel_atomic_setup_scalers(dev_priv, intel_crtc,
>  							 pipe_config);
>  	}
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 54f3ff840812..8323fc2ec4f2 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1859,6 +1859,8 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
>  				 int ignore);
>  bool ilk_disable_lp_wm(struct drm_device *dev);
>  int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
> +int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
> +				  struct intel_crtc_state *cstate);
>  static inline int intel_enable_rc6(void)
>  {
>  	return i915.enable_rc6;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 92600cf42e12..69b1692ffd07 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3397,6 +3397,93 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
>  	return mul_fixed_16_16(downscale_w, downscale_h);
>  }
>  
> +static uint_fixed_16_16_t
> +skl_pipe_downscale_amount(const struct intel_crtc_state *config)
> +{
> +	uint_fixed_16_16_t pipe_downscale = u32_to_fixed_16_16(1);
> +
> +	if (!config->base.active)
> +		return pipe_downscale;
> +
> +	if (config->pch_pfit.enabled) {
> +		uint32_t src_w, src_h, dst_w, dst_h;
> +		uint32_t pfit_size = config->pch_pfit.size;
> +		uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> +		uint_fixed_16_16_t downscale_h, downscale_w;
> +
> +		src_w = config->pipe_src_w;
> +		src_h = config->pipe_src_h;
> +		dst_w = pfit_size >> 16;
> +		dst_h = pfit_size & 0xffff;
> +
> +		if (!dst_w || !dst_h)
> +			return pipe_downscale;
> +
> +		fp_w_ratio = fixed_16_16_div(src_w, dst_w);
> +		fp_h_ratio = fixed_16_16_div(src_h, dst_h);
> +		downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
> +		downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
> +
> +		pipe_downscale = mul_fixed_16_16(downscale_w, downscale_h);
> +	}
> +
> +	return pipe_downscale;
> +}
> +
> +int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
> +				  struct intel_crtc_state *cstate)
> +{
> +	struct drm_crtc_state *crtc_state = &cstate->base;
> +	struct drm_atomic_state *state = crtc_state->state;
> +	struct drm_plane *plane;
> +	const struct drm_plane_state *pstate;
> +	struct intel_plane_state *intel_pstate;
> +	int crtc_clock, cdclk;
> +	uint32_t pipe_max_pixel_rate;
> +	uint_fixed_16_16_t pipe_downscale;
> +	uint_fixed_16_16_t max_downscale = u32_to_fixed_16_16(1);
> +
> +	if (cstate->base.active)
> +		return 0;
^This check looks buggy, are there any tests?

In general we try to do the same for !active as we do with active, so please to remove any !active checks altogether..
> +	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
> +		uint_fixed_16_16_t plane_downscale;
> +		uint_fixed_16_16_t fp_9_div_8 = fixed_16_16_div(9, 8);
> +		int bpp;
> +
> +		if (!pstate->visible)
> +			continue;
Best use intel_wm_plane_visible here.
> +
> +		if (WARN_ON(!pstate->fb))
> +			continue;
> +
> +		intel_pstate = to_intel_plane_state(pstate);
> +		plane_downscale = skl_plane_downscale_amount(cstate,
> +							     intel_pstate);
> +		bpp = pstate->fb->format->cpp[0] * 8;
> +		if (bpp == 64)
> +			plane_downscale = mul_fixed_16_16(plane_downscale,
> +							  fp_9_div_8);
> +
> +		max_downscale = max_fixed_16_16(plane_downscale, max_downscale);
> +	}
> +	pipe_downscale = skl_pipe_downscale_amount(cstate);
> +
> +	pipe_downscale = mul_fixed_16_16(pipe_downscale, max_downscale);
> +
> +	crtc_clock = crtc_state->adjusted_mode.crtc_clock;
> +	cdclk = to_intel_atomic_state(state)->cdclk.logical.cdclk;
> +	pipe_max_pixel_rate = fixed_16_16_div_round_up_u64(cdclk,
> +							   pipe_downscale);
> +
> +	if (pipe_max_pixel_rate < crtc_clock) {
> +		DRM_ERROR("Max supported pixel clock with scaling exceeds\n");
> +		return -ERANGE;
> +	}
> +
> +	return 0;
> +}
> +
>  static unsigned int
>  skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
>  			     const struct drm_plane_state *pstate,


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/11] drm/i915/skl+: consider max supported plane pixel rate while scaling
  2017-05-10 13:22   ` Maarten Lankhorst
@ 2017-05-11  8:36     ` Mahesh Kumar
  2017-05-11  9:48       ` Maarten Lankhorst
  2017-05-11  9:18     ` [PATCH v2] " Mahesh Kumar
  1 sibling, 1 reply; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-11  8:36 UTC (permalink / raw)
  To: Maarten Lankhorst, intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

Hi,

Thanks for review.

On Wednesday 10 May 2017 06:52 PM, Maarten Lankhorst wrote:
> Op 08-05-17 om 13:49 schreef Mahesh Kumar:
>> A display resolution is only supported if it meets all the restrictions
>> below for Maximum Pipe Pixel Rate.
>>
>> The display resolution must fit within the maximum pixel rate output
>> from the pipe. Make sure that the display pipe is able to feed pixels at
>> a rate required to support the desired resolution.
>> For each enabled plane on the pipe {
>>      If plane scaling enabled {
>> 	Horizontal down scale amount = Maximum[1, plane horizontal size /
>> 		    scaler horizontal window size]
>> 	Vertical down scale amount = Maximum[1, plane vertical size /
>> 		    scaler vertical window size]
>> 	Plane down scale amount = Horizontal down scale amount *
>> 		    Vertical down scale amount
>> 	Plane Ratio = 1 / Plane down scale amount
>>      }
>>      Else {
>> 	Plane Ratio = 1
>>      }
>>      If plane source pixel format is 64 bits per pixel {
>> 	Plane Ratio = Plane Ratio * 8/9
>>      }
>> }
>>
>> Pipe Ratio = Minimum Plane Ratio of all enabled planes on the pipe
>>
>> If pipe scaling is enabled {
>>      Horizontal down scale amount = Maximum[1, pipe horizontal source size /
>> 		scaler horizontal window size]
>>      Vertical down scale amount = Maximum[1, pipe vertical source size /
>> 		scaler vertical window size]
>>      Note: The progressive fetch - interlace display mode is equivalent to a
>> 		2.0 vertical down scale
>>      Pipe down scale amount = Horizontal down scale amount *
>> 		Vertical down scale amount
>>      Pipe Ratio = Pipe Ratio / Pipe down scale amount
>> }
>>
>> Pipe maximum pixel rate = CDCLK frequency * Pipe Ratio
>>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_display.c |  3 ++
>>   drivers/gpu/drm/i915/intel_drv.h     |  2 +
>>   drivers/gpu/drm/i915/intel_pm.c      | 87 ++++++++++++++++++++++++++++++++++++
>>   3 files changed, 92 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>> index 85b9e2f521a0..d64367e810f8 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -10992,6 +10992,9 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
>>   			ret = skl_update_scaler_crtc(pipe_config);
>>   
>>   		if (!ret)
>> +			ret = skl_check_pipe_max_pixel_rate(intel_crtc,
>> +							    pipe_config);
>> +		if (!ret)
>>   			ret = intel_atomic_setup_scalers(dev_priv, intel_crtc,
>>   							 pipe_config);
>>   	}
>> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
>> index 54f3ff840812..8323fc2ec4f2 100644
>> --- a/drivers/gpu/drm/i915/intel_drv.h
>> +++ b/drivers/gpu/drm/i915/intel_drv.h
>> @@ -1859,6 +1859,8 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
>>   				 int ignore);
>>   bool ilk_disable_lp_wm(struct drm_device *dev);
>>   int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
>> +int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
>> +				  struct intel_crtc_state *cstate);
>>   static inline int intel_enable_rc6(void)
>>   {
>>   	return i915.enable_rc6;
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index 92600cf42e12..69b1692ffd07 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -3397,6 +3397,93 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
>>   	return mul_fixed_16_16(downscale_w, downscale_h);
>>   }
>>   
>> +static uint_fixed_16_16_t
>> +skl_pipe_downscale_amount(const struct intel_crtc_state *config)
>> +{
>> +	uint_fixed_16_16_t pipe_downscale = u32_to_fixed_16_16(1);
>> +
>> +	if (!config->base.active)
>> +		return pipe_downscale;
>> +
>> +	if (config->pch_pfit.enabled) {
>> +		uint32_t src_w, src_h, dst_w, dst_h;
>> +		uint32_t pfit_size = config->pch_pfit.size;
>> +		uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
>> +		uint_fixed_16_16_t downscale_h, downscale_w;
>> +
>> +		src_w = config->pipe_src_w;
>> +		src_h = config->pipe_src_h;
>> +		dst_w = pfit_size >> 16;
>> +		dst_h = pfit_size & 0xffff;
>> +
>> +		if (!dst_w || !dst_h)
>> +			return pipe_downscale;
>> +
>> +		fp_w_ratio = fixed_16_16_div(src_w, dst_w);
>> +		fp_h_ratio = fixed_16_16_div(src_h, dst_h);
>> +		downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
>> +		downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
>> +
>> +		pipe_downscale = mul_fixed_16_16(downscale_w, downscale_h);
>> +	}
>> +
>> +	return pipe_downscale;
>> +}
>> +
>> +int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
>> +				  struct intel_crtc_state *cstate)
>> +{
>> +	struct drm_crtc_state *crtc_state = &cstate->base;
>> +	struct drm_atomic_state *state = crtc_state->state;
>> +	struct drm_plane *plane;
>> +	const struct drm_plane_state *pstate;
>> +	struct intel_plane_state *intel_pstate;
>> +	int crtc_clock, cdclk;
>> +	uint32_t pipe_max_pixel_rate;
>> +	uint_fixed_16_16_t pipe_downscale;
>> +	uint_fixed_16_16_t max_downscale = u32_to_fixed_16_16(1);
>> +
>> +	if (cstate->base.active)
>> +		return 0;
> ^This check looks buggy, are there any tests?
>
> In general we try to do the same for !active as we do with active, so please to remove any !active checks altogether..
Thanks for catching this, yes it's a bug, during my testing this check 
was not there but I added this as an extra precaution & screwed-up :P
>> +	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
>> +		uint_fixed_16_16_t plane_downscale;
>> +		uint_fixed_16_16_t fp_9_div_8 = fixed_16_16_div(9, 8);
>> +		int bpp;
>> +
>> +		if (!pstate->visible)
>> +			continue;
> Best use intel_wm_plane_visible here.
will do,

-Mahesh
>> +
>> +		if (WARN_ON(!pstate->fb))
>> +			continue;
>> +
>> +		intel_pstate = to_intel_plane_state(pstate);
>> +		plane_downscale = skl_plane_downscale_amount(cstate,
>> +							     intel_pstate);
>> +		bpp = pstate->fb->format->cpp[0] * 8;
>> +		if (bpp == 64)
>> +			plane_downscale = mul_fixed_16_16(plane_downscale,
>> +							  fp_9_div_8);
>> +
>> +		max_downscale = max_fixed_16_16(plane_downscale, max_downscale);
>> +	}
>> +	pipe_downscale = skl_pipe_downscale_amount(cstate);
>> +
>> +	pipe_downscale = mul_fixed_16_16(pipe_downscale, max_downscale);
>> +
>> +	crtc_clock = crtc_state->adjusted_mode.crtc_clock;
>> +	cdclk = to_intel_atomic_state(state)->cdclk.logical.cdclk;
>> +	pipe_max_pixel_rate = fixed_16_16_div_round_up_u64(cdclk,
>> +							   pipe_downscale);
>> +
>> +	if (pipe_max_pixel_rate < crtc_clock) {
>> +		DRM_ERROR("Max supported pixel clock with scaling exceeds\n");
>> +		return -ERANGE;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>   static unsigned int
>>   skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
>>   			     const struct drm_plane_state *pstate,
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v2] drm/i915/skl+: consider max supported plane pixel rate while scaling
  2017-05-10 13:22   ` Maarten Lankhorst
  2017-05-11  8:36     ` Mahesh Kumar
@ 2017-05-11  9:18     ` Mahesh Kumar
  1 sibling, 0 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-11  9:18 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

A display resolution is only supported if it meets all the restrictions
below for Maximum Pipe Pixel Rate.

The display resolution must fit within the maximum pixel rate output
from the pipe. Make sure that the display pipe is able to feed pixels at
a rate required to support the desired resolution.
For each enabled plane on the pipe {
    If plane scaling enabled {
	Horizontal down scale amount = Maximum[1, plane horizontal size /
		    scaler horizontal window size]
	Vertical down scale amount = Maximum[1, plane vertical size /
		    scaler vertical window size]
	Plane down scale amount = Horizontal down scale amount *
		    Vertical down scale amount
	Plane Ratio = 1 / Plane down scale amount
    }
    Else {
	Plane Ratio = 1
    }
    If plane source pixel format is 64 bits per pixel {
	Plane Ratio = Plane Ratio * 8/9
    }
}

Pipe Ratio = Minimum Plane Ratio of all enabled planes on the pipe

If pipe scaling is enabled {
    Horizontal down scale amount = Maximum[1, pipe horizontal source size /
		scaler horizontal window size]
    Vertical down scale amount = Maximum[1, pipe vertical source size /
		scaler vertical window size]
    Note: The progressive fetch - interlace display mode is equivalent to a
		2.0 vertical down scale
    Pipe down scale amount = Horizontal down scale amount *
		Vertical down scale amount
    Pipe Ratio = Pipe Ratio / Pipe down scale amount
}

Pipe maximum pixel rate = CDCLK frequency * Pipe Ratio

Changes since V1:
 - separate out fixed_16_16 wrapper API definition
Changes since V2:
 - Fix buggy crtc !active condition (Maarten)
 - use intel_wm_plane_visible wrapper as per Maarten's suggestion

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c |  3 ++
 drivers/gpu/drm/i915/intel_drv.h     |  2 +
 drivers/gpu/drm/i915/intel_pm.c      | 88 ++++++++++++++++++++++++++++++++++++
 3 files changed, 93 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 85b9e2f521a0..d64367e810f8 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10992,6 +10992,9 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
 			ret = skl_update_scaler_crtc(pipe_config);
 
 		if (!ret)
+			ret = skl_check_pipe_max_pixel_rate(intel_crtc,
+							    pipe_config);
+		if (!ret)
 			ret = intel_atomic_setup_scalers(dev_priv, intel_crtc,
 							 pipe_config);
 	}
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 54f3ff840812..8323fc2ec4f2 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1859,6 +1859,8 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
 				 int ignore);
 bool ilk_disable_lp_wm(struct drm_device *dev);
 int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
+int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
+				  struct intel_crtc_state *cstate);
 static inline int intel_enable_rc6(void)
 {
 	return i915.enable_rc6;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 041cbca7105e..92b047054f7e 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3397,6 +3397,94 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 	return mul_fixed_16_16(downscale_w, downscale_h);
 }
 
+static uint_fixed_16_16_t
+skl_pipe_downscale_amount(const struct intel_crtc_state *config)
+{
+	uint_fixed_16_16_t pipe_downscale = u32_to_fixed_16_16(1);
+
+	if (!config->base.active)
+		return pipe_downscale;
+
+	if (config->pch_pfit.enabled) {
+		uint32_t src_w, src_h, dst_w, dst_h;
+		uint32_t pfit_size = config->pch_pfit.size;
+		uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
+		uint_fixed_16_16_t downscale_h, downscale_w;
+
+		src_w = config->pipe_src_w;
+		src_h = config->pipe_src_h;
+		dst_w = pfit_size >> 16;
+		dst_h = pfit_size & 0xffff;
+
+		if (!dst_w || !dst_h)
+			return pipe_downscale;
+
+		fp_w_ratio = fixed_16_16_div(src_w, dst_w);
+		fp_h_ratio = fixed_16_16_div(src_h, dst_h);
+		downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
+		downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
+
+		pipe_downscale = mul_fixed_16_16(downscale_w, downscale_h);
+	}
+
+	return pipe_downscale;
+}
+
+int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
+				  struct intel_crtc_state *cstate)
+{
+	struct drm_crtc_state *crtc_state = &cstate->base;
+	struct drm_atomic_state *state = crtc_state->state;
+	struct drm_plane *plane;
+	const struct drm_plane_state *pstate;
+	struct intel_plane_state *intel_pstate;
+	int crtc_clock, cdclk;
+	uint32_t pipe_max_pixel_rate;
+	uint_fixed_16_16_t pipe_downscale;
+	uint_fixed_16_16_t max_downscale = u32_to_fixed_16_16(1);
+
+	if (!cstate->base.active)
+		return 0;
+
+	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
+		uint_fixed_16_16_t plane_downscale;
+		uint_fixed_16_16_t fp_9_div_8 = fixed_16_16_div(9, 8);
+		int bpp;
+
+		if (!intel_wm_plane_visible(cstate,
+					    to_intel_plane_state(pstate)))
+			continue;
+
+		if (WARN_ON(!pstate->fb))
+			continue;
+
+		intel_pstate = to_intel_plane_state(pstate);
+		plane_downscale = skl_plane_downscale_amount(cstate,
+							     intel_pstate);
+		bpp = pstate->fb->format->cpp[0] * 8;
+		if (bpp == 64)
+			plane_downscale = mul_fixed_16_16(plane_downscale,
+							  fp_9_div_8);
+
+		max_downscale = max_fixed_16_16(plane_downscale, max_downscale);
+	}
+	pipe_downscale = skl_pipe_downscale_amount(cstate);
+
+	pipe_downscale = mul_fixed_16_16(pipe_downscale, max_downscale);
+
+	crtc_clock = crtc_state->adjusted_mode.crtc_clock;
+	cdclk = to_intel_atomic_state(state)->cdclk.logical.cdclk;
+	pipe_max_pixel_rate = fixed_16_16_div_round_up_u64(cdclk,
+							   pipe_downscale);
+
+	if (pipe_max_pixel_rate < crtc_clock) {
+		DRM_ERROR("Max supported pixel clock with scaling exceeds\n");
+		return -ERANGE;
+	}
+
+	return 0;
+}
+
 static unsigned int
 skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 			     const struct drm_plane_state *pstate,
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev5)
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (12 preceding siblings ...)
  2017-05-09  8:12 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev4) Patchwork
@ 2017-05-11  9:35 ` Patchwork
  2017-05-11 14:11 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev6) Patchwork
  2017-05-12  0:21 ` [PATCH 00/11] Implement DDB algorithm and WM cleanup Matt Roper
  15 siblings, 0 replies; 43+ messages in thread
From: Patchwork @ 2017-05-11  9:35 UTC (permalink / raw)
  To: Kumar, Mahesh; +Cc: intel-gfx

== Series Details ==

Series: Implement DDB algorithm and WM cleanup (rev5)
URL   : https://patchwork.freedesktop.org/series/20376/
State : success

== Summary ==

Series 20376v5 Implement DDB algorithm and WM cleanup
https://patchwork.freedesktop.org/api/1.0/series/20376/revisions/5/mbox/

fi-bdw-5557u     total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  time:437s
fi-bdw-gvtdvm    total:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  time:434s
fi-bsw-n3050     total:278  pass:242  dwarn:0   dfail:0   fail:0   skip:36  time:575s
fi-bxt-j4205     total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  time:516s
fi-bxt-t5700     total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20  time:571s
fi-byt-j1900     total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  time:495s
fi-byt-n2820     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:491s
fi-hsw-4770      total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:414s
fi-hsw-4770r     total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:416s
fi-ilk-650       total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  time:425s
fi-ivb-3520m     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:492s
fi-ivb-3770      total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:478s
fi-kbl-7500u     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:464s
fi-kbl-7560u     total:278  pass:267  dwarn:1   dfail:0   fail:0   skip:10  time:675s
fi-skl-6260u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:458s
fi-skl-6700hq    total:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  time:581s
fi-skl-6700k     total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  time:469s
fi-skl-6770hq    total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:506s
fi-skl-gvtdvm    total:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  time:443s
fi-snb-2520m     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:537s
fi-snb-2600      total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  time:417s

4149c7b22164d81960dae0987f497725117b68e6 drm-tip: 2017y-05m-10d-18h-09m-15s UTC integration manifest
04c9c80 drm/i915/skl+: consider max supported plane pixel rate while scaling
0257976 drm/i915/skl: New ddb allocation algorithm
e5ddde8 drm/i915/skl+: use linetime latency if ddb size is not available
2879b17 drm/i915/skl+: Watermark calculation cleanup
924111b drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
56c574d drm/i915/skl+: no need to memset again
71f5a11 drm/i915/skl: Fail the flip if no FB for WM calculation
63cc879 drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point
daee6f3 drm/i915: Use fixed_16_16 wrapper for division operation
2b17d69 drm/i915: Add more wrapper for fixed_point_16_16 operations
78b019d drm/i915: fix naming of fixed_16_16 wrapper.

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4671/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/11] drm/i915/skl+: consider max supported plane pixel rate while scaling
  2017-05-11  8:36     ` Mahesh Kumar
@ 2017-05-11  9:48       ` Maarten Lankhorst
  2017-05-11 10:59         ` Mahesh Kumar
  2017-05-11 13:05         ` [PATCH v4 " Mahesh Kumar
  0 siblings, 2 replies; 43+ messages in thread
From: Maarten Lankhorst @ 2017-05-11  9:48 UTC (permalink / raw)
  To: Mahesh Kumar, intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

Op 11-05-17 om 10:36 schreef Mahesh Kumar:
> Hi,
>
> Thanks for review.
>
> On Wednesday 10 May 2017 06:52 PM, Maarten Lankhorst wrote:
>> Op 08-05-17 om 13:49 schreef Mahesh Kumar:
>>> A display resolution is only supported if it meets all the restrictions
>>> below for Maximum Pipe Pixel Rate.
>>>
>>> The display resolution must fit within the maximum pixel rate output
>>> from the pipe. Make sure that the display pipe is able to feed pixels at
>>> a rate required to support the desired resolution.
>>> For each enabled plane on the pipe {
>>>      If plane scaling enabled {
>>>     Horizontal down scale amount = Maximum[1, plane horizontal size /
>>>             scaler horizontal window size]
>>>     Vertical down scale amount = Maximum[1, plane vertical size /
>>>             scaler vertical window size]
>>>     Plane down scale amount = Horizontal down scale amount *
>>>             Vertical down scale amount
>>>     Plane Ratio = 1 / Plane down scale amount
>>>      }
>>>      Else {
>>>     Plane Ratio = 1
>>>      }
>>>      If plane source pixel format is 64 bits per pixel {
>>>     Plane Ratio = Plane Ratio * 8/9
>>>      }
>>> }
>>>
>>> Pipe Ratio = Minimum Plane Ratio of all enabled planes on the pipe
>>>
>>> If pipe scaling is enabled {
>>>      Horizontal down scale amount = Maximum[1, pipe horizontal source size /
>>>         scaler horizontal window size]
>>>      Vertical down scale amount = Maximum[1, pipe vertical source size /
>>>         scaler vertical window size]
>>>      Note: The progressive fetch - interlace display mode is equivalent to a
>>>         2.0 vertical down scale
>>>      Pipe down scale amount = Horizontal down scale amount *
>>>         Vertical down scale amount
>>>      Pipe Ratio = Pipe Ratio / Pipe down scale amount
>>> }
>>>
>>> Pipe maximum pixel rate = CDCLK frequency * Pipe Ratio
>>>
>>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/intel_display.c |  3 ++
>>>   drivers/gpu/drm/i915/intel_drv.h     |  2 +
>>>   drivers/gpu/drm/i915/intel_pm.c      | 87 ++++++++++++++++++++++++++++++++++++
>>>   3 files changed, 92 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>>> index 85b9e2f521a0..d64367e810f8 100644
>>> --- a/drivers/gpu/drm/i915/intel_display.c
>>> +++ b/drivers/gpu/drm/i915/intel_display.c
>>> @@ -10992,6 +10992,9 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
>>>               ret = skl_update_scaler_crtc(pipe_config);
>>>             if (!ret)
>>> +            ret = skl_check_pipe_max_pixel_rate(intel_crtc,
>>> +                                pipe_config);
>>> +        if (!ret)
>>>               ret = intel_atomic_setup_scalers(dev_priv, intel_crtc,
>>>                                pipe_config);
>>>       }
>>> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
>>> index 54f3ff840812..8323fc2ec4f2 100644
>>> --- a/drivers/gpu/drm/i915/intel_drv.h
>>> +++ b/drivers/gpu/drm/i915/intel_drv.h
>>> @@ -1859,6 +1859,8 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
>>>                    int ignore);
>>>   bool ilk_disable_lp_wm(struct drm_device *dev);
>>>   int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
>>> +int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
>>> +                  struct intel_crtc_state *cstate);
>>>   static inline int intel_enable_rc6(void)
>>>   {
>>>       return i915.enable_rc6;
>>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>>> index 92600cf42e12..69b1692ffd07 100644
>>> --- a/drivers/gpu/drm/i915/intel_pm.c
>>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>>> @@ -3397,6 +3397,93 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
>>>       return mul_fixed_16_16(downscale_w, downscale_h);
>>>   }
>>>   +static uint_fixed_16_16_t
>>> +skl_pipe_downscale_amount(const struct intel_crtc_state *config)
>>> +{
>>> +    uint_fixed_16_16_t pipe_downscale = u32_to_fixed_16_16(1);
>>> +
>>> +    if (!config->base.active)
>>> +        return pipe_downscale;
>>> +
>>> +    if (config->pch_pfit.enabled) {
>>> +        uint32_t src_w, src_h, dst_w, dst_h;
>>> +        uint32_t pfit_size = config->pch_pfit.size;
>>> +        uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
>>> +        uint_fixed_16_16_t downscale_h, downscale_w;
>>> +
>>> +        src_w = config->pipe_src_w;
>>> +        src_h = config->pipe_src_h;
>>> +        dst_w = pfit_size >> 16;
>>> +        dst_h = pfit_size & 0xffff;
>>> +
>>> +        if (!dst_w || !dst_h)
>>> +            return pipe_downscale;
>>> +
>>> +        fp_w_ratio = fixed_16_16_div(src_w, dst_w);
>>> +        fp_h_ratio = fixed_16_16_div(src_h, dst_h);
>>> +        downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
>>> +        downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
>>> +
>>> +        pipe_downscale = mul_fixed_16_16(downscale_w, downscale_h);
>>> +    }
>>> +
>>> +    return pipe_downscale;
>>> +}
>>> +
>>> +int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
>>> +                  struct intel_crtc_state *cstate)
>>> +{
>>> +    struct drm_crtc_state *crtc_state = &cstate->base;
>>> +    struct drm_atomic_state *state = crtc_state->state;
>>> +    struct drm_plane *plane;
>>> +    const struct drm_plane_state *pstate;
>>> +    struct intel_plane_state *intel_pstate;
>>> +    int crtc_clock, cdclk;
>>> +    uint32_t pipe_max_pixel_rate;
>>> +    uint_fixed_16_16_t pipe_downscale;
>>> +    uint_fixed_16_16_t max_downscale = u32_to_fixed_16_16(1);
>>> +
>>> +    if (cstate->base.active)
>>> +        return 0;
>> ^This check looks buggy, are there any tests?
>>
>> In general we try to do the same for !active as we do with active, so please to remove any !active checks altogether..
> Thanks for catching this, yes it's a bug, during my testing this check was not there but I added this as an extra precaution & screwed-up :P
>>> +    drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
>>> +        uint_fixed_16_16_t plane_downscale;
>>> +        uint_fixed_16_16_t fp_9_div_8 = fixed_16_16_div(9, 8);
>>> +        int bpp;
>>> +
>>> +        if (!pstate->visible)
>>> +            continue;
>> Best use intel_wm_plane_visible here.
> will do,
>
> -Mahesh
>>> +
>>> +        if (WARN_ON(!pstate->fb))
>>> +            continue;
>>> +
>>> +        intel_pstate = to_intel_plane_state(pstate);
>>> +        plane_downscale = skl_plane_downscale_amount(cstate,
>>> +                                 intel_pstate);
>>> +        bpp = pstate->fb->format->cpp[0] * 8;
>>> +        if (bpp == 64)
>>> +            plane_downscale = mul_fixed_16_16(plane_downscale,
>>> +                              fp_9_div_8);
>>> +
>>> +        max_downscale = max_fixed_16_16(plane_downscale, max_downscale);
>>> +    }
>>> +    pipe_downscale = skl_pipe_downscale_amount(cstate);
>>> +
>>> +    pipe_downscale = mul_fixed_16_16(pipe_downscale, max_downscale);
>>> +
>>> +    crtc_clock = crtc_state->adjusted_mode.crtc_clock;
>>> +    cdclk = to_intel_atomic_state(state)->cdclk.logical.cdclk;
>>> +    pipe_max_pixel_rate = fixed_16_16_div_round_up_u64(cdclk,
>>> +                               pipe_downscale);
>>> +
>>> +    if (pipe_max_pixel_rate < crtc_clock) {
>>> +        DRM_ERROR("Max supported pixel clock with scaling exceeds\n");
>>> +        return -ERANGE; 
Oops, please also only return -EINVAL, definitely not ERANGE.

~Maarten
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 11/11] drm/i915/skl+: consider max supported plane pixel rate while scaling
  2017-05-11  9:48       ` Maarten Lankhorst
@ 2017-05-11 10:59         ` Mahesh Kumar
  2017-05-11 13:05         ` [PATCH v4 " Mahesh Kumar
  1 sibling, 0 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-11 10:59 UTC (permalink / raw)
  To: Maarten Lankhorst, intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst



On Thursday 11 May 2017 03:18 PM, Maarten Lankhorst wrote:
> Op 11-05-17 om 10:36 schreef Mahesh Kumar:
>> Hi,
>>
>> Thanks for review.
>>
>> On Wednesday 10 May 2017 06:52 PM, Maarten Lankhorst wrote:
>>> Op 08-05-17 om 13:49 schreef Mahesh Kumar:
>>>> A display resolution is only supported if it meets all the restrictions
>>>> below for Maximum Pipe Pixel Rate.
>>>>
>>>> The display resolution must fit within the maximum pixel rate output
>>>> from the pipe. Make sure that the display pipe is able to feed pixels at
>>>> a rate required to support the desired resolution.
>>>> For each enabled plane on the pipe {
>>>>       If plane scaling enabled {
>>>>      Horizontal down scale amount = Maximum[1, plane horizontal size /
>>>>              scaler horizontal window size]
>>>>      Vertical down scale amount = Maximum[1, plane vertical size /
>>>>              scaler vertical window size]
>>>>      Plane down scale amount = Horizontal down scale amount *
>>>>              Vertical down scale amount
>>>>      Plane Ratio = 1 / Plane down scale amount
>>>>       }
>>>>       Else {
>>>>      Plane Ratio = 1
>>>>       }
>>>>       If plane source pixel format is 64 bits per pixel {
>>>>      Plane Ratio = Plane Ratio * 8/9
>>>>       }
>>>> }
>>>>
>>>> Pipe Ratio = Minimum Plane Ratio of all enabled planes on the pipe
>>>>
>>>> If pipe scaling is enabled {
>>>>       Horizontal down scale amount = Maximum[1, pipe horizontal source size /
>>>>          scaler horizontal window size]
>>>>       Vertical down scale amount = Maximum[1, pipe vertical source size /
>>>>          scaler vertical window size]
>>>>       Note: The progressive fetch - interlace display mode is equivalent to a
>>>>          2.0 vertical down scale
>>>>       Pipe down scale amount = Horizontal down scale amount *
>>>>          Vertical down scale amount
>>>>       Pipe Ratio = Pipe Ratio / Pipe down scale amount
>>>> }
>>>>
>>>> Pipe maximum pixel rate = CDCLK frequency * Pipe Ratio
>>>>
>>>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/intel_display.c |  3 ++
>>>>    drivers/gpu/drm/i915/intel_drv.h     |  2 +
>>>>    drivers/gpu/drm/i915/intel_pm.c      | 87 ++++++++++++++++++++++++++++++++++++
>>>>    3 files changed, 92 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>>>> index 85b9e2f521a0..d64367e810f8 100644
>>>> --- a/drivers/gpu/drm/i915/intel_display.c
>>>> +++ b/drivers/gpu/drm/i915/intel_display.c
>>>> @@ -10992,6 +10992,9 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
>>>>                ret = skl_update_scaler_crtc(pipe_config);
>>>>              if (!ret)
>>>> +            ret = skl_check_pipe_max_pixel_rate(intel_crtc,
>>>> +                                pipe_config);
>>>> +        if (!ret)
>>>>                ret = intel_atomic_setup_scalers(dev_priv, intel_crtc,
>>>>                                 pipe_config);
>>>>        }
>>>> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
>>>> index 54f3ff840812..8323fc2ec4f2 100644
>>>> --- a/drivers/gpu/drm/i915/intel_drv.h
>>>> +++ b/drivers/gpu/drm/i915/intel_drv.h
>>>> @@ -1859,6 +1859,8 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
>>>>                     int ignore);
>>>>    bool ilk_disable_lp_wm(struct drm_device *dev);
>>>>    int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
>>>> +int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
>>>> +                  struct intel_crtc_state *cstate);
>>>>    static inline int intel_enable_rc6(void)
>>>>    {
>>>>        return i915.enable_rc6;
>>>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>>>> index 92600cf42e12..69b1692ffd07 100644
>>>> --- a/drivers/gpu/drm/i915/intel_pm.c
>>>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>>>> @@ -3397,6 +3397,93 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
>>>>        return mul_fixed_16_16(downscale_w, downscale_h);
>>>>    }
>>>>    +static uint_fixed_16_16_t
>>>> +skl_pipe_downscale_amount(const struct intel_crtc_state *config)
>>>> +{
>>>> +    uint_fixed_16_16_t pipe_downscale = u32_to_fixed_16_16(1);
>>>> +
>>>> +    if (!config->base.active)
>>>> +        return pipe_downscale;
>>>> +
>>>> +    if (config->pch_pfit.enabled) {
>>>> +        uint32_t src_w, src_h, dst_w, dst_h;
>>>> +        uint32_t pfit_size = config->pch_pfit.size;
>>>> +        uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
>>>> +        uint_fixed_16_16_t downscale_h, downscale_w;
>>>> +
>>>> +        src_w = config->pipe_src_w;
>>>> +        src_h = config->pipe_src_h;
>>>> +        dst_w = pfit_size >> 16;
>>>> +        dst_h = pfit_size & 0xffff;
>>>> +
>>>> +        if (!dst_w || !dst_h)
>>>> +            return pipe_downscale;
>>>> +
>>>> +        fp_w_ratio = fixed_16_16_div(src_w, dst_w);
>>>> +        fp_h_ratio = fixed_16_16_div(src_h, dst_h);
>>>> +        downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
>>>> +        downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
>>>> +
>>>> +        pipe_downscale = mul_fixed_16_16(downscale_w, downscale_h);
>>>> +    }
>>>> +
>>>> +    return pipe_downscale;
>>>> +}
>>>> +
>>>> +int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
>>>> +                  struct intel_crtc_state *cstate)
>>>> +{
>>>> +    struct drm_crtc_state *crtc_state = &cstate->base;
>>>> +    struct drm_atomic_state *state = crtc_state->state;
>>>> +    struct drm_plane *plane;
>>>> +    const struct drm_plane_state *pstate;
>>>> +    struct intel_plane_state *intel_pstate;
>>>> +    int crtc_clock, cdclk;
>>>> +    uint32_t pipe_max_pixel_rate;
>>>> +    uint_fixed_16_16_t pipe_downscale;
>>>> +    uint_fixed_16_16_t max_downscale = u32_to_fixed_16_16(1);
>>>> +
>>>> +    if (cstate->base.active)
>>>> +        return 0;
>>> ^This check looks buggy, are there any tests?
>>>
>>> In general we try to do the same for !active as we do with active, so please to remove any !active checks altogether..
>> Thanks for catching this, yes it's a bug, during my testing this check was not there but I added this as an extra precaution & screwed-up :P
>>>> +    drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
>>>> +        uint_fixed_16_16_t plane_downscale;
>>>> +        uint_fixed_16_16_t fp_9_div_8 = fixed_16_16_div(9, 8);
>>>> +        int bpp;
>>>> +
>>>> +        if (!pstate->visible)
>>>> +            continue;
>>> Best use intel_wm_plane_visible here.
>> will do,
>>
>> -Mahesh
>>>> +
>>>> +        if (WARN_ON(!pstate->fb))
>>>> +            continue;
>>>> +
>>>> +        intel_pstate = to_intel_plane_state(pstate);
>>>> +        plane_downscale = skl_plane_downscale_amount(cstate,
>>>> +                                 intel_pstate);
>>>> +        bpp = pstate->fb->format->cpp[0] * 8;
>>>> +        if (bpp == 64)
>>>> +            plane_downscale = mul_fixed_16_16(plane_downscale,
>>>> +                              fp_9_div_8);
>>>> +
>>>> +        max_downscale = max_fixed_16_16(plane_downscale, max_downscale);
>>>> +    }
>>>> +    pipe_downscale = skl_pipe_downscale_amount(cstate);
>>>> +
>>>> +    pipe_downscale = mul_fixed_16_16(pipe_downscale, max_downscale);
>>>> +
>>>> +    crtc_clock = crtc_state->adjusted_mode.crtc_clock;
>>>> +    cdclk = to_intel_atomic_state(state)->cdclk.logical.cdclk;
>>>> +    pipe_max_pixel_rate = fixed_16_16_div_round_up_u64(cdclk,
>>>> +                               pipe_downscale);
>>>> +
>>>> +    if (pipe_max_pixel_rate < crtc_clock) {
>>>> +        DRM_ERROR("Max supported pixel clock with scaling exceeds\n");
>>>> +        return -ERANGE;
> Oops, please also only return -EINVAL, definitely not ERANGE.
Requested scaling here is out of supported range that's why used 
-ERANGE, should I still change it to -EINVAL?

-Mahesh
>
> ~Maarten

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v4 11/11] drm/i915/skl+: consider max supported plane pixel rate while scaling
  2017-05-11  9:48       ` Maarten Lankhorst
  2017-05-11 10:59         ` Mahesh Kumar
@ 2017-05-11 13:05         ` Mahesh Kumar
  1 sibling, 0 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-11 13:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

A display resolution is only supported if it meets all the restrictions
below for Maximum Pipe Pixel Rate.

The display resolution must fit within the maximum pixel rate output
from the pipe. Make sure that the display pipe is able to feed pixels at
a rate required to support the desired resolution.
For each enabled plane on the pipe {
    If plane scaling enabled {
	Horizontal down scale amount = Maximum[1, plane horizontal size /
		    scaler horizontal window size]
	Vertical down scale amount = Maximum[1, plane vertical size /
		    scaler vertical window size]
	Plane down scale amount = Horizontal down scale amount *
		    Vertical down scale amount
	Plane Ratio = 1 / Plane down scale amount
    }
    Else {
	Plane Ratio = 1
    }
    If plane source pixel format is 64 bits per pixel {
	Plane Ratio = Plane Ratio * 8/9
    }
}

Pipe Ratio = Minimum Plane Ratio of all enabled planes on the pipe

If pipe scaling is enabled {
    Horizontal down scale amount = Maximum[1, pipe horizontal source size /
		scaler horizontal window size]
    Vertical down scale amount = Maximum[1, pipe vertical source size /
		scaler vertical window size]
    Note: The progressive fetch - interlace display mode is equivalent to a
		2.0 vertical down scale
    Pipe down scale amount = Horizontal down scale amount *
		Vertical down scale amount
    Pipe Ratio = Pipe Ratio / Pipe down scale amount
}

Pipe maximum pixel rate = CDCLK frequency * Pipe Ratio

Changes since V1:
 - separate out fixed_16_16 wrapper API definition
Changes since V2:
 - Fix buggy crtc !active condition (Maarten)
 - use intel_wm_plane_visible wrapper as per Maarten's suggestion
Changes since V3:
 - Change failure return from ERANGE to EINVAL

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c |  3 ++
 drivers/gpu/drm/i915/intel_drv.h     |  2 +
 drivers/gpu/drm/i915/intel_pm.c      | 88 ++++++++++++++++++++++++++++++++++++
 3 files changed, 93 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 85b9e2f521a0..d64367e810f8 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10992,6 +10992,9 @@ static int intel_crtc_atomic_check(struct drm_crtc *crtc,
 			ret = skl_update_scaler_crtc(pipe_config);
 
 		if (!ret)
+			ret = skl_check_pipe_max_pixel_rate(intel_crtc,
+							    pipe_config);
+		if (!ret)
 			ret = intel_atomic_setup_scalers(dev_priv, intel_crtc,
 							 pipe_config);
 	}
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 54f3ff840812..8323fc2ec4f2 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1859,6 +1859,8 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
 				 int ignore);
 bool ilk_disable_lp_wm(struct drm_device *dev);
 int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
+int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
+				  struct intel_crtc_state *cstate);
 static inline int intel_enable_rc6(void)
 {
 	return i915.enable_rc6;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 041cbca7105e..4f94dd6be8ed 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3397,6 +3397,94 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 	return mul_fixed_16_16(downscale_w, downscale_h);
 }
 
+static uint_fixed_16_16_t
+skl_pipe_downscale_amount(const struct intel_crtc_state *config)
+{
+	uint_fixed_16_16_t pipe_downscale = u32_to_fixed_16_16(1);
+
+	if (!config->base.active)
+		return pipe_downscale;
+
+	if (config->pch_pfit.enabled) {
+		uint32_t src_w, src_h, dst_w, dst_h;
+		uint32_t pfit_size = config->pch_pfit.size;
+		uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
+		uint_fixed_16_16_t downscale_h, downscale_w;
+
+		src_w = config->pipe_src_w;
+		src_h = config->pipe_src_h;
+		dst_w = pfit_size >> 16;
+		dst_h = pfit_size & 0xffff;
+
+		if (!dst_w || !dst_h)
+			return pipe_downscale;
+
+		fp_w_ratio = fixed_16_16_div(src_w, dst_w);
+		fp_h_ratio = fixed_16_16_div(src_h, dst_h);
+		downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
+		downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
+
+		pipe_downscale = mul_fixed_16_16(downscale_w, downscale_h);
+	}
+
+	return pipe_downscale;
+}
+
+int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
+				  struct intel_crtc_state *cstate)
+{
+	struct drm_crtc_state *crtc_state = &cstate->base;
+	struct drm_atomic_state *state = crtc_state->state;
+	struct drm_plane *plane;
+	const struct drm_plane_state *pstate;
+	struct intel_plane_state *intel_pstate;
+	int crtc_clock, cdclk;
+	uint32_t pipe_max_pixel_rate;
+	uint_fixed_16_16_t pipe_downscale;
+	uint_fixed_16_16_t max_downscale = u32_to_fixed_16_16(1);
+
+	if (!cstate->base.active)
+		return 0;
+
+	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
+		uint_fixed_16_16_t plane_downscale;
+		uint_fixed_16_16_t fp_9_div_8 = fixed_16_16_div(9, 8);
+		int bpp;
+
+		if (!intel_wm_plane_visible(cstate,
+					    to_intel_plane_state(pstate)))
+			continue;
+
+		if (WARN_ON(!pstate->fb))
+			continue;
+
+		intel_pstate = to_intel_plane_state(pstate);
+		plane_downscale = skl_plane_downscale_amount(cstate,
+							     intel_pstate);
+		bpp = pstate->fb->format->cpp[0] * 8;
+		if (bpp == 64)
+			plane_downscale = mul_fixed_16_16(plane_downscale,
+							  fp_9_div_8);
+
+		max_downscale = max_fixed_16_16(plane_downscale, max_downscale);
+	}
+	pipe_downscale = skl_pipe_downscale_amount(cstate);
+
+	pipe_downscale = mul_fixed_16_16(pipe_downscale, max_downscale);
+
+	crtc_clock = crtc_state->adjusted_mode.crtc_clock;
+	cdclk = to_intel_atomic_state(state)->cdclk.logical.cdclk;
+	pipe_max_pixel_rate = fixed_16_16_div_round_up_u64(cdclk,
+							   pipe_downscale);
+
+	if (pipe_max_pixel_rate < crtc_clock) {
+		DRM_ERROR("Max supported pixel clock with scaling exceeds\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static unsigned int
 skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
 			     const struct drm_plane_state *pstate,
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev6)
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (13 preceding siblings ...)
  2017-05-11  9:35 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev5) Patchwork
@ 2017-05-11 14:11 ` Patchwork
  2017-05-12  0:21 ` [PATCH 00/11] Implement DDB algorithm and WM cleanup Matt Roper
  15 siblings, 0 replies; 43+ messages in thread
From: Patchwork @ 2017-05-11 14:11 UTC (permalink / raw)
  To: Kumar, Mahesh; +Cc: intel-gfx

== Series Details ==

Series: Implement DDB algorithm and WM cleanup (rev6)
URL   : https://patchwork.freedesktop.org/series/20376/
State : success

== Summary ==

Series 20376v6 Implement DDB algorithm and WM cleanup
https://patchwork.freedesktop.org/api/1.0/series/20376/revisions/6/mbox/

Test gem_exec_suspend:
        Subgroup basic-s4-devices:
                pass       -> DMESG-WARN (fi-kbl-7560u) fdo#100125

fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125

fi-bdw-5557u     total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  time:440s
fi-bdw-gvtdvm    total:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  time:434s
fi-bsw-n3050     total:278  pass:242  dwarn:0   dfail:0   fail:0   skip:36  time:587s
fi-bxt-j4205     total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  time:515s
fi-byt-j1900     total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  time:495s
fi-byt-n2820     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:484s
fi-hsw-4770      total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:418s
fi-hsw-4770r     total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  time:410s
fi-ilk-650       total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  time:416s
fi-ivb-3520m     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:494s
fi-ivb-3770      total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:482s
fi-kbl-7500u     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  time:462s
fi-kbl-7560u     total:278  pass:267  dwarn:1   dfail:0   fail:0   skip:10  time:573s
fi-skl-6260u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:453s
fi-skl-6700hq    total:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  time:575s
fi-skl-6700k     total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  time:475s
fi-skl-6770hq    total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  time:501s
fi-skl-gvtdvm    total:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  time:437s
fi-snb-2520m     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  time:537s
fi-snb-2600      total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  time:416s
fi-bxt-t5700 failed to collect. IGT log at Patchwork_4675/fi-bxt-t5700/igt.log

51350ea90d1a5c8bb7e6532eb61c64a57e83ab9b drm-tip: 2017y-05m-11d-13h-05m-06s UTC integration manifest
97bac45 drm/i915/skl+: consider max supported plane pixel rate while scaling
ce02568 drm/i915/skl: New ddb allocation algorithm
1ab634e drm/i915/skl+: use linetime latency if ddb size is not available
d2a7155 drm/i915/skl+: Watermark calculation cleanup
a603680 drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
42ba052 drm/i915/skl+: no need to memset again
1157f1d drm/i915/skl: Fail the flip if no FB for WM calculation
877435c drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point
002cc23 drm/i915: Use fixed_16_16 wrapper for division operation
71b458f drm/i915: Add more wrapper for fixed_point_16_16 operations
d7c17ff drm/i915: fix naming of fixed_16_16 wrapper.

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4675/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/11] Implement DDB algorithm and WM cleanup
  2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
                   ` (14 preceding siblings ...)
  2017-05-11 14:11 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev6) Patchwork
@ 2017-05-12  0:21 ` Matt Roper
  2017-05-12  8:25   ` Mahesh Kumar
  15 siblings, 1 reply; 43+ messages in thread
From: Matt Roper @ 2017-05-12  0:21 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 08, 2017 at 05:18:51PM +0530, Mahesh Kumar wrote:
> This series implements new DDB allocation algorithm to solve the cases,
> where we have sufficient DDB available to enable multiple planes, But
> due to the current algorithm not dividing it properly among planes, we
> end-up failing the flip.
> It also takes care of enabling same watermark level for each
> plane, for efficient power saving.
> Series also fixes/cleans-up few bug in present code.
> 
> There are two steps in current WM programming.
> 
> 1. Calculate minimum number of blocks required  for a WM level to be
> enabled. For 1440x2560 panel we need 41 blocks as minimum number of
> blocks to enable WM0. This is the step which doesn't use vertical size.
> It only depends on Pipe drain rate and plane horizontal size as per the
> current Bspec algorithm.
> So all the plane below have minimum  number of blocks required to enable
> WM0 as 41
>     Plane 1  - 1440x2560        -    Min blocks to enable WM0 = 41
>     Plane 2  - 1440x2560        -    Min blocks to enable WM0 = 41
>     Plane 3  - 1440x48          -    Min blocks to enable WM0 = 41
>     Plane 4  - 1440x96          -    Min blocks to enable WM0 = 41
> 
> 2. Number of blocks allotted by the driver
>     Driver allocates  12 for Plane 3   &  16 for plane 4
> 
>     Total Dbuf Available = 508
>     Dbuf Available after 32 blocks for cursor = 508 - (32)  = 476

Given the dbuf size of 508, I assume this example is for Broxton
hardware, right?  In that case, you wouldn't actually be able to use the
cursor plane since Plane 4 (1440x96) is mutually exclusive with the
cursor, so there wouldn't be a need to reserve these 32 blocks.   I
guess there's also the issue that the upstream driver can't actually
expose/use Plane 4 at all today.

That said, your overall example here still gets the important points
across and is very much appreciated.


>     allocate minimum blocks for each plane 8 * 4 = 32
>     remaining blocks = 476 - 32 = 444
>     Relative Data Rate for Planes
>        Plane 1  =  1440 * 2560 * 3  =  11059200
>        Plane 2  =  1440 * 2560 * 3  =  11059200
>        Plane 3  =  1440 * 48   * 3  =  207360
>        Plane 4  =  1440 * 96   * 3  =  414720
>        Total Relative BW            =  22740480
> 
> -   Allocate Buffer
>     buffer allocation = (Plane relative data rate / total data rate)
> 		    * total remaming DDB + minimum plane DDB
>      Plane 1  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
>      Plane 2  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
>      Plane 3  buffer allocation = (207360   / 22740480) * 444 + 8 = 12
>      Plane 4  buffer allocation = (414720   / 22740480) * 444 + 8 = 16
> 
> In this case it forced driver to disable Plane 3 & 4. Driver need to use
> more efficient way to allocate buffer that is optimum for power.
> 
> New Algorithm suggested by HW team is:
> 
> 1. Calculate minimum buffer allocations for each plane and for each
>     watermark level
> 
> 2. Add minimum buffer allocations required for enabling WM7
>     for all the planes
> 
> Level 0 =  41 + 41 + 41 + 41  = 164
> Level 1 =  42 + 42 + 42 + 42  = 168
> Level 2 =  42 + 42 + 42 + 42  = 168
> Level 3 =  94 + 94 + 94 + 94 =  376
> Level 4 =  94 + 94 + 94 + 94 =  376
> Level 5 =  94 + 94 + 94 + 94 =  376
> Level 6 =  94 + 94 + 94 + 94 =  376
> Level 7 =  94 + 94 + 94 + 94 =  376
> 
> 3. Check to see how many buffer allocation are left and enable
> the best case. In this case since we have 476 blocks we can enable
> WM0-7 on all 4 planes.
> Let's say if we have only 200 block available then the best cases
> allocation is to enable Level2 which requires 168 blocks

It's probably worth noting that the use cases that are most likely to
benefit from this are those with large differences in the height of the
'shortest' plane vs the height of the 'tallest' plane.  It's the
blind proportional distribution of remaining blocks in the current
algorithm that prevents 'short' planes from reaching their minimum block
requirements for various watermark levels (and if they can't even reach
the WM0 minimum, then the plane can't be used at all).

There will certainly still be cases where the overall display
configuration (with lots of pipes and planes in use) simply requires
more blocks than the hardware has to even reach WM0, no matter how we
slice up the limited DDB size, but the changes here will definitely help
prevent us from rejecting atomic commits for some configurations we
actually could handle.


Matt

> 
> Mahesh Kumar (11):
>   drm/i915: fix naming of fixed_16_16 wrapper.
>   drm/i915: Add more wrapper for fixed_point_16_16 operations
>   drm/i915: Use fixed_16_16 wrapper for division operation
>   drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed
>     point
>   drm/i915/skl: Fail the flip if no FB for WM calculation
>   drm/i915/skl+: no need to memset again
>   drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe
>     allocation
>   drm/i915/skl+: Watermark calculation cleanup
>   drm/i915/skl+: use linetime latency if ddb size is not available
>   drm/i915/skl: New ddb allocation algorithm
>   drm/i915/skl+: consider max supported plane pixel rate while scaling
> 
>  drivers/gpu/drm/i915/i915_drv.h      |  56 +++-
>  drivers/gpu/drm/i915/intel_display.c |   3 +
>  drivers/gpu/drm/i915/intel_drv.h     |   2 +
>  drivers/gpu/drm/i915/intel_pm.c      | 520 +++++++++++++++++++++++------------
>  4 files changed, 395 insertions(+), 186 deletions(-)
> 
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/11] drm/i915: fix naming of fixed_16_16 wrapper.
  2017-05-08 11:48 ` [PATCH 01/11] drm/i915: fix naming of fixed_16_16 wrapper Mahesh Kumar
@ 2017-05-12  0:21   ` Matt Roper
  0 siblings, 0 replies; 43+ messages in thread
From: Matt Roper @ 2017-05-12  0:21 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 08, 2017 at 05:18:52PM +0530, Mahesh Kumar wrote:
> fixed_16_16_div_round_up(_u64), wrapper for fixed_16_16 division
> operation don't really round_up the result. Wrapper round_up only the
> fraction part of the result to make it 16-bit.
> This patch eliminates round_up keyword from the wrapper.
> 
> Later patch will introduce the new wrapper to do rounding-off the result
> and give unt32_t output to cleanup mix use of fixed_16_16_t & uint32_t
> variables.
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>

Yeah, the naming here is confusing.  You're not rounding up to the next
integer value (which is typically what "round up" refers to), but you
are rounding the fractional part up rather than taking a floor.  Do we
actually need to DIV_ROUND_UP the fractional part in the implementation
of the function?  If we can just use a standard '/' there, it will more
closely match the new function name since since there won't be any round
happening.

Not that it really matters much either way.  With or without changes,

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>


> ---
>  drivers/gpu/drm/i915/i915_drv.h | 6 ++----
>  drivers/gpu/drm/i915/intel_pm.c | 6 +++---
>  2 files changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index b20ed16da0ad..5804734d30a3 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -152,8 +152,7 @@ static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
>  	return max;
>  }
>  
> -static inline uint_fixed_16_16_t fixed_16_16_div_round_up(uint32_t val,
> -							  uint32_t d)
> +static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
>  {
>  	uint_fixed_16_16_t fp, res;
>  
> @@ -162,8 +161,7 @@ static inline uint_fixed_16_16_t fixed_16_16_div_round_up(uint32_t val,
>  	return res;
>  }
>  
> -static inline uint_fixed_16_16_t fixed_16_16_div_round_up_u64(uint32_t val,
> -							      uint32_t d)
> +static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val, uint32_t d)
>  {
>  	uint_fixed_16_16_t res;
>  	uint64_t interm_val;
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index cacb65fa2dd5..91f09dfdb618 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3698,7 +3698,7 @@ static uint_fixed_16_16_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp,
>  		return FP_16_16_MAX;
>  
>  	wm_intermediate_val = latency * pixel_rate * cpp;
> -	ret = fixed_16_16_div_round_up_u64(wm_intermediate_val, 1000 * 512);
> +	ret = fixed_16_16_div_u64(wm_intermediate_val, 1000 * 512);
>  	return ret;
>  }
>  
> @@ -3834,8 +3834,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  	if (y_tiled) {
>  		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line *
>  					   y_min_scanlines, 512);
> -		plane_blocks_per_line =
> -		      fixed_16_16_div_round_up(interm_pbpl, y_min_scanlines);
> +		plane_blocks_per_line = fixed_16_16_div(interm_pbpl,
> +							y_min_scanlines);
>  	} else if (x_tiled) {
>  		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512);
>  		plane_blocks_per_line = u32_to_fixed_16_16(interm_pbpl);
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/11] drm/i915: Add more wrapper for fixed_point_16_16 operations
  2017-05-08 11:48 ` [PATCH 02/11] drm/i915: Add more wrapper for fixed_point_16_16 operations Mahesh Kumar
  2017-05-10 12:37   ` Maarten Lankhorst
@ 2017-05-12  0:22   ` Matt Roper
  2017-05-12  8:55     ` Mahesh Kumar
  1 sibling, 1 reply; 43+ messages in thread
From: Matt Roper @ 2017-05-12  0:22 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 08, 2017 at 05:18:53PM +0530, Mahesh Kumar wrote:
> This patch adds few wrapper to perform fixed_point_16_16 operations
> mul_u32_fixed_16_16_round_up : Multiplies u32 and fixed_16_16_t
> 				variables & returns u32 result with
> 				rounding-off.
> mul_fixed_16_16 : Multiplies two fixed_16_16_t variable & returns
> 				fixed_16_16
> div_round_up_fixed_16_16 : Perform division operation on fixed_16_16_t
> 				variables & return u32 result with round-off
> fixed_16_16_div_round_up_u64 : devide uint32_t variable by fixed_16_16
> 				variable and round_up the result to uint32_t.

Minor nitpick, but I'd say "rounding-up" rather than "rounding-off" in
your descriptions here.

> 
> These wrappers will be used by later patches in the series.

The name on this last one (fixed_16_16_div_round_up_u64) doesn't seem to
match what it does.  You're using u64 internally, but the actual
operands are a u32 and a fixed point value.  Wouldn't it make more sense
to call it div_round_up_u32_fixed_16_16 (that also keeps the dividend
first and the divisor second, which seems more natural to me).

Actually the naming of all the fixed point math functions is a bit
confusing/inconsistent.  Some of them are "op_type[_type][_round_up],"
some of them are "op[_round_up]_type," some of them are
"type_op[_round_up]," and some of them are "type_op[_round_up]_type."
Also, none of them specify the return value type, but I guess that's not
too much of a problem since the use of a fixed point structure will
ensure the compiler warns if someone tries to use them assuming the
wrong kind of result.

Maybe we could standardize the names a bit to help avoid confusion?  I'd
suggest "op[_round_up]_type1_type2."  If both types are the same, we'd
still repeat the type for clarity, but maybe we could just use "fixed16"
or "fix16" to keep the overall names from getting too long.  And for
division we'd always keep the dividend as the first type, divisor as
second.

E.g.,
        mul_round_up_u32_fixed16()
        div_round_up_u32_fixed16()
        mul_fixed16_fixed16()
        div_round_up_fixed16_fixed16()

And presumably the existing fixed point operations could be renamed in
the same manner.


Matt

> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h | 43 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 5804734d30a3..5ff0091707c0 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -152,6 +152,38 @@ static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
>  	return max;
>  }
>  
> +static inline uint32_t div_round_up_fixed_16_16(uint_fixed_16_16_t val,
> +						uint_fixed_16_16_t d)
> +{
> +	return DIV_ROUND_UP(val.val, d.val);
> +}
> +
> +static inline uint32_t mul_u32_fixed_16_16_round_up(uint32_t val,
> +						    uint_fixed_16_16_t mul)
> +{
> +       uint64_t intermediate_val;
> +       uint32_t result;
> +
> +       intermediate_val = (uint64_t) val * mul.val;
> +       intermediate_val = DIV_ROUND_UP_ULL(intermediate_val, 1 << 16);
> +       WARN_ON(intermediate_val >> 32);
> +       result = clamp_t(uint32_t, intermediate_val, 0, ~0);
> +       return result;
> +}
> +
> +static inline uint_fixed_16_16_t mul_fixed_16_16(uint_fixed_16_16_t val,
> +						 uint_fixed_16_16_t mul)
> +{
> +	uint64_t intermediate_val;
> +	uint_fixed_16_16_t fp;
> +
> +	intermediate_val = (uint64_t) val.val * mul.val;
> +	intermediate_val = intermediate_val >> 16;
> +	WARN_ON(intermediate_val >> 32);
> +	fp.val = clamp_t(uint32_t, intermediate_val, 0, ~0);
> +	return fp;
> +}
> +
>  static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
>  {
>  	uint_fixed_16_16_t fp, res;
> @@ -174,6 +206,17 @@ static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val, uint32_t d)
>  	return res;
>  }
>  
> +static inline uint32_t fixed_16_16_div_round_up_u64(uint32_t val,
> +						    uint_fixed_16_16_t d)
> +{
> +	uint64_t interm_val;
> +
> +	interm_val = (uint64_t)val << 16;
> +	interm_val = DIV_ROUND_UP_ULL(interm_val, d.val);
> +	WARN_ON(interm_val >> 32);
> +	return clamp_t(uint32_t, interm_val, 0, ~0);
> +}
> +
>  static inline uint_fixed_16_16_t mul_u32_fixed_16_16(uint32_t val,
>  						     uint_fixed_16_16_t mul)
>  {
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 03/11] drm/i915: Use fixed_16_16 wrapper for division operation
  2017-05-08 11:48 ` [PATCH 03/11] drm/i915: Use fixed_16_16 wrapper for division operation Mahesh Kumar
@ 2017-05-12  0:22   ` Matt Roper
  0 siblings, 0 replies; 43+ messages in thread
From: Matt Roper @ 2017-05-12  0:22 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 08, 2017 at 05:18:54PM +0530, Mahesh Kumar wrote:
> Don't use fixed_16_16 structure members directly, instead use wrapper to
> perform fixed_16_16 division operation.
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_pm.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 91f09dfdb618..66b542ba47ad 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3867,8 +3867,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  	}
>  
>  	res_blocks = fixed_16_16_to_u32_round_up(selected_result) + 1;
> -	res_lines = DIV_ROUND_UP(selected_result.val,
> -				 plane_blocks_per_line.val);
> +	res_lines = div_round_up_fixed_16_16(selected_result,
> +					     plane_blocks_per_line);
>  
>  	if (level >= 1 && level <= 7) {
>  		if (y_tiled) {
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 04/11] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point
  2017-05-08 11:48 ` [PATCH 04/11] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point Mahesh Kumar
@ 2017-05-12  0:22   ` Matt Roper
  0 siblings, 0 replies; 43+ messages in thread
From: Matt Roper @ 2017-05-12  0:22 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 08, 2017 at 05:18:55PM +0530, Mahesh Kumar wrote:
> This patch make changes to calculate adjusted plane pixel rate &
> plane downscale amount using fixed_point functions available.
> This patch will give uniformity in code, & will help to avoid mixing of
> 32bit uint32_t variable for fixed-16.16 with fixed_16_16_t variables in
> later patch in the series.
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 37 +++++++++++++++++++------------------
>  1 file changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 66b542ba47ad..6414701ef09e 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3360,26 +3360,27 @@ void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
>   * Return value is provided in 16.16 fixed point form to retain fractional part.
>   * Caller should take care of dividing & rounding off the value.
>   */
> -static uint32_t
> +static uint_fixed_16_16_t
>  skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
>  			   const struct intel_plane_state *pstate)
>  {
>  	struct intel_plane *plane = to_intel_plane(pstate->base.plane);
> -	uint32_t downscale_h, downscale_w;
>  	uint32_t src_w, src_h, dst_w, dst_h;
> +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> +	uint_fixed_16_16_t downscale_h, downscale_w;
>  
>  	if (WARN_ON(!intel_wm_plane_visible(cstate, pstate)))
> -		return DRM_PLANE_HELPER_NO_SCALING;
> +		return u32_to_fixed_16_16(0);

I don't think this really changes anything, right?  Both of the places
that call this function have already bailed out and returned 0 data rate
if the plane is invisible.  Not that it hurts anything either.

>  
>  	/* n.b., src is 16.16 fixed point, dst is whole integer */
>  	if (plane->id == PLANE_CURSOR) {
> -		src_w = pstate->base.src_w;
> -		src_h = pstate->base.src_h;
> +		src_w = pstate->base.src_w >> 16;
> +		src_h = pstate->base.src_h >> 16;
>  		dst_w = pstate->base.crtc_w;
>  		dst_h = pstate->base.crtc_h;
>  	} else {
> -		src_w = drm_rect_width(&pstate->base.src);
> -		src_h = drm_rect_height(&pstate->base.src);
> +		src_w = drm_rect_width(&pstate->base.src) >> 16;
> +		src_h = drm_rect_height(&pstate->base.src) >> 16;
>  		dst_w = drm_rect_width(&pstate->base.dst);
>  		dst_h = drm_rect_height(&pstate->base.dst);
>  	}
> @@ -3387,11 +3388,13 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
>  	if (drm_rotation_90_or_270(pstate->base.rotation))
>  		swap(dst_w, dst_h);
>  
> -	downscale_h = max(src_h / dst_h, (uint32_t)DRM_PLANE_HELPER_NO_SCALING);
> -	downscale_w = max(src_w / dst_w, (uint32_t)DRM_PLANE_HELPER_NO_SCALING);
> +	fp_w_ratio = fixed_16_16_div(src_w, dst_w);
> +	fp_h_ratio = fixed_16_16_div(src_h, dst_h);
> +	downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
> +	downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
>  
>  	/* Provide result in 16.16 fixed point */
> -	return (uint64_t)downscale_w * downscale_h >> 16;
> +	return mul_fixed_16_16(downscale_w, downscale_h);

I think we can drop the comment here now; it was originally added to
remind why we were doing the extra shift at return, but now it's pretty
obvious at this point that everything here is dealing with fixed point.

With or without the comment dropped,

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

>  }
>  
>  static unsigned int
> @@ -3401,10 +3404,11 @@ skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
>  {
>  	struct intel_plane *plane = to_intel_plane(pstate->plane);
>  	struct intel_plane_state *intel_pstate = to_intel_plane_state(pstate);
> -	uint32_t down_scale_amount, data_rate;
> +	uint32_t data_rate;
>  	uint32_t width = 0, height = 0;
>  	struct drm_framebuffer *fb;
>  	u32 format;
> +	uint_fixed_16_16_t down_scale_amount;
>  
>  	if (!intel_pstate->base.visible)
>  		return 0;
> @@ -3438,7 +3442,7 @@ skl_plane_relative_data_rate(const struct intel_crtc_state *cstate,
>  
>  	down_scale_amount = skl_plane_downscale_amount(cstate, intel_pstate);
>  
> -	return (uint64_t)data_rate * down_scale_amount >> 16;
> +	return mul_u32_fixed_16_16_round_up(data_rate, down_scale_amount);
>  }
>  
>  /*
> @@ -3724,8 +3728,7 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
>  					      struct intel_plane_state *pstate)
>  {
>  	uint64_t adjusted_pixel_rate;
> -	uint64_t downscale_amount;
> -	uint64_t pixel_rate;
> +	uint_fixed_16_16_t downscale_amount;
>  
>  	/* Shouldn't reach here on disabled planes... */
>  	if (WARN_ON(!intel_wm_plane_visible(cstate, pstate)))
> @@ -3738,10 +3741,8 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
>  	adjusted_pixel_rate = cstate->pixel_rate;
>  	downscale_amount = skl_plane_downscale_amount(cstate, pstate);
>  
> -	pixel_rate = adjusted_pixel_rate * downscale_amount >> 16;
> -	WARN_ON(pixel_rate != clamp_t(uint32_t, pixel_rate, 0, ~0));
> -
> -	return pixel_rate;
> +	return mul_u32_fixed_16_16_round_up(adjusted_pixel_rate,
> +					    downscale_amount);
>  }
>  
>  static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 05/11] drm/i915/skl: Fail the flip if no FB for WM calculation
  2017-05-08 12:01     ` Mahesh Kumar
@ 2017-05-12  0:22       ` Matt Roper
  0 siblings, 0 replies; 43+ messages in thread
From: Matt Roper @ 2017-05-12  0:22 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, Zanoni, Paulo R, Lankhorst, Maarten

On Mon, May 08, 2017 at 05:31:30PM +0530, Mahesh Kumar wrote:
> Hi,
> 
> 
> On Monday 08 May 2017 05:18 PM, Lankhorst, Maarten wrote:
> > Mahesh Kumar schreef op ma 08-05-2017 om 17:18 [+0530]:
> > > Fail the flip if no FB is present but plane_state is set as visible.
> > > Above is not a valid combination so instead of continue fail the
> > > flip.
> > Why is this patch necessary? drm_atomic_plane_check handles this.
> Ideally we should never get such combination here. But current WM code
> checks for this situation and even if it's true it proceeds further. This
> patch just corrects the WM code flow decision.
> I also think some of these checks are redundant here.

Yeah, WARN()'s are basically "I'm sure this could never happen" type
assertions.  Of course sometimes we restructure parts of the code and
forget about assumptions we've made elsewhere (or we just plain screw up
and add new bugs to existing code), so we wind up hitting the WARN()'s
anyway.  If proceeding on here could lead to a panic (which I think it
could since we dereference the fb in skl_compute_plane_wm()), then
adding a sensible bail-out here seems okay to me; the extra paranoia
only costs one extra line of code.


Matt

> 
> -Mahesh
> 
> > 
> > ~Maarten
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 06/11] drm/i915/skl+: no need to memset again
  2017-05-08 11:48 ` [PATCH 06/11] drm/i915/skl+: no need to memset again Mahesh Kumar
@ 2017-05-12  0:23   ` Matt Roper
  0 siblings, 0 replies; 43+ messages in thread
From: Matt Roper @ 2017-05-12  0:23 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 08, 2017 at 05:18:57PM +0530, Mahesh Kumar wrote:
> We are already doing memset of ddb structure at the begining of skl_allocate_pipe_ddb
> function, No need to again do a memset.
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> ---
>  drivers/gpu/drm/i915/intel_pm.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 1d7c67d7e539..2a4e9d89cd6f 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3606,10 +3606,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  
>  	skl_ddb_get_pipe_allocation_limits(dev, cstate, alloc, &num_active);
>  	alloc_size = skl_ddb_entry_size(alloc);
> -	if (alloc_size == 0) {
> -		memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
> +	if (alloc_size == 0)
>  		return 0;
> -	}
>  
>  	skl_ddb_calc_min(cstate, num_active, minimum, y_minimum);
>  
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 07/11] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
  2017-05-08 11:48 ` [PATCH 07/11] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation Mahesh Kumar
  2017-05-08 14:12   ` Ander Conselvan De Oliveira
@ 2017-05-12  0:23   ` Matt Roper
  2017-05-12 13:44     ` Mahesh Kumar
  1 sibling, 1 reply; 43+ messages in thread
From: Matt Roper @ 2017-05-12  0:23 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 08, 2017 at 05:18:58PM +0530, Mahesh Kumar wrote:
> DDB minimum requirement may exceed the allocated DDB for CRTC/Pipe. This
> patch make changes to fail the flip if minimum requirement for pipe
> exceeds the total ddb allocated to the pipe.
> Previously it succeeded but making alloc_size a negative value. Which
> will make later calculations for plane ddb allocation bogus & may lead
> to screen corruption or system hang.
> 

I think originally the bspec defined the initial minimum DDB allocation
for each plane to just be '8' in all cases.  So even with 4 planes on a
pipe, your initial pipe minimum would only be 32, so the code here never
had a risk of overflowing (note that the initial minimum is different
than the WM0 minimum requirement we ultimately wind up calculating).
But then the bspec was updated to suggest a higher initial minimum for
y-tiled buffers (calculated based on the width of the buffer), so our
initial assumptions here don't really hold anymore.  We should still be
safe if we're not using y-tile, but even a single 4k y-tiled plane would
have an initial minimum of something like 250 blocks iirc., so we can't
assume we're always safe anymore.


> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 2a4e9d89cd6f..0ace94d67432 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3591,6 +3591,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	int num_active;
>  	unsigned plane_data_rate[I915_MAX_PLANES] = {};
>  	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
> +	uint16_t total_min_blocks = 0;
>  
>  	/* Clear the partitioning for disabled planes. */
>  	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
> @@ -3618,10 +3619,18 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	 */
>  
>  	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
> -		alloc_size -= minimum[plane_id];
> -		alloc_size -= y_minimum[plane_id];
> +		total_min_blocks += minimum[plane_id];
> +		total_min_blocks += y_minimum[plane_id];
>  	}
>  
> +	if ((total_min_blocks > alloc_size)) {
> +		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
> +		DRM_DEBUG_KMS("minimum required %d/%d\n", total_min_blocks,
> +							alloc_size);
> +		return-EINVAL;

I think you're missing a space here?

With that fixed,

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> +	}
> +
> +	alloc_size -= total_min_blocks;
>  	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
>  	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
>  
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/11] drm/i915/skl+: Watermark calculation cleanup
  2017-05-08 11:48 ` [PATCH 08/11] drm/i915/skl+: Watermark calculation cleanup Mahesh Kumar
@ 2017-05-12  0:23   ` Matt Roper
  2017-05-12 13:47     ` Mahesh Kumar
  0 siblings, 1 reply; 43+ messages in thread
From: Matt Roper @ 2017-05-12  0:23 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 08, 2017 at 05:18:59PM +0530, Mahesh Kumar wrote:
> This patch cleanup/reorganises the watermark calculation functions.
> This patch also make use of already available macro
> "drm_atomic_crtc_state_for_each_plane_state" to walk through
> plane_state list instead of calculating plane_state in function itself.
> Now we iterate over WM levels in skl_compute_wm_level function instead
> of "skl_build_pipe_wm" function.

I'd split the minor cleanup (const-ifying,
drm_atomic_crtc_state_for_each_plane_state usage, etc.) into a separate
patch from the code flow change (looping over levels in
skl_compute_wm_level instead of skl_build_pipe_wm).  You'll probably
also want to rename the function to skl_compute_wm_level*s* (plural) if
its going to calculate all levels now instead of just one.


Matt

> 
> This restructuring will help later patch for new DDB allocation
> algorithm to do only algo related changes.
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 89 +++++++++++++++++------------------------
>  1 file changed, 37 insertions(+), 52 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 0ace94d67432..9d0225862a7e 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3732,7 +3732,7 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
>  }
>  
>  static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
> -					      struct intel_plane_state *pstate)
> +					      const struct intel_plane_state *pstate)
>  {
>  	uint64_t adjusted_pixel_rate;
>  	uint_fixed_16_16_t downscale_amount;
> @@ -3754,7 +3754,7 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
>  
>  static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  				struct intel_crtc_state *cstate,
> -				struct intel_plane_state *intel_pstate,
> +				const struct intel_plane_state *intel_pstate,
>  				uint16_t ddb_allocation,
>  				int level,
>  				uint16_t *out_blocks, /* out */
> @@ -3762,8 +3762,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  				bool *enabled /* out */)
>  {
>  	struct intel_plane *plane = to_intel_plane(intel_pstate->base.plane);
> -	struct drm_plane_state *pstate = &intel_pstate->base;
> -	struct drm_framebuffer *fb = pstate->fb;
> +	const struct drm_plane_state *pstate = &intel_pstate->base;
> +	const struct drm_framebuffer *fb = pstate->fb;
>  	uint32_t latency = dev_priv->wm.skl_latency[level];
>  	uint_fixed_16_16_t method1, method2;
>  	uint_fixed_16_16_t plane_blocks_per_line;
> @@ -3918,52 +3918,36 @@ static int
>  skl_compute_wm_level(const struct drm_i915_private *dev_priv,
>  		     struct skl_ddb_allocation *ddb,
>  		     struct intel_crtc_state *cstate,
> -		     struct intel_plane *intel_plane,
> -		     int level,
> -		     struct skl_wm_level *result)
> +		     const struct intel_plane_state *intel_pstate,
> +		     struct skl_plane_wm *wm)
>  {
> -	struct drm_atomic_state *state = cstate->base.state;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
> -	struct drm_plane *plane = &intel_plane->base;
> -	struct intel_plane_state *intel_pstate = NULL;
> +	struct drm_plane *plane = intel_pstate->base.plane;
> +	struct intel_plane *intel_plane = to_intel_plane(plane);
>  	uint16_t ddb_blocks;
>  	enum pipe pipe = intel_crtc->pipe;
> +	int level, max_level = ilk_wm_max_level(dev_priv);
>  	int ret;
>  
> -	if (state)
> -		intel_pstate =
> -			intel_atomic_get_existing_plane_state(state,
> -							      intel_plane);
> -
> -	/*
> -	 * Note: If we start supporting multiple pending atomic commits against
> -	 * the same planes/CRTC's in the future, plane->state will no longer be
> -	 * the correct pre-state to use for the calculations here and we'll
> -	 * need to change where we get the 'unchanged' plane data from.
> -	 *
> -	 * For now this is fine because we only allow one queued commit against
> -	 * a CRTC.  Even if the plane isn't modified by this transaction and we
> -	 * don't have a plane lock, we still have the CRTC's lock, so we know
> -	 * that no other transactions are racing with us to update it.
> -	 */
> -	if (!intel_pstate)
> -		intel_pstate = to_intel_plane_state(plane->state);
> -
>  	if (WARN_ON(!intel_pstate->base.fb))
>  		return -EINVAL;
>  
>  	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
>  
> -	ret = skl_compute_plane_wm(dev_priv,
> -				   cstate,
> -				   intel_pstate,
> -				   ddb_blocks,
> -				   level,
> -				   &result->plane_res_b,
> -				   &result->plane_res_l,
> -				   &result->plane_en);
> -	if (ret)
> -		return ret;
> +	for (level = 0; level <= max_level; level++) {
> +		struct skl_wm_level *result = &wm->wm[level];
> +
> +		ret = skl_compute_plane_wm(dev_priv,
> +					   cstate,
> +					   intel_pstate,
> +					   ddb_blocks,
> +					   level,
> +					   &result->plane_res_b,
> +					   &result->plane_res_l,
> +					   &result->plane_en);
> +		if (ret)
> +			return ret;
> +	}
>  
>  	return 0;
>  }
> @@ -4009,10 +3993,11 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
>  			     struct skl_pipe_wm *pipe_wm)
>  {
>  	struct drm_device *dev = cstate->base.crtc->dev;
> +	struct drm_crtc_state *crtc_state = &cstate->base;
>  	const struct drm_i915_private *dev_priv = to_i915(dev);
> -	struct intel_plane *intel_plane;
> +	struct drm_plane *plane;
> +	const struct drm_plane_state *pstate;
>  	struct skl_plane_wm *wm;
> -	int level, max_level = ilk_wm_max_level(dev_priv);
>  	int ret;
>  
>  	/*
> @@ -4021,18 +4006,18 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
>  	 */
>  	memset(pipe_wm->planes, 0, sizeof(pipe_wm->planes));
>  
> -	for_each_intel_plane_mask(&dev_priv->drm,
> -				  intel_plane,
> -				  cstate->base.plane_mask) {
> -		wm = &pipe_wm->planes[intel_plane->id];
> +	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
> +		const struct intel_plane_state *intel_pstate =
> +						to_intel_plane_state(pstate);
> +		enum plane_id plane_id = to_intel_plane(plane)->id;
> +
> +		wm = &pipe_wm->planes[plane_id];
> +
> +		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
> +					   intel_pstate, wm);
> +		if (ret)
> +			return ret;
>  
> -		for (level = 0; level <= max_level; level++) {
> -			ret = skl_compute_wm_level(dev_priv, ddb, cstate,
> -						   intel_plane, level,
> -						   &wm->wm[level]);
> -			if (ret)
> -				return ret;
> -		}
>  		skl_compute_transition_wm(cstate, &wm->trans_wm);
>  	}
>  	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 00/11] Implement DDB algorithm and WM cleanup
  2017-05-12  0:21 ` [PATCH 00/11] Implement DDB algorithm and WM cleanup Matt Roper
@ 2017-05-12  8:25   ` Mahesh Kumar
  0 siblings, 0 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-12  8:25 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

Hi Matt,

Thanks for review,


On Friday 12 May 2017 05:51 AM, Matt Roper wrote:
> On Mon, May 08, 2017 at 05:18:51PM +0530, Mahesh Kumar wrote:
>> This series implements new DDB allocation algorithm to solve the cases,
>> where we have sufficient DDB available to enable multiple planes, But
>> due to the current algorithm not dividing it properly among planes, we
>> end-up failing the flip.
>> It also takes care of enabling same watermark level for each
>> plane, for efficient power saving.
>> Series also fixes/cleans-up few bug in present code.
>>
>> There are two steps in current WM programming.
>>
>> 1. Calculate minimum number of blocks required  for a WM level to be
>> enabled. For 1440x2560 panel we need 41 blocks as minimum number of
>> blocks to enable WM0. This is the step which doesn't use vertical size.
>> It only depends on Pipe drain rate and plane horizontal size as per the
>> current Bspec algorithm.
>> So all the plane below have minimum  number of blocks required to enable
>> WM0 as 41
>>      Plane 1  - 1440x2560        -    Min blocks to enable WM0 = 41
>>      Plane 2  - 1440x2560        -    Min blocks to enable WM0 = 41
>>      Plane 3  - 1440x48          -    Min blocks to enable WM0 = 41
>>      Plane 4  - 1440x96          -    Min blocks to enable WM0 = 41
>>
>> 2. Number of blocks allotted by the driver
>>      Driver allocates  12 for Plane 3   &  16 for plane 4
>>
>>      Total Dbuf Available = 508
>>      Dbuf Available after 32 blocks for cursor = 508 - (32)  = 476
> Given the dbuf size of 508, I assume this example is for Broxton
> hardware, right?  In that case, you wouldn't actually be able to use the
> cursor plane since Plane 4 (1440x96) is mutually exclusive with the
> cursor, so there wouldn't be a need to reserve these 32 blocks.   I
> guess there's also the issue that the upstream driver can't actually
> expose/use Plane 4 at all today.
yes, this example is for Broxton. During this writeup only optimization 
of "not to use cursor plane instead use 4th plane" was there, but code 
was still allocating DDB for cursor.
True, upstream doesn't expose 4th plane, this was as per the local 
optimization for Broxton.

Regards,
-Mahesh
>
> That said, your overall example here still gets the important points
> across and is very much appreciated.
>
>
>>      allocate minimum blocks for each plane 8 * 4 = 32
>>      remaining blocks = 476 - 32 = 444
>>      Relative Data Rate for Planes
>>         Plane 1  =  1440 * 2560 * 3  =  11059200
>>         Plane 2  =  1440 * 2560 * 3  =  11059200
>>         Plane 3  =  1440 * 48   * 3  =  207360
>>         Plane 4  =  1440 * 96   * 3  =  414720
>>         Total Relative BW            =  22740480
>>
>> -   Allocate Buffer
>>      buffer allocation = (Plane relative data rate / total data rate)
>> 		    * total remaming DDB + minimum plane DDB
>>       Plane 1  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
>>       Plane 2  buffer allocation = (11059200 / 22740480) * 444 + 8 = 223
>>       Plane 3  buffer allocation = (207360   / 22740480) * 444 + 8 = 12
>>       Plane 4  buffer allocation = (414720   / 22740480) * 444 + 8 = 16
>>
>> In this case it forced driver to disable Plane 3 & 4. Driver need to use
>> more efficient way to allocate buffer that is optimum for power.
>>
>> New Algorithm suggested by HW team is:
>>
>> 1. Calculate minimum buffer allocations for each plane and for each
>>      watermark level
>>
>> 2. Add minimum buffer allocations required for enabling WM7
>>      for all the planes
>>
>> Level 0 =  41 + 41 + 41 + 41  = 164
>> Level 1 =  42 + 42 + 42 + 42  = 168
>> Level 2 =  42 + 42 + 42 + 42  = 168
>> Level 3 =  94 + 94 + 94 + 94 =  376
>> Level 4 =  94 + 94 + 94 + 94 =  376
>> Level 5 =  94 + 94 + 94 + 94 =  376
>> Level 6 =  94 + 94 + 94 + 94 =  376
>> Level 7 =  94 + 94 + 94 + 94 =  376
>>
>> 3. Check to see how many buffer allocation are left and enable
>> the best case. In this case since we have 476 blocks we can enable
>> WM0-7 on all 4 planes.
>> Let's say if we have only 200 block available then the best cases
>> allocation is to enable Level2 which requires 168 blocks
> It's probably worth noting that the use cases that are most likely to
> benefit from this are those with large differences in the height of the
> 'shortest' plane vs the height of the 'tallest' plane.  It's the
> blind proportional distribution of remaining blocks in the current
> algorithm that prevents 'short' planes from reaching their minimum block
> requirements for various watermark levels (and if they can't even reach
> the WM0 minimum, then the plane can't be used at all).
>
> There will certainly still be cases where the overall display
> configuration (with lots of pipes and planes in use) simply requires
> more blocks than the hardware has to even reach WM0, no matter how we
> slice up the limited DDB size, but the changes here will definitely help
> prevent us from rejecting atomic commits for some configurations we
> actually could handle.
>
> Matt
>
>> Mahesh Kumar (11):
>>    drm/i915: fix naming of fixed_16_16 wrapper.
>>    drm/i915: Add more wrapper for fixed_point_16_16 operations
>>    drm/i915: Use fixed_16_16 wrapper for division operation
>>    drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed
>>      point
>>    drm/i915/skl: Fail the flip if no FB for WM calculation
>>    drm/i915/skl+: no need to memset again
>>    drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe
>>      allocation
>>    drm/i915/skl+: Watermark calculation cleanup
>>    drm/i915/skl+: use linetime latency if ddb size is not available
>>    drm/i915/skl: New ddb allocation algorithm
>>    drm/i915/skl+: consider max supported plane pixel rate while scaling
>>
>>   drivers/gpu/drm/i915/i915_drv.h      |  56 +++-
>>   drivers/gpu/drm/i915/intel_display.c |   3 +
>>   drivers/gpu/drm/i915/intel_drv.h     |   2 +
>>   drivers/gpu/drm/i915/intel_pm.c      | 520 +++++++++++++++++++++++------------
>>   4 files changed, 395 insertions(+), 186 deletions(-)
>>
>> -- 
>> 2.11.0
>>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/11] drm/i915: Add more wrapper for fixed_point_16_16 operations
  2017-05-12  0:22   ` Matt Roper
@ 2017-05-12  8:55     ` Mahesh Kumar
  0 siblings, 0 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-12  8:55 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

Hi,


On Friday 12 May 2017 05:52 AM, Matt Roper wrote:
> On Mon, May 08, 2017 at 05:18:53PM +0530, Mahesh Kumar wrote:
>> This patch adds few wrapper to perform fixed_point_16_16 operations
>> mul_u32_fixed_16_16_round_up : Multiplies u32 and fixed_16_16_t
>> 				variables & returns u32 result with
>> 				rounding-off.
>> mul_fixed_16_16 : Multiplies two fixed_16_16_t variable & returns
>> 				fixed_16_16
>> div_round_up_fixed_16_16 : Perform division operation on fixed_16_16_t
>> 				variables & return u32 result with round-off
>> fixed_16_16_div_round_up_u64 : devide uint32_t variable by fixed_16_16
>> 				variable and round_up the result to uint32_t.
> Minor nitpick, but I'd say "rounding-up" rather than "rounding-off" in
> your descriptions here.
Will update.
>
>> These wrappers will be used by later patches in the series.
> The name on this last one (fixed_16_16_div_round_up_u64) doesn't seem to
> match what it does.  You're using u64 internally, but the actual
> operands are a u32 and a fixed point value.  Wouldn't it make more sense
> to call it div_round_up_u32_fixed_16_16 (that also keeps the dividend
> first and the divisor second, which seems more natural to me).
>
> Actually the naming of all the fixed point math functions is a bit
> confusing/inconsistent.  Some of them are "op_type[_type][_round_up],"
> some of them are "op[_round_up]_type," some of them are
> "type_op[_round_up]," and some of them are "type_op[_round_up]_type."
> Also, none of them specify the return value type, but I guess that's not
> too much of a problem since the use of a fixed point structure will
> ensure the compiler warns if someone tries to use them assuming the
> wrong kind of result.
>
> Maybe we could standardize the names a bit to help avoid confusion?  I'd
> suggest "op[_round_up]_type1_type2."  If both types are the same, we'd
> still repeat the type for clarity, but maybe we could just use "fixed16"
> or "fix16" to keep the overall names from getting too long.  And for
> division we'd always keep the dividend as the first type, divisor as
> second.
>
> E.g.,
>          mul_round_up_u32_fixed16()
>          div_round_up_u32_fixed16()
>          mul_fixed16_fixed16()
> div_round_up_fixed16_fixed16()
will use name "mul_fixed16 / div_round_up_fixed16" (assuming same type 
for both the operand if only one type is there in function name), hope 
that is fine with you.
> And presumably the existing fixed point operations could be renamed in
> the same manner.
thanks for suggestion/review, It sounds logical & more consistent, will 
fix it,
In this patch will update only these 4 wrappers, will create a new patch 
at end of the series to address naming for other wrappers.

thanks,
-Mahesh

>
> Matt
>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h | 43 +++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 43 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 5804734d30a3..5ff0091707c0 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -152,6 +152,38 @@ static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
>>   	return max;
>>   }
>>   
>> +static inline uint32_t div_round_up_fixed_16_16(uint_fixed_16_16_t val,
>> +						uint_fixed_16_16_t d)
>> +{
>> +	return DIV_ROUND_UP(val.val, d.val);
>> +}
>> +
>> +static inline uint32_t mul_u32_fixed_16_16_round_up(uint32_t val,
>> +						    uint_fixed_16_16_t mul)
>> +{
>> +       uint64_t intermediate_val;
>> +       uint32_t result;
>> +
>> +       intermediate_val = (uint64_t) val * mul.val;
>> +       intermediate_val = DIV_ROUND_UP_ULL(intermediate_val, 1 << 16);
>> +       WARN_ON(intermediate_val >> 32);
>> +       result = clamp_t(uint32_t, intermediate_val, 0, ~0);
>> +       return result;
>> +}
>> +
>> +static inline uint_fixed_16_16_t mul_fixed_16_16(uint_fixed_16_16_t val,
>> +						 uint_fixed_16_16_t mul)
>> +{
>> +	uint64_t intermediate_val;
>> +	uint_fixed_16_16_t fp;
>> +
>> +	intermediate_val = (uint64_t) val.val * mul.val;
>> +	intermediate_val = intermediate_val >> 16;
>> +	WARN_ON(intermediate_val >> 32);
>> +	fp.val = clamp_t(uint32_t, intermediate_val, 0, ~0);
>> +	return fp;
>> +}
>> +
>>   static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
>>   {
>>   	uint_fixed_16_16_t fp, res;
>> @@ -174,6 +206,17 @@ static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val, uint32_t d)
>>   	return res;
>>   }
>>   
>> +static inline uint32_t fixed_16_16_div_round_up_u64(uint32_t val,
>> +						    uint_fixed_16_16_t d)
>> +{
>> +	uint64_t interm_val;
>> +
>> +	interm_val = (uint64_t)val << 16;
>> +	interm_val = DIV_ROUND_UP_ULL(interm_val, d.val);
>> +	WARN_ON(interm_val >> 32);
>> +	return clamp_t(uint32_t, interm_val, 0, ~0);
>> +}
>> +
>>   static inline uint_fixed_16_16_t mul_u32_fixed_16_16(uint32_t val,
>>   						     uint_fixed_16_16_t mul)
>>   {
>> -- 
>> 2.11.0
>>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 07/11] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation
  2017-05-12  0:23   ` Matt Roper
@ 2017-05-12 13:44     ` Mahesh Kumar
  0 siblings, 0 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-12 13:44 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst



On Friday 12 May 2017 05:53 AM, Matt Roper wrote:
> On Mon, May 08, 2017 at 05:18:58PM +0530, Mahesh Kumar wrote:
>> DDB minimum requirement may exceed the allocated DDB for CRTC/Pipe. This
>> patch make changes to fail the flip if minimum requirement for pipe
>> exceeds the total ddb allocated to the pipe.
>> Previously it succeeded but making alloc_size a negative value. Which
>> will make later calculations for plane ddb allocation bogus & may lead
>> to screen corruption or system hang.
>>
> I think originally the bspec defined the initial minimum DDB allocation
> for each plane to just be '8' in all cases.  So even with 4 planes on a
> pipe, your initial pipe minimum would only be 32, so the code here never
> had a risk of overflowing (note that the initial minimum is different
> than the WM0 minimum requirement we ultimately wind up calculating).
> But then the bspec was updated to suggest a higher initial minimum for
> y-tiled buffers (calculated based on the width of the buffer), so our
> initial assumptions here don't really hold anymore.  We should still be
> safe if we're not using y-tile, but even a single 4k y-tiled plane would
> have an initial minimum of something like 250 blocks iirc., so we can't
> assume we're always safe anymore.
>
>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_pm.c | 13 +++++++++++--
>>   1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index 2a4e9d89cd6f..0ace94d67432 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -3591,6 +3591,7 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>>   	int num_active;
>>   	unsigned plane_data_rate[I915_MAX_PLANES] = {};
>>   	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
>> +	uint16_t total_min_blocks = 0;
>>   
>>   	/* Clear the partitioning for disabled planes. */
>>   	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
>> @@ -3618,10 +3619,18 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>>   	 */
>>   
>>   	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
>> -		alloc_size -= minimum[plane_id];
>> -		alloc_size -= y_minimum[plane_id];
>> +		total_min_blocks += minimum[plane_id];
>> +		total_min_blocks += y_minimum[plane_id];
>>   	}
>>   
>> +	if ((total_min_blocks > alloc_size)) {
>> +		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
>> +		DRM_DEBUG_KMS("minimum required %d/%d\n", total_min_blocks,
>> +							alloc_size);
>> +		return-EINVAL;
> I think you're missing a space here?
>
> With that fixed,
I think I already fixed this in newer version of patch. :),
anyway resubmitting complete series again with other comments addressed.
thanks,
-Mahesh
>
> Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
>
>> +	}
>> +
>> +	alloc_size -= total_min_blocks;
>>   	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
>>   	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
>>   
>> -- 
>> 2.11.0
>>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 08/11] drm/i915/skl+: Watermark calculation cleanup
  2017-05-12  0:23   ` Matt Roper
@ 2017-05-12 13:47     ` Mahesh Kumar
  0 siblings, 0 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-12 13:47 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst


[-- Attachment #1.1: Type: text/plain, Size: 7208 bytes --]

Hi,


On Friday 12 May 2017 05:53 AM, Matt Roper wrote:
> On Mon, May 08, 2017 at 05:18:59PM +0530, Mahesh Kumar wrote:
>> This patch cleanup/reorganises the watermark calculation functions.
>> This patch also make use of already available macro
>> "drm_atomic_crtc_state_for_each_plane_state" to walk through
>> plane_state list instead of calculating plane_state in function itself.
>> Now we iterate over WM levels in skl_compute_wm_level function instead
>> of "skl_build_pipe_wm" function.
> I'd split the minor cleanup (const-ifying,
> drm_atomic_crtc_state_for_each_plane_state usage, etc.) into a separate
> patch from the code flow change (looping over levels in
> skl_compute_wm_level instead of skl_build_pipe_wm).  You'll probably
> also want to rename the function to skl_compute_wm_level*s* (plural) if
> its going to calculate all levels now instead of just one.
const-ifying is required by drm_atomic_crtc_state_for_each_plane_state 
Macro,
dividing it into 2 patches one to use the the macro & another one to 
move level iteration in separate
skl_compute_wm_level*s *function.

-Mahesh
>
>
> Matt
>
>> This restructuring will help later patch for new DDB allocation
>> algorithm to do only algo related changes.
>>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_pm.c | 89 +++++++++++++++++------------------------
>>   1 file changed, 37 insertions(+), 52 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index 0ace94d67432..9d0225862a7e 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -3732,7 +3732,7 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
>>   }
>>   
>>   static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cstate,
>> -					      struct intel_plane_state *pstate)
>> +					      const struct intel_plane_state *pstate)
>>   {
>>   	uint64_t adjusted_pixel_rate;
>>   	uint_fixed_16_16_t downscale_amount;
>> @@ -3754,7 +3754,7 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
>>   
>>   static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>   				struct intel_crtc_state *cstate,
>> -				struct intel_plane_state *intel_pstate,
>> +				const struct intel_plane_state *intel_pstate,
>>   				uint16_t ddb_allocation,
>>   				int level,
>>   				uint16_t *out_blocks, /* out */
>> @@ -3762,8 +3762,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>   				bool *enabled /* out */)
>>   {
>>   	struct intel_plane *plane = to_intel_plane(intel_pstate->base.plane);
>> -	struct drm_plane_state *pstate = &intel_pstate->base;
>> -	struct drm_framebuffer *fb = pstate->fb;
>> +	const struct drm_plane_state *pstate = &intel_pstate->base;
>> +	const struct drm_framebuffer *fb = pstate->fb;
>>   	uint32_t latency = dev_priv->wm.skl_latency[level];
>>   	uint_fixed_16_16_t method1, method2;
>>   	uint_fixed_16_16_t plane_blocks_per_line;
>> @@ -3918,52 +3918,36 @@ static int
>>   skl_compute_wm_level(const struct drm_i915_private *dev_priv,
>>   		     struct skl_ddb_allocation *ddb,
>>   		     struct intel_crtc_state *cstate,
>> -		     struct intel_plane *intel_plane,
>> -		     int level,
>> -		     struct skl_wm_level *result)
>> +		     const struct intel_plane_state *intel_pstate,
>> +		     struct skl_plane_wm *wm)
>>   {
>> -	struct drm_atomic_state *state = cstate->base.state;
>>   	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
>> -	struct drm_plane *plane = &intel_plane->base;
>> -	struct intel_plane_state *intel_pstate = NULL;
>> +	struct drm_plane *plane = intel_pstate->base.plane;
>> +	struct intel_plane *intel_plane = to_intel_plane(plane);
>>   	uint16_t ddb_blocks;
>>   	enum pipe pipe = intel_crtc->pipe;
>> +	int level, max_level = ilk_wm_max_level(dev_priv);
>>   	int ret;
>>   
>> -	if (state)
>> -		intel_pstate =
>> -			intel_atomic_get_existing_plane_state(state,
>> -							      intel_plane);
>> -
>> -	/*
>> -	 * Note: If we start supporting multiple pending atomic commits against
>> -	 * the same planes/CRTC's in the future, plane->state will no longer be
>> -	 * the correct pre-state to use for the calculations here and we'll
>> -	 * need to change where we get the 'unchanged' plane data from.
>> -	 *
>> -	 * For now this is fine because we only allow one queued commit against
>> -	 * a CRTC.  Even if the plane isn't modified by this transaction and we
>> -	 * don't have a plane lock, we still have the CRTC's lock, so we know
>> -	 * that no other transactions are racing with us to update it.
>> -	 */
>> -	if (!intel_pstate)
>> -		intel_pstate = to_intel_plane_state(plane->state);
>> -
>>   	if (WARN_ON(!intel_pstate->base.fb))
>>   		return -EINVAL;
>>   
>>   	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
>>   
>> -	ret = skl_compute_plane_wm(dev_priv,
>> -				   cstate,
>> -				   intel_pstate,
>> -				   ddb_blocks,
>> -				   level,
>> -				   &result->plane_res_b,
>> -				   &result->plane_res_l,
>> -				   &result->plane_en);
>> -	if (ret)
>> -		return ret;
>> +	for (level = 0; level <= max_level; level++) {
>> +		struct skl_wm_level *result = &wm->wm[level];
>> +
>> +		ret = skl_compute_plane_wm(dev_priv,
>> +					   cstate,
>> +					   intel_pstate,
>> +					   ddb_blocks,
>> +					   level,
>> +					   &result->plane_res_b,
>> +					   &result->plane_res_l,
>> +					   &result->plane_en);
>> +		if (ret)
>> +			return ret;
>> +	}
>>   
>>   	return 0;
>>   }
>> @@ -4009,10 +3993,11 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
>>   			     struct skl_pipe_wm *pipe_wm)
>>   {
>>   	struct drm_device *dev = cstate->base.crtc->dev;
>> +	struct drm_crtc_state *crtc_state = &cstate->base;
>>   	const struct drm_i915_private *dev_priv = to_i915(dev);
>> -	struct intel_plane *intel_plane;
>> +	struct drm_plane *plane;
>> +	const struct drm_plane_state *pstate;
>>   	struct skl_plane_wm *wm;
>> -	int level, max_level = ilk_wm_max_level(dev_priv);
>>   	int ret;
>>   
>>   	/*
>> @@ -4021,18 +4006,18 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
>>   	 */
>>   	memset(pipe_wm->planes, 0, sizeof(pipe_wm->planes));
>>   
>> -	for_each_intel_plane_mask(&dev_priv->drm,
>> -				  intel_plane,
>> -				  cstate->base.plane_mask) {
>> -		wm = &pipe_wm->planes[intel_plane->id];
>> +	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
>> +		const struct intel_plane_state *intel_pstate =
>> +						to_intel_plane_state(pstate);
>> +		enum plane_id plane_id = to_intel_plane(plane)->id;
>> +
>> +		wm = &pipe_wm->planes[plane_id];
>> +
>> +		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
>> +					   intel_pstate, wm);
>> +		if (ret)
>> +			return ret;
>>   
>> -		for (level = 0; level <= max_level; level++) {
>> -			ret = skl_compute_wm_level(dev_priv, ddb, cstate,
>> -						   intel_plane, level,
>> -						   &wm->wm[level]);
>> -			if (ret)
>> -				return ret;
>> -		}
>>   		skl_compute_transition_wm(cstate, &wm->trans_wm);
>>   	}
>>   	pipe_wm->linetime = skl_compute_linetime_wm(cstate);
>> -- 
>> 2.11.0
>>


[-- Attachment #1.2: Type: text/html, Size: 7739 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/11] drm/i915/skl: New ddb allocation algorithm
  2017-05-08 11:49 ` [PATCH 10/11] drm/i915/skl: New ddb allocation algorithm Mahesh Kumar
@ 2017-05-12 22:24   ` Matt Roper
  2017-05-15  8:15     ` Mahesh Kumar
  0 siblings, 1 reply; 43+ messages in thread
From: Matt Roper @ 2017-05-12 22:24 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

On Mon, May 08, 2017 at 05:19:01PM +0530, Mahesh Kumar wrote:
> This patch implements new DDB allocation algorithm as per HW team
> recommendation. This algo takecare of scenario where we allocate less DDB
> for the planes with lower relative pixel rate, but they require more DDB
> to work.
> It also takes care of enabling same watermark level for each
> plane in crtc, for efficient power saving.
> 
> Changes since v1:
>  - Rebase on top of Paulo's patch series
> 
> Changes since v2:
>  - Fix the for loop condition to enable WM
> 
> Changes since v3:
>  - Fix crash in cursor i-g-t reported by Maarten
>  - Rebase after addressing Paulo's comments
>  - Few other ULT fixes
> Changes since v4:
>  - Rebase on drm-tip
>  - Added separate function to enable WM levels
> Changes since v4:
>  - Fix a crash identified in skl-6770HQ system
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 253 ++++++++++++++++++++++++----------------
>  1 file changed, 152 insertions(+), 101 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index bcf5d2523e4a..92600cf42e12 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3573,13 +3573,41 @@ skl_ddb_calc_min(const struct intel_crtc_state *cstate, int num_active,
>  	minimum[PLANE_CURSOR] = skl_cursor_allocation(num_active);
>  }
>  
> +static void
> +skl_enable_plane_wm_levels(const struct drm_i915_private *dev_priv,
> +			   uint16_t plane_ddb,
> +			   uint16_t max_level,
> +			   struct skl_plane_wm *wm)
> +{
> +	int level;
> +	/*
> +	 * Now enable all levels in WM structure which can be enabled
> +	 * using current DDB allocation
> +	 */
> +	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
> +		struct skl_wm_level *level_wm = &wm->wm[level];
> +
> +		if (level > max_level || level_wm->plane_res_b == 0
> +				      || level_wm->plane_res_l >= 31
> +				      || level_wm->plane_res_b >= plane_ddb) {
> +			level_wm->plane_en = false;
> +			level_wm->plane_res_b = 0;
> +			level_wm->plane_res_l = 0;
> +		} else {
> +			level_wm->plane_en = true;
> +		}
> +	}
> +}
> +
>  static int
>  skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
> -		      struct skl_ddb_allocation *ddb /* out */)
> +		      struct skl_ddb_allocation *ddb /* out */,
> +		      struct skl_pipe_wm *pipe_wm)
>  {
>  	struct drm_atomic_state *state = cstate->base.state;
>  	struct drm_crtc *crtc = cstate->base.crtc;
>  	struct drm_device *dev = crtc->dev;
> +	struct drm_i915_private *dev_priv = to_i915(dev);
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>  	enum pipe pipe = intel_crtc->pipe;
>  	struct skl_ddb_entry *alloc = &cstate->wm.skl.ddb;
> @@ -3592,6 +3620,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	unsigned plane_data_rate[I915_MAX_PLANES] = {};
>  	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
>  	uint16_t total_min_blocks = 0;
> +	uint16_t total_level_ddb;
> +	int max_level, level;
>  
>  	/* Clear the partitioning for disabled planes. */
>  	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
> @@ -3630,10 +3660,43 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  		return-EINVAL;
>  	}
>  
> -	alloc_size -= total_min_blocks;
> -	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
> +	alloc_size -= minimum[PLANE_CURSOR];
> +	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end -
> +							minimum[PLANE_CURSOR];
>  	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
>  
> +	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {

Why do we walk backwards here (max level down to 0)?  Shouldn't we be
going the other direction so that we allocate blocks for WM0, then move
on to the higher levels until we eventually fail?  Maybe I'm missing
something...

> +		total_level_ddb = 0;
> +		for_each_plane_id_on_crtc(intel_crtc, plane_id) {
> +			/*
> +			 * TODO: We should calculate watermark values for Y/UV
> +			 * plane both in case of NV12 format and use both values
> +			 * for ddb calculation. NV12 is disabled as of now, So
> +			 * using only single/UV plane value here.
> +			 */
> +			struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
> +			uint16_t plane_res_b = wm->wm[level].plane_res_b;
> +			uint16_t min = minimum[plane_id] + y_minimum[plane_id];
> +
> +			if (plane_id == PLANE_CURSOR)
> +				continue;
> +
> +			total_level_ddb += max(plane_res_b, min);
> +		}
> +
> +		if (total_level_ddb <= alloc_size)
> +			break;
> +	}
> +
> +	if ((level < 0) || (total_min_blocks > alloc_size)) {

If we had plenty of DDB space and could enable every single level, the
backwards walk above would leave level=-1 when the loop ends.  So then
we'd complain here that we'd exceeded limitations and fail the commit,
which doesn't seem right.


Matt

> +		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
> +		DRM_DEBUG_KMS("minimum required %d/%d\n", (level < 0) ?
> +				total_level_ddb : total_min_blocks, alloc_size);
> +		return -EINVAL;
> +	}
> +	max_level = level;
> +	alloc_size -= total_level_ddb;
> +
>  	/*
>  	 * 2. Distribute the remaining space in proportion to the amount of
>  	 * data each plane needs to fetch from memory.
> @@ -3649,10 +3712,17 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  	start = alloc->start;
>  	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
>  		unsigned int data_rate, y_data_rate;
> -		uint16_t plane_blocks, y_plane_blocks = 0;
> -
> -		if (plane_id == PLANE_CURSOR)
> +		uint16_t plane_blocks = 0, y_plane_blocks = 0;
> +		struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
> +		uint16_t plane_res_b = wm->wm[max_level].plane_res_b;
> +
> +		if (plane_id == PLANE_CURSOR) {
> +			plane_blocks =
> +				skl_ddb_entry_size(&ddb->plane[pipe][plane_id]);
> +			skl_enable_plane_wm_levels(dev_priv, plane_blocks,
> +						   max_level, wm);
>  			continue;
> +		}
>  
>  		data_rate = plane_data_rate[plane_id];
>  
> @@ -3661,33 +3731,36 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>  		 * promote the expression to 64 bits to avoid overflowing, the
>  		 * result is < available as data_rate / total_data_rate < 1
>  		 */
> -		plane_blocks = minimum[plane_id];
> -		plane_blocks += div_u64((uint64_t)alloc_size * data_rate,
> -					total_data_rate);
>  
>  		/* Leave disabled planes at (0,0) */
>  		if (data_rate) {
> +			plane_blocks = max(minimum[plane_id], plane_res_b);
> +			plane_blocks += div_u64((uint64_t)alloc_size *
> +					data_rate, total_data_rate);
>  			ddb->plane[pipe][plane_id].start = start;
>  			ddb->plane[pipe][plane_id].end = start + plane_blocks;
> +			start += plane_blocks;
>  		}
>  
> -		start += plane_blocks;
> -
>  		/*
>  		 * allocation for y_plane part of planar format:
> +		 * TODO: Once we start calculating watermark values for Y/UV
> +		 * plane both consider it for initial allowed wm blocks.
>  		 */
>  		y_data_rate = plane_y_data_rate[plane_id];
>  
> -		y_plane_blocks = y_minimum[plane_id];
> -		y_plane_blocks += div_u64((uint64_t)alloc_size * y_data_rate,
> -					total_data_rate);
> -
>  		if (y_data_rate) {
> +			y_plane_blocks = y_minimum[plane_id];
> +			y_plane_blocks += div_u64((uint64_t)alloc_size *
> +					y_data_rate, total_data_rate);
>  			ddb->y_plane[pipe][plane_id].start = start;
>  			ddb->y_plane[pipe][plane_id].end = start + y_plane_blocks;
> +			start += y_plane_blocks;
>  		}
> -
> -		start += y_plane_blocks;
> +		skl_enable_plane_wm_levels(dev_priv,
> +					   plane_blocks,
> +					   max_level,
> +					   wm);
>  	}
>  
>  	return 0;
> @@ -3776,11 +3849,9 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
>  static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  				struct intel_crtc_state *cstate,
>  				const struct intel_plane_state *intel_pstate,
> -				uint16_t ddb_allocation,
>  				int level,
>  				uint16_t *out_blocks, /* out */
> -				uint8_t *out_lines, /* out */
> -				bool *enabled /* out */)
> +				uint8_t *out_lines /* out */)
>  {
>  	struct intel_plane *plane = to_intel_plane(intel_pstate->base.plane);
>  	const struct drm_plane_state *pstate = &intel_pstate->base;
> @@ -3803,10 +3874,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  	bool y_tiled, x_tiled;
>  
>  	if (latency == 0 ||
> -	    !intel_wm_plane_visible(cstate, intel_pstate)) {
> -		*enabled = false;
> +	    !intel_wm_plane_visible(cstate, intel_pstate))
>  		return 0;
> -	}
>  
>  	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
>  		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
> @@ -3892,9 +3961,6 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
>  		    (plane_bytes_per_line / 512 < 1))
>  			selected_result = method2;
> -		else if ((ddb_allocation && ddb_allocation /
> -			fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1)
> -			selected_result = min_fixed_16_16(method1, method2);
>  		else if (latency >= linetime_us)
>  			selected_result = method2;
>  		else
> @@ -3914,64 +3980,41 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>  		}
>  	}
>  
> -	if (res_blocks >= ddb_allocation || res_lines > 31) {
> -		*enabled = false;
> -
> -		/*
> -		 * If there are no valid level 0 watermarks, then we can't
> -		 * support this display configuration.
> -		 */
> -		if (level) {
> -			return 0;
> -		} else {
> -			struct drm_plane *plane = pstate->plane;
> +	if (res_lines >= 31 && level == 0) {
> +		struct drm_plane *plane = pstate->plane;
>  
> -			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
> -			DRM_DEBUG_KMS("[PLANE:%d:%s] blocks required = %u/%u, lines required = %u/31\n",
> -				      plane->base.id, plane->name,
> -				      res_blocks, ddb_allocation, res_lines);
> -			return -EINVAL;
> -		}
> +		DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
> +		DRM_DEBUG_KMS("[PLANE:%d:%s] lines required = %u/31\n",
> +				plane->base.id, plane->name, res_lines);
>  	}
>  
>  	*out_blocks = res_blocks;
>  	*out_lines = res_lines;
> -	*enabled = true;
>  
>  	return 0;
>  }
>  
>  static int
>  skl_compute_wm_level(const struct drm_i915_private *dev_priv,
> -		     struct skl_ddb_allocation *ddb,
>  		     struct intel_crtc_state *cstate,
>  		     const struct intel_plane_state *intel_pstate,
>  		     struct skl_plane_wm *wm)
>  {
> -	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
> -	struct drm_plane *plane = intel_pstate->base.plane;
> -	struct intel_plane *intel_plane = to_intel_plane(plane);
> -	uint16_t ddb_blocks;
> -	enum pipe pipe = intel_crtc->pipe;
>  	int level, max_level = ilk_wm_max_level(dev_priv);
>  	int ret;
>  
>  	if (WARN_ON(!intel_pstate->base.fb))
>  		return -EINVAL;
>  
> -	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
> -
>  	for (level = 0; level <= max_level; level++) {
>  		struct skl_wm_level *result = &wm->wm[level];
>  
>  		ret = skl_compute_plane_wm(dev_priv,
>  					   cstate,
>  					   intel_pstate,
> -					   ddb_blocks,
>  					   level,
>  					   &result->plane_res_b,
> -					   &result->plane_res_l,
> -					   &result->plane_en);
> +					   &result->plane_res_l);
>  		if (ret)
>  			return ret;
>  	}
> @@ -4037,8 +4080,7 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
>  
>  		wm = &pipe_wm->planes[plane_id];
>  
> -		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
> -					   intel_pstate, wm);
> +		ret = skl_compute_wm_level(dev_priv, cstate, intel_pstate, wm);
>  		if (ret)
>  			return ret;
>  
> @@ -4152,6 +4194,45 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
>  	return false;
>  }
>  
> +static int
> +skl_ddb_add_affected_planes(struct intel_crtc_state *cstate,
> +			    const struct skl_pipe_wm *old_pipe_wm,
> +			    const struct skl_pipe_wm *pipe_wm)
> +{
> +	struct drm_atomic_state *state = cstate->base.state;
> +	struct drm_device *dev = state->dev;
> +	struct drm_crtc *crtc = cstate->base.crtc;
> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
> +	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
> +	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
> +	struct drm_plane_state *plane_state;
> +	struct drm_plane *plane;
> +	enum pipe pipe = intel_crtc->pipe;
> +
> +	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
> +
> +	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
> +		enum plane_id plane_id = to_intel_plane(plane)->id;
> +		const struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
> +		const struct skl_plane_wm *old_wm = &old_pipe_wm->planes[plane_id];
> +
> +		if ((skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
> +					&new_ddb->plane[pipe][plane_id]) &&
> +		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
> +					&new_ddb->y_plane[pipe][plane_id])) &&
> +		    !memcmp(wm, old_wm, sizeof(struct skl_plane_wm)))
> +			continue;
> +
> +		plane_state = drm_atomic_get_plane_state(state, plane);
> +		if (IS_ERR(plane_state))
> +			return PTR_ERR(plane_state);
> +	}
> +
> +	return 0;
> +}
> +
>  static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
>  			      const struct skl_pipe_wm *old_pipe_wm,
>  			      struct skl_pipe_wm *pipe_wm, /* out */
> @@ -4165,6 +4246,18 @@ static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
>  	if (ret)
>  		return ret;
>  
> +	ret = skl_allocate_pipe_ddb(intel_cstate, ddb, pipe_wm);
> +	if (ret)
> +		return ret;
> +	/*
> +	 * TODO: Do we still need to add planes in state, As WM update is
> +	 * not part of update_plane anymore, So wm for planes can be updated
> +	 * irrespective of updade_plane call.
> +	 */
> +	ret = skl_ddb_add_affected_planes(intel_cstate, old_pipe_wm, pipe_wm);
> +	if (ret)
> +		return ret;
> +
>  	if (!memcmp(old_pipe_wm, pipe_wm, sizeof(*pipe_wm)))
>  		*changed = false;
>  	else
> @@ -4187,41 +4280,7 @@ pipes_modified(struct drm_atomic_state *state)
>  }
>  
>  static int
> -skl_ddb_add_affected_planes(struct intel_crtc_state *cstate)
> -{
> -	struct drm_atomic_state *state = cstate->base.state;
> -	struct drm_device *dev = state->dev;
> -	struct drm_crtc *crtc = cstate->base.crtc;
> -	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> -	struct drm_i915_private *dev_priv = to_i915(dev);
> -	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
> -	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
> -	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
> -	struct drm_plane_state *plane_state;
> -	struct drm_plane *plane;
> -	enum pipe pipe = intel_crtc->pipe;
> -
> -	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
> -
> -	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
> -		enum plane_id plane_id = to_intel_plane(plane)->id;
> -
> -		if (skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
> -					&new_ddb->plane[pipe][plane_id]) &&
> -		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
> -					&new_ddb->y_plane[pipe][plane_id]))
> -			continue;
> -
> -		plane_state = drm_atomic_get_plane_state(state, plane);
> -		if (IS_ERR(plane_state))
> -			return PTR_ERR(plane_state);
> -	}
> -
> -	return 0;
> -}
> -
> -static int
> -skl_compute_ddb(struct drm_atomic_state *state)
> +skl_include_affected_crtc(struct drm_atomic_state *state)
>  {
>  	struct drm_device *dev = state->dev;
>  	struct drm_i915_private *dev_priv = to_i915(dev);
> @@ -4285,14 +4344,6 @@ skl_compute_ddb(struct drm_atomic_state *state)
>  		cstate = intel_atomic_get_crtc_state(state, intel_crtc);
>  		if (IS_ERR(cstate))
>  			return PTR_ERR(cstate);
> -
> -		ret = skl_allocate_pipe_ddb(cstate, ddb);
> -		if (ret)
> -			return ret;
> -
> -		ret = skl_ddb_add_affected_planes(cstate);
> -		if (ret)
> -			return ret;
>  	}
>  
>  	return 0;
> @@ -4373,7 +4424,7 @@ skl_compute_wm(struct drm_atomic_state *state)
>  	/* Clear all dirty flags */
>  	results->dirty_pipes = 0;
>  
> -	ret = skl_compute_ddb(state);
> +	ret = skl_include_affected_crtc(state);
>  	if (ret)
>  		return ret;
>  
> -- 
> 2.11.0
> 

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/11] drm/i915/skl: New ddb allocation algorithm
  2017-05-12 22:24   ` Matt Roper
@ 2017-05-15  8:15     ` Mahesh Kumar
  0 siblings, 0 replies; 43+ messages in thread
From: Mahesh Kumar @ 2017-05-15  8:15 UTC (permalink / raw)
  To: Matt Roper; +Cc: intel-gfx, paulo.r.zanoni, maarten.lankhorst

Hi,


On Saturday 13 May 2017 03:54 AM, Matt Roper wrote:
> On Mon, May 08, 2017 at 05:19:01PM +0530, Mahesh Kumar wrote:
>> This patch implements new DDB allocation algorithm as per HW team
>> recommendation. This algo takecare of scenario where we allocate less DDB
>> for the planes with lower relative pixel rate, but they require more DDB
>> to work.
>> It also takes care of enabling same watermark level for each
>> plane in crtc, for efficient power saving.
>>
>> Changes since v1:
>>   - Rebase on top of Paulo's patch series
>>
>> Changes since v2:
>>   - Fix the for loop condition to enable WM
>>
>> Changes since v3:
>>   - Fix crash in cursor i-g-t reported by Maarten
>>   - Rebase after addressing Paulo's comments
>>   - Few other ULT fixes
>> Changes since v4:
>>   - Rebase on drm-tip
>>   - Added separate function to enable WM levels
>> Changes since v4:
>>   - Fix a crash identified in skl-6770HQ system
>>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_pm.c | 253 ++++++++++++++++++++++++----------------
>>   1 file changed, 152 insertions(+), 101 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
>> index bcf5d2523e4a..92600cf42e12 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -3573,13 +3573,41 @@ skl_ddb_calc_min(const struct intel_crtc_state *cstate, int num_active,
>>   	minimum[PLANE_CURSOR] = skl_cursor_allocation(num_active);
>>   }
>>   
>> +static void
>> +skl_enable_plane_wm_levels(const struct drm_i915_private *dev_priv,
>> +			   uint16_t plane_ddb,
>> +			   uint16_t max_level,
>> +			   struct skl_plane_wm *wm)
>> +{
>> +	int level;
>> +	/*
>> +	 * Now enable all levels in WM structure which can be enabled
>> +	 * using current DDB allocation
>> +	 */
>> +	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
>> +		struct skl_wm_level *level_wm = &wm->wm[level];
>> +
>> +		if (level > max_level || level_wm->plane_res_b == 0
>> +				      || level_wm->plane_res_l >= 31
>> +				      || level_wm->plane_res_b >= plane_ddb) {
>> +			level_wm->plane_en = false;
>> +			level_wm->plane_res_b = 0;
>> +			level_wm->plane_res_l = 0;
>> +		} else {
>> +			level_wm->plane_en = true;
>> +		}
>> +	}
>> +}
>> +
>>   static int
>>   skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>> -		      struct skl_ddb_allocation *ddb /* out */)
>> +		      struct skl_ddb_allocation *ddb /* out */,
>> +		      struct skl_pipe_wm *pipe_wm)
>>   {
>>   	struct drm_atomic_state *state = cstate->base.state;
>>   	struct drm_crtc *crtc = cstate->base.crtc;
>>   	struct drm_device *dev = crtc->dev;
>> +	struct drm_i915_private *dev_priv = to_i915(dev);
>>   	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>>   	enum pipe pipe = intel_crtc->pipe;
>>   	struct skl_ddb_entry *alloc = &cstate->wm.skl.ddb;
>> @@ -3592,6 +3620,8 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>>   	unsigned plane_data_rate[I915_MAX_PLANES] = {};
>>   	unsigned plane_y_data_rate[I915_MAX_PLANES] = {};
>>   	uint16_t total_min_blocks = 0;
>> +	uint16_t total_level_ddb;
>> +	int max_level, level;
>>   
>>   	/* Clear the partitioning for disabled planes. */
>>   	memset(ddb->plane[pipe], 0, sizeof(ddb->plane[pipe]));
>> @@ -3630,10 +3660,43 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>>   		return-EINVAL;
>>   	}
>>   
>> -	alloc_size -= total_min_blocks;
>> -	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end - minimum[PLANE_CURSOR];
>> +	alloc_size -= minimum[PLANE_CURSOR];
>> +	ddb->plane[pipe][PLANE_CURSOR].start = alloc->end -
>> +							minimum[PLANE_CURSOR];
>>   	ddb->plane[pipe][PLANE_CURSOR].end = alloc->end;
>>   
>> +	for (level = ilk_wm_max_level(dev_priv); level >= 0; level--) {
> Why do we walk backwards here (max level down to 0)?  Shouldn't we be
> going the other direction so that we allocate blocks for WM0, then move
> on to the higher levels until we eventually fail?  Maybe I'm missing
> something...
We are checking of levels which "can't be enabled" here, and prune those 
level. that's the reason walking from max level down to 0.
>> +		total_level_ddb = 0;
>> +		for_each_plane_id_on_crtc(intel_crtc, plane_id) {
>> +			/*
>> +			 * TODO: We should calculate watermark values for Y/UV
>> +			 * plane both in case of NV12 format and use both values
>> +			 * for ddb calculation. NV12 is disabled as of now, So
>> +			 * using only single/UV plane value here.
>> +			 */
>> +			struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
>> +			uint16_t plane_res_b = wm->wm[level].plane_res_b;
>> +			uint16_t min = minimum[plane_id] + y_minimum[plane_id];
>> +
>> +			if (plane_id == PLANE_CURSOR)
>> +				continue;
>> +
>> +			total_level_ddb += max(plane_res_b, min);
>> +		}
>> +
>> +		if (total_level_ddb <= alloc_size)
>> +			break;
>> +	}
>> +
>> +	if ((level < 0) || (total_min_blocks > alloc_size)) {
> If we had plenty of DDB space and could enable every single level, the
> backwards walk above would leave level=-1 when the loop ends.  So then
> we'd complain here that we'd exceeded limitations and fail the commit,
> which doesn't seem right.
If we have plenty of DDB then, level-7 itself can be enabled & we'll 
return from there.
total_level_ddb is required DDB to enable wm level :)
If we are reaching to -1 that means we don't have sufficient DDB to 
enable even WM level-0 & hence failing the flip.

-Mahesh
>
>
> Matt
>
>> +		DRM_DEBUG_KMS("Requested display configuration exceeds system DDB limitations");
>> +		DRM_DEBUG_KMS("minimum required %d/%d\n", (level < 0) ?
>> +				total_level_ddb : total_min_blocks, alloc_size);
>> +		return -EINVAL;
>> +	}
>> +	max_level = level;
>> +	alloc_size -= total_level_ddb;
>> +
>>   	/*
>>   	 * 2. Distribute the remaining space in proportion to the amount of
>>   	 * data each plane needs to fetch from memory.
>> @@ -3649,10 +3712,17 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>>   	start = alloc->start;
>>   	for_each_plane_id_on_crtc(intel_crtc, plane_id) {
>>   		unsigned int data_rate, y_data_rate;
>> -		uint16_t plane_blocks, y_plane_blocks = 0;
>> -
>> -		if (plane_id == PLANE_CURSOR)
>> +		uint16_t plane_blocks = 0, y_plane_blocks = 0;
>> +		struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
>> +		uint16_t plane_res_b = wm->wm[max_level].plane_res_b;
>> +
>> +		if (plane_id == PLANE_CURSOR) {
>> +			plane_blocks =
>> +				skl_ddb_entry_size(&ddb->plane[pipe][plane_id]);
>> +			skl_enable_plane_wm_levels(dev_priv, plane_blocks,
>> +						   max_level, wm);
>>   			continue;
>> +		}
>>   
>>   		data_rate = plane_data_rate[plane_id];
>>   
>> @@ -3661,33 +3731,36 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
>>   		 * promote the expression to 64 bits to avoid overflowing, the
>>   		 * result is < available as data_rate / total_data_rate < 1
>>   		 */
>> -		plane_blocks = minimum[plane_id];
>> -		plane_blocks += div_u64((uint64_t)alloc_size * data_rate,
>> -					total_data_rate);
>>   
>>   		/* Leave disabled planes at (0,0) */
>>   		if (data_rate) {
>> +			plane_blocks = max(minimum[plane_id], plane_res_b);
>> +			plane_blocks += div_u64((uint64_t)alloc_size *
>> +					data_rate, total_data_rate);
>>   			ddb->plane[pipe][plane_id].start = start;
>>   			ddb->plane[pipe][plane_id].end = start + plane_blocks;
>> +			start += plane_blocks;
>>   		}
>>   
>> -		start += plane_blocks;
>> -
>>   		/*
>>   		 * allocation for y_plane part of planar format:
>> +		 * TODO: Once we start calculating watermark values for Y/UV
>> +		 * plane both consider it for initial allowed wm blocks.
>>   		 */
>>   		y_data_rate = plane_y_data_rate[plane_id];
>>   
>> -		y_plane_blocks = y_minimum[plane_id];
>> -		y_plane_blocks += div_u64((uint64_t)alloc_size * y_data_rate,
>> -					total_data_rate);
>> -
>>   		if (y_data_rate) {
>> +			y_plane_blocks = y_minimum[plane_id];
>> +			y_plane_blocks += div_u64((uint64_t)alloc_size *
>> +					y_data_rate, total_data_rate);
>>   			ddb->y_plane[pipe][plane_id].start = start;
>>   			ddb->y_plane[pipe][plane_id].end = start + y_plane_blocks;
>> +			start += y_plane_blocks;
>>   		}
>> -
>> -		start += y_plane_blocks;
>> +		skl_enable_plane_wm_levels(dev_priv,
>> +					   plane_blocks,
>> +					   max_level,
>> +					   wm);
>>   	}
>>   
>>   	return 0;
>> @@ -3776,11 +3849,9 @@ static uint32_t skl_adjusted_plane_pixel_rate(const struct intel_crtc_state *cst
>>   static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>   				struct intel_crtc_state *cstate,
>>   				const struct intel_plane_state *intel_pstate,
>> -				uint16_t ddb_allocation,
>>   				int level,
>>   				uint16_t *out_blocks, /* out */
>> -				uint8_t *out_lines, /* out */
>> -				bool *enabled /* out */)
>> +				uint8_t *out_lines /* out */)
>>   {
>>   	struct intel_plane *plane = to_intel_plane(intel_pstate->base.plane);
>>   	const struct drm_plane_state *pstate = &intel_pstate->base;
>> @@ -3803,10 +3874,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>   	bool y_tiled, x_tiled;
>>   
>>   	if (latency == 0 ||
>> -	    !intel_wm_plane_visible(cstate, intel_pstate)) {
>> -		*enabled = false;
>> +	    !intel_wm_plane_visible(cstate, intel_pstate))
>>   		return 0;
>> -	}
>>   
>>   	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
>>   		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
>> @@ -3892,9 +3961,6 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>   		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
>>   		    (plane_bytes_per_line / 512 < 1))
>>   			selected_result = method2;
>> -		else if ((ddb_allocation && ddb_allocation /
>> -			fixed_16_16_to_u32_round_up(plane_blocks_per_line)) >= 1)
>> -			selected_result = min_fixed_16_16(method1, method2);
>>   		else if (latency >= linetime_us)
>>   			selected_result = method2;
>>   		else
>> @@ -3914,64 +3980,41 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
>>   		}
>>   	}
>>   
>> -	if (res_blocks >= ddb_allocation || res_lines > 31) {
>> -		*enabled = false;
>> -
>> -		/*
>> -		 * If there are no valid level 0 watermarks, then we can't
>> -		 * support this display configuration.
>> -		 */
>> -		if (level) {
>> -			return 0;
>> -		} else {
>> -			struct drm_plane *plane = pstate->plane;
>> +	if (res_lines >= 31 && level == 0) {
>> +		struct drm_plane *plane = pstate->plane;
>>   
>> -			DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
>> -			DRM_DEBUG_KMS("[PLANE:%d:%s] blocks required = %u/%u, lines required = %u/31\n",
>> -				      plane->base.id, plane->name,
>> -				      res_blocks, ddb_allocation, res_lines);
>> -			return -EINVAL;
>> -		}
>> +		DRM_DEBUG_KMS("Requested display configuration exceeds system watermark limitations\n");
>> +		DRM_DEBUG_KMS("[PLANE:%d:%s] lines required = %u/31\n",
>> +				plane->base.id, plane->name, res_lines);
>>   	}
>>   
>>   	*out_blocks = res_blocks;
>>   	*out_lines = res_lines;
>> -	*enabled = true;
>>   
>>   	return 0;
>>   }
>>   
>>   static int
>>   skl_compute_wm_level(const struct drm_i915_private *dev_priv,
>> -		     struct skl_ddb_allocation *ddb,
>>   		     struct intel_crtc_state *cstate,
>>   		     const struct intel_plane_state *intel_pstate,
>>   		     struct skl_plane_wm *wm)
>>   {
>> -	struct intel_crtc *intel_crtc = to_intel_crtc(cstate->base.crtc);
>> -	struct drm_plane *plane = intel_pstate->base.plane;
>> -	struct intel_plane *intel_plane = to_intel_plane(plane);
>> -	uint16_t ddb_blocks;
>> -	enum pipe pipe = intel_crtc->pipe;
>>   	int level, max_level = ilk_wm_max_level(dev_priv);
>>   	int ret;
>>   
>>   	if (WARN_ON(!intel_pstate->base.fb))
>>   		return -EINVAL;
>>   
>> -	ddb_blocks = skl_ddb_entry_size(&ddb->plane[pipe][intel_plane->id]);
>> -
>>   	for (level = 0; level <= max_level; level++) {
>>   		struct skl_wm_level *result = &wm->wm[level];
>>   
>>   		ret = skl_compute_plane_wm(dev_priv,
>>   					   cstate,
>>   					   intel_pstate,
>> -					   ddb_blocks,
>>   					   level,
>>   					   &result->plane_res_b,
>> -					   &result->plane_res_l,
>> -					   &result->plane_en);
>> +					   &result->plane_res_l);
>>   		if (ret)
>>   			return ret;
>>   	}
>> @@ -4037,8 +4080,7 @@ static int skl_build_pipe_wm(struct intel_crtc_state *cstate,
>>   
>>   		wm = &pipe_wm->planes[plane_id];
>>   
>> -		ret = skl_compute_wm_level(dev_priv, ddb, cstate,
>> -					   intel_pstate, wm);
>> +		ret = skl_compute_wm_level(dev_priv, cstate, intel_pstate, wm);
>>   		if (ret)
>>   			return ret;
>>   
>> @@ -4152,6 +4194,45 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
>>   	return false;
>>   }
>>   
>> +static int
>> +skl_ddb_add_affected_planes(struct intel_crtc_state *cstate,
>> +			    const struct skl_pipe_wm *old_pipe_wm,
>> +			    const struct skl_pipe_wm *pipe_wm)
>> +{
>> +	struct drm_atomic_state *state = cstate->base.state;
>> +	struct drm_device *dev = state->dev;
>> +	struct drm_crtc *crtc = cstate->base.crtc;
>> +	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>> +	struct drm_i915_private *dev_priv = to_i915(dev);
>> +	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
>> +	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
>> +	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
>> +	struct drm_plane_state *plane_state;
>> +	struct drm_plane *plane;
>> +	enum pipe pipe = intel_crtc->pipe;
>> +
>> +	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
>> +
>> +	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
>> +		enum plane_id plane_id = to_intel_plane(plane)->id;
>> +		const struct skl_plane_wm *wm = &pipe_wm->planes[plane_id];
>> +		const struct skl_plane_wm *old_wm = &old_pipe_wm->planes[plane_id];
>> +
>> +		if ((skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
>> +					&new_ddb->plane[pipe][plane_id]) &&
>> +		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
>> +					&new_ddb->y_plane[pipe][plane_id])) &&
>> +		    !memcmp(wm, old_wm, sizeof(struct skl_plane_wm)))
>> +			continue;
>> +
>> +		plane_state = drm_atomic_get_plane_state(state, plane);
>> +		if (IS_ERR(plane_state))
>> +			return PTR_ERR(plane_state);
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>   static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
>>   			      const struct skl_pipe_wm *old_pipe_wm,
>>   			      struct skl_pipe_wm *pipe_wm, /* out */
>> @@ -4165,6 +4246,18 @@ static int skl_update_pipe_wm(struct drm_crtc_state *cstate,
>>   	if (ret)
>>   		return ret;
>>   
>> +	ret = skl_allocate_pipe_ddb(intel_cstate, ddb, pipe_wm);
>> +	if (ret)
>> +		return ret;
>> +	/*
>> +	 * TODO: Do we still need to add planes in state, As WM update is
>> +	 * not part of update_plane anymore, So wm for planes can be updated
>> +	 * irrespective of updade_plane call.
>> +	 */
>> +	ret = skl_ddb_add_affected_planes(intel_cstate, old_pipe_wm, pipe_wm);
>> +	if (ret)
>> +		return ret;
>> +
>>   	if (!memcmp(old_pipe_wm, pipe_wm, sizeof(*pipe_wm)))
>>   		*changed = false;
>>   	else
>> @@ -4187,41 +4280,7 @@ pipes_modified(struct drm_atomic_state *state)
>>   }
>>   
>>   static int
>> -skl_ddb_add_affected_planes(struct intel_crtc_state *cstate)
>> -{
>> -	struct drm_atomic_state *state = cstate->base.state;
>> -	struct drm_device *dev = state->dev;
>> -	struct drm_crtc *crtc = cstate->base.crtc;
>> -	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>> -	struct drm_i915_private *dev_priv = to_i915(dev);
>> -	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
>> -	struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;
>> -	struct skl_ddb_allocation *cur_ddb = &dev_priv->wm.skl_hw.ddb;
>> -	struct drm_plane_state *plane_state;
>> -	struct drm_plane *plane;
>> -	enum pipe pipe = intel_crtc->pipe;
>> -
>> -	WARN_ON(!drm_atomic_get_existing_crtc_state(state, crtc));
>> -
>> -	drm_for_each_plane_mask(plane, dev, cstate->base.plane_mask) {
>> -		enum plane_id plane_id = to_intel_plane(plane)->id;
>> -
>> -		if (skl_ddb_entry_equal(&cur_ddb->plane[pipe][plane_id],
>> -					&new_ddb->plane[pipe][plane_id]) &&
>> -		    skl_ddb_entry_equal(&cur_ddb->y_plane[pipe][plane_id],
>> -					&new_ddb->y_plane[pipe][plane_id]))
>> -			continue;
>> -
>> -		plane_state = drm_atomic_get_plane_state(state, plane);
>> -		if (IS_ERR(plane_state))
>> -			return PTR_ERR(plane_state);
>> -	}
>> -
>> -	return 0;
>> -}
>> -
>> -static int
>> -skl_compute_ddb(struct drm_atomic_state *state)
>> +skl_include_affected_crtc(struct drm_atomic_state *state)
>>   {
>>   	struct drm_device *dev = state->dev;
>>   	struct drm_i915_private *dev_priv = to_i915(dev);
>> @@ -4285,14 +4344,6 @@ skl_compute_ddb(struct drm_atomic_state *state)
>>   		cstate = intel_atomic_get_crtc_state(state, intel_crtc);
>>   		if (IS_ERR(cstate))
>>   			return PTR_ERR(cstate);
>> -
>> -		ret = skl_allocate_pipe_ddb(cstate, ddb);
>> -		if (ret)
>> -			return ret;
>> -
>> -		ret = skl_ddb_add_affected_planes(cstate);
>> -		if (ret)
>> -			return ret;
>>   	}
>>   
>>   	return 0;
>> @@ -4373,7 +4424,7 @@ skl_compute_wm(struct drm_atomic_state *state)
>>   	/* Clear all dirty flags */
>>   	results->dirty_pipes = 0;
>>   
>> -	ret = skl_compute_ddb(state);
>> +	ret = skl_include_affected_crtc(state);
>>   	if (ret)
>>   		return ret;
>>   
>> -- 
>> 2.11.0
>>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2017-05-15  8:11 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-08 11:48 [PATCH 00/11] Implement DDB algorithm and WM cleanup Mahesh Kumar
2017-05-08 11:48 ` [PATCH 01/11] drm/i915: fix naming of fixed_16_16 wrapper Mahesh Kumar
2017-05-12  0:21   ` Matt Roper
2017-05-08 11:48 ` [PATCH 02/11] drm/i915: Add more wrapper for fixed_point_16_16 operations Mahesh Kumar
2017-05-10 12:37   ` Maarten Lankhorst
2017-05-12  0:22   ` Matt Roper
2017-05-12  8:55     ` Mahesh Kumar
2017-05-08 11:48 ` [PATCH 03/11] drm/i915: Use fixed_16_16 wrapper for division operation Mahesh Kumar
2017-05-12  0:22   ` Matt Roper
2017-05-08 11:48 ` [PATCH 04/11] drm/i915/skl+: calculate pixel_rate & relative_data_rate in fixed point Mahesh Kumar
2017-05-12  0:22   ` Matt Roper
2017-05-08 11:48 ` [PATCH 05/11] drm/i915/skl: Fail the flip if no FB for WM calculation Mahesh Kumar
2017-05-08 11:48   ` Lankhorst, Maarten
2017-05-08 12:01     ` Mahesh Kumar
2017-05-12  0:22       ` Matt Roper
2017-05-08 11:48 ` [PATCH 06/11] drm/i915/skl+: no need to memset again Mahesh Kumar
2017-05-12  0:23   ` Matt Roper
2017-05-08 11:48 ` [PATCH 07/11] drm/i915/skl+: Fail the flip if ddb min requirement exceeds pipe allocation Mahesh Kumar
2017-05-08 14:12   ` Ander Conselvan De Oliveira
2017-05-09  7:50     ` Mahesh Kumar
2017-05-09  7:52     ` Mahesh Kumar
2017-05-12  0:23   ` Matt Roper
2017-05-12 13:44     ` Mahesh Kumar
2017-05-08 11:48 ` [PATCH 08/11] drm/i915/skl+: Watermark calculation cleanup Mahesh Kumar
2017-05-12  0:23   ` Matt Roper
2017-05-12 13:47     ` Mahesh Kumar
2017-05-08 11:49 ` [PATCH 09/11] drm/i915/skl+: use linetime latency if ddb size is not available Mahesh Kumar
2017-05-08 11:49 ` [PATCH 10/11] drm/i915/skl: New ddb allocation algorithm Mahesh Kumar
2017-05-12 22:24   ` Matt Roper
2017-05-15  8:15     ` Mahesh Kumar
2017-05-08 11:49 ` [PATCH 11/11] drm/i915/skl+: consider max supported plane pixel rate while scaling Mahesh Kumar
2017-05-10 13:22   ` Maarten Lankhorst
2017-05-11  8:36     ` Mahesh Kumar
2017-05-11  9:48       ` Maarten Lankhorst
2017-05-11 10:59         ` Mahesh Kumar
2017-05-11 13:05         ` [PATCH v4 " Mahesh Kumar
2017-05-11  9:18     ` [PATCH v2] " Mahesh Kumar
2017-05-08 12:37 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev2) Patchwork
2017-05-09  8:12 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev4) Patchwork
2017-05-11  9:35 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev5) Patchwork
2017-05-11 14:11 ` ✓ Fi.CI.BAT: success for Implement DDB algorithm and WM cleanup (rev6) Patchwork
2017-05-12  0:21 ` [PATCH 00/11] Implement DDB algorithm and WM cleanup Matt Roper
2017-05-12  8:25   ` Mahesh Kumar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.