All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH v3 0/5] Consider DBuf bandwidth when calculating CDCLK
@ 2020-03-30 12:23 Stanislav Lisovskiy
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 1/5] drm/i915: Decouple cdclk calculation from modeset checks Stanislav Lisovskiy
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: Stanislav Lisovskiy @ 2020-03-30 12:23 UTC (permalink / raw)
  To: intel-gfx

We need to calculate cdclk after watermarks/ddb has been calculated
as with recent hw CDCLK needs to be adjusted accordingly to DBuf
requirements, which is not possible with current code organization.

Setting CDCLK according to DBuf BW requirements and not just rejecting
if it doesn't satisfy BW requirements, will allow us to save power when
it is possible and gain additional bandwidth when it's needed - i.e
boosting both our power management and perfomance capabilities.

This patch is preparation for that, first we now extract modeset
calculation from modeset checks, in order to call it after wm/ddb
has been calculated.

Stanislav Lisovskiy (5):
  drm/i915: Decouple cdclk calculation from modeset checks
  drm/i915: Force recalculate min_cdclk if planes config changed
  drm/i915: Introduce for_each_dbuf_slice_in_mask macro
  drm/i915: Adjust CDCLK accordingly to our DBuf bw needs
  drm/i915: Remove unneeded hack now for CDCLK

 drivers/gpu/drm/i915/display/intel_bw.c       | 61 ++++++++++++++++++-
 drivers/gpu/drm/i915/display/intel_bw.h       |  8 +++
 drivers/gpu/drm/i915/display/intel_cdclk.c    | 31 +++++++---
 drivers/gpu/drm/i915/display/intel_display.c  | 36 ++++++++---
 drivers/gpu/drm/i915/display/intel_display.h  |  7 +++
 .../drm/i915/display/intel_display_power.h    |  5 ++
 drivers/gpu/drm/i915/intel_pm.c               | 34 ++++++++++-
 drivers/gpu/drm/i915/intel_pm.h               |  3 +
 8 files changed, 163 insertions(+), 22 deletions(-)

-- 
2.24.1.485.gad05a3d8e5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Intel-gfx] [PATCH v3 1/5] drm/i915: Decouple cdclk calculation from modeset checks
  2020-03-30 12:23 [Intel-gfx] [PATCH v3 0/5] Consider DBuf bandwidth when calculating CDCLK Stanislav Lisovskiy
@ 2020-03-30 12:23 ` Stanislav Lisovskiy
  2020-03-30 15:44   ` Ville Syrjälä
  2020-04-07  7:33   ` [Intel-gfx] [PATCH v4 " Stanislav Lisovskiy
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 2/5] drm/i915: Force recalculate min_cdclk if planes config changed Stanislav Lisovskiy
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 20+ messages in thread
From: Stanislav Lisovskiy @ 2020-03-30 12:23 UTC (permalink / raw)
  To: intel-gfx

We need to calculate cdclk after watermarks/ddb has been calculated
as with recent hw CDCLK needs to be adjusted accordingly to DBuf
requirements, which is not possible with current code organization.

Setting CDCLK according to DBuf BW requirements and not just rejecting
if it doesn't satisfy BW requirements, will allow us to save power when
it is possible and gain additional bandwidth when it's needed - i.e
boosting both our power management and perfomance capabilities.

This patch is preparation for that, first we now extract modeset
calculation from modeset checks, in order to call it after wm/ddb
has been calculated.

v2: - Extract only intel_modeset_calc_cdclk from intel_modeset_checks
      (Ville Syrjälä)

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 7c45d676c9b7..17d83f37f49f 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -14545,10 +14545,6 @@ static int intel_modeset_checks(struct intel_atomic_state *state)
 			return ret;
 	}
 
-	ret = intel_modeset_calc_cdclk(state);
-	if (ret)
-		return ret;
-
 	intel_modeset_clear_plls(state);
 
 	if (IS_HASWELL(dev_priv))
@@ -14882,10 +14878,6 @@ static int intel_atomic_check(struct drm_device *dev,
 			goto fail;
 	}
 
-	ret = intel_atomic_check_crtcs(state);
-	if (ret)
-		goto fail;
-
 	intel_fbc_choose_crtc(dev_priv, state);
 	ret = calc_watermark_data(state);
 	if (ret)
@@ -14895,6 +14887,16 @@ static int intel_atomic_check(struct drm_device *dev,
 	if (ret)
 		goto fail;
 
+	if (any_ms) {
+		ret = intel_modeset_calc_cdclk(state);
+		if (ret)
+			return ret;
+	}
+
+	ret = intel_atomic_check_crtcs(state);
+	if (ret)
+		goto fail;
+
 	for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state,
 					    new_crtc_state, i) {
 		if (!needs_modeset(new_crtc_state) &&
-- 
2.24.1.485.gad05a3d8e5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Intel-gfx] [PATCH v3 2/5] drm/i915: Force recalculate min_cdclk if planes config changed
  2020-03-30 12:23 [Intel-gfx] [PATCH v3 0/5] Consider DBuf bandwidth when calculating CDCLK Stanislav Lisovskiy
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 1/5] drm/i915: Decouple cdclk calculation from modeset checks Stanislav Lisovskiy
@ 2020-03-30 12:23 ` Stanislav Lisovskiy
  2020-03-30 16:18   ` Ville Syrjälä
  2020-04-07  7:36   ` [Intel-gfx] [PATCH v4 " Stanislav Lisovskiy
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 3/5] drm/i915: Introduce for_each_dbuf_slice_in_mask macro Stanislav Lisovskiy
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 20+ messages in thread
From: Stanislav Lisovskiy @ 2020-03-30 12:23 UTC (permalink / raw)
  To: intel-gfx

In Gen11+ whenever we might exceed DBuf bandwidth we might need to
recalculate CDCLK which DBuf bandwidth is scaled with.
Total Dbuf bw used might change based on particular plane needs.

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 17d83f37f49f..9fd32d61ebfe 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -14623,7 +14623,7 @@ static bool active_planes_affects_min_cdclk(struct drm_i915_private *dev_priv)
 	/* See {hsw,vlv,ivb}_plane_ratio() */
 	return IS_BROADWELL(dev_priv) || IS_HASWELL(dev_priv) ||
 		IS_CHERRYVIEW(dev_priv) || IS_VALLEYVIEW(dev_priv) ||
-		IS_IVYBRIDGE(dev_priv);
+		IS_IVYBRIDGE(dev_priv) || (INTEL_GEN(dev_priv) >= 11);
 }
 
 static int intel_atomic_check_planes(struct intel_atomic_state *state,
@@ -14669,7 +14669,13 @@ static int intel_atomic_check_planes(struct intel_atomic_state *state,
 		old_active_planes = old_crtc_state->active_planes & ~BIT(PLANE_CURSOR);
 		new_active_planes = new_crtc_state->active_planes & ~BIT(PLANE_CURSOR);
 
-		if (hweight8(old_active_planes) == hweight8(new_active_planes))
+		/*
+		 * Not only the number of planes, but if the plane configuration had
+		 * changed might already mean we need to recompute min CDCLK,
+		 * because different planes might consume different amount of Dbuf bandwidth
+		 * according to formula: Bw per plane = Pixel rate * bpp * pipe/plane scale factor
+		 */
+		if (old_active_planes == new_active_planes)
 			continue;
 
 		ret = intel_crtc_add_planes_to_state(state, crtc, new_active_planes);
-- 
2.24.1.485.gad05a3d8e5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Intel-gfx] [PATCH v3 3/5] drm/i915: Introduce for_each_dbuf_slice_in_mask macro
  2020-03-30 12:23 [Intel-gfx] [PATCH v3 0/5] Consider DBuf bandwidth when calculating CDCLK Stanislav Lisovskiy
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 1/5] drm/i915: Decouple cdclk calculation from modeset checks Stanislav Lisovskiy
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 2/5] drm/i915: Force recalculate min_cdclk if planes config changed Stanislav Lisovskiy
@ 2020-03-30 12:23 ` Stanislav Lisovskiy
  2020-03-30 16:07   ` Ville Syrjälä
  2020-04-07  7:38   ` [Intel-gfx] [PATCH v4 " Stanislav Lisovskiy
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 4/5] drm/i915: Adjust CDCLK accordingly to our DBuf bw needs Stanislav Lisovskiy
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 20+ messages in thread
From: Stanislav Lisovskiy @ 2020-03-30 12:23 UTC (permalink / raw)
  To: intel-gfx

We quite often need now to iterate only particular dbuf slices
in mask, whether they are active or related to particular crtc.

Let's make our life a bit easier and use a macro for that.

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.h       | 7 +++++++
 drivers/gpu/drm/i915/display/intel_display_power.h | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h
index adb1225a3480..c898285f0dc3 100644
--- a/drivers/gpu/drm/i915/display/intel_display.h
+++ b/drivers/gpu/drm/i915/display/intel_display.h
@@ -187,6 +187,13 @@ enum plane_id {
 	for ((__p) = PLANE_PRIMARY; (__p) < I915_MAX_PLANES; (__p)++) \
 		for_each_if((__crtc)->plane_ids_mask & BIT(__p))
 
+#define for_each_dbuf_slice_in_mask(__slice, __mask) \
+	for ((__slice) = 0; (__slice) < I915_MAX_DBUF_SLICES; (__slice)++) \
+		for_each_if((1 << (__slice)) & (__mask))
+
+#define for_each_dbuf_slice(__slice) \
+	for_each_dbuf_slice_in_mask(__slice, (1 << I915_MAX_DBUF_SLICES) - 1)
+
 enum port {
 	PORT_NONE = -1,
 
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
index da64a5edae7a..468e8fb0203a 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.h
+++ b/drivers/gpu/drm/i915/display/intel_display_power.h
@@ -311,8 +311,11 @@ intel_display_power_put_async(struct drm_i915_private *i915,
 enum dbuf_slice {
 	DBUF_S1,
 	DBUF_S2,
+	DBUF_SLICE_MAX
 };
 
+#define I915_DBUF_MAX_SLICES DBUF_SLICE_MAX
+
 #define with_intel_display_power(i915, domain, wf) \
 	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
 	     intel_display_power_put_async((i915), (domain), (wf)), (wf) = 0)
-- 
2.24.1.485.gad05a3d8e5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Intel-gfx] [PATCH v3 4/5] drm/i915: Adjust CDCLK accordingly to our DBuf bw needs
  2020-03-30 12:23 [Intel-gfx] [PATCH v3 0/5] Consider DBuf bandwidth when calculating CDCLK Stanislav Lisovskiy
                   ` (2 preceding siblings ...)
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 3/5] drm/i915: Introduce for_each_dbuf_slice_in_mask macro Stanislav Lisovskiy
@ 2020-03-30 12:23 ` Stanislav Lisovskiy
  2020-03-30 17:07   ` Ville Syrjälä
  2020-04-07  7:39   ` [Intel-gfx] [PATCH v4 " Stanislav Lisovskiy
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 5/5] drm/i915: Remove unneeded hack now for CDCLK Stanislav Lisovskiy
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 20+ messages in thread
From: Stanislav Lisovskiy @ 2020-03-30 12:23 UTC (permalink / raw)
  To: intel-gfx

According to BSpec max BW per slice is calculated using formula
Max BW = CDCLK * 64. Currently when calculating min CDCLK we
account only per plane requirements, however in order to avoid
FIFO underruns we need to estimate accumulated BW consumed by
all planes(ddb entries basically) residing on that particular
DBuf slice. This will allow us to put CDCLK lower and save power
when we don't need that much bandwidth or gain additional
performance once plane consumption grows.

v2: - Fix long line warning
    - Limited new DBuf bw checks to only gens >= 11

v3: - Lets track used Dbuf bw per slice and per crtc in bw state
      (or may be in DBuf state in future), that way we don't need
      to have all crtcs in state and those only if we detect if
      are actually going to change cdclk, just same way as we
      do with other stuff, i.e intel_atomic_serialize_global_state
      and co. Just as per Ville's paradigm.
    - Made dbuf bw calculation procedure look nicer by introducing
      for_each_dbuf_slice_in_mask - we often will now need to iterate
      slices using mask.
    - According to experimental results CDCLK * 64 accounts for
      overall bandwidth across all dbufs, not per dbuf.

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c       | 61 ++++++++++++++++++-
 drivers/gpu/drm/i915/display/intel_bw.h       |  8 +++
 drivers/gpu/drm/i915/display/intel_cdclk.c    | 25 ++++++++
 drivers/gpu/drm/i915/display/intel_display.c  |  8 +++
 .../drm/i915/display/intel_display_power.h    |  2 +
 drivers/gpu/drm/i915/intel_pm.c               | 34 ++++++++++-
 drivers/gpu/drm/i915/intel_pm.h               |  3 +
 7 files changed, 138 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index 573a1c206b60..e9d65820fb76 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -6,6 +6,7 @@
 #include <drm/drm_atomic_state_helper.h>
 
 #include "intel_bw.h"
+#include "intel_pm.h"
 #include "intel_display_types.h"
 #include "intel_sideband.h"
 #include "intel_atomic.h"
@@ -338,7 +339,6 @@ static unsigned int intel_bw_crtc_data_rate(const struct intel_crtc_state *crtc_
 
 	return data_rate;
 }
-
 void intel_bw_crtc_update(struct intel_bw_state *bw_state,
 			  const struct intel_crtc_state *crtc_state)
 {
@@ -419,6 +419,65 @@ intel_atomic_bw_get_state(struct intel_atomic_state *state)
 	return to_intel_bw_state(bw_state);
 }
 
+int intel_bw_calc_min_cdclk(struct intel_atomic_state *state)
+{
+	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
+	int i = 0;
+	enum plane_id plane_id;
+	struct intel_crtc_state *crtc_state;
+	struct intel_crtc *crtc;
+	int max_bw = 0;
+	int min_cdclk;
+	enum pipe pipe;
+	struct intel_bw_state *bw_state;
+	int slice_id = 0;
+
+	bw_state = intel_atomic_bw_get_state(state);
+
+	if (IS_ERR(bw_state))
+		return PTR_ERR(bw_state);
+
+	for_each_new_intel_crtc_in_state(state, crtc, crtc_state, i) {
+		struct intel_crtc_bw *crtc_bw = &bw_state->dbuf_bw_used[crtc->pipe];
+
+		memset(&crtc_bw->dbuf_bw, 0, sizeof(crtc_bw->dbuf_bw));
+
+		for_each_plane_id_on_crtc(crtc, plane_id) {
+			struct skl_ddb_entry *plane_alloc =
+				&crtc_state->wm.skl.plane_ddb_y[plane_id];
+			struct skl_ddb_entry *uv_plane_alloc =
+				&crtc_state->wm.skl.plane_ddb_uv[plane_id];
+			unsigned int data_rate = crtc_state->data_rate[plane_id];
+
+			unsigned int dbuf_mask = skl_ddb_dbuf_slice_mask(dev_priv, plane_alloc);
+
+			dbuf_mask |= skl_ddb_dbuf_slice_mask(dev_priv, uv_plane_alloc);
+
+			DRM_DEBUG_KMS("Got dbuf mask %x for pipe %c ddb %d-%d plane %d data rate %d\n",
+				      dbuf_mask, pipe_name(crtc->pipe), plane_alloc->start,
+				      plane_alloc->end, plane_id, data_rate);
+
+			for_each_dbuf_slice_in_mask(slice_id, dbuf_mask)
+				crtc_bw->dbuf_bw[slice_id] += data_rate;
+		}
+	}
+
+	for_each_dbuf_slice(slice_id) {
+		int total_bw_per_slice = 0;
+
+		for_each_pipe(dev_priv, pipe) {
+			struct intel_crtc_bw *crtc_bw = &bw_state->dbuf_bw_used[pipe];
+
+			total_bw_per_slice += crtc_bw->dbuf_bw[slice_id];
+		}
+		max_bw += total_bw_per_slice;
+	}
+
+	min_cdclk = max_bw / 64;
+
+	return min_cdclk;
+}
+
 int intel_bw_atomic_check(struct intel_atomic_state *state)
 {
 	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
diff --git a/drivers/gpu/drm/i915/display/intel_bw.h b/drivers/gpu/drm/i915/display/intel_bw.h
index 9a5627be6876..d2b5f32b0791 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.h
+++ b/drivers/gpu/drm/i915/display/intel_bw.h
@@ -10,11 +10,16 @@
 
 #include "intel_display.h"
 #include "intel_global_state.h"
+#include "intel_display_power.h"
 
 struct drm_i915_private;
 struct intel_atomic_state;
 struct intel_crtc_state;
 
+struct intel_crtc_bw {
+	int dbuf_bw[I915_MAX_DBUF_SLICES];
+};
+
 struct intel_bw_state {
 	struct intel_global_state base;
 
@@ -31,6 +36,8 @@ struct intel_bw_state {
 	 */
 	u8 qgv_points_mask;
 
+	struct intel_crtc_bw dbuf_bw_used[I915_MAX_PIPES];
+
 	unsigned int data_rate[I915_MAX_PIPES];
 	u8 num_active_planes[I915_MAX_PIPES];
 };
@@ -53,5 +60,6 @@ void intel_bw_crtc_update(struct intel_bw_state *bw_state,
 			  const struct intel_crtc_state *crtc_state);
 int icl_pcode_restrict_qgv_points(struct drm_i915_private *dev_priv,
 				  u32 points_mask);
+int intel_bw_calc_min_cdclk(struct intel_atomic_state *state);
 
 #endif /* __INTEL_BW_H__ */
diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c
index 979a0241fdcb..036774e7f3ec 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
@@ -25,6 +25,7 @@
 #include "intel_cdclk.h"
 #include "intel_display_types.h"
 #include "intel_sideband.h"
+#include "intel_bw.h"
 
 /**
  * DOC: CDCLK / RAWCLK
@@ -2001,11 +2002,19 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
 {
 	struct drm_i915_private *dev_priv =
 		to_i915(crtc_state->uapi.crtc->dev);
+	struct intel_atomic_state *state = NULL;
 	int min_cdclk;
 
 	if (!crtc_state->hw.enable)
 		return 0;
 
+	/*
+	 * FIXME: Unfortunately when this gets called from intel_modeset_setup_hw_state
+	 * there is no intel_atomic_state at all. So lets not then use it.
+	 */
+	if (crtc_state->uapi.state)
+		state = to_intel_atomic_state(crtc_state->uapi.state);
+
 	min_cdclk = intel_pixel_rate_to_cdclk(crtc_state);
 
 	/* pixel rate mustn't exceed 95% of cdclk with IPS on BDW */
@@ -2080,6 +2089,22 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
 	if (IS_TIGERLAKE(dev_priv))
 		min_cdclk = max(min_cdclk, (int)crtc_state->pixel_rate);
 
+	/*
+	 * Similar story as with skl_write_plane_wm and intel_enable_sagv
+	 * - in some certain driver parts, we don't have any guarantee that
+	 * parent exists. So we might be having a crtc_state without
+	 * parent state.
+	 */
+	if (INTEL_GEN(dev_priv) >= 11) {
+		if (state) {
+			int dbuf_bw_cdclk = intel_bw_calc_min_cdclk(state);
+
+			DRM_DEBUG_KMS("DBuf bw min cdclk %d current min_cdclk %d\n",
+				      dbuf_bw_cdclk, min_cdclk);
+			min_cdclk = max(min_cdclk, dbuf_bw_cdclk);
+		}
+	}
+
 	if (min_cdclk > dev_priv->max_cdclk_freq) {
 		drm_dbg_kms(&dev_priv->drm,
 			    "required cdclk (%d kHz) exceeds max (%d kHz)\n",
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 9fd32d61ebfe..fa2870c0d7fd 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -14678,6 +14678,14 @@ static int intel_atomic_check_planes(struct intel_atomic_state *state,
 		if (old_active_planes == new_active_planes)
 			continue;
 
+		/*
+		 * active_planes bitmask has been updated, whenever amount
+		 * of active planes had changed we need to recalculate CDCLK
+		 * as it depends on total bandwidth now, not only min_cdclk
+		 * per plane.
+		 */
+		*need_cdclk_calc = true;
+
 		ret = intel_crtc_add_planes_to_state(state, crtc, new_active_planes);
 		if (ret)
 			return ret;
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
index 468e8fb0203a..9e33fb90422f 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.h
+++ b/drivers/gpu/drm/i915/display/intel_display_power.h
@@ -308,6 +308,8 @@ intel_display_power_put_async(struct drm_i915_private *i915,
 }
 #endif
 
+#define I915_MAX_DBUF_SLICES 2
+
 enum dbuf_slice {
 	DBUF_S1,
 	DBUF_S2,
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 551933e3f7da..5dcd1cd09ad7 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4055,10 +4055,9 @@ icl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
 	return offset;
 }
 
-static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
+u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
 {
 	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
-
 	drm_WARN_ON(&dev_priv->drm, ddb_size == 0);
 
 	if (INTEL_GEN(dev_priv) < 11)
@@ -4067,6 +4066,37 @@ static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
 	return ddb_size;
 }
 
+u32 skl_ddb_dbuf_slice_mask(struct drm_i915_private *dev_priv,
+			    const struct skl_ddb_entry *entry)
+{
+	u32 slice_mask = 0;
+	u16 ddb_size = intel_get_ddb_size(dev_priv);
+	u16 num_supported_slices = INTEL_INFO(dev_priv)->num_supported_dbuf_slices;
+	u16 slice_size = ddb_size / num_supported_slices;
+	u16 start_slice;
+	u16 end_slice;
+
+	if (!skl_ddb_entry_size(entry))
+		return 0;
+
+	start_slice = entry->start / slice_size;
+	end_slice = (entry->end - 1) / slice_size;
+
+	DRM_DEBUG_KMS("ddb size %d slices %d slice size %d start slice %d end slice %d\n",
+		      ddb_size, num_supported_slices, slice_size, start_slice, end_slice);
+
+	/*
+	 * Per plane DDB entry can in a really worst case be on multiple slices
+	 * but single entry is anyway contigious.
+	 */
+	while (start_slice <= end_slice) {
+		slice_mask |= 1 << start_slice;
+		start_slice++;
+	}
+
+	return slice_mask;
+}
+
 static u8 skl_compute_dbuf_slices(const struct intel_crtc_state *crtc_state,
 				  u8 active_pipes);
 
diff --git a/drivers/gpu/drm/i915/intel_pm.h b/drivers/gpu/drm/i915/intel_pm.h
index 069515f04170..41c61ad71ce6 100644
--- a/drivers/gpu/drm/i915/intel_pm.h
+++ b/drivers/gpu/drm/i915/intel_pm.h
@@ -37,6 +37,9 @@ void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
 			       struct skl_ddb_entry *ddb_y,
 			       struct skl_ddb_entry *ddb_uv);
 void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv);
+u16 intel_get_ddb_size(struct drm_i915_private *dev_priv);
+u32 skl_ddb_dbuf_slice_mask(struct drm_i915_private *dev_priv,
+			    const struct skl_ddb_entry *entry);
 void skl_pipe_wm_get_hw_state(struct intel_crtc *crtc,
 			      struct skl_pipe_wm *out);
 void g4x_wm_sanitize(struct drm_i915_private *dev_priv);
-- 
2.24.1.485.gad05a3d8e5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Intel-gfx] [PATCH v3 5/5] drm/i915: Remove unneeded hack now for CDCLK
  2020-03-30 12:23 [Intel-gfx] [PATCH v3 0/5] Consider DBuf bandwidth when calculating CDCLK Stanislav Lisovskiy
                   ` (3 preceding siblings ...)
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 4/5] drm/i915: Adjust CDCLK accordingly to our DBuf bw needs Stanislav Lisovskiy
@ 2020-03-30 12:23 ` Stanislav Lisovskiy
  2020-03-30 18:07 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Consider DBuf bandwidth when calculating CDCLK (rev3) Patchwork
  2020-04-07  8:03 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Consider DBuf bandwidth when calculating CDCLK (rev7) Patchwork
  6 siblings, 0 replies; 20+ messages in thread
From: Stanislav Lisovskiy @ 2020-03-30 12:23 UTC (permalink / raw)
  To: intel-gfx

No need to bump up CDCLK now, as it is now correctly
calculated, accounting for DBuf BW as BSpec says.

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
---
 drivers/gpu/drm/i915/display/intel_cdclk.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c
index 036774e7f3ec..13e7ea6f471e 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
@@ -2077,18 +2077,6 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
 	/* Account for additional needs from the planes */
 	min_cdclk = max(intel_planes_min_cdclk(crtc_state), min_cdclk);
 
-	/*
-	 * HACK. Currently for TGL platforms we calculate
-	 * min_cdclk initially based on pixel_rate divided
-	 * by 2, accounting for also plane requirements,
-	 * however in some cases the lowest possible CDCLK
-	 * doesn't work and causing the underruns.
-	 * Explicitly stating here that this seems to be currently
-	 * rather a Hack, than final solution.
-	 */
-	if (IS_TIGERLAKE(dev_priv))
-		min_cdclk = max(min_cdclk, (int)crtc_state->pixel_rate);
-
 	/*
 	 * Similar story as with skl_write_plane_wm and intel_enable_sagv
 	 * - in some certain driver parts, we don't have any guarantee that
-- 
2.24.1.485.gad05a3d8e5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] [PATCH v3 1/5] drm/i915: Decouple cdclk calculation from modeset checks
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 1/5] drm/i915: Decouple cdclk calculation from modeset checks Stanislav Lisovskiy
@ 2020-03-30 15:44   ` Ville Syrjälä
  2020-04-07  7:33   ` [Intel-gfx] [PATCH v4 " Stanislav Lisovskiy
  1 sibling, 0 replies; 20+ messages in thread
From: Ville Syrjälä @ 2020-03-30 15:44 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

On Mon, Mar 30, 2020 at 03:23:50PM +0300, Stanislav Lisovskiy wrote:
> We need to calculate cdclk after watermarks/ddb has been calculated
> as with recent hw CDCLK needs to be adjusted accordingly to DBuf
> requirements, which is not possible with current code organization.
> 
> Setting CDCLK according to DBuf BW requirements and not just rejecting
> if it doesn't satisfy BW requirements, will allow us to save power when
> it is possible and gain additional bandwidth when it's needed - i.e
> boosting both our power management and perfomance capabilities.
> 
> This patch is preparation for that, first we now extract modeset
> calculation from modeset checks, in order to call it after wm/ddb
> has been calculated.
> 
> v2: - Extract only intel_modeset_calc_cdclk from intel_modeset_checks
>       (Ville Syrjälä)
> 
> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_display.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 7c45d676c9b7..17d83f37f49f 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -14545,10 +14545,6 @@ static int intel_modeset_checks(struct intel_atomic_state *state)
>  			return ret;
>  	}
>  
> -	ret = intel_modeset_calc_cdclk(state);
> -	if (ret)
> -		return ret;
> -
>  	intel_modeset_clear_plls(state);
>  
>  	if (IS_HASWELL(dev_priv))
> @@ -14882,10 +14878,6 @@ static int intel_atomic_check(struct drm_device *dev,
>  			goto fail;
>  	}
>  
> -	ret = intel_atomic_check_crtcs(state);
> -	if (ret)
> -		goto fail;
> -
>  	intel_fbc_choose_crtc(dev_priv, state);
>  	ret = calc_watermark_data(state);
>  	if (ret)
> @@ -14895,6 +14887,16 @@ static int intel_atomic_check(struct drm_device *dev,
>  	if (ret)
>  		goto fail;
>  
> +	if (any_ms) {
> +		ret = intel_modeset_calc_cdclk(state);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	ret = intel_atomic_check_crtcs(state);
> +	if (ret)
> +		goto fail;

I was thinking we'd do this as two patches. One with just the
extraction, and another one with the bigger reordering. But I think I
convinced myself that it should be safe, so maybe a single patch is
fine.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

> +
>  	for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state,
>  					    new_crtc_state, i) {
>  		if (!needs_modeset(new_crtc_state) &&
> -- 
> 2.24.1.485.gad05a3d8e5

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] [PATCH v3 3/5] drm/i915: Introduce for_each_dbuf_slice_in_mask macro
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 3/5] drm/i915: Introduce for_each_dbuf_slice_in_mask macro Stanislav Lisovskiy
@ 2020-03-30 16:07   ` Ville Syrjälä
  2020-04-07  7:38   ` [Intel-gfx] [PATCH v4 " Stanislav Lisovskiy
  1 sibling, 0 replies; 20+ messages in thread
From: Ville Syrjälä @ 2020-03-30 16:07 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

On Mon, Mar 30, 2020 at 03:23:52PM +0300, Stanislav Lisovskiy wrote:
> We quite often need now to iterate only particular dbuf slices
> in mask, whether they are active or related to particular crtc.
> 
> Let's make our life a bit easier and use a macro for that.
> 
> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_display.h       | 7 +++++++
>  drivers/gpu/drm/i915/display/intel_display_power.h | 3 +++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h
> index adb1225a3480..c898285f0dc3 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.h
> +++ b/drivers/gpu/drm/i915/display/intel_display.h
> @@ -187,6 +187,13 @@ enum plane_id {
>  	for ((__p) = PLANE_PRIMARY; (__p) < I915_MAX_PLANES; (__p)++) \
>  		for_each_if((__crtc)->plane_ids_mask & BIT(__p))
>  
> +#define for_each_dbuf_slice_in_mask(__slice, __mask) \

Please stick to established conventions.

> +	for ((__slice) = 0; (__slice) < I915_MAX_DBUF_SLICES; (__slice)++) \
> +		for_each_if((1 << (__slice)) & (__mask))
> +
> +#define for_each_dbuf_slice(__slice) \
> +	for_each_dbuf_slice_in_mask(__slice, (1 << I915_MAX_DBUF_SLICES) - 1)
> +
>  enum port {
>  	PORT_NONE = -1,
>  
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
> index da64a5edae7a..468e8fb0203a 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> @@ -311,8 +311,11 @@ intel_display_power_put_async(struct drm_i915_private *i915,
>  enum dbuf_slice {
>  	DBUF_S1,
>  	DBUF_S2,
> +	DBUF_SLICE_MAX
>  };
>  
> +#define I915_DBUF_MAX_SLICES DBUF_SLICE_MAX
> +

Huh?

>  #define with_intel_display_power(i915, domain, wf) \
>  	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
>  	     intel_display_power_put_async((i915), (domain), (wf)), (wf) = 0)
> -- 
> 2.24.1.485.gad05a3d8e5

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] [PATCH v3 2/5] drm/i915: Force recalculate min_cdclk if planes config changed
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 2/5] drm/i915: Force recalculate min_cdclk if planes config changed Stanislav Lisovskiy
@ 2020-03-30 16:18   ` Ville Syrjälä
  2020-03-30 17:56     ` Lisovskiy, Stanislav
  2020-04-07  7:36   ` [Intel-gfx] [PATCH v4 " Stanislav Lisovskiy
  1 sibling, 1 reply; 20+ messages in thread
From: Ville Syrjälä @ 2020-03-30 16:18 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

On Mon, Mar 30, 2020 at 03:23:51PM +0300, Stanislav Lisovskiy wrote:
> In Gen11+ whenever we might exceed DBuf bandwidth we might need to
> recalculate CDCLK which DBuf bandwidth is scaled with.
> Total Dbuf bw used might change based on particular plane needs.
> 
> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_display.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 17d83f37f49f..9fd32d61ebfe 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -14623,7 +14623,7 @@ static bool active_planes_affects_min_cdclk(struct drm_i915_private *dev_priv)
>  	/* See {hsw,vlv,ivb}_plane_ratio() */
>  	return IS_BROADWELL(dev_priv) || IS_HASWELL(dev_priv) ||
>  		IS_CHERRYVIEW(dev_priv) || IS_VALLEYVIEW(dev_priv) ||
> -		IS_IVYBRIDGE(dev_priv);
> +		IS_IVYBRIDGE(dev_priv) || (INTEL_GEN(dev_priv) >= 11);
>  }
>  
>  static int intel_atomic_check_planes(struct intel_atomic_state *state,
> @@ -14669,7 +14669,13 @@ static int intel_atomic_check_planes(struct intel_atomic_state *state,
>  		old_active_planes = old_crtc_state->active_planes & ~BIT(PLANE_CURSOR);
>  		new_active_planes = new_crtc_state->active_planes & ~BIT(PLANE_CURSOR);
>  
> -		if (hweight8(old_active_planes) == hweight8(new_active_planes))
> +		/*
> +		 * Not only the number of planes, but if the plane configuration had
> +		 * changed might already mean we need to recompute min CDCLK,
> +		 * because different planes might consume different amount of Dbuf bandwidth
> +		 * according to formula: Bw per plane = Pixel rate * bpp * pipe/plane scale factor
> +		 */

The set of of active planes doesn't dictate the bandwidth since it
doesn't consider most of the parameters you listed above. Ie. doesn't
seem to be the right place for this stuff.

The decision to bump the cdclk should really come from the dbuf code
not from some totally distinct piece of code. But to get this right
I have a feeling we'll need the dbuf state and totally decouple cdclk
from any_ms.

> +		if (old_active_planes == new_active_planes)
>  			continue;
>  
>  		ret = intel_crtc_add_planes_to_state(state, crtc, new_active_planes);
> -- 
> 2.24.1.485.gad05a3d8e5

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] [PATCH v3 4/5] drm/i915: Adjust CDCLK accordingly to our DBuf bw needs
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 4/5] drm/i915: Adjust CDCLK accordingly to our DBuf bw needs Stanislav Lisovskiy
@ 2020-03-30 17:07   ` Ville Syrjälä
  2020-03-30 18:16     ` Lisovskiy, Stanislav
  2020-04-07  7:39   ` [Intel-gfx] [PATCH v4 " Stanislav Lisovskiy
  1 sibling, 1 reply; 20+ messages in thread
From: Ville Syrjälä @ 2020-03-30 17:07 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

On Mon, Mar 30, 2020 at 03:23:53PM +0300, Stanislav Lisovskiy wrote:
> According to BSpec max BW per slice is calculated using formula
> Max BW = CDCLK * 64. Currently when calculating min CDCLK we
> account only per plane requirements, however in order to avoid
> FIFO underruns we need to estimate accumulated BW consumed by
> all planes(ddb entries basically) residing on that particular
> DBuf slice. This will allow us to put CDCLK lower and save power
> when we don't need that much bandwidth or gain additional
> performance once plane consumption grows.
> 
> v2: - Fix long line warning
>     - Limited new DBuf bw checks to only gens >= 11
> 
> v3: - Lets track used Dbuf bw per slice and per crtc in bw state
>       (or may be in DBuf state in future), that way we don't need
>       to have all crtcs in state and those only if we detect if
>       are actually going to change cdclk, just same way as we
>       do with other stuff, i.e intel_atomic_serialize_global_state
>       and co. Just as per Ville's paradigm.
>     - Made dbuf bw calculation procedure look nicer by introducing
>       for_each_dbuf_slice_in_mask - we often will now need to iterate
>       slices using mask.
>     - According to experimental results CDCLK * 64 accounts for
>       overall bandwidth across all dbufs, not per dbuf.
> 
> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_bw.c       | 61 ++++++++++++++++++-
>  drivers/gpu/drm/i915/display/intel_bw.h       |  8 +++
>  drivers/gpu/drm/i915/display/intel_cdclk.c    | 25 ++++++++
>  drivers/gpu/drm/i915/display/intel_display.c  |  8 +++
>  .../drm/i915/display/intel_display_power.h    |  2 +
>  drivers/gpu/drm/i915/intel_pm.c               | 34 ++++++++++-
>  drivers/gpu/drm/i915/intel_pm.h               |  3 +
>  7 files changed, 138 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
> index 573a1c206b60..e9d65820fb76 100644
> --- a/drivers/gpu/drm/i915/display/intel_bw.c
> +++ b/drivers/gpu/drm/i915/display/intel_bw.c
> @@ -6,6 +6,7 @@
>  #include <drm/drm_atomic_state_helper.h>
>  
>  #include "intel_bw.h"
> +#include "intel_pm.h"
>  #include "intel_display_types.h"
>  #include "intel_sideband.h"
>  #include "intel_atomic.h"
> @@ -338,7 +339,6 @@ static unsigned int intel_bw_crtc_data_rate(const struct intel_crtc_state *crtc_
>  
>  	return data_rate;
>  }
> -
>  void intel_bw_crtc_update(struct intel_bw_state *bw_state,
>  			  const struct intel_crtc_state *crtc_state)
>  {
> @@ -419,6 +419,65 @@ intel_atomic_bw_get_state(struct intel_atomic_state *state)
>  	return to_intel_bw_state(bw_state);
>  }
>  
> +int intel_bw_calc_min_cdclk(struct intel_atomic_state *state)
> +{
> +	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
> +	int i = 0;
> +	enum plane_id plane_id;
> +	struct intel_crtc_state *crtc_state;

const

> +	struct intel_crtc *crtc;
> +	int max_bw = 0;
> +	int min_cdclk;
> +	enum pipe pipe;
> +	struct intel_bw_state *bw_state;
> +	int slice_id = 0;

Bunch of needless intiialization, needlessly wide scope, etc.

> +
> +	bw_state = intel_atomic_bw_get_state(state);
> +

Spurious whitespace.

> +	if (IS_ERR(bw_state))
> +		return PTR_ERR(bw_state);
> +
> +	for_each_new_intel_crtc_in_state(state, crtc, crtc_state, i) {
> +		struct intel_crtc_bw *crtc_bw = &bw_state->dbuf_bw_used[crtc->pipe];
> +
> +		memset(&crtc_bw->dbuf_bw, 0, sizeof(crtc_bw->dbuf_bw));
> +
> +		for_each_plane_id_on_crtc(crtc, plane_id) {
> +			struct skl_ddb_entry *plane_alloc =
> +				&crtc_state->wm.skl.plane_ddb_y[plane_id];
> +			struct skl_ddb_entry *uv_plane_alloc =
> +				&crtc_state->wm.skl.plane_ddb_uv[plane_id];

const

> +			unsigned int data_rate = crtc_state->data_rate[plane_id];
> +

more strange whitespace

> +			unsigned int dbuf_mask = skl_ddb_dbuf_slice_mask(dev_priv, plane_alloc);
> +
> +			dbuf_mask |= skl_ddb_dbuf_slice_mask(dev_priv, uv_plane_alloc);

Looks bad when you initialize part of it in declaration and the rest
later.

> +
> +			DRM_DEBUG_KMS("Got dbuf mask %x for pipe %c ddb %d-%d plane %d data rate %d\n",
> +				      dbuf_mask, pipe_name(crtc->pipe), plane_alloc->start,
> +				      plane_alloc->end, plane_id, data_rate);
> +
> +			for_each_dbuf_slice_in_mask(slice_id, dbuf_mask)
> +				crtc_bw->dbuf_bw[slice_id] += data_rate;

This doesn't feel quite right for planar formats.

For pre-icl it works by accident since we only have the one slice so
we don't end up accounting for the full bandwidth from both color planes
to multiple slices. If we had multiple slices and chroma and luma had
been allocated to different slices we'd count the same thing multiple
times.

For icl+ we seem to assign the full data rate to the UV plane's slice(s)
since only the UV plane has data_rate[] != 0.

> +		}
> +	}
> +
> +	for_each_dbuf_slice(slice_id) {
> +		int total_bw_per_slice = 0;
> +
> +		for_each_pipe(dev_priv, pipe) {
> +			struct intel_crtc_bw *crtc_bw = &bw_state->dbuf_bw_used[pipe];
> +
> +			total_bw_per_slice += crtc_bw->dbuf_bw[slice_id];
> +		}
> +		max_bw += total_bw_per_slice;

So we're aggregating all the bw instead of per-slice? Is this based on
the other mail you sent? Deserves a comment explaining why we do such
odd things.

> +	}
> +
> +	min_cdclk = max_bw / 64;
> +
> +	return min_cdclk;
> +}
> +
>  int intel_bw_atomic_check(struct intel_atomic_state *state)
>  {
>  	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
> diff --git a/drivers/gpu/drm/i915/display/intel_bw.h b/drivers/gpu/drm/i915/display/intel_bw.h
> index 9a5627be6876..d2b5f32b0791 100644
> --- a/drivers/gpu/drm/i915/display/intel_bw.h
> +++ b/drivers/gpu/drm/i915/display/intel_bw.h
> @@ -10,11 +10,16 @@
>  
>  #include "intel_display.h"
>  #include "intel_global_state.h"
> +#include "intel_display_power.h"
>  
>  struct drm_i915_private;
>  struct intel_atomic_state;
>  struct intel_crtc_state;
>  
> +struct intel_crtc_bw {
> +	int dbuf_bw[I915_MAX_DBUF_SLICES];
> +};
> +
>  struct intel_bw_state {
>  	struct intel_global_state base;
>  
> @@ -31,6 +36,8 @@ struct intel_bw_state {
>  	 */
>  	u8 qgv_points_mask;
>  
> +	struct intel_crtc_bw dbuf_bw_used[I915_MAX_PIPES];

The name of the struct isn't very good if it just contains the
dbuf bw numbers.

> +
>  	unsigned int data_rate[I915_MAX_PIPES];
>  	u8 num_active_planes[I915_MAX_PIPES];

Maybe collect all this to the per-crtc struct?

>  };
> @@ -53,5 +60,6 @@ void intel_bw_crtc_update(struct intel_bw_state *bw_state,
>  			  const struct intel_crtc_state *crtc_state);
>  int icl_pcode_restrict_qgv_points(struct drm_i915_private *dev_priv,
>  				  u32 points_mask);
> +int intel_bw_calc_min_cdclk(struct intel_atomic_state *state);
>  
>  #endif /* __INTEL_BW_H__ */
> diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c
> index 979a0241fdcb..036774e7f3ec 100644
> --- a/drivers/gpu/drm/i915/display/intel_cdclk.c
> +++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
> @@ -25,6 +25,7 @@
>  #include "intel_cdclk.h"
>  #include "intel_display_types.h"
>  #include "intel_sideband.h"
> +#include "intel_bw.h"
>  
>  /**
>   * DOC: CDCLK / RAWCLK
> @@ -2001,11 +2002,19 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
>  {
>  	struct drm_i915_private *dev_priv =
>  		to_i915(crtc_state->uapi.crtc->dev);
> +	struct intel_atomic_state *state = NULL;
>  	int min_cdclk;
>  
>  	if (!crtc_state->hw.enable)
>  		return 0;
>  
> +	/*
> +	 * FIXME: Unfortunately when this gets called from intel_modeset_setup_hw_state
> +	 * there is no intel_atomic_state at all. So lets not then use it.
> +	 */
> +	if (crtc_state->uapi.state)
> +		state = to_intel_atomic_state(crtc_state->uapi.state);

This still indicates that either this isn't the right place to call this
or we have the state stored in the wrong place.

I think I'd just move the thing into intel_compute_min_cdclk() as a start.

> +
>  	min_cdclk = intel_pixel_rate_to_cdclk(crtc_state);
>  
>  	/* pixel rate mustn't exceed 95% of cdclk with IPS on BDW */
> @@ -2080,6 +2089,22 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
>  	if (IS_TIGERLAKE(dev_priv))
>  		min_cdclk = max(min_cdclk, (int)crtc_state->pixel_rate);
>  
> +	/*
> +	 * Similar story as with skl_write_plane_wm and intel_enable_sagv
> +	 * - in some certain driver parts, we don't have any guarantee that
> +	 * parent exists. So we might be having a crtc_state without
> +	 * parent state.
> +	 */
> +	if (INTEL_GEN(dev_priv) >= 11) {
> +		if (state) {
> +			int dbuf_bw_cdclk = intel_bw_calc_min_cdclk(state);
> +
> +			DRM_DEBUG_KMS("DBuf bw min cdclk %d current min_cdclk %d\n",
> +				      dbuf_bw_cdclk, min_cdclk);
> +			min_cdclk = max(min_cdclk, dbuf_bw_cdclk);
> +		}
> +	}
> +
>  	if (min_cdclk > dev_priv->max_cdclk_freq) {
>  		drm_dbg_kms(&dev_priv->drm,
>  			    "required cdclk (%d kHz) exceeds max (%d kHz)\n",
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 9fd32d61ebfe..fa2870c0d7fd 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -14678,6 +14678,14 @@ static int intel_atomic_check_planes(struct intel_atomic_state *state,
>  		if (old_active_planes == new_active_planes)
>  			continue;
>  
> +		/*
> +		 * active_planes bitmask has been updated, whenever amount
> +		 * of active planes had changed we need to recalculate CDCLK
> +		 * as it depends on total bandwidth now, not only min_cdclk
> +		 * per plane.
> +		 */
> +		*need_cdclk_calc = true;
> +
>  		ret = intel_crtc_add_planes_to_state(state, crtc, new_active_planes);
>  		if (ret)
>  			return ret;
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
> index 468e8fb0203a..9e33fb90422f 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> @@ -308,6 +308,8 @@ intel_display_power_put_async(struct drm_i915_private *i915,
>  }
>  #endif
>  
> +#define I915_MAX_DBUF_SLICES 2
> +
>  enum dbuf_slice {
>  	DBUF_S1,
>  	DBUF_S2,
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 551933e3f7da..5dcd1cd09ad7 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -4055,10 +4055,9 @@ icl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
>  	return offset;
>  }
>  
> -static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
> +u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
>  {
>  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> -
>  	drm_WARN_ON(&dev_priv->drm, ddb_size == 0);
>  
>  	if (INTEL_GEN(dev_priv) < 11)
> @@ -4067,6 +4066,37 @@ static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
>  	return ddb_size;
>  }
>  
> +u32 skl_ddb_dbuf_slice_mask(struct drm_i915_private *dev_priv,
> +			    const struct skl_ddb_entry *entry)
> +{
> +	u32 slice_mask = 0;
> +	u16 ddb_size = intel_get_ddb_size(dev_priv);
> +	u16 num_supported_slices = INTEL_INFO(dev_priv)->num_supported_dbuf_slices;
> +	u16 slice_size = ddb_size / num_supported_slices;
> +	u16 start_slice;
> +	u16 end_slice;
> +
> +	if (!skl_ddb_entry_size(entry))
> +		return 0;
> +
> +	start_slice = entry->start / slice_size;
> +	end_slice = (entry->end - 1) / slice_size;
> +
> +	DRM_DEBUG_KMS("ddb size %d slices %d slice size %d start slice %d end slice %d\n",
> +		      ddb_size, num_supported_slices, slice_size, start_slice, end_slice);
> +
> +	/*
> +	 * Per plane DDB entry can in a really worst case be on multiple slices
> +	 * but single entry is anyway contigious.
> +	 */
> +	while (start_slice <= end_slice) {
> +		slice_mask |= 1 << start_slice;
> +		start_slice++;
> +	}

GENMASK() or somehting?

> +
> +	return slice_mask;
> +}
> +
>  static u8 skl_compute_dbuf_slices(const struct intel_crtc_state *crtc_state,
>  				  u8 active_pipes);
>  
> diff --git a/drivers/gpu/drm/i915/intel_pm.h b/drivers/gpu/drm/i915/intel_pm.h
> index 069515f04170..41c61ad71ce6 100644
> --- a/drivers/gpu/drm/i915/intel_pm.h
> +++ b/drivers/gpu/drm/i915/intel_pm.h
> @@ -37,6 +37,9 @@ void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
>  			       struct skl_ddb_entry *ddb_y,
>  			       struct skl_ddb_entry *ddb_uv);
>  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv);
> +u16 intel_get_ddb_size(struct drm_i915_private *dev_priv);
> +u32 skl_ddb_dbuf_slice_mask(struct drm_i915_private *dev_priv,
> +			    const struct skl_ddb_entry *entry);
>  void skl_pipe_wm_get_hw_state(struct intel_crtc *crtc,
>  			      struct skl_pipe_wm *out);
>  void g4x_wm_sanitize(struct drm_i915_private *dev_priv);
> -- 
> 2.24.1.485.gad05a3d8e5

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] [PATCH v3 2/5] drm/i915: Force recalculate min_cdclk if planes config changed
  2020-03-30 16:18   ` Ville Syrjälä
@ 2020-03-30 17:56     ` Lisovskiy, Stanislav
  0 siblings, 0 replies; 20+ messages in thread
From: Lisovskiy, Stanislav @ 2020-03-30 17:56 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Mon, Mar 30, 2020 at 07:18:57PM +0300, Ville Syrjälä wrote:
> On Mon, Mar 30, 2020 at 03:23:51PM +0300, Stanislav Lisovskiy wrote:
> > In Gen11+ whenever we might exceed DBuf bandwidth we might need to
> > recalculate CDCLK which DBuf bandwidth is scaled with.
> > Total Dbuf bw used might change based on particular plane needs.
> > 
> > Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> > ---
> >  drivers/gpu/drm/i915/display/intel_display.c | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> > index 17d83f37f49f..9fd32d61ebfe 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > @@ -14623,7 +14623,7 @@ static bool active_planes_affects_min_cdclk(struct drm_i915_private *dev_priv)
> >  	/* See {hsw,vlv,ivb}_plane_ratio() */
> >  	return IS_BROADWELL(dev_priv) || IS_HASWELL(dev_priv) ||
> >  		IS_CHERRYVIEW(dev_priv) || IS_VALLEYVIEW(dev_priv) ||
> > -		IS_IVYBRIDGE(dev_priv);
> > +		IS_IVYBRIDGE(dev_priv) || (INTEL_GEN(dev_priv) >= 11);
> >  }
> >  
> >  static int intel_atomic_check_planes(struct intel_atomic_state *state,
> > @@ -14669,7 +14669,13 @@ static int intel_atomic_check_planes(struct intel_atomic_state *state,
> >  		old_active_planes = old_crtc_state->active_planes & ~BIT(PLANE_CURSOR);
> >  		new_active_planes = new_crtc_state->active_planes & ~BIT(PLANE_CURSOR);
> >  
> > -		if (hweight8(old_active_planes) == hweight8(new_active_planes))
> > +		/*
> > +		 * Not only the number of planes, but if the plane configuration had
> > +		 * changed might already mean we need to recompute min CDCLK,
> > +		 * because different planes might consume different amount of Dbuf bandwidth
> > +		 * according to formula: Bw per plane = Pixel rate * bpp * pipe/plane scale factor
> > +		 */
> 
> The set of of active planes doesn't dictate the bandwidth since it
> doesn't consider most of the parameters you listed above. Ie. doesn't
> seem to be the right place for this stuff.
> 
> The decision to bump the cdclk should really come from the dbuf code
> not from some totally distinct piece of code. But to get this right
> I have a feeling we'll need the dbuf state and totally decouple cdclk
> from any_ms.

My idea was that if active plane configuration had changed - it means that we need
to recalculate bandwidth used by those. Once we now the bandwidth used per slice/per pipe
stored in bw_state - so we recalculate only those in state and that should be fine.
if recalculated bandwidth results in a different CDCLK - then we need to change it.

Or do you mean we need to recalculate bandwidth constantly as part of DBuf state?
I kind of don't like recalculating it everytime, so filtering the case when it will
obviously be the same seems to a good idea, may be this check is not correct but I would 
prefer to recalc used bw only if the planes had changed.

Also I'm fine with tracking it in DBuf state, we first need to land DBuf state patches
(need to look at those again), but meanwhile we could store it in a bw_state.

> 
> > +		if (old_active_planes == new_active_planes)
> >  			continue;
> >  
> >  		ret = intel_crtc_add_planes_to_state(state, crtc, new_active_planes);
> > -- 
> > 2.24.1.485.gad05a3d8e5
> 
> -- 
> Ville Syrjälä
> Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for Consider DBuf bandwidth when calculating CDCLK (rev3)
  2020-03-30 12:23 [Intel-gfx] [PATCH v3 0/5] Consider DBuf bandwidth when calculating CDCLK Stanislav Lisovskiy
                   ` (4 preceding siblings ...)
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 5/5] drm/i915: Remove unneeded hack now for CDCLK Stanislav Lisovskiy
@ 2020-03-30 18:07 ` Patchwork
  2020-04-07  8:03 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Consider DBuf bandwidth when calculating CDCLK (rev7) Patchwork
  6 siblings, 0 replies; 20+ messages in thread
From: Patchwork @ 2020-03-30 18:07 UTC (permalink / raw)
  To: Lisovskiy, Stanislav; +Cc: intel-gfx

== Series Details ==

Series: Consider DBuf bandwidth when calculating CDCLK (rev3)
URL   : https://patchwork.freedesktop.org/series/74739/
State : failure

== Summary ==

Applying: drm/i915: Decouple cdclk calculation from modeset checks
Applying: drm/i915: Force recalculate min_cdclk if planes config changed
Applying: drm/i915: Introduce for_each_dbuf_slice_in_mask macro
Applying: drm/i915: Adjust CDCLK accordingly to our DBuf bw needs
error: sha1 information is lacking or useless (drivers/gpu/drm/i915/display/intel_bw.c).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0004 drm/i915: Adjust CDCLK accordingly to our DBuf bw needs
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] [PATCH v3 4/5] drm/i915: Adjust CDCLK accordingly to our DBuf bw needs
  2020-03-30 17:07   ` Ville Syrjälä
@ 2020-03-30 18:16     ` Lisovskiy, Stanislav
  2020-04-07 16:26       ` Ville Syrjälä
  0 siblings, 1 reply; 20+ messages in thread
From: Lisovskiy, Stanislav @ 2020-03-30 18:16 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Mon, Mar 30, 2020 at 08:07:31PM +0300, Ville Syrjälä wrote:
> On Mon, Mar 30, 2020 at 03:23:53PM +0300, Stanislav Lisovskiy wrote:
> > According to BSpec max BW per slice is calculated using formula
> > Max BW = CDCLK * 64. Currently when calculating min CDCLK we
> > account only per plane requirements, however in order to avoid
> > FIFO underruns we need to estimate accumulated BW consumed by
> > all planes(ddb entries basically) residing on that particular
> > DBuf slice. This will allow us to put CDCLK lower and save power
> > when we don't need that much bandwidth or gain additional
> > performance once plane consumption grows.
> > 
> > v2: - Fix long line warning
> >     - Limited new DBuf bw checks to only gens >= 11
> > 
> > v3: - Lets track used Dbuf bw per slice and per crtc in bw state
> >       (or may be in DBuf state in future), that way we don't need
> >       to have all crtcs in state and those only if we detect if
> >       are actually going to change cdclk, just same way as we
> >       do with other stuff, i.e intel_atomic_serialize_global_state
> >       and co. Just as per Ville's paradigm.
> >     - Made dbuf bw calculation procedure look nicer by introducing
> >       for_each_dbuf_slice_in_mask - we often will now need to iterate
> >       slices using mask.
> >     - According to experimental results CDCLK * 64 accounts for
> >       overall bandwidth across all dbufs, not per dbuf.
> > 
> > Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> > ---
> >  drivers/gpu/drm/i915/display/intel_bw.c       | 61 ++++++++++++++++++-
> >  drivers/gpu/drm/i915/display/intel_bw.h       |  8 +++
> >  drivers/gpu/drm/i915/display/intel_cdclk.c    | 25 ++++++++
> >  drivers/gpu/drm/i915/display/intel_display.c  |  8 +++
> >  .../drm/i915/display/intel_display_power.h    |  2 +
> >  drivers/gpu/drm/i915/intel_pm.c               | 34 ++++++++++-
> >  drivers/gpu/drm/i915/intel_pm.h               |  3 +
> >  7 files changed, 138 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
> > index 573a1c206b60..e9d65820fb76 100644
> > --- a/drivers/gpu/drm/i915/display/intel_bw.c
> > +++ b/drivers/gpu/drm/i915/display/intel_bw.c
> > @@ -6,6 +6,7 @@
> >  #include <drm/drm_atomic_state_helper.h>
> >  
> >  #include "intel_bw.h"
> > +#include "intel_pm.h"
> >  #include "intel_display_types.h"
> >  #include "intel_sideband.h"
> >  #include "intel_atomic.h"
> > @@ -338,7 +339,6 @@ static unsigned int intel_bw_crtc_data_rate(const struct intel_crtc_state *crtc_
> >  
> >  	return data_rate;
> >  }
> > -
> >  void intel_bw_crtc_update(struct intel_bw_state *bw_state,
> >  			  const struct intel_crtc_state *crtc_state)
> >  {
> > @@ -419,6 +419,65 @@ intel_atomic_bw_get_state(struct intel_atomic_state *state)
> >  	return to_intel_bw_state(bw_state);
> >  }
> >  
> > +int intel_bw_calc_min_cdclk(struct intel_atomic_state *state)
> > +{
> > +	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
> > +	int i = 0;
> > +	enum plane_id plane_id;
> > +	struct intel_crtc_state *crtc_state;
> 
> const

Why? There are lots of places where we use it that way.
Even in intel_compute_min_cdclk:

struct intel_crtc_state *crtc_state;
int min_cdclk, i;
enum pipe pipe;

for_each_new_intel_crtc_in_state(state, crtc, crtc_state, i) {

Also I grepped - almost every place in intel_display.c
which has for_each_new_intel_crtc_in_state uses crtc_state as non-const.

It is called that way intel_calc_active_pipes, intel_atomic_check_crtcs,...
almost everywhere in intel_display.c

So can you please explain what do you mean?..

> 
> > +	struct intel_crtc *crtc;
> > +	int max_bw = 0;
> > +	int min_cdclk;
> > +	enum pipe pipe;
> > +	struct intel_bw_state *bw_state;
> > +	int slice_id = 0;
> 
> Bunch of needless intiialization, needlessly wide scope, etc.

max_bw needs to be accumulated so I guess it should be init to 0.

Agree regarding slice_id - just a leftover after I started to 
use for_each_dbuf_slice macro.

> 
> > +
> > +	bw_state = intel_atomic_bw_get_state(state);
> > +
> 
> Spurious whitespace.
> 
> > +	if (IS_ERR(bw_state))
> > +		return PTR_ERR(bw_state);
> > +
> > +	for_each_new_intel_crtc_in_state(state, crtc, crtc_state, i) {
> > +		struct intel_crtc_bw *crtc_bw = &bw_state->dbuf_bw_used[crtc->pipe];
> > +
> > +		memset(&crtc_bw->dbuf_bw, 0, sizeof(crtc_bw->dbuf_bw));
> > +
> > +		for_each_plane_id_on_crtc(crtc, plane_id) {
> > +			struct skl_ddb_entry *plane_alloc =
> > +				&crtc_state->wm.skl.plane_ddb_y[plane_id];
> > +			struct skl_ddb_entry *uv_plane_alloc =
> > +				&crtc_state->wm.skl.plane_ddb_uv[plane_id];
> 
> const
> 
> > +			unsigned int data_rate = crtc_state->data_rate[plane_id];
> > +
> 
> more strange whitespace
> 
> > +			unsigned int dbuf_mask = skl_ddb_dbuf_slice_mask(dev_priv, plane_alloc);
> > +
> > +			dbuf_mask |= skl_ddb_dbuf_slice_mask(dev_priv, uv_plane_alloc);
> 
> Looks bad when you initialize part of it in declaration and the rest
> later.

Didn't want to have a long time here, will have to make a separate function then.

> 
> > +
> > +			DRM_DEBUG_KMS("Got dbuf mask %x for pipe %c ddb %d-%d plane %d data rate %d\n",
> > +				      dbuf_mask, pipe_name(crtc->pipe), plane_alloc->start,
> > +				      plane_alloc->end, plane_id, data_rate);
> > +
> > +			for_each_dbuf_slice_in_mask(slice_id, dbuf_mask)
> > +				crtc_bw->dbuf_bw[slice_id] += data_rate;
> 
> This doesn't feel quite right for planar formats.
> 
> For pre-icl it works by accident since we only have the one slice so
> we don't end up accounting for the full bandwidth from both color planes
> to multiple slices. If we had multiple slices and chroma and luma had
> been allocated to different slices we'd count the same thing multiple
> times.
> 
> For icl+ we seem to assign the full data rate to the UV plane's slice(s)
> since only the UV plane has data_rate[] != 0.

This sounds interesting, so do you mean that we need something like data_rate 
and data_rate_uv? 
My logic here is that we just get a set of slices used by uv and y planes
in total and then account for data_rate for those. 
We probably then need another patch to split data rate then.

I was aware of that actually but anyway being pessimistic here is better.
Also same about if plane would be sitting on 2 slices at the same time.

> 
> > +		}
> > +	}
> > +
> > +	for_each_dbuf_slice(slice_id) {
> > +		int total_bw_per_slice = 0;
> > +
> > +		for_each_pipe(dev_priv, pipe) {
> > +			struct intel_crtc_bw *crtc_bw = &bw_state->dbuf_bw_used[pipe];
> > +
> > +			total_bw_per_slice += crtc_bw->dbuf_bw[slice_id];
> > +		}
> > +		max_bw += total_bw_per_slice;
> 
> So we're aggregating all the bw instead of per-slice? Is this based on
> the other mail you sent? Deserves a comment explaining why we do such
> odd things.

Agree need to add a comment. Currently it seems the only way to make it
work without either having underruns or 8K not usable with bumped CDCLK.

> 
> > +	}
> > +
> > +	min_cdclk = max_bw / 64;
> > +
> > +	return min_cdclk;
> > +}
> > +
> >  int intel_bw_atomic_check(struct intel_atomic_state *state)
> >  {
> >  	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
> > diff --git a/drivers/gpu/drm/i915/display/intel_bw.h b/drivers/gpu/drm/i915/display/intel_bw.h
> > index 9a5627be6876..d2b5f32b0791 100644
> > --- a/drivers/gpu/drm/i915/display/intel_bw.h
> > +++ b/drivers/gpu/drm/i915/display/intel_bw.h
> > @@ -10,11 +10,16 @@
> >  
> >  #include "intel_display.h"
> >  #include "intel_global_state.h"
> > +#include "intel_display_power.h"
> >  
> >  struct drm_i915_private;
> >  struct intel_atomic_state;
> >  struct intel_crtc_state;
> >  
> > +struct intel_crtc_bw {
> > +	int dbuf_bw[I915_MAX_DBUF_SLICES];
> > +};
> > +
> >  struct intel_bw_state {
> >  	struct intel_global_state base;
> >  
> > @@ -31,6 +36,8 @@ struct intel_bw_state {
> >  	 */
> >  	u8 qgv_points_mask;
> >  
> > +	struct intel_crtc_bw dbuf_bw_used[I915_MAX_PIPES];
> 
> The name of the struct isn't very good if it just contains the
> dbuf bw numbers.
> 
> > +
> >  	unsigned int data_rate[I915_MAX_PIPES];
> >  	u8 num_active_planes[I915_MAX_PIPES];
> 
> Maybe collect all this to the per-crtc struct?
> 
> >  };
> > @@ -53,5 +60,6 @@ void intel_bw_crtc_update(struct intel_bw_state *bw_state,
> >  			  const struct intel_crtc_state *crtc_state);
> >  int icl_pcode_restrict_qgv_points(struct drm_i915_private *dev_priv,
> >  				  u32 points_mask);
> > +int intel_bw_calc_min_cdclk(struct intel_atomic_state *state);
> >  
> >  #endif /* __INTEL_BW_H__ */
> > diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c
> > index 979a0241fdcb..036774e7f3ec 100644
> > --- a/drivers/gpu/drm/i915/display/intel_cdclk.c
> > +++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
> > @@ -25,6 +25,7 @@
> >  #include "intel_cdclk.h"
> >  #include "intel_display_types.h"
> >  #include "intel_sideband.h"
> > +#include "intel_bw.h"
> >  
> >  /**
> >   * DOC: CDCLK / RAWCLK
> > @@ -2001,11 +2002,19 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
> >  {
> >  	struct drm_i915_private *dev_priv =
> >  		to_i915(crtc_state->uapi.crtc->dev);
> > +	struct intel_atomic_state *state = NULL;
> >  	int min_cdclk;
> >  
> >  	if (!crtc_state->hw.enable)
> >  		return 0;
> >  
> > +	/*
> > +	 * FIXME: Unfortunately when this gets called from intel_modeset_setup_hw_state
> > +	 * there is no intel_atomic_state at all. So lets not then use it.
> > +	 */
> > +	if (crtc_state->uapi.state)
> > +		state = to_intel_atomic_state(crtc_state->uapi.state);
> 
> This still indicates that either this isn't the right place to call this
> or we have the state stored in the wrong place.
> 
> I think I'd just move the thing into intel_compute_min_cdclk() as a start.

Was thinking about this, why I didn't put it here is because just wanted
to keep all cdclk regulating logic in this function, because intel_compute_min_cdclk
is kind of higher level already which is supposed to call that one per crtc_state.
Otherwise CDCLK regulating logic will be kind of spread across those two.

> 
> > +
> >  	min_cdclk = intel_pixel_rate_to_cdclk(crtc_state);
> >  
> >  	/* pixel rate mustn't exceed 95% of cdclk with IPS on BDW */
> > @@ -2080,6 +2089,22 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
> >  	if (IS_TIGERLAKE(dev_priv))
> >  		min_cdclk = max(min_cdclk, (int)crtc_state->pixel_rate);
> >  
> > +	/*
> > +	 * Similar story as with skl_write_plane_wm and intel_enable_sagv
> > +	 * - in some certain driver parts, we don't have any guarantee that
> > +	 * parent exists. So we might be having a crtc_state without
> > +	 * parent state.
> > +	 */
> > +	if (INTEL_GEN(dev_priv) >= 11) {
> > +		if (state) {
> > +			int dbuf_bw_cdclk = intel_bw_calc_min_cdclk(state);
> > +
> > +			DRM_DEBUG_KMS("DBuf bw min cdclk %d current min_cdclk %d\n",
> > +				      dbuf_bw_cdclk, min_cdclk);
> > +			min_cdclk = max(min_cdclk, dbuf_bw_cdclk);
> > +		}
> > +	}
> > +
> >  	if (min_cdclk > dev_priv->max_cdclk_freq) {
> >  		drm_dbg_kms(&dev_priv->drm,
> >  			    "required cdclk (%d kHz) exceeds max (%d kHz)\n",
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> > index 9fd32d61ebfe..fa2870c0d7fd 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > @@ -14678,6 +14678,14 @@ static int intel_atomic_check_planes(struct intel_atomic_state *state,
> >  		if (old_active_planes == new_active_planes)
> >  			continue;
> >  
> > +		/*
> > +		 * active_planes bitmask has been updated, whenever amount
> > +		 * of active planes had changed we need to recalculate CDCLK
> > +		 * as it depends on total bandwidth now, not only min_cdclk
> > +		 * per plane.
> > +		 */
> > +		*need_cdclk_calc = true;
> > +
> >  		ret = intel_crtc_add_planes_to_state(state, crtc, new_active_planes);
> >  		if (ret)
> >  			return ret;
> > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
> > index 468e8fb0203a..9e33fb90422f 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > @@ -308,6 +308,8 @@ intel_display_power_put_async(struct drm_i915_private *i915,
> >  }
> >  #endif
> >  
> > +#define I915_MAX_DBUF_SLICES 2
> > +
> >  enum dbuf_slice {
> >  	DBUF_S1,
> >  	DBUF_S2,
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 551933e3f7da..5dcd1cd09ad7 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -4055,10 +4055,9 @@ icl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
> >  	return offset;
> >  }
> >  
> > -static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
> > +u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
> >  {
> >  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > -
> >  	drm_WARN_ON(&dev_priv->drm, ddb_size == 0);
> >  
> >  	if (INTEL_GEN(dev_priv) < 11)
> > @@ -4067,6 +4066,37 @@ static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
> >  	return ddb_size;
> >  }
> >  
> > +u32 skl_ddb_dbuf_slice_mask(struct drm_i915_private *dev_priv,
> > +			    const struct skl_ddb_entry *entry)
> > +{
> > +	u32 slice_mask = 0;
> > +	u16 ddb_size = intel_get_ddb_size(dev_priv);
> > +	u16 num_supported_slices = INTEL_INFO(dev_priv)->num_supported_dbuf_slices;
> > +	u16 slice_size = ddb_size / num_supported_slices;
> > +	u16 start_slice;
> > +	u16 end_slice;
> > +
> > +	if (!skl_ddb_entry_size(entry))
> > +		return 0;
> > +
> > +	start_slice = entry->start / slice_size;
> > +	end_slice = (entry->end - 1) / slice_size;
> > +
> > +	DRM_DEBUG_KMS("ddb size %d slices %d slice size %d start slice %d end slice %d\n",
> > +		      ddb_size, num_supported_slices, slice_size, start_slice, end_slice);
> > +
> > +	/*
> > +	 * Per plane DDB entry can in a really worst case be on multiple slices
> > +	 * but single entry is anyway contigious.
> > +	 */
> > +	while (start_slice <= end_slice) {
> > +		slice_mask |= 1 << start_slice;
> > +		start_slice++;
> > +	}
> 
> GENMASK() or somehting?

Will fix, thanks for proposal.

> 
> > +
> > +	return slice_mask;
> > +}
> > +
> >  static u8 skl_compute_dbuf_slices(const struct intel_crtc_state *crtc_state,
> >  				  u8 active_pipes);
> >  
> > diff --git a/drivers/gpu/drm/i915/intel_pm.h b/drivers/gpu/drm/i915/intel_pm.h
> > index 069515f04170..41c61ad71ce6 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.h
> > +++ b/drivers/gpu/drm/i915/intel_pm.h
> > @@ -37,6 +37,9 @@ void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
> >  			       struct skl_ddb_entry *ddb_y,
> >  			       struct skl_ddb_entry *ddb_uv);
> >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv);
> > +u16 intel_get_ddb_size(struct drm_i915_private *dev_priv);
> > +u32 skl_ddb_dbuf_slice_mask(struct drm_i915_private *dev_priv,
> > +			    const struct skl_ddb_entry *entry);
> >  void skl_pipe_wm_get_hw_state(struct intel_crtc *crtc,
> >  			      struct skl_pipe_wm *out);
> >  void g4x_wm_sanitize(struct drm_i915_private *dev_priv);
> > -- 
> > 2.24.1.485.gad05a3d8e5
> 
> -- 
> Ville Syrjälä
> Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Intel-gfx] [PATCH v4 1/5] drm/i915: Decouple cdclk calculation from modeset checks
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 1/5] drm/i915: Decouple cdclk calculation from modeset checks Stanislav Lisovskiy
  2020-03-30 15:44   ` Ville Syrjälä
@ 2020-04-07  7:33   ` Stanislav Lisovskiy
  2020-04-07 16:15     ` Ville Syrjälä
  1 sibling, 1 reply; 20+ messages in thread
From: Stanislav Lisovskiy @ 2020-04-07  7:33 UTC (permalink / raw)
  To: intel-gfx

We need to calculate cdclk after watermarks/ddb has been calculated
as with recent hw CDCLK needs to be adjusted accordingly to DBuf
requirements, which is not possible with current code organization.

Setting CDCLK according to DBuf BW requirements and not just rejecting
if it doesn't satisfy BW requirements, will allow us to save power when
it is possible and gain additional bandwidth when it's needed - i.e
boosting both our power management and perfomance capabilities.

This patch is preparation for that, first we now extract modeset
calculation from modeset checks, in order to call it after wm/ddb
has been calculated.

v2: - Extract only intel_modeset_calc_cdclk from intel_modeset_checks
      (Ville Syrjälä)

v3: - Clear plls after intel_modeset_calc_cdclk

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c | 22 +++++++++++---------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 70ec301fe6e3..c77088e1d033 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -14464,12 +14464,6 @@ static int intel_modeset_checks(struct intel_atomic_state *state)
 			return ret;
 	}
 
-	ret = intel_modeset_calc_cdclk(state);
-	if (ret)
-		return ret;
-
-	intel_modeset_clear_plls(state);
-
 	if (IS_HASWELL(dev_priv))
 		return hsw_mode_set_planes_workaround(state);
 
@@ -14801,10 +14795,6 @@ static int intel_atomic_check(struct drm_device *dev,
 			goto fail;
 	}
 
-	ret = intel_atomic_check_crtcs(state);
-	if (ret)
-		goto fail;
-
 	intel_fbc_choose_crtc(dev_priv, state);
 	ret = calc_watermark_data(state);
 	if (ret)
@@ -14814,6 +14804,18 @@ static int intel_atomic_check(struct drm_device *dev,
 	if (ret)
 		goto fail;
 
+	if (any_ms) {
+		ret = intel_modeset_calc_cdclk(state);
+		if (ret)
+			return ret;
+
+		intel_modeset_clear_plls(state);
+	}
+
+	ret = intel_atomic_check_crtcs(state);
+	if (ret)
+		goto fail;
+
 	for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state,
 					    new_crtc_state, i) {
 		if (!needs_modeset(new_crtc_state) &&
-- 
2.24.1.485.gad05a3d8e5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Intel-gfx] [PATCH v4 2/5] drm/i915: Force recalculate min_cdclk if planes config changed
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 2/5] drm/i915: Force recalculate min_cdclk if planes config changed Stanislav Lisovskiy
  2020-03-30 16:18   ` Ville Syrjälä
@ 2020-04-07  7:36   ` Stanislav Lisovskiy
  1 sibling, 0 replies; 20+ messages in thread
From: Stanislav Lisovskiy @ 2020-04-07  7:36 UTC (permalink / raw)
  To: intel-gfx

In Gen11+ whenever we might exceed DBuf bandwidth we might need to
recalculate CDCLK which DBuf bandwidth is scaled with.
Total Dbuf bw used might change based on particular plane needs.

In intel_atomic_check_planes we try to filter out the cases when
we definitely don't need to recalculate required bandwidth/CDCLK.
In current code we compare amount of planes and skip recalculating
if those are equal.
This seems being too relaxed requirement and might be even wrong
because plane combination might become different despite amount
of planes is same - that requires recalculating min cdclk and
consumed bandwidth.

v2: - Changed commit message to properly reflect the need why,
      we might want to change from hamming weight comparison
      to actual plane combination checking.

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index c77088e1d033..307636b23ac9 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -14540,7 +14540,7 @@ static bool active_planes_affects_min_cdclk(struct drm_i915_private *dev_priv)
 	/* See {hsw,vlv,ivb}_plane_ratio() */
 	return IS_BROADWELL(dev_priv) || IS_HASWELL(dev_priv) ||
 		IS_CHERRYVIEW(dev_priv) || IS_VALLEYVIEW(dev_priv) ||
-		IS_IVYBRIDGE(dev_priv);
+		IS_IVYBRIDGE(dev_priv) || (INTEL_GEN(dev_priv) >= 11);
 }
 
 static int intel_atomic_check_planes(struct intel_atomic_state *state,
@@ -14586,7 +14586,13 @@ static int intel_atomic_check_planes(struct intel_atomic_state *state,
 		old_active_planes = old_crtc_state->active_planes & ~BIT(PLANE_CURSOR);
 		new_active_planes = new_crtc_state->active_planes & ~BIT(PLANE_CURSOR);
 
-		if (hweight8(old_active_planes) == hweight8(new_active_planes))
+		/*
+		 * Not only the number of planes, but if the plane configuration had
+		 * changed might already mean we need to recompute min CDCLK,
+		 * because different planes might consume different amount of Dbuf bandwidth
+		 * according to formula: Bw per plane = Pixel rate * bpp * pipe/plane scale factor
+		 */
+		if (old_active_planes == new_active_planes)
 			continue;
 
 		ret = intel_crtc_add_planes_to_state(state, crtc, new_active_planes);
-- 
2.24.1.485.gad05a3d8e5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Intel-gfx] [PATCH v4 3/5] drm/i915: Introduce for_each_dbuf_slice_in_mask macro
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 3/5] drm/i915: Introduce for_each_dbuf_slice_in_mask macro Stanislav Lisovskiy
  2020-03-30 16:07   ` Ville Syrjälä
@ 2020-04-07  7:38   ` Stanislav Lisovskiy
  1 sibling, 0 replies; 20+ messages in thread
From: Stanislav Lisovskiy @ 2020-04-07  7:38 UTC (permalink / raw)
  To: intel-gfx

We quite often need now to iterate only particular dbuf slices
in mask, whether they are active or related to particular crtc.

v2: - Minor code refactoring

Let's make our life a bit easier and use a macro for that.

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.h       | 7 +++++++
 drivers/gpu/drm/i915/display/intel_display_power.h | 1 +
 2 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h
index cc7f287804d7..9aab7e5181fd 100644
--- a/drivers/gpu/drm/i915/display/intel_display.h
+++ b/drivers/gpu/drm/i915/display/intel_display.h
@@ -187,6 +187,13 @@ enum plane_id {
 	for ((__p) = PLANE_PRIMARY; (__p) < I915_MAX_PLANES; (__p)++) \
 		for_each_if((__crtc)->plane_ids_mask & BIT(__p))
 
+#define for_each_dbuf_slice_in_mask(__slice, __mask) \
+	for ((__slice) = DBUF_S1; (__slice) < I915_MAX_DBUF_SLICES; (__slice)++) \
+		for_each_if((BIT(__slice)) & (__mask))
+
+#define for_each_dbuf_slice(__slice) \
+	for_each_dbuf_slice_in_mask(__slice, BIT(I915_MAX_DBUF_SLICES) - 1)
+
 enum port {
 	PORT_NONE = -1,
 
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
index da64a5edae7a..815d8fec7e4a 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.h
+++ b/drivers/gpu/drm/i915/display/intel_display_power.h
@@ -311,6 +311,7 @@ intel_display_power_put_async(struct drm_i915_private *i915,
 enum dbuf_slice {
 	DBUF_S1,
 	DBUF_S2,
+	I915_DBUF_MAX_SLICES
 };
 
 #define with_intel_display_power(i915, domain, wf) \
-- 
2.24.1.485.gad05a3d8e5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Intel-gfx] [PATCH v4 4/5] drm/i915: Adjust CDCLK accordingly to our DBuf bw needs
  2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 4/5] drm/i915: Adjust CDCLK accordingly to our DBuf bw needs Stanislav Lisovskiy
  2020-03-30 17:07   ` Ville Syrjälä
@ 2020-04-07  7:39   ` Stanislav Lisovskiy
  1 sibling, 0 replies; 20+ messages in thread
From: Stanislav Lisovskiy @ 2020-04-07  7:39 UTC (permalink / raw)
  To: intel-gfx

According to BSpec max BW per slice is calculated using formula
Max BW = CDCLK * 64. Currently when calculating min CDCLK we
account only per plane requirements, however in order to avoid
FIFO underruns we need to estimate accumulated BW consumed by
all planes(ddb entries basically) residing on that particular
DBuf slice. This will allow us to put CDCLK lower and save power
when we don't need that much bandwidth or gain additional
performance once plane consumption grows.

v2: - Fix long line warning
    - Limited new DBuf bw checks to only gens >= 11

v3: - Lets track used Dbuf bw per slice and per crtc in bw state
      (or may be in DBuf state in future), that way we don't need
      to have all crtcs in state and those only if we detect if
      are actually going to change cdclk, just same way as we
      do with other stuff, i.e intel_atomic_serialize_global_state
      and co. Just as per Ville's paradigm.
    - Made dbuf bw calculation procedure look nicer by introducing
      for_each_dbuf_slice_in_mask - we often will now need to iterate
      slices using mask.
    - According to experimental results CDCLK * 64 accounts for
      overall bandwidth across all dbufs, not per dbuf.

v4: - Fixed missing const(Ville)
    - Removed spurious whitespaces(Ville)
    - Fixed local variable init(reduced scope where not needed)
    - Added some comments about data rate for planar formats
    - Changed struct intel_crtc_bw to intel_dbuf_bw
    - Moved dbuf bw calculation to intel_compute_min_cdclk(Ville)

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
---
 drivers/gpu/drm/i915/display/intel_bw.c       | 73 ++++++++++++++++++-
 drivers/gpu/drm/i915/display/intel_bw.h       |  7 ++
 drivers/gpu/drm/i915/display/intel_cdclk.c    | 27 +++++++
 drivers/gpu/drm/i915/display/intel_display.c  |  8 ++
 .../drm/i915/display/intel_display_power.h    |  2 +
 drivers/gpu/drm/i915/intel_pm.c               | 31 +++++++-
 drivers/gpu/drm/i915/intel_pm.h               |  3 +
 7 files changed, 148 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
index 58b264bc318d..43f7177d98dd 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -6,6 +6,7 @@
 #include <drm/drm_atomic_state_helper.h>
 
 #include "intel_bw.h"
+#include "intel_pm.h"
 #include "intel_display_types.h"
 #include "intel_sideband.h"
 
@@ -333,7 +334,6 @@ static unsigned int intel_bw_crtc_data_rate(const struct intel_crtc_state *crtc_
 
 	return data_rate;
 }
-
 void intel_bw_crtc_update(struct intel_bw_state *bw_state,
 			  const struct intel_crtc_state *crtc_state)
 {
@@ -387,6 +387,77 @@ intel_atomic_get_bw_state(struct intel_atomic_state *state)
 	return to_intel_bw_state(bw_state);
 }
 
+int intel_bw_calc_min_cdclk(struct intel_atomic_state *state)
+{
+	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
+	int i;
+	const struct intel_crtc_state *crtc_state;
+	struct intel_crtc *crtc;
+	int max_bw = 0;
+	int min_cdclk;
+	struct intel_bw_state *bw_state;
+	int slice_id;
+
+	bw_state = intel_atomic_get_bw_state(state);
+	if (IS_ERR(bw_state))
+		return PTR_ERR(bw_state);
+
+	for_each_new_intel_crtc_in_state(state, crtc, crtc_state, i) {
+		enum plane_id plane_id;
+		struct intel_dbuf_bw *crtc_bw = &bw_state->dbuf_bw[crtc->pipe];
+
+		memset(&crtc_bw->used_bw, 0, sizeof(crtc_bw->used_bw));
+
+		for_each_plane_id_on_crtc(crtc, plane_id) {
+			const struct skl_ddb_entry *plane_alloc =
+				&crtc_state->wm.skl.plane_ddb_y[plane_id];
+			const struct skl_ddb_entry *uv_plane_alloc =
+				&crtc_state->wm.skl.plane_ddb_uv[plane_id];
+			unsigned int data_rate = crtc_state->data_rate[plane_id];
+			unsigned int dbuf_mask = 0;
+
+			dbuf_mask |= skl_ddb_dbuf_slice_mask(dev_priv, plane_alloc);
+			dbuf_mask |= skl_ddb_dbuf_slice_mask(dev_priv, uv_plane_alloc);
+
+			/*
+			 * FIXME: To calculate that more properly we probably need to
+			 * to split per plane data_rate into data_rate_y and data_rate_uv
+			 * for multiplanar formats in order not to get accounted those twice
+			 * if they happen to reside on different slices.
+			 * However for pre-icl this would work anyway because we have only single
+			 * slice and for icl+ uv plane has non-zero data rate.
+			 * So in worst case those calculation are a bit pessimistic, which
+			 * shouldn't pose any significant problem anyway.
+			 */
+			for_each_dbuf_slice_in_mask(slice_id, dbuf_mask)
+				crtc_bw->used_bw[slice_id] += data_rate;
+		}
+	}
+
+	for_each_dbuf_slice(slice_id) {
+		int total_bw_per_slice = 0;
+		enum pipe pipe;
+
+		/*
+		 * Current experimental observations show that contrary to BSpec
+		 * we get underruns once we exceed 64 * CDCLK for slices in total.
+		 * As a temporary measure in order not to keep CDCLK bumped up all the
+		 * time we calculate CDCLK according to this formula for  overall bw
+		 * consumed by slices.
+		 */
+		for_each_pipe(dev_priv, pipe) {
+			struct intel_dbuf_bw *crtc_bw = &bw_state->dbuf_bw[pipe];
+
+			total_bw_per_slice += crtc_bw->used_bw[slice_id];
+		}
+		max_bw += total_bw_per_slice;
+	}
+
+	min_cdclk = max_bw / 64;
+
+	return min_cdclk;
+}
+
 int intel_bw_atomic_check(struct intel_atomic_state *state)
 {
 	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
diff --git a/drivers/gpu/drm/i915/display/intel_bw.h b/drivers/gpu/drm/i915/display/intel_bw.h
index a8aa7624c5aa..e0ac43c38b3d 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.h
+++ b/drivers/gpu/drm/i915/display/intel_bw.h
@@ -10,13 +10,19 @@
 
 #include "intel_display.h"
 #include "intel_global_state.h"
+#include "intel_display_power.h"
 
 struct drm_i915_private;
 struct intel_atomic_state;
 struct intel_crtc_state;
 
+struct intel_dbuf_bw {
+	int used_bw[I915_MAX_DBUF_SLICES];
+};
+
 struct intel_bw_state {
 	struct intel_global_state base;
+	struct intel_dbuf_bw dbuf_bw[I915_MAX_PIPES];
 
 	unsigned int data_rate[I915_MAX_PIPES];
 	u8 num_active_planes[I915_MAX_PIPES];
@@ -29,5 +35,6 @@ int intel_bw_init(struct drm_i915_private *dev_priv);
 int intel_bw_atomic_check(struct intel_atomic_state *state);
 void intel_bw_crtc_update(struct intel_bw_state *bw_state,
 			  const struct intel_crtc_state *crtc_state);
+int intel_bw_calc_min_cdclk(struct intel_atomic_state *state);
 
 #endif /* __INTEL_BW_H__ */
diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c
index 979a0241fdcb..fbb8cbcee3d2 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
@@ -25,6 +25,7 @@
 #include "intel_cdclk.h"
 #include "intel_display_types.h"
 #include "intel_sideband.h"
+#include "intel_bw.h"
 
 /**
  * DOC: CDCLK / RAWCLK
@@ -2090,6 +2091,23 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
 	return min_cdclk;
 }
 
+static int intel_compute_dbuf_min_cdclk(struct intel_atomic_state *state)
+{
+	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
+	int min_cdclk = intel_bw_calc_min_cdclk(state);
+
+	DRM_DEBUG_KMS("DBuf bw min cdclk %d\n", min_cdclk);
+
+	if (min_cdclk > dev_priv->max_cdclk_freq) {
+		drm_dbg_kms(&dev_priv->drm,
+			    "required cdclk (%d kHz) exceeds max (%d kHz)\n",
+			    min_cdclk, dev_priv->max_cdclk_freq);
+		return -EINVAL;
+	}
+
+	return min_cdclk;
+}
+
 static int intel_compute_min_cdclk(struct intel_cdclk_state *cdclk_state)
 {
 	struct intel_atomic_state *state = cdclk_state->base.state;
@@ -2098,6 +2116,13 @@ static int intel_compute_min_cdclk(struct intel_cdclk_state *cdclk_state)
 	struct intel_crtc_state *crtc_state;
 	int min_cdclk, i;
 	enum pipe pipe;
+	int dbuf_bw_min_cdclk = 0;
+
+	if (INTEL_GEN(dev_priv) >= 11) {
+		dbuf_bw_min_cdclk = intel_compute_dbuf_min_cdclk(state);
+		if (dbuf_bw_min_cdclk < 0)
+			return dbuf_bw_min_cdclk;
+	}
 
 	for_each_new_intel_crtc_in_state(state, crtc, crtc_state, i) {
 		int ret;
@@ -2106,6 +2131,8 @@ static int intel_compute_min_cdclk(struct intel_cdclk_state *cdclk_state)
 		if (min_cdclk < 0)
 			return min_cdclk;
 
+		min_cdclk = max(min_cdclk, dbuf_bw_min_cdclk);
+
 		if (cdclk_state->min_cdclk[i] == min_cdclk)
 			continue;
 
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 307636b23ac9..f9ca883a605e 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -14595,6 +14595,14 @@ static int intel_atomic_check_planes(struct intel_atomic_state *state,
 		if (old_active_planes == new_active_planes)
 			continue;
 
+		/*
+		 * active_planes bitmask has been updated, whenever amount
+		 * of active planes had changed we need to recalculate CDCLK
+		 * as it depends on total bandwidth now, not only min_cdclk
+		 * per plane.
+		 */
+		*need_cdclk_calc = true;
+
 		ret = intel_crtc_add_planes_to_state(state, crtc, new_active_planes);
 		if (ret)
 			return ret;
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
index 815d8fec7e4a..da3d9595ae49 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.h
+++ b/drivers/gpu/drm/i915/display/intel_display_power.h
@@ -308,6 +308,8 @@ intel_display_power_put_async(struct drm_i915_private *i915,
 }
 #endif
 
+#define I915_MAX_DBUF_SLICES 2
+
 enum dbuf_slice {
 	DBUF_S1,
 	DBUF_S2,
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 8375054ba27d..51716af263cb 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3844,10 +3844,9 @@ icl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
 	return offset;
 }
 
-static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
+u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
 {
 	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
-
 	drm_WARN_ON(&dev_priv->drm, ddb_size == 0);
 
 	if (INTEL_GEN(dev_priv) < 11)
@@ -3856,6 +3855,34 @@ static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
 	return ddb_size;
 }
 
+u32 skl_ddb_dbuf_slice_mask(struct drm_i915_private *dev_priv,
+			    const struct skl_ddb_entry *entry)
+{
+	u32 slice_mask = 0;
+	u16 ddb_size = intel_get_ddb_size(dev_priv);
+	u16 num_supported_slices = INTEL_INFO(dev_priv)->num_supported_dbuf_slices;
+	u16 slice_size = ddb_size / num_supported_slices;
+	u16 start_slice;
+	u16 end_slice;
+
+	if (!skl_ddb_entry_size(entry))
+		return 0;
+
+	start_slice = entry->start / slice_size;
+	end_slice = (entry->end - 1) / slice_size;
+
+	/*
+	 * Per plane DDB entry can in a really worst case be on multiple slices
+	 * but single entry is anyway contigious.
+	 */
+	while (start_slice <= end_slice) {
+		slice_mask |= BIT(start_slice);
+		start_slice++;
+	}
+
+	return slice_mask;
+}
+
 static u8 skl_compute_dbuf_slices(const struct intel_crtc_state *crtc_state,
 				  u8 active_pipes);
 
diff --git a/drivers/gpu/drm/i915/intel_pm.h b/drivers/gpu/drm/i915/intel_pm.h
index d60a85421c5a..a9e3835927a8 100644
--- a/drivers/gpu/drm/i915/intel_pm.h
+++ b/drivers/gpu/drm/i915/intel_pm.h
@@ -37,6 +37,9 @@ void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
 			       struct skl_ddb_entry *ddb_y,
 			       struct skl_ddb_entry *ddb_uv);
 void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv);
+u16 intel_get_ddb_size(struct drm_i915_private *dev_priv);
+u32 skl_ddb_dbuf_slice_mask(struct drm_i915_private *dev_priv,
+			    const struct skl_ddb_entry *entry);
 void skl_pipe_wm_get_hw_state(struct intel_crtc *crtc,
 			      struct skl_pipe_wm *out);
 void g4x_wm_sanitize(struct drm_i915_private *dev_priv);
-- 
2.24.1.485.gad05a3d8e5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for Consider DBuf bandwidth when calculating CDCLK (rev7)
  2020-03-30 12:23 [Intel-gfx] [PATCH v3 0/5] Consider DBuf bandwidth when calculating CDCLK Stanislav Lisovskiy
                   ` (5 preceding siblings ...)
  2020-03-30 18:07 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Consider DBuf bandwidth when calculating CDCLK (rev3) Patchwork
@ 2020-04-07  8:03 ` Patchwork
  6 siblings, 0 replies; 20+ messages in thread
From: Patchwork @ 2020-04-07  8:03 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

== Series Details ==

Series: Consider DBuf bandwidth when calculating CDCLK (rev7)
URL   : https://patchwork.freedesktop.org/series/74739/
State : failure

== Summary ==

Applying: drm/i915: Decouple cdclk calculation from modeset checks
Applying: drm/i915: Force recalculate min_cdclk if planes config changed
Applying: drm/i915: Introduce for_each_dbuf_slice_in_mask macro
Applying: drm/i915: Adjust CDCLK accordingly to our DBuf bw needs
Applying: drm/i915: Remove unneeded hack now for CDCLK
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/display/intel_cdclk.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/display/intel_cdclk.c
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/display/intel_cdclk.c
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0005 drm/i915: Remove unneeded hack now for CDCLK
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] [PATCH v4 1/5] drm/i915: Decouple cdclk calculation from modeset checks
  2020-04-07  7:33   ` [Intel-gfx] [PATCH v4 " Stanislav Lisovskiy
@ 2020-04-07 16:15     ` Ville Syrjälä
  0 siblings, 0 replies; 20+ messages in thread
From: Ville Syrjälä @ 2020-04-07 16:15 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

On Tue, Apr 07, 2020 at 10:33:56AM +0300, Stanislav Lisovskiy wrote:
> We need to calculate cdclk after watermarks/ddb has been calculated
> as with recent hw CDCLK needs to be adjusted accordingly to DBuf
> requirements, which is not possible with current code organization.
> 
> Setting CDCLK according to DBuf BW requirements and not just rejecting
> if it doesn't satisfy BW requirements, will allow us to save power when
> it is possible and gain additional bandwidth when it's needed - i.e
> boosting both our power management and perfomance capabilities.
> 
> This patch is preparation for that, first we now extract modeset
> calculation from modeset checks, in order to call it after wm/ddb
> has been calculated.
> 
> v2: - Extract only intel_modeset_calc_cdclk from intel_modeset_checks
>       (Ville Syrjälä)
> 
> v3: - Clear plls after intel_modeset_calc_cdclk

Why did we change this now?

> 
> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_display.c | 22 +++++++++++---------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 70ec301fe6e3..c77088e1d033 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -14464,12 +14464,6 @@ static int intel_modeset_checks(struct intel_atomic_state *state)
>  			return ret;
>  	}
>  
> -	ret = intel_modeset_calc_cdclk(state);
> -	if (ret)
> -		return ret;
> -
> -	intel_modeset_clear_plls(state);
> -
>  	if (IS_HASWELL(dev_priv))
>  		return hsw_mode_set_planes_workaround(state);
>  
> @@ -14801,10 +14795,6 @@ static int intel_atomic_check(struct drm_device *dev,
>  			goto fail;
>  	}
>  
> -	ret = intel_atomic_check_crtcs(state);
> -	if (ret)
> -		goto fail;
> -
>  	intel_fbc_choose_crtc(dev_priv, state);
>  	ret = calc_watermark_data(state);
>  	if (ret)
> @@ -14814,6 +14804,18 @@ static int intel_atomic_check(struct drm_device *dev,
>  	if (ret)
>  		goto fail;
>  
> +	if (any_ms) {
> +		ret = intel_modeset_calc_cdclk(state);
> +		if (ret)
> +			return ret;
> +
> +		intel_modeset_clear_plls(state);
> +	}
> +
> +	ret = intel_atomic_check_crtcs(state);
> +	if (ret)
> +		goto fail;
> +
>  	for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state,
>  					    new_crtc_state, i) {
>  		if (!needs_modeset(new_crtc_state) &&
> -- 
> 2.24.1.485.gad05a3d8e5

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Intel-gfx] [PATCH v3 4/5] drm/i915: Adjust CDCLK accordingly to our DBuf bw needs
  2020-03-30 18:16     ` Lisovskiy, Stanislav
@ 2020-04-07 16:26       ` Ville Syrjälä
  0 siblings, 0 replies; 20+ messages in thread
From: Ville Syrjälä @ 2020-04-07 16:26 UTC (permalink / raw)
  To: Lisovskiy, Stanislav; +Cc: intel-gfx

On Mon, Mar 30, 2020 at 09:16:49PM +0300, Lisovskiy, Stanislav wrote:
> On Mon, Mar 30, 2020 at 08:07:31PM +0300, Ville Syrjälä wrote:
> > On Mon, Mar 30, 2020 at 03:23:53PM +0300, Stanislav Lisovskiy wrote:
> > > According to BSpec max BW per slice is calculated using formula
> > > Max BW = CDCLK * 64. Currently when calculating min CDCLK we
> > > account only per plane requirements, however in order to avoid
> > > FIFO underruns we need to estimate accumulated BW consumed by
> > > all planes(ddb entries basically) residing on that particular
> > > DBuf slice. This will allow us to put CDCLK lower and save power
> > > when we don't need that much bandwidth or gain additional
> > > performance once plane consumption grows.
> > > 
> > > v2: - Fix long line warning
> > >     - Limited new DBuf bw checks to only gens >= 11
> > > 
> > > v3: - Lets track used Dbuf bw per slice and per crtc in bw state
> > >       (or may be in DBuf state in future), that way we don't need
> > >       to have all crtcs in state and those only if we detect if
> > >       are actually going to change cdclk, just same way as we
> > >       do with other stuff, i.e intel_atomic_serialize_global_state
> > >       and co. Just as per Ville's paradigm.
> > >     - Made dbuf bw calculation procedure look nicer by introducing
> > >       for_each_dbuf_slice_in_mask - we often will now need to iterate
> > >       slices using mask.
> > >     - According to experimental results CDCLK * 64 accounts for
> > >       overall bandwidth across all dbufs, not per dbuf.
> > > 
> > > Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/display/intel_bw.c       | 61 ++++++++++++++++++-
> > >  drivers/gpu/drm/i915/display/intel_bw.h       |  8 +++
> > >  drivers/gpu/drm/i915/display/intel_cdclk.c    | 25 ++++++++
> > >  drivers/gpu/drm/i915/display/intel_display.c  |  8 +++
> > >  .../drm/i915/display/intel_display_power.h    |  2 +
> > >  drivers/gpu/drm/i915/intel_pm.c               | 34 ++++++++++-
> > >  drivers/gpu/drm/i915/intel_pm.h               |  3 +
> > >  7 files changed, 138 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c
> > > index 573a1c206b60..e9d65820fb76 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_bw.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_bw.c
> > > @@ -6,6 +6,7 @@
> > >  #include <drm/drm_atomic_state_helper.h>
> > >  
> > >  #include "intel_bw.h"
> > > +#include "intel_pm.h"
> > >  #include "intel_display_types.h"
> > >  #include "intel_sideband.h"
> > >  #include "intel_atomic.h"
> > > @@ -338,7 +339,6 @@ static unsigned int intel_bw_crtc_data_rate(const struct intel_crtc_state *crtc_
> > >  
> > >  	return data_rate;
> > >  }
> > > -
> > >  void intel_bw_crtc_update(struct intel_bw_state *bw_state,
> > >  			  const struct intel_crtc_state *crtc_state)
> > >  {
> > > @@ -419,6 +419,65 @@ intel_atomic_bw_get_state(struct intel_atomic_state *state)
> > >  	return to_intel_bw_state(bw_state);
> > >  }
> > >  
> > > +int intel_bw_calc_min_cdclk(struct intel_atomic_state *state)
> > > +{
> > > +	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
> > > +	int i = 0;
> > > +	enum plane_id plane_id;
> > > +	struct intel_crtc_state *crtc_state;
> > 
> > const
> 
> Why? There are lots of places where we use it that way.

Always make it const if you don't have to change it. Makes life a lot
easier when you immediately see that it can't mutate when you're not
looking. Which is also why pure functions are preferrable to those
with side effects.

> Even in intel_compute_min_cdclk:
> 
> struct intel_crtc_state *crtc_state;
> int min_cdclk, i;
> enum pipe pipe;
> 
> for_each_new_intel_crtc_in_state(state, crtc, crtc_state, i) {
> 
> Also I grepped - almost every place in intel_display.c
> which has for_each_new_intel_crtc_in_state uses crtc_state as non-const.
> 
> It is called that way intel_calc_active_pipes, intel_atomic_check_crtcs,...
> almost everywhere in intel_display.c
> 
> So can you please explain what do you mean?..
> 
> > 
> > > +	struct intel_crtc *crtc;
> > > +	int max_bw = 0;
> > > +	int min_cdclk;
> > > +	enum pipe pipe;
> > > +	struct intel_bw_state *bw_state;
> > > +	int slice_id = 0;
> > 
> > Bunch of needless intiialization, needlessly wide scope, etc.
> 
> max_bw needs to be accumulated so I guess it should be init to 0.
> 
> Agree regarding slice_id - just a leftover after I started to 
> use for_each_dbuf_slice macro.
> 
> > 
> > > +
> > > +	bw_state = intel_atomic_bw_get_state(state);
> > > +
> > 
> > Spurious whitespace.
> > 
> > > +	if (IS_ERR(bw_state))
> > > +		return PTR_ERR(bw_state);
> > > +
> > > +	for_each_new_intel_crtc_in_state(state, crtc, crtc_state, i) {
> > > +		struct intel_crtc_bw *crtc_bw = &bw_state->dbuf_bw_used[crtc->pipe];
> > > +
> > > +		memset(&crtc_bw->dbuf_bw, 0, sizeof(crtc_bw->dbuf_bw));
> > > +
> > > +		for_each_plane_id_on_crtc(crtc, plane_id) {
> > > +			struct skl_ddb_entry *plane_alloc =
> > > +				&crtc_state->wm.skl.plane_ddb_y[plane_id];
> > > +			struct skl_ddb_entry *uv_plane_alloc =
> > > +				&crtc_state->wm.skl.plane_ddb_uv[plane_id];
> > 
> > const
> > 
> > > +			unsigned int data_rate = crtc_state->data_rate[plane_id];
> > > +
> > 
> > more strange whitespace
> > 
> > > +			unsigned int dbuf_mask = skl_ddb_dbuf_slice_mask(dev_priv, plane_alloc);
> > > +
> > > +			dbuf_mask |= skl_ddb_dbuf_slice_mask(dev_priv, uv_plane_alloc);
> > 
> > Looks bad when you initialize part of it in declaration and the rest
> > later.
> 
> Didn't want to have a long time here, will have to make a separate function then.

Not sure why you want a separate function.

foo = a | b ?

> 
> > 
> > > +
> > > +			DRM_DEBUG_KMS("Got dbuf mask %x for pipe %c ddb %d-%d plane %d data rate %d\n",
> > > +				      dbuf_mask, pipe_name(crtc->pipe), plane_alloc->start,
> > > +				      plane_alloc->end, plane_id, data_rate);
> > > +
> > > +			for_each_dbuf_slice_in_mask(slice_id, dbuf_mask)
> > > +				crtc_bw->dbuf_bw[slice_id] += data_rate;
> > 
> > This doesn't feel quite right for planar formats.
> > 
> > For pre-icl it works by accident since we only have the one slice so
> > we don't end up accounting for the full bandwidth from both color planes
> > to multiple slices. If we had multiple slices and chroma and luma had
> > been allocated to different slices we'd count the same thing multiple
> > times.
> > 
> > For icl+ we seem to assign the full data rate to the UV plane's slice(s)
> > since only the UV plane has data_rate[] != 0.
> 
> This sounds interesting, so do you mean that we need something like data_rate 
> and data_rate_uv? 

Perhaps. Or maybe we just start tracking the data rate separately
for the master and slave planes on icl+. No real issue on pre-icl so
that would actually work AFAICS.

> My logic here is that we just get a set of slices used by uv and y planes
> in total and then account for data_rate for those. 
> We probably then need another patch to split data rate then.
> 
> I was aware of that actually but anyway being pessimistic here is better.
> Also same about if plane would be sitting on 2 slices at the same time.
> 
> > 
> > > +		}
> > > +	}
> > > +
> > > +	for_each_dbuf_slice(slice_id) {
> > > +		int total_bw_per_slice = 0;
> > > +
> > > +		for_each_pipe(dev_priv, pipe) {
> > > +			struct intel_crtc_bw *crtc_bw = &bw_state->dbuf_bw_used[pipe];
> > > +
> > > +			total_bw_per_slice += crtc_bw->dbuf_bw[slice_id];
> > > +		}
> > > +		max_bw += total_bw_per_slice;
> > 
> > So we're aggregating all the bw instead of per-slice? Is this based on
> > the other mail you sent? Deserves a comment explaining why we do such
> > odd things.
> 
> Agree need to add a comment. Currently it seems the only way to make it
> work without either having underruns or 8K not usable with bumped CDCLK.
> 
> > 
> > > +	}
> > > +
> > > +	min_cdclk = max_bw / 64;
> > > +
> > > +	return min_cdclk;
> > > +}
> > > +
> > >  int intel_bw_atomic_check(struct intel_atomic_state *state)
> > >  {
> > >  	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
> > > diff --git a/drivers/gpu/drm/i915/display/intel_bw.h b/drivers/gpu/drm/i915/display/intel_bw.h
> > > index 9a5627be6876..d2b5f32b0791 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_bw.h
> > > +++ b/drivers/gpu/drm/i915/display/intel_bw.h
> > > @@ -10,11 +10,16 @@
> > >  
> > >  #include "intel_display.h"
> > >  #include "intel_global_state.h"
> > > +#include "intel_display_power.h"
> > >  
> > >  struct drm_i915_private;
> > >  struct intel_atomic_state;
> > >  struct intel_crtc_state;
> > >  
> > > +struct intel_crtc_bw {
> > > +	int dbuf_bw[I915_MAX_DBUF_SLICES];
> > > +};
> > > +
> > >  struct intel_bw_state {
> > >  	struct intel_global_state base;
> > >  
> > > @@ -31,6 +36,8 @@ struct intel_bw_state {
> > >  	 */
> > >  	u8 qgv_points_mask;
> > >  
> > > +	struct intel_crtc_bw dbuf_bw_used[I915_MAX_PIPES];
> > 
> > The name of the struct isn't very good if it just contains the
> > dbuf bw numbers.
> > 
> > > +
> > >  	unsigned int data_rate[I915_MAX_PIPES];
> > >  	u8 num_active_planes[I915_MAX_PIPES];
> > 
> > Maybe collect all this to the per-crtc struct?
> > 
> > >  };
> > > @@ -53,5 +60,6 @@ void intel_bw_crtc_update(struct intel_bw_state *bw_state,
> > >  			  const struct intel_crtc_state *crtc_state);
> > >  int icl_pcode_restrict_qgv_points(struct drm_i915_private *dev_priv,
> > >  				  u32 points_mask);
> > > +int intel_bw_calc_min_cdclk(struct intel_atomic_state *state);
> > >  
> > >  #endif /* __INTEL_BW_H__ */
> > > diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c
> > > index 979a0241fdcb..036774e7f3ec 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_cdclk.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
> > > @@ -25,6 +25,7 @@
> > >  #include "intel_cdclk.h"
> > >  #include "intel_display_types.h"
> > >  #include "intel_sideband.h"
> > > +#include "intel_bw.h"
> > >  
> > >  /**
> > >   * DOC: CDCLK / RAWCLK
> > > @@ -2001,11 +2002,19 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
> > >  {
> > >  	struct drm_i915_private *dev_priv =
> > >  		to_i915(crtc_state->uapi.crtc->dev);
> > > +	struct intel_atomic_state *state = NULL;
> > >  	int min_cdclk;
> > >  
> > >  	if (!crtc_state->hw.enable)
> > >  		return 0;
> > >  
> > > +	/*
> > > +	 * FIXME: Unfortunately when this gets called from intel_modeset_setup_hw_state
> > > +	 * there is no intel_atomic_state at all. So lets not then use it.
> > > +	 */
> > > +	if (crtc_state->uapi.state)
> > > +		state = to_intel_atomic_state(crtc_state->uapi.state);
> > 
> > This still indicates that either this isn't the right place to call this
> > or we have the state stored in the wrong place.
> > 
> > I think I'd just move the thing into intel_compute_min_cdclk() as a start.
> 
> Was thinking about this, why I didn't put it here is because just wanted
> to keep all cdclk regulating logic in this function, because intel_compute_min_cdclk
> is kind of higher level already which is supposed to call that one per crtc_state.
> Otherwise CDCLK regulating logic will be kind of spread across those two.

This is not a per-crtc limit so having it in the per-crtc function seems
wrong anyway to me.

> 
> > 
> > > +
> > >  	min_cdclk = intel_pixel_rate_to_cdclk(crtc_state);
> > >  
> > >  	/* pixel rate mustn't exceed 95% of cdclk with IPS on BDW */
> > > @@ -2080,6 +2089,22 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
> > >  	if (IS_TIGERLAKE(dev_priv))
> > >  		min_cdclk = max(min_cdclk, (int)crtc_state->pixel_rate);
> > >  
> > > +	/*
> > > +	 * Similar story as with skl_write_plane_wm and intel_enable_sagv
> > > +	 * - in some certain driver parts, we don't have any guarantee that
> > > +	 * parent exists. So we might be having a crtc_state without
> > > +	 * parent state.
> > > +	 */
> > > +	if (INTEL_GEN(dev_priv) >= 11) {
> > > +		if (state) {
> > > +			int dbuf_bw_cdclk = intel_bw_calc_min_cdclk(state);
> > > +
> > > +			DRM_DEBUG_KMS("DBuf bw min cdclk %d current min_cdclk %d\n",
> > > +				      dbuf_bw_cdclk, min_cdclk);
> > > +			min_cdclk = max(min_cdclk, dbuf_bw_cdclk);
> > > +		}
> > > +	}
> > > +
> > >  	if (min_cdclk > dev_priv->max_cdclk_freq) {
> > >  		drm_dbg_kms(&dev_priv->drm,
> > >  			    "required cdclk (%d kHz) exceeds max (%d kHz)\n",
> > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> > > index 9fd32d61ebfe..fa2870c0d7fd 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > @@ -14678,6 +14678,14 @@ static int intel_atomic_check_planes(struct intel_atomic_state *state,
> > >  		if (old_active_planes == new_active_planes)
> > >  			continue;
> > >  
> > > +		/*
> > > +		 * active_planes bitmask has been updated, whenever amount
> > > +		 * of active planes had changed we need to recalculate CDCLK
> > > +		 * as it depends on total bandwidth now, not only min_cdclk
> > > +		 * per plane.
> > > +		 */
> > > +		*need_cdclk_calc = true;
> > > +
> > >  		ret = intel_crtc_add_planes_to_state(state, crtc, new_active_planes);
> > >  		if (ret)
> > >  			return ret;
> > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > index 468e8fb0203a..9e33fb90422f 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > @@ -308,6 +308,8 @@ intel_display_power_put_async(struct drm_i915_private *i915,
> > >  }
> > >  #endif
> > >  
> > > +#define I915_MAX_DBUF_SLICES 2
> > > +
> > >  enum dbuf_slice {
> > >  	DBUF_S1,
> > >  	DBUF_S2,
> > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > > index 551933e3f7da..5dcd1cd09ad7 100644
> > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > @@ -4055,10 +4055,9 @@ icl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
> > >  	return offset;
> > >  }
> > >  
> > > -static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
> > > +u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
> > >  {
> > >  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > > -
> > >  	drm_WARN_ON(&dev_priv->drm, ddb_size == 0);
> > >  
> > >  	if (INTEL_GEN(dev_priv) < 11)
> > > @@ -4067,6 +4066,37 @@ static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv)
> > >  	return ddb_size;
> > >  }
> > >  
> > > +u32 skl_ddb_dbuf_slice_mask(struct drm_i915_private *dev_priv,
> > > +			    const struct skl_ddb_entry *entry)
> > > +{
> > > +	u32 slice_mask = 0;
> > > +	u16 ddb_size = intel_get_ddb_size(dev_priv);
> > > +	u16 num_supported_slices = INTEL_INFO(dev_priv)->num_supported_dbuf_slices;
> > > +	u16 slice_size = ddb_size / num_supported_slices;
> > > +	u16 start_slice;
> > > +	u16 end_slice;
> > > +
> > > +	if (!skl_ddb_entry_size(entry))
> > > +		return 0;
> > > +
> > > +	start_slice = entry->start / slice_size;
> > > +	end_slice = (entry->end - 1) / slice_size;
> > > +
> > > +	DRM_DEBUG_KMS("ddb size %d slices %d slice size %d start slice %d end slice %d\n",
> > > +		      ddb_size, num_supported_slices, slice_size, start_slice, end_slice);
> > > +
> > > +	/*
> > > +	 * Per plane DDB entry can in a really worst case be on multiple slices
> > > +	 * but single entry is anyway contigious.
> > > +	 */
> > > +	while (start_slice <= end_slice) {
> > > +		slice_mask |= 1 << start_slice;
> > > +		start_slice++;
> > > +	}
> > 
> > GENMASK() or somehting?
> 
> Will fix, thanks for proposal.
> 
> > 
> > > +
> > > +	return slice_mask;
> > > +}
> > > +
> > >  static u8 skl_compute_dbuf_slices(const struct intel_crtc_state *crtc_state,
> > >  				  u8 active_pipes);
> > >  
> > > diff --git a/drivers/gpu/drm/i915/intel_pm.h b/drivers/gpu/drm/i915/intel_pm.h
> > > index 069515f04170..41c61ad71ce6 100644
> > > --- a/drivers/gpu/drm/i915/intel_pm.h
> > > +++ b/drivers/gpu/drm/i915/intel_pm.h
> > > @@ -37,6 +37,9 @@ void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
> > >  			       struct skl_ddb_entry *ddb_y,
> > >  			       struct skl_ddb_entry *ddb_uv);
> > >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv);
> > > +u16 intel_get_ddb_size(struct drm_i915_private *dev_priv);
> > > +u32 skl_ddb_dbuf_slice_mask(struct drm_i915_private *dev_priv,
> > > +			    const struct skl_ddb_entry *entry);
> > >  void skl_pipe_wm_get_hw_state(struct intel_crtc *crtc,
> > >  			      struct skl_pipe_wm *out);
> > >  void g4x_wm_sanitize(struct drm_i915_private *dev_priv);
> > > -- 
> > > 2.24.1.485.gad05a3d8e5
> > 
> > -- 
> > Ville Syrjälä
> > Intel

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-04-07 16:27 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-30 12:23 [Intel-gfx] [PATCH v3 0/5] Consider DBuf bandwidth when calculating CDCLK Stanislav Lisovskiy
2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 1/5] drm/i915: Decouple cdclk calculation from modeset checks Stanislav Lisovskiy
2020-03-30 15:44   ` Ville Syrjälä
2020-04-07  7:33   ` [Intel-gfx] [PATCH v4 " Stanislav Lisovskiy
2020-04-07 16:15     ` Ville Syrjälä
2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 2/5] drm/i915: Force recalculate min_cdclk if planes config changed Stanislav Lisovskiy
2020-03-30 16:18   ` Ville Syrjälä
2020-03-30 17:56     ` Lisovskiy, Stanislav
2020-04-07  7:36   ` [Intel-gfx] [PATCH v4 " Stanislav Lisovskiy
2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 3/5] drm/i915: Introduce for_each_dbuf_slice_in_mask macro Stanislav Lisovskiy
2020-03-30 16:07   ` Ville Syrjälä
2020-04-07  7:38   ` [Intel-gfx] [PATCH v4 " Stanislav Lisovskiy
2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 4/5] drm/i915: Adjust CDCLK accordingly to our DBuf bw needs Stanislav Lisovskiy
2020-03-30 17:07   ` Ville Syrjälä
2020-03-30 18:16     ` Lisovskiy, Stanislav
2020-04-07 16:26       ` Ville Syrjälä
2020-04-07  7:39   ` [Intel-gfx] [PATCH v4 " Stanislav Lisovskiy
2020-03-30 12:23 ` [Intel-gfx] [PATCH v3 5/5] drm/i915: Remove unneeded hack now for CDCLK Stanislav Lisovskiy
2020-03-30 18:07 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Consider DBuf bandwidth when calculating CDCLK (rev3) Patchwork
2020-04-07  8:03 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Consider DBuf bandwidth when calculating CDCLK (rev7) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.