All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-14 12:24 ` Stanislav Lisovskiy
  0 siblings, 0 replies; 34+ messages in thread
From: Stanislav Lisovskiy @ 2019-11-14 12:24 UTC (permalink / raw)
  To: intel-gfx

Also implemented algorithm for choosing DBuf slice configuration
based on active pipes, pipe ratio as stated in BSpec 12716.

Now pipe allocation still stays proportional to pipe width as before,
however within allowed DBuf slice for this particular configuration.

v2: Remove unneeded check from commit as ddb enabled slices might
    now differ from hw state.

v3: - Added new field "supported_slices" to ddb to separate max
      supported slices vs currently enabled, to avoid confusion.
    - Removed obsolete comments and code related to DBuf(Matthew Roper).
    - Some code style and long line removal(Matthew Roper).
    - Added WARN_ON to new ddb range offset calc function(Matthew Roper).
    - Removed platform specific call to calc pipe ratio from ddb
      allocation function and fixed the return value(Matthew Roper)
    - Refactored DBUF slice allocation table to improve readability
    - Added DBUF slice allocation for TGL as well.
    - ICL(however not TGL) seems to voluntarily enable second DBuf slice
      after pm suspend/resume causing a mismatch failure, because we
      update DBuf slices only if we do a modeset, however this check
      is done always. Fixed it to be done only when modeset for ICL.

v4: - Now enabled slices is not actually a number, but a bitmask,
      because we might need to enable second slice only and number
      of slices would still 1 and that current code doesn't allow.
    - Remove redundant duplicate code to have some unified way of
      enabling dbuf slices instead of hardcoding.

v5: - Fix failing gen9_assert_dbuf_enabled as it was naively thinking
      that we have only one DBUF_CTL slice. Now another version for
      gen11 will check both slices as only second one can be enabled,
      to keep CI happy.

v6: - Removed enabled dbuf assertion completely. Previous code
      was keeping dbuf enabled, even(!) when _dbuf_disable is called.
      Currently we enable/disable dbuf slices based on requrement
      so no point in asserting this here.
    - Removed unnecessary modeset check from verify_wm_state(Matthew Roper)
    - Slices intersection after union is same as final result(Matthew Roper)
    - Moved DBuf slices to intel_device_info(Matthew Roper)
    - Some macros added(Matthew Roper)
    - ddb range is now always less or equal to ddb size - no need for additional
      checks(previously needed as we had some bandwidth checks in there which
      could "suddenly" return ddb_size smaller than it is.

v7: Minor costemic changes:
    - Changed icl_dbuf_slices_restore name to icl_program_dbuf_slices
      as it more corresponds to the actual functionality.
    - Some simplification with supported slices for BXT and GLK
    - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
      this is not used for skl anymore.
    - Changed DBuf slice assignment order for TGL case

Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: Matthew Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
 .../drm/i915/display/intel_display_power.c    |  98 ++---
 .../drm/i915/display/intel_display_power.h    |   2 +
 drivers/gpu/drm/i915/i915_drv.c               |   5 +
 drivers/gpu/drm/i915/i915_drv.h               |   2 +-
 drivers/gpu/drm/i915/i915_pci.c               |   6 +-
 drivers/gpu/drm/i915/i915_reg.h               |   8 +-
 drivers/gpu/drm/i915/intel_device_info.h      |   1 +
 drivers/gpu/drm/i915/intel_pm.c               | 387 ++++++++++++++++--
 9 files changed, 431 insertions(+), 104 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 876fc25968bf..bd7aff675198 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -13387,7 +13387,7 @@ static void verify_wm_state(struct intel_crtc *crtc,
 
 	if (INTEL_GEN(dev_priv) >= 11 &&
 	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
-		DRM_ERROR("mismatch in DBUF Slices (expected %u, got %u)\n",
+		DRM_ERROR("mismatch in DBUF Slices (expected %x, got %x)\n",
 			  sw_ddb->enabled_slices,
 			  hw->ddb.enabled_slices);
 
@@ -14614,15 +14614,24 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
 	u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
 	u8 required_slices = state->wm_results.ddb.enabled_slices;
 	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
+	u8 slices_union = hw_enabled_slices | required_slices;
+	u8 slices_intersection = required_slices;
 
 	for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i)
 		/* ignore allocations for crtc's that have been turned off. */
 		if (new_crtc_state->hw.active)
 			entries[i] = old_crtc_state->wm.skl.ddb;
 
-	/* If 2nd DBuf slice required, enable it here */
-	if (INTEL_GEN(dev_priv) >= 11 && required_slices > hw_enabled_slices)
-		icl_dbuf_slices_update(dev_priv, required_slices);
+	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
+		      required_slices, hw_enabled_slices);
+
+	/*
+	 * If some other DBuf slice required, enable it here,
+	 * as here we only add more slices, but not remove to prevent
+	 * issues if somebody still uses those.
+	 */
+	if (INTEL_GEN(dev_priv) >= 11 && required_slices != hw_enabled_slices)
+		icl_dbuf_slices_update(dev_priv, slices_union);
 
 	/*
 	 * Whenever the number of active pipes changes, we need to make sure we
@@ -14681,9 +14690,12 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
 		}
 	} while (progress);
 
-	/* If 2nd DBuf slice is no more required disable it */
-	if (INTEL_GEN(dev_priv) >= 11 && required_slices < hw_enabled_slices)
-		icl_dbuf_slices_update(dev_priv, required_slices);
+	/*
+	 * If some other DBuf slice is not needed, disable it here,
+	 * as here we only remove slices, which are not needed anymore.
+	 */
+	if (INTEL_GEN(dev_priv) >= 11 && required_slices != hw_enabled_slices)
+		icl_dbuf_slices_update(dev_priv, slices_intersection);
 }
 
 static void intel_atomic_helper_free_state(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c
index ce1b64f4dd44..a78bc9b61862 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -1031,15 +1031,6 @@ static bool gen9_dc_off_power_well_enabled(struct drm_i915_private *dev_priv,
 		(I915_READ(DC_STATE_EN) & DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
 }
 
-static void gen9_assert_dbuf_enabled(struct drm_i915_private *dev_priv)
-{
-	u32 tmp = I915_READ(DBUF_CTL);
-
-	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
-	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
-	     "Unexpected DBuf power power state (0x%08x)\n", tmp);
-}
-
 static void gen9_disable_dc_states(struct drm_i915_private *dev_priv)
 {
 	struct intel_cdclk_state cdclk_state = {};
@@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct drm_i915_private *dev_priv)
 	/* Can't read out voltage_level so can't use intel_cdclk_changed() */
 	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw, &cdclk_state));
 
-	gen9_assert_dbuf_enabled(dev_priv);
-
 	if (IS_GEN9_LP(dev_priv))
 		bxt_verify_ddi_phy_power_wells(dev_priv);
 
@@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct drm_i915_private *dev_priv)
 	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
 }
 
-static u8 intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
+int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
 {
-	if (INTEL_GEN(dev_priv) < 11)
-		return 1;
-	return 2;
+	return INTEL_INFO(dev_priv)->supported_slices;
+}
+
+void icl_program_dbuf_slices(struct drm_i915_private *dev_priv)
+{
+	const u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
+
+	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
 }
 
 void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
 			    u8 req_slices)
 {
-	const u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
-	bool ret;
+	bool ret = true;
+	int i;
+	int max_slices = intel_dbuf_max_slices(dev_priv);
 
-	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
+	if (hweight8(req_slices) > intel_dbuf_max_slices(dev_priv)) {
 		DRM_ERROR("Invalid number of dbuf slices requested\n");
 		return;
 	}
 
-	if (req_slices == hw_enabled_slices || req_slices == 0)
-		return;
+	DRM_DEBUG_KMS("Updating dbuf slices to %x\n", req_slices);
 
-	if (req_slices > hw_enabled_slices)
-		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2, true);
-	else
-		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2, false);
+	for (i = 0; i < max_slices; i++) {
+		int slice_bit = BIT(i);
+		bool slice_set = (slice_bit & req_slices) != 0;
+
+		switch (slice_bit) {
+		case DBUF_S1_BIT:
+			ret = ret && intel_dbuf_slice_set(dev_priv,
+							  DBUF_CTL_S1,
+							  slice_set);
+			break;
+		case DBUF_S2_BIT:
+			ret = ret && intel_dbuf_slice_set(dev_priv,
+							  DBUF_CTL_S2,
+							  slice_set);
+			break;
+		default:
+			MISSING_CASE(slice_bit);
+		}
+	}
 
 	if (ret)
 		dev_priv->wm.skl_hw.ddb.enabled_slices = req_slices;
@@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
 
 static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
 {
-	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) | DBUF_POWER_REQUEST);
-	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) | DBUF_POWER_REQUEST);
-	POSTING_READ(DBUF_CTL_S2);
-
-	udelay(10);
-
-	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
-	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
-		DRM_ERROR("DBuf power enable timeout\n");
-	else
-		/*
-		 * FIXME: for now pretend that we only have 1 slice, see
-		 * intel_enabled_dbuf_slices_num().
-		 */
-		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
+	/*
+	 * Just power up 1 slice, we will
+	 * figure out later which slices we have and what we need.
+	 */
+	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
+	icl_program_dbuf_slices(dev_priv);
 }
 
 static void icl_dbuf_disable(struct drm_i915_private *dev_priv)
 {
-	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) & ~DBUF_POWER_REQUEST);
-	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) & ~DBUF_POWER_REQUEST);
-	POSTING_READ(DBUF_CTL_S2);
-
-	udelay(10);
-
-	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
-	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
-		DRM_ERROR("DBuf power disable timeout!\n");
-	else
-		/*
-		 * FIXME: for now pretend that the first slice is always
-		 * enabled, see intel_enabled_dbuf_slices_num().
-		 */
-		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
+	/*
+	 * Disable all slices
+	 */
+	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
+	icl_program_dbuf_slices(dev_priv);
 }
 
 static void icl_mbus_init(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
index 1da04f3e0fb3..b7057d68ad78 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.h
+++ b/drivers/gpu/drm/i915/display/intel_display_power.h
@@ -311,8 +311,10 @@ intel_display_power_put_async(struct drm_i915_private *i915,
 	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
 	     intel_display_power_put_async((i915), (domain), (wf)), (wf) = 0)
 
+int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
 void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
 			    u8 req_slices);
+void icl_program_dbuf_slices(struct drm_i915_private *dev_priv);
 
 void chv_phy_powergate_lanes(struct intel_encoder *encoder,
 			     bool override, unsigned int mask);
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 82e4e6bf08c3..96741e68633a 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -51,6 +51,7 @@
 #include "display/intel_bw.h"
 #include "display/intel_cdclk.h"
 #include "display/intel_display_types.h"
+#include "display/intel_display_power.h"
 #include "display/intel_dp.h"
 #include "display/intel_fbdev.h"
 #include "display/intel_hotplug.h"
@@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct device *kdev)
 	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
 		ret = vlv_resume_prepare(dev_priv, true);
 
+	/* Weird hack to fix ICL hardware bug, as it resets DBUF slices reg */
+	if (INTEL_GEN(dev_priv) == 11)
+		icl_program_dbuf_slices(dev_priv);
+
 	intel_uncore_runtime_resume(&dev_priv->uncore);
 
 	intel_runtime_pm_enable_interrupts(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7e0f67babe20..a396977c9c2d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -804,7 +804,7 @@ static inline bool skl_ddb_entry_equal(const struct skl_ddb_entry *e1,
 }
 
 struct skl_ddb_allocation {
-	u8 enabled_slices; /* GEN11 has configurable 2 slices */
+	u8 enabled_slices;   /* Bitmask of currently enabled DBuf slices */
 };
 
 struct skl_ddb_values {
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 1bb701d32a5d..8e86af505730 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -614,7 +614,8 @@ static const struct intel_device_info intel_cherryview_info = {
 	.has_gt_uc = 1, \
 	.display.has_hdcp = 1, \
 	.display.has_ipc = 1, \
-	.ddb_size = 896
+	.ddb_size = 896, \
+	.supported_slices = 1
 
 #define SKL_PLATFORM \
 	GEN9_FEATURES, \
@@ -650,6 +651,7 @@ static const struct intel_device_info intel_skylake_gt4_info = {
 	GEN(9), \
 	.is_lp = 1, \
 	.display.has_hotplug = 1, \
+	.supported_slices = 1, \
 	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0), \
 	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
 	.has_64bit_reloc = 1, \
@@ -737,6 +739,7 @@ static const struct intel_device_info intel_coffeelake_gt3_info = {
 	GEN9_FEATURES, \
 	GEN(10), \
 	.ddb_size = 1024, \
+	.supported_slices = 1, \
 	.display.has_dsc = 1, \
 	.has_coherent_ggtt = false, \
 	GLK_COLORS
@@ -773,6 +776,7 @@ static const struct intel_device_info intel_cannonlake_info = {
 	}, \
 	GEN(11), \
 	.ddb_size = 2048, \
+	.supported_slices = 2, \
 	.has_logical_ring_elsq = 1, \
 	.color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 }
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index a607ea520829..fba5731063d8 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7734,9 +7734,11 @@ enum {
 #define DISP_ARB_CTL2	_MMIO(0x45004)
 #define  DISP_DATA_PARTITION_5_6	(1 << 6)
 #define  DISP_IPC_ENABLE		(1 << 3)
-#define DBUF_CTL	_MMIO(0x45008)
-#define DBUF_CTL_S1	_MMIO(0x45008)
-#define DBUF_CTL_S2	_MMIO(0x44FE8)
+#define DBUF_S1_BIT			BIT(0)
+#define DBUF_S2_BIT			BIT(1)
+#define DBUF_CTL			_MMIO(0x45008)
+#define DBUF_CTL_S1			_MMIO(0x45008)
+#define DBUF_CTL_S2			_MMIO(0x44FE8)
 #define  DBUF_POWER_REQUEST		(1 << 31)
 #define  DBUF_POWER_STATE		(1 << 30)
 #define GEN7_MSG_CTL	_MMIO(0x45010)
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 4bdf8a6cfb47..ba34e1a5c591 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -180,6 +180,7 @@ struct intel_device_info {
 	} display;
 
 	u16 ddb_size; /* in blocks */
+	u8 supported_slices; /* number of DBuf slices */
 
 	/* Register offsets for the various display pipes and transcoders */
 	int pipe_offsets[I915_MAX_TRANSCODERS];
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 2d389e437e87..1ea81ab92429 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
 	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
 }
 
-static u8 intel_enabled_dbuf_slices_num(struct drm_i915_private *dev_priv)
+static u8 intel_enabled_dbuf_slices(struct drm_i915_private *dev_priv)
 {
-	u8 enabled_slices;
-
-	/* Slice 1 will always be enabled */
-	enabled_slices = 1;
+	u8 enabled_slices = 0;
 
 	/* Gen prior to GEN11 have only one DBuf slice */
 	if (INTEL_GEN(dev_priv) < 11)
-		return enabled_slices;
+		return DBUF_S1_BIT;
 
-	/*
-	 * FIXME: for now we'll only ever use 1 slice; pretend that we have
-	 * only that 1 slice enabled until we have a proper way for on-demand
-	 * toggling of the second slice.
-	 */
-	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
-		enabled_slices++;
+	/* Check if second DBuf slice is enabled */
+	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
+		enabled_slices |= DBUF_S1_BIT;
+
+	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
+		enabled_slices |= DBUF_S2_BIT;
 
 	return enabled_slices;
 }
@@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv,
 			      const int num_active,
 			      struct skl_ddb_allocation *ddb)
 {
-	const struct drm_display_mode *adjusted_mode;
-	u64 total_data_bw;
 	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
-
 	WARN_ON(ddb_size == 0);
 
 	if (INTEL_GEN(dev_priv) < 11)
 		return ddb_size - 4; /* 4 blocks for bypass path allocation */
 
-	adjusted_mode = &crtc_state->hw.adjusted_mode;
-	total_data_bw = total_data_rate * drm_mode_vrefresh(adjusted_mode);
+	return ddb_size;
+}
 
-	/*
-	 * 12GB/s is maximum BW supported by single DBuf slice.
-	 *
-	 * FIXME dbuf slice code is broken:
-	 * - must wait for planes to stop using the slice before powering it off
-	 * - plane straddling both slices is illegal in multi-pipe scenarios
-	 * - should validate we stay within the hw bandwidth limits
-	 */
-	if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
-		ddb->enabled_slices = 2;
-	} else {
-		ddb->enabled_slices = 1;
-		ddb_size /= 2;
-	}
+/*
+ * Calculate initial DBuf slice offset, based on slice size
+ * and mask(i.e if slice size is 1024 and second slice is enabled
+ * offset would be 1024)
+ */
+static u32 skl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
+					   u32 slice_size, u32 ddb_size)
+{
+	u32 offset = 0;
 
-	return ddb_size;
+	if (!dbuf_slice_mask)
+		return 0;
+
+	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
+
+	WARN_ON(offset >= ddb_size);
+	return offset;
 }
 
+static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
+				      int pipe, u32 active_pipes,
+				      const struct intel_crtc_state *crtc_state);
+
 static void
 skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
 				   const struct intel_crtc_state *crtc_state,
@@ -3857,7 +3855,14 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
 	u32 pipe_width = 0, total_width = 0, width_before_pipe = 0;
 	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
 	u16 ddb_size;
+	u32 ddb_range_size;
 	u32 i;
+	u32 dbuf_slice_mask;
+	u32 active_pipes;
+	u32 offset;
+	u32 slice_size;
+	u32 total_slice_mask;
+	u32 start, end;
 
 	if (WARN_ON(!state) || !crtc_state->hw.active) {
 		alloc->start = 0;
@@ -3866,14 +3871,23 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
 		return;
 	}
 
-	if (intel_state->active_pipe_changes)
+	if (intel_state->active_pipe_changes) {
 		*num_active = hweight8(intel_state->active_pipes);
-	else
+		active_pipes = intel_state->active_pipes;
+	} else {
 		*num_active = hweight8(dev_priv->active_pipes);
+		active_pipes = dev_priv->active_pipes;
+	}
 
 	ddb_size = intel_get_ddb_size(dev_priv, crtc_state, total_data_rate,
 				      *num_active, ddb);
 
+	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
+
+	slice_size = ddb_size / INTEL_INFO(dev_priv)->supported_slices;
+
+	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
+
 	/*
 	 * If the state doesn't change the active CRTC's or there is no
 	 * modeset request, then there's no need to recalculate;
@@ -3891,20 +3905,70 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
 		return;
 	}
 
+	/*
+	 * Get allowed DBuf slices for correspondent pipe and platform.
+	 */
+	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv, for_pipe,
+						     active_pipes, crtc_state);
+
+	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes %x\n",
+		      dbuf_slice_mask,
+		      for_pipe, active_pipes);
+
+	/*
+	 * Figure out at which DBuf slice we start, i.e if we start at Dbuf S2
+	 * and slice size is 1024, the offset would be 1024
+	 */
+	offset = skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
+						 slice_size, ddb_size);
+
+	/*
+	 * Figure out total size of allowed DBuf slices, which is basically
+	 * a number of allowed slices for that pipe multiplied by slice size.
+	 * Inside of this
+	 * range ddb entries are still allocated in proportion to display width.
+	 */
+	ddb_range_size = hweight8(dbuf_slice_mask) * slice_size;
+
 	/*
 	 * Watermark/ddb requirement highly depends upon width of the
 	 * framebuffer, So instead of allocating DDB equally among pipes
 	 * distribute DDB based on resolution/width of the display.
 	 */
+	total_slice_mask = dbuf_slice_mask;
 	for_each_new_intel_crtc_in_state(intel_state, crtc, crtc_state, i) {
 		const struct drm_display_mode *adjusted_mode =
 			&crtc_state->hw.adjusted_mode;
 		enum pipe pipe = crtc->pipe;
 		int hdisplay, vdisplay;
+		u32 pipe_dbuf_slice_mask = \
+			i915_get_allowed_dbuf_mask(dev_priv,
+						pipe,
+						active_pipes,
+						crtc_state);
 
 		if (!crtc_state->hw.enable)
 			continue;
 
+		/*
+		 * According to BSpec pipe can share one dbuf slice with another
+		 * pipes or pipe can use multiple dbufs, in both cases we
+		 * account for other pipes only if they have exactly same mask.
+		 * However we need to account how many slices we should enable
+		 * in total.
+		 */
+		total_slice_mask |= pipe_dbuf_slice_mask;
+
+		/*
+		 * Do not account pipes using other slice sets
+		 * luckily as of current BSpec slice sets do not partially
+		 * intersect(pipes share either same one slice or same slice set
+		 * i.e no partial intersection), so it is enough to check for
+		 * equality for now.
+		 */
+		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
+			continue;
+
 		drm_mode_get_hv_timing(adjusted_mode, &hdisplay, &vdisplay);
 		total_width += hdisplay;
 
@@ -3914,8 +3978,19 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
 			pipe_width = hdisplay;
 	}
 
-	alloc->start = ddb_size * width_before_pipe / total_width;
-	alloc->end = ddb_size * (width_before_pipe + pipe_width) / total_width;
+	ddb->enabled_slices = total_slice_mask;
+
+	start = ddb_range_size * width_before_pipe / total_width;
+	end = ddb_range_size * (width_before_pipe + pipe_width) / total_width;
+
+	alloc->start = offset + start;
+	alloc->end = offset + end;
+
+	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
+		      alloc->start, alloc->end);
+	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported %d\n",
+		      ddb->enabled_slices,
+		      INTEL_INFO(dev_priv)->supported_slices);
 }
 
 static int skl_compute_wm_params(const struct intel_crtc_state *crtc_state,
@@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
 void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
 			  struct skl_ddb_allocation *ddb /* out */)
 {
-	ddb->enabled_slices = intel_enabled_dbuf_slices_num(dev_priv);
+	ddb->enabled_slices = intel_enabled_dbuf_slices(dev_priv);
+	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb->enabled_slices);
 }
 
 /*
@@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct intel_crtc_state *crtc_state,
 	return total_data_rate;
 }
 
+static uint_fixed_16_16_t
+icl_pipe_downscale_amount(const struct intel_crtc_state *crtc_state)
+{
+	u32 src_w, src_h, dst_w, dst_h;
+	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
+	uint_fixed_16_16_t downscale_h, downscale_w;
+	const struct drm_display_mode *adjusted_mode = &crtc_state->hw.adjusted_mode;
+
+	src_w = crtc_state->pipe_src_w;
+	src_h = crtc_state->pipe_src_h;
+	dst_w = adjusted_mode->crtc_hdisplay;
+	dst_h = adjusted_mode->crtc_vdisplay;
+
+	fp_w_ratio = div_fixed16(src_w, dst_w);
+	fp_h_ratio = div_fixed16(src_h, dst_h);
+	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
+	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
+
+	return mul_fixed16(downscale_w, downscale_h);
+}
+
+uint_fixed_16_16_t
+icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
+{
+	struct drm_plane *plane;
+	const struct drm_plane_state *drm_plane_state;
+	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
+	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+
+	if (!crtc_state->uapi.enable)
+		return max_downscale;
+
+	drm_atomic_crtc_state_for_each_plane_state(plane, drm_plane_state, &crtc_state->uapi) {
+		uint_fixed_16_16_t plane_downscale;
+		const struct intel_plane_state *plane_state =
+			to_intel_plane_state(drm_plane_state);
+
+		if (!intel_wm_plane_visible(crtc_state, plane_state))
+			continue;
+
+		plane_downscale = skl_plane_downscale_amount(crtc_state, plane_state);
+
+		DRM_DEBUG_KMS("Plane %d downscale amount %d.%d\n",
+			      to_intel_plane(plane)->id,
+			      plane_downscale.val >> 16,
+			      plane_downscale.val & 0xffff);
+
+		max_downscale = max_fixed16(plane_downscale, max_downscale);
+	}
+
+
+	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
+
+	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
+		       crtc->pipe, pipe_downscale.val >> 16,
+		       pipe_downscale.val & 0xffff);
+
+	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
+
+	/*
+	 * Convert result to percentage: get the actual ratio instead of rate,
+	 * then multiply by 100 to get percentage.
+	 */
+	pipe_ratio = u32_to_fixed16(100 * div_round_up_u32_fixed16(1, pipe_downscale));
+
+	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
+		      pipe_ratio.val >> 16, pipe_ratio.val & 0xffff);
+
+	return pipe_ratio;
+}
+
+struct dbuf_slice_conf_entry {
+	u32 dbuf_mask[I915_MAX_PIPES];
+	u32 active_pipes;
+};
+
+
+#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
+	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
+#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
+	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
+#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
+	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
+#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
+	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
+#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
+	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
+#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
+	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
+#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
+	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
+#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
+	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
+#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
+	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
+#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
+	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
+#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
+	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) }
+#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
+	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C) | BIT(PIPE_D) }
+#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
+	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D) }
+#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
+	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_D) }
+#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
+	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) }
+#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \
+	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) | BIT(PIPE_B) | \
+					  BIT(PIPE_C) | BIT(PIPE_D) }
+
+/*
+ * Table taken from Bspec 12716
+ * Pipes do have some preferred DBuf slice affinity,
+ * plus there are some hardcoded requirements on how
+ * those should be distributed for multipipe scenarios.
+ * For more DBuf slices algorithm can get even more messy
+ * and less readable, so decided to use a table almost
+ * as is from BSpec itself - that way it is at least easier
+ * to compare, change and check.
+ */
+struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
+{ { 0, 0, 0, 0 }, 0 },
+ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
+ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe ratio >= 88.8 */
+ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
+ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe ratio >= 88.8 */
+ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
+ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for pipe ratio >= 88.8 */
+ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
+};
+
+/*
+ * Table taken from Bspec 49255
+ * Pipes do have some preferred DBuf slice affinity,
+ * plus there are some hardcoded requirements on how
+ * those should be distributed for multipipe scenarios.
+ * For more DBuf slices algorithm can get even more messy
+ * and less readable, so decided to use a table almost
+ * as is from BSpec itself - that way it is at least easier
+ * to compare, change and check.
+ */
+struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
+{ { 0, 0, 0, 0 }, 0 },
+ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
+ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
+ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
+ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
+ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
+ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
+ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
+ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
+};
+
+/*
+ * This function finds an entry with same enabled pipe configuration and
+ * returns correspondent DBuf slice mask as stated in BSpec for particular
+ * platform.
+ */
+static u32 icl_get_allowed_dbuf_mask(int pipe,
+				     u32 active_pipes,
+				     const struct intel_crtc_state *crtc_state)
+{
+	int i;
+	uint_fixed_16_16_t pipe_ratio;
+
+	/*
+	 * Calculate pipe ratio as stated in BSpec 28692
+	 */
+	pipe_ratio = icl_get_pipe_ratio(crtc_state);
+
+	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
+		if (icl_allowed_dbufs[i].active_pipes == active_pipes) {
+			if (hweight32(active_pipes) == 1) {
+				/*
+				 * According to BSpec 12716: if ratio >= 88.8
+				 * always use single dbuf slice.
+				 */
+				if (pipe_ratio.val >= div_fixed16(888, 10).val)
+					++i;
+			}
+			return icl_allowed_dbufs[i].dbuf_mask[pipe];
+		}
+	}
+	return 0;
+}
+
+/*
+ * This function finds an entry with same enabled pipe configuration and
+ * returns correspondent DBuf slice mask as stated in BSpec for particular
+ * platform.
+ */
+static u32 tgl_get_allowed_dbuf_mask(int pipe,
+				     u32 active_pipes,
+				     const struct intel_crtc_state *crtc_state)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
+		if (tgl_allowed_dbufs[i].active_pipes == active_pipes)
+			return tgl_allowed_dbufs[i].dbuf_mask[pipe];
+	}
+	return 0;
+}
+
+u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
+				      int pipe, u32 active_pipes,
+				      const struct intel_crtc_state *crtc_state)
+{
+	if (IS_GEN(dev_priv, 11))
+		return icl_get_allowed_dbuf_mask(pipe,
+						 active_pipes,
+						 crtc_state);
+	else if (IS_GEN(dev_priv, 12))
+		return tgl_get_allowed_dbuf_mask(pipe,
+						 active_pipes,
+						 crtc_state);
+	/*
+	 * For anything else just return one slice yet.
+	 * Should be extended for other platforms.
+	 */
+	return DBUF_S1_BIT;
+}
+
 static u64
 icl_get_total_relative_data_rate(struct intel_crtc_state *crtc_state,
 				 u64 *plane_data_rate)
-- 
2.17.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [Intel-gfx] [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-14 12:24 ` Stanislav Lisovskiy
  0 siblings, 0 replies; 34+ messages in thread
From: Stanislav Lisovskiy @ 2019-11-14 12:24 UTC (permalink / raw)
  To: intel-gfx

Also implemented algorithm for choosing DBuf slice configuration
based on active pipes, pipe ratio as stated in BSpec 12716.

Now pipe allocation still stays proportional to pipe width as before,
however within allowed DBuf slice for this particular configuration.

v2: Remove unneeded check from commit as ddb enabled slices might
    now differ from hw state.

v3: - Added new field "supported_slices" to ddb to separate max
      supported slices vs currently enabled, to avoid confusion.
    - Removed obsolete comments and code related to DBuf(Matthew Roper).
    - Some code style and long line removal(Matthew Roper).
    - Added WARN_ON to new ddb range offset calc function(Matthew Roper).
    - Removed platform specific call to calc pipe ratio from ddb
      allocation function and fixed the return value(Matthew Roper)
    - Refactored DBUF slice allocation table to improve readability
    - Added DBUF slice allocation for TGL as well.
    - ICL(however not TGL) seems to voluntarily enable second DBuf slice
      after pm suspend/resume causing a mismatch failure, because we
      update DBuf slices only if we do a modeset, however this check
      is done always. Fixed it to be done only when modeset for ICL.

v4: - Now enabled slices is not actually a number, but a bitmask,
      because we might need to enable second slice only and number
      of slices would still 1 and that current code doesn't allow.
    - Remove redundant duplicate code to have some unified way of
      enabling dbuf slices instead of hardcoding.

v5: - Fix failing gen9_assert_dbuf_enabled as it was naively thinking
      that we have only one DBUF_CTL slice. Now another version for
      gen11 will check both slices as only second one can be enabled,
      to keep CI happy.

v6: - Removed enabled dbuf assertion completely. Previous code
      was keeping dbuf enabled, even(!) when _dbuf_disable is called.
      Currently we enable/disable dbuf slices based on requrement
      so no point in asserting this here.
    - Removed unnecessary modeset check from verify_wm_state(Matthew Roper)
    - Slices intersection after union is same as final result(Matthew Roper)
    - Moved DBuf slices to intel_device_info(Matthew Roper)
    - Some macros added(Matthew Roper)
    - ddb range is now always less or equal to ddb size - no need for additional
      checks(previously needed as we had some bandwidth checks in there which
      could "suddenly" return ddb_size smaller than it is.

v7: Minor costemic changes:
    - Changed icl_dbuf_slices_restore name to icl_program_dbuf_slices
      as it more corresponds to the actual functionality.
    - Some simplification with supported slices for BXT and GLK
    - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
      this is not used for skl anymore.
    - Changed DBuf slice assignment order for TGL case

Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: Matthew Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
 .../drm/i915/display/intel_display_power.c    |  98 ++---
 .../drm/i915/display/intel_display_power.h    |   2 +
 drivers/gpu/drm/i915/i915_drv.c               |   5 +
 drivers/gpu/drm/i915/i915_drv.h               |   2 +-
 drivers/gpu/drm/i915/i915_pci.c               |   6 +-
 drivers/gpu/drm/i915/i915_reg.h               |   8 +-
 drivers/gpu/drm/i915/intel_device_info.h      |   1 +
 drivers/gpu/drm/i915/intel_pm.c               | 387 ++++++++++++++++--
 9 files changed, 431 insertions(+), 104 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 876fc25968bf..bd7aff675198 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -13387,7 +13387,7 @@ static void verify_wm_state(struct intel_crtc *crtc,
 
 	if (INTEL_GEN(dev_priv) >= 11 &&
 	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
-		DRM_ERROR("mismatch in DBUF Slices (expected %u, got %u)\n",
+		DRM_ERROR("mismatch in DBUF Slices (expected %x, got %x)\n",
 			  sw_ddb->enabled_slices,
 			  hw->ddb.enabled_slices);
 
@@ -14614,15 +14614,24 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
 	u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
 	u8 required_slices = state->wm_results.ddb.enabled_slices;
 	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
+	u8 slices_union = hw_enabled_slices | required_slices;
+	u8 slices_intersection = required_slices;
 
 	for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i)
 		/* ignore allocations for crtc's that have been turned off. */
 		if (new_crtc_state->hw.active)
 			entries[i] = old_crtc_state->wm.skl.ddb;
 
-	/* If 2nd DBuf slice required, enable it here */
-	if (INTEL_GEN(dev_priv) >= 11 && required_slices > hw_enabled_slices)
-		icl_dbuf_slices_update(dev_priv, required_slices);
+	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
+		      required_slices, hw_enabled_slices);
+
+	/*
+	 * If some other DBuf slice required, enable it here,
+	 * as here we only add more slices, but not remove to prevent
+	 * issues if somebody still uses those.
+	 */
+	if (INTEL_GEN(dev_priv) >= 11 && required_slices != hw_enabled_slices)
+		icl_dbuf_slices_update(dev_priv, slices_union);
 
 	/*
 	 * Whenever the number of active pipes changes, we need to make sure we
@@ -14681,9 +14690,12 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
 		}
 	} while (progress);
 
-	/* If 2nd DBuf slice is no more required disable it */
-	if (INTEL_GEN(dev_priv) >= 11 && required_slices < hw_enabled_slices)
-		icl_dbuf_slices_update(dev_priv, required_slices);
+	/*
+	 * If some other DBuf slice is not needed, disable it here,
+	 * as here we only remove slices, which are not needed anymore.
+	 */
+	if (INTEL_GEN(dev_priv) >= 11 && required_slices != hw_enabled_slices)
+		icl_dbuf_slices_update(dev_priv, slices_intersection);
 }
 
 static void intel_atomic_helper_free_state(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c
index ce1b64f4dd44..a78bc9b61862 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -1031,15 +1031,6 @@ static bool gen9_dc_off_power_well_enabled(struct drm_i915_private *dev_priv,
 		(I915_READ(DC_STATE_EN) & DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
 }
 
-static void gen9_assert_dbuf_enabled(struct drm_i915_private *dev_priv)
-{
-	u32 tmp = I915_READ(DBUF_CTL);
-
-	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
-	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
-	     "Unexpected DBuf power power state (0x%08x)\n", tmp);
-}
-
 static void gen9_disable_dc_states(struct drm_i915_private *dev_priv)
 {
 	struct intel_cdclk_state cdclk_state = {};
@@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct drm_i915_private *dev_priv)
 	/* Can't read out voltage_level so can't use intel_cdclk_changed() */
 	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw, &cdclk_state));
 
-	gen9_assert_dbuf_enabled(dev_priv);
-
 	if (IS_GEN9_LP(dev_priv))
 		bxt_verify_ddi_phy_power_wells(dev_priv);
 
@@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct drm_i915_private *dev_priv)
 	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
 }
 
-static u8 intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
+int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
 {
-	if (INTEL_GEN(dev_priv) < 11)
-		return 1;
-	return 2;
+	return INTEL_INFO(dev_priv)->supported_slices;
+}
+
+void icl_program_dbuf_slices(struct drm_i915_private *dev_priv)
+{
+	const u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
+
+	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
 }
 
 void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
 			    u8 req_slices)
 {
-	const u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
-	bool ret;
+	bool ret = true;
+	int i;
+	int max_slices = intel_dbuf_max_slices(dev_priv);
 
-	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
+	if (hweight8(req_slices) > intel_dbuf_max_slices(dev_priv)) {
 		DRM_ERROR("Invalid number of dbuf slices requested\n");
 		return;
 	}
 
-	if (req_slices == hw_enabled_slices || req_slices == 0)
-		return;
+	DRM_DEBUG_KMS("Updating dbuf slices to %x\n", req_slices);
 
-	if (req_slices > hw_enabled_slices)
-		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2, true);
-	else
-		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2, false);
+	for (i = 0; i < max_slices; i++) {
+		int slice_bit = BIT(i);
+		bool slice_set = (slice_bit & req_slices) != 0;
+
+		switch (slice_bit) {
+		case DBUF_S1_BIT:
+			ret = ret && intel_dbuf_slice_set(dev_priv,
+							  DBUF_CTL_S1,
+							  slice_set);
+			break;
+		case DBUF_S2_BIT:
+			ret = ret && intel_dbuf_slice_set(dev_priv,
+							  DBUF_CTL_S2,
+							  slice_set);
+			break;
+		default:
+			MISSING_CASE(slice_bit);
+		}
+	}
 
 	if (ret)
 		dev_priv->wm.skl_hw.ddb.enabled_slices = req_slices;
@@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
 
 static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
 {
-	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) | DBUF_POWER_REQUEST);
-	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) | DBUF_POWER_REQUEST);
-	POSTING_READ(DBUF_CTL_S2);
-
-	udelay(10);
-
-	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
-	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
-		DRM_ERROR("DBuf power enable timeout\n");
-	else
-		/*
-		 * FIXME: for now pretend that we only have 1 slice, see
-		 * intel_enabled_dbuf_slices_num().
-		 */
-		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
+	/*
+	 * Just power up 1 slice, we will
+	 * figure out later which slices we have and what we need.
+	 */
+	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
+	icl_program_dbuf_slices(dev_priv);
 }
 
 static void icl_dbuf_disable(struct drm_i915_private *dev_priv)
 {
-	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) & ~DBUF_POWER_REQUEST);
-	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) & ~DBUF_POWER_REQUEST);
-	POSTING_READ(DBUF_CTL_S2);
-
-	udelay(10);
-
-	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
-	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
-		DRM_ERROR("DBuf power disable timeout!\n");
-	else
-		/*
-		 * FIXME: for now pretend that the first slice is always
-		 * enabled, see intel_enabled_dbuf_slices_num().
-		 */
-		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
+	/*
+	 * Disable all slices
+	 */
+	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
+	icl_program_dbuf_slices(dev_priv);
 }
 
 static void icl_mbus_init(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
index 1da04f3e0fb3..b7057d68ad78 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.h
+++ b/drivers/gpu/drm/i915/display/intel_display_power.h
@@ -311,8 +311,10 @@ intel_display_power_put_async(struct drm_i915_private *i915,
 	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
 	     intel_display_power_put_async((i915), (domain), (wf)), (wf) = 0)
 
+int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
 void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
 			    u8 req_slices);
+void icl_program_dbuf_slices(struct drm_i915_private *dev_priv);
 
 void chv_phy_powergate_lanes(struct intel_encoder *encoder,
 			     bool override, unsigned int mask);
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 82e4e6bf08c3..96741e68633a 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -51,6 +51,7 @@
 #include "display/intel_bw.h"
 #include "display/intel_cdclk.h"
 #include "display/intel_display_types.h"
+#include "display/intel_display_power.h"
 #include "display/intel_dp.h"
 #include "display/intel_fbdev.h"
 #include "display/intel_hotplug.h"
@@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct device *kdev)
 	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
 		ret = vlv_resume_prepare(dev_priv, true);
 
+	/* Weird hack to fix ICL hardware bug, as it resets DBUF slices reg */
+	if (INTEL_GEN(dev_priv) == 11)
+		icl_program_dbuf_slices(dev_priv);
+
 	intel_uncore_runtime_resume(&dev_priv->uncore);
 
 	intel_runtime_pm_enable_interrupts(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7e0f67babe20..a396977c9c2d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -804,7 +804,7 @@ static inline bool skl_ddb_entry_equal(const struct skl_ddb_entry *e1,
 }
 
 struct skl_ddb_allocation {
-	u8 enabled_slices; /* GEN11 has configurable 2 slices */
+	u8 enabled_slices;   /* Bitmask of currently enabled DBuf slices */
 };
 
 struct skl_ddb_values {
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 1bb701d32a5d..8e86af505730 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -614,7 +614,8 @@ static const struct intel_device_info intel_cherryview_info = {
 	.has_gt_uc = 1, \
 	.display.has_hdcp = 1, \
 	.display.has_ipc = 1, \
-	.ddb_size = 896
+	.ddb_size = 896, \
+	.supported_slices = 1
 
 #define SKL_PLATFORM \
 	GEN9_FEATURES, \
@@ -650,6 +651,7 @@ static const struct intel_device_info intel_skylake_gt4_info = {
 	GEN(9), \
 	.is_lp = 1, \
 	.display.has_hotplug = 1, \
+	.supported_slices = 1, \
 	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0), \
 	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
 	.has_64bit_reloc = 1, \
@@ -737,6 +739,7 @@ static const struct intel_device_info intel_coffeelake_gt3_info = {
 	GEN9_FEATURES, \
 	GEN(10), \
 	.ddb_size = 1024, \
+	.supported_slices = 1, \
 	.display.has_dsc = 1, \
 	.has_coherent_ggtt = false, \
 	GLK_COLORS
@@ -773,6 +776,7 @@ static const struct intel_device_info intel_cannonlake_info = {
 	}, \
 	GEN(11), \
 	.ddb_size = 2048, \
+	.supported_slices = 2, \
 	.has_logical_ring_elsq = 1, \
 	.color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 }
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index a607ea520829..fba5731063d8 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7734,9 +7734,11 @@ enum {
 #define DISP_ARB_CTL2	_MMIO(0x45004)
 #define  DISP_DATA_PARTITION_5_6	(1 << 6)
 #define  DISP_IPC_ENABLE		(1 << 3)
-#define DBUF_CTL	_MMIO(0x45008)
-#define DBUF_CTL_S1	_MMIO(0x45008)
-#define DBUF_CTL_S2	_MMIO(0x44FE8)
+#define DBUF_S1_BIT			BIT(0)
+#define DBUF_S2_BIT			BIT(1)
+#define DBUF_CTL			_MMIO(0x45008)
+#define DBUF_CTL_S1			_MMIO(0x45008)
+#define DBUF_CTL_S2			_MMIO(0x44FE8)
 #define  DBUF_POWER_REQUEST		(1 << 31)
 #define  DBUF_POWER_STATE		(1 << 30)
 #define GEN7_MSG_CTL	_MMIO(0x45010)
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 4bdf8a6cfb47..ba34e1a5c591 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -180,6 +180,7 @@ struct intel_device_info {
 	} display;
 
 	u16 ddb_size; /* in blocks */
+	u8 supported_slices; /* number of DBuf slices */
 
 	/* Register offsets for the various display pipes and transcoders */
 	int pipe_offsets[I915_MAX_TRANSCODERS];
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 2d389e437e87..1ea81ab92429 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
 	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
 }
 
-static u8 intel_enabled_dbuf_slices_num(struct drm_i915_private *dev_priv)
+static u8 intel_enabled_dbuf_slices(struct drm_i915_private *dev_priv)
 {
-	u8 enabled_slices;
-
-	/* Slice 1 will always be enabled */
-	enabled_slices = 1;
+	u8 enabled_slices = 0;
 
 	/* Gen prior to GEN11 have only one DBuf slice */
 	if (INTEL_GEN(dev_priv) < 11)
-		return enabled_slices;
+		return DBUF_S1_BIT;
 
-	/*
-	 * FIXME: for now we'll only ever use 1 slice; pretend that we have
-	 * only that 1 slice enabled until we have a proper way for on-demand
-	 * toggling of the second slice.
-	 */
-	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
-		enabled_slices++;
+	/* Check if second DBuf slice is enabled */
+	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
+		enabled_slices |= DBUF_S1_BIT;
+
+	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
+		enabled_slices |= DBUF_S2_BIT;
 
 	return enabled_slices;
 }
@@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv,
 			      const int num_active,
 			      struct skl_ddb_allocation *ddb)
 {
-	const struct drm_display_mode *adjusted_mode;
-	u64 total_data_bw;
 	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
-
 	WARN_ON(ddb_size == 0);
 
 	if (INTEL_GEN(dev_priv) < 11)
 		return ddb_size - 4; /* 4 blocks for bypass path allocation */
 
-	adjusted_mode = &crtc_state->hw.adjusted_mode;
-	total_data_bw = total_data_rate * drm_mode_vrefresh(adjusted_mode);
+	return ddb_size;
+}
 
-	/*
-	 * 12GB/s is maximum BW supported by single DBuf slice.
-	 *
-	 * FIXME dbuf slice code is broken:
-	 * - must wait for planes to stop using the slice before powering it off
-	 * - plane straddling both slices is illegal in multi-pipe scenarios
-	 * - should validate we stay within the hw bandwidth limits
-	 */
-	if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
-		ddb->enabled_slices = 2;
-	} else {
-		ddb->enabled_slices = 1;
-		ddb_size /= 2;
-	}
+/*
+ * Calculate initial DBuf slice offset, based on slice size
+ * and mask(i.e if slice size is 1024 and second slice is enabled
+ * offset would be 1024)
+ */
+static u32 skl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
+					   u32 slice_size, u32 ddb_size)
+{
+	u32 offset = 0;
 
-	return ddb_size;
+	if (!dbuf_slice_mask)
+		return 0;
+
+	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
+
+	WARN_ON(offset >= ddb_size);
+	return offset;
 }
 
+static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
+				      int pipe, u32 active_pipes,
+				      const struct intel_crtc_state *crtc_state);
+
 static void
 skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
 				   const struct intel_crtc_state *crtc_state,
@@ -3857,7 +3855,14 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
 	u32 pipe_width = 0, total_width = 0, width_before_pipe = 0;
 	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
 	u16 ddb_size;
+	u32 ddb_range_size;
 	u32 i;
+	u32 dbuf_slice_mask;
+	u32 active_pipes;
+	u32 offset;
+	u32 slice_size;
+	u32 total_slice_mask;
+	u32 start, end;
 
 	if (WARN_ON(!state) || !crtc_state->hw.active) {
 		alloc->start = 0;
@@ -3866,14 +3871,23 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
 		return;
 	}
 
-	if (intel_state->active_pipe_changes)
+	if (intel_state->active_pipe_changes) {
 		*num_active = hweight8(intel_state->active_pipes);
-	else
+		active_pipes = intel_state->active_pipes;
+	} else {
 		*num_active = hweight8(dev_priv->active_pipes);
+		active_pipes = dev_priv->active_pipes;
+	}
 
 	ddb_size = intel_get_ddb_size(dev_priv, crtc_state, total_data_rate,
 				      *num_active, ddb);
 
+	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
+
+	slice_size = ddb_size / INTEL_INFO(dev_priv)->supported_slices;
+
+	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
+
 	/*
 	 * If the state doesn't change the active CRTC's or there is no
 	 * modeset request, then there's no need to recalculate;
@@ -3891,20 +3905,70 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
 		return;
 	}
 
+	/*
+	 * Get allowed DBuf slices for correspondent pipe and platform.
+	 */
+	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv, for_pipe,
+						     active_pipes, crtc_state);
+
+	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes %x\n",
+		      dbuf_slice_mask,
+		      for_pipe, active_pipes);
+
+	/*
+	 * Figure out at which DBuf slice we start, i.e if we start at Dbuf S2
+	 * and slice size is 1024, the offset would be 1024
+	 */
+	offset = skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
+						 slice_size, ddb_size);
+
+	/*
+	 * Figure out total size of allowed DBuf slices, which is basically
+	 * a number of allowed slices for that pipe multiplied by slice size.
+	 * Inside of this
+	 * range ddb entries are still allocated in proportion to display width.
+	 */
+	ddb_range_size = hweight8(dbuf_slice_mask) * slice_size;
+
 	/*
 	 * Watermark/ddb requirement highly depends upon width of the
 	 * framebuffer, So instead of allocating DDB equally among pipes
 	 * distribute DDB based on resolution/width of the display.
 	 */
+	total_slice_mask = dbuf_slice_mask;
 	for_each_new_intel_crtc_in_state(intel_state, crtc, crtc_state, i) {
 		const struct drm_display_mode *adjusted_mode =
 			&crtc_state->hw.adjusted_mode;
 		enum pipe pipe = crtc->pipe;
 		int hdisplay, vdisplay;
+		u32 pipe_dbuf_slice_mask = \
+			i915_get_allowed_dbuf_mask(dev_priv,
+						pipe,
+						active_pipes,
+						crtc_state);
 
 		if (!crtc_state->hw.enable)
 			continue;
 
+		/*
+		 * According to BSpec pipe can share one dbuf slice with another
+		 * pipes or pipe can use multiple dbufs, in both cases we
+		 * account for other pipes only if they have exactly same mask.
+		 * However we need to account how many slices we should enable
+		 * in total.
+		 */
+		total_slice_mask |= pipe_dbuf_slice_mask;
+
+		/*
+		 * Do not account pipes using other slice sets
+		 * luckily as of current BSpec slice sets do not partially
+		 * intersect(pipes share either same one slice or same slice set
+		 * i.e no partial intersection), so it is enough to check for
+		 * equality for now.
+		 */
+		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
+			continue;
+
 		drm_mode_get_hv_timing(adjusted_mode, &hdisplay, &vdisplay);
 		total_width += hdisplay;
 
@@ -3914,8 +3978,19 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
 			pipe_width = hdisplay;
 	}
 
-	alloc->start = ddb_size * width_before_pipe / total_width;
-	alloc->end = ddb_size * (width_before_pipe + pipe_width) / total_width;
+	ddb->enabled_slices = total_slice_mask;
+
+	start = ddb_range_size * width_before_pipe / total_width;
+	end = ddb_range_size * (width_before_pipe + pipe_width) / total_width;
+
+	alloc->start = offset + start;
+	alloc->end = offset + end;
+
+	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
+		      alloc->start, alloc->end);
+	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported %d\n",
+		      ddb->enabled_slices,
+		      INTEL_INFO(dev_priv)->supported_slices);
 }
 
 static int skl_compute_wm_params(const struct intel_crtc_state *crtc_state,
@@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
 void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
 			  struct skl_ddb_allocation *ddb /* out */)
 {
-	ddb->enabled_slices = intel_enabled_dbuf_slices_num(dev_priv);
+	ddb->enabled_slices = intel_enabled_dbuf_slices(dev_priv);
+	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb->enabled_slices);
 }
 
 /*
@@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct intel_crtc_state *crtc_state,
 	return total_data_rate;
 }
 
+static uint_fixed_16_16_t
+icl_pipe_downscale_amount(const struct intel_crtc_state *crtc_state)
+{
+	u32 src_w, src_h, dst_w, dst_h;
+	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
+	uint_fixed_16_16_t downscale_h, downscale_w;
+	const struct drm_display_mode *adjusted_mode = &crtc_state->hw.adjusted_mode;
+
+	src_w = crtc_state->pipe_src_w;
+	src_h = crtc_state->pipe_src_h;
+	dst_w = adjusted_mode->crtc_hdisplay;
+	dst_h = adjusted_mode->crtc_vdisplay;
+
+	fp_w_ratio = div_fixed16(src_w, dst_w);
+	fp_h_ratio = div_fixed16(src_h, dst_h);
+	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
+	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
+
+	return mul_fixed16(downscale_w, downscale_h);
+}
+
+uint_fixed_16_16_t
+icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
+{
+	struct drm_plane *plane;
+	const struct drm_plane_state *drm_plane_state;
+	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
+	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
+	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+
+	if (!crtc_state->uapi.enable)
+		return max_downscale;
+
+	drm_atomic_crtc_state_for_each_plane_state(plane, drm_plane_state, &crtc_state->uapi) {
+		uint_fixed_16_16_t plane_downscale;
+		const struct intel_plane_state *plane_state =
+			to_intel_plane_state(drm_plane_state);
+
+		if (!intel_wm_plane_visible(crtc_state, plane_state))
+			continue;
+
+		plane_downscale = skl_plane_downscale_amount(crtc_state, plane_state);
+
+		DRM_DEBUG_KMS("Plane %d downscale amount %d.%d\n",
+			      to_intel_plane(plane)->id,
+			      plane_downscale.val >> 16,
+			      plane_downscale.val & 0xffff);
+
+		max_downscale = max_fixed16(plane_downscale, max_downscale);
+	}
+
+
+	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
+
+	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
+		       crtc->pipe, pipe_downscale.val >> 16,
+		       pipe_downscale.val & 0xffff);
+
+	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
+
+	/*
+	 * Convert result to percentage: get the actual ratio instead of rate,
+	 * then multiply by 100 to get percentage.
+	 */
+	pipe_ratio = u32_to_fixed16(100 * div_round_up_u32_fixed16(1, pipe_downscale));
+
+	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
+		      pipe_ratio.val >> 16, pipe_ratio.val & 0xffff);
+
+	return pipe_ratio;
+}
+
+struct dbuf_slice_conf_entry {
+	u32 dbuf_mask[I915_MAX_PIPES];
+	u32 active_pipes;
+};
+
+
+#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
+	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
+#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
+	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
+#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
+	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
+#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
+	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
+#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
+	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
+#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
+	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
+#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
+	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
+#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
+	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
+#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
+	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
+#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
+	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
+#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
+	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) }
+#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
+	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C) | BIT(PIPE_D) }
+#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
+	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D) }
+#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
+	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_D) }
+#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
+	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) }
+#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \
+	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) | BIT(PIPE_B) | \
+					  BIT(PIPE_C) | BIT(PIPE_D) }
+
+/*
+ * Table taken from Bspec 12716
+ * Pipes do have some preferred DBuf slice affinity,
+ * plus there are some hardcoded requirements on how
+ * those should be distributed for multipipe scenarios.
+ * For more DBuf slices algorithm can get even more messy
+ * and less readable, so decided to use a table almost
+ * as is from BSpec itself - that way it is at least easier
+ * to compare, change and check.
+ */
+struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
+{ { 0, 0, 0, 0 }, 0 },
+ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
+ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe ratio >= 88.8 */
+ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
+ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe ratio >= 88.8 */
+ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
+ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for pipe ratio >= 88.8 */
+ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
+};
+
+/*
+ * Table taken from Bspec 49255
+ * Pipes do have some preferred DBuf slice affinity,
+ * plus there are some hardcoded requirements on how
+ * those should be distributed for multipipe scenarios.
+ * For more DBuf slices algorithm can get even more messy
+ * and less readable, so decided to use a table almost
+ * as is from BSpec itself - that way it is at least easier
+ * to compare, change and check.
+ */
+struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
+{ { 0, 0, 0, 0 }, 0 },
+ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
+ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
+ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
+ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
+ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
+ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
+ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
+ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
+ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
+};
+
+/*
+ * This function finds an entry with same enabled pipe configuration and
+ * returns correspondent DBuf slice mask as stated in BSpec for particular
+ * platform.
+ */
+static u32 icl_get_allowed_dbuf_mask(int pipe,
+				     u32 active_pipes,
+				     const struct intel_crtc_state *crtc_state)
+{
+	int i;
+	uint_fixed_16_16_t pipe_ratio;
+
+	/*
+	 * Calculate pipe ratio as stated in BSpec 28692
+	 */
+	pipe_ratio = icl_get_pipe_ratio(crtc_state);
+
+	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
+		if (icl_allowed_dbufs[i].active_pipes == active_pipes) {
+			if (hweight32(active_pipes) == 1) {
+				/*
+				 * According to BSpec 12716: if ratio >= 88.8
+				 * always use single dbuf slice.
+				 */
+				if (pipe_ratio.val >= div_fixed16(888, 10).val)
+					++i;
+			}
+			return icl_allowed_dbufs[i].dbuf_mask[pipe];
+		}
+	}
+	return 0;
+}
+
+/*
+ * This function finds an entry with same enabled pipe configuration and
+ * returns correspondent DBuf slice mask as stated in BSpec for particular
+ * platform.
+ */
+static u32 tgl_get_allowed_dbuf_mask(int pipe,
+				     u32 active_pipes,
+				     const struct intel_crtc_state *crtc_state)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
+		if (tgl_allowed_dbufs[i].active_pipes == active_pipes)
+			return tgl_allowed_dbufs[i].dbuf_mask[pipe];
+	}
+	return 0;
+}
+
+u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
+				      int pipe, u32 active_pipes,
+				      const struct intel_crtc_state *crtc_state)
+{
+	if (IS_GEN(dev_priv, 11))
+		return icl_get_allowed_dbuf_mask(pipe,
+						 active_pipes,
+						 crtc_state);
+	else if (IS_GEN(dev_priv, 12))
+		return tgl_get_allowed_dbuf_mask(pipe,
+						 active_pipes,
+						 crtc_state);
+	/*
+	 * For anything else just return one slice yet.
+	 * Should be extended for other platforms.
+	 */
+	return DBUF_S1_BIT;
+}
+
 static u64
 icl_get_total_relative_data_rate(struct intel_crtc_state *crtc_state,
 				 u64 *plane_data_rate)
-- 
2.17.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
@ 2019-11-14 13:49   ` Patchwork
  0 siblings, 0 replies; 34+ messages in thread
From: Patchwork @ 2019-11-14 13:49 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
URL   : https://patchwork.freedesktop.org/series/69124/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
dc5cb9d402db drm/i915: Enable second dbuf slice for ICL and TGL
-:48: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#48: 
    - Slices intersection after union is same as final result(Matthew Roper)

-:277: WARNING:SUSPECT_CODE_INDENT: suspect code indent for conditional statements (8, 0)
#277: FILE: drivers/gpu/drm/i915/display/intel_display_power.h:311:
 	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
[...]
+int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);

-:570: WARNING:LINE_CONTINUATIONS: Avoid unnecessary line continuations
#570: FILE: drivers/gpu/drm/i915/intel_pm.c:3952:
+		u32 pipe_dbuf_slice_mask = \

-:572: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#572: FILE: drivers/gpu/drm/i915/intel_pm.c:3954:
+			i915_get_allowed_dbuf_mask(dev_priv,
+						pipe,

-:688: CHECK:LINE_SPACING: Please don't use multiple blank lines
#688: FILE: drivers/gpu/drm/i915/intel_pm.c:4302:
+
+

-:692: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#692: FILE: drivers/gpu/drm/i915/intel_pm.c:4306:
+	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
+		       crtc->pipe, pipe_downscale.val >> 16,

-:714: CHECK:LINE_SPACING: Please don't use multiple blank lines
#714: FILE: drivers/gpu/drm/i915/intel_pm.c:4328:
+
+

-:715: CHECK:CAMELCASE: Avoid CamelCase: <DBuf1>
#715: FILE: drivers/gpu/drm/i915/intel_pm.c:4329:
+#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \

-:723: CHECK:CAMELCASE: Avoid CamelCase: <DBuf2>
#723: FILE: drivers/gpu/drm/i915/intel_pm.c:4337:
+#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \

-:735: CHECK:CAMELCASE: Avoid CamelCase: <DBuf3>
#735: FILE: drivers/gpu/drm/i915/intel_pm.c:4349:
+#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \

-:745: CHECK:CAMELCASE: Avoid CamelCase: <DBuf4>
#745: FILE: drivers/gpu/drm/i915/intel_pm.c:4359:
+#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \

-:854: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#854: FILE: drivers/gpu/drm/i915/intel_pm.c:4468:
+u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
+				      int pipe, u32 active_pipes,

total: 0 errors, 3 warnings, 9 checks, 745 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
@ 2019-11-14 13:49   ` Patchwork
  0 siblings, 0 replies; 34+ messages in thread
From: Patchwork @ 2019-11-14 13:49 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
URL   : https://patchwork.freedesktop.org/series/69124/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
dc5cb9d402db drm/i915: Enable second dbuf slice for ICL and TGL
-:48: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#48: 
    - Slices intersection after union is same as final result(Matthew Roper)

-:277: WARNING:SUSPECT_CODE_INDENT: suspect code indent for conditional statements (8, 0)
#277: FILE: drivers/gpu/drm/i915/display/intel_display_power.h:311:
 	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
[...]
+int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);

-:570: WARNING:LINE_CONTINUATIONS: Avoid unnecessary line continuations
#570: FILE: drivers/gpu/drm/i915/intel_pm.c:3952:
+		u32 pipe_dbuf_slice_mask = \

-:572: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#572: FILE: drivers/gpu/drm/i915/intel_pm.c:3954:
+			i915_get_allowed_dbuf_mask(dev_priv,
+						pipe,

-:688: CHECK:LINE_SPACING: Please don't use multiple blank lines
#688: FILE: drivers/gpu/drm/i915/intel_pm.c:4302:
+
+

-:692: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#692: FILE: drivers/gpu/drm/i915/intel_pm.c:4306:
+	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
+		       crtc->pipe, pipe_downscale.val >> 16,

-:714: CHECK:LINE_SPACING: Please don't use multiple blank lines
#714: FILE: drivers/gpu/drm/i915/intel_pm.c:4328:
+
+

-:715: CHECK:CAMELCASE: Avoid CamelCase: <DBuf1>
#715: FILE: drivers/gpu/drm/i915/intel_pm.c:4329:
+#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \

-:723: CHECK:CAMELCASE: Avoid CamelCase: <DBuf2>
#723: FILE: drivers/gpu/drm/i915/intel_pm.c:4337:
+#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \

-:735: CHECK:CAMELCASE: Avoid CamelCase: <DBuf3>
#735: FILE: drivers/gpu/drm/i915/intel_pm.c:4349:
+#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \

-:745: CHECK:CAMELCASE: Avoid CamelCase: <DBuf4>
#745: FILE: drivers/gpu/drm/i915/intel_pm.c:4359:
+#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \

-:854: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#854: FILE: drivers/gpu/drm/i915/intel_pm.c:4468:
+u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
+				      int pipe, u32 active_pipes,

total: 0 errors, 3 warnings, 9 checks, 745 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* ✗ Fi.CI.SPARSE: warning for drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
@ 2019-11-14 13:50   ` Patchwork
  0 siblings, 0 replies; 34+ messages in thread
From: Patchwork @ 2019-11-14 13:50 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
URL   : https://patchwork.freedesktop.org/series/69124/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.6.0
Commit: drm/i915: Enable second dbuf slice for ICL and TGL
+drivers/gpu/drm/i915/intel_pm.c:4273:1: warning: symbol 'icl_get_pipe_ratio' was not declared. Should it be static?
+drivers/gpu/drm/i915/intel_pm.c:4373:30: warning: symbol 'icl_allowed_dbufs' was not declared. Should it be static?
+drivers/gpu/drm/i915/intel_pm.c:4397:30: warning: symbol 'tgl_allowed_dbufs' was not declared. Should it be static?
+drivers/gpu/drm/i915/intel_pm.c:4467:5: warning: symbol 'i915_get_allowed_dbuf_mask' was not declared. Should it be static?

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
@ 2019-11-14 13:50   ` Patchwork
  0 siblings, 0 replies; 34+ messages in thread
From: Patchwork @ 2019-11-14 13:50 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
URL   : https://patchwork.freedesktop.org/series/69124/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.6.0
Commit: drm/i915: Enable second dbuf slice for ICL and TGL
+drivers/gpu/drm/i915/intel_pm.c:4273:1: warning: symbol 'icl_get_pipe_ratio' was not declared. Should it be static?
+drivers/gpu/drm/i915/intel_pm.c:4373:30: warning: symbol 'icl_allowed_dbufs' was not declared. Should it be static?
+drivers/gpu/drm/i915/intel_pm.c:4397:30: warning: symbol 'tgl_allowed_dbufs' was not declared. Should it be static?
+drivers/gpu/drm/i915/intel_pm.c:4467:5: warning: symbol 'i915_get_allowed_dbuf_mask' was not declared. Should it be static?

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
@ 2019-11-14 14:18   ` Patchwork
  0 siblings, 0 replies; 34+ messages in thread
From: Patchwork @ 2019-11-14 14:18 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
URL   : https://patchwork.freedesktop.org/series/69124/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7344 -> Patchwork_15260
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/index.html

Known issues
------------

  Here are the changes found in Patchwork_15260 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s3:
    - fi-skl-6700k2:      [PASS][1] -> [INCOMPLETE][2] ([fdo#104108])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-skl-6700k2/igt@gem_exec_suspend@basic-s3.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-skl-6700k2/igt@gem_exec_suspend@basic-s3.html

  * igt@i915_selftest@live_blt:
    - fi-bsw-n3050:       [PASS][3] -> [DMESG-FAIL][4] ([fdo#112176])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-bsw-n3050/igt@i915_selftest@live_blt.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-bsw-n3050/igt@i915_selftest@live_blt.html

  * igt@i915_selftest@live_hangcheck:
    - fi-hsw-4770r:       [PASS][5] -> [DMESG-FAIL][6] ([fdo#111991])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-hsw-4770r/igt@i915_selftest@live_hangcheck.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-hsw-4770r/igt@i915_selftest@live_hangcheck.html

  * igt@kms_busy@basic-flip-pipe-b:
    - fi-skl-6770hq:      [PASS][7] -> [DMESG-WARN][8] ([fdo#105541])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html

  
#### Possible fixes ####

  * igt@i915_module_load@reload-with-fault-injection:
    - {fi-kbl-7560u}:     [INCOMPLETE][9] ([fdo#110343]) -> [PASS][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-kbl-7560u/igt@i915_module_load@reload-with-fault-injection.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-kbl-7560u/igt@i915_module_load@reload-with-fault-injection.html

  * igt@kms_flip@basic-flip-vs-modeset:
    - fi-skl-6770hq:      [FAIL][11] -> [PASS][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-skl-6770hq/igt@kms_flip@basic-flip-vs-modeset.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-skl-6770hq/igt@kms_flip@basic-flip-vs-modeset.html

  * igt@kms_flip@basic-plain-flip:
    - fi-skl-6770hq:      [DMESG-WARN][13] ([fdo#105541]) -> [PASS][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-skl-6770hq/igt@kms_flip@basic-plain-flip.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-skl-6770hq/igt@kms_flip@basic-plain-flip.html

  * igt@kms_setmode@basic-clone-single-crtc:
    - fi-skl-6770hq:      [WARN][15] ([fdo#112252]) -> [PASS][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-skl-6770hq/igt@kms_setmode@basic-clone-single-crtc.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-skl-6770hq/igt@kms_setmode@basic-clone-single-crtc.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#104108]: https://bugs.freedesktop.org/show_bug.cgi?id=104108
  [fdo#105541]: https://bugs.freedesktop.org/show_bug.cgi?id=105541
  [fdo#110343]: https://bugs.freedesktop.org/show_bug.cgi?id=110343
  [fdo#111991]: https://bugs.freedesktop.org/show_bug.cgi?id=111991
  [fdo#112176]: https://bugs.freedesktop.org/show_bug.cgi?id=112176
  [fdo#112252]: https://bugs.freedesktop.org/show_bug.cgi?id=112252


Participating hosts (50 -> 45)
------------------------------

  Missing    (5): fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7344 -> Patchwork_15260

  CI-20190529: 20190529
  CI_DRM_7344: f1908e34ae20deabec070a88efaa495aa8e051f1 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5280: d053b819673dda567faca00ccbd7e826f66da586 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_15260: dc5cb9d402dbac06b94a9109933dc54797e4a0b9 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

dc5cb9d402db drm/i915: Enable second dbuf slice for ICL and TGL

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
@ 2019-11-14 14:18   ` Patchwork
  0 siblings, 0 replies; 34+ messages in thread
From: Patchwork @ 2019-11-14 14:18 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
URL   : https://patchwork.freedesktop.org/series/69124/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7344 -> Patchwork_15260
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/index.html

Known issues
------------

  Here are the changes found in Patchwork_15260 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s3:
    - fi-skl-6700k2:      [PASS][1] -> [INCOMPLETE][2] ([fdo#104108])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-skl-6700k2/igt@gem_exec_suspend@basic-s3.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-skl-6700k2/igt@gem_exec_suspend@basic-s3.html

  * igt@i915_selftest@live_blt:
    - fi-bsw-n3050:       [PASS][3] -> [DMESG-FAIL][4] ([fdo#112176])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-bsw-n3050/igt@i915_selftest@live_blt.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-bsw-n3050/igt@i915_selftest@live_blt.html

  * igt@i915_selftest@live_hangcheck:
    - fi-hsw-4770r:       [PASS][5] -> [DMESG-FAIL][6] ([fdo#111991])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-hsw-4770r/igt@i915_selftest@live_hangcheck.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-hsw-4770r/igt@i915_selftest@live_hangcheck.html

  * igt@kms_busy@basic-flip-pipe-b:
    - fi-skl-6770hq:      [PASS][7] -> [DMESG-WARN][8] ([fdo#105541])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-skl-6770hq/igt@kms_busy@basic-flip-pipe-b.html

  
#### Possible fixes ####

  * igt@i915_module_load@reload-with-fault-injection:
    - {fi-kbl-7560u}:     [INCOMPLETE][9] ([fdo#110343]) -> [PASS][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-kbl-7560u/igt@i915_module_load@reload-with-fault-injection.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-kbl-7560u/igt@i915_module_load@reload-with-fault-injection.html

  * igt@kms_flip@basic-flip-vs-modeset:
    - fi-skl-6770hq:      [FAIL][11] -> [PASS][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-skl-6770hq/igt@kms_flip@basic-flip-vs-modeset.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-skl-6770hq/igt@kms_flip@basic-flip-vs-modeset.html

  * igt@kms_flip@basic-plain-flip:
    - fi-skl-6770hq:      [DMESG-WARN][13] ([fdo#105541]) -> [PASS][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-skl-6770hq/igt@kms_flip@basic-plain-flip.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-skl-6770hq/igt@kms_flip@basic-plain-flip.html

  * igt@kms_setmode@basic-clone-single-crtc:
    - fi-skl-6770hq:      [WARN][15] ([fdo#112252]) -> [PASS][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/fi-skl-6770hq/igt@kms_setmode@basic-clone-single-crtc.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/fi-skl-6770hq/igt@kms_setmode@basic-clone-single-crtc.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#104108]: https://bugs.freedesktop.org/show_bug.cgi?id=104108
  [fdo#105541]: https://bugs.freedesktop.org/show_bug.cgi?id=105541
  [fdo#110343]: https://bugs.freedesktop.org/show_bug.cgi?id=110343
  [fdo#111991]: https://bugs.freedesktop.org/show_bug.cgi?id=111991
  [fdo#112176]: https://bugs.freedesktop.org/show_bug.cgi?id=112176
  [fdo#112252]: https://bugs.freedesktop.org/show_bug.cgi?id=112252


Participating hosts (50 -> 45)
------------------------------

  Missing    (5): fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_7344 -> Patchwork_15260

  CI-20190529: 20190529
  CI_DRM_7344: f1908e34ae20deabec070a88efaa495aa8e051f1 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5280: d053b819673dda567faca00ccbd7e826f66da586 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_15260: dc5cb9d402dbac06b94a9109933dc54797e4a0b9 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

dc5cb9d402db drm/i915: Enable second dbuf slice for ICL and TGL

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* ✓ Fi.CI.IGT: success for drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
@ 2019-11-15  9:59   ` Patchwork
  0 siblings, 0 replies; 34+ messages in thread
From: Patchwork @ 2019-11-15  9:59 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
URL   : https://patchwork.freedesktop.org/series/69124/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7344_full -> Patchwork_15260_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Known issues
------------

  Here are the changes found in Patchwork_15260_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_isolation@rcs0-s3:
    - shard-tglb:         [PASS][1] -> [INCOMPLETE][2] ([fdo#111832])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb9/igt@gem_ctx_isolation@rcs0-s3.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb7/igt@gem_ctx_isolation@rcs0-s3.html

  * igt@gem_ctx_switch@vcs1-queue:
    - shard-iclb:         [PASS][3] -> [SKIP][4] ([fdo#112080]) +4 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb2/igt@gem_ctx_switch@vcs1-queue.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb5/igt@gem_ctx_switch@vcs1-queue.html

  * igt@gem_exec_async@concurrent-writes-bsd:
    - shard-iclb:         [PASS][5] -> [SKIP][6] ([fdo#112146]) +1 similar issue
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb7/igt@gem_exec_async@concurrent-writes-bsd.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb1/igt@gem_exec_async@concurrent-writes-bsd.html

  * igt@gem_exec_await@wide-all:
    - shard-tglb:         [PASS][7] -> [INCOMPLETE][8] ([fdo#111736])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb9/igt@gem_exec_await@wide-all.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb6/igt@gem_exec_await@wide-all.html

  * igt@gem_exec_nop@basic-parallel:
    - shard-tglb:         [PASS][9] -> [INCOMPLETE][10] ([fdo#111747])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb1/igt@gem_exec_nop@basic-parallel.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb5/igt@gem_exec_nop@basic-parallel.html

  * igt@gem_exec_schedule@independent-bsd2:
    - shard-iclb:         [PASS][11] -> [SKIP][12] ([fdo#109276]) +13 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb2/igt@gem_exec_schedule@independent-bsd2.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb5/igt@gem_exec_schedule@independent-bsd2.html

  * igt@gem_userptr_blits@sync-unmap-after-close:
    - shard-hsw:          [PASS][13] -> [DMESG-WARN][14] ([fdo#111870]) +3 similar issues
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-hsw1/igt@gem_userptr_blits@sync-unmap-after-close.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-hsw7/igt@gem_userptr_blits@sync-unmap-after-close.html
    - shard-snb:          [PASS][15] -> [DMESG-WARN][16] ([fdo#111870]) +1 similar issue
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-snb5/igt@gem_userptr_blits@sync-unmap-after-close.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-snb7/igt@gem_userptr_blits@sync-unmap-after-close.html

  * igt@i915_pm_dc@dc6-psr:
    - shard-iclb:         [PASS][17] -> [FAIL][18] ([fdo#111830 ])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb5/igt@i915_pm_dc@dc6-psr.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb2/igt@i915_pm_dc@dc6-psr.html

  * igt@i915_selftest@live_hangcheck:
    - shard-hsw:          [PASS][19] -> [DMESG-FAIL][20] ([fdo#111991])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-hsw4/igt@i915_selftest@live_hangcheck.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-hsw5/igt@i915_selftest@live_hangcheck.html
    - shard-snb:          [PASS][21] -> [INCOMPLETE][22] ([fdo#105411])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-snb4/igt@i915_selftest@live_hangcheck.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-snb1/igt@i915_selftest@live_hangcheck.html

  * igt@kms_color@pipe-b-ctm-0-5:
    - shard-skl:          [PASS][23] -> [DMESG-WARN][24] ([fdo#106107])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl5/igt@kms_color@pipe-b-ctm-0-5.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl1/igt@kms_color@pipe-b-ctm-0-5.html

  * igt@kms_cursor_crc@pipe-a-cursor-64x64-offscreen:
    - shard-skl:          [PASS][25] -> [FAIL][26] ([fdo#103232])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl3/igt@kms_cursor_crc@pipe-a-cursor-64x64-offscreen.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl9/igt@kms_cursor_crc@pipe-a-cursor-64x64-offscreen.html

  * igt@kms_cursor_crc@pipe-c-cursor-128x128-sliding:
    - shard-hsw:          [PASS][27] -> [INCOMPLETE][28] ([fdo#103540])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-hsw1/igt@kms_cursor_crc@pipe-c-cursor-128x128-sliding.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-hsw4/igt@kms_cursor_crc@pipe-c-cursor-128x128-sliding.html

  * igt@kms_cursor_legacy@cursor-vs-flip-toggle:
    - shard-skl:          [PASS][29] -> [FAIL][30] ([fdo#103355])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl2/igt@kms_cursor_legacy@cursor-vs-flip-toggle.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl5/igt@kms_cursor_legacy@cursor-vs-flip-toggle.html

  * igt@kms_flip@flip-vs-expired-vblank:
    - shard-skl:          [PASS][31] -> [FAIL][32] ([fdo#105363]) +1 similar issue
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl4/igt@kms_flip@flip-vs-expired-vblank.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl10/igt@kms_flip@flip-vs-expired-vblank.html

  * igt@kms_flip@flip-vs-suspend:
    - shard-skl:          [PASS][33] -> [INCOMPLETE][34] ([fdo#109507])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl1/igt@kms_flip@flip-vs-suspend.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl7/igt@kms_flip@flip-vs-suspend.html
    - shard-tglb:         [PASS][35] -> [INCOMPLETE][36] ([fdo#111850] / [fdo#112031])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb6/igt@kms_flip@flip-vs-suspend.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb5/igt@kms_flip@flip-vs-suspend.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-indfb-plflip-blt:
    - shard-tglb:         [PASS][37] -> [FAIL][38] ([fdo#103167]) +3 similar issues
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb8/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-indfb-plflip-blt.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb5/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-indfb-plflip-blt.html

  * igt@kms_frontbuffer_tracking@fbc-suspend:
    - shard-kbl:          [PASS][39] -> [DMESG-WARN][40] ([fdo#108566]) +4 similar issues
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-kbl6/igt@kms_frontbuffer_tracking@fbc-suspend.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-kbl3/igt@kms_frontbuffer_tracking@fbc-suspend.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render:
    - shard-iclb:         [PASS][41] -> [FAIL][42] ([fdo#103167]) +3 similar issues
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb7/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb1/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-indfb-msflip-blt:
    - shard-skl:          [PASS][43] -> [FAIL][44] ([fdo#103167])
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl8/igt@kms_frontbuffer_tracking@psr-1p-primscrn-indfb-msflip-blt.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl3/igt@kms_frontbuffer_tracking@psr-1p-primscrn-indfb-msflip-blt.html

  * igt@kms_frontbuffer_tracking@psr-suspend:
    - shard-tglb:         [PASS][45] -> [INCOMPLETE][46] ([fdo#111832] / [fdo#111850]) +2 similar issues
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb6/igt@kms_frontbuffer_tracking@psr-suspend.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb8/igt@kms_frontbuffer_tracking@psr-suspend.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b:
    - shard-tglb:         [PASS][47] -> [INCOMPLETE][48] ([fdo#111850])
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb8/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb4/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes:
    - shard-apl:          [PASS][49] -> [DMESG-WARN][50] ([fdo#108566]) +1 similar issue
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-apl3/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-apl6/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes:
    - shard-skl:          [PASS][51] -> [INCOMPLETE][52] ([fdo#104108])
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl2/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl5/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes.html

  * igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min:
    - shard-skl:          [PASS][53] -> [FAIL][54] ([fdo#108145])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl3/igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl9/igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min.html

  * igt@kms_plane_alpha_blend@pipe-b-coverage-7efc:
    - shard-skl:          [PASS][55] -> [FAIL][56] ([fdo#108145] / [fdo#110403])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl3/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl10/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html

  * igt@kms_psr@psr2_cursor_plane_onoff:
    - shard-iclb:         [PASS][57] -> [SKIP][58] ([fdo#109441]) +1 similar issue
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb2/igt@kms_psr@psr2_cursor_plane_onoff.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb5/igt@kms_psr@psr2_cursor_plane_onoff.html

  * igt@perf@gen8-unprivileged-single-ctx-counters:
    - shard-skl:          [PASS][59] -> [TIMEOUT][60] ([fdo#111732 ])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl1/igt@perf@gen8-unprivileged-single-ctx-counters.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl7/igt@perf@gen8-unprivileged-single-ctx-counters.html

  
#### Possible fixes ####

  * igt@gem_ctx_isolation@vcs1-clean:
    - shard-iclb:         [SKIP][61] ([fdo#109276] / [fdo#112080]) -> [PASS][62] +1 similar issue
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb8/igt@gem_ctx_isolation@vcs1-clean.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb2/igt@gem_ctx_isolation@vcs1-clean.html

  * igt@gem_ctx_isolation@vcs1-s3:
    - shard-tglb:         [INCOMPLETE][63] ([fdo#111832]) -> [PASS][64]
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb2/igt@gem_ctx_isolation@vcs1-s3.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb2/igt@gem_ctx_isolation@vcs1-s3.html

  * igt@gem_exec_parallel@vcs1-fds:
    - shard-iclb:         [SKIP][65] ([fdo#112080]) -> [PASS][66] +6 similar issues
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb7/igt@gem_exec_parallel@vcs1-fds.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb1/igt@gem_exec_parallel@vcs1-fds.html

  * igt@gem_exec_schedule@reorder-wide-bsd:
    - shard-iclb:         [SKIP][67] ([fdo#112146]) -> [PASS][68] +3 similar issues
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb4/igt@gem_exec_schedule@reorder-wide-bsd.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb8/igt@gem_exec_schedule@reorder-wide-bsd.html

  * igt@gem_persistent_relocs@forked-faulting-reloc-thrashing:
    - shard-iclb:         [TIMEOUT][69] ([fdo#112068 ]) -> [PASS][70]
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb6/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb3/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html

  * igt@gem_persistent_relocs@forked-thrash-inactive:
    - shard-hsw:          [INCOMPLETE][71] ([fdo#103540] / [fdo#112068 ]) -> [PASS][72]
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-hsw5/igt@gem_persistent_relocs@forked-thrash-inactive.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-hsw2/igt@gem_persistent_relocs@forked-thrash-inactive.html

  * igt@gem_userptr_blits@dmabuf-sync:
    - shard-snb:          [DMESG-WARN][73] ([fdo#111870]) -> [PASS][74]
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-snb2/igt@gem_userptr_blits@dmabuf-sync.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-snb6/igt@gem_userptr_blits@dmabuf-sync.html

  * igt@gem_userptr_blits@dmabuf-unsync:
    - shard-hsw:          [DMESG-WARN][75] ([fdo#111870]) -> [PASS][76]
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-hsw5/igt@gem_userptr_blits@dmabuf-unsync.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-hsw4/igt@gem_userptr_blits@dmabuf-unsync.html

  * igt@i915_suspend@fence-restore-tiled2untiled:
    - shard-apl:          [DMESG-WARN][77] ([fdo#108566]) -> [PASS][78] +3 similar issues
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-apl6/igt@i915_suspend@fence-restore-tiled2untiled.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-apl8/igt@i915_suspend@fence-restore-tiled2untiled.html

  * igt@kms_color@pipe-a-ctm-0-5:
    - shard-skl:          [DMESG-WARN][79] ([fdo#106107]) -> [PASS][80] +1 similar issue
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl9/igt@kms_color@pipe-a-ctm-0-5.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl6/igt@kms_color@pipe-a-ctm-0-5.html

  * igt@kms_cursor_legacy@2x-long-cursor-vs-flip-atomic:
    - shard-hsw:          [FAIL][81] ([fdo#105767]) -> [PASS][82]
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-hsw4/igt@kms_cursor_legacy@2x-long-cursor-vs-flip-atomic.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-hsw5/igt@kms_cursor_legacy@2x-long-cursor-vs-flip-atomic.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - shard-skl:          [FAIL][83] ([fdo#102670]) -> [PASS][84]
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl5/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl2/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_fbcon_fbt@fbc-suspend:
    - shard-tglb:         [INCOMPLETE][85] ([fdo#111747] / [fdo#111832] / [fdo#111850]) -> [PASS][86]
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb4/igt@kms_fbcon_fbt@fbc-suspend.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb6/igt@kms_fbcon_fbt@fbc-suspend.html

  * igt@kms_flip@flip-vs-expired-vblank:
    - shard-glk:          [FAIL][87] ([fdo#105363]) -> [PASS][88]
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-glk3/igt@kms_flip@flip-vs-expired-vblank.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-glk3/igt@kms_flip@flip-vs-expired-vblank.html

  * igt@kms_frontbuffer_tracking@fbc-rgb565-draw-mmap-gtt:
    - shard-tglb:         [FAIL][89] ([fdo#103167]) -> [PASS][90] +6 similar issues
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb5/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-mmap-gtt.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb8/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-mmap-gtt.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite:
    - shard-iclb:         [FAIL][91] ([fdo#103167]) -> [PASS][92] +4 similar issues
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb6/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb6/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite.html

  * igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min:
    - shard-skl:          [FAIL][93] ([fdo#108145]) -> [PASS][94]
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl5/igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl2/igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min.html

  * igt@kms_psr@psr2_no_drrs:
    - shard-iclb:         [SKIP][95] ([fdo#109441]) -> [PASS][96] +1 similar issue
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb7/igt@kms_psr@psr2_no_drrs.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb2/igt@kms_psr@psr2_no_drrs.html

  * igt@kms_setmode@basic:
    - shard-apl:          [FAIL][97] ([fdo#99912]) -> [PASS][98]
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-apl6/igt@kms_setmode@basic.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-apl6/igt@kms_setmode@basic.html

  * igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend:
    - shard-skl:          [INCOMPLETE][99] ([fdo#104108]) -> [PASS][100] +1 similar issue
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl5/igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl7/igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend.html

  * igt@kms_vblank@pipe-a-ts-continuation-suspend:
    - shard-tglb:         [INCOMPLETE][101] ([fdo#111832] / [fdo#111850]) -> [PASS][102] +2 similar issues
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb5/igt@kms_vblank@pipe-a-ts-continuation-suspend.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb9/igt@kms_vblank@pipe-a-ts-continuation-suspend.html

  * igt@kms_vblank@pipe-c-ts-continuation-suspend:
    - shard-kbl:          [DMESG-WARN][103] ([fdo#108566]) -> [PASS][104] +1 similar issue
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-kbl3/igt@kms_vblank@pipe-c-ts-continuation-suspend.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-kbl1/igt@kms_vblank@pipe-c-ts-continuation-suspend.html

  * igt@prime_vgem@fence-wait-bsd2:
    - shard-iclb:         [SKIP][105] ([fdo#109276]) -> [PASS][106] +14 similar issues
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb7/igt@prime_vgem@fence-wait-bsd2.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb1/igt@prime_vgem@fence-wait-bsd2.html

  
#### Warnings ###

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
@ 2019-11-15  9:59   ` Patchwork
  0 siblings, 0 replies; 34+ messages in thread
From: Patchwork @ 2019-11-15  9:59 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Enable second dbuf slice for ICL and TGL (rev4)
URL   : https://patchwork.freedesktop.org/series/69124/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_7344_full -> Patchwork_15260_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Known issues
------------

  Here are the changes found in Patchwork_15260_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_isolation@rcs0-s3:
    - shard-tglb:         [PASS][1] -> [INCOMPLETE][2] ([fdo#111832])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb9/igt@gem_ctx_isolation@rcs0-s3.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb7/igt@gem_ctx_isolation@rcs0-s3.html

  * igt@gem_ctx_switch@vcs1-queue:
    - shard-iclb:         [PASS][3] -> [SKIP][4] ([fdo#112080]) +4 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb2/igt@gem_ctx_switch@vcs1-queue.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb5/igt@gem_ctx_switch@vcs1-queue.html

  * igt@gem_exec_async@concurrent-writes-bsd:
    - shard-iclb:         [PASS][5] -> [SKIP][6] ([fdo#112146]) +1 similar issue
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb7/igt@gem_exec_async@concurrent-writes-bsd.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb1/igt@gem_exec_async@concurrent-writes-bsd.html

  * igt@gem_exec_await@wide-all:
    - shard-tglb:         [PASS][7] -> [INCOMPLETE][8] ([fdo#111736])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb9/igt@gem_exec_await@wide-all.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb6/igt@gem_exec_await@wide-all.html

  * igt@gem_exec_nop@basic-parallel:
    - shard-tglb:         [PASS][9] -> [INCOMPLETE][10] ([fdo#111747])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb1/igt@gem_exec_nop@basic-parallel.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb5/igt@gem_exec_nop@basic-parallel.html

  * igt@gem_exec_schedule@independent-bsd2:
    - shard-iclb:         [PASS][11] -> [SKIP][12] ([fdo#109276]) +13 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb2/igt@gem_exec_schedule@independent-bsd2.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb5/igt@gem_exec_schedule@independent-bsd2.html

  * igt@gem_userptr_blits@sync-unmap-after-close:
    - shard-hsw:          [PASS][13] -> [DMESG-WARN][14] ([fdo#111870]) +3 similar issues
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-hsw1/igt@gem_userptr_blits@sync-unmap-after-close.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-hsw7/igt@gem_userptr_blits@sync-unmap-after-close.html
    - shard-snb:          [PASS][15] -> [DMESG-WARN][16] ([fdo#111870]) +1 similar issue
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-snb5/igt@gem_userptr_blits@sync-unmap-after-close.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-snb7/igt@gem_userptr_blits@sync-unmap-after-close.html

  * igt@i915_pm_dc@dc6-psr:
    - shard-iclb:         [PASS][17] -> [FAIL][18] ([fdo#111830 ])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb5/igt@i915_pm_dc@dc6-psr.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb2/igt@i915_pm_dc@dc6-psr.html

  * igt@i915_selftest@live_hangcheck:
    - shard-hsw:          [PASS][19] -> [DMESG-FAIL][20] ([fdo#111991])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-hsw4/igt@i915_selftest@live_hangcheck.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-hsw5/igt@i915_selftest@live_hangcheck.html
    - shard-snb:          [PASS][21] -> [INCOMPLETE][22] ([fdo#105411])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-snb4/igt@i915_selftest@live_hangcheck.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-snb1/igt@i915_selftest@live_hangcheck.html

  * igt@kms_color@pipe-b-ctm-0-5:
    - shard-skl:          [PASS][23] -> [DMESG-WARN][24] ([fdo#106107])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl5/igt@kms_color@pipe-b-ctm-0-5.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl1/igt@kms_color@pipe-b-ctm-0-5.html

  * igt@kms_cursor_crc@pipe-a-cursor-64x64-offscreen:
    - shard-skl:          [PASS][25] -> [FAIL][26] ([fdo#103232])
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl3/igt@kms_cursor_crc@pipe-a-cursor-64x64-offscreen.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl9/igt@kms_cursor_crc@pipe-a-cursor-64x64-offscreen.html

  * igt@kms_cursor_crc@pipe-c-cursor-128x128-sliding:
    - shard-hsw:          [PASS][27] -> [INCOMPLETE][28] ([fdo#103540])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-hsw1/igt@kms_cursor_crc@pipe-c-cursor-128x128-sliding.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-hsw4/igt@kms_cursor_crc@pipe-c-cursor-128x128-sliding.html

  * igt@kms_cursor_legacy@cursor-vs-flip-toggle:
    - shard-skl:          [PASS][29] -> [FAIL][30] ([fdo#103355])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl2/igt@kms_cursor_legacy@cursor-vs-flip-toggle.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl5/igt@kms_cursor_legacy@cursor-vs-flip-toggle.html

  * igt@kms_flip@flip-vs-expired-vblank:
    - shard-skl:          [PASS][31] -> [FAIL][32] ([fdo#105363]) +1 similar issue
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl4/igt@kms_flip@flip-vs-expired-vblank.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl10/igt@kms_flip@flip-vs-expired-vblank.html

  * igt@kms_flip@flip-vs-suspend:
    - shard-skl:          [PASS][33] -> [INCOMPLETE][34] ([fdo#109507])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl1/igt@kms_flip@flip-vs-suspend.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl7/igt@kms_flip@flip-vs-suspend.html
    - shard-tglb:         [PASS][35] -> [INCOMPLETE][36] ([fdo#111850] / [fdo#112031])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb6/igt@kms_flip@flip-vs-suspend.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb5/igt@kms_flip@flip-vs-suspend.html

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-indfb-plflip-blt:
    - shard-tglb:         [PASS][37] -> [FAIL][38] ([fdo#103167]) +3 similar issues
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb8/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-indfb-plflip-blt.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb5/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-indfb-plflip-blt.html

  * igt@kms_frontbuffer_tracking@fbc-suspend:
    - shard-kbl:          [PASS][39] -> [DMESG-WARN][40] ([fdo#108566]) +4 similar issues
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-kbl6/igt@kms_frontbuffer_tracking@fbc-suspend.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-kbl3/igt@kms_frontbuffer_tracking@fbc-suspend.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render:
    - shard-iclb:         [PASS][41] -> [FAIL][42] ([fdo#103167]) +3 similar issues
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb7/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb1/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-indfb-msflip-blt:
    - shard-skl:          [PASS][43] -> [FAIL][44] ([fdo#103167])
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl8/igt@kms_frontbuffer_tracking@psr-1p-primscrn-indfb-msflip-blt.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl3/igt@kms_frontbuffer_tracking@psr-1p-primscrn-indfb-msflip-blt.html

  * igt@kms_frontbuffer_tracking@psr-suspend:
    - shard-tglb:         [PASS][45] -> [INCOMPLETE][46] ([fdo#111832] / [fdo#111850]) +2 similar issues
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb6/igt@kms_frontbuffer_tracking@psr-suspend.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb8/igt@kms_frontbuffer_tracking@psr-suspend.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b:
    - shard-tglb:         [PASS][47] -> [INCOMPLETE][48] ([fdo#111850])
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb8/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb4/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes:
    - shard-apl:          [PASS][49] -> [DMESG-WARN][50] ([fdo#108566]) +1 similar issue
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-apl3/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-apl6/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-a-planes.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes:
    - shard-skl:          [PASS][51] -> [INCOMPLETE][52] ([fdo#104108])
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl2/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl5/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes.html

  * igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min:
    - shard-skl:          [PASS][53] -> [FAIL][54] ([fdo#108145])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl3/igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl9/igt@kms_plane_alpha_blend@pipe-a-constant-alpha-min.html

  * igt@kms_plane_alpha_blend@pipe-b-coverage-7efc:
    - shard-skl:          [PASS][55] -> [FAIL][56] ([fdo#108145] / [fdo#110403])
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl3/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl10/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html

  * igt@kms_psr@psr2_cursor_plane_onoff:
    - shard-iclb:         [PASS][57] -> [SKIP][58] ([fdo#109441]) +1 similar issue
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb2/igt@kms_psr@psr2_cursor_plane_onoff.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb5/igt@kms_psr@psr2_cursor_plane_onoff.html

  * igt@perf@gen8-unprivileged-single-ctx-counters:
    - shard-skl:          [PASS][59] -> [TIMEOUT][60] ([fdo#111732 ])
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl1/igt@perf@gen8-unprivileged-single-ctx-counters.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl7/igt@perf@gen8-unprivileged-single-ctx-counters.html

  
#### Possible fixes ####

  * igt@gem_ctx_isolation@vcs1-clean:
    - shard-iclb:         [SKIP][61] ([fdo#109276] / [fdo#112080]) -> [PASS][62] +1 similar issue
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb8/igt@gem_ctx_isolation@vcs1-clean.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb2/igt@gem_ctx_isolation@vcs1-clean.html

  * igt@gem_ctx_isolation@vcs1-s3:
    - shard-tglb:         [INCOMPLETE][63] ([fdo#111832]) -> [PASS][64]
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb2/igt@gem_ctx_isolation@vcs1-s3.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb2/igt@gem_ctx_isolation@vcs1-s3.html

  * igt@gem_exec_parallel@vcs1-fds:
    - shard-iclb:         [SKIP][65] ([fdo#112080]) -> [PASS][66] +6 similar issues
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb7/igt@gem_exec_parallel@vcs1-fds.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb1/igt@gem_exec_parallel@vcs1-fds.html

  * igt@gem_exec_schedule@reorder-wide-bsd:
    - shard-iclb:         [SKIP][67] ([fdo#112146]) -> [PASS][68] +3 similar issues
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb4/igt@gem_exec_schedule@reorder-wide-bsd.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb8/igt@gem_exec_schedule@reorder-wide-bsd.html

  * igt@gem_persistent_relocs@forked-faulting-reloc-thrashing:
    - shard-iclb:         [TIMEOUT][69] ([fdo#112068 ]) -> [PASS][70]
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb6/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb3/igt@gem_persistent_relocs@forked-faulting-reloc-thrashing.html

  * igt@gem_persistent_relocs@forked-thrash-inactive:
    - shard-hsw:          [INCOMPLETE][71] ([fdo#103540] / [fdo#112068 ]) -> [PASS][72]
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-hsw5/igt@gem_persistent_relocs@forked-thrash-inactive.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-hsw2/igt@gem_persistent_relocs@forked-thrash-inactive.html

  * igt@gem_userptr_blits@dmabuf-sync:
    - shard-snb:          [DMESG-WARN][73] ([fdo#111870]) -> [PASS][74]
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-snb2/igt@gem_userptr_blits@dmabuf-sync.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-snb6/igt@gem_userptr_blits@dmabuf-sync.html

  * igt@gem_userptr_blits@dmabuf-unsync:
    - shard-hsw:          [DMESG-WARN][75] ([fdo#111870]) -> [PASS][76]
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-hsw5/igt@gem_userptr_blits@dmabuf-unsync.html
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-hsw4/igt@gem_userptr_blits@dmabuf-unsync.html

  * igt@i915_suspend@fence-restore-tiled2untiled:
    - shard-apl:          [DMESG-WARN][77] ([fdo#108566]) -> [PASS][78] +3 similar issues
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-apl6/igt@i915_suspend@fence-restore-tiled2untiled.html
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-apl8/igt@i915_suspend@fence-restore-tiled2untiled.html

  * igt@kms_color@pipe-a-ctm-0-5:
    - shard-skl:          [DMESG-WARN][79] ([fdo#106107]) -> [PASS][80] +1 similar issue
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl9/igt@kms_color@pipe-a-ctm-0-5.html
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl6/igt@kms_color@pipe-a-ctm-0-5.html

  * igt@kms_cursor_legacy@2x-long-cursor-vs-flip-atomic:
    - shard-hsw:          [FAIL][81] ([fdo#105767]) -> [PASS][82]
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-hsw4/igt@kms_cursor_legacy@2x-long-cursor-vs-flip-atomic.html
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-hsw5/igt@kms_cursor_legacy@2x-long-cursor-vs-flip-atomic.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - shard-skl:          [FAIL][83] ([fdo#102670]) -> [PASS][84]
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl5/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl2/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_fbcon_fbt@fbc-suspend:
    - shard-tglb:         [INCOMPLETE][85] ([fdo#111747] / [fdo#111832] / [fdo#111850]) -> [PASS][86]
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb4/igt@kms_fbcon_fbt@fbc-suspend.html
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb6/igt@kms_fbcon_fbt@fbc-suspend.html

  * igt@kms_flip@flip-vs-expired-vblank:
    - shard-glk:          [FAIL][87] ([fdo#105363]) -> [PASS][88]
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-glk3/igt@kms_flip@flip-vs-expired-vblank.html
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-glk3/igt@kms_flip@flip-vs-expired-vblank.html

  * igt@kms_frontbuffer_tracking@fbc-rgb565-draw-mmap-gtt:
    - shard-tglb:         [FAIL][89] ([fdo#103167]) -> [PASS][90] +6 similar issues
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb5/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-mmap-gtt.html
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb8/igt@kms_frontbuffer_tracking@fbc-rgb565-draw-mmap-gtt.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite:
    - shard-iclb:         [FAIL][91] ([fdo#103167]) -> [PASS][92] +4 similar issues
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb6/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite.html
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb6/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite.html

  * igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min:
    - shard-skl:          [FAIL][93] ([fdo#108145]) -> [PASS][94]
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl5/igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl2/igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min.html

  * igt@kms_psr@psr2_no_drrs:
    - shard-iclb:         [SKIP][95] ([fdo#109441]) -> [PASS][96] +1 similar issue
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb7/igt@kms_psr@psr2_no_drrs.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb2/igt@kms_psr@psr2_no_drrs.html

  * igt@kms_setmode@basic:
    - shard-apl:          [FAIL][97] ([fdo#99912]) -> [PASS][98]
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-apl6/igt@kms_setmode@basic.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-apl6/igt@kms_setmode@basic.html

  * igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend:
    - shard-skl:          [INCOMPLETE][99] ([fdo#104108]) -> [PASS][100] +1 similar issue
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-skl5/igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-skl7/igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend.html

  * igt@kms_vblank@pipe-a-ts-continuation-suspend:
    - shard-tglb:         [INCOMPLETE][101] ([fdo#111832] / [fdo#111850]) -> [PASS][102] +2 similar issues
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-tglb5/igt@kms_vblank@pipe-a-ts-continuation-suspend.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-tglb9/igt@kms_vblank@pipe-a-ts-continuation-suspend.html

  * igt@kms_vblank@pipe-c-ts-continuation-suspend:
    - shard-kbl:          [DMESG-WARN][103] ([fdo#108566]) -> [PASS][104] +1 similar issue
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-kbl3/igt@kms_vblank@pipe-c-ts-continuation-suspend.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-kbl1/igt@kms_vblank@pipe-c-ts-continuation-suspend.html

  * igt@prime_vgem@fence-wait-bsd2:
    - shard-iclb:         [SKIP][105] ([fdo#109276]) -> [PASS][106] +14 similar issues
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_7344/shard-iclb7/igt@prime_vgem@fence-wait-bsd2.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/shard-iclb1/igt@prime_vgem@fence-wait-bsd2.html

  
#### Warnings ###

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15260/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-15 20:19   ` Ville Syrjälä
  0 siblings, 0 replies; 34+ messages in thread
From: Ville Syrjälä @ 2019-11-15 20:19 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy wrote:
> Also implemented algorithm for choosing DBuf slice configuration
> based on active pipes, pipe ratio as stated in BSpec 12716.
> 
> Now pipe allocation still stays proportional to pipe width as before,
> however within allowed DBuf slice for this particular configuration.
> 
> v2: Remove unneeded check from commit as ddb enabled slices might
>     now differ from hw state.
> 
> v3: - Added new field "supported_slices" to ddb to separate max
>       supported slices vs currently enabled, to avoid confusion.
>     - Removed obsolete comments and code related to DBuf(Matthew Roper).
>     - Some code style and long line removal(Matthew Roper).
>     - Added WARN_ON to new ddb range offset calc function(Matthew Roper).
>     - Removed platform specific call to calc pipe ratio from ddb
>       allocation function and fixed the return value(Matthew Roper)
>     - Refactored DBUF slice allocation table to improve readability
>     - Added DBUF slice allocation for TGL as well.
>     - ICL(however not TGL) seems to voluntarily enable second DBuf slice
>       after pm suspend/resume causing a mismatch failure, because we
>       update DBuf slices only if we do a modeset, however this check
>       is done always. Fixed it to be done only when modeset for ICL.
> 
> v4: - Now enabled slices is not actually a number, but a bitmask,
>       because we might need to enable second slice only and number
>       of slices would still 1 and that current code doesn't allow.
>     - Remove redundant duplicate code to have some unified way of
>       enabling dbuf slices instead of hardcoding.
> 
> v5: - Fix failing gen9_assert_dbuf_enabled as it was naively thinking
>       that we have only one DBUF_CTL slice. Now another version for
>       gen11 will check both slices as only second one can be enabled,
>       to keep CI happy.
> 
> v6: - Removed enabled dbuf assertion completely. Previous code
>       was keeping dbuf enabled, even(!) when _dbuf_disable is called.
>       Currently we enable/disable dbuf slices based on requrement
>       so no point in asserting this here.
>     - Removed unnecessary modeset check from verify_wm_state(Matthew Roper)
>     - Slices intersection after union is same as final result(Matthew Roper)
>     - Moved DBuf slices to intel_device_info(Matthew Roper)
>     - Some macros added(Matthew Roper)
>     - ddb range is now always less or equal to ddb size - no need for additional
>       checks(previously needed as we had some bandwidth checks in there which
>       could "suddenly" return ddb_size smaller than it is.
> 
> v7: Minor costemic changes:
>     - Changed icl_dbuf_slices_restore name to icl_program_dbuf_slices
>       as it more corresponds to the actual functionality.
>     - Some simplification with supported slices for BXT and GLK
>     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
>       this is not used for skl anymore.
>     - Changed DBuf slice assignment order for TGL case
> 
> Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> Cc: Matthew Roper <matthew.d.roper@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@intel.com>
> Cc: James Ausmus <james.ausmus@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
>  .../drm/i915/display/intel_display_power.c    |  98 ++---
>  .../drm/i915/display/intel_display_power.h    |   2 +
>  drivers/gpu/drm/i915/i915_drv.c               |   5 +
>  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
>  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
>  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
>  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
>  drivers/gpu/drm/i915/intel_pm.c               | 387 ++++++++++++++++--
>  9 files changed, 431 insertions(+), 104 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 876fc25968bf..bd7aff675198 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct intel_crtc *crtc,
>  
>  	if (INTEL_GEN(dev_priv) >= 11 &&
>  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> -		DRM_ERROR("mismatch in DBUF Slices (expected %u, got %u)\n",
> +		DRM_ERROR("mismatch in DBUF Slices (expected %x, got %x)\n",
>  			  sw_ddb->enabled_slices,
>  			  hw->ddb.enabled_slices);
>  
> @@ -14614,15 +14614,24 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
>  	u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
>  	u8 required_slices = state->wm_results.ddb.enabled_slices;
>  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> +	u8 slices_union = hw_enabled_slices | required_slices;
> +	u8 slices_intersection = required_slices;
>  
>  	for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i)
>  		/* ignore allocations for crtc's that have been turned off. */
>  		if (new_crtc_state->hw.active)
>  			entries[i] = old_crtc_state->wm.skl.ddb;
>  
> -	/* If 2nd DBuf slice required, enable it here */
> -	if (INTEL_GEN(dev_priv) >= 11 && required_slices > hw_enabled_slices)
> -		icl_dbuf_slices_update(dev_priv, required_slices);
> +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> +		      required_slices, hw_enabled_slices);
> +
> +	/*
> +	 * If some other DBuf slice required, enable it here,
> +	 * as here we only add more slices, but not remove to prevent
> +	 * issues if somebody still uses those.
> +	 */
> +	if (INTEL_GEN(dev_priv) >= 11 && required_slices != hw_enabled_slices)
> +		icl_dbuf_slices_update(dev_priv, slices_union);
>  
>  	/*
>  	 * Whenever the number of active pipes changes, we need to make sure we
> @@ -14681,9 +14690,12 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
>  		}
>  	} while (progress);
>  
> -	/* If 2nd DBuf slice is no more required disable it */
> -	if (INTEL_GEN(dev_priv) >= 11 && required_slices < hw_enabled_slices)
> -		icl_dbuf_slices_update(dev_priv, required_slices);
> +	/*
> +	 * If some other DBuf slice is not needed, disable it here,
> +	 * as here we only remove slices, which are not needed anymore.
> +	 */
> +	if (INTEL_GEN(dev_priv) >= 11 && required_slices != hw_enabled_slices)
> +		icl_dbuf_slices_update(dev_priv, slices_intersection);

This is still a totally wrong place for this. We need to be sure the
planes have switched over to the new ddb layout before we power off
anything.

>  }
>  
>  static void intel_atomic_helper_free_state(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c
> index ce1b64f4dd44..a78bc9b61862 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> @@ -1031,15 +1031,6 @@ static bool gen9_dc_off_power_well_enabled(struct drm_i915_private *dev_priv,
>  		(I915_READ(DC_STATE_EN) & DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
>  }
>  
> -static void gen9_assert_dbuf_enabled(struct drm_i915_private *dev_priv)
> -{
> -	u32 tmp = I915_READ(DBUF_CTL);
> -
> -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> -	     "Unexpected DBuf power power state (0x%08x)\n", tmp);
> -}
> -
>  static void gen9_disable_dc_states(struct drm_i915_private *dev_priv)
>  {
>  	struct intel_cdclk_state cdclk_state = {};
> @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct drm_i915_private *dev_priv)
>  	/* Can't read out voltage_level so can't use intel_cdclk_changed() */
>  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw, &cdclk_state));
>  
> -	gen9_assert_dbuf_enabled(dev_priv);
> -
>  	if (IS_GEN9_LP(dev_priv))
>  		bxt_verify_ddi_phy_power_wells(dev_priv);
>  
> @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct drm_i915_private *dev_priv)
>  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
>  }
>  
> -static u8 intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
>  {
> -	if (INTEL_GEN(dev_priv) < 11)
> -		return 1;
> -	return 2;
> +	return INTEL_INFO(dev_priv)->supported_slices;
> +}

Kinda pointless function now. 'supported_slices' is not a very good
name. Should be 'num_dbuf_slices' or something along those lines.

> +
> +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv)
> +{
> +	const u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;

Seems a rather pointless variable. const at least is pointless.

> +
> +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
>  }
>  
>  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
>  			    u8 req_slices)
>  {
> -	const u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
> -	bool ret;
> +	bool ret = true;
> +	int i;
> +	int max_slices = intel_dbuf_max_slices(dev_priv);
>  
> -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> +	if (hweight8(req_slices) > intel_dbuf_max_slices(dev_priv)) {
>  		DRM_ERROR("Invalid number of dbuf slices requested\n");
>  		return;
>  	}
>  
> -	if (req_slices == hw_enabled_slices || req_slices == 0)
> -		return;
> +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n", req_slices);
>  
> -	if (req_slices > hw_enabled_slices)
> -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2, true);
> -	else
> -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2, false);
> +	for (i = 0; i < max_slices; i++) {
> +		int slice_bit = BIT(i);
> +		bool slice_set = (slice_bit & req_slices) != 0;
> +
> +		switch (slice_bit) {
> +		case DBUF_S1_BIT:
> +			ret = ret && intel_dbuf_slice_set(dev_priv,
> +							  DBUF_CTL_S1,
> +							  slice_set);
> +			break;
> +		case DBUF_S2_BIT:
> +			ret = ret && intel_dbuf_slice_set(dev_priv,
> +							  DBUF_CTL_S2,
> +							  slice_set);

The return value on intel_dbuf_slice_set() seems pointless. I'd nuke it.

Also as long you have intel_dbuf_slice_set() I'd probably just pass the
slice index there and let it deal with the register details. DBUF_CTL
should probably be parametrized as well so you don't need annoying switch
statements.

> +			break;
> +		default:
> +			MISSING_CASE(slice_bit);
> +		}
> +	}
>  
>  	if (ret)
>  		dev_priv->wm.skl_hw.ddb.enabled_slices = req_slices;
> @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
>  
>  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
>  {
> -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) | DBUF_POWER_REQUEST);
> -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) | DBUF_POWER_REQUEST);
> -	POSTING_READ(DBUF_CTL_S2);
> -
> -	udelay(10);
> -
> -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> -		DRM_ERROR("DBuf power enable timeout\n");
> -	else
> -		/*
> -		 * FIXME: for now pretend that we only have 1 slice, see
> -		 * intel_enabled_dbuf_slices_num().
> -		 */
> -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> +	/*
> +	 * Just power up 1 slice, we will
> +	 * figure out later which slices we have and what we need.
> +	 */
> +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;

That stuff really shouldn't be modified by low level functions.
The intial value should come from readout, and after that it should
be adjusted as part of the overall atomic state.

> +	icl_program_dbuf_slices(dev_priv);
>  }
>  
>  static void icl_dbuf_disable(struct drm_i915_private *dev_priv)
>  {
> -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) & ~DBUF_POWER_REQUEST);
> -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) & ~DBUF_POWER_REQUEST);
> -	POSTING_READ(DBUF_CTL_S2);
> -
> -	udelay(10);
> -
> -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> -		DRM_ERROR("DBuf power disable timeout!\n");
> -	else
> -		/*
> -		 * FIXME: for now pretend that the first slice is always
> -		 * enabled, see intel_enabled_dbuf_slices_num().
> -		 */
> -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> +	/*
> +	 * Disable all slices
> +	 */
> +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> +	icl_program_dbuf_slices(dev_priv);
>  }
>  
>  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
> index 1da04f3e0fb3..b7057d68ad78 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> @@ -311,8 +311,10 @@ intel_display_power_put_async(struct drm_i915_private *i915,
>  	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
>  	     intel_display_power_put_async((i915), (domain), (wf)), (wf) = 0)
>  
> +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
>  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
>  			    u8 req_slices);
> +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv);
>  
>  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
>  			     bool override, unsigned int mask);
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 82e4e6bf08c3..96741e68633a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -51,6 +51,7 @@
>  #include "display/intel_bw.h"
>  #include "display/intel_cdclk.h"
>  #include "display/intel_display_types.h"
> +#include "display/intel_display_power.h"
>  #include "display/intel_dp.h"
>  #include "display/intel_fbdev.h"
>  #include "display/intel_hotplug.h"
> @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct device *kdev)
>  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
>  		ret = vlv_resume_prepare(dev_priv, true);
>  
> +	/* Weird hack to fix ICL hardware bug, as it resets DBUF slices reg */
> +	if (INTEL_GEN(dev_priv) == 11)
> +		icl_program_dbuf_slices(dev_priv);
> +
>  	intel_uncore_runtime_resume(&dev_priv->uncore);
>  
>  	intel_runtime_pm_enable_interrupts(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 7e0f67babe20..a396977c9c2d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -804,7 +804,7 @@ static inline bool skl_ddb_entry_equal(const struct skl_ddb_entry *e1,
>  }
>  
>  struct skl_ddb_allocation {

This struct is quite pointless. Should be killed off.

> -	u8 enabled_slices; /* GEN11 has configurable 2 slices */
> +	u8 enabled_slices;   /* Bitmask of currently enabled DBuf slices */
>  };
>  
>  struct skl_ddb_values {
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 1bb701d32a5d..8e86af505730 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -614,7 +614,8 @@ static const struct intel_device_info intel_cherryview_info = {
>  	.has_gt_uc = 1, \
>  	.display.has_hdcp = 1, \
>  	.display.has_ipc = 1, \
> -	.ddb_size = 896
> +	.ddb_size = 896, \
> +	.supported_slices = 1
>  
>  #define SKL_PLATFORM \
>  	GEN9_FEATURES, \
> @@ -650,6 +651,7 @@ static const struct intel_device_info intel_skylake_gt4_info = {
>  	GEN(9), \
>  	.is_lp = 1, \
>  	.display.has_hotplug = 1, \
> +	.supported_slices = 1, \
>  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0), \
>  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
>  	.has_64bit_reloc = 1, \
> @@ -737,6 +739,7 @@ static const struct intel_device_info intel_coffeelake_gt3_info = {
>  	GEN9_FEATURES, \
>  	GEN(10), \
>  	.ddb_size = 1024, \
> +	.supported_slices = 1, \
>  	.display.has_dsc = 1, \
>  	.has_coherent_ggtt = false, \
>  	GLK_COLORS
> @@ -773,6 +776,7 @@ static const struct intel_device_info intel_cannonlake_info = {
>  	}, \
>  	GEN(11), \
>  	.ddb_size = 2048, \
> +	.supported_slices = 2, \
>  	.has_logical_ring_elsq = 1, \
>  	.color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 }
>  
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index a607ea520829..fba5731063d8 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -7734,9 +7734,11 @@ enum {
>  #define DISP_ARB_CTL2	_MMIO(0x45004)
>  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
>  #define  DISP_IPC_ENABLE		(1 << 3)
> -#define DBUF_CTL	_MMIO(0x45008)
> -#define DBUF_CTL_S1	_MMIO(0x45008)
> -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> +#define DBUF_S1_BIT			BIT(0)
> +#define DBUF_S2_BIT			BIT(1)

These are not bits inside DBUF_CTL so they should not be here.
I'd just nuke them entirely. If you think we need more than an
int to refer to a slice then IMO something simple like

enum dbuf_slice {
	DBUF_S1,
	DBUF_S2,
};

would match the general apporach we use for other things much better
rather than defining some shifted values.

> +#define DBUF_CTL			_MMIO(0x45008)
> +#define DBUF_CTL_S1			_MMIO(0x45008)
> +#define DBUF_CTL_S2			_MMIO(0x44FE8)
>  #define  DBUF_POWER_REQUEST		(1 << 31)
>  #define  DBUF_POWER_STATE		(1 << 30)
>  #define GEN7_MSG_CTL	_MMIO(0x45010)
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
> index 4bdf8a6cfb47..ba34e1a5c591 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.h
> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> @@ -180,6 +180,7 @@ struct intel_device_info {
>  	} display;
>  
>  	u16 ddb_size; /* in blocks */
> +	u8 supported_slices; /* number of DBuf slices */
>  
>  	/* Register offsets for the various display pipes and transcoders */
>  	int pipe_offsets[I915_MAX_TRANSCODERS];
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 2d389e437e87..1ea81ab92429 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
>  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
>  }
>  
> -static u8 intel_enabled_dbuf_slices_num(struct drm_i915_private *dev_priv)
> +static u8 intel_enabled_dbuf_slices(struct drm_i915_private *dev_priv)
>  {
> -	u8 enabled_slices;
> -
> -	/* Slice 1 will always be enabled */
> -	enabled_slices = 1;
> +	u8 enabled_slices = 0;
>  
>  	/* Gen prior to GEN11 have only one DBuf slice */
>  	if (INTEL_GEN(dev_priv) < 11)
> -		return enabled_slices;
> +		return DBUF_S1_BIT;

I'd nuke this special case entirely and just read everything from the
hw. Assuming we need readout somewhere.

>  
> -	/*
> -	 * FIXME: for now we'll only ever use 1 slice; pretend that we have
> -	 * only that 1 slice enabled until we have a proper way for on-demand
> -	 * toggling of the second slice.
> -	 */
> -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> -		enabled_slices++;
> +	/* Check if second DBuf slice is enabled */
> +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> +		enabled_slices |= DBUF_S1_BIT;
> +
> +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> +		enabled_slices |= DBUF_S2_BIT;
>  
>  	return enabled_slices;
>  }
> @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv,
>  			      const int num_active,
>  			      struct skl_ddb_allocation *ddb)
>  {
> -	const struct drm_display_mode *adjusted_mode;
> -	u64 total_data_bw;
>  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> -
>  	WARN_ON(ddb_size == 0);
>  
>  	if (INTEL_GEN(dev_priv) < 11)
>  		return ddb_size - 4; /* 4 blocks for bypass path allocation */
>  
> -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> -	total_data_bw = total_data_rate * drm_mode_vrefresh(adjusted_mode);
> +	return ddb_size;
> +}
>  
> -	/*
> -	 * 12GB/s is maximum BW supported by single DBuf slice.
> -	 *
> -	 * FIXME dbuf slice code is broken:
> -	 * - must wait for planes to stop using the slice before powering it off
> -	 * - plane straddling both slices is illegal in multi-pipe scenarios
> -	 * - should validate we stay within the hw bandwidth limits
> -	 */
> -	if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
> -		ddb->enabled_slices = 2;
> -	} else {
> -		ddb->enabled_slices = 1;
> -		ddb_size /= 2;
> -	}
> +/*
> + * Calculate initial DBuf slice offset, based on slice size
> + * and mask(i.e if slice size is 1024 and second slice is enabled
> + * offset would be 1024)
> + */
> +static u32 skl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
> +					   u32 slice_size, u32 ddb_size)
> +{
> +	u32 offset = 0;
>  
> -	return ddb_size;
> +	if (!dbuf_slice_mask)
> +		return 0;
> +
> +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> +
> +	WARN_ON(offset >= ddb_size);
> +	return offset;
>  }
>  
> +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> +				      int pipe, u32 active_pipes,
> +				      const struct intel_crtc_state *crtc_state);
> +
>  static void
>  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  				   const struct intel_crtc_state *crtc_state,
> @@ -3857,7 +3855,14 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  	u32 pipe_width = 0, total_width = 0, width_before_pipe = 0;
>  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
>  	u16 ddb_size;
> +	u32 ddb_range_size;
>  	u32 i;
> +	u32 dbuf_slice_mask;
> +	u32 active_pipes;
> +	u32 offset;
> +	u32 slice_size;
> +	u32 total_slice_mask;
> +	u32 start, end;
>  
>  	if (WARN_ON(!state) || !crtc_state->hw.active) {
>  		alloc->start = 0;
> @@ -3866,14 +3871,23 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  		return;
>  	}
>  
> -	if (intel_state->active_pipe_changes)
> +	if (intel_state->active_pipe_changes) {
>  		*num_active = hweight8(intel_state->active_pipes);
> -	else
> +		active_pipes = intel_state->active_pipes;
> +	} else {
>  		*num_active = hweight8(dev_priv->active_pipes);
> +		active_pipes = dev_priv->active_pipes;
> +	}
>  
>  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state, total_data_rate,
>  				      *num_active, ddb);
>  
> +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> +
> +	slice_size = ddb_size / INTEL_INFO(dev_priv)->supported_slices;
> +
> +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> +
>  	/*
>  	 * If the state doesn't change the active CRTC's or there is no
>  	 * modeset request, then there's no need to recalculate;
> @@ -3891,20 +3905,70 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  		return;
>  	}
>  
> +	/*
> +	 * Get allowed DBuf slices for correspondent pipe and platform.
> +	 */
> +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv, for_pipe,
> +						     active_pipes, crtc_state);

Quite a few redundant arguments. The function name looks alien.

skl_crtc_possible_dbuf_slices() or something perhaps?

> +
> +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes %x\n",
> +		      dbuf_slice_mask,
> +		      for_pipe, active_pipes);
> +
> +	/*
> +	 * Figure out at which DBuf slice we start, i.e if we start at Dbuf S2
> +	 * and slice size is 1024, the offset would be 1024
> +	 */
> +	offset = skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> +						 slice_size, ddb_size);
> +
> +	/*
> +	 * Figure out total size of allowed DBuf slices, which is basically
> +	 * a number of allowed slices for that pipe multiplied by slice size.
> +	 * Inside of this
> +	 * range ddb entries are still allocated in proportion to display width.
> +	 */
> +	ddb_range_size = hweight8(dbuf_slice_mask) * slice_size;
> +
>  	/*
>  	 * Watermark/ddb requirement highly depends upon width of the
>  	 * framebuffer, So instead of allocating DDB equally among pipes
>  	 * distribute DDB based on resolution/width of the display.
>  	 */
> +	total_slice_mask = dbuf_slice_mask;

Isn't this already taken care by the loop below?

>  	for_each_new_intel_crtc_in_state(intel_state, crtc, crtc_state, i) {
>  		const struct drm_display_mode *adjusted_mode =
>  			&crtc_state->hw.adjusted_mode;
>  		enum pipe pipe = crtc->pipe;
>  		int hdisplay, vdisplay;
> +		u32 pipe_dbuf_slice_mask = \
> +			i915_get_allowed_dbuf_mask(dev_priv,
> +						pipe,
> +						active_pipes,
> +						crtc_state);
>  
>  		if (!crtc_state->hw.enable)
>  			continue;
>  
> +		/*
> +		 * According to BSpec pipe can share one dbuf slice with another
> +		 * pipes or pipe can use multiple dbufs, in both cases we
> +		 * account for other pipes only if they have exactly same mask.
> +		 * However we need to account how many slices we should enable
> +		 * in total.
> +		 */
> +		total_slice_mask |= pipe_dbuf_slice_mask;

total_slice_mask will now account only the crtcs in the state. What
happens to the other pipes' slices?

> +
> +		/*
> +		 * Do not account pipes using other slice sets
> +		 * luckily as of current BSpec slice sets do not partially
> +		 * intersect(pipes share either same one slice or same slice set
> +		 * i.e no partial intersection), so it is enough to check for
> +		 * equality for now.
> +		 */
> +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> +			continue;
> +
>  		drm_mode_get_hv_timing(adjusted_mode, &hdisplay, &vdisplay);
>  		total_width += hdisplay;
>  
> @@ -3914,8 +3978,19 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  			pipe_width = hdisplay;
>  	}
>  
> -	alloc->start = ddb_size * width_before_pipe / total_width;
> -	alloc->end = ddb_size * (width_before_pipe + pipe_width) / total_width;
> +	ddb->enabled_slices = total_slice_mask;
> +
> +	start = ddb_range_size * width_before_pipe / total_width;
> +	end = ddb_range_size * (width_before_pipe + pipe_width) / total_width;
> +
> +	alloc->start = offset + start;
> +	alloc->end = offset + end;
> +
> +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> +		      alloc->start, alloc->end);
> +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported %d\n",
> +		      ddb->enabled_slices,
> +		      INTEL_INFO(dev_priv)->supported_slices);
>  }
>  
>  static int skl_compute_wm_params(const struct intel_crtc_state *crtc_state,
> @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
>  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
>  			  struct skl_ddb_allocation *ddb /* out */)
>  {
> -	ddb->enabled_slices = intel_enabled_dbuf_slices_num(dev_priv);
> +	ddb->enabled_slices = intel_enabled_dbuf_slices(dev_priv);
> +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb->enabled_slices);
>  }
>  
>  /*
> @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct intel_crtc_state *crtc_state,
>  	return total_data_rate;
>  }
>  
> +static uint_fixed_16_16_t
> +icl_pipe_downscale_amount(const struct intel_crtc_state *crtc_state)
> +{
> +	u32 src_w, src_h, dst_w, dst_h;
> +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> +	uint_fixed_16_16_t downscale_h, downscale_w;
> +	const struct drm_display_mode *adjusted_mode = &crtc_state->hw.adjusted_mode;
> +
> +	src_w = crtc_state->pipe_src_w;
> +	src_h = crtc_state->pipe_src_h;
> +	dst_w = adjusted_mode->crtc_hdisplay;
> +	dst_h = adjusted_mode->crtc_vdisplay;
> +
> +	fp_w_ratio = div_fixed16(src_w, dst_w);
> +	fp_h_ratio = div_fixed16(src_h, dst_h);
> +	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
> +	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
> +
> +	return mul_fixed16(downscale_w, downscale_h);
> +}
> +
> +uint_fixed_16_16_t
> +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> +{
> +	struct drm_plane *plane;
> +	const struct drm_plane_state *drm_plane_state;
> +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> +	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> +
> +	if (!crtc_state->uapi.enable)
> +		return max_downscale;
> +
> +	drm_atomic_crtc_state_for_each_plane_state(plane, drm_plane_state, &crtc_state->uapi) {
> +		uint_fixed_16_16_t plane_downscale;
> +		const struct intel_plane_state *plane_state =
> +			to_intel_plane_state(drm_plane_state);
> +
> +		if (!intel_wm_plane_visible(crtc_state, plane_state))
> +			continue;
> +
> +		plane_downscale = skl_plane_downscale_amount(crtc_state, plane_state);
> +
> +		DRM_DEBUG_KMS("Plane %d downscale amount %d.%d\n",
> +			      to_intel_plane(plane)->id,
> +			      plane_downscale.val >> 16,
> +			      plane_downscale.val & 0xffff);
> +
> +		max_downscale = max_fixed16(plane_downscale, max_downscale);
> +	}
> +
> +
> +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> +
> +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> +		       crtc->pipe, pipe_downscale.val >> 16,
> +		       pipe_downscale.val & 0xffff);
> +
> +	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
> +
> +	/*
> +	 * Convert result to percentage: get the actual ratio instead of rate,
> +	 * then multiply by 100 to get percentage.
> +	 */
> +	pipe_ratio = u32_to_fixed16(100 * div_round_up_u32_fixed16(1, pipe_downscale));
> +
> +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> +		      pipe_ratio.val >> 16, pipe_ratio.val & 0xffff);
> +
> +	return pipe_ratio;
> +}

Yuck. I just killed these.

> +
> +struct dbuf_slice_conf_entry {
> +	u32 dbuf_mask[I915_MAX_PIPES];
> +	u32 active_pipes;
> +};
> +
> +
> +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) }
> +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C) | BIT(PIPE_D) }
> +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D) }
> +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_D) }
> +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) }
> +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \
> +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) | BIT(PIPE_B) | \
> +					  BIT(PIPE_C) | BIT(PIPE_D) }
> +
> +/*
> + * Table taken from Bspec 12716
> + * Pipes do have some preferred DBuf slice affinity,
> + * plus there are some hardcoded requirements on how
> + * those should be distributed for multipipe scenarios.
> + * For more DBuf slices algorithm can get even more messy
> + * and less readable, so decided to use a table almost
> + * as is from BSpec itself - that way it is at least easier
> + * to compare, change and check.
> + */
> +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> +{ { 0, 0, 0, 0 }, 0 },
> +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
> +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe ratio >= 88.8 */
> +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
> +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe ratio >= 88.8 */
> +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
> +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for pipe ratio >= 88.8 */
> +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> +};
> +
> +/*
> + * Table taken from Bspec 49255
> + * Pipes do have some preferred DBuf slice affinity,
> + * plus there are some hardcoded requirements on how
> + * those should be distributed for multipipe scenarios.
> + * For more DBuf slices algorithm can get even more messy
> + * and less readable, so decided to use a table almost
> + * as is from BSpec itself - that way it is at least easier
> + * to compare, change and check.
> + */
> +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> +{ { 0, 0, 0, 0 }, 0 },
> +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> +};

My eyes!

I have to think we should be able to reduce all that to a handful
of lines of code.


> +
> +/*
> + * This function

"This function" is totally redundant. Ditto elsewhere.

> finds an entry with same enabled pipe configuration and
> + * returns correspondent DBuf slice mask as stated in BSpec for particular
> + * platform.
> + */
> +static u32 icl_get_allowed_dbuf_mask(int pipe,
> +				     u32 active_pipes,
> +				     const struct intel_crtc_state *crtc_state)
> +{
> +	int i;
> +	uint_fixed_16_16_t pipe_ratio;
> +
> +	/*
> +	 * Calculate pipe ratio as stated in BSpec 28692
> +	 */
> +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> +
> +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> +		if (icl_allowed_dbufs[i].active_pipes == active_pipes) {
> +			if (hweight32(active_pipes) == 1) {
> +				/*
> +				 * According to BSpec 12716: if ratio >= 88.8
> +				 * always use single dbuf slice.
> +				 */
> +				if (pipe_ratio.val >= div_fixed16(888, 10).val)
> +					++i;
> +			}
> +			return icl_allowed_dbufs[i].dbuf_mask[pipe];
> +		}
> +	}
> +	return 0;
> +}
> +
> +/*
> + * This function finds an entry with same enabled pipe configuration and
> + * returns correspondent DBuf slice mask as stated in BSpec for particular
> + * platform.
> + */
> +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> +				     u32 active_pipes,
> +				     const struct intel_crtc_state *crtc_state)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> +		if (tgl_allowed_dbufs[i].active_pipes == active_pipes)
> +			return tgl_allowed_dbufs[i].dbuf_mask[pipe];
> +	}
> +	return 0;
> +}
> +
> +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> +				      int pipe, u32 active_pipes,
> +				      const struct intel_crtc_state *crtc_state)
> +{
> +	if (IS_GEN(dev_priv, 11))
> +		return icl_get_allowed_dbuf_mask(pipe,
> +						 active_pipes,
> +						 crtc_state);
> +	else if (IS_GEN(dev_priv, 12))
> +		return tgl_get_allowed_dbuf_mask(pipe,
> +						 active_pipes,
> +						 crtc_state);
> +	/*
> +	 * For anything else just return one slice yet.
> +	 * Should be extended for other platforms.
> +	 */
> +	return DBUF_S1_BIT;
> +}
> +
>  static u64
>  icl_get_total_relative_data_rate(struct intel_crtc_state *crtc_state,
>  				 u64 *plane_data_rate)
> -- 
> 2.17.1

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-15 20:19   ` Ville Syrjälä
  0 siblings, 0 replies; 34+ messages in thread
From: Ville Syrjälä @ 2019-11-15 20:19 UTC (permalink / raw)
  To: Stanislav Lisovskiy; +Cc: intel-gfx

On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy wrote:
> Also implemented algorithm for choosing DBuf slice configuration
> based on active pipes, pipe ratio as stated in BSpec 12716.
> 
> Now pipe allocation still stays proportional to pipe width as before,
> however within allowed DBuf slice for this particular configuration.
> 
> v2: Remove unneeded check from commit as ddb enabled slices might
>     now differ from hw state.
> 
> v3: - Added new field "supported_slices" to ddb to separate max
>       supported slices vs currently enabled, to avoid confusion.
>     - Removed obsolete comments and code related to DBuf(Matthew Roper).
>     - Some code style and long line removal(Matthew Roper).
>     - Added WARN_ON to new ddb range offset calc function(Matthew Roper).
>     - Removed platform specific call to calc pipe ratio from ddb
>       allocation function and fixed the return value(Matthew Roper)
>     - Refactored DBUF slice allocation table to improve readability
>     - Added DBUF slice allocation for TGL as well.
>     - ICL(however not TGL) seems to voluntarily enable second DBuf slice
>       after pm suspend/resume causing a mismatch failure, because we
>       update DBuf slices only if we do a modeset, however this check
>       is done always. Fixed it to be done only when modeset for ICL.
> 
> v4: - Now enabled slices is not actually a number, but a bitmask,
>       because we might need to enable second slice only and number
>       of slices would still 1 and that current code doesn't allow.
>     - Remove redundant duplicate code to have some unified way of
>       enabling dbuf slices instead of hardcoding.
> 
> v5: - Fix failing gen9_assert_dbuf_enabled as it was naively thinking
>       that we have only one DBUF_CTL slice. Now another version for
>       gen11 will check both slices as only second one can be enabled,
>       to keep CI happy.
> 
> v6: - Removed enabled dbuf assertion completely. Previous code
>       was keeping dbuf enabled, even(!) when _dbuf_disable is called.
>       Currently we enable/disable dbuf slices based on requrement
>       so no point in asserting this here.
>     - Removed unnecessary modeset check from verify_wm_state(Matthew Roper)
>     - Slices intersection after union is same as final result(Matthew Roper)
>     - Moved DBuf slices to intel_device_info(Matthew Roper)
>     - Some macros added(Matthew Roper)
>     - ddb range is now always less or equal to ddb size - no need for additional
>       checks(previously needed as we had some bandwidth checks in there which
>       could "suddenly" return ddb_size smaller than it is.
> 
> v7: Minor costemic changes:
>     - Changed icl_dbuf_slices_restore name to icl_program_dbuf_slices
>       as it more corresponds to the actual functionality.
>     - Some simplification with supported slices for BXT and GLK
>     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
>       this is not used for skl anymore.
>     - Changed DBuf slice assignment order for TGL case
> 
> Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> Cc: Matthew Roper <matthew.d.roper@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@intel.com>
> Cc: James Ausmus <james.ausmus@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
>  .../drm/i915/display/intel_display_power.c    |  98 ++---
>  .../drm/i915/display/intel_display_power.h    |   2 +
>  drivers/gpu/drm/i915/i915_drv.c               |   5 +
>  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
>  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
>  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
>  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
>  drivers/gpu/drm/i915/intel_pm.c               | 387 ++++++++++++++++--
>  9 files changed, 431 insertions(+), 104 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 876fc25968bf..bd7aff675198 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct intel_crtc *crtc,
>  
>  	if (INTEL_GEN(dev_priv) >= 11 &&
>  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> -		DRM_ERROR("mismatch in DBUF Slices (expected %u, got %u)\n",
> +		DRM_ERROR("mismatch in DBUF Slices (expected %x, got %x)\n",
>  			  sw_ddb->enabled_slices,
>  			  hw->ddb.enabled_slices);
>  
> @@ -14614,15 +14614,24 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
>  	u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
>  	u8 required_slices = state->wm_results.ddb.enabled_slices;
>  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> +	u8 slices_union = hw_enabled_slices | required_slices;
> +	u8 slices_intersection = required_slices;
>  
>  	for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i)
>  		/* ignore allocations for crtc's that have been turned off. */
>  		if (new_crtc_state->hw.active)
>  			entries[i] = old_crtc_state->wm.skl.ddb;
>  
> -	/* If 2nd DBuf slice required, enable it here */
> -	if (INTEL_GEN(dev_priv) >= 11 && required_slices > hw_enabled_slices)
> -		icl_dbuf_slices_update(dev_priv, required_slices);
> +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> +		      required_slices, hw_enabled_slices);
> +
> +	/*
> +	 * If some other DBuf slice required, enable it here,
> +	 * as here we only add more slices, but not remove to prevent
> +	 * issues if somebody still uses those.
> +	 */
> +	if (INTEL_GEN(dev_priv) >= 11 && required_slices != hw_enabled_slices)
> +		icl_dbuf_slices_update(dev_priv, slices_union);
>  
>  	/*
>  	 * Whenever the number of active pipes changes, we need to make sure we
> @@ -14681,9 +14690,12 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
>  		}
>  	} while (progress);
>  
> -	/* If 2nd DBuf slice is no more required disable it */
> -	if (INTEL_GEN(dev_priv) >= 11 && required_slices < hw_enabled_slices)
> -		icl_dbuf_slices_update(dev_priv, required_slices);
> +	/*
> +	 * If some other DBuf slice is not needed, disable it here,
> +	 * as here we only remove slices, which are not needed anymore.
> +	 */
> +	if (INTEL_GEN(dev_priv) >= 11 && required_slices != hw_enabled_slices)
> +		icl_dbuf_slices_update(dev_priv, slices_intersection);

This is still a totally wrong place for this. We need to be sure the
planes have switched over to the new ddb layout before we power off
anything.

>  }
>  
>  static void intel_atomic_helper_free_state(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c
> index ce1b64f4dd44..a78bc9b61862 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> @@ -1031,15 +1031,6 @@ static bool gen9_dc_off_power_well_enabled(struct drm_i915_private *dev_priv,
>  		(I915_READ(DC_STATE_EN) & DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
>  }
>  
> -static void gen9_assert_dbuf_enabled(struct drm_i915_private *dev_priv)
> -{
> -	u32 tmp = I915_READ(DBUF_CTL);
> -
> -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> -	     "Unexpected DBuf power power state (0x%08x)\n", tmp);
> -}
> -
>  static void gen9_disable_dc_states(struct drm_i915_private *dev_priv)
>  {
>  	struct intel_cdclk_state cdclk_state = {};
> @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct drm_i915_private *dev_priv)
>  	/* Can't read out voltage_level so can't use intel_cdclk_changed() */
>  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw, &cdclk_state));
>  
> -	gen9_assert_dbuf_enabled(dev_priv);
> -
>  	if (IS_GEN9_LP(dev_priv))
>  		bxt_verify_ddi_phy_power_wells(dev_priv);
>  
> @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct drm_i915_private *dev_priv)
>  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
>  }
>  
> -static u8 intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
>  {
> -	if (INTEL_GEN(dev_priv) < 11)
> -		return 1;
> -	return 2;
> +	return INTEL_INFO(dev_priv)->supported_slices;
> +}

Kinda pointless function now. 'supported_slices' is not a very good
name. Should be 'num_dbuf_slices' or something along those lines.

> +
> +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv)
> +{
> +	const u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;

Seems a rather pointless variable. const at least is pointless.

> +
> +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
>  }
>  
>  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
>  			    u8 req_slices)
>  {
> -	const u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
> -	bool ret;
> +	bool ret = true;
> +	int i;
> +	int max_slices = intel_dbuf_max_slices(dev_priv);
>  
> -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> +	if (hweight8(req_slices) > intel_dbuf_max_slices(dev_priv)) {
>  		DRM_ERROR("Invalid number of dbuf slices requested\n");
>  		return;
>  	}
>  
> -	if (req_slices == hw_enabled_slices || req_slices == 0)
> -		return;
> +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n", req_slices);
>  
> -	if (req_slices > hw_enabled_slices)
> -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2, true);
> -	else
> -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2, false);
> +	for (i = 0; i < max_slices; i++) {
> +		int slice_bit = BIT(i);
> +		bool slice_set = (slice_bit & req_slices) != 0;
> +
> +		switch (slice_bit) {
> +		case DBUF_S1_BIT:
> +			ret = ret && intel_dbuf_slice_set(dev_priv,
> +							  DBUF_CTL_S1,
> +							  slice_set);
> +			break;
> +		case DBUF_S2_BIT:
> +			ret = ret && intel_dbuf_slice_set(dev_priv,
> +							  DBUF_CTL_S2,
> +							  slice_set);

The return value on intel_dbuf_slice_set() seems pointless. I'd nuke it.

Also as long you have intel_dbuf_slice_set() I'd probably just pass the
slice index there and let it deal with the register details. DBUF_CTL
should probably be parametrized as well so you don't need annoying switch
statements.

> +			break;
> +		default:
> +			MISSING_CASE(slice_bit);
> +		}
> +	}
>  
>  	if (ret)
>  		dev_priv->wm.skl_hw.ddb.enabled_slices = req_slices;
> @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
>  
>  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
>  {
> -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) | DBUF_POWER_REQUEST);
> -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) | DBUF_POWER_REQUEST);
> -	POSTING_READ(DBUF_CTL_S2);
> -
> -	udelay(10);
> -
> -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> -		DRM_ERROR("DBuf power enable timeout\n");
> -	else
> -		/*
> -		 * FIXME: for now pretend that we only have 1 slice, see
> -		 * intel_enabled_dbuf_slices_num().
> -		 */
> -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> +	/*
> +	 * Just power up 1 slice, we will
> +	 * figure out later which slices we have and what we need.
> +	 */
> +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;

That stuff really shouldn't be modified by low level functions.
The intial value should come from readout, and after that it should
be adjusted as part of the overall atomic state.

> +	icl_program_dbuf_slices(dev_priv);
>  }
>  
>  static void icl_dbuf_disable(struct drm_i915_private *dev_priv)
>  {
> -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) & ~DBUF_POWER_REQUEST);
> -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) & ~DBUF_POWER_REQUEST);
> -	POSTING_READ(DBUF_CTL_S2);
> -
> -	udelay(10);
> -
> -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> -		DRM_ERROR("DBuf power disable timeout!\n");
> -	else
> -		/*
> -		 * FIXME: for now pretend that the first slice is always
> -		 * enabled, see intel_enabled_dbuf_slices_num().
> -		 */
> -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> +	/*
> +	 * Disable all slices
> +	 */
> +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> +	icl_program_dbuf_slices(dev_priv);
>  }
>  
>  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
> index 1da04f3e0fb3..b7057d68ad78 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> @@ -311,8 +311,10 @@ intel_display_power_put_async(struct drm_i915_private *i915,
>  	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
>  	     intel_display_power_put_async((i915), (domain), (wf)), (wf) = 0)
>  
> +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
>  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
>  			    u8 req_slices);
> +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv);
>  
>  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
>  			     bool override, unsigned int mask);
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 82e4e6bf08c3..96741e68633a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -51,6 +51,7 @@
>  #include "display/intel_bw.h"
>  #include "display/intel_cdclk.h"
>  #include "display/intel_display_types.h"
> +#include "display/intel_display_power.h"
>  #include "display/intel_dp.h"
>  #include "display/intel_fbdev.h"
>  #include "display/intel_hotplug.h"
> @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct device *kdev)
>  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
>  		ret = vlv_resume_prepare(dev_priv, true);
>  
> +	/* Weird hack to fix ICL hardware bug, as it resets DBUF slices reg */
> +	if (INTEL_GEN(dev_priv) == 11)
> +		icl_program_dbuf_slices(dev_priv);
> +
>  	intel_uncore_runtime_resume(&dev_priv->uncore);
>  
>  	intel_runtime_pm_enable_interrupts(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 7e0f67babe20..a396977c9c2d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -804,7 +804,7 @@ static inline bool skl_ddb_entry_equal(const struct skl_ddb_entry *e1,
>  }
>  
>  struct skl_ddb_allocation {

This struct is quite pointless. Should be killed off.

> -	u8 enabled_slices; /* GEN11 has configurable 2 slices */
> +	u8 enabled_slices;   /* Bitmask of currently enabled DBuf slices */
>  };
>  
>  struct skl_ddb_values {
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 1bb701d32a5d..8e86af505730 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -614,7 +614,8 @@ static const struct intel_device_info intel_cherryview_info = {
>  	.has_gt_uc = 1, \
>  	.display.has_hdcp = 1, \
>  	.display.has_ipc = 1, \
> -	.ddb_size = 896
> +	.ddb_size = 896, \
> +	.supported_slices = 1
>  
>  #define SKL_PLATFORM \
>  	GEN9_FEATURES, \
> @@ -650,6 +651,7 @@ static const struct intel_device_info intel_skylake_gt4_info = {
>  	GEN(9), \
>  	.is_lp = 1, \
>  	.display.has_hotplug = 1, \
> +	.supported_slices = 1, \
>  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0), \
>  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
>  	.has_64bit_reloc = 1, \
> @@ -737,6 +739,7 @@ static const struct intel_device_info intel_coffeelake_gt3_info = {
>  	GEN9_FEATURES, \
>  	GEN(10), \
>  	.ddb_size = 1024, \
> +	.supported_slices = 1, \
>  	.display.has_dsc = 1, \
>  	.has_coherent_ggtt = false, \
>  	GLK_COLORS
> @@ -773,6 +776,7 @@ static const struct intel_device_info intel_cannonlake_info = {
>  	}, \
>  	GEN(11), \
>  	.ddb_size = 2048, \
> +	.supported_slices = 2, \
>  	.has_logical_ring_elsq = 1, \
>  	.color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 }
>  
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index a607ea520829..fba5731063d8 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -7734,9 +7734,11 @@ enum {
>  #define DISP_ARB_CTL2	_MMIO(0x45004)
>  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
>  #define  DISP_IPC_ENABLE		(1 << 3)
> -#define DBUF_CTL	_MMIO(0x45008)
> -#define DBUF_CTL_S1	_MMIO(0x45008)
> -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> +#define DBUF_S1_BIT			BIT(0)
> +#define DBUF_S2_BIT			BIT(1)

These are not bits inside DBUF_CTL so they should not be here.
I'd just nuke them entirely. If you think we need more than an
int to refer to a slice then IMO something simple like

enum dbuf_slice {
	DBUF_S1,
	DBUF_S2,
};

would match the general apporach we use for other things much better
rather than defining some shifted values.

> +#define DBUF_CTL			_MMIO(0x45008)
> +#define DBUF_CTL_S1			_MMIO(0x45008)
> +#define DBUF_CTL_S2			_MMIO(0x44FE8)
>  #define  DBUF_POWER_REQUEST		(1 << 31)
>  #define  DBUF_POWER_STATE		(1 << 30)
>  #define GEN7_MSG_CTL	_MMIO(0x45010)
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
> index 4bdf8a6cfb47..ba34e1a5c591 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.h
> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> @@ -180,6 +180,7 @@ struct intel_device_info {
>  	} display;
>  
>  	u16 ddb_size; /* in blocks */
> +	u8 supported_slices; /* number of DBuf slices */
>  
>  	/* Register offsets for the various display pipes and transcoders */
>  	int pipe_offsets[I915_MAX_TRANSCODERS];
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 2d389e437e87..1ea81ab92429 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
>  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
>  }
>  
> -static u8 intel_enabled_dbuf_slices_num(struct drm_i915_private *dev_priv)
> +static u8 intel_enabled_dbuf_slices(struct drm_i915_private *dev_priv)
>  {
> -	u8 enabled_slices;
> -
> -	/* Slice 1 will always be enabled */
> -	enabled_slices = 1;
> +	u8 enabled_slices = 0;
>  
>  	/* Gen prior to GEN11 have only one DBuf slice */
>  	if (INTEL_GEN(dev_priv) < 11)
> -		return enabled_slices;
> +		return DBUF_S1_BIT;

I'd nuke this special case entirely and just read everything from the
hw. Assuming we need readout somewhere.

>  
> -	/*
> -	 * FIXME: for now we'll only ever use 1 slice; pretend that we have
> -	 * only that 1 slice enabled until we have a proper way for on-demand
> -	 * toggling of the second slice.
> -	 */
> -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> -		enabled_slices++;
> +	/* Check if second DBuf slice is enabled */
> +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> +		enabled_slices |= DBUF_S1_BIT;
> +
> +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> +		enabled_slices |= DBUF_S2_BIT;
>  
>  	return enabled_slices;
>  }
> @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv,
>  			      const int num_active,
>  			      struct skl_ddb_allocation *ddb)
>  {
> -	const struct drm_display_mode *adjusted_mode;
> -	u64 total_data_bw;
>  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> -
>  	WARN_ON(ddb_size == 0);
>  
>  	if (INTEL_GEN(dev_priv) < 11)
>  		return ddb_size - 4; /* 4 blocks for bypass path allocation */
>  
> -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> -	total_data_bw = total_data_rate * drm_mode_vrefresh(adjusted_mode);
> +	return ddb_size;
> +}
>  
> -	/*
> -	 * 12GB/s is maximum BW supported by single DBuf slice.
> -	 *
> -	 * FIXME dbuf slice code is broken:
> -	 * - must wait for planes to stop using the slice before powering it off
> -	 * - plane straddling both slices is illegal in multi-pipe scenarios
> -	 * - should validate we stay within the hw bandwidth limits
> -	 */
> -	if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
> -		ddb->enabled_slices = 2;
> -	} else {
> -		ddb->enabled_slices = 1;
> -		ddb_size /= 2;
> -	}
> +/*
> + * Calculate initial DBuf slice offset, based on slice size
> + * and mask(i.e if slice size is 1024 and second slice is enabled
> + * offset would be 1024)
> + */
> +static u32 skl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
> +					   u32 slice_size, u32 ddb_size)
> +{
> +	u32 offset = 0;
>  
> -	return ddb_size;
> +	if (!dbuf_slice_mask)
> +		return 0;
> +
> +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> +
> +	WARN_ON(offset >= ddb_size);
> +	return offset;
>  }
>  
> +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> +				      int pipe, u32 active_pipes,
> +				      const struct intel_crtc_state *crtc_state);
> +
>  static void
>  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  				   const struct intel_crtc_state *crtc_state,
> @@ -3857,7 +3855,14 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  	u32 pipe_width = 0, total_width = 0, width_before_pipe = 0;
>  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
>  	u16 ddb_size;
> +	u32 ddb_range_size;
>  	u32 i;
> +	u32 dbuf_slice_mask;
> +	u32 active_pipes;
> +	u32 offset;
> +	u32 slice_size;
> +	u32 total_slice_mask;
> +	u32 start, end;
>  
>  	if (WARN_ON(!state) || !crtc_state->hw.active) {
>  		alloc->start = 0;
> @@ -3866,14 +3871,23 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  		return;
>  	}
>  
> -	if (intel_state->active_pipe_changes)
> +	if (intel_state->active_pipe_changes) {
>  		*num_active = hweight8(intel_state->active_pipes);
> -	else
> +		active_pipes = intel_state->active_pipes;
> +	} else {
>  		*num_active = hweight8(dev_priv->active_pipes);
> +		active_pipes = dev_priv->active_pipes;
> +	}
>  
>  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state, total_data_rate,
>  				      *num_active, ddb);
>  
> +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> +
> +	slice_size = ddb_size / INTEL_INFO(dev_priv)->supported_slices;
> +
> +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> +
>  	/*
>  	 * If the state doesn't change the active CRTC's or there is no
>  	 * modeset request, then there's no need to recalculate;
> @@ -3891,20 +3905,70 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  		return;
>  	}
>  
> +	/*
> +	 * Get allowed DBuf slices for correspondent pipe and platform.
> +	 */
> +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv, for_pipe,
> +						     active_pipes, crtc_state);

Quite a few redundant arguments. The function name looks alien.

skl_crtc_possible_dbuf_slices() or something perhaps?

> +
> +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes %x\n",
> +		      dbuf_slice_mask,
> +		      for_pipe, active_pipes);
> +
> +	/*
> +	 * Figure out at which DBuf slice we start, i.e if we start at Dbuf S2
> +	 * and slice size is 1024, the offset would be 1024
> +	 */
> +	offset = skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> +						 slice_size, ddb_size);
> +
> +	/*
> +	 * Figure out total size of allowed DBuf slices, which is basically
> +	 * a number of allowed slices for that pipe multiplied by slice size.
> +	 * Inside of this
> +	 * range ddb entries are still allocated in proportion to display width.
> +	 */
> +	ddb_range_size = hweight8(dbuf_slice_mask) * slice_size;
> +
>  	/*
>  	 * Watermark/ddb requirement highly depends upon width of the
>  	 * framebuffer, So instead of allocating DDB equally among pipes
>  	 * distribute DDB based on resolution/width of the display.
>  	 */
> +	total_slice_mask = dbuf_slice_mask;

Isn't this already taken care by the loop below?

>  	for_each_new_intel_crtc_in_state(intel_state, crtc, crtc_state, i) {
>  		const struct drm_display_mode *adjusted_mode =
>  			&crtc_state->hw.adjusted_mode;
>  		enum pipe pipe = crtc->pipe;
>  		int hdisplay, vdisplay;
> +		u32 pipe_dbuf_slice_mask = \
> +			i915_get_allowed_dbuf_mask(dev_priv,
> +						pipe,
> +						active_pipes,
> +						crtc_state);
>  
>  		if (!crtc_state->hw.enable)
>  			continue;
>  
> +		/*
> +		 * According to BSpec pipe can share one dbuf slice with another
> +		 * pipes or pipe can use multiple dbufs, in both cases we
> +		 * account for other pipes only if they have exactly same mask.
> +		 * However we need to account how many slices we should enable
> +		 * in total.
> +		 */
> +		total_slice_mask |= pipe_dbuf_slice_mask;

total_slice_mask will now account only the crtcs in the state. What
happens to the other pipes' slices?

> +
> +		/*
> +		 * Do not account pipes using other slice sets
> +		 * luckily as of current BSpec slice sets do not partially
> +		 * intersect(pipes share either same one slice or same slice set
> +		 * i.e no partial intersection), so it is enough to check for
> +		 * equality for now.
> +		 */
> +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> +			continue;
> +
>  		drm_mode_get_hv_timing(adjusted_mode, &hdisplay, &vdisplay);
>  		total_width += hdisplay;
>  
> @@ -3914,8 +3978,19 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  			pipe_width = hdisplay;
>  	}
>  
> -	alloc->start = ddb_size * width_before_pipe / total_width;
> -	alloc->end = ddb_size * (width_before_pipe + pipe_width) / total_width;
> +	ddb->enabled_slices = total_slice_mask;
> +
> +	start = ddb_range_size * width_before_pipe / total_width;
> +	end = ddb_range_size * (width_before_pipe + pipe_width) / total_width;
> +
> +	alloc->start = offset + start;
> +	alloc->end = offset + end;
> +
> +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> +		      alloc->start, alloc->end);
> +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported %d\n",
> +		      ddb->enabled_slices,
> +		      INTEL_INFO(dev_priv)->supported_slices);
>  }
>  
>  static int skl_compute_wm_params(const struct intel_crtc_state *crtc_state,
> @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
>  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
>  			  struct skl_ddb_allocation *ddb /* out */)
>  {
> -	ddb->enabled_slices = intel_enabled_dbuf_slices_num(dev_priv);
> +	ddb->enabled_slices = intel_enabled_dbuf_slices(dev_priv);
> +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb->enabled_slices);
>  }
>  
>  /*
> @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct intel_crtc_state *crtc_state,
>  	return total_data_rate;
>  }
>  
> +static uint_fixed_16_16_t
> +icl_pipe_downscale_amount(const struct intel_crtc_state *crtc_state)
> +{
> +	u32 src_w, src_h, dst_w, dst_h;
> +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> +	uint_fixed_16_16_t downscale_h, downscale_w;
> +	const struct drm_display_mode *adjusted_mode = &crtc_state->hw.adjusted_mode;
> +
> +	src_w = crtc_state->pipe_src_w;
> +	src_h = crtc_state->pipe_src_h;
> +	dst_w = adjusted_mode->crtc_hdisplay;
> +	dst_h = adjusted_mode->crtc_vdisplay;
> +
> +	fp_w_ratio = div_fixed16(src_w, dst_w);
> +	fp_h_ratio = div_fixed16(src_h, dst_h);
> +	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
> +	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
> +
> +	return mul_fixed16(downscale_w, downscale_h);
> +}
> +
> +uint_fixed_16_16_t
> +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> +{
> +	struct drm_plane *plane;
> +	const struct drm_plane_state *drm_plane_state;
> +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> +	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> +
> +	if (!crtc_state->uapi.enable)
> +		return max_downscale;
> +
> +	drm_atomic_crtc_state_for_each_plane_state(plane, drm_plane_state, &crtc_state->uapi) {
> +		uint_fixed_16_16_t plane_downscale;
> +		const struct intel_plane_state *plane_state =
> +			to_intel_plane_state(drm_plane_state);
> +
> +		if (!intel_wm_plane_visible(crtc_state, plane_state))
> +			continue;
> +
> +		plane_downscale = skl_plane_downscale_amount(crtc_state, plane_state);
> +
> +		DRM_DEBUG_KMS("Plane %d downscale amount %d.%d\n",
> +			      to_intel_plane(plane)->id,
> +			      plane_downscale.val >> 16,
> +			      plane_downscale.val & 0xffff);
> +
> +		max_downscale = max_fixed16(plane_downscale, max_downscale);
> +	}
> +
> +
> +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> +
> +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> +		       crtc->pipe, pipe_downscale.val >> 16,
> +		       pipe_downscale.val & 0xffff);
> +
> +	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
> +
> +	/*
> +	 * Convert result to percentage: get the actual ratio instead of rate,
> +	 * then multiply by 100 to get percentage.
> +	 */
> +	pipe_ratio = u32_to_fixed16(100 * div_round_up_u32_fixed16(1, pipe_downscale));
> +
> +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> +		      pipe_ratio.val >> 16, pipe_ratio.val & 0xffff);
> +
> +	return pipe_ratio;
> +}

Yuck. I just killed these.

> +
> +struct dbuf_slice_conf_entry {
> +	u32 dbuf_mask[I915_MAX_PIPES];
> +	u32 active_pipes;
> +};
> +
> +
> +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) }
> +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C) | BIT(PIPE_D) }
> +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D) }
> +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_D) }
> +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) }
> +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \
> +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) | BIT(PIPE_B) | \
> +					  BIT(PIPE_C) | BIT(PIPE_D) }
> +
> +/*
> + * Table taken from Bspec 12716
> + * Pipes do have some preferred DBuf slice affinity,
> + * plus there are some hardcoded requirements on how
> + * those should be distributed for multipipe scenarios.
> + * For more DBuf slices algorithm can get even more messy
> + * and less readable, so decided to use a table almost
> + * as is from BSpec itself - that way it is at least easier
> + * to compare, change and check.
> + */
> +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> +{ { 0, 0, 0, 0 }, 0 },
> +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
> +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe ratio >= 88.8 */
> +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
> +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe ratio >= 88.8 */
> +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
> +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for pipe ratio >= 88.8 */
> +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> +};
> +
> +/*
> + * Table taken from Bspec 49255
> + * Pipes do have some preferred DBuf slice affinity,
> + * plus there are some hardcoded requirements on how
> + * those should be distributed for multipipe scenarios.
> + * For more DBuf slices algorithm can get even more messy
> + * and less readable, so decided to use a table almost
> + * as is from BSpec itself - that way it is at least easier
> + * to compare, change and check.
> + */
> +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> +{ { 0, 0, 0, 0 }, 0 },
> +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> +};

My eyes!

I have to think we should be able to reduce all that to a handful
of lines of code.


> +
> +/*
> + * This function

"This function" is totally redundant. Ditto elsewhere.

> finds an entry with same enabled pipe configuration and
> + * returns correspondent DBuf slice mask as stated in BSpec for particular
> + * platform.
> + */
> +static u32 icl_get_allowed_dbuf_mask(int pipe,
> +				     u32 active_pipes,
> +				     const struct intel_crtc_state *crtc_state)
> +{
> +	int i;
> +	uint_fixed_16_16_t pipe_ratio;
> +
> +	/*
> +	 * Calculate pipe ratio as stated in BSpec 28692
> +	 */
> +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> +
> +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> +		if (icl_allowed_dbufs[i].active_pipes == active_pipes) {
> +			if (hweight32(active_pipes) == 1) {
> +				/*
> +				 * According to BSpec 12716: if ratio >= 88.8
> +				 * always use single dbuf slice.
> +				 */
> +				if (pipe_ratio.val >= div_fixed16(888, 10).val)
> +					++i;
> +			}
> +			return icl_allowed_dbufs[i].dbuf_mask[pipe];
> +		}
> +	}
> +	return 0;
> +}
> +
> +/*
> + * This function finds an entry with same enabled pipe configuration and
> + * returns correspondent DBuf slice mask as stated in BSpec for particular
> + * platform.
> + */
> +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> +				     u32 active_pipes,
> +				     const struct intel_crtc_state *crtc_state)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> +		if (tgl_allowed_dbufs[i].active_pipes == active_pipes)
> +			return tgl_allowed_dbufs[i].dbuf_mask[pipe];
> +	}
> +	return 0;
> +}
> +
> +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> +				      int pipe, u32 active_pipes,
> +				      const struct intel_crtc_state *crtc_state)
> +{
> +	if (IS_GEN(dev_priv, 11))
> +		return icl_get_allowed_dbuf_mask(pipe,
> +						 active_pipes,
> +						 crtc_state);
> +	else if (IS_GEN(dev_priv, 12))
> +		return tgl_get_allowed_dbuf_mask(pipe,
> +						 active_pipes,
> +						 crtc_state);
> +	/*
> +	 * For anything else just return one slice yet.
> +	 * Should be extended for other platforms.
> +	 */
> +	return DBUF_S1_BIT;
> +}
> +
>  static u64
>  icl_get_total_relative_data_rate(struct intel_crtc_state *crtc_state,
>  				 u64 *plane_data_rate)
> -- 
> 2.17.1

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18  9:19     ` Lisovskiy, Stanislav
  0 siblings, 0 replies; 34+ messages in thread
From: Lisovskiy, Stanislav @ 2019-11-18  9:19 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx

On Fri, 2019-11-15 at 22:19 +0200, Ville Syrjälä wrote:
> On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy wrote:
> > Also implemented algorithm for choosing DBuf slice configuration
> > based on active pipes, pipe ratio as stated in BSpec 12716.
> > 
> > Now pipe allocation still stays proportional to pipe width as
> > before,
> > however within allowed DBuf slice for this particular
> > configuration.
> > 
> > v2: Remove unneeded check from commit as ddb enabled slices might
> >     now differ from hw state.
> > 
> > v3: - Added new field "supported_slices" to ddb to separate max
> >       supported slices vs currently enabled, to avoid confusion.
> >     - Removed obsolete comments and code related to DBuf(Matthew
> > Roper).
> >     - Some code style and long line removal(Matthew Roper).
> >     - Added WARN_ON to new ddb range offset calc function(Matthew
> > Roper).
> >     - Removed platform specific call to calc pipe ratio from ddb
> >       allocation function and fixed the return value(Matthew Roper)
> >     - Refactored DBUF slice allocation table to improve readability
> >     - Added DBUF slice allocation for TGL as well.
> >     - ICL(however not TGL) seems to voluntarily enable second DBuf
> > slice
> >       after pm suspend/resume causing a mismatch failure, because
> > we
> >       update DBuf slices only if we do a modeset, however this
> > check
> >       is done always. Fixed it to be done only when modeset for
> > ICL.
> > 
> > v4: - Now enabled slices is not actually a number, but a bitmask,
> >       because we might need to enable second slice only and number
> >       of slices would still 1 and that current code doesn't allow.
> >     - Remove redundant duplicate code to have some unified way of
> >       enabling dbuf slices instead of hardcoding.
> > 
> > v5: - Fix failing gen9_assert_dbuf_enabled as it was naively
> > thinking
> >       that we have only one DBUF_CTL slice. Now another version for
> >       gen11 will check both slices as only second one can be
> > enabled,
> >       to keep CI happy.
> > 
> > v6: - Removed enabled dbuf assertion completely. Previous code
> >       was keeping dbuf enabled, even(!) when _dbuf_disable is
> > called.
> >       Currently we enable/disable dbuf slices based on requrement
> >       so no point in asserting this here.
> >     - Removed unnecessary modeset check from
> > verify_wm_state(Matthew Roper)
> >     - Slices intersection after union is same as final
> > result(Matthew Roper)
> >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> >     - Some macros added(Matthew Roper)
> >     - ddb range is now always less or equal to ddb size - no need
> > for additional
> >       checks(previously needed as we had some bandwidth checks in
> > there which
> >       could "suddenly" return ddb_size smaller than it is.
> > 
> > v7: Minor costemic changes:
> >     - Changed icl_dbuf_slices_restore name to
> > icl_program_dbuf_slices
> >       as it more corresponds to the actual functionality.
> >     - Some simplification with supported slices for BXT and GLK
> >     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
> >       this is not used for skl anymore.
> >     - Changed DBuf slice assignment order for TGL case
> > 
> > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > Cc: James Ausmus <james.ausmus@intel.com>
> > ---
> >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> >  .../drm/i915/display/intel_display_power.h    |   2 +
> >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > ++++++++++++++++--
> >  9 files changed, 431 insertions(+), 104 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > b/drivers/gpu/drm/i915/display/intel_display.c
> > index 876fc25968bf..bd7aff675198 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > intel_crtc *crtc,
> >  
> >  	if (INTEL_GEN(dev_priv) >= 11 &&
> >  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > -		DRM_ERROR("mismatch in DBUF Slices (expected %u, got
> > %u)\n",
> > +		DRM_ERROR("mismatch in DBUF Slices (expected %x, got
> > %x)\n",
> >  			  sw_ddb->enabled_slices,
> >  			  hw->ddb.enabled_slices);
> >  
> > @@ -14614,15 +14614,24 @@ static void
> > skl_commit_modeset_enables(struct intel_atomic_state *state)
> >  	u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
> >  	u8 required_slices = state->wm_results.ddb.enabled_slices;
> >  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > +	u8 slices_union = hw_enabled_slices | required_slices;
> > +	u8 slices_intersection = required_slices;
> >  
> >  	for_each_oldnew_intel_crtc_in_state(state, crtc,
> > old_crtc_state, new_crtc_state, i)
> >  		/* ignore allocations for crtc's that have been turned
> > off. */
> >  		if (new_crtc_state->hw.active)
> >  			entries[i] = old_crtc_state->wm.skl.ddb;
> >  
> > -	/* If 2nd DBuf slice required, enable it here */
> > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > hw_enabled_slices)
> > -		icl_dbuf_slices_update(dev_priv, required_slices);
> > +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > +		      required_slices, hw_enabled_slices);
> > +
> > +	/*
> > +	 * If some other DBuf slice required, enable it here,
> > +	 * as here we only add more slices, but not remove to prevent
> > +	 * issues if somebody still uses those.
> > +	 */
> > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > hw_enabled_slices)
> > +		icl_dbuf_slices_update(dev_priv, slices_union);
> >  
> >  	/*
> >  	 * Whenever the number of active pipes changes, we need to make
> > sure we
> > @@ -14681,9 +14690,12 @@ static void
> > skl_commit_modeset_enables(struct intel_atomic_state *state)
> >  		}
> >  	} while (progress);
> >  
> > -	/* If 2nd DBuf slice is no more required disable it */
> > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > hw_enabled_slices)
> > -		icl_dbuf_slices_update(dev_priv, required_slices);
> > +	/*
> > +	 * If some other DBuf slice is not needed, disable it here,
> > +	 * as here we only remove slices, which are not needed anymore.
> > +	 */
> > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > hw_enabled_slices)
> > +		icl_dbuf_slices_update(dev_priv, slices_intersection);
> 
> This is still a totally wrong place for this. We need to be sure the
> planes have switched over to the new ddb layout before we power off
> anything.

I have just modified already existing line here, not added by me.
The point of this patch is starting to use second DBuf slice but
not fixing all kinds of issues related to DBuf code at the same time.

Of course this is nice idea to fix it - if that is wrong plane,
but current code somehow lives with it and probably this should be
with some other patch.

> 
> >  }
> >  
> >  static void intel_atomic_helper_free_state(struct drm_i915_private
> > *dev_priv)
> > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > index ce1b64f4dd44..a78bc9b61862 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > @@ -1031,15 +1031,6 @@ static bool
> > gen9_dc_off_power_well_enabled(struct drm_i915_private *dev_priv,
> >  		(I915_READ(DC_STATE_EN) &
> > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> >  }
> >  
> > -static void gen9_assert_dbuf_enabled(struct drm_i915_private
> > *dev_priv)
> > -{
> > -	u32 tmp = I915_READ(DBUF_CTL);
> > -
> > -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> > -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > -	     "Unexpected DBuf power power state (0x%08x)\n", tmp);
> > -}
> > -
> >  static void gen9_disable_dc_states(struct drm_i915_private
> > *dev_priv)
> >  {
> >  	struct intel_cdclk_state cdclk_state = {};
> > @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct
> > drm_i915_private *dev_priv)
> >  	/* Can't read out voltage_level so can't use
> > intel_cdclk_changed() */
> >  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw,
> > &cdclk_state));
> >  
> > -	gen9_assert_dbuf_enabled(dev_priv);
> > -
> >  	if (IS_GEN9_LP(dev_priv))
> >  		bxt_verify_ddi_phy_power_wells(dev_priv);
> >  
> > @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct
> > drm_i915_private *dev_priv)
> >  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> >  }
> >  
> > -static u8 intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> >  {
> > -	if (INTEL_GEN(dev_priv) < 11)
> > -		return 1;
> > -	return 2;
> > +	return INTEL_INFO(dev_priv)->supported_slices;
> > +}
> 
> Kinda pointless function now. 'supported_slices' is not a very good
> name. Should be 'num_dbuf_slices' or something along those lines.

Supported slices really means how much slices we support in hardware.
num_dbuf_slices will not reflect that, it will be more ambigious.
I don't like this name.

> 
> > +
> > +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv)
> > +{
> > +	const u8 hw_enabled_slices = dev_priv-
> > >wm.skl_hw.ddb.enabled_slices;
> 
> Seems a rather pointless variable. const at least is pointless.
> 
> > +
> > +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> >  }
> >  
> >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> >  			    u8 req_slices)
> >  {
> > -	const u8 hw_enabled_slices = dev_priv-
> > >wm.skl_hw.ddb.enabled_slices;
> > -	bool ret;
> > +	bool ret = true;
> > +	int i;
> > +	int max_slices = intel_dbuf_max_slices(dev_priv);
> >  
> > -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > +	if (hweight8(req_slices) > intel_dbuf_max_slices(dev_priv)) {
> >  		DRM_ERROR("Invalid number of dbuf slices requested\n");
> >  		return;
> >  	}
> >  
> > -	if (req_slices == hw_enabled_slices || req_slices == 0)
> > -		return;
> > +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n", req_slices);
> >  
> > -	if (req_slices > hw_enabled_slices)
> > -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2,
> > true);
> > -	else
> > -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2,
> > false);
> > +	for (i = 0; i < max_slices; i++) {
> > +		int slice_bit = BIT(i);
> > +		bool slice_set = (slice_bit & req_slices) != 0;
> > +
> > +		switch (slice_bit) {
> > +		case DBUF_S1_BIT:
> > +			ret = ret && intel_dbuf_slice_set(dev_priv,
> > +							  DBUF_CTL_S1,
> > +							  slice_set);
> > +			break;
> > +		case DBUF_S2_BIT:
> > +			ret = ret && intel_dbuf_slice_set(dev_priv,
> > +							  DBUF_CTL_S2,
> > +							  slice_set);
> 
> The return value on intel_dbuf_slice_set() seems pointless. I'd nuke
> it.
> 
> Also as long you have intel_dbuf_slice_set() I'd probably just pass
> the
> slice index there and let it deal with the register details. DBUF_CTL
> should probably be parametrized as well so you don't need annoying
> switch
> statements.

What if it fails? Shouldn't we still have some clue what state do we
have/had at the moment?

> 
> > +			break;
> > +		default:
> > +			MISSING_CASE(slice_bit);
> > +		}
> > +	}
> >  
> >  	if (ret)
> >  		dev_priv->wm.skl_hw.ddb.enabled_slices = req_slices;
> > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > drm_i915_private *dev_priv,
> >  
> >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> >  {
> > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > DBUF_POWER_REQUEST);
> > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > DBUF_POWER_REQUEST);
> > -	POSTING_READ(DBUF_CTL_S2);
> > -
> > -	udelay(10);
> > -
> > -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > -		DRM_ERROR("DBuf power enable timeout\n");
> > -	else
> > -		/*
> > -		 * FIXME: for now pretend that we only have 1 slice,
> > see
> > -		 * intel_enabled_dbuf_slices_num().
> > -		 */
> > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > +	/*
> > +	 * Just power up 1 slice, we will
> > +	 * figure out later which slices we have and what we need.
> > +	 */
> > +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> 
> That stuff really shouldn't be modified by low level functions.
> The intial value should come from readout, and after that it should
> be adjusted as part of the overall atomic state.

Well, as you can see currently this is even worse - has some weird
hardcoded stuff and is modified like that. Also if we are enabling the 
dbuf slices, shouldn't we really determine what value should be there?

Also I wouldn't really always trust the readout, as to me it seems
icelake resets it constantly to 2 after suspend/resume. 

> 
> > +	icl_program_dbuf_slices(dev_priv);
> >  }
> >  
> >  static void icl_dbuf_disable(struct drm_i915_private *dev_priv)
> >  {
> > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > ~DBUF_POWER_REQUEST);
> > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > ~DBUF_POWER_REQUEST);
> > -	POSTING_READ(DBUF_CTL_S2);
> > -
> > -	udelay(10);
> > -
> > -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > -		DRM_ERROR("DBuf power disable timeout!\n");
> > -	else
> > -		/*
> > -		 * FIXME: for now pretend that the first slice is
> > always
> > -		 * enabled, see intel_enabled_dbuf_slices_num().
> > -		 */
> > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > +	/*
> > +	 * Disable all slices
> > +	 */
> > +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > +	icl_program_dbuf_slices(dev_priv);
> >  }
> >  
> >  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h
> > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > index 1da04f3e0fb3..b7057d68ad78 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > drm_i915_private *i915,
> >  	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
> >  	     intel_display_power_put_async((i915), (domain), (wf)),
> > (wf) = 0)
> >  
> > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
> >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> >  			    u8 req_slices);
> > +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv);
> >  
> >  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
> >  			     bool override, unsigned int mask);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index 82e4e6bf08c3..96741e68633a 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -51,6 +51,7 @@
> >  #include "display/intel_bw.h"
> >  #include "display/intel_cdclk.h"
> >  #include "display/intel_display_types.h"
> > +#include "display/intel_display_power.h"
> >  #include "display/intel_dp.h"
> >  #include "display/intel_fbdev.h"
> >  #include "display/intel_hotplug.h"
> > @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct
> > device *kdev)
> >  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
> >  		ret = vlv_resume_prepare(dev_priv, true);
> >  
> > +	/* Weird hack to fix ICL hardware bug, as it resets DBUF slices
> > reg */
> > +	if (INTEL_GEN(dev_priv) == 11)
> > +		icl_program_dbuf_slices(dev_priv);
> > +
> >  	intel_uncore_runtime_resume(&dev_priv->uncore);
> >  
> >  	intel_runtime_pm_enable_interrupts(dev_priv);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > b/drivers/gpu/drm/i915/i915_drv.h
> > index 7e0f67babe20..a396977c9c2d 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -804,7 +804,7 @@ static inline bool skl_ddb_entry_equal(const
> > struct skl_ddb_entry *e1,
> >  }
> >  
> >  struct skl_ddb_allocation {
> 
> This struct is quite pointless. Should be killed off.

Legacy stuff, not added by me. Quite fine with removing it - 
but should we really do all this at the same time and in same
patch? 

> 
> > -	u8 enabled_slices; /* GEN11 has configurable 2 slices */
> > +	u8 enabled_slices;   /* Bitmask of currently enabled DBuf
> > slices */
> >  };
> >  
> >  struct skl_ddb_values {
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > b/drivers/gpu/drm/i915/i915_pci.c
> > index 1bb701d32a5d..8e86af505730 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > intel_cherryview_info = {
> >  	.has_gt_uc = 1, \
> >  	.display.has_hdcp = 1, \
> >  	.display.has_ipc = 1, \
> > -	.ddb_size = 896
> > +	.ddb_size = 896, \
> > +	.supported_slices = 1
> >  
> >  #define SKL_PLATFORM \
> >  	GEN9_FEATURES, \
> > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > intel_skylake_gt4_info = {
> >  	GEN(9), \
> >  	.is_lp = 1, \
> >  	.display.has_hotplug = 1, \
> > +	.supported_slices = 1, \
> >  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0),
> > \
> >  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
> >  	.has_64bit_reloc = 1, \
> > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > intel_coffeelake_gt3_info = {
> >  	GEN9_FEATURES, \
> >  	GEN(10), \
> >  	.ddb_size = 1024, \
> > +	.supported_slices = 1, \
> >  	.display.has_dsc = 1, \
> >  	.has_coherent_ggtt = false, \
> >  	GLK_COLORS
> > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > intel_cannonlake_info = {
> >  	}, \
> >  	GEN(11), \
> >  	.ddb_size = 2048, \
> > +	.supported_slices = 2, \
> >  	.has_logical_ring_elsq = 1, \
> >  	.color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 }
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > b/drivers/gpu/drm/i915/i915_reg.h
> > index a607ea520829..fba5731063d8 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -7734,9 +7734,11 @@ enum {
> >  #define DISP_ARB_CTL2	_MMIO(0x45004)
> >  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
> >  #define  DISP_IPC_ENABLE		(1 << 3)
> > -#define DBUF_CTL	_MMIO(0x45008)
> > -#define DBUF_CTL_S1	_MMIO(0x45008)
> > -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> > +#define DBUF_S1_BIT			BIT(0)
> > +#define DBUF_S2_BIT			BIT(1)
> 
> These are not bits inside DBUF_CTL so they should not be here.
> I'd just nuke them entirely. If you think we need more than an
> int to refer to a slice then IMO something simple like
> 
> enum dbuf_slice {
> 	DBUF_S1,
> 	DBUF_S2,
> };
> 
> would match the general apporach we use for other things much better
> rather than defining some shifted values.

We need to combine slices - so bitmask is quite suitable for that, 
I can of course use this enum, but then I'll have to re-encode it 
to bitmask anyways.


> 
> > +#define DBUF_CTL			_MMIO(0x45008)
> > +#define DBUF_CTL_S1			_MMIO(0x45008)
> > +#define DBUF_CTL_S2			_MMIO(0x44FE8)
> >  #define  DBUF_POWER_REQUEST		(1 << 31)
> >  #define  DBUF_POWER_STATE		(1 << 30)
> >  #define GEN7_MSG_CTL	_MMIO(0x45010)
> > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > b/drivers/gpu/drm/i915/intel_device_info.h
> > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > @@ -180,6 +180,7 @@ struct intel_device_info {
> >  	} display;
> >  
> >  	u16 ddb_size; /* in blocks */
> > +	u8 supported_slices; /* number of DBuf slices */
> >  
> >  	/* Register offsets for the various display pipes and
> > transcoders */
> >  	int pipe_offsets[I915_MAX_TRANSCODERS];
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > b/drivers/gpu/drm/i915/intel_pm.c
> > index 2d389e437e87..1ea81ab92429 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct drm_device
> > *dev)
> >  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> >  }
> >  
> > -static u8 intel_enabled_dbuf_slices_num(struct drm_i915_private
> > *dev_priv)
> > +static u8 intel_enabled_dbuf_slices(struct drm_i915_private
> > *dev_priv)
> >  {
> > -	u8 enabled_slices;
> > -
> > -	/* Slice 1 will always be enabled */
> > -	enabled_slices = 1;
> > +	u8 enabled_slices = 0;
> >  
> >  	/* Gen prior to GEN11 have only one DBuf slice */
> >  	if (INTEL_GEN(dev_priv) < 11)
> > -		return enabled_slices;
> > +		return DBUF_S1_BIT;
> 
> I'd nuke this special case entirely and just read everything from the
> hw. Assuming we need readout somewhere.

We have it that way hardcoded right now in drm-tip. I think this is a
subject to other patch, but not this change. As I understand we need 
to first figure out how to do it properly. 
This patch anyway makes it better by at least removing hardcoded
relation between number of slices and which slices are enabled.
With current code there is no way for example to enable second DBuf
slice only, which is required by BSpec.

> 
> >  
> > -	/*
> > -	 * FIXME: for now we'll only ever use 1 slice; pretend that we
> > have
> > -	 * only that 1 slice enabled until we have a proper way for on-
> > demand
> > -	 * toggling of the second slice.
> > -	 */
> > -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > -		enabled_slices++;
> > +	/* Check if second DBuf slice is enabled */
> > +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > +		enabled_slices |= DBUF_S1_BIT;
> > +
> > +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > +		enabled_slices |= DBUF_S2_BIT;
> >  
> >  	return enabled_slices;
> >  }
> > @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct
> > drm_i915_private *dev_priv,
> >  			      const int num_active,
> >  			      struct skl_ddb_allocation *ddb)
> >  {
> > -	const struct drm_display_mode *adjusted_mode;
> > -	u64 total_data_bw;
> >  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > -
> >  	WARN_ON(ddb_size == 0);
> >  
> >  	if (INTEL_GEN(dev_priv) < 11)
> >  		return ddb_size - 4; /* 4 blocks for bypass path
> > allocation */
> >  
> > -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> > -	total_data_bw = total_data_rate *
> > drm_mode_vrefresh(adjusted_mode);
> > +	return ddb_size;
> > +}
> >  
> > -	/*
> > -	 * 12GB/s is maximum BW supported by single DBuf slice.
> > -	 *
> > -	 * FIXME dbuf slice code is broken:
> > -	 * - must wait for planes to stop using the slice before
> > powering it off
> > -	 * - plane straddling both slices is illegal in multi-pipe
> > scenarios
> > -	 * - should validate we stay within the hw bandwidth limits
> > -	 */
> > -	if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
> > -		ddb->enabled_slices = 2;
> > -	} else {
> > -		ddb->enabled_slices = 1;
> > -		ddb_size /= 2;
> > -	}
> > +/*
> > + * Calculate initial DBuf slice offset, based on slice size
> > + * and mask(i.e if slice size is 1024 and second slice is enabled
> > + * offset would be 1024)
> > + */
> > +static u32 skl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
> > +					   u32 slice_size, u32
> > ddb_size)
> > +{
> > +	u32 offset = 0;
> >  
> > -	return ddb_size;
> > +	if (!dbuf_slice_mask)
> > +		return 0;
> > +
> > +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > +
> > +	WARN_ON(offset >= ddb_size);
> > +	return offset;
> >  }
> >  
> > +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > *dev_priv,
> > +				      int pipe, u32 active_pipes,
> > +				      const struct intel_crtc_state
> > *crtc_state);
> > +
> >  static void
> >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > *dev_priv,
> >  				   const struct intel_crtc_state
> > *crtc_state,
> > @@ -3857,7 +3855,14 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  	u32 pipe_width = 0, total_width = 0, width_before_pipe = 0;
> >  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> >  	u16 ddb_size;
> > +	u32 ddb_range_size;
> >  	u32 i;
> > +	u32 dbuf_slice_mask;
> > +	u32 active_pipes;
> > +	u32 offset;
> > +	u32 slice_size;
> > +	u32 total_slice_mask;
> > +	u32 start, end;
> >  
> >  	if (WARN_ON(!state) || !crtc_state->hw.active) {
> >  		alloc->start = 0;
> > @@ -3866,14 +3871,23 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  		return;
> >  	}
> >  
> > -	if (intel_state->active_pipe_changes)
> > +	if (intel_state->active_pipe_changes) {
> >  		*num_active = hweight8(intel_state->active_pipes);
> > -	else
> > +		active_pipes = intel_state->active_pipes;
> > +	} else {
> >  		*num_active = hweight8(dev_priv->active_pipes);
> > +		active_pipes = dev_priv->active_pipes;
> > +	}
> >  
> >  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > total_data_rate,
> >  				      *num_active, ddb);
> >  
> > +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > +
> > +	slice_size = ddb_size / INTEL_INFO(dev_priv)->supported_slices;
> > +
> > +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> > +
> >  	/*
> >  	 * If the state doesn't change the active CRTC's or there is no
> >  	 * modeset request, then there's no need to recalculate;
> > @@ -3891,20 +3905,70 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  		return;
> >  	}
> >  
> > +	/*
> > +	 * Get allowed DBuf slices for correspondent pipe and platform.
> > +	 */
> > +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv,
> > for_pipe,
> > +						     active_pipes,
> > crtc_state);
> 
> Quite a few redundant arguments. The function name looks alien.
> 
> skl_crtc_possible_dbuf_slices() or something perhaps?
> 
> > +
> > +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes %x\n",
> > +		      dbuf_slice_mask,
> > +		      for_pipe, active_pipes);
> > +
> > +	/*
> > +	 * Figure out at which DBuf slice we start, i.e if we start at
> > Dbuf S2
> > +	 * and slice size is 1024, the offset would be 1024
> > +	 */
> > +	offset = skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > +						 slice_size, ddb_size);
> > +
> > +	/*
> > +	 * Figure out total size of allowed DBuf slices, which is
> > basically
> > +	 * a number of allowed slices for that pipe multiplied by slice
> > size.
> > +	 * Inside of this
> > +	 * range ddb entries are still allocated in proportion to
> > display width.
> > +	 */
> > +	ddb_range_size = hweight8(dbuf_slice_mask) * slice_size;
> > +
> >  	/*
> >  	 * Watermark/ddb requirement highly depends upon width of the
> >  	 * framebuffer, So instead of allocating DDB equally among
> > pipes
> >  	 * distribute DDB based on resolution/width of the display.
> >  	 */
> > +	total_slice_mask = dbuf_slice_mask;
> 
> Isn't this already taken care by the loop below?
> 
> >  	for_each_new_intel_crtc_in_state(intel_state, crtc, crtc_state,
> > i) {
> >  		const struct drm_display_mode *adjusted_mode =
> >  			&crtc_state->hw.adjusted_mode;
> >  		enum pipe pipe = crtc->pipe;
> >  		int hdisplay, vdisplay;
> > +		u32 pipe_dbuf_slice_mask = \
> > +			i915_get_allowed_dbuf_mask(dev_priv,
> > +						pipe,
> > +						active_pipes,
> > +						crtc_state);
> >  
> >  		if (!crtc_state->hw.enable)
> >  			continue;
> >  
> > +		/*
> > +		 * According to BSpec pipe can share one dbuf slice
> > with another
> > +		 * pipes or pipe can use multiple dbufs, in both cases
> > we
> > +		 * account for other pipes only if they have exactly
> > same mask.
> > +		 * However we need to account how many slices we should
> > enable
> > +		 * in total.
> > +		 */
> > +		total_slice_mask |= pipe_dbuf_slice_mask;
> 
> total_slice_mask will now account only the crtcs in the state. What
> happens to the other pipes' slices?

Yes we need to iterate through all pipes/crtcs here, however
for getting correct dbuf slice mask most important is active_pipes.
However yes, I will change the loop for it to be more correct.

> 
> > +
> > +		/*
> > +		 * Do not account pipes using other slice sets
> > +		 * luckily as of current BSpec slice sets do not
> > partially
> > +		 * intersect(pipes share either same one slice or same
> > slice set
> > +		 * i.e no partial intersection), so it is enough to
> > check for
> > +		 * equality for now.
> > +		 */
> > +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > +			continue;
> > +
> >  		drm_mode_get_hv_timing(adjusted_mode, &hdisplay,
> > &vdisplay);
> >  		total_width += hdisplay;
> >  
> > @@ -3914,8 +3978,19 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  			pipe_width = hdisplay;
> >  	}
> >  
> > -	alloc->start = ddb_size * width_before_pipe / total_width;
> > -	alloc->end = ddb_size * (width_before_pipe + pipe_width) /
> > total_width;
> > +	ddb->enabled_slices = total_slice_mask;
> > +
> > +	start = ddb_range_size * width_before_pipe / total_width;
> > +	end = ddb_range_size * (width_before_pipe + pipe_width) /
> > total_width;
> > +
> > +	alloc->start = offset + start;
> > +	alloc->end = offset + end;
> > +
> > +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > +		      alloc->start, alloc->end);
> > +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported %d\n",
> > +		      ddb->enabled_slices,
> > +		      INTEL_INFO(dev_priv)->supported_slices);
> >  }
> >  
> >  static int skl_compute_wm_params(const struct intel_crtc_state
> > *crtc_state,
> > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > intel_crtc *crtc,
> >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
> >  			  struct skl_ddb_allocation *ddb /* out */)
> >  {
> > -	ddb->enabled_slices = intel_enabled_dbuf_slices_num(dev_priv);
> > +	ddb->enabled_slices = intel_enabled_dbuf_slices(dev_priv);
> > +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > >enabled_slices);
> >  }
> >  
> >  /*
> > @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct
> > intel_crtc_state *crtc_state,
> >  	return total_data_rate;
> >  }
> >  
> > +static uint_fixed_16_16_t
> > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	u32 src_w, src_h, dst_w, dst_h;
> > +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > +	uint_fixed_16_16_t downscale_h, downscale_w;
> > +	const struct drm_display_mode *adjusted_mode = &crtc_state-
> > >hw.adjusted_mode;
> > +
> > +	src_w = crtc_state->pipe_src_w;
> > +	src_h = crtc_state->pipe_src_h;
> > +	dst_w = adjusted_mode->crtc_hdisplay;
> > +	dst_h = adjusted_mode->crtc_vdisplay;
> > +
> > +	fp_w_ratio = div_fixed16(src_w, dst_w);
> > +	fp_h_ratio = div_fixed16(src_h, dst_h);
> > +	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
> > +	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
> > +
> > +	return mul_fixed16(downscale_w, downscale_h);
> > +}
> > +
> > +uint_fixed_16_16_t
> > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > +{
> > +	struct drm_plane *plane;
> > +	const struct drm_plane_state *drm_plane_state;
> > +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > +	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> > +
> > +	if (!crtc_state->uapi.enable)
> > +		return max_downscale;
> > +
> > +	drm_atomic_crtc_state_for_each_plane_state(plane,
> > drm_plane_state, &crtc_state->uapi) {
> > +		uint_fixed_16_16_t plane_downscale;
> > +		const struct intel_plane_state *plane_state =
> > +			to_intel_plane_state(drm_plane_state);
> > +
> > +		if (!intel_wm_plane_visible(crtc_state, plane_state))
> > +			continue;
> > +
> > +		plane_downscale =
> > skl_plane_downscale_amount(crtc_state, plane_state);
> > +
> > +		DRM_DEBUG_KMS("Plane %d downscale amount %d.%d\n",
> > +			      to_intel_plane(plane)->id,
> > +			      plane_downscale.val >> 16,
> > +			      plane_downscale.val & 0xffff);
> > +
> > +		max_downscale = max_fixed16(plane_downscale,
> > max_downscale);
> > +	}
> > +
> > +
> > +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > +
> > +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > +		       crtc->pipe, pipe_downscale.val >> 16,
> > +		       pipe_downscale.val & 0xffff);
> > +
> > +	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
> > +
> > +	/*
> > +	 * Convert result to percentage: get the actual ratio instead
> > of rate,
> > +	 * then multiply by 100 to get percentage.
> > +	 */
> > +	pipe_ratio = u32_to_fixed16(100 * div_round_up_u32_fixed16(1,
> > pipe_downscale));
> > +
> > +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > +		      pipe_ratio.val >> 16, pipe_ratio.val & 0xffff);
> > +
> > +	return pipe_ratio;
> > +}
> 
> Yuck. I just killed these.

I still need pipe_ratio fro ICL as you know, this is dictated by BSpec,
not by me.

> 
> > +
> > +struct dbuf_slice_conf_entry {
> > +	u32 dbuf_mask[I915_MAX_PIPES];
> > +	u32 active_pipes;
> > +};
> > +
> > +
> > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > BIT(PIPE_C) }
> > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C) |
> > BIT(PIPE_D) }
> > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C) |
> > BIT(PIPE_D) }
> > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > BIT(PIPE_D) }
> > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > BIT(PIPE_C) }
> > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \
> > +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) | BIT(PIPE_B) | \
> > +					  BIT(PIPE_C) | BIT(PIPE_D) }
> > +
> > +/*
> > + * Table taken from Bspec 12716
> > + * Pipes do have some preferred DBuf slice affinity,
> > + * plus there are some hardcoded requirements on how
> > + * those should be distributed for multipipe scenarios.
> > + * For more DBuf slices algorithm can get even more messy
> > + * and less readable, so decided to use a table almost
> > + * as is from BSpec itself - that way it is at least easier
> > + * to compare, change and check.
> > + */
> > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > +{ { 0, 0, 0, 0 }, 0 },
> > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > < 88.8 */
> > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe
> > ratio >= 88.8 */
> > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > < 88.8 */
> > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe
> > ratio >= 88.8 */
> > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > < 88.8 */
> > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for pipe
> > ratio >= 88.8 */
> > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > +};
> > +
> > +/*
> > + * Table taken from Bspec 49255
> > + * Pipes do have some preferred DBuf slice affinity,
> > + * plus there are some hardcoded requirements on how
> > + * those should be distributed for multipipe scenarios.
> > + * For more DBuf slices algorithm can get even more messy
> > + * and less readable, so decided to use a table almost
> > + * as is from BSpec itself - that way it is at least easier
> > + * to compare, change and check.
> > + */
> > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > +{ { 0, 0, 0, 0 }, 0 },
> > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT,
> > DBUF_S2_BIT),
> > +};
> 
> My eyes!
> 
> I have to think we should be able to reduce all that to a handful
> of lines of code.

Yeah, then we are going to have a huge function with lots of weird
definitions, unfortunately BSpec table has quite strange DBuf
assignments like 
+ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
+ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),

i.e slices can get mixed in a quite various ways. There is no
rational pattern in those and could be even dangerous to try to 
optimize it some way, as one day it might change again in some
unpredictable way.

Would you prefer to have it like

if (pipes are A and B)
    program S2 to A and S1 to B
if (pipes are A and C)
    program S1 to A and S2 to C
...

I would prefer at least to see it in a comparable way with the
table we have in BSpec, rather than having lots of weird looking
if-else statements, GEN checks and so on.

I knew this table was not going to look like it is done typically
here - but really my opinion that it is way better than hardcoding
it into some weird algorithm, which would be hard to compare to 
initial table, which is already strange enough.

If you don't like it - could you please explain why this is exactly
worse than having long functions with hardcoded checks?

I think this is pretty stupid to fix stuff which looks new or unusual
just because somebody doesn't like it, without even constructive
arguing.

- Stan

> 
> 
> > +
> > +/*
> > + * This function
> 
> "This function" is totally redundant. Ditto elsewhere.
> 
> > finds an entry with same enabled pipe configuration and
> > + * returns correspondent DBuf slice mask as stated in BSpec for
> > particular
> > + * platform.
> > + */
> > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > +				     u32 active_pipes,
> > +				     const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	int i;
> > +	uint_fixed_16_16_t pipe_ratio;
> > +
> > +	/*
> > +	 * Calculate pipe ratio as stated in BSpec 28692
> > +	 */
> > +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > +
> > +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > +		if (icl_allowed_dbufs[i].active_pipes == active_pipes)
> > {
> > +			if (hweight32(active_pipes) == 1) {
> > +				/*
> > +				 * According to BSpec 12716: if ratio
> > >= 88.8
> > +				 * always use single dbuf slice.
> > +				 */
> > +				if (pipe_ratio.val >= div_fixed16(888,
> > 10).val)
> > +					++i;
> > +			}
> > +			return icl_allowed_dbufs[i].dbuf_mask[pipe];
> > +		}
> > +	}
> > +	return 0;
> > +}
> > +
> > +/*
> > + * This function finds an entry with same enabled pipe
> > configuration and
> > + * returns correspondent DBuf slice mask as stated in BSpec for
> > particular
> > + * platform.
> > + */
> > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > +				     u32 active_pipes,
> > +				     const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	int i;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > +		if (tgl_allowed_dbufs[i].active_pipes == active_pipes)
> > +			return tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > +	}
> > +	return 0;
> > +}
> > +
> > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> > +				      int pipe, u32 active_pipes,
> > +				      const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	if (IS_GEN(dev_priv, 11))
> > +		return icl_get_allowed_dbuf_mask(pipe,
> > +						 active_pipes,
> > +						 crtc_state);
> > +	else if (IS_GEN(dev_priv, 12))
> > +		return tgl_get_allowed_dbuf_mask(pipe,
> > +						 active_pipes,
> > +						 crtc_state);
> > +	/*
> > +	 * For anything else just return one slice yet.
> > +	 * Should be extended for other platforms.
> > +	 */
> > +	return DBUF_S1_BIT;
> > +}
> > +
> >  static u64
> >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > *crtc_state,
> >  				 u64 *plane_data_rate)
> > -- 
> > 2.17.1
> 
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18  9:19     ` Lisovskiy, Stanislav
  0 siblings, 0 replies; 34+ messages in thread
From: Lisovskiy, Stanislav @ 2019-11-18  9:19 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx

On Fri, 2019-11-15 at 22:19 +0200, Ville Syrjälä wrote:
> On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy wrote:
> > Also implemented algorithm for choosing DBuf slice configuration
> > based on active pipes, pipe ratio as stated in BSpec 12716.
> > 
> > Now pipe allocation still stays proportional to pipe width as
> > before,
> > however within allowed DBuf slice for this particular
> > configuration.
> > 
> > v2: Remove unneeded check from commit as ddb enabled slices might
> >     now differ from hw state.
> > 
> > v3: - Added new field "supported_slices" to ddb to separate max
> >       supported slices vs currently enabled, to avoid confusion.
> >     - Removed obsolete comments and code related to DBuf(Matthew
> > Roper).
> >     - Some code style and long line removal(Matthew Roper).
> >     - Added WARN_ON to new ddb range offset calc function(Matthew
> > Roper).
> >     - Removed platform specific call to calc pipe ratio from ddb
> >       allocation function and fixed the return value(Matthew Roper)
> >     - Refactored DBUF slice allocation table to improve readability
> >     - Added DBUF slice allocation for TGL as well.
> >     - ICL(however not TGL) seems to voluntarily enable second DBuf
> > slice
> >       after pm suspend/resume causing a mismatch failure, because
> > we
> >       update DBuf slices only if we do a modeset, however this
> > check
> >       is done always. Fixed it to be done only when modeset for
> > ICL.
> > 
> > v4: - Now enabled slices is not actually a number, but a bitmask,
> >       because we might need to enable second slice only and number
> >       of slices would still 1 and that current code doesn't allow.
> >     - Remove redundant duplicate code to have some unified way of
> >       enabling dbuf slices instead of hardcoding.
> > 
> > v5: - Fix failing gen9_assert_dbuf_enabled as it was naively
> > thinking
> >       that we have only one DBUF_CTL slice. Now another version for
> >       gen11 will check both slices as only second one can be
> > enabled,
> >       to keep CI happy.
> > 
> > v6: - Removed enabled dbuf assertion completely. Previous code
> >       was keeping dbuf enabled, even(!) when _dbuf_disable is
> > called.
> >       Currently we enable/disable dbuf slices based on requrement
> >       so no point in asserting this here.
> >     - Removed unnecessary modeset check from
> > verify_wm_state(Matthew Roper)
> >     - Slices intersection after union is same as final
> > result(Matthew Roper)
> >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> >     - Some macros added(Matthew Roper)
> >     - ddb range is now always less or equal to ddb size - no need
> > for additional
> >       checks(previously needed as we had some bandwidth checks in
> > there which
> >       could "suddenly" return ddb_size smaller than it is.
> > 
> > v7: Minor costemic changes:
> >     - Changed icl_dbuf_slices_restore name to
> > icl_program_dbuf_slices
> >       as it more corresponds to the actual functionality.
> >     - Some simplification with supported slices for BXT and GLK
> >     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
> >       this is not used for skl anymore.
> >     - Changed DBuf slice assignment order for TGL case
> > 
> > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > Cc: James Ausmus <james.ausmus@intel.com>
> > ---
> >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> >  .../drm/i915/display/intel_display_power.h    |   2 +
> >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > ++++++++++++++++--
> >  9 files changed, 431 insertions(+), 104 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > b/drivers/gpu/drm/i915/display/intel_display.c
> > index 876fc25968bf..bd7aff675198 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > intel_crtc *crtc,
> >  
> >  	if (INTEL_GEN(dev_priv) >= 11 &&
> >  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > -		DRM_ERROR("mismatch in DBUF Slices (expected %u, got
> > %u)\n",
> > +		DRM_ERROR("mismatch in DBUF Slices (expected %x, got
> > %x)\n",
> >  			  sw_ddb->enabled_slices,
> >  			  hw->ddb.enabled_slices);
> >  
> > @@ -14614,15 +14614,24 @@ static void
> > skl_commit_modeset_enables(struct intel_atomic_state *state)
> >  	u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
> >  	u8 required_slices = state->wm_results.ddb.enabled_slices;
> >  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > +	u8 slices_union = hw_enabled_slices | required_slices;
> > +	u8 slices_intersection = required_slices;
> >  
> >  	for_each_oldnew_intel_crtc_in_state(state, crtc,
> > old_crtc_state, new_crtc_state, i)
> >  		/* ignore allocations for crtc's that have been turned
> > off. */
> >  		if (new_crtc_state->hw.active)
> >  			entries[i] = old_crtc_state->wm.skl.ddb;
> >  
> > -	/* If 2nd DBuf slice required, enable it here */
> > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > hw_enabled_slices)
> > -		icl_dbuf_slices_update(dev_priv, required_slices);
> > +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > +		      required_slices, hw_enabled_slices);
> > +
> > +	/*
> > +	 * If some other DBuf slice required, enable it here,
> > +	 * as here we only add more slices, but not remove to prevent
> > +	 * issues if somebody still uses those.
> > +	 */
> > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > hw_enabled_slices)
> > +		icl_dbuf_slices_update(dev_priv, slices_union);
> >  
> >  	/*
> >  	 * Whenever the number of active pipes changes, we need to make
> > sure we
> > @@ -14681,9 +14690,12 @@ static void
> > skl_commit_modeset_enables(struct intel_atomic_state *state)
> >  		}
> >  	} while (progress);
> >  
> > -	/* If 2nd DBuf slice is no more required disable it */
> > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > hw_enabled_slices)
> > -		icl_dbuf_slices_update(dev_priv, required_slices);
> > +	/*
> > +	 * If some other DBuf slice is not needed, disable it here,
> > +	 * as here we only remove slices, which are not needed anymore.
> > +	 */
> > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > hw_enabled_slices)
> > +		icl_dbuf_slices_update(dev_priv, slices_intersection);
> 
> This is still a totally wrong place for this. We need to be sure the
> planes have switched over to the new ddb layout before we power off
> anything.

I have just modified already existing line here, not added by me.
The point of this patch is starting to use second DBuf slice but
not fixing all kinds of issues related to DBuf code at the same time.

Of course this is nice idea to fix it - if that is wrong plane,
but current code somehow lives with it and probably this should be
with some other patch.

> 
> >  }
> >  
> >  static void intel_atomic_helper_free_state(struct drm_i915_private
> > *dev_priv)
> > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > index ce1b64f4dd44..a78bc9b61862 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > @@ -1031,15 +1031,6 @@ static bool
> > gen9_dc_off_power_well_enabled(struct drm_i915_private *dev_priv,
> >  		(I915_READ(DC_STATE_EN) &
> > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> >  }
> >  
> > -static void gen9_assert_dbuf_enabled(struct drm_i915_private
> > *dev_priv)
> > -{
> > -	u32 tmp = I915_READ(DBUF_CTL);
> > -
> > -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> > -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > -	     "Unexpected DBuf power power state (0x%08x)\n", tmp);
> > -}
> > -
> >  static void gen9_disable_dc_states(struct drm_i915_private
> > *dev_priv)
> >  {
> >  	struct intel_cdclk_state cdclk_state = {};
> > @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct
> > drm_i915_private *dev_priv)
> >  	/* Can't read out voltage_level so can't use
> > intel_cdclk_changed() */
> >  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw,
> > &cdclk_state));
> >  
> > -	gen9_assert_dbuf_enabled(dev_priv);
> > -
> >  	if (IS_GEN9_LP(dev_priv))
> >  		bxt_verify_ddi_phy_power_wells(dev_priv);
> >  
> > @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct
> > drm_i915_private *dev_priv)
> >  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> >  }
> >  
> > -static u8 intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> >  {
> > -	if (INTEL_GEN(dev_priv) < 11)
> > -		return 1;
> > -	return 2;
> > +	return INTEL_INFO(dev_priv)->supported_slices;
> > +}
> 
> Kinda pointless function now. 'supported_slices' is not a very good
> name. Should be 'num_dbuf_slices' or something along those lines.

Supported slices really means how much slices we support in hardware.
num_dbuf_slices will not reflect that, it will be more ambigious.
I don't like this name.

> 
> > +
> > +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv)
> > +{
> > +	const u8 hw_enabled_slices = dev_priv-
> > >wm.skl_hw.ddb.enabled_slices;
> 
> Seems a rather pointless variable. const at least is pointless.
> 
> > +
> > +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> >  }
> >  
> >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> >  			    u8 req_slices)
> >  {
> > -	const u8 hw_enabled_slices = dev_priv-
> > >wm.skl_hw.ddb.enabled_slices;
> > -	bool ret;
> > +	bool ret = true;
> > +	int i;
> > +	int max_slices = intel_dbuf_max_slices(dev_priv);
> >  
> > -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > +	if (hweight8(req_slices) > intel_dbuf_max_slices(dev_priv)) {
> >  		DRM_ERROR("Invalid number of dbuf slices requested\n");
> >  		return;
> >  	}
> >  
> > -	if (req_slices == hw_enabled_slices || req_slices == 0)
> > -		return;
> > +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n", req_slices);
> >  
> > -	if (req_slices > hw_enabled_slices)
> > -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2,
> > true);
> > -	else
> > -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2,
> > false);
> > +	for (i = 0; i < max_slices; i++) {
> > +		int slice_bit = BIT(i);
> > +		bool slice_set = (slice_bit & req_slices) != 0;
> > +
> > +		switch (slice_bit) {
> > +		case DBUF_S1_BIT:
> > +			ret = ret && intel_dbuf_slice_set(dev_priv,
> > +							  DBUF_CTL_S1,
> > +							  slice_set);
> > +			break;
> > +		case DBUF_S2_BIT:
> > +			ret = ret && intel_dbuf_slice_set(dev_priv,
> > +							  DBUF_CTL_S2,
> > +							  slice_set);
> 
> The return value on intel_dbuf_slice_set() seems pointless. I'd nuke
> it.
> 
> Also as long you have intel_dbuf_slice_set() I'd probably just pass
> the
> slice index there and let it deal with the register details. DBUF_CTL
> should probably be parametrized as well so you don't need annoying
> switch
> statements.

What if it fails? Shouldn't we still have some clue what state do we
have/had at the moment?

> 
> > +			break;
> > +		default:
> > +			MISSING_CASE(slice_bit);
> > +		}
> > +	}
> >  
> >  	if (ret)
> >  		dev_priv->wm.skl_hw.ddb.enabled_slices = req_slices;
> > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > drm_i915_private *dev_priv,
> >  
> >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> >  {
> > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > DBUF_POWER_REQUEST);
> > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > DBUF_POWER_REQUEST);
> > -	POSTING_READ(DBUF_CTL_S2);
> > -
> > -	udelay(10);
> > -
> > -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > -		DRM_ERROR("DBuf power enable timeout\n");
> > -	else
> > -		/*
> > -		 * FIXME: for now pretend that we only have 1 slice,
> > see
> > -		 * intel_enabled_dbuf_slices_num().
> > -		 */
> > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > +	/*
> > +	 * Just power up 1 slice, we will
> > +	 * figure out later which slices we have and what we need.
> > +	 */
> > +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> 
> That stuff really shouldn't be modified by low level functions.
> The intial value should come from readout, and after that it should
> be adjusted as part of the overall atomic state.

Well, as you can see currently this is even worse - has some weird
hardcoded stuff and is modified like that. Also if we are enabling the 
dbuf slices, shouldn't we really determine what value should be there?

Also I wouldn't really always trust the readout, as to me it seems
icelake resets it constantly to 2 after suspend/resume. 

> 
> > +	icl_program_dbuf_slices(dev_priv);
> >  }
> >  
> >  static void icl_dbuf_disable(struct drm_i915_private *dev_priv)
> >  {
> > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > ~DBUF_POWER_REQUEST);
> > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > ~DBUF_POWER_REQUEST);
> > -	POSTING_READ(DBUF_CTL_S2);
> > -
> > -	udelay(10);
> > -
> > -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > -		DRM_ERROR("DBuf power disable timeout!\n");
> > -	else
> > -		/*
> > -		 * FIXME: for now pretend that the first slice is
> > always
> > -		 * enabled, see intel_enabled_dbuf_slices_num().
> > -		 */
> > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > +	/*
> > +	 * Disable all slices
> > +	 */
> > +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > +	icl_program_dbuf_slices(dev_priv);
> >  }
> >  
> >  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h
> > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > index 1da04f3e0fb3..b7057d68ad78 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > drm_i915_private *i915,
> >  	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
> >  	     intel_display_power_put_async((i915), (domain), (wf)),
> > (wf) = 0)
> >  
> > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
> >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> >  			    u8 req_slices);
> > +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv);
> >  
> >  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
> >  			     bool override, unsigned int mask);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index 82e4e6bf08c3..96741e68633a 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -51,6 +51,7 @@
> >  #include "display/intel_bw.h"
> >  #include "display/intel_cdclk.h"
> >  #include "display/intel_display_types.h"
> > +#include "display/intel_display_power.h"
> >  #include "display/intel_dp.h"
> >  #include "display/intel_fbdev.h"
> >  #include "display/intel_hotplug.h"
> > @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct
> > device *kdev)
> >  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
> >  		ret = vlv_resume_prepare(dev_priv, true);
> >  
> > +	/* Weird hack to fix ICL hardware bug, as it resets DBUF slices
> > reg */
> > +	if (INTEL_GEN(dev_priv) == 11)
> > +		icl_program_dbuf_slices(dev_priv);
> > +
> >  	intel_uncore_runtime_resume(&dev_priv->uncore);
> >  
> >  	intel_runtime_pm_enable_interrupts(dev_priv);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > b/drivers/gpu/drm/i915/i915_drv.h
> > index 7e0f67babe20..a396977c9c2d 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -804,7 +804,7 @@ static inline bool skl_ddb_entry_equal(const
> > struct skl_ddb_entry *e1,
> >  }
> >  
> >  struct skl_ddb_allocation {
> 
> This struct is quite pointless. Should be killed off.

Legacy stuff, not added by me. Quite fine with removing it - 
but should we really do all this at the same time and in same
patch? 

> 
> > -	u8 enabled_slices; /* GEN11 has configurable 2 slices */
> > +	u8 enabled_slices;   /* Bitmask of currently enabled DBuf
> > slices */
> >  };
> >  
> >  struct skl_ddb_values {
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > b/drivers/gpu/drm/i915/i915_pci.c
> > index 1bb701d32a5d..8e86af505730 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > intel_cherryview_info = {
> >  	.has_gt_uc = 1, \
> >  	.display.has_hdcp = 1, \
> >  	.display.has_ipc = 1, \
> > -	.ddb_size = 896
> > +	.ddb_size = 896, \
> > +	.supported_slices = 1
> >  
> >  #define SKL_PLATFORM \
> >  	GEN9_FEATURES, \
> > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > intel_skylake_gt4_info = {
> >  	GEN(9), \
> >  	.is_lp = 1, \
> >  	.display.has_hotplug = 1, \
> > +	.supported_slices = 1, \
> >  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0),
> > \
> >  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
> >  	.has_64bit_reloc = 1, \
> > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > intel_coffeelake_gt3_info = {
> >  	GEN9_FEATURES, \
> >  	GEN(10), \
> >  	.ddb_size = 1024, \
> > +	.supported_slices = 1, \
> >  	.display.has_dsc = 1, \
> >  	.has_coherent_ggtt = false, \
> >  	GLK_COLORS
> > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > intel_cannonlake_info = {
> >  	}, \
> >  	GEN(11), \
> >  	.ddb_size = 2048, \
> > +	.supported_slices = 2, \
> >  	.has_logical_ring_elsq = 1, \
> >  	.color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 }
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > b/drivers/gpu/drm/i915/i915_reg.h
> > index a607ea520829..fba5731063d8 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -7734,9 +7734,11 @@ enum {
> >  #define DISP_ARB_CTL2	_MMIO(0x45004)
> >  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
> >  #define  DISP_IPC_ENABLE		(1 << 3)
> > -#define DBUF_CTL	_MMIO(0x45008)
> > -#define DBUF_CTL_S1	_MMIO(0x45008)
> > -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> > +#define DBUF_S1_BIT			BIT(0)
> > +#define DBUF_S2_BIT			BIT(1)
> 
> These are not bits inside DBUF_CTL so they should not be here.
> I'd just nuke them entirely. If you think we need more than an
> int to refer to a slice then IMO something simple like
> 
> enum dbuf_slice {
> 	DBUF_S1,
> 	DBUF_S2,
> };
> 
> would match the general apporach we use for other things much better
> rather than defining some shifted values.

We need to combine slices - so bitmask is quite suitable for that, 
I can of course use this enum, but then I'll have to re-encode it 
to bitmask anyways.


> 
> > +#define DBUF_CTL			_MMIO(0x45008)
> > +#define DBUF_CTL_S1			_MMIO(0x45008)
> > +#define DBUF_CTL_S2			_MMIO(0x44FE8)
> >  #define  DBUF_POWER_REQUEST		(1 << 31)
> >  #define  DBUF_POWER_STATE		(1 << 30)
> >  #define GEN7_MSG_CTL	_MMIO(0x45010)
> > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > b/drivers/gpu/drm/i915/intel_device_info.h
> > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > @@ -180,6 +180,7 @@ struct intel_device_info {
> >  	} display;
> >  
> >  	u16 ddb_size; /* in blocks */
> > +	u8 supported_slices; /* number of DBuf slices */
> >  
> >  	/* Register offsets for the various display pipes and
> > transcoders */
> >  	int pipe_offsets[I915_MAX_TRANSCODERS];
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > b/drivers/gpu/drm/i915/intel_pm.c
> > index 2d389e437e87..1ea81ab92429 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct drm_device
> > *dev)
> >  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> >  }
> >  
> > -static u8 intel_enabled_dbuf_slices_num(struct drm_i915_private
> > *dev_priv)
> > +static u8 intel_enabled_dbuf_slices(struct drm_i915_private
> > *dev_priv)
> >  {
> > -	u8 enabled_slices;
> > -
> > -	/* Slice 1 will always be enabled */
> > -	enabled_slices = 1;
> > +	u8 enabled_slices = 0;
> >  
> >  	/* Gen prior to GEN11 have only one DBuf slice */
> >  	if (INTEL_GEN(dev_priv) < 11)
> > -		return enabled_slices;
> > +		return DBUF_S1_BIT;
> 
> I'd nuke this special case entirely and just read everything from the
> hw. Assuming we need readout somewhere.

We have it that way hardcoded right now in drm-tip. I think this is a
subject to other patch, but not this change. As I understand we need 
to first figure out how to do it properly. 
This patch anyway makes it better by at least removing hardcoded
relation between number of slices and which slices are enabled.
With current code there is no way for example to enable second DBuf
slice only, which is required by BSpec.

> 
> >  
> > -	/*
> > -	 * FIXME: for now we'll only ever use 1 slice; pretend that we
> > have
> > -	 * only that 1 slice enabled until we have a proper way for on-
> > demand
> > -	 * toggling of the second slice.
> > -	 */
> > -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > -		enabled_slices++;
> > +	/* Check if second DBuf slice is enabled */
> > +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > +		enabled_slices |= DBUF_S1_BIT;
> > +
> > +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > +		enabled_slices |= DBUF_S2_BIT;
> >  
> >  	return enabled_slices;
> >  }
> > @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct
> > drm_i915_private *dev_priv,
> >  			      const int num_active,
> >  			      struct skl_ddb_allocation *ddb)
> >  {
> > -	const struct drm_display_mode *adjusted_mode;
> > -	u64 total_data_bw;
> >  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > -
> >  	WARN_ON(ddb_size == 0);
> >  
> >  	if (INTEL_GEN(dev_priv) < 11)
> >  		return ddb_size - 4; /* 4 blocks for bypass path
> > allocation */
> >  
> > -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> > -	total_data_bw = total_data_rate *
> > drm_mode_vrefresh(adjusted_mode);
> > +	return ddb_size;
> > +}
> >  
> > -	/*
> > -	 * 12GB/s is maximum BW supported by single DBuf slice.
> > -	 *
> > -	 * FIXME dbuf slice code is broken:
> > -	 * - must wait for planes to stop using the slice before
> > powering it off
> > -	 * - plane straddling both slices is illegal in multi-pipe
> > scenarios
> > -	 * - should validate we stay within the hw bandwidth limits
> > -	 */
> > -	if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
> > -		ddb->enabled_slices = 2;
> > -	} else {
> > -		ddb->enabled_slices = 1;
> > -		ddb_size /= 2;
> > -	}
> > +/*
> > + * Calculate initial DBuf slice offset, based on slice size
> > + * and mask(i.e if slice size is 1024 and second slice is enabled
> > + * offset would be 1024)
> > + */
> > +static u32 skl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
> > +					   u32 slice_size, u32
> > ddb_size)
> > +{
> > +	u32 offset = 0;
> >  
> > -	return ddb_size;
> > +	if (!dbuf_slice_mask)
> > +		return 0;
> > +
> > +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > +
> > +	WARN_ON(offset >= ddb_size);
> > +	return offset;
> >  }
> >  
> > +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > *dev_priv,
> > +				      int pipe, u32 active_pipes,
> > +				      const struct intel_crtc_state
> > *crtc_state);
> > +
> >  static void
> >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > *dev_priv,
> >  				   const struct intel_crtc_state
> > *crtc_state,
> > @@ -3857,7 +3855,14 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  	u32 pipe_width = 0, total_width = 0, width_before_pipe = 0;
> >  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> >  	u16 ddb_size;
> > +	u32 ddb_range_size;
> >  	u32 i;
> > +	u32 dbuf_slice_mask;
> > +	u32 active_pipes;
> > +	u32 offset;
> > +	u32 slice_size;
> > +	u32 total_slice_mask;
> > +	u32 start, end;
> >  
> >  	if (WARN_ON(!state) || !crtc_state->hw.active) {
> >  		alloc->start = 0;
> > @@ -3866,14 +3871,23 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  		return;
> >  	}
> >  
> > -	if (intel_state->active_pipe_changes)
> > +	if (intel_state->active_pipe_changes) {
> >  		*num_active = hweight8(intel_state->active_pipes);
> > -	else
> > +		active_pipes = intel_state->active_pipes;
> > +	} else {
> >  		*num_active = hweight8(dev_priv->active_pipes);
> > +		active_pipes = dev_priv->active_pipes;
> > +	}
> >  
> >  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > total_data_rate,
> >  				      *num_active, ddb);
> >  
> > +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > +
> > +	slice_size = ddb_size / INTEL_INFO(dev_priv)->supported_slices;
> > +
> > +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> > +
> >  	/*
> >  	 * If the state doesn't change the active CRTC's or there is no
> >  	 * modeset request, then there's no need to recalculate;
> > @@ -3891,20 +3905,70 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  		return;
> >  	}
> >  
> > +	/*
> > +	 * Get allowed DBuf slices for correspondent pipe and platform.
> > +	 */
> > +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv,
> > for_pipe,
> > +						     active_pipes,
> > crtc_state);
> 
> Quite a few redundant arguments. The function name looks alien.
> 
> skl_crtc_possible_dbuf_slices() or something perhaps?
> 
> > +
> > +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes %x\n",
> > +		      dbuf_slice_mask,
> > +		      for_pipe, active_pipes);
> > +
> > +	/*
> > +	 * Figure out at which DBuf slice we start, i.e if we start at
> > Dbuf S2
> > +	 * and slice size is 1024, the offset would be 1024
> > +	 */
> > +	offset = skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > +						 slice_size, ddb_size);
> > +
> > +	/*
> > +	 * Figure out total size of allowed DBuf slices, which is
> > basically
> > +	 * a number of allowed slices for that pipe multiplied by slice
> > size.
> > +	 * Inside of this
> > +	 * range ddb entries are still allocated in proportion to
> > display width.
> > +	 */
> > +	ddb_range_size = hweight8(dbuf_slice_mask) * slice_size;
> > +
> >  	/*
> >  	 * Watermark/ddb requirement highly depends upon width of the
> >  	 * framebuffer, So instead of allocating DDB equally among
> > pipes
> >  	 * distribute DDB based on resolution/width of the display.
> >  	 */
> > +	total_slice_mask = dbuf_slice_mask;
> 
> Isn't this already taken care by the loop below?
> 
> >  	for_each_new_intel_crtc_in_state(intel_state, crtc, crtc_state,
> > i) {
> >  		const struct drm_display_mode *adjusted_mode =
> >  			&crtc_state->hw.adjusted_mode;
> >  		enum pipe pipe = crtc->pipe;
> >  		int hdisplay, vdisplay;
> > +		u32 pipe_dbuf_slice_mask = \
> > +			i915_get_allowed_dbuf_mask(dev_priv,
> > +						pipe,
> > +						active_pipes,
> > +						crtc_state);
> >  
> >  		if (!crtc_state->hw.enable)
> >  			continue;
> >  
> > +		/*
> > +		 * According to BSpec pipe can share one dbuf slice
> > with another
> > +		 * pipes or pipe can use multiple dbufs, in both cases
> > we
> > +		 * account for other pipes only if they have exactly
> > same mask.
> > +		 * However we need to account how many slices we should
> > enable
> > +		 * in total.
> > +		 */
> > +		total_slice_mask |= pipe_dbuf_slice_mask;
> 
> total_slice_mask will now account only the crtcs in the state. What
> happens to the other pipes' slices?

Yes we need to iterate through all pipes/crtcs here, however
for getting correct dbuf slice mask most important is active_pipes.
However yes, I will change the loop for it to be more correct.

> 
> > +
> > +		/*
> > +		 * Do not account pipes using other slice sets
> > +		 * luckily as of current BSpec slice sets do not
> > partially
> > +		 * intersect(pipes share either same one slice or same
> > slice set
> > +		 * i.e no partial intersection), so it is enough to
> > check for
> > +		 * equality for now.
> > +		 */
> > +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > +			continue;
> > +
> >  		drm_mode_get_hv_timing(adjusted_mode, &hdisplay,
> > &vdisplay);
> >  		total_width += hdisplay;
> >  
> > @@ -3914,8 +3978,19 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  			pipe_width = hdisplay;
> >  	}
> >  
> > -	alloc->start = ddb_size * width_before_pipe / total_width;
> > -	alloc->end = ddb_size * (width_before_pipe + pipe_width) /
> > total_width;
> > +	ddb->enabled_slices = total_slice_mask;
> > +
> > +	start = ddb_range_size * width_before_pipe / total_width;
> > +	end = ddb_range_size * (width_before_pipe + pipe_width) /
> > total_width;
> > +
> > +	alloc->start = offset + start;
> > +	alloc->end = offset + end;
> > +
> > +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > +		      alloc->start, alloc->end);
> > +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported %d\n",
> > +		      ddb->enabled_slices,
> > +		      INTEL_INFO(dev_priv)->supported_slices);
> >  }
> >  
> >  static int skl_compute_wm_params(const struct intel_crtc_state
> > *crtc_state,
> > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > intel_crtc *crtc,
> >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
> >  			  struct skl_ddb_allocation *ddb /* out */)
> >  {
> > -	ddb->enabled_slices = intel_enabled_dbuf_slices_num(dev_priv);
> > +	ddb->enabled_slices = intel_enabled_dbuf_slices(dev_priv);
> > +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > >enabled_slices);
> >  }
> >  
> >  /*
> > @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct
> > intel_crtc_state *crtc_state,
> >  	return total_data_rate;
> >  }
> >  
> > +static uint_fixed_16_16_t
> > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	u32 src_w, src_h, dst_w, dst_h;
> > +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > +	uint_fixed_16_16_t downscale_h, downscale_w;
> > +	const struct drm_display_mode *adjusted_mode = &crtc_state-
> > >hw.adjusted_mode;
> > +
> > +	src_w = crtc_state->pipe_src_w;
> > +	src_h = crtc_state->pipe_src_h;
> > +	dst_w = adjusted_mode->crtc_hdisplay;
> > +	dst_h = adjusted_mode->crtc_vdisplay;
> > +
> > +	fp_w_ratio = div_fixed16(src_w, dst_w);
> > +	fp_h_ratio = div_fixed16(src_h, dst_h);
> > +	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
> > +	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
> > +
> > +	return mul_fixed16(downscale_w, downscale_h);
> > +}
> > +
> > +uint_fixed_16_16_t
> > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > +{
> > +	struct drm_plane *plane;
> > +	const struct drm_plane_state *drm_plane_state;
> > +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > +	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> > +
> > +	if (!crtc_state->uapi.enable)
> > +		return max_downscale;
> > +
> > +	drm_atomic_crtc_state_for_each_plane_state(plane,
> > drm_plane_state, &crtc_state->uapi) {
> > +		uint_fixed_16_16_t plane_downscale;
> > +		const struct intel_plane_state *plane_state =
> > +			to_intel_plane_state(drm_plane_state);
> > +
> > +		if (!intel_wm_plane_visible(crtc_state, plane_state))
> > +			continue;
> > +
> > +		plane_downscale =
> > skl_plane_downscale_amount(crtc_state, plane_state);
> > +
> > +		DRM_DEBUG_KMS("Plane %d downscale amount %d.%d\n",
> > +			      to_intel_plane(plane)->id,
> > +			      plane_downscale.val >> 16,
> > +			      plane_downscale.val & 0xffff);
> > +
> > +		max_downscale = max_fixed16(plane_downscale,
> > max_downscale);
> > +	}
> > +
> > +
> > +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > +
> > +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > +		       crtc->pipe, pipe_downscale.val >> 16,
> > +		       pipe_downscale.val & 0xffff);
> > +
> > +	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
> > +
> > +	/*
> > +	 * Convert result to percentage: get the actual ratio instead
> > of rate,
> > +	 * then multiply by 100 to get percentage.
> > +	 */
> > +	pipe_ratio = u32_to_fixed16(100 * div_round_up_u32_fixed16(1,
> > pipe_downscale));
> > +
> > +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > +		      pipe_ratio.val >> 16, pipe_ratio.val & 0xffff);
> > +
> > +	return pipe_ratio;
> > +}
> 
> Yuck. I just killed these.

I still need pipe_ratio fro ICL as you know, this is dictated by BSpec,
not by me.

> 
> > +
> > +struct dbuf_slice_conf_entry {
> > +	u32 dbuf_mask[I915_MAX_PIPES];
> > +	u32 active_pipes;
> > +};
> > +
> > +
> > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > BIT(PIPE_C) }
> > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C) |
> > BIT(PIPE_D) }
> > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C) |
> > BIT(PIPE_D) }
> > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > BIT(PIPE_D) }
> > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > BIT(PIPE_C) }
> > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \
> > +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) | BIT(PIPE_B) | \
> > +					  BIT(PIPE_C) | BIT(PIPE_D) }
> > +
> > +/*
> > + * Table taken from Bspec 12716
> > + * Pipes do have some preferred DBuf slice affinity,
> > + * plus there are some hardcoded requirements on how
> > + * those should be distributed for multipipe scenarios.
> > + * For more DBuf slices algorithm can get even more messy
> > + * and less readable, so decided to use a table almost
> > + * as is from BSpec itself - that way it is at least easier
> > + * to compare, change and check.
> > + */
> > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > +{ { 0, 0, 0, 0 }, 0 },
> > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > < 88.8 */
> > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe
> > ratio >= 88.8 */
> > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > < 88.8 */
> > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe
> > ratio >= 88.8 */
> > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > < 88.8 */
> > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for pipe
> > ratio >= 88.8 */
> > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > +};
> > +
> > +/*
> > + * Table taken from Bspec 49255
> > + * Pipes do have some preferred DBuf slice affinity,
> > + * plus there are some hardcoded requirements on how
> > + * those should be distributed for multipipe scenarios.
> > + * For more DBuf slices algorithm can get even more messy
> > + * and less readable, so decided to use a table almost
> > + * as is from BSpec itself - that way it is at least easier
> > + * to compare, change and check.
> > + */
> > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > +{ { 0, 0, 0, 0 }, 0 },
> > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT,
> > DBUF_S2_BIT),
> > +};
> 
> My eyes!
> 
> I have to think we should be able to reduce all that to a handful
> of lines of code.

Yeah, then we are going to have a huge function with lots of weird
definitions, unfortunately BSpec table has quite strange DBuf
assignments like 
+ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
+ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),

i.e slices can get mixed in a quite various ways. There is no
rational pattern in those and could be even dangerous to try to 
optimize it some way, as one day it might change again in some
unpredictable way.

Would you prefer to have it like

if (pipes are A and B)
    program S2 to A and S1 to B
if (pipes are A and C)
    program S1 to A and S2 to C
...

I would prefer at least to see it in a comparable way with the
table we have in BSpec, rather than having lots of weird looking
if-else statements, GEN checks and so on.

I knew this table was not going to look like it is done typically
here - but really my opinion that it is way better than hardcoding
it into some weird algorithm, which would be hard to compare to 
initial table, which is already strange enough.

If you don't like it - could you please explain why this is exactly
worse than having long functions with hardcoded checks?

I think this is pretty stupid to fix stuff which looks new or unusual
just because somebody doesn't like it, without even constructive
arguing.

- Stan

> 
> 
> > +
> > +/*
> > + * This function
> 
> "This function" is totally redundant. Ditto elsewhere.
> 
> > finds an entry with same enabled pipe configuration and
> > + * returns correspondent DBuf slice mask as stated in BSpec for
> > particular
> > + * platform.
> > + */
> > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > +				     u32 active_pipes,
> > +				     const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	int i;
> > +	uint_fixed_16_16_t pipe_ratio;
> > +
> > +	/*
> > +	 * Calculate pipe ratio as stated in BSpec 28692
> > +	 */
> > +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > +
> > +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > +		if (icl_allowed_dbufs[i].active_pipes == active_pipes)
> > {
> > +			if (hweight32(active_pipes) == 1) {
> > +				/*
> > +				 * According to BSpec 12716: if ratio
> > >= 88.8
> > +				 * always use single dbuf slice.
> > +				 */
> > +				if (pipe_ratio.val >= div_fixed16(888,
> > 10).val)
> > +					++i;
> > +			}
> > +			return icl_allowed_dbufs[i].dbuf_mask[pipe];
> > +		}
> > +	}
> > +	return 0;
> > +}
> > +
> > +/*
> > + * This function finds an entry with same enabled pipe
> > configuration and
> > + * returns correspondent DBuf slice mask as stated in BSpec for
> > particular
> > + * platform.
> > + */
> > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > +				     u32 active_pipes,
> > +				     const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	int i;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > +		if (tgl_allowed_dbufs[i].active_pipes == active_pipes)
> > +			return tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > +	}
> > +	return 0;
> > +}
> > +
> > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> > +				      int pipe, u32 active_pipes,
> > +				      const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	if (IS_GEN(dev_priv, 11))
> > +		return icl_get_allowed_dbuf_mask(pipe,
> > +						 active_pipes,
> > +						 crtc_state);
> > +	else if (IS_GEN(dev_priv, 12))
> > +		return tgl_get_allowed_dbuf_mask(pipe,
> > +						 active_pipes,
> > +						 crtc_state);
> > +	/*
> > +	 * For anything else just return one slice yet.
> > +	 * Should be extended for other platforms.
> > +	 */
> > +	return DBUF_S1_BIT;
> > +}
> > +
> >  static u64
> >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > *crtc_state,
> >  				 u64 *plane_data_rate)
> > -- 
> > 2.17.1
> 
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 14:22   ` Jani Nikula
  0 siblings, 0 replies; 34+ messages in thread
From: Jani Nikula @ 2019-11-18 14:22 UTC (permalink / raw)
  To: Stanislav Lisovskiy, intel-gfx

On Thu, 14 Nov 2019, Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> wrote:
> Also implemented algorithm for choosing DBuf slice configuration
> based on active pipes, pipe ratio as stated in BSpec 12716.
>
> Now pipe allocation still stays proportional to pipe width as before,
> however within allowed DBuf slice for this particular configuration.
>
> v2: Remove unneeded check from commit as ddb enabled slices might
>     now differ from hw state.
>
> v3: - Added new field "supported_slices" to ddb to separate max
>       supported slices vs currently enabled, to avoid confusion.
>     - Removed obsolete comments and code related to DBuf(Matthew Roper).
>     - Some code style and long line removal(Matthew Roper).
>     - Added WARN_ON to new ddb range offset calc function(Matthew Roper).
>     - Removed platform specific call to calc pipe ratio from ddb
>       allocation function and fixed the return value(Matthew Roper)
>     - Refactored DBUF slice allocation table to improve readability
>     - Added DBUF slice allocation for TGL as well.
>     - ICL(however not TGL) seems to voluntarily enable second DBuf slice
>       after pm suspend/resume causing a mismatch failure, because we
>       update DBuf slices only if we do a modeset, however this check
>       is done always. Fixed it to be done only when modeset for ICL.
>
> v4: - Now enabled slices is not actually a number, but a bitmask,
>       because we might need to enable second slice only and number
>       of slices would still 1 and that current code doesn't allow.
>     - Remove redundant duplicate code to have some unified way of
>       enabling dbuf slices instead of hardcoding.
>
> v5: - Fix failing gen9_assert_dbuf_enabled as it was naively thinking
>       that we have only one DBUF_CTL slice. Now another version for
>       gen11 will check both slices as only second one can be enabled,
>       to keep CI happy.
>
> v6: - Removed enabled dbuf assertion completely. Previous code
>       was keeping dbuf enabled, even(!) when _dbuf_disable is called.
>       Currently we enable/disable dbuf slices based on requrement
>       so no point in asserting this here.
>     - Removed unnecessary modeset check from verify_wm_state(Matthew Roper)
>     - Slices intersection after union is same as final result(Matthew Roper)
>     - Moved DBuf slices to intel_device_info(Matthew Roper)
>     - Some macros added(Matthew Roper)
>     - ddb range is now always less or equal to ddb size - no need for additional
>       checks(previously needed as we had some bandwidth checks in there which
>       could "suddenly" return ddb_size smaller than it is.
>
> v7: Minor costemic changes:
>     - Changed icl_dbuf_slices_restore name to icl_program_dbuf_slices
>       as it more corresponds to the actual functionality.
>     - Some simplification with supported slices for BXT and GLK
>     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
>       this is not used for skl anymore.
>     - Changed DBuf slice assignment order for TGL case
>
> Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> Cc: Matthew Roper <matthew.d.roper@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@intel.com>
> Cc: James Ausmus <james.ausmus@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
>  .../drm/i915/display/intel_display_power.c    |  98 ++---
>  .../drm/i915/display/intel_display_power.h    |   2 +
>  drivers/gpu/drm/i915/i915_drv.c               |   5 +
>  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
>  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
>  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
>  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
>  drivers/gpu/drm/i915/intel_pm.c               | 387 ++++++++++++++++--
>  9 files changed, 431 insertions(+), 104 deletions(-)

This is a rather big patch, and that's one of the few things we can't
fix after committing.

If there's a bug, by our rate of change this turns practically
un-revertible in a matter of weeks due to conflicts. If there's a
bisected regression, and this is the regressing commit, what are our
chances of identifying the bug based on the sha1 alone? Can't revert and
can't immediately spot the bug/fix is a bad combination to have.

Also we have this nice CI infrastructure reporting about valid
checkpatch and sparse issues for several versions of the patch. Please
fix.


BR,
Jani.


>
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 876fc25968bf..bd7aff675198 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct intel_crtc *crtc,
>  
>  	if (INTEL_GEN(dev_priv) >= 11 &&
>  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> -		DRM_ERROR("mismatch in DBUF Slices (expected %u, got %u)\n",
> +		DRM_ERROR("mismatch in DBUF Slices (expected %x, got %x)\n",
>  			  sw_ddb->enabled_slices,
>  			  hw->ddb.enabled_slices);
>  
> @@ -14614,15 +14614,24 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
>  	u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
>  	u8 required_slices = state->wm_results.ddb.enabled_slices;
>  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> +	u8 slices_union = hw_enabled_slices | required_slices;
> +	u8 slices_intersection = required_slices;
>  
>  	for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i)
>  		/* ignore allocations for crtc's that have been turned off. */
>  		if (new_crtc_state->hw.active)
>  			entries[i] = old_crtc_state->wm.skl.ddb;
>  
> -	/* If 2nd DBuf slice required, enable it here */
> -	if (INTEL_GEN(dev_priv) >= 11 && required_slices > hw_enabled_slices)
> -		icl_dbuf_slices_update(dev_priv, required_slices);
> +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> +		      required_slices, hw_enabled_slices);
> +
> +	/*
> +	 * If some other DBuf slice required, enable it here,
> +	 * as here we only add more slices, but not remove to prevent
> +	 * issues if somebody still uses those.
> +	 */
> +	if (INTEL_GEN(dev_priv) >= 11 && required_slices != hw_enabled_slices)
> +		icl_dbuf_slices_update(dev_priv, slices_union);
>  
>  	/*
>  	 * Whenever the number of active pipes changes, we need to make sure we
> @@ -14681,9 +14690,12 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
>  		}
>  	} while (progress);
>  
> -	/* If 2nd DBuf slice is no more required disable it */
> -	if (INTEL_GEN(dev_priv) >= 11 && required_slices < hw_enabled_slices)
> -		icl_dbuf_slices_update(dev_priv, required_slices);
> +	/*
> +	 * If some other DBuf slice is not needed, disable it here,
> +	 * as here we only remove slices, which are not needed anymore.
> +	 */
> +	if (INTEL_GEN(dev_priv) >= 11 && required_slices != hw_enabled_slices)
> +		icl_dbuf_slices_update(dev_priv, slices_intersection);
>  }
>  
>  static void intel_atomic_helper_free_state(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c
> index ce1b64f4dd44..a78bc9b61862 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> @@ -1031,15 +1031,6 @@ static bool gen9_dc_off_power_well_enabled(struct drm_i915_private *dev_priv,
>  		(I915_READ(DC_STATE_EN) & DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
>  }
>  
> -static void gen9_assert_dbuf_enabled(struct drm_i915_private *dev_priv)
> -{
> -	u32 tmp = I915_READ(DBUF_CTL);
> -
> -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> -	     "Unexpected DBuf power power state (0x%08x)\n", tmp);
> -}
> -
>  static void gen9_disable_dc_states(struct drm_i915_private *dev_priv)
>  {
>  	struct intel_cdclk_state cdclk_state = {};
> @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct drm_i915_private *dev_priv)
>  	/* Can't read out voltage_level so can't use intel_cdclk_changed() */
>  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw, &cdclk_state));
>  
> -	gen9_assert_dbuf_enabled(dev_priv);
> -
>  	if (IS_GEN9_LP(dev_priv))
>  		bxt_verify_ddi_phy_power_wells(dev_priv);
>  
> @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct drm_i915_private *dev_priv)
>  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
>  }
>  
> -static u8 intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
>  {
> -	if (INTEL_GEN(dev_priv) < 11)
> -		return 1;
> -	return 2;
> +	return INTEL_INFO(dev_priv)->supported_slices;
> +}
> +
> +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv)
> +{
> +	const u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
> +
> +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
>  }
>  
>  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
>  			    u8 req_slices)
>  {
> -	const u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
> -	bool ret;
> +	bool ret = true;
> +	int i;
> +	int max_slices = intel_dbuf_max_slices(dev_priv);
>  
> -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> +	if (hweight8(req_slices) > intel_dbuf_max_slices(dev_priv)) {
>  		DRM_ERROR("Invalid number of dbuf slices requested\n");
>  		return;
>  	}
>  
> -	if (req_slices == hw_enabled_slices || req_slices == 0)
> -		return;
> +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n", req_slices);
>  
> -	if (req_slices > hw_enabled_slices)
> -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2, true);
> -	else
> -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2, false);
> +	for (i = 0; i < max_slices; i++) {
> +		int slice_bit = BIT(i);
> +		bool slice_set = (slice_bit & req_slices) != 0;
> +
> +		switch (slice_bit) {
> +		case DBUF_S1_BIT:
> +			ret = ret && intel_dbuf_slice_set(dev_priv,
> +							  DBUF_CTL_S1,
> +							  slice_set);
> +			break;
> +		case DBUF_S2_BIT:
> +			ret = ret && intel_dbuf_slice_set(dev_priv,
> +							  DBUF_CTL_S2,
> +							  slice_set);
> +			break;
> +		default:
> +			MISSING_CASE(slice_bit);
> +		}
> +	}
>  
>  	if (ret)
>  		dev_priv->wm.skl_hw.ddb.enabled_slices = req_slices;
> @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
>  
>  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
>  {
> -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) | DBUF_POWER_REQUEST);
> -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) | DBUF_POWER_REQUEST);
> -	POSTING_READ(DBUF_CTL_S2);
> -
> -	udelay(10);
> -
> -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> -		DRM_ERROR("DBuf power enable timeout\n");
> -	else
> -		/*
> -		 * FIXME: for now pretend that we only have 1 slice, see
> -		 * intel_enabled_dbuf_slices_num().
> -		 */
> -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> +	/*
> +	 * Just power up 1 slice, we will
> +	 * figure out later which slices we have and what we need.
> +	 */
> +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> +	icl_program_dbuf_slices(dev_priv);
>  }
>  
>  static void icl_dbuf_disable(struct drm_i915_private *dev_priv)
>  {
> -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) & ~DBUF_POWER_REQUEST);
> -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) & ~DBUF_POWER_REQUEST);
> -	POSTING_READ(DBUF_CTL_S2);
> -
> -	udelay(10);
> -
> -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> -		DRM_ERROR("DBuf power disable timeout!\n");
> -	else
> -		/*
> -		 * FIXME: for now pretend that the first slice is always
> -		 * enabled, see intel_enabled_dbuf_slices_num().
> -		 */
> -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> +	/*
> +	 * Disable all slices
> +	 */
> +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> +	icl_program_dbuf_slices(dev_priv);
>  }
>  
>  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
> index 1da04f3e0fb3..b7057d68ad78 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> @@ -311,8 +311,10 @@ intel_display_power_put_async(struct drm_i915_private *i915,
>  	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
>  	     intel_display_power_put_async((i915), (domain), (wf)), (wf) = 0)
>  
> +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
>  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
>  			    u8 req_slices);
> +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv);
>  
>  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
>  			     bool override, unsigned int mask);
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 82e4e6bf08c3..96741e68633a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -51,6 +51,7 @@
>  #include "display/intel_bw.h"
>  #include "display/intel_cdclk.h"
>  #include "display/intel_display_types.h"
> +#include "display/intel_display_power.h"
>  #include "display/intel_dp.h"
>  #include "display/intel_fbdev.h"
>  #include "display/intel_hotplug.h"
> @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct device *kdev)
>  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
>  		ret = vlv_resume_prepare(dev_priv, true);
>  
> +	/* Weird hack to fix ICL hardware bug, as it resets DBUF slices reg */
> +	if (INTEL_GEN(dev_priv) == 11)
> +		icl_program_dbuf_slices(dev_priv);
> +
>  	intel_uncore_runtime_resume(&dev_priv->uncore);
>  
>  	intel_runtime_pm_enable_interrupts(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 7e0f67babe20..a396977c9c2d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -804,7 +804,7 @@ static inline bool skl_ddb_entry_equal(const struct skl_ddb_entry *e1,
>  }
>  
>  struct skl_ddb_allocation {
> -	u8 enabled_slices; /* GEN11 has configurable 2 slices */
> +	u8 enabled_slices;   /* Bitmask of currently enabled DBuf slices */
>  };
>  
>  struct skl_ddb_values {
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 1bb701d32a5d..8e86af505730 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -614,7 +614,8 @@ static const struct intel_device_info intel_cherryview_info = {
>  	.has_gt_uc = 1, \
>  	.display.has_hdcp = 1, \
>  	.display.has_ipc = 1, \
> -	.ddb_size = 896
> +	.ddb_size = 896, \
> +	.supported_slices = 1
>  
>  #define SKL_PLATFORM \
>  	GEN9_FEATURES, \
> @@ -650,6 +651,7 @@ static const struct intel_device_info intel_skylake_gt4_info = {
>  	GEN(9), \
>  	.is_lp = 1, \
>  	.display.has_hotplug = 1, \
> +	.supported_slices = 1, \
>  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0), \
>  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
>  	.has_64bit_reloc = 1, \
> @@ -737,6 +739,7 @@ static const struct intel_device_info intel_coffeelake_gt3_info = {
>  	GEN9_FEATURES, \
>  	GEN(10), \
>  	.ddb_size = 1024, \
> +	.supported_slices = 1, \
>  	.display.has_dsc = 1, \
>  	.has_coherent_ggtt = false, \
>  	GLK_COLORS
> @@ -773,6 +776,7 @@ static const struct intel_device_info intel_cannonlake_info = {
>  	}, \
>  	GEN(11), \
>  	.ddb_size = 2048, \
> +	.supported_slices = 2, \
>  	.has_logical_ring_elsq = 1, \
>  	.color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 }
>  
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index a607ea520829..fba5731063d8 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -7734,9 +7734,11 @@ enum {
>  #define DISP_ARB_CTL2	_MMIO(0x45004)
>  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
>  #define  DISP_IPC_ENABLE		(1 << 3)
> -#define DBUF_CTL	_MMIO(0x45008)
> -#define DBUF_CTL_S1	_MMIO(0x45008)
> -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> +#define DBUF_S1_BIT			BIT(0)
> +#define DBUF_S2_BIT			BIT(1)
> +#define DBUF_CTL			_MMIO(0x45008)
> +#define DBUF_CTL_S1			_MMIO(0x45008)
> +#define DBUF_CTL_S2			_MMIO(0x44FE8)
>  #define  DBUF_POWER_REQUEST		(1 << 31)
>  #define  DBUF_POWER_STATE		(1 << 30)
>  #define GEN7_MSG_CTL	_MMIO(0x45010)
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
> index 4bdf8a6cfb47..ba34e1a5c591 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.h
> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> @@ -180,6 +180,7 @@ struct intel_device_info {
>  	} display;
>  
>  	u16 ddb_size; /* in blocks */
> +	u8 supported_slices; /* number of DBuf slices */
>  
>  	/* Register offsets for the various display pipes and transcoders */
>  	int pipe_offsets[I915_MAX_TRANSCODERS];
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 2d389e437e87..1ea81ab92429 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
>  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
>  }
>  
> -static u8 intel_enabled_dbuf_slices_num(struct drm_i915_private *dev_priv)
> +static u8 intel_enabled_dbuf_slices(struct drm_i915_private *dev_priv)
>  {
> -	u8 enabled_slices;
> -
> -	/* Slice 1 will always be enabled */
> -	enabled_slices = 1;
> +	u8 enabled_slices = 0;
>  
>  	/* Gen prior to GEN11 have only one DBuf slice */
>  	if (INTEL_GEN(dev_priv) < 11)
> -		return enabled_slices;
> +		return DBUF_S1_BIT;
>  
> -	/*
> -	 * FIXME: for now we'll only ever use 1 slice; pretend that we have
> -	 * only that 1 slice enabled until we have a proper way for on-demand
> -	 * toggling of the second slice.
> -	 */
> -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> -		enabled_slices++;
> +	/* Check if second DBuf slice is enabled */
> +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> +		enabled_slices |= DBUF_S1_BIT;
> +
> +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> +		enabled_slices |= DBUF_S2_BIT;
>  
>  	return enabled_slices;
>  }
> @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv,
>  			      const int num_active,
>  			      struct skl_ddb_allocation *ddb)
>  {
> -	const struct drm_display_mode *adjusted_mode;
> -	u64 total_data_bw;
>  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> -
>  	WARN_ON(ddb_size == 0);
>  
>  	if (INTEL_GEN(dev_priv) < 11)
>  		return ddb_size - 4; /* 4 blocks for bypass path allocation */
>  
> -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> -	total_data_bw = total_data_rate * drm_mode_vrefresh(adjusted_mode);
> +	return ddb_size;
> +}
>  
> -	/*
> -	 * 12GB/s is maximum BW supported by single DBuf slice.
> -	 *
> -	 * FIXME dbuf slice code is broken:
> -	 * - must wait for planes to stop using the slice before powering it off
> -	 * - plane straddling both slices is illegal in multi-pipe scenarios
> -	 * - should validate we stay within the hw bandwidth limits
> -	 */
> -	if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
> -		ddb->enabled_slices = 2;
> -	} else {
> -		ddb->enabled_slices = 1;
> -		ddb_size /= 2;
> -	}
> +/*
> + * Calculate initial DBuf slice offset, based on slice size
> + * and mask(i.e if slice size is 1024 and second slice is enabled
> + * offset would be 1024)
> + */
> +static u32 skl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
> +					   u32 slice_size, u32 ddb_size)
> +{
> +	u32 offset = 0;
>  
> -	return ddb_size;
> +	if (!dbuf_slice_mask)
> +		return 0;
> +
> +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> +
> +	WARN_ON(offset >= ddb_size);
> +	return offset;
>  }
>  
> +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> +				      int pipe, u32 active_pipes,
> +				      const struct intel_crtc_state *crtc_state);
> +
>  static void
>  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  				   const struct intel_crtc_state *crtc_state,
> @@ -3857,7 +3855,14 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  	u32 pipe_width = 0, total_width = 0, width_before_pipe = 0;
>  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
>  	u16 ddb_size;
> +	u32 ddb_range_size;
>  	u32 i;
> +	u32 dbuf_slice_mask;
> +	u32 active_pipes;
> +	u32 offset;
> +	u32 slice_size;
> +	u32 total_slice_mask;
> +	u32 start, end;
>  
>  	if (WARN_ON(!state) || !crtc_state->hw.active) {
>  		alloc->start = 0;
> @@ -3866,14 +3871,23 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  		return;
>  	}
>  
> -	if (intel_state->active_pipe_changes)
> +	if (intel_state->active_pipe_changes) {
>  		*num_active = hweight8(intel_state->active_pipes);
> -	else
> +		active_pipes = intel_state->active_pipes;
> +	} else {
>  		*num_active = hweight8(dev_priv->active_pipes);
> +		active_pipes = dev_priv->active_pipes;
> +	}
>  
>  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state, total_data_rate,
>  				      *num_active, ddb);
>  
> +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> +
> +	slice_size = ddb_size / INTEL_INFO(dev_priv)->supported_slices;
> +
> +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> +
>  	/*
>  	 * If the state doesn't change the active CRTC's or there is no
>  	 * modeset request, then there's no need to recalculate;
> @@ -3891,20 +3905,70 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  		return;
>  	}
>  
> +	/*
> +	 * Get allowed DBuf slices for correspondent pipe and platform.
> +	 */
> +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv, for_pipe,
> +						     active_pipes, crtc_state);
> +
> +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes %x\n",
> +		      dbuf_slice_mask,
> +		      for_pipe, active_pipes);
> +
> +	/*
> +	 * Figure out at which DBuf slice we start, i.e if we start at Dbuf S2
> +	 * and slice size is 1024, the offset would be 1024
> +	 */
> +	offset = skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> +						 slice_size, ddb_size);
> +
> +	/*
> +	 * Figure out total size of allowed DBuf slices, which is basically
> +	 * a number of allowed slices for that pipe multiplied by slice size.
> +	 * Inside of this
> +	 * range ddb entries are still allocated in proportion to display width.
> +	 */
> +	ddb_range_size = hweight8(dbuf_slice_mask) * slice_size;
> +
>  	/*
>  	 * Watermark/ddb requirement highly depends upon width of the
>  	 * framebuffer, So instead of allocating DDB equally among pipes
>  	 * distribute DDB based on resolution/width of the display.
>  	 */
> +	total_slice_mask = dbuf_slice_mask;
>  	for_each_new_intel_crtc_in_state(intel_state, crtc, crtc_state, i) {
>  		const struct drm_display_mode *adjusted_mode =
>  			&crtc_state->hw.adjusted_mode;
>  		enum pipe pipe = crtc->pipe;
>  		int hdisplay, vdisplay;
> +		u32 pipe_dbuf_slice_mask = \
> +			i915_get_allowed_dbuf_mask(dev_priv,
> +						pipe,
> +						active_pipes,
> +						crtc_state);
>  
>  		if (!crtc_state->hw.enable)
>  			continue;
>  
> +		/*
> +		 * According to BSpec pipe can share one dbuf slice with another
> +		 * pipes or pipe can use multiple dbufs, in both cases we
> +		 * account for other pipes only if they have exactly same mask.
> +		 * However we need to account how many slices we should enable
> +		 * in total.
> +		 */
> +		total_slice_mask |= pipe_dbuf_slice_mask;
> +
> +		/*
> +		 * Do not account pipes using other slice sets
> +		 * luckily as of current BSpec slice sets do not partially
> +		 * intersect(pipes share either same one slice or same slice set
> +		 * i.e no partial intersection), so it is enough to check for
> +		 * equality for now.
> +		 */
> +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> +			continue;
> +
>  		drm_mode_get_hv_timing(adjusted_mode, &hdisplay, &vdisplay);
>  		total_width += hdisplay;
>  
> @@ -3914,8 +3978,19 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  			pipe_width = hdisplay;
>  	}
>  
> -	alloc->start = ddb_size * width_before_pipe / total_width;
> -	alloc->end = ddb_size * (width_before_pipe + pipe_width) / total_width;
> +	ddb->enabled_slices = total_slice_mask;
> +
> +	start = ddb_range_size * width_before_pipe / total_width;
> +	end = ddb_range_size * (width_before_pipe + pipe_width) / total_width;
> +
> +	alloc->start = offset + start;
> +	alloc->end = offset + end;
> +
> +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> +		      alloc->start, alloc->end);
> +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported %d\n",
> +		      ddb->enabled_slices,
> +		      INTEL_INFO(dev_priv)->supported_slices);
>  }
>  
>  static int skl_compute_wm_params(const struct intel_crtc_state *crtc_state,
> @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
>  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
>  			  struct skl_ddb_allocation *ddb /* out */)
>  {
> -	ddb->enabled_slices = intel_enabled_dbuf_slices_num(dev_priv);
> +	ddb->enabled_slices = intel_enabled_dbuf_slices(dev_priv);
> +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb->enabled_slices);
>  }
>  
>  /*
> @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct intel_crtc_state *crtc_state,
>  	return total_data_rate;
>  }
>  
> +static uint_fixed_16_16_t
> +icl_pipe_downscale_amount(const struct intel_crtc_state *crtc_state)
> +{
> +	u32 src_w, src_h, dst_w, dst_h;
> +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> +	uint_fixed_16_16_t downscale_h, downscale_w;
> +	const struct drm_display_mode *adjusted_mode = &crtc_state->hw.adjusted_mode;
> +
> +	src_w = crtc_state->pipe_src_w;
> +	src_h = crtc_state->pipe_src_h;
> +	dst_w = adjusted_mode->crtc_hdisplay;
> +	dst_h = adjusted_mode->crtc_vdisplay;
> +
> +	fp_w_ratio = div_fixed16(src_w, dst_w);
> +	fp_h_ratio = div_fixed16(src_h, dst_h);
> +	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
> +	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
> +
> +	return mul_fixed16(downscale_w, downscale_h);
> +}
> +
> +uint_fixed_16_16_t
> +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> +{
> +	struct drm_plane *plane;
> +	const struct drm_plane_state *drm_plane_state;
> +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> +	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> +
> +	if (!crtc_state->uapi.enable)
> +		return max_downscale;
> +
> +	drm_atomic_crtc_state_for_each_plane_state(plane, drm_plane_state, &crtc_state->uapi) {
> +		uint_fixed_16_16_t plane_downscale;
> +		const struct intel_plane_state *plane_state =
> +			to_intel_plane_state(drm_plane_state);
> +
> +		if (!intel_wm_plane_visible(crtc_state, plane_state))
> +			continue;
> +
> +		plane_downscale = skl_plane_downscale_amount(crtc_state, plane_state);
> +
> +		DRM_DEBUG_KMS("Plane %d downscale amount %d.%d\n",
> +			      to_intel_plane(plane)->id,
> +			      plane_downscale.val >> 16,
> +			      plane_downscale.val & 0xffff);
> +
> +		max_downscale = max_fixed16(plane_downscale, max_downscale);
> +	}
> +
> +
> +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> +
> +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> +		       crtc->pipe, pipe_downscale.val >> 16,
> +		       pipe_downscale.val & 0xffff);
> +
> +	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
> +
> +	/*
> +	 * Convert result to percentage: get the actual ratio instead of rate,
> +	 * then multiply by 100 to get percentage.
> +	 */
> +	pipe_ratio = u32_to_fixed16(100 * div_round_up_u32_fixed16(1, pipe_downscale));
> +
> +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> +		      pipe_ratio.val >> 16, pipe_ratio.val & 0xffff);
> +
> +	return pipe_ratio;
> +}
> +
> +struct dbuf_slice_conf_entry {
> +	u32 dbuf_mask[I915_MAX_PIPES];
> +	u32 active_pipes;
> +};
> +
> +
> +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) }
> +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C) | BIT(PIPE_D) }
> +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D) }
> +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_D) }
> +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) }
> +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \
> +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) | BIT(PIPE_B) | \
> +					  BIT(PIPE_C) | BIT(PIPE_D) }
> +
> +/*
> + * Table taken from Bspec 12716
> + * Pipes do have some preferred DBuf slice affinity,
> + * plus there are some hardcoded requirements on how
> + * those should be distributed for multipipe scenarios.
> + * For more DBuf slices algorithm can get even more messy
> + * and less readable, so decided to use a table almost
> + * as is from BSpec itself - that way it is at least easier
> + * to compare, change and check.
> + */
> +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> +{ { 0, 0, 0, 0 }, 0 },
> +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
> +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe ratio >= 88.8 */
> +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
> +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe ratio >= 88.8 */
> +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
> +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for pipe ratio >= 88.8 */
> +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> +};
> +
> +/*
> + * Table taken from Bspec 49255
> + * Pipes do have some preferred DBuf slice affinity,
> + * plus there are some hardcoded requirements on how
> + * those should be distributed for multipipe scenarios.
> + * For more DBuf slices algorithm can get even more messy
> + * and less readable, so decided to use a table almost
> + * as is from BSpec itself - that way it is at least easier
> + * to compare, change and check.
> + */
> +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> +{ { 0, 0, 0, 0 }, 0 },
> +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> +};
> +
> +/*
> + * This function finds an entry with same enabled pipe configuration and
> + * returns correspondent DBuf slice mask as stated in BSpec for particular
> + * platform.
> + */
> +static u32 icl_get_allowed_dbuf_mask(int pipe,
> +				     u32 active_pipes,
> +				     const struct intel_crtc_state *crtc_state)
> +{
> +	int i;
> +	uint_fixed_16_16_t pipe_ratio;
> +
> +	/*
> +	 * Calculate pipe ratio as stated in BSpec 28692
> +	 */
> +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> +
> +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> +		if (icl_allowed_dbufs[i].active_pipes == active_pipes) {
> +			if (hweight32(active_pipes) == 1) {
> +				/*
> +				 * According to BSpec 12716: if ratio >= 88.8
> +				 * always use single dbuf slice.
> +				 */
> +				if (pipe_ratio.val >= div_fixed16(888, 10).val)
> +					++i;
> +			}
> +			return icl_allowed_dbufs[i].dbuf_mask[pipe];
> +		}
> +	}
> +	return 0;
> +}
> +
> +/*
> + * This function finds an entry with same enabled pipe configuration and
> + * returns correspondent DBuf slice mask as stated in BSpec for particular
> + * platform.
> + */
> +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> +				     u32 active_pipes,
> +				     const struct intel_crtc_state *crtc_state)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> +		if (tgl_allowed_dbufs[i].active_pipes == active_pipes)
> +			return tgl_allowed_dbufs[i].dbuf_mask[pipe];
> +	}
> +	return 0;
> +}
> +
> +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> +				      int pipe, u32 active_pipes,
> +				      const struct intel_crtc_state *crtc_state)
> +{
> +	if (IS_GEN(dev_priv, 11))
> +		return icl_get_allowed_dbuf_mask(pipe,
> +						 active_pipes,
> +						 crtc_state);
> +	else if (IS_GEN(dev_priv, 12))
> +		return tgl_get_allowed_dbuf_mask(pipe,
> +						 active_pipes,
> +						 crtc_state);
> +	/*
> +	 * For anything else just return one slice yet.
> +	 * Should be extended for other platforms.
> +	 */
> +	return DBUF_S1_BIT;
> +}
> +
>  static u64
>  icl_get_total_relative_data_rate(struct intel_crtc_state *crtc_state,
>  				 u64 *plane_data_rate)

-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 14:22   ` Jani Nikula
  0 siblings, 0 replies; 34+ messages in thread
From: Jani Nikula @ 2019-11-18 14:22 UTC (permalink / raw)
  To: Stanislav Lisovskiy, intel-gfx

On Thu, 14 Nov 2019, Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> wrote:
> Also implemented algorithm for choosing DBuf slice configuration
> based on active pipes, pipe ratio as stated in BSpec 12716.
>
> Now pipe allocation still stays proportional to pipe width as before,
> however within allowed DBuf slice for this particular configuration.
>
> v2: Remove unneeded check from commit as ddb enabled slices might
>     now differ from hw state.
>
> v3: - Added new field "supported_slices" to ddb to separate max
>       supported slices vs currently enabled, to avoid confusion.
>     - Removed obsolete comments and code related to DBuf(Matthew Roper).
>     - Some code style and long line removal(Matthew Roper).
>     - Added WARN_ON to new ddb range offset calc function(Matthew Roper).
>     - Removed platform specific call to calc pipe ratio from ddb
>       allocation function and fixed the return value(Matthew Roper)
>     - Refactored DBUF slice allocation table to improve readability
>     - Added DBUF slice allocation for TGL as well.
>     - ICL(however not TGL) seems to voluntarily enable second DBuf slice
>       after pm suspend/resume causing a mismatch failure, because we
>       update DBuf slices only if we do a modeset, however this check
>       is done always. Fixed it to be done only when modeset for ICL.
>
> v4: - Now enabled slices is not actually a number, but a bitmask,
>       because we might need to enable second slice only and number
>       of slices would still 1 and that current code doesn't allow.
>     - Remove redundant duplicate code to have some unified way of
>       enabling dbuf slices instead of hardcoding.
>
> v5: - Fix failing gen9_assert_dbuf_enabled as it was naively thinking
>       that we have only one DBUF_CTL slice. Now another version for
>       gen11 will check both slices as only second one can be enabled,
>       to keep CI happy.
>
> v6: - Removed enabled dbuf assertion completely. Previous code
>       was keeping dbuf enabled, even(!) when _dbuf_disable is called.
>       Currently we enable/disable dbuf slices based on requrement
>       so no point in asserting this here.
>     - Removed unnecessary modeset check from verify_wm_state(Matthew Roper)
>     - Slices intersection after union is same as final result(Matthew Roper)
>     - Moved DBuf slices to intel_device_info(Matthew Roper)
>     - Some macros added(Matthew Roper)
>     - ddb range is now always less or equal to ddb size - no need for additional
>       checks(previously needed as we had some bandwidth checks in there which
>       could "suddenly" return ddb_size smaller than it is.
>
> v7: Minor costemic changes:
>     - Changed icl_dbuf_slices_restore name to icl_program_dbuf_slices
>       as it more corresponds to the actual functionality.
>     - Some simplification with supported slices for BXT and GLK
>     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
>       this is not used for skl anymore.
>     - Changed DBuf slice assignment order for TGL case
>
> Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> Cc: Matthew Roper <matthew.d.roper@intel.com>
> Cc: Ville Syrjälä <ville.syrjala@intel.com>
> Cc: James Ausmus <james.ausmus@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
>  .../drm/i915/display/intel_display_power.c    |  98 ++---
>  .../drm/i915/display/intel_display_power.h    |   2 +
>  drivers/gpu/drm/i915/i915_drv.c               |   5 +
>  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
>  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
>  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
>  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
>  drivers/gpu/drm/i915/intel_pm.c               | 387 ++++++++++++++++--
>  9 files changed, 431 insertions(+), 104 deletions(-)

This is a rather big patch, and that's one of the few things we can't
fix after committing.

If there's a bug, by our rate of change this turns practically
un-revertible in a matter of weeks due to conflicts. If there's a
bisected regression, and this is the regressing commit, what are our
chances of identifying the bug based on the sha1 alone? Can't revert and
can't immediately spot the bug/fix is a bad combination to have.

Also we have this nice CI infrastructure reporting about valid
checkpatch and sparse issues for several versions of the patch. Please
fix.


BR,
Jani.


>
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 876fc25968bf..bd7aff675198 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct intel_crtc *crtc,
>  
>  	if (INTEL_GEN(dev_priv) >= 11 &&
>  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> -		DRM_ERROR("mismatch in DBUF Slices (expected %u, got %u)\n",
> +		DRM_ERROR("mismatch in DBUF Slices (expected %x, got %x)\n",
>  			  sw_ddb->enabled_slices,
>  			  hw->ddb.enabled_slices);
>  
> @@ -14614,15 +14614,24 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
>  	u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
>  	u8 required_slices = state->wm_results.ddb.enabled_slices;
>  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> +	u8 slices_union = hw_enabled_slices | required_slices;
> +	u8 slices_intersection = required_slices;
>  
>  	for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i)
>  		/* ignore allocations for crtc's that have been turned off. */
>  		if (new_crtc_state->hw.active)
>  			entries[i] = old_crtc_state->wm.skl.ddb;
>  
> -	/* If 2nd DBuf slice required, enable it here */
> -	if (INTEL_GEN(dev_priv) >= 11 && required_slices > hw_enabled_slices)
> -		icl_dbuf_slices_update(dev_priv, required_slices);
> +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> +		      required_slices, hw_enabled_slices);
> +
> +	/*
> +	 * If some other DBuf slice required, enable it here,
> +	 * as here we only add more slices, but not remove to prevent
> +	 * issues if somebody still uses those.
> +	 */
> +	if (INTEL_GEN(dev_priv) >= 11 && required_slices != hw_enabled_slices)
> +		icl_dbuf_slices_update(dev_priv, slices_union);
>  
>  	/*
>  	 * Whenever the number of active pipes changes, we need to make sure we
> @@ -14681,9 +14690,12 @@ static void skl_commit_modeset_enables(struct intel_atomic_state *state)
>  		}
>  	} while (progress);
>  
> -	/* If 2nd DBuf slice is no more required disable it */
> -	if (INTEL_GEN(dev_priv) >= 11 && required_slices < hw_enabled_slices)
> -		icl_dbuf_slices_update(dev_priv, required_slices);
> +	/*
> +	 * If some other DBuf slice is not needed, disable it here,
> +	 * as here we only remove slices, which are not needed anymore.
> +	 */
> +	if (INTEL_GEN(dev_priv) >= 11 && required_slices != hw_enabled_slices)
> +		icl_dbuf_slices_update(dev_priv, slices_intersection);
>  }
>  
>  static void intel_atomic_helper_free_state(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c
> index ce1b64f4dd44..a78bc9b61862 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> @@ -1031,15 +1031,6 @@ static bool gen9_dc_off_power_well_enabled(struct drm_i915_private *dev_priv,
>  		(I915_READ(DC_STATE_EN) & DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
>  }
>  
> -static void gen9_assert_dbuf_enabled(struct drm_i915_private *dev_priv)
> -{
> -	u32 tmp = I915_READ(DBUF_CTL);
> -
> -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> -	     "Unexpected DBuf power power state (0x%08x)\n", tmp);
> -}
> -
>  static void gen9_disable_dc_states(struct drm_i915_private *dev_priv)
>  {
>  	struct intel_cdclk_state cdclk_state = {};
> @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct drm_i915_private *dev_priv)
>  	/* Can't read out voltage_level so can't use intel_cdclk_changed() */
>  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw, &cdclk_state));
>  
> -	gen9_assert_dbuf_enabled(dev_priv);
> -
>  	if (IS_GEN9_LP(dev_priv))
>  		bxt_verify_ddi_phy_power_wells(dev_priv);
>  
> @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct drm_i915_private *dev_priv)
>  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
>  }
>  
> -static u8 intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
>  {
> -	if (INTEL_GEN(dev_priv) < 11)
> -		return 1;
> -	return 2;
> +	return INTEL_INFO(dev_priv)->supported_slices;
> +}
> +
> +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv)
> +{
> +	const u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
> +
> +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
>  }
>  
>  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
>  			    u8 req_slices)
>  {
> -	const u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
> -	bool ret;
> +	bool ret = true;
> +	int i;
> +	int max_slices = intel_dbuf_max_slices(dev_priv);
>  
> -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> +	if (hweight8(req_slices) > intel_dbuf_max_slices(dev_priv)) {
>  		DRM_ERROR("Invalid number of dbuf slices requested\n");
>  		return;
>  	}
>  
> -	if (req_slices == hw_enabled_slices || req_slices == 0)
> -		return;
> +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n", req_slices);
>  
> -	if (req_slices > hw_enabled_slices)
> -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2, true);
> -	else
> -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2, false);
> +	for (i = 0; i < max_slices; i++) {
> +		int slice_bit = BIT(i);
> +		bool slice_set = (slice_bit & req_slices) != 0;
> +
> +		switch (slice_bit) {
> +		case DBUF_S1_BIT:
> +			ret = ret && intel_dbuf_slice_set(dev_priv,
> +							  DBUF_CTL_S1,
> +							  slice_set);
> +			break;
> +		case DBUF_S2_BIT:
> +			ret = ret && intel_dbuf_slice_set(dev_priv,
> +							  DBUF_CTL_S2,
> +							  slice_set);
> +			break;
> +		default:
> +			MISSING_CASE(slice_bit);
> +		}
> +	}
>  
>  	if (ret)
>  		dev_priv->wm.skl_hw.ddb.enabled_slices = req_slices;
> @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
>  
>  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
>  {
> -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) | DBUF_POWER_REQUEST);
> -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) | DBUF_POWER_REQUEST);
> -	POSTING_READ(DBUF_CTL_S2);
> -
> -	udelay(10);
> -
> -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> -		DRM_ERROR("DBuf power enable timeout\n");
> -	else
> -		/*
> -		 * FIXME: for now pretend that we only have 1 slice, see
> -		 * intel_enabled_dbuf_slices_num().
> -		 */
> -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> +	/*
> +	 * Just power up 1 slice, we will
> +	 * figure out later which slices we have and what we need.
> +	 */
> +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> +	icl_program_dbuf_slices(dev_priv);
>  }
>  
>  static void icl_dbuf_disable(struct drm_i915_private *dev_priv)
>  {
> -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) & ~DBUF_POWER_REQUEST);
> -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) & ~DBUF_POWER_REQUEST);
> -	POSTING_READ(DBUF_CTL_S2);
> -
> -	udelay(10);
> -
> -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> -		DRM_ERROR("DBuf power disable timeout!\n");
> -	else
> -		/*
> -		 * FIXME: for now pretend that the first slice is always
> -		 * enabled, see intel_enabled_dbuf_slices_num().
> -		 */
> -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> +	/*
> +	 * Disable all slices
> +	 */
> +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> +	icl_program_dbuf_slices(dev_priv);
>  }
>  
>  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h
> index 1da04f3e0fb3..b7057d68ad78 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> @@ -311,8 +311,10 @@ intel_display_power_put_async(struct drm_i915_private *i915,
>  	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
>  	     intel_display_power_put_async((i915), (domain), (wf)), (wf) = 0)
>  
> +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
>  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
>  			    u8 req_slices);
> +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv);
>  
>  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
>  			     bool override, unsigned int mask);
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 82e4e6bf08c3..96741e68633a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -51,6 +51,7 @@
>  #include "display/intel_bw.h"
>  #include "display/intel_cdclk.h"
>  #include "display/intel_display_types.h"
> +#include "display/intel_display_power.h"
>  #include "display/intel_dp.h"
>  #include "display/intel_fbdev.h"
>  #include "display/intel_hotplug.h"
> @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct device *kdev)
>  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
>  		ret = vlv_resume_prepare(dev_priv, true);
>  
> +	/* Weird hack to fix ICL hardware bug, as it resets DBUF slices reg */
> +	if (INTEL_GEN(dev_priv) == 11)
> +		icl_program_dbuf_slices(dev_priv);
> +
>  	intel_uncore_runtime_resume(&dev_priv->uncore);
>  
>  	intel_runtime_pm_enable_interrupts(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 7e0f67babe20..a396977c9c2d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -804,7 +804,7 @@ static inline bool skl_ddb_entry_equal(const struct skl_ddb_entry *e1,
>  }
>  
>  struct skl_ddb_allocation {
> -	u8 enabled_slices; /* GEN11 has configurable 2 slices */
> +	u8 enabled_slices;   /* Bitmask of currently enabled DBuf slices */
>  };
>  
>  struct skl_ddb_values {
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 1bb701d32a5d..8e86af505730 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -614,7 +614,8 @@ static const struct intel_device_info intel_cherryview_info = {
>  	.has_gt_uc = 1, \
>  	.display.has_hdcp = 1, \
>  	.display.has_ipc = 1, \
> -	.ddb_size = 896
> +	.ddb_size = 896, \
> +	.supported_slices = 1
>  
>  #define SKL_PLATFORM \
>  	GEN9_FEATURES, \
> @@ -650,6 +651,7 @@ static const struct intel_device_info intel_skylake_gt4_info = {
>  	GEN(9), \
>  	.is_lp = 1, \
>  	.display.has_hotplug = 1, \
> +	.supported_slices = 1, \
>  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0), \
>  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
>  	.has_64bit_reloc = 1, \
> @@ -737,6 +739,7 @@ static const struct intel_device_info intel_coffeelake_gt3_info = {
>  	GEN9_FEATURES, \
>  	GEN(10), \
>  	.ddb_size = 1024, \
> +	.supported_slices = 1, \
>  	.display.has_dsc = 1, \
>  	.has_coherent_ggtt = false, \
>  	GLK_COLORS
> @@ -773,6 +776,7 @@ static const struct intel_device_info intel_cannonlake_info = {
>  	}, \
>  	GEN(11), \
>  	.ddb_size = 2048, \
> +	.supported_slices = 2, \
>  	.has_logical_ring_elsq = 1, \
>  	.color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 }
>  
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index a607ea520829..fba5731063d8 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -7734,9 +7734,11 @@ enum {
>  #define DISP_ARB_CTL2	_MMIO(0x45004)
>  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
>  #define  DISP_IPC_ENABLE		(1 << 3)
> -#define DBUF_CTL	_MMIO(0x45008)
> -#define DBUF_CTL_S1	_MMIO(0x45008)
> -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> +#define DBUF_S1_BIT			BIT(0)
> +#define DBUF_S2_BIT			BIT(1)
> +#define DBUF_CTL			_MMIO(0x45008)
> +#define DBUF_CTL_S1			_MMIO(0x45008)
> +#define DBUF_CTL_S2			_MMIO(0x44FE8)
>  #define  DBUF_POWER_REQUEST		(1 << 31)
>  #define  DBUF_POWER_STATE		(1 << 30)
>  #define GEN7_MSG_CTL	_MMIO(0x45010)
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
> index 4bdf8a6cfb47..ba34e1a5c591 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.h
> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> @@ -180,6 +180,7 @@ struct intel_device_info {
>  	} display;
>  
>  	u16 ddb_size; /* in blocks */
> +	u8 supported_slices; /* number of DBuf slices */
>  
>  	/* Register offsets for the various display pipes and transcoders */
>  	int pipe_offsets[I915_MAX_TRANSCODERS];
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 2d389e437e87..1ea81ab92429 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
>  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
>  }
>  
> -static u8 intel_enabled_dbuf_slices_num(struct drm_i915_private *dev_priv)
> +static u8 intel_enabled_dbuf_slices(struct drm_i915_private *dev_priv)
>  {
> -	u8 enabled_slices;
> -
> -	/* Slice 1 will always be enabled */
> -	enabled_slices = 1;
> +	u8 enabled_slices = 0;
>  
>  	/* Gen prior to GEN11 have only one DBuf slice */
>  	if (INTEL_GEN(dev_priv) < 11)
> -		return enabled_slices;
> +		return DBUF_S1_BIT;
>  
> -	/*
> -	 * FIXME: for now we'll only ever use 1 slice; pretend that we have
> -	 * only that 1 slice enabled until we have a proper way for on-demand
> -	 * toggling of the second slice.
> -	 */
> -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> -		enabled_slices++;
> +	/* Check if second DBuf slice is enabled */
> +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> +		enabled_slices |= DBUF_S1_BIT;
> +
> +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> +		enabled_slices |= DBUF_S2_BIT;
>  
>  	return enabled_slices;
>  }
> @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct drm_i915_private *dev_priv,
>  			      const int num_active,
>  			      struct skl_ddb_allocation *ddb)
>  {
> -	const struct drm_display_mode *adjusted_mode;
> -	u64 total_data_bw;
>  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> -
>  	WARN_ON(ddb_size == 0);
>  
>  	if (INTEL_GEN(dev_priv) < 11)
>  		return ddb_size - 4; /* 4 blocks for bypass path allocation */
>  
> -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> -	total_data_bw = total_data_rate * drm_mode_vrefresh(adjusted_mode);
> +	return ddb_size;
> +}
>  
> -	/*
> -	 * 12GB/s is maximum BW supported by single DBuf slice.
> -	 *
> -	 * FIXME dbuf slice code is broken:
> -	 * - must wait for planes to stop using the slice before powering it off
> -	 * - plane straddling both slices is illegal in multi-pipe scenarios
> -	 * - should validate we stay within the hw bandwidth limits
> -	 */
> -	if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
> -		ddb->enabled_slices = 2;
> -	} else {
> -		ddb->enabled_slices = 1;
> -		ddb_size /= 2;
> -	}
> +/*
> + * Calculate initial DBuf slice offset, based on slice size
> + * and mask(i.e if slice size is 1024 and second slice is enabled
> + * offset would be 1024)
> + */
> +static u32 skl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
> +					   u32 slice_size, u32 ddb_size)
> +{
> +	u32 offset = 0;
>  
> -	return ddb_size;
> +	if (!dbuf_slice_mask)
> +		return 0;
> +
> +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> +
> +	WARN_ON(offset >= ddb_size);
> +	return offset;
>  }
>  
> +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> +				      int pipe, u32 active_pipes,
> +				      const struct intel_crtc_state *crtc_state);
> +
>  static void
>  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  				   const struct intel_crtc_state *crtc_state,
> @@ -3857,7 +3855,14 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  	u32 pipe_width = 0, total_width = 0, width_before_pipe = 0;
>  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
>  	u16 ddb_size;
> +	u32 ddb_range_size;
>  	u32 i;
> +	u32 dbuf_slice_mask;
> +	u32 active_pipes;
> +	u32 offset;
> +	u32 slice_size;
> +	u32 total_slice_mask;
> +	u32 start, end;
>  
>  	if (WARN_ON(!state) || !crtc_state->hw.active) {
>  		alloc->start = 0;
> @@ -3866,14 +3871,23 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  		return;
>  	}
>  
> -	if (intel_state->active_pipe_changes)
> +	if (intel_state->active_pipe_changes) {
>  		*num_active = hweight8(intel_state->active_pipes);
> -	else
> +		active_pipes = intel_state->active_pipes;
> +	} else {
>  		*num_active = hweight8(dev_priv->active_pipes);
> +		active_pipes = dev_priv->active_pipes;
> +	}
>  
>  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state, total_data_rate,
>  				      *num_active, ddb);
>  
> +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> +
> +	slice_size = ddb_size / INTEL_INFO(dev_priv)->supported_slices;
> +
> +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> +
>  	/*
>  	 * If the state doesn't change the active CRTC's or there is no
>  	 * modeset request, then there's no need to recalculate;
> @@ -3891,20 +3905,70 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  		return;
>  	}
>  
> +	/*
> +	 * Get allowed DBuf slices for correspondent pipe and platform.
> +	 */
> +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv, for_pipe,
> +						     active_pipes, crtc_state);
> +
> +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes %x\n",
> +		      dbuf_slice_mask,
> +		      for_pipe, active_pipes);
> +
> +	/*
> +	 * Figure out at which DBuf slice we start, i.e if we start at Dbuf S2
> +	 * and slice size is 1024, the offset would be 1024
> +	 */
> +	offset = skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> +						 slice_size, ddb_size);
> +
> +	/*
> +	 * Figure out total size of allowed DBuf slices, which is basically
> +	 * a number of allowed slices for that pipe multiplied by slice size.
> +	 * Inside of this
> +	 * range ddb entries are still allocated in proportion to display width.
> +	 */
> +	ddb_range_size = hweight8(dbuf_slice_mask) * slice_size;
> +
>  	/*
>  	 * Watermark/ddb requirement highly depends upon width of the
>  	 * framebuffer, So instead of allocating DDB equally among pipes
>  	 * distribute DDB based on resolution/width of the display.
>  	 */
> +	total_slice_mask = dbuf_slice_mask;
>  	for_each_new_intel_crtc_in_state(intel_state, crtc, crtc_state, i) {
>  		const struct drm_display_mode *adjusted_mode =
>  			&crtc_state->hw.adjusted_mode;
>  		enum pipe pipe = crtc->pipe;
>  		int hdisplay, vdisplay;
> +		u32 pipe_dbuf_slice_mask = \
> +			i915_get_allowed_dbuf_mask(dev_priv,
> +						pipe,
> +						active_pipes,
> +						crtc_state);
>  
>  		if (!crtc_state->hw.enable)
>  			continue;
>  
> +		/*
> +		 * According to BSpec pipe can share one dbuf slice with another
> +		 * pipes or pipe can use multiple dbufs, in both cases we
> +		 * account for other pipes only if they have exactly same mask.
> +		 * However we need to account how many slices we should enable
> +		 * in total.
> +		 */
> +		total_slice_mask |= pipe_dbuf_slice_mask;
> +
> +		/*
> +		 * Do not account pipes using other slice sets
> +		 * luckily as of current BSpec slice sets do not partially
> +		 * intersect(pipes share either same one slice or same slice set
> +		 * i.e no partial intersection), so it is enough to check for
> +		 * equality for now.
> +		 */
> +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> +			continue;
> +
>  		drm_mode_get_hv_timing(adjusted_mode, &hdisplay, &vdisplay);
>  		total_width += hdisplay;
>  
> @@ -3914,8 +3978,19 @@ skl_ddb_get_pipe_allocation_limits(struct drm_i915_private *dev_priv,
>  			pipe_width = hdisplay;
>  	}
>  
> -	alloc->start = ddb_size * width_before_pipe / total_width;
> -	alloc->end = ddb_size * (width_before_pipe + pipe_width) / total_width;
> +	ddb->enabled_slices = total_slice_mask;
> +
> +	start = ddb_range_size * width_before_pipe / total_width;
> +	end = ddb_range_size * (width_before_pipe + pipe_width) / total_width;
> +
> +	alloc->start = offset + start;
> +	alloc->end = offset + end;
> +
> +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> +		      alloc->start, alloc->end);
> +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported %d\n",
> +		      ddb->enabled_slices,
> +		      INTEL_INFO(dev_priv)->supported_slices);
>  }
>  
>  static int skl_compute_wm_params(const struct intel_crtc_state *crtc_state,
> @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct intel_crtc *crtc,
>  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
>  			  struct skl_ddb_allocation *ddb /* out */)
>  {
> -	ddb->enabled_slices = intel_enabled_dbuf_slices_num(dev_priv);
> +	ddb->enabled_slices = intel_enabled_dbuf_slices(dev_priv);
> +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb->enabled_slices);
>  }
>  
>  /*
> @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct intel_crtc_state *crtc_state,
>  	return total_data_rate;
>  }
>  
> +static uint_fixed_16_16_t
> +icl_pipe_downscale_amount(const struct intel_crtc_state *crtc_state)
> +{
> +	u32 src_w, src_h, dst_w, dst_h;
> +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> +	uint_fixed_16_16_t downscale_h, downscale_w;
> +	const struct drm_display_mode *adjusted_mode = &crtc_state->hw.adjusted_mode;
> +
> +	src_w = crtc_state->pipe_src_w;
> +	src_h = crtc_state->pipe_src_h;
> +	dst_w = adjusted_mode->crtc_hdisplay;
> +	dst_h = adjusted_mode->crtc_vdisplay;
> +
> +	fp_w_ratio = div_fixed16(src_w, dst_w);
> +	fp_h_ratio = div_fixed16(src_h, dst_h);
> +	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
> +	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
> +
> +	return mul_fixed16(downscale_w, downscale_h);
> +}
> +
> +uint_fixed_16_16_t
> +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> +{
> +	struct drm_plane *plane;
> +	const struct drm_plane_state *drm_plane_state;
> +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> +	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> +
> +	if (!crtc_state->uapi.enable)
> +		return max_downscale;
> +
> +	drm_atomic_crtc_state_for_each_plane_state(plane, drm_plane_state, &crtc_state->uapi) {
> +		uint_fixed_16_16_t plane_downscale;
> +		const struct intel_plane_state *plane_state =
> +			to_intel_plane_state(drm_plane_state);
> +
> +		if (!intel_wm_plane_visible(crtc_state, plane_state))
> +			continue;
> +
> +		plane_downscale = skl_plane_downscale_amount(crtc_state, plane_state);
> +
> +		DRM_DEBUG_KMS("Plane %d downscale amount %d.%d\n",
> +			      to_intel_plane(plane)->id,
> +			      plane_downscale.val >> 16,
> +			      plane_downscale.val & 0xffff);
> +
> +		max_downscale = max_fixed16(plane_downscale, max_downscale);
> +	}
> +
> +
> +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> +
> +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> +		       crtc->pipe, pipe_downscale.val >> 16,
> +		       pipe_downscale.val & 0xffff);
> +
> +	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
> +
> +	/*
> +	 * Convert result to percentage: get the actual ratio instead of rate,
> +	 * then multiply by 100 to get percentage.
> +	 */
> +	pipe_ratio = u32_to_fixed16(100 * div_round_up_u32_fixed16(1, pipe_downscale));
> +
> +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> +		      pipe_ratio.val >> 16, pipe_ratio.val & 0xffff);
> +
> +	return pipe_ratio;
> +}
> +
> +struct dbuf_slice_conf_entry {
> +	u32 dbuf_mask[I915_MAX_PIPES];
> +	u32 active_pipes;
> +};
> +
> +
> +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) }
> +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C) | BIT(PIPE_D) }
> +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D) }
> +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_D) }
> +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) }
> +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \
> +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) | BIT(PIPE_B) | \
> +					  BIT(PIPE_C) | BIT(PIPE_D) }
> +
> +/*
> + * Table taken from Bspec 12716
> + * Pipes do have some preferred DBuf slice affinity,
> + * plus there are some hardcoded requirements on how
> + * those should be distributed for multipipe scenarios.
> + * For more DBuf slices algorithm can get even more messy
> + * and less readable, so decided to use a table almost
> + * as is from BSpec itself - that way it is at least easier
> + * to compare, change and check.
> + */
> +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> +{ { 0, 0, 0, 0 }, 0 },
> +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
> +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe ratio >= 88.8 */
> +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
> +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe ratio >= 88.8 */
> +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio < 88.8 */
> +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for pipe ratio >= 88.8 */
> +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> +};
> +
> +/*
> + * Table taken from Bspec 49255
> + * Pipes do have some preferred DBuf slice affinity,
> + * plus there are some hardcoded requirements on how
> + * those should be distributed for multipipe scenarios.
> + * For more DBuf slices algorithm can get even more messy
> + * and less readable, so decided to use a table almost
> + * as is from BSpec itself - that way it is at least easier
> + * to compare, change and check.
> + */
> +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> +{ { 0, 0, 0, 0 }, 0 },
> +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> +};
> +
> +/*
> + * This function finds an entry with same enabled pipe configuration and
> + * returns correspondent DBuf slice mask as stated in BSpec for particular
> + * platform.
> + */
> +static u32 icl_get_allowed_dbuf_mask(int pipe,
> +				     u32 active_pipes,
> +				     const struct intel_crtc_state *crtc_state)
> +{
> +	int i;
> +	uint_fixed_16_16_t pipe_ratio;
> +
> +	/*
> +	 * Calculate pipe ratio as stated in BSpec 28692
> +	 */
> +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> +
> +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> +		if (icl_allowed_dbufs[i].active_pipes == active_pipes) {
> +			if (hweight32(active_pipes) == 1) {
> +				/*
> +				 * According to BSpec 12716: if ratio >= 88.8
> +				 * always use single dbuf slice.
> +				 */
> +				if (pipe_ratio.val >= div_fixed16(888, 10).val)
> +					++i;
> +			}
> +			return icl_allowed_dbufs[i].dbuf_mask[pipe];
> +		}
> +	}
> +	return 0;
> +}
> +
> +/*
> + * This function finds an entry with same enabled pipe configuration and
> + * returns correspondent DBuf slice mask as stated in BSpec for particular
> + * platform.
> + */
> +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> +				     u32 active_pipes,
> +				     const struct intel_crtc_state *crtc_state)
> +{
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> +		if (tgl_allowed_dbufs[i].active_pipes == active_pipes)
> +			return tgl_allowed_dbufs[i].dbuf_mask[pipe];
> +	}
> +	return 0;
> +}
> +
> +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> +				      int pipe, u32 active_pipes,
> +				      const struct intel_crtc_state *crtc_state)
> +{
> +	if (IS_GEN(dev_priv, 11))
> +		return icl_get_allowed_dbuf_mask(pipe,
> +						 active_pipes,
> +						 crtc_state);
> +	else if (IS_GEN(dev_priv, 12))
> +		return tgl_get_allowed_dbuf_mask(pipe,
> +						 active_pipes,
> +						 crtc_state);
> +	/*
> +	 * For anything else just return one slice yet.
> +	 * Should be extended for other platforms.
> +	 */
> +	return DBUF_S1_BIT;
> +}
> +
>  static u64
>  icl_get_total_relative_data_rate(struct intel_crtc_state *crtc_state,
>  				 u64 *plane_data_rate)

-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 14:32     ` Lisovskiy, Stanislav
  0 siblings, 0 replies; 34+ messages in thread
From: Lisovskiy, Stanislav @ 2019-11-18 14:32 UTC (permalink / raw)
  To: intel-gfx, jani.nikula

On Mon, 2019-11-18 at 16:22 +0200, Jani Nikula wrote:
> On Thu, 14 Nov 2019, Stanislav Lisovskiy <
> stanislav.lisovskiy@intel.com> wrote:
> > Also implemented algorithm for choosing DBuf slice configuration
> > based on active pipes, pipe ratio as stated in BSpec 12716.
> > 
> > Now pipe allocation still stays proportional to pipe width as
> > before,
> > however within allowed DBuf slice for this particular
> > configuration.
> > 
> > v2: Remove unneeded check from commit as ddb enabled slices might
> >     now differ from hw state.
> > 
> > v3: - Added new field "supported_slices" to ddb to separate max
> >       supported slices vs currently enabled, to avoid confusion.
> >     - Removed obsolete comments and code related to DBuf(Matthew
> > Roper).
> >     - Some code style and long line removal(Matthew Roper).
> >     - Added WARN_ON to new ddb range offset calc function(Matthew
> > Roper).
> >     - Removed platform specific call to calc pipe ratio from ddb
> >       allocation function and fixed the return value(Matthew Roper)
> >     - Refactored DBUF slice allocation table to improve readability
> >     - Added DBUF slice allocation for TGL as well.
> >     - ICL(however not TGL) seems to voluntarily enable second DBuf
> > slice
> >       after pm suspend/resume causing a mismatch failure, because
> > we
> >       update DBuf slices only if we do a modeset, however this
> > check
> >       is done always. Fixed it to be done only when modeset for
> > ICL.
> > 
> > v4: - Now enabled slices is not actually a number, but a bitmask,
> >       because we might need to enable second slice only and number
> >       of slices would still 1 and that current code doesn't allow.
> >     - Remove redundant duplicate code to have some unified way of
> >       enabling dbuf slices instead of hardcoding.
> > 
> > v5: - Fix failing gen9_assert_dbuf_enabled as it was naively
> > thinking
> >       that we have only one DBUF_CTL slice. Now another version for
> >       gen11 will check both slices as only second one can be
> > enabled,
> >       to keep CI happy.
> > 
> > v6: - Removed enabled dbuf assertion completely. Previous code
> >       was keeping dbuf enabled, even(!) when _dbuf_disable is
> > called.
> >       Currently we enable/disable dbuf slices based on requrement
> >       so no point in asserting this here.
> >     - Removed unnecessary modeset check from
> > verify_wm_state(Matthew Roper)
> >     - Slices intersection after union is same as final
> > result(Matthew Roper)
> >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> >     - Some macros added(Matthew Roper)
> >     - ddb range is now always less or equal to ddb size - no need
> > for additional
> >       checks(previously needed as we had some bandwidth checks in
> > there which
> >       could "suddenly" return ddb_size smaller than it is.
> > 
> > v7: Minor costemic changes:
> >     - Changed icl_dbuf_slices_restore name to
> > icl_program_dbuf_slices
> >       as it more corresponds to the actual functionality.
> >     - Some simplification with supported slices for BXT and GLK
> >     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
> >       this is not used for skl anymore.
> >     - Changed DBuf slice assignment order for TGL case
> > 
> > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > Cc: James Ausmus <james.ausmus@intel.com>
> > ---
> >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> >  .../drm/i915/display/intel_display_power.h    |   2 +
> >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > ++++++++++++++++--
> >  9 files changed, 431 insertions(+), 104 deletions(-)
> 
> This is a rather big patch, and that's one of the few things we can't
> fix after committing.
> 
> If there's a bug, by our rate of change this turns practically
> un-revertible in a matter of weeks due to conflicts. If there's a
> bisected regression, and this is the regressing commit, what are our
> chances of identifying the bug based on the sha1 alone? Can't revert
> and
> can't immediately spot the bug/fix is a bad combination to have.
> 
> Also we have this nice CI infrastructure reporting about valid
> checkpatch and sparse issues for several versions of the patch.
> Please
> fix.
> 
> 
> BR,
> Jani.

I'm fine with fixing checkpatch and sparse failures(even though
most of those are about 80 line width limitation violation or line
continuations, which tbh are anyway quite common in our code). 

One thing I don't get is why this should cause any regression, as both
BAT and IGT are green here for two runs in a row.

What is the point of having a CI then.

Best Regards,

Lisovskiy Stanislav

> 
> 
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > b/drivers/gpu/drm/i915/display/intel_display.c
> > index 876fc25968bf..bd7aff675198 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > intel_crtc *crtc,
> >  
> >  	if (INTEL_GEN(dev_priv) >= 11 &&
> >  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > -		DRM_ERROR("mismatch in DBUF Slices (expected %u, got
> > %u)\n",
> > +		DRM_ERROR("mismatch in DBUF Slices (expected %x, got
> > %x)\n",
> >  			  sw_ddb->enabled_slices,
> >  			  hw->ddb.enabled_slices);
> >  
> > @@ -14614,15 +14614,24 @@ static void
> > skl_commit_modeset_enables(struct intel_atomic_state *state)
> >  	u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
> >  	u8 required_slices = state->wm_results.ddb.enabled_slices;
> >  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > +	u8 slices_union = hw_enabled_slices | required_slices;
> > +	u8 slices_intersection = required_slices;
> >  
> >  	for_each_oldnew_intel_crtc_in_state(state, crtc,
> > old_crtc_state, new_crtc_state, i)
> >  		/* ignore allocations for crtc's that have been turned
> > off. */
> >  		if (new_crtc_state->hw.active)
> >  			entries[i] = old_crtc_state->wm.skl.ddb;
> >  
> > -	/* If 2nd DBuf slice required, enable it here */
> > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > hw_enabled_slices)
> > -		icl_dbuf_slices_update(dev_priv, required_slices);
> > +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > +		      required_slices, hw_enabled_slices);
> > +
> > +	/*
> > +	 * If some other DBuf slice required, enable it here,
> > +	 * as here we only add more slices, but not remove to prevent
> > +	 * issues if somebody still uses those.
> > +	 */
> > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > hw_enabled_slices)
> > +		icl_dbuf_slices_update(dev_priv, slices_union);
> >  
> >  	/*
> >  	 * Whenever the number of active pipes changes, we need to make
> > sure we
> > @@ -14681,9 +14690,12 @@ static void
> > skl_commit_modeset_enables(struct intel_atomic_state *state)
> >  		}
> >  	} while (progress);
> >  
> > -	/* If 2nd DBuf slice is no more required disable it */
> > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > hw_enabled_slices)
> > -		icl_dbuf_slices_update(dev_priv, required_slices);
> > +	/*
> > +	 * If some other DBuf slice is not needed, disable it here,
> > +	 * as here we only remove slices, which are not needed anymore.
> > +	 */
> > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > hw_enabled_slices)
> > +		icl_dbuf_slices_update(dev_priv, slices_intersection);
> >  }
> >  
> >  static void intel_atomic_helper_free_state(struct drm_i915_private
> > *dev_priv)
> > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > index ce1b64f4dd44..a78bc9b61862 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > @@ -1031,15 +1031,6 @@ static bool
> > gen9_dc_off_power_well_enabled(struct drm_i915_private *dev_priv,
> >  		(I915_READ(DC_STATE_EN) &
> > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> >  }
> >  
> > -static void gen9_assert_dbuf_enabled(struct drm_i915_private
> > *dev_priv)
> > -{
> > -	u32 tmp = I915_READ(DBUF_CTL);
> > -
> > -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> > -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > -	     "Unexpected DBuf power power state (0x%08x)\n", tmp);
> > -}
> > -
> >  static void gen9_disable_dc_states(struct drm_i915_private
> > *dev_priv)
> >  {
> >  	struct intel_cdclk_state cdclk_state = {};
> > @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct
> > drm_i915_private *dev_priv)
> >  	/* Can't read out voltage_level so can't use
> > intel_cdclk_changed() */
> >  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw,
> > &cdclk_state));
> >  
> > -	gen9_assert_dbuf_enabled(dev_priv);
> > -
> >  	if (IS_GEN9_LP(dev_priv))
> >  		bxt_verify_ddi_phy_power_wells(dev_priv);
> >  
> > @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct
> > drm_i915_private *dev_priv)
> >  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> >  }
> >  
> > -static u8 intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> >  {
> > -	if (INTEL_GEN(dev_priv) < 11)
> > -		return 1;
> > -	return 2;
> > +	return INTEL_INFO(dev_priv)->supported_slices;
> > +}
> > +
> > +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv)
> > +{
> > +	const u8 hw_enabled_slices = dev_priv-
> > >wm.skl_hw.ddb.enabled_slices;
> > +
> > +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> >  }
> >  
> >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> >  			    u8 req_slices)
> >  {
> > -	const u8 hw_enabled_slices = dev_priv-
> > >wm.skl_hw.ddb.enabled_slices;
> > -	bool ret;
> > +	bool ret = true;
> > +	int i;
> > +	int max_slices = intel_dbuf_max_slices(dev_priv);
> >  
> > -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > +	if (hweight8(req_slices) > intel_dbuf_max_slices(dev_priv)) {
> >  		DRM_ERROR("Invalid number of dbuf slices requested\n");
> >  		return;
> >  	}
> >  
> > -	if (req_slices == hw_enabled_slices || req_slices == 0)
> > -		return;
> > +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n", req_slices);
> >  
> > -	if (req_slices > hw_enabled_slices)
> > -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2,
> > true);
> > -	else
> > -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2,
> > false);
> > +	for (i = 0; i < max_slices; i++) {
> > +		int slice_bit = BIT(i);
> > +		bool slice_set = (slice_bit & req_slices) != 0;
> > +
> > +		switch (slice_bit) {
> > +		case DBUF_S1_BIT:
> > +			ret = ret && intel_dbuf_slice_set(dev_priv,
> > +							  DBUF_CTL_S1,
> > +							  slice_set);
> > +			break;
> > +		case DBUF_S2_BIT:
> > +			ret = ret && intel_dbuf_slice_set(dev_priv,
> > +							  DBUF_CTL_S2,
> > +							  slice_set);
> > +			break;
> > +		default:
> > +			MISSING_CASE(slice_bit);
> > +		}
> > +	}
> >  
> >  	if (ret)
> >  		dev_priv->wm.skl_hw.ddb.enabled_slices = req_slices;
> > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > drm_i915_private *dev_priv,
> >  
> >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> >  {
> > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > DBUF_POWER_REQUEST);
> > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > DBUF_POWER_REQUEST);
> > -	POSTING_READ(DBUF_CTL_S2);
> > -
> > -	udelay(10);
> > -
> > -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > -		DRM_ERROR("DBuf power enable timeout\n");
> > -	else
> > -		/*
> > -		 * FIXME: for now pretend that we only have 1 slice,
> > see
> > -		 * intel_enabled_dbuf_slices_num().
> > -		 */
> > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > +	/*
> > +	 * Just power up 1 slice, we will
> > +	 * figure out later which slices we have and what we need.
> > +	 */
> > +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > +	icl_program_dbuf_slices(dev_priv);
> >  }
> >  
> >  static void icl_dbuf_disable(struct drm_i915_private *dev_priv)
> >  {
> > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > ~DBUF_POWER_REQUEST);
> > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > ~DBUF_POWER_REQUEST);
> > -	POSTING_READ(DBUF_CTL_S2);
> > -
> > -	udelay(10);
> > -
> > -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > -		DRM_ERROR("DBuf power disable timeout!\n");
> > -	else
> > -		/*
> > -		 * FIXME: for now pretend that the first slice is
> > always
> > -		 * enabled, see intel_enabled_dbuf_slices_num().
> > -		 */
> > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > +	/*
> > +	 * Disable all slices
> > +	 */
> > +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > +	icl_program_dbuf_slices(dev_priv);
> >  }
> >  
> >  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h
> > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > index 1da04f3e0fb3..b7057d68ad78 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > drm_i915_private *i915,
> >  	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
> >  	     intel_display_power_put_async((i915), (domain), (wf)),
> > (wf) = 0)
> >  
> > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
> >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> >  			    u8 req_slices);
> > +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv);
> >  
> >  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
> >  			     bool override, unsigned int mask);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index 82e4e6bf08c3..96741e68633a 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -51,6 +51,7 @@
> >  #include "display/intel_bw.h"
> >  #include "display/intel_cdclk.h"
> >  #include "display/intel_display_types.h"
> > +#include "display/intel_display_power.h"
> >  #include "display/intel_dp.h"
> >  #include "display/intel_fbdev.h"
> >  #include "display/intel_hotplug.h"
> > @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct
> > device *kdev)
> >  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
> >  		ret = vlv_resume_prepare(dev_priv, true);
> >  
> > +	/* Weird hack to fix ICL hardware bug, as it resets DBUF slices
> > reg */
> > +	if (INTEL_GEN(dev_priv) == 11)
> > +		icl_program_dbuf_slices(dev_priv);
> > +
> >  	intel_uncore_runtime_resume(&dev_priv->uncore);
> >  
> >  	intel_runtime_pm_enable_interrupts(dev_priv);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > b/drivers/gpu/drm/i915/i915_drv.h
> > index 7e0f67babe20..a396977c9c2d 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -804,7 +804,7 @@ static inline bool skl_ddb_entry_equal(const
> > struct skl_ddb_entry *e1,
> >  }
> >  
> >  struct skl_ddb_allocation {
> > -	u8 enabled_slices; /* GEN11 has configurable 2 slices */
> > +	u8 enabled_slices;   /* Bitmask of currently enabled DBuf
> > slices */
> >  };
> >  
> >  struct skl_ddb_values {
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > b/drivers/gpu/drm/i915/i915_pci.c
> > index 1bb701d32a5d..8e86af505730 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > intel_cherryview_info = {
> >  	.has_gt_uc = 1, \
> >  	.display.has_hdcp = 1, \
> >  	.display.has_ipc = 1, \
> > -	.ddb_size = 896
> > +	.ddb_size = 896, \
> > +	.supported_slices = 1
> >  
> >  #define SKL_PLATFORM \
> >  	GEN9_FEATURES, \
> > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > intel_skylake_gt4_info = {
> >  	GEN(9), \
> >  	.is_lp = 1, \
> >  	.display.has_hotplug = 1, \
> > +	.supported_slices = 1, \
> >  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0),
> > \
> >  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
> >  	.has_64bit_reloc = 1, \
> > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > intel_coffeelake_gt3_info = {
> >  	GEN9_FEATURES, \
> >  	GEN(10), \
> >  	.ddb_size = 1024, \
> > +	.supported_slices = 1, \
> >  	.display.has_dsc = 1, \
> >  	.has_coherent_ggtt = false, \
> >  	GLK_COLORS
> > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > intel_cannonlake_info = {
> >  	}, \
> >  	GEN(11), \
> >  	.ddb_size = 2048, \
> > +	.supported_slices = 2, \
> >  	.has_logical_ring_elsq = 1, \
> >  	.color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 }
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > b/drivers/gpu/drm/i915/i915_reg.h
> > index a607ea520829..fba5731063d8 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -7734,9 +7734,11 @@ enum {
> >  #define DISP_ARB_CTL2	_MMIO(0x45004)
> >  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
> >  #define  DISP_IPC_ENABLE		(1 << 3)
> > -#define DBUF_CTL	_MMIO(0x45008)
> > -#define DBUF_CTL_S1	_MMIO(0x45008)
> > -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> > +#define DBUF_S1_BIT			BIT(0)
> > +#define DBUF_S2_BIT			BIT(1)
> > +#define DBUF_CTL			_MMIO(0x45008)
> > +#define DBUF_CTL_S1			_MMIO(0x45008)
> > +#define DBUF_CTL_S2			_MMIO(0x44FE8)
> >  #define  DBUF_POWER_REQUEST		(1 << 31)
> >  #define  DBUF_POWER_STATE		(1 << 30)
> >  #define GEN7_MSG_CTL	_MMIO(0x45010)
> > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > b/drivers/gpu/drm/i915/intel_device_info.h
> > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > @@ -180,6 +180,7 @@ struct intel_device_info {
> >  	} display;
> >  
> >  	u16 ddb_size; /* in blocks */
> > +	u8 supported_slices; /* number of DBuf slices */
> >  
> >  	/* Register offsets for the various display pipes and
> > transcoders */
> >  	int pipe_offsets[I915_MAX_TRANSCODERS];
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > b/drivers/gpu/drm/i915/intel_pm.c
> > index 2d389e437e87..1ea81ab92429 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct drm_device
> > *dev)
> >  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> >  }
> >  
> > -static u8 intel_enabled_dbuf_slices_num(struct drm_i915_private
> > *dev_priv)
> > +static u8 intel_enabled_dbuf_slices(struct drm_i915_private
> > *dev_priv)
> >  {
> > -	u8 enabled_slices;
> > -
> > -	/* Slice 1 will always be enabled */
> > -	enabled_slices = 1;
> > +	u8 enabled_slices = 0;
> >  
> >  	/* Gen prior to GEN11 have only one DBuf slice */
> >  	if (INTEL_GEN(dev_priv) < 11)
> > -		return enabled_slices;
> > +		return DBUF_S1_BIT;
> >  
> > -	/*
> > -	 * FIXME: for now we'll only ever use 1 slice; pretend that we
> > have
> > -	 * only that 1 slice enabled until we have a proper way for on-
> > demand
> > -	 * toggling of the second slice.
> > -	 */
> > -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > -		enabled_slices++;
> > +	/* Check if second DBuf slice is enabled */
> > +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > +		enabled_slices |= DBUF_S1_BIT;
> > +
> > +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > +		enabled_slices |= DBUF_S2_BIT;
> >  
> >  	return enabled_slices;
> >  }
> > @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct
> > drm_i915_private *dev_priv,
> >  			      const int num_active,
> >  			      struct skl_ddb_allocation *ddb)
> >  {
> > -	const struct drm_display_mode *adjusted_mode;
> > -	u64 total_data_bw;
> >  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > -
> >  	WARN_ON(ddb_size == 0);
> >  
> >  	if (INTEL_GEN(dev_priv) < 11)
> >  		return ddb_size - 4; /* 4 blocks for bypass path
> > allocation */
> >  
> > -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> > -	total_data_bw = total_data_rate *
> > drm_mode_vrefresh(adjusted_mode);
> > +	return ddb_size;
> > +}
> >  
> > -	/*
> > -	 * 12GB/s is maximum BW supported by single DBuf slice.
> > -	 *
> > -	 * FIXME dbuf slice code is broken:
> > -	 * - must wait for planes to stop using the slice before
> > powering it off
> > -	 * - plane straddling both slices is illegal in multi-pipe
> > scenarios
> > -	 * - should validate we stay within the hw bandwidth limits
> > -	 */
> > -	if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
> > -		ddb->enabled_slices = 2;
> > -	} else {
> > -		ddb->enabled_slices = 1;
> > -		ddb_size /= 2;
> > -	}
> > +/*
> > + * Calculate initial DBuf slice offset, based on slice size
> > + * and mask(i.e if slice size is 1024 and second slice is enabled
> > + * offset would be 1024)
> > + */
> > +static u32 skl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
> > +					   u32 slice_size, u32
> > ddb_size)
> > +{
> > +	u32 offset = 0;
> >  
> > -	return ddb_size;
> > +	if (!dbuf_slice_mask)
> > +		return 0;
> > +
> > +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > +
> > +	WARN_ON(offset >= ddb_size);
> > +	return offset;
> >  }
> >  
> > +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > *dev_priv,
> > +				      int pipe, u32 active_pipes,
> > +				      const struct intel_crtc_state
> > *crtc_state);
> > +
> >  static void
> >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > *dev_priv,
> >  				   const struct intel_crtc_state
> > *crtc_state,
> > @@ -3857,7 +3855,14 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  	u32 pipe_width = 0, total_width = 0, width_before_pipe = 0;
> >  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> >  	u16 ddb_size;
> > +	u32 ddb_range_size;
> >  	u32 i;
> > +	u32 dbuf_slice_mask;
> > +	u32 active_pipes;
> > +	u32 offset;
> > +	u32 slice_size;
> > +	u32 total_slice_mask;
> > +	u32 start, end;
> >  
> >  	if (WARN_ON(!state) || !crtc_state->hw.active) {
> >  		alloc->start = 0;
> > @@ -3866,14 +3871,23 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  		return;
> >  	}
> >  
> > -	if (intel_state->active_pipe_changes)
> > +	if (intel_state->active_pipe_changes) {
> >  		*num_active = hweight8(intel_state->active_pipes);
> > -	else
> > +		active_pipes = intel_state->active_pipes;
> > +	} else {
> >  		*num_active = hweight8(dev_priv->active_pipes);
> > +		active_pipes = dev_priv->active_pipes;
> > +	}
> >  
> >  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > total_data_rate,
> >  				      *num_active, ddb);
> >  
> > +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > +
> > +	slice_size = ddb_size / INTEL_INFO(dev_priv)->supported_slices;
> > +
> > +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> > +
> >  	/*
> >  	 * If the state doesn't change the active CRTC's or there is no
> >  	 * modeset request, then there's no need to recalculate;
> > @@ -3891,20 +3905,70 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  		return;
> >  	}
> >  
> > +	/*
> > +	 * Get allowed DBuf slices for correspondent pipe and platform.
> > +	 */
> > +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv,
> > for_pipe,
> > +						     active_pipes,
> > crtc_state);
> > +
> > +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes %x\n",
> > +		      dbuf_slice_mask,
> > +		      for_pipe, active_pipes);
> > +
> > +	/*
> > +	 * Figure out at which DBuf slice we start, i.e if we start at
> > Dbuf S2
> > +	 * and slice size is 1024, the offset would be 1024
> > +	 */
> > +	offset = skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > +						 slice_size, ddb_size);
> > +
> > +	/*
> > +	 * Figure out total size of allowed DBuf slices, which is
> > basically
> > +	 * a number of allowed slices for that pipe multiplied by slice
> > size.
> > +	 * Inside of this
> > +	 * range ddb entries are still allocated in proportion to
> > display width.
> > +	 */
> > +	ddb_range_size = hweight8(dbuf_slice_mask) * slice_size;
> > +
> >  	/*
> >  	 * Watermark/ddb requirement highly depends upon width of the
> >  	 * framebuffer, So instead of allocating DDB equally among
> > pipes
> >  	 * distribute DDB based on resolution/width of the display.
> >  	 */
> > +	total_slice_mask = dbuf_slice_mask;
> >  	for_each_new_intel_crtc_in_state(intel_state, crtc, crtc_state,
> > i) {
> >  		const struct drm_display_mode *adjusted_mode =
> >  			&crtc_state->hw.adjusted_mode;
> >  		enum pipe pipe = crtc->pipe;
> >  		int hdisplay, vdisplay;
> > +		u32 pipe_dbuf_slice_mask = \
> > +			i915_get_allowed_dbuf_mask(dev_priv,
> > +						pipe,
> > +						active_pipes,
> > +						crtc_state);
> >  
> >  		if (!crtc_state->hw.enable)
> >  			continue;
> >  
> > +		/*
> > +		 * According to BSpec pipe can share one dbuf slice
> > with another
> > +		 * pipes or pipe can use multiple dbufs, in both cases
> > we
> > +		 * account for other pipes only if they have exactly
> > same mask.
> > +		 * However we need to account how many slices we should
> > enable
> > +		 * in total.
> > +		 */
> > +		total_slice_mask |= pipe_dbuf_slice_mask;
> > +
> > +		/*
> > +		 * Do not account pipes using other slice sets
> > +		 * luckily as of current BSpec slice sets do not
> > partially
> > +		 * intersect(pipes share either same one slice or same
> > slice set
> > +		 * i.e no partial intersection), so it is enough to
> > check for
> > +		 * equality for now.
> > +		 */
> > +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > +			continue;
> > +
> >  		drm_mode_get_hv_timing(adjusted_mode, &hdisplay,
> > &vdisplay);
> >  		total_width += hdisplay;
> >  
> > @@ -3914,8 +3978,19 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  			pipe_width = hdisplay;
> >  	}
> >  
> > -	alloc->start = ddb_size * width_before_pipe / total_width;
> > -	alloc->end = ddb_size * (width_before_pipe + pipe_width) /
> > total_width;
> > +	ddb->enabled_slices = total_slice_mask;
> > +
> > +	start = ddb_range_size * width_before_pipe / total_width;
> > +	end = ddb_range_size * (width_before_pipe + pipe_width) /
> > total_width;
> > +
> > +	alloc->start = offset + start;
> > +	alloc->end = offset + end;
> > +
> > +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > +		      alloc->start, alloc->end);
> > +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported %d\n",
> > +		      ddb->enabled_slices,
> > +		      INTEL_INFO(dev_priv)->supported_slices);
> >  }
> >  
> >  static int skl_compute_wm_params(const struct intel_crtc_state
> > *crtc_state,
> > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > intel_crtc *crtc,
> >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
> >  			  struct skl_ddb_allocation *ddb /* out */)
> >  {
> > -	ddb->enabled_slices = intel_enabled_dbuf_slices_num(dev_priv);
> > +	ddb->enabled_slices = intel_enabled_dbuf_slices(dev_priv);
> > +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > >enabled_slices);
> >  }
> >  
> >  /*
> > @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct
> > intel_crtc_state *crtc_state,
> >  	return total_data_rate;
> >  }
> >  
> > +static uint_fixed_16_16_t
> > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	u32 src_w, src_h, dst_w, dst_h;
> > +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > +	uint_fixed_16_16_t downscale_h, downscale_w;
> > +	const struct drm_display_mode *adjusted_mode = &crtc_state-
> > >hw.adjusted_mode;
> > +
> > +	src_w = crtc_state->pipe_src_w;
> > +	src_h = crtc_state->pipe_src_h;
> > +	dst_w = adjusted_mode->crtc_hdisplay;
> > +	dst_h = adjusted_mode->crtc_vdisplay;
> > +
> > +	fp_w_ratio = div_fixed16(src_w, dst_w);
> > +	fp_h_ratio = div_fixed16(src_h, dst_h);
> > +	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
> > +	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
> > +
> > +	return mul_fixed16(downscale_w, downscale_h);
> > +}
> > +
> > +uint_fixed_16_16_t
> > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > +{
> > +	struct drm_plane *plane;
> > +	const struct drm_plane_state *drm_plane_state;
> > +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > +	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> > +
> > +	if (!crtc_state->uapi.enable)
> > +		return max_downscale;
> > +
> > +	drm_atomic_crtc_state_for_each_plane_state(plane,
> > drm_plane_state, &crtc_state->uapi) {
> > +		uint_fixed_16_16_t plane_downscale;
> > +		const struct intel_plane_state *plane_state =
> > +			to_intel_plane_state(drm_plane_state);
> > +
> > +		if (!intel_wm_plane_visible(crtc_state, plane_state))
> > +			continue;
> > +
> > +		plane_downscale =
> > skl_plane_downscale_amount(crtc_state, plane_state);
> > +
> > +		DRM_DEBUG_KMS("Plane %d downscale amount %d.%d\n",
> > +			      to_intel_plane(plane)->id,
> > +			      plane_downscale.val >> 16,
> > +			      plane_downscale.val & 0xffff);
> > +
> > +		max_downscale = max_fixed16(plane_downscale,
> > max_downscale);
> > +	}
> > +
> > +
> > +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > +
> > +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > +		       crtc->pipe, pipe_downscale.val >> 16,
> > +		       pipe_downscale.val & 0xffff);
> > +
> > +	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
> > +
> > +	/*
> > +	 * Convert result to percentage: get the actual ratio instead
> > of rate,
> > +	 * then multiply by 100 to get percentage.
> > +	 */
> > +	pipe_ratio = u32_to_fixed16(100 * div_round_up_u32_fixed16(1,
> > pipe_downscale));
> > +
> > +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > +		      pipe_ratio.val >> 16, pipe_ratio.val & 0xffff);
> > +
> > +	return pipe_ratio;
> > +}
> > +
> > +struct dbuf_slice_conf_entry {
> > +	u32 dbuf_mask[I915_MAX_PIPES];
> > +	u32 active_pipes;
> > +};
> > +
> > +
> > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > BIT(PIPE_C) }
> > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C) |
> > BIT(PIPE_D) }
> > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C) |
> > BIT(PIPE_D) }
> > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > BIT(PIPE_D) }
> > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > BIT(PIPE_C) }
> > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \
> > +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) | BIT(PIPE_B) | \
> > +					  BIT(PIPE_C) | BIT(PIPE_D) }
> > +
> > +/*
> > + * Table taken from Bspec 12716
> > + * Pipes do have some preferred DBuf slice affinity,
> > + * plus there are some hardcoded requirements on how
> > + * those should be distributed for multipipe scenarios.
> > + * For more DBuf slices algorithm can get even more messy
> > + * and less readable, so decided to use a table almost
> > + * as is from BSpec itself - that way it is at least easier
> > + * to compare, change and check.
> > + */
> > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > +{ { 0, 0, 0, 0 }, 0 },
> > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > < 88.8 */
> > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe
> > ratio >= 88.8 */
> > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > < 88.8 */
> > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe
> > ratio >= 88.8 */
> > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > < 88.8 */
> > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for pipe
> > ratio >= 88.8 */
> > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > +};
> > +
> > +/*
> > + * Table taken from Bspec 49255
> > + * Pipes do have some preferred DBuf slice affinity,
> > + * plus there are some hardcoded requirements on how
> > + * those should be distributed for multipipe scenarios.
> > + * For more DBuf slices algorithm can get even more messy
> > + * and less readable, so decided to use a table almost
> > + * as is from BSpec itself - that way it is at least easier
> > + * to compare, change and check.
> > + */
> > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > +{ { 0, 0, 0, 0 }, 0 },
> > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT,
> > DBUF_S2_BIT),
> > +};
> > +
> > +/*
> > + * This function finds an entry with same enabled pipe
> > configuration and
> > + * returns correspondent DBuf slice mask as stated in BSpec for
> > particular
> > + * platform.
> > + */
> > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > +				     u32 active_pipes,
> > +				     const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	int i;
> > +	uint_fixed_16_16_t pipe_ratio;
> > +
> > +	/*
> > +	 * Calculate pipe ratio as stated in BSpec 28692
> > +	 */
> > +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > +
> > +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > +		if (icl_allowed_dbufs[i].active_pipes == active_pipes)
> > {
> > +			if (hweight32(active_pipes) == 1) {
> > +				/*
> > +				 * According to BSpec 12716: if ratio
> > >= 88.8
> > +				 * always use single dbuf slice.
> > +				 */
> > +				if (pipe_ratio.val >= div_fixed16(888,
> > 10).val)
> > +					++i;
> > +			}
> > +			return icl_allowed_dbufs[i].dbuf_mask[pipe];
> > +		}
> > +	}
> > +	return 0;
> > +}
> > +
> > +/*
> > + * This function finds an entry with same enabled pipe
> > configuration and
> > + * returns correspondent DBuf slice mask as stated in BSpec for
> > particular
> > + * platform.
> > + */
> > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > +				     u32 active_pipes,
> > +				     const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	int i;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > +		if (tgl_allowed_dbufs[i].active_pipes == active_pipes)
> > +			return tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > +	}
> > +	return 0;
> > +}
> > +
> > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> > +				      int pipe, u32 active_pipes,
> > +				      const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	if (IS_GEN(dev_priv, 11))
> > +		return icl_get_allowed_dbuf_mask(pipe,
> > +						 active_pipes,
> > +						 crtc_state);
> > +	else if (IS_GEN(dev_priv, 12))
> > +		return tgl_get_allowed_dbuf_mask(pipe,
> > +						 active_pipes,
> > +						 crtc_state);
> > +	/*
> > +	 * For anything else just return one slice yet.
> > +	 * Should be extended for other platforms.
> > +	 */
> > +	return DBUF_S1_BIT;
> > +}
> > +
> >  static u64
> >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > *crtc_state,
> >  				 u64 *plane_data_rate)
> 
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 14:32     ` Lisovskiy, Stanislav
  0 siblings, 0 replies; 34+ messages in thread
From: Lisovskiy, Stanislav @ 2019-11-18 14:32 UTC (permalink / raw)
  To: intel-gfx, jani.nikula

On Mon, 2019-11-18 at 16:22 +0200, Jani Nikula wrote:
> On Thu, 14 Nov 2019, Stanislav Lisovskiy <
> stanislav.lisovskiy@intel.com> wrote:
> > Also implemented algorithm for choosing DBuf slice configuration
> > based on active pipes, pipe ratio as stated in BSpec 12716.
> > 
> > Now pipe allocation still stays proportional to pipe width as
> > before,
> > however within allowed DBuf slice for this particular
> > configuration.
> > 
> > v2: Remove unneeded check from commit as ddb enabled slices might
> >     now differ from hw state.
> > 
> > v3: - Added new field "supported_slices" to ddb to separate max
> >       supported slices vs currently enabled, to avoid confusion.
> >     - Removed obsolete comments and code related to DBuf(Matthew
> > Roper).
> >     - Some code style and long line removal(Matthew Roper).
> >     - Added WARN_ON to new ddb range offset calc function(Matthew
> > Roper).
> >     - Removed platform specific call to calc pipe ratio from ddb
> >       allocation function and fixed the return value(Matthew Roper)
> >     - Refactored DBUF slice allocation table to improve readability
> >     - Added DBUF slice allocation for TGL as well.
> >     - ICL(however not TGL) seems to voluntarily enable second DBuf
> > slice
> >       after pm suspend/resume causing a mismatch failure, because
> > we
> >       update DBuf slices only if we do a modeset, however this
> > check
> >       is done always. Fixed it to be done only when modeset for
> > ICL.
> > 
> > v4: - Now enabled slices is not actually a number, but a bitmask,
> >       because we might need to enable second slice only and number
> >       of slices would still 1 and that current code doesn't allow.
> >     - Remove redundant duplicate code to have some unified way of
> >       enabling dbuf slices instead of hardcoding.
> > 
> > v5: - Fix failing gen9_assert_dbuf_enabled as it was naively
> > thinking
> >       that we have only one DBUF_CTL slice. Now another version for
> >       gen11 will check both slices as only second one can be
> > enabled,
> >       to keep CI happy.
> > 
> > v6: - Removed enabled dbuf assertion completely. Previous code
> >       was keeping dbuf enabled, even(!) when _dbuf_disable is
> > called.
> >       Currently we enable/disable dbuf slices based on requrement
> >       so no point in asserting this here.
> >     - Removed unnecessary modeset check from
> > verify_wm_state(Matthew Roper)
> >     - Slices intersection after union is same as final
> > result(Matthew Roper)
> >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> >     - Some macros added(Matthew Roper)
> >     - ddb range is now always less or equal to ddb size - no need
> > for additional
> >       checks(previously needed as we had some bandwidth checks in
> > there which
> >       could "suddenly" return ddb_size smaller than it is.
> > 
> > v7: Minor costemic changes:
> >     - Changed icl_dbuf_slices_restore name to
> > icl_program_dbuf_slices
> >       as it more corresponds to the actual functionality.
> >     - Some simplification with supported slices for BXT and GLK
> >     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
> >       this is not used for skl anymore.
> >     - Changed DBuf slice assignment order for TGL case
> > 
> > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > Cc: James Ausmus <james.ausmus@intel.com>
> > ---
> >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> >  .../drm/i915/display/intel_display_power.h    |   2 +
> >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > ++++++++++++++++--
> >  9 files changed, 431 insertions(+), 104 deletions(-)
> 
> This is a rather big patch, and that's one of the few things we can't
> fix after committing.
> 
> If there's a bug, by our rate of change this turns practically
> un-revertible in a matter of weeks due to conflicts. If there's a
> bisected regression, and this is the regressing commit, what are our
> chances of identifying the bug based on the sha1 alone? Can't revert
> and
> can't immediately spot the bug/fix is a bad combination to have.
> 
> Also we have this nice CI infrastructure reporting about valid
> checkpatch and sparse issues for several versions of the patch.
> Please
> fix.
> 
> 
> BR,
> Jani.

I'm fine with fixing checkpatch and sparse failures(even though
most of those are about 80 line width limitation violation or line
continuations, which tbh are anyway quite common in our code). 

One thing I don't get is why this should cause any regression, as both
BAT and IGT are green here for two runs in a row.

What is the point of having a CI then.

Best Regards,

Lisovskiy Stanislav

> 
> 
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > b/drivers/gpu/drm/i915/display/intel_display.c
> > index 876fc25968bf..bd7aff675198 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > intel_crtc *crtc,
> >  
> >  	if (INTEL_GEN(dev_priv) >= 11 &&
> >  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > -		DRM_ERROR("mismatch in DBUF Slices (expected %u, got
> > %u)\n",
> > +		DRM_ERROR("mismatch in DBUF Slices (expected %x, got
> > %x)\n",
> >  			  sw_ddb->enabled_slices,
> >  			  hw->ddb.enabled_slices);
> >  
> > @@ -14614,15 +14614,24 @@ static void
> > skl_commit_modeset_enables(struct intel_atomic_state *state)
> >  	u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
> >  	u8 required_slices = state->wm_results.ddb.enabled_slices;
> >  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > +	u8 slices_union = hw_enabled_slices | required_slices;
> > +	u8 slices_intersection = required_slices;
> >  
> >  	for_each_oldnew_intel_crtc_in_state(state, crtc,
> > old_crtc_state, new_crtc_state, i)
> >  		/* ignore allocations for crtc's that have been turned
> > off. */
> >  		if (new_crtc_state->hw.active)
> >  			entries[i] = old_crtc_state->wm.skl.ddb;
> >  
> > -	/* If 2nd DBuf slice required, enable it here */
> > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > hw_enabled_slices)
> > -		icl_dbuf_slices_update(dev_priv, required_slices);
> > +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > +		      required_slices, hw_enabled_slices);
> > +
> > +	/*
> > +	 * If some other DBuf slice required, enable it here,
> > +	 * as here we only add more slices, but not remove to prevent
> > +	 * issues if somebody still uses those.
> > +	 */
> > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > hw_enabled_slices)
> > +		icl_dbuf_slices_update(dev_priv, slices_union);
> >  
> >  	/*
> >  	 * Whenever the number of active pipes changes, we need to make
> > sure we
> > @@ -14681,9 +14690,12 @@ static void
> > skl_commit_modeset_enables(struct intel_atomic_state *state)
> >  		}
> >  	} while (progress);
> >  
> > -	/* If 2nd DBuf slice is no more required disable it */
> > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > hw_enabled_slices)
> > -		icl_dbuf_slices_update(dev_priv, required_slices);
> > +	/*
> > +	 * If some other DBuf slice is not needed, disable it here,
> > +	 * as here we only remove slices, which are not needed anymore.
> > +	 */
> > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > hw_enabled_slices)
> > +		icl_dbuf_slices_update(dev_priv, slices_intersection);
> >  }
> >  
> >  static void intel_atomic_helper_free_state(struct drm_i915_private
> > *dev_priv)
> > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > index ce1b64f4dd44..a78bc9b61862 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > @@ -1031,15 +1031,6 @@ static bool
> > gen9_dc_off_power_well_enabled(struct drm_i915_private *dev_priv,
> >  		(I915_READ(DC_STATE_EN) &
> > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> >  }
> >  
> > -static void gen9_assert_dbuf_enabled(struct drm_i915_private
> > *dev_priv)
> > -{
> > -	u32 tmp = I915_READ(DBUF_CTL);
> > -
> > -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> > -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > -	     "Unexpected DBuf power power state (0x%08x)\n", tmp);
> > -}
> > -
> >  static void gen9_disable_dc_states(struct drm_i915_private
> > *dev_priv)
> >  {
> >  	struct intel_cdclk_state cdclk_state = {};
> > @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct
> > drm_i915_private *dev_priv)
> >  	/* Can't read out voltage_level so can't use
> > intel_cdclk_changed() */
> >  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw,
> > &cdclk_state));
> >  
> > -	gen9_assert_dbuf_enabled(dev_priv);
> > -
> >  	if (IS_GEN9_LP(dev_priv))
> >  		bxt_verify_ddi_phy_power_wells(dev_priv);
> >  
> > @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct
> > drm_i915_private *dev_priv)
> >  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> >  }
> >  
> > -static u8 intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> >  {
> > -	if (INTEL_GEN(dev_priv) < 11)
> > -		return 1;
> > -	return 2;
> > +	return INTEL_INFO(dev_priv)->supported_slices;
> > +}
> > +
> > +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv)
> > +{
> > +	const u8 hw_enabled_slices = dev_priv-
> > >wm.skl_hw.ddb.enabled_slices;
> > +
> > +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> >  }
> >  
> >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> >  			    u8 req_slices)
> >  {
> > -	const u8 hw_enabled_slices = dev_priv-
> > >wm.skl_hw.ddb.enabled_slices;
> > -	bool ret;
> > +	bool ret = true;
> > +	int i;
> > +	int max_slices = intel_dbuf_max_slices(dev_priv);
> >  
> > -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > +	if (hweight8(req_slices) > intel_dbuf_max_slices(dev_priv)) {
> >  		DRM_ERROR("Invalid number of dbuf slices requested\n");
> >  		return;
> >  	}
> >  
> > -	if (req_slices == hw_enabled_slices || req_slices == 0)
> > -		return;
> > +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n", req_slices);
> >  
> > -	if (req_slices > hw_enabled_slices)
> > -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2,
> > true);
> > -	else
> > -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2,
> > false);
> > +	for (i = 0; i < max_slices; i++) {
> > +		int slice_bit = BIT(i);
> > +		bool slice_set = (slice_bit & req_slices) != 0;
> > +
> > +		switch (slice_bit) {
> > +		case DBUF_S1_BIT:
> > +			ret = ret && intel_dbuf_slice_set(dev_priv,
> > +							  DBUF_CTL_S1,
> > +							  slice_set);
> > +			break;
> > +		case DBUF_S2_BIT:
> > +			ret = ret && intel_dbuf_slice_set(dev_priv,
> > +							  DBUF_CTL_S2,
> > +							  slice_set);
> > +			break;
> > +		default:
> > +			MISSING_CASE(slice_bit);
> > +		}
> > +	}
> >  
> >  	if (ret)
> >  		dev_priv->wm.skl_hw.ddb.enabled_slices = req_slices;
> > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > drm_i915_private *dev_priv,
> >  
> >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> >  {
> > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > DBUF_POWER_REQUEST);
> > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > DBUF_POWER_REQUEST);
> > -	POSTING_READ(DBUF_CTL_S2);
> > -
> > -	udelay(10);
> > -
> > -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > -		DRM_ERROR("DBuf power enable timeout\n");
> > -	else
> > -		/*
> > -		 * FIXME: for now pretend that we only have 1 slice,
> > see
> > -		 * intel_enabled_dbuf_slices_num().
> > -		 */
> > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > +	/*
> > +	 * Just power up 1 slice, we will
> > +	 * figure out later which slices we have and what we need.
> > +	 */
> > +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > +	icl_program_dbuf_slices(dev_priv);
> >  }
> >  
> >  static void icl_dbuf_disable(struct drm_i915_private *dev_priv)
> >  {
> > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > ~DBUF_POWER_REQUEST);
> > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > ~DBUF_POWER_REQUEST);
> > -	POSTING_READ(DBUF_CTL_S2);
> > -
> > -	udelay(10);
> > -
> > -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > -		DRM_ERROR("DBuf power disable timeout!\n");
> > -	else
> > -		/*
> > -		 * FIXME: for now pretend that the first slice is
> > always
> > -		 * enabled, see intel_enabled_dbuf_slices_num().
> > -		 */
> > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > +	/*
> > +	 * Disable all slices
> > +	 */
> > +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > +	icl_program_dbuf_slices(dev_priv);
> >  }
> >  
> >  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h
> > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > index 1da04f3e0fb3..b7057d68ad78 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > drm_i915_private *i915,
> >  	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
> >  	     intel_display_power_put_async((i915), (domain), (wf)),
> > (wf) = 0)
> >  
> > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
> >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> >  			    u8 req_slices);
> > +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv);
> >  
> >  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
> >  			     bool override, unsigned int mask);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index 82e4e6bf08c3..96741e68633a 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -51,6 +51,7 @@
> >  #include "display/intel_bw.h"
> >  #include "display/intel_cdclk.h"
> >  #include "display/intel_display_types.h"
> > +#include "display/intel_display_power.h"
> >  #include "display/intel_dp.h"
> >  #include "display/intel_fbdev.h"
> >  #include "display/intel_hotplug.h"
> > @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct
> > device *kdev)
> >  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
> >  		ret = vlv_resume_prepare(dev_priv, true);
> >  
> > +	/* Weird hack to fix ICL hardware bug, as it resets DBUF slices
> > reg */
> > +	if (INTEL_GEN(dev_priv) == 11)
> > +		icl_program_dbuf_slices(dev_priv);
> > +
> >  	intel_uncore_runtime_resume(&dev_priv->uncore);
> >  
> >  	intel_runtime_pm_enable_interrupts(dev_priv);
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > b/drivers/gpu/drm/i915/i915_drv.h
> > index 7e0f67babe20..a396977c9c2d 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -804,7 +804,7 @@ static inline bool skl_ddb_entry_equal(const
> > struct skl_ddb_entry *e1,
> >  }
> >  
> >  struct skl_ddb_allocation {
> > -	u8 enabled_slices; /* GEN11 has configurable 2 slices */
> > +	u8 enabled_slices;   /* Bitmask of currently enabled DBuf
> > slices */
> >  };
> >  
> >  struct skl_ddb_values {
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > b/drivers/gpu/drm/i915/i915_pci.c
> > index 1bb701d32a5d..8e86af505730 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > intel_cherryview_info = {
> >  	.has_gt_uc = 1, \
> >  	.display.has_hdcp = 1, \
> >  	.display.has_ipc = 1, \
> > -	.ddb_size = 896
> > +	.ddb_size = 896, \
> > +	.supported_slices = 1
> >  
> >  #define SKL_PLATFORM \
> >  	GEN9_FEATURES, \
> > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > intel_skylake_gt4_info = {
> >  	GEN(9), \
> >  	.is_lp = 1, \
> >  	.display.has_hotplug = 1, \
> > +	.supported_slices = 1, \
> >  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0),
> > \
> >  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
> >  	.has_64bit_reloc = 1, \
> > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > intel_coffeelake_gt3_info = {
> >  	GEN9_FEATURES, \
> >  	GEN(10), \
> >  	.ddb_size = 1024, \
> > +	.supported_slices = 1, \
> >  	.display.has_dsc = 1, \
> >  	.has_coherent_ggtt = false, \
> >  	GLK_COLORS
> > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > intel_cannonlake_info = {
> >  	}, \
> >  	GEN(11), \
> >  	.ddb_size = 2048, \
> > +	.supported_slices = 2, \
> >  	.has_logical_ring_elsq = 1, \
> >  	.color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 }
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > b/drivers/gpu/drm/i915/i915_reg.h
> > index a607ea520829..fba5731063d8 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -7734,9 +7734,11 @@ enum {
> >  #define DISP_ARB_CTL2	_MMIO(0x45004)
> >  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
> >  #define  DISP_IPC_ENABLE		(1 << 3)
> > -#define DBUF_CTL	_MMIO(0x45008)
> > -#define DBUF_CTL_S1	_MMIO(0x45008)
> > -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> > +#define DBUF_S1_BIT			BIT(0)
> > +#define DBUF_S2_BIT			BIT(1)
> > +#define DBUF_CTL			_MMIO(0x45008)
> > +#define DBUF_CTL_S1			_MMIO(0x45008)
> > +#define DBUF_CTL_S2			_MMIO(0x44FE8)
> >  #define  DBUF_POWER_REQUEST		(1 << 31)
> >  #define  DBUF_POWER_STATE		(1 << 30)
> >  #define GEN7_MSG_CTL	_MMIO(0x45010)
> > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > b/drivers/gpu/drm/i915/intel_device_info.h
> > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > @@ -180,6 +180,7 @@ struct intel_device_info {
> >  	} display;
> >  
> >  	u16 ddb_size; /* in blocks */
> > +	u8 supported_slices; /* number of DBuf slices */
> >  
> >  	/* Register offsets for the various display pipes and
> > transcoders */
> >  	int pipe_offsets[I915_MAX_TRANSCODERS];
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > b/drivers/gpu/drm/i915/intel_pm.c
> > index 2d389e437e87..1ea81ab92429 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct drm_device
> > *dev)
> >  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> >  }
> >  
> > -static u8 intel_enabled_dbuf_slices_num(struct drm_i915_private
> > *dev_priv)
> > +static u8 intel_enabled_dbuf_slices(struct drm_i915_private
> > *dev_priv)
> >  {
> > -	u8 enabled_slices;
> > -
> > -	/* Slice 1 will always be enabled */
> > -	enabled_slices = 1;
> > +	u8 enabled_slices = 0;
> >  
> >  	/* Gen prior to GEN11 have only one DBuf slice */
> >  	if (INTEL_GEN(dev_priv) < 11)
> > -		return enabled_slices;
> > +		return DBUF_S1_BIT;
> >  
> > -	/*
> > -	 * FIXME: for now we'll only ever use 1 slice; pretend that we
> > have
> > -	 * only that 1 slice enabled until we have a proper way for on-
> > demand
> > -	 * toggling of the second slice.
> > -	 */
> > -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > -		enabled_slices++;
> > +	/* Check if second DBuf slice is enabled */
> > +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > +		enabled_slices |= DBUF_S1_BIT;
> > +
> > +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > +		enabled_slices |= DBUF_S2_BIT;
> >  
> >  	return enabled_slices;
> >  }
> > @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct
> > drm_i915_private *dev_priv,
> >  			      const int num_active,
> >  			      struct skl_ddb_allocation *ddb)
> >  {
> > -	const struct drm_display_mode *adjusted_mode;
> > -	u64 total_data_bw;
> >  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > -
> >  	WARN_ON(ddb_size == 0);
> >  
> >  	if (INTEL_GEN(dev_priv) < 11)
> >  		return ddb_size - 4; /* 4 blocks for bypass path
> > allocation */
> >  
> > -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> > -	total_data_bw = total_data_rate *
> > drm_mode_vrefresh(adjusted_mode);
> > +	return ddb_size;
> > +}
> >  
> > -	/*
> > -	 * 12GB/s is maximum BW supported by single DBuf slice.
> > -	 *
> > -	 * FIXME dbuf slice code is broken:
> > -	 * - must wait for planes to stop using the slice before
> > powering it off
> > -	 * - plane straddling both slices is illegal in multi-pipe
> > scenarios
> > -	 * - should validate we stay within the hw bandwidth limits
> > -	 */
> > -	if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
> > -		ddb->enabled_slices = 2;
> > -	} else {
> > -		ddb->enabled_slices = 1;
> > -		ddb_size /= 2;
> > -	}
> > +/*
> > + * Calculate initial DBuf slice offset, based on slice size
> > + * and mask(i.e if slice size is 1024 and second slice is enabled
> > + * offset would be 1024)
> > + */
> > +static u32 skl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
> > +					   u32 slice_size, u32
> > ddb_size)
> > +{
> > +	u32 offset = 0;
> >  
> > -	return ddb_size;
> > +	if (!dbuf_slice_mask)
> > +		return 0;
> > +
> > +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > +
> > +	WARN_ON(offset >= ddb_size);
> > +	return offset;
> >  }
> >  
> > +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > *dev_priv,
> > +				      int pipe, u32 active_pipes,
> > +				      const struct intel_crtc_state
> > *crtc_state);
> > +
> >  static void
> >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > *dev_priv,
> >  				   const struct intel_crtc_state
> > *crtc_state,
> > @@ -3857,7 +3855,14 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  	u32 pipe_width = 0, total_width = 0, width_before_pipe = 0;
> >  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> >  	u16 ddb_size;
> > +	u32 ddb_range_size;
> >  	u32 i;
> > +	u32 dbuf_slice_mask;
> > +	u32 active_pipes;
> > +	u32 offset;
> > +	u32 slice_size;
> > +	u32 total_slice_mask;
> > +	u32 start, end;
> >  
> >  	if (WARN_ON(!state) || !crtc_state->hw.active) {
> >  		alloc->start = 0;
> > @@ -3866,14 +3871,23 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  		return;
> >  	}
> >  
> > -	if (intel_state->active_pipe_changes)
> > +	if (intel_state->active_pipe_changes) {
> >  		*num_active = hweight8(intel_state->active_pipes);
> > -	else
> > +		active_pipes = intel_state->active_pipes;
> > +	} else {
> >  		*num_active = hweight8(dev_priv->active_pipes);
> > +		active_pipes = dev_priv->active_pipes;
> > +	}
> >  
> >  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > total_data_rate,
> >  				      *num_active, ddb);
> >  
> > +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > +
> > +	slice_size = ddb_size / INTEL_INFO(dev_priv)->supported_slices;
> > +
> > +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> > +
> >  	/*
> >  	 * If the state doesn't change the active CRTC's or there is no
> >  	 * modeset request, then there's no need to recalculate;
> > @@ -3891,20 +3905,70 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  		return;
> >  	}
> >  
> > +	/*
> > +	 * Get allowed DBuf slices for correspondent pipe and platform.
> > +	 */
> > +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv,
> > for_pipe,
> > +						     active_pipes,
> > crtc_state);
> > +
> > +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes %x\n",
> > +		      dbuf_slice_mask,
> > +		      for_pipe, active_pipes);
> > +
> > +	/*
> > +	 * Figure out at which DBuf slice we start, i.e if we start at
> > Dbuf S2
> > +	 * and slice size is 1024, the offset would be 1024
> > +	 */
> > +	offset = skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > +						 slice_size, ddb_size);
> > +
> > +	/*
> > +	 * Figure out total size of allowed DBuf slices, which is
> > basically
> > +	 * a number of allowed slices for that pipe multiplied by slice
> > size.
> > +	 * Inside of this
> > +	 * range ddb entries are still allocated in proportion to
> > display width.
> > +	 */
> > +	ddb_range_size = hweight8(dbuf_slice_mask) * slice_size;
> > +
> >  	/*
> >  	 * Watermark/ddb requirement highly depends upon width of the
> >  	 * framebuffer, So instead of allocating DDB equally among
> > pipes
> >  	 * distribute DDB based on resolution/width of the display.
> >  	 */
> > +	total_slice_mask = dbuf_slice_mask;
> >  	for_each_new_intel_crtc_in_state(intel_state, crtc, crtc_state,
> > i) {
> >  		const struct drm_display_mode *adjusted_mode =
> >  			&crtc_state->hw.adjusted_mode;
> >  		enum pipe pipe = crtc->pipe;
> >  		int hdisplay, vdisplay;
> > +		u32 pipe_dbuf_slice_mask = \
> > +			i915_get_allowed_dbuf_mask(dev_priv,
> > +						pipe,
> > +						active_pipes,
> > +						crtc_state);
> >  
> >  		if (!crtc_state->hw.enable)
> >  			continue;
> >  
> > +		/*
> > +		 * According to BSpec pipe can share one dbuf slice
> > with another
> > +		 * pipes or pipe can use multiple dbufs, in both cases
> > we
> > +		 * account for other pipes only if they have exactly
> > same mask.
> > +		 * However we need to account how many slices we should
> > enable
> > +		 * in total.
> > +		 */
> > +		total_slice_mask |= pipe_dbuf_slice_mask;
> > +
> > +		/*
> > +		 * Do not account pipes using other slice sets
> > +		 * luckily as of current BSpec slice sets do not
> > partially
> > +		 * intersect(pipes share either same one slice or same
> > slice set
> > +		 * i.e no partial intersection), so it is enough to
> > check for
> > +		 * equality for now.
> > +		 */
> > +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > +			continue;
> > +
> >  		drm_mode_get_hv_timing(adjusted_mode, &hdisplay,
> > &vdisplay);
> >  		total_width += hdisplay;
> >  
> > @@ -3914,8 +3978,19 @@ skl_ddb_get_pipe_allocation_limits(struct
> > drm_i915_private *dev_priv,
> >  			pipe_width = hdisplay;
> >  	}
> >  
> > -	alloc->start = ddb_size * width_before_pipe / total_width;
> > -	alloc->end = ddb_size * (width_before_pipe + pipe_width) /
> > total_width;
> > +	ddb->enabled_slices = total_slice_mask;
> > +
> > +	start = ddb_range_size * width_before_pipe / total_width;
> > +	end = ddb_range_size * (width_before_pipe + pipe_width) /
> > total_width;
> > +
> > +	alloc->start = offset + start;
> > +	alloc->end = offset + end;
> > +
> > +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > +		      alloc->start, alloc->end);
> > +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported %d\n",
> > +		      ddb->enabled_slices,
> > +		      INTEL_INFO(dev_priv)->supported_slices);
> >  }
> >  
> >  static int skl_compute_wm_params(const struct intel_crtc_state
> > *crtc_state,
> > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > intel_crtc *crtc,
> >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
> >  			  struct skl_ddb_allocation *ddb /* out */)
> >  {
> > -	ddb->enabled_slices = intel_enabled_dbuf_slices_num(dev_priv);
> > +	ddb->enabled_slices = intel_enabled_dbuf_slices(dev_priv);
> > +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > >enabled_slices);
> >  }
> >  
> >  /*
> > @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct
> > intel_crtc_state *crtc_state,
> >  	return total_data_rate;
> >  }
> >  
> > +static uint_fixed_16_16_t
> > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	u32 src_w, src_h, dst_w, dst_h;
> > +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > +	uint_fixed_16_16_t downscale_h, downscale_w;
> > +	const struct drm_display_mode *adjusted_mode = &crtc_state-
> > >hw.adjusted_mode;
> > +
> > +	src_w = crtc_state->pipe_src_w;
> > +	src_h = crtc_state->pipe_src_h;
> > +	dst_w = adjusted_mode->crtc_hdisplay;
> > +	dst_h = adjusted_mode->crtc_vdisplay;
> > +
> > +	fp_w_ratio = div_fixed16(src_w, dst_w);
> > +	fp_h_ratio = div_fixed16(src_h, dst_h);
> > +	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
> > +	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
> > +
> > +	return mul_fixed16(downscale_w, downscale_h);
> > +}
> > +
> > +uint_fixed_16_16_t
> > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > +{
> > +	struct drm_plane *plane;
> > +	const struct drm_plane_state *drm_plane_state;
> > +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > +	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> > +
> > +	if (!crtc_state->uapi.enable)
> > +		return max_downscale;
> > +
> > +	drm_atomic_crtc_state_for_each_plane_state(plane,
> > drm_plane_state, &crtc_state->uapi) {
> > +		uint_fixed_16_16_t plane_downscale;
> > +		const struct intel_plane_state *plane_state =
> > +			to_intel_plane_state(drm_plane_state);
> > +
> > +		if (!intel_wm_plane_visible(crtc_state, plane_state))
> > +			continue;
> > +
> > +		plane_downscale =
> > skl_plane_downscale_amount(crtc_state, plane_state);
> > +
> > +		DRM_DEBUG_KMS("Plane %d downscale amount %d.%d\n",
> > +			      to_intel_plane(plane)->id,
> > +			      plane_downscale.val >> 16,
> > +			      plane_downscale.val & 0xffff);
> > +
> > +		max_downscale = max_fixed16(plane_downscale,
> > max_downscale);
> > +	}
> > +
> > +
> > +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > +
> > +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > +		       crtc->pipe, pipe_downscale.val >> 16,
> > +		       pipe_downscale.val & 0xffff);
> > +
> > +	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
> > +
> > +	/*
> > +	 * Convert result to percentage: get the actual ratio instead
> > of rate,
> > +	 * then multiply by 100 to get percentage.
> > +	 */
> > +	pipe_ratio = u32_to_fixed16(100 * div_round_up_u32_fixed16(1,
> > pipe_downscale));
> > +
> > +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > +		      pipe_ratio.val >> 16, pipe_ratio.val & 0xffff);
> > +
> > +	return pipe_ratio;
> > +}
> > +
> > +struct dbuf_slice_conf_entry {
> > +	u32 dbuf_mask[I915_MAX_PIPES];
> > +	u32 active_pipes;
> > +};
> > +
> > +
> > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > BIT(PIPE_C) }
> > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C) |
> > BIT(PIPE_D) }
> > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C) |
> > BIT(PIPE_D) }
> > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > BIT(PIPE_D) }
> > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > BIT(PIPE_C) }
> > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \
> > +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) | BIT(PIPE_B) | \
> > +					  BIT(PIPE_C) | BIT(PIPE_D) }
> > +
> > +/*
> > + * Table taken from Bspec 12716
> > + * Pipes do have some preferred DBuf slice affinity,
> > + * plus there are some hardcoded requirements on how
> > + * those should be distributed for multipipe scenarios.
> > + * For more DBuf slices algorithm can get even more messy
> > + * and less readable, so decided to use a table almost
> > + * as is from BSpec itself - that way it is at least easier
> > + * to compare, change and check.
> > + */
> > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > +{ { 0, 0, 0, 0 }, 0 },
> > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > < 88.8 */
> > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe
> > ratio >= 88.8 */
> > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > < 88.8 */
> > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe
> > ratio >= 88.8 */
> > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > < 88.8 */
> > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for pipe
> > ratio >= 88.8 */
> > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > +};
> > +
> > +/*
> > + * Table taken from Bspec 49255
> > + * Pipes do have some preferred DBuf slice affinity,
> > + * plus there are some hardcoded requirements on how
> > + * those should be distributed for multipipe scenarios.
> > + * For more DBuf slices algorithm can get even more messy
> > + * and less readable, so decided to use a table almost
> > + * as is from BSpec itself - that way it is at least easier
> > + * to compare, change and check.
> > + */
> > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > +{ { 0, 0, 0, 0 }, 0 },
> > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT,
> > DBUF_S2_BIT),
> > +};
> > +
> > +/*
> > + * This function finds an entry with same enabled pipe
> > configuration and
> > + * returns correspondent DBuf slice mask as stated in BSpec for
> > particular
> > + * platform.
> > + */
> > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > +				     u32 active_pipes,
> > +				     const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	int i;
> > +	uint_fixed_16_16_t pipe_ratio;
> > +
> > +	/*
> > +	 * Calculate pipe ratio as stated in BSpec 28692
> > +	 */
> > +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > +
> > +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > +		if (icl_allowed_dbufs[i].active_pipes == active_pipes)
> > {
> > +			if (hweight32(active_pipes) == 1) {
> > +				/*
> > +				 * According to BSpec 12716: if ratio
> > >= 88.8
> > +				 * always use single dbuf slice.
> > +				 */
> > +				if (pipe_ratio.val >= div_fixed16(888,
> > 10).val)
> > +					++i;
> > +			}
> > +			return icl_allowed_dbufs[i].dbuf_mask[pipe];
> > +		}
> > +	}
> > +	return 0;
> > +}
> > +
> > +/*
> > + * This function finds an entry with same enabled pipe
> > configuration and
> > + * returns correspondent DBuf slice mask as stated in BSpec for
> > particular
> > + * platform.
> > + */
> > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > +				     u32 active_pipes,
> > +				     const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	int i;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > +		if (tgl_allowed_dbufs[i].active_pipes == active_pipes)
> > +			return tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > +	}
> > +	return 0;
> > +}
> > +
> > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> > +				      int pipe, u32 active_pipes,
> > +				      const struct intel_crtc_state
> > *crtc_state)
> > +{
> > +	if (IS_GEN(dev_priv, 11))
> > +		return icl_get_allowed_dbuf_mask(pipe,
> > +						 active_pipes,
> > +						 crtc_state);
> > +	else if (IS_GEN(dev_priv, 12))
> > +		return tgl_get_allowed_dbuf_mask(pipe,
> > +						 active_pipes,
> > +						 crtc_state);
> > +	/*
> > +	 * For anything else just return one slice yet.
> > +	 * Should be extended for other platforms.
> > +	 */
> > +	return DBUF_S1_BIT;
> > +}
> > +
> >  static u64
> >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > *crtc_state,
> >  				 u64 *plane_data_rate)
> 
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 15:26       ` Jani Nikula
  0 siblings, 0 replies; 34+ messages in thread
From: Jani Nikula @ 2019-11-18 15:26 UTC (permalink / raw)
  To: Lisovskiy, Stanislav, intel-gfx

On Mon, 18 Nov 2019, "Lisovskiy, Stanislav" <stanislav.lisovskiy@intel.com> wrote:
> I'm fine with fixing checkpatch and sparse failures(even though
> most of those are about 80 line width limitation violation or line
> continuations, which tbh are anyway quite common in our code). 

Please look through the warnings again. Only the COMMIT_LOG_LONG_LINE
one is to be ignored in this case. There's nothing else about 80
character line width. Line continuation is about using backslash to
continue to another line while that's only needed in macros.

Please let's just try to leave the code in neater shape after our
changes. We have these steps in CI so people wouldn't have to nitpick
about them separately.

> One thing I don't get is why this should cause any regression, as both
> BAT and IGT are green here for two runs in a row.
>
> What is the point of having a CI then.

Are you saying there's no point in having CI if it can't catch
absolutely everything?

Our CI is like every other CI everywhere; imperfect. Both in terms of
hardware and software coverage. We would have no bugs reported by the
community otherwise.


BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 15:26       ` Jani Nikula
  0 siblings, 0 replies; 34+ messages in thread
From: Jani Nikula @ 2019-11-18 15:26 UTC (permalink / raw)
  To: Lisovskiy, Stanislav, intel-gfx

On Mon, 18 Nov 2019, "Lisovskiy, Stanislav" <stanislav.lisovskiy@intel.com> wrote:
> I'm fine with fixing checkpatch and sparse failures(even though
> most of those are about 80 line width limitation violation or line
> continuations, which tbh are anyway quite common in our code). 

Please look through the warnings again. Only the COMMIT_LOG_LONG_LINE
one is to be ignored in this case. There's nothing else about 80
character line width. Line continuation is about using backslash to
continue to another line while that's only needed in macros.

Please let's just try to leave the code in neater shape after our
changes. We have these steps in CI so people wouldn't have to nitpick
about them separately.

> One thing I don't get is why this should cause any regression, as both
> BAT and IGT are green here for two runs in a row.
>
> What is the point of having a CI then.

Are you saying there's no point in having CI if it can't catch
absolutely everything?

Our CI is like every other CI everywhere; imperfect. Both in terms of
hardware and software coverage. We would have no bugs reported by the
community otherwise.


BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 15:27       ` Ville Syrjälä
  0 siblings, 0 replies; 34+ messages in thread
From: Ville Syrjälä @ 2019-11-18 15:27 UTC (permalink / raw)
  To: Lisovskiy, Stanislav; +Cc: intel-gfx

On Mon, Nov 18, 2019 at 09:19:18AM +0000, Lisovskiy, Stanislav wrote:
> On Fri, 2019-11-15 at 22:19 +0200, Ville Syrjälä wrote:
> > On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy wrote:
> > > Also implemented algorithm for choosing DBuf slice configuration
> > > based on active pipes, pipe ratio as stated in BSpec 12716.
> > > 
> > > Now pipe allocation still stays proportional to pipe width as
> > > before,
> > > however within allowed DBuf slice for this particular
> > > configuration.
> > > 
> > > v2: Remove unneeded check from commit as ddb enabled slices might
> > >     now differ from hw state.
> > > 
> > > v3: - Added new field "supported_slices" to ddb to separate max
> > >       supported slices vs currently enabled, to avoid confusion.
> > >     - Removed obsolete comments and code related to DBuf(Matthew
> > > Roper).
> > >     - Some code style and long line removal(Matthew Roper).
> > >     - Added WARN_ON to new ddb range offset calc function(Matthew
> > > Roper).
> > >     - Removed platform specific call to calc pipe ratio from ddb
> > >       allocation function and fixed the return value(Matthew Roper)
> > >     - Refactored DBUF slice allocation table to improve readability
> > >     - Added DBUF slice allocation for TGL as well.
> > >     - ICL(however not TGL) seems to voluntarily enable second DBuf
> > > slice
> > >       after pm suspend/resume causing a mismatch failure, because
> > > we
> > >       update DBuf slices only if we do a modeset, however this
> > > check
> > >       is done always. Fixed it to be done only when modeset for
> > > ICL.
> > > 
> > > v4: - Now enabled slices is not actually a number, but a bitmask,
> > >       because we might need to enable second slice only and number
> > >       of slices would still 1 and that current code doesn't allow.
> > >     - Remove redundant duplicate code to have some unified way of
> > >       enabling dbuf slices instead of hardcoding.
> > > 
> > > v5: - Fix failing gen9_assert_dbuf_enabled as it was naively
> > > thinking
> > >       that we have only one DBUF_CTL slice. Now another version for
> > >       gen11 will check both slices as only second one can be
> > > enabled,
> > >       to keep CI happy.
> > > 
> > > v6: - Removed enabled dbuf assertion completely. Previous code
> > >       was keeping dbuf enabled, even(!) when _dbuf_disable is
> > > called.
> > >       Currently we enable/disable dbuf slices based on requrement
> > >       so no point in asserting this here.
> > >     - Removed unnecessary modeset check from
> > > verify_wm_state(Matthew Roper)
> > >     - Slices intersection after union is same as final
> > > result(Matthew Roper)
> > >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> > >     - Some macros added(Matthew Roper)
> > >     - ddb range is now always less or equal to ddb size - no need
> > > for additional
> > >       checks(previously needed as we had some bandwidth checks in
> > > there which
> > >       could "suddenly" return ddb_size smaller than it is.
> > > 
> > > v7: Minor costemic changes:
> > >     - Changed icl_dbuf_slices_restore name to
> > > icl_program_dbuf_slices
> > >       as it more corresponds to the actual functionality.
> > >     - Some simplification with supported slices for BXT and GLK
> > >     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
> > >       this is not used for skl anymore.
> > >     - Changed DBuf slice assignment order for TGL case
> > > 
> > > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > > Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> > > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > > Cc: James Ausmus <james.ausmus@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> > >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> > >  .../drm/i915/display/intel_display_power.h    |   2 +
> > >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> > >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> > >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> > >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> > >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> > >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > > ++++++++++++++++--
> > >  9 files changed, 431 insertions(+), 104 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > index 876fc25968bf..bd7aff675198 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > > intel_crtc *crtc,
> > >  
> > >  	if (INTEL_GEN(dev_priv) >= 11 &&
> > >  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > > -		DRM_ERROR("mismatch in DBUF Slices (expected %u, got
> > > %u)\n",
> > > +		DRM_ERROR("mismatch in DBUF Slices (expected %x, got
> > > %x)\n",
> > >  			  sw_ddb->enabled_slices,
> > >  			  hw->ddb.enabled_slices);
> > >  
> > > @@ -14614,15 +14614,24 @@ static void
> > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > >  	u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
> > >  	u8 required_slices = state->wm_results.ddb.enabled_slices;
> > >  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > > +	u8 slices_union = hw_enabled_slices | required_slices;
> > > +	u8 slices_intersection = required_slices;
> > >  
> > >  	for_each_oldnew_intel_crtc_in_state(state, crtc,
> > > old_crtc_state, new_crtc_state, i)
> > >  		/* ignore allocations for crtc's that have been turned
> > > off. */
> > >  		if (new_crtc_state->hw.active)
> > >  			entries[i] = old_crtc_state->wm.skl.ddb;
> > >  
> > > -	/* If 2nd DBuf slice required, enable it here */
> > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > > hw_enabled_slices)
> > > -		icl_dbuf_slices_update(dev_priv, required_slices);
> > > +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > > +		      required_slices, hw_enabled_slices);
> > > +
> > > +	/*
> > > +	 * If some other DBuf slice required, enable it here,
> > > +	 * as here we only add more slices, but not remove to prevent
> > > +	 * issues if somebody still uses those.
> > > +	 */
> > > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > hw_enabled_slices)
> > > +		icl_dbuf_slices_update(dev_priv, slices_union);
> > >  
> > >  	/*
> > >  	 * Whenever the number of active pipes changes, we need to make
> > > sure we
> > > @@ -14681,9 +14690,12 @@ static void
> > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > >  		}
> > >  	} while (progress);
> > >  
> > > -	/* If 2nd DBuf slice is no more required disable it */
> > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > hw_enabled_slices)
> > > -		icl_dbuf_slices_update(dev_priv, required_slices);
> > > +	/*
> > > +	 * If some other DBuf slice is not needed, disable it here,
> > > +	 * as here we only remove slices, which are not needed anymore.
> > > +	 */
> > > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > hw_enabled_slices)
> > > +		icl_dbuf_slices_update(dev_priv, slices_intersection);
> > 
> > This is still a totally wrong place for this. We need to be sure the
> > planes have switched over to the new ddb layout before we power off
> > anything.
> 
> I have just modified already existing line here, not added by me.
> The point of this patch is starting to use second DBuf slice but
> not fixing all kinds of issues related to DBuf code at the same time.

You're re-enabling code I disabled on purpose. It has to be fixed
properly or not touched at all. In hindsight I should have probably
just ripped it all out to not give people bad ideas.

> 
> Of course this is nice idea to fix it - if that is wrong plane,
> but current code somehow lives with it and probably this should be
> with some other patch.
> 
> > 
> > >  }
> > >  
> > >  static void intel_atomic_helper_free_state(struct drm_i915_private
> > > *dev_priv)
> > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > index ce1b64f4dd44..a78bc9b61862 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > @@ -1031,15 +1031,6 @@ static bool
> > > gen9_dc_off_power_well_enabled(struct drm_i915_private *dev_priv,
> > >  		(I915_READ(DC_STATE_EN) &
> > > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> > >  }
> > >  
> > > -static void gen9_assert_dbuf_enabled(struct drm_i915_private
> > > *dev_priv)
> > > -{
> > > -	u32 tmp = I915_READ(DBUF_CTL);
> > > -
> > > -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> > > -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > > -	     "Unexpected DBuf power power state (0x%08x)\n", tmp);
> > > -}
> > > -
> > >  static void gen9_disable_dc_states(struct drm_i915_private
> > > *dev_priv)
> > >  {
> > >  	struct intel_cdclk_state cdclk_state = {};
> > > @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct
> > > drm_i915_private *dev_priv)
> > >  	/* Can't read out voltage_level so can't use
> > > intel_cdclk_changed() */
> > >  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw,
> > > &cdclk_state));
> > >  
> > > -	gen9_assert_dbuf_enabled(dev_priv);
> > > -
> > >  	if (IS_GEN9_LP(dev_priv))
> > >  		bxt_verify_ddi_phy_power_wells(dev_priv);
> > >  
> > > @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct
> > > drm_i915_private *dev_priv)
> > >  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> > >  }
> > >  
> > > -static u8 intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > >  {
> > > -	if (INTEL_GEN(dev_priv) < 11)
> > > -		return 1;
> > > -	return 2;
> > > +	return INTEL_INFO(dev_priv)->supported_slices;
> > > +}
> > 
> > Kinda pointless function now. 'supported_slices' is not a very good
> > name. Should be 'num_dbuf_slices' or something along those lines.
> 
> Supported slices really means how much slices we support in hardware.
> num_dbuf_slices will not reflect that, it will be more ambigious.
> I don't like this name.

supported_slices doesn't even say "dbuf" anywhere. So it's even more
ambiguous. Also we already have num_pipes, num_sprites, etc.

> 
> > 
> > > +
> > > +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv)
> > > +{
> > > +	const u8 hw_enabled_slices = dev_priv-
> > > >wm.skl_hw.ddb.enabled_slices;
> > 
> > Seems a rather pointless variable. const at least is pointless.
> > 
> > > +
> > > +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> > >  }
> > >  
> > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > >  			    u8 req_slices)
> > >  {
> > > -	const u8 hw_enabled_slices = dev_priv-
> > > >wm.skl_hw.ddb.enabled_slices;
> > > -	bool ret;
> > > +	bool ret = true;
> > > +	int i;
> > > +	int max_slices = intel_dbuf_max_slices(dev_priv);
> > >  
> > > -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > > +	if (hweight8(req_slices) > intel_dbuf_max_slices(dev_priv)) {
> > >  		DRM_ERROR("Invalid number of dbuf slices requested\n");
> > >  		return;
> > >  	}
> > >  
> > > -	if (req_slices == hw_enabled_slices || req_slices == 0)
> > > -		return;
> > > +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n", req_slices);
> > >  
> > > -	if (req_slices > hw_enabled_slices)
> > > -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2,
> > > true);
> > > -	else
> > > -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2,
> > > false);
> > > +	for (i = 0; i < max_slices; i++) {
> > > +		int slice_bit = BIT(i);
> > > +		bool slice_set = (slice_bit & req_slices) != 0;
> > > +
> > > +		switch (slice_bit) {
> > > +		case DBUF_S1_BIT:
> > > +			ret = ret && intel_dbuf_slice_set(dev_priv,
> > > +							  DBUF_CTL_S1,
> > > +							  slice_set);
> > > +			break;
> > > +		case DBUF_S2_BIT:
> > > +			ret = ret && intel_dbuf_slice_set(dev_priv,
> > > +							  DBUF_CTL_S2,
> > > +							  slice_set);
> > 
> > The return value on intel_dbuf_slice_set() seems pointless. I'd nuke
> > it.
> > 
> > Also as long you have intel_dbuf_slice_set() I'd probably just pass
> > the
> > slice index there and let it deal with the register details. DBUF_CTL
> > should probably be parametrized as well so you don't need annoying
> > switch
> > statements.
> 
> What if it fails? Shouldn't we still have some clue what state do we
> have/had at the moment?

It if fails you log an error and keep going.

> 
> > 
> > > +			break;
> > > +		default:
> > > +			MISSING_CASE(slice_bit);
> > > +		}
> > > +	}
> > >  
> > >  	if (ret)
> > >  		dev_priv->wm.skl_hw.ddb.enabled_slices = req_slices;
> > > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > > drm_i915_private *dev_priv,
> > >  
> > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > >  {
> > > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > DBUF_POWER_REQUEST);
> > > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > DBUF_POWER_REQUEST);
> > > -	POSTING_READ(DBUF_CTL_S2);
> > > -
> > > -	udelay(10);
> > > -
> > > -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > -		DRM_ERROR("DBuf power enable timeout\n");
> > > -	else
> > > -		/*
> > > -		 * FIXME: for now pretend that we only have 1 slice,
> > > see
> > > -		 * intel_enabled_dbuf_slices_num().
> > > -		 */
> > > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > +	/*
> > > +	 * Just power up 1 slice, we will
> > > +	 * figure out later which slices we have and what we need.
> > > +	 */
> > > +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > 
> > That stuff really shouldn't be modified by low level functions.
> > The intial value should come from readout, and after that it should
> > be adjusted as part of the overall atomic state.
> 
> Well, as you can see currently this is even worse - has some weird
> hardcoded stuff and is modified like that. Also if we are enabling the 
> dbuf slices, shouldn't we really determine what value should be there?
> 
> Also I wouldn't really always trust the readout, as to me it seems
> icelake resets it constantly to 2 after suspend/resume. 

The point of readout is that we syncrhonize our software state with the
current state of hardware without having to modify the hardware state.
Once synchronized the driver takes over and tells the hw what to do.

> 
> > 
> > > +	icl_program_dbuf_slices(dev_priv);
> > >  }
> > >  
> > >  static void icl_dbuf_disable(struct drm_i915_private *dev_priv)
> > >  {
> > > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > > ~DBUF_POWER_REQUEST);
> > > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > > ~DBUF_POWER_REQUEST);
> > > -	POSTING_READ(DBUF_CTL_S2);
> > > -
> > > -	udelay(10);
> > > -
> > > -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > -		DRM_ERROR("DBuf power disable timeout!\n");
> > > -	else
> > > -		/*
> > > -		 * FIXME: for now pretend that the first slice is
> > > always
> > > -		 * enabled, see intel_enabled_dbuf_slices_num().
> > > -		 */
> > > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > +	/*
> > > +	 * Disable all slices
> > > +	 */
> > > +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > > +	icl_program_dbuf_slices(dev_priv);
> > >  }
> > >  
> > >  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > index 1da04f3e0fb3..b7057d68ad78 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > > drm_i915_private *i915,
> > >  	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
> > >  	     intel_display_power_put_async((i915), (domain), (wf)),
> > > (wf) = 0)
> > >  
> > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
> > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > >  			    u8 req_slices);
> > > +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv);
> > >  
> > >  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
> > >  			     bool override, unsigned int mask);
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > b/drivers/gpu/drm/i915/i915_drv.c
> > > index 82e4e6bf08c3..96741e68633a 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -51,6 +51,7 @@
> > >  #include "display/intel_bw.h"
> > >  #include "display/intel_cdclk.h"
> > >  #include "display/intel_display_types.h"
> > > +#include "display/intel_display_power.h"
> > >  #include "display/intel_dp.h"
> > >  #include "display/intel_fbdev.h"
> > >  #include "display/intel_hotplug.h"
> > > @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct
> > > device *kdev)
> > >  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
> > >  		ret = vlv_resume_prepare(dev_priv, true);
> > >  
> > > +	/* Weird hack to fix ICL hardware bug, as it resets DBUF slices
> > > reg */
> > > +	if (INTEL_GEN(dev_priv) == 11)
> > > +		icl_program_dbuf_slices(dev_priv);
> > > +
> > >  	intel_uncore_runtime_resume(&dev_priv->uncore);
> > >  
> > >  	intel_runtime_pm_enable_interrupts(dev_priv);
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > b/drivers/gpu/drm/i915/i915_drv.h
> > > index 7e0f67babe20..a396977c9c2d 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -804,7 +804,7 @@ static inline bool skl_ddb_entry_equal(const
> > > struct skl_ddb_entry *e1,
> > >  }
> > >  
> > >  struct skl_ddb_allocation {
> > 
> > This struct is quite pointless. Should be killed off.
> 
> Legacy stuff, not added by me. Quite fine with removing it - 
> but should we really do all this at the same time and in same
> patch? 

Not the same patch. But clean up the cobwebs first, then add the new
stuff. Much easier that way around.

> 
> > 
> > > -	u8 enabled_slices; /* GEN11 has configurable 2 slices */
> > > +	u8 enabled_slices;   /* Bitmask of currently enabled DBuf
> > > slices */
> > >  };
> > >  
> > >  struct skl_ddb_values {
> > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > b/drivers/gpu/drm/i915/i915_pci.c
> > > index 1bb701d32a5d..8e86af505730 100644
> > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > > intel_cherryview_info = {
> > >  	.has_gt_uc = 1, \
> > >  	.display.has_hdcp = 1, \
> > >  	.display.has_ipc = 1, \
> > > -	.ddb_size = 896
> > > +	.ddb_size = 896, \
> > > +	.supported_slices = 1
> > >  
> > >  #define SKL_PLATFORM \
> > >  	GEN9_FEATURES, \
> > > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > > intel_skylake_gt4_info = {
> > >  	GEN(9), \
> > >  	.is_lp = 1, \
> > >  	.display.has_hotplug = 1, \
> > > +	.supported_slices = 1, \
> > >  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0),
> > > \
> > >  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
> > >  	.has_64bit_reloc = 1, \
> > > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > > intel_coffeelake_gt3_info = {
> > >  	GEN9_FEATURES, \
> > >  	GEN(10), \
> > >  	.ddb_size = 1024, \
> > > +	.supported_slices = 1, \
> > >  	.display.has_dsc = 1, \
> > >  	.has_coherent_ggtt = false, \
> > >  	GLK_COLORS
> > > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > > intel_cannonlake_info = {
> > >  	}, \
> > >  	GEN(11), \
> > >  	.ddb_size = 2048, \
> > > +	.supported_slices = 2, \
> > >  	.has_logical_ring_elsq = 1, \
> > >  	.color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 }
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > > b/drivers/gpu/drm/i915/i915_reg.h
> > > index a607ea520829..fba5731063d8 100644
> > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > @@ -7734,9 +7734,11 @@ enum {
> > >  #define DISP_ARB_CTL2	_MMIO(0x45004)
> > >  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
> > >  #define  DISP_IPC_ENABLE		(1 << 3)
> > > -#define DBUF_CTL	_MMIO(0x45008)
> > > -#define DBUF_CTL_S1	_MMIO(0x45008)
> > > -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> > > +#define DBUF_S1_BIT			BIT(0)
> > > +#define DBUF_S2_BIT			BIT(1)
> > 
> > These are not bits inside DBUF_CTL so they should not be here.
> > I'd just nuke them entirely. If you think we need more than an
> > int to refer to a slice then IMO something simple like
> > 
> > enum dbuf_slice {
> > 	DBUF_S1,
> > 	DBUF_S2,
> > };
> > 
> > would match the general apporach we use for other things much better
> > rather than defining some shifted values.
> 
> We need to combine slices - so bitmask is quite suitable for that, 
> I can of course use this enum, but then I'll have to re-encode it 
> to bitmask anyways.

Which is pretty much how everything else works in the driver.

> 
> 
> > 
> > > +#define DBUF_CTL			_MMIO(0x45008)
> > > +#define DBUF_CTL_S1			_MMIO(0x45008)
> > > +#define DBUF_CTL_S2			_MMIO(0x44FE8)
> > >  #define  DBUF_POWER_REQUEST		(1 << 31)
> > >  #define  DBUF_POWER_STATE		(1 << 30)
> > >  #define GEN7_MSG_CTL	_MMIO(0x45010)
> > > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > > b/drivers/gpu/drm/i915/intel_device_info.h
> > > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > > @@ -180,6 +180,7 @@ struct intel_device_info {
> > >  	} display;
> > >  
> > >  	u16 ddb_size; /* in blocks */
> > > +	u8 supported_slices; /* number of DBuf slices */
> > >  
> > >  	/* Register offsets for the various display pipes and
> > > transcoders */
> > >  	int pipe_offsets[I915_MAX_TRANSCODERS];
> > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > b/drivers/gpu/drm/i915/intel_pm.c
> > > index 2d389e437e87..1ea81ab92429 100644
> > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct drm_device
> > > *dev)
> > >  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> > >  }
> > >  
> > > -static u8 intel_enabled_dbuf_slices_num(struct drm_i915_private
> > > *dev_priv)
> > > +static u8 intel_enabled_dbuf_slices(struct drm_i915_private
> > > *dev_priv)
> > >  {
> > > -	u8 enabled_slices;
> > > -
> > > -	/* Slice 1 will always be enabled */
> > > -	enabled_slices = 1;
> > > +	u8 enabled_slices = 0;
> > >  
> > >  	/* Gen prior to GEN11 have only one DBuf slice */
> > >  	if (INTEL_GEN(dev_priv) < 11)
> > > -		return enabled_slices;
> > > +		return DBUF_S1_BIT;
> > 
> > I'd nuke this special case entirely and just read everything from the
> > hw. Assuming we need readout somewhere.
> 
> We have it that way hardcoded right now in drm-tip. I think this is a
> subject to other patch, but not this change. As I understand we need 
> to first figure out how to do it properly. 
> This patch anyway makes it better by at least removing hardcoded
> relation between number of slices and which slices are enabled.
> With current code there is no way for example to enable second DBuf
> slice only, which is required by BSpec.
> 
> > 
> > >  
> > > -	/*
> > > -	 * FIXME: for now we'll only ever use 1 slice; pretend that we
> > > have
> > > -	 * only that 1 slice enabled until we have a proper way for on-
> > > demand
> > > -	 * toggling of the second slice.
> > > -	 */
> > > -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > -		enabled_slices++;
> > > +	/* Check if second DBuf slice is enabled */
> > > +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > > +		enabled_slices |= DBUF_S1_BIT;
> > > +
> > > +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > +		enabled_slices |= DBUF_S2_BIT;
> > >  
> > >  	return enabled_slices;
> > >  }
> > > @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct
> > > drm_i915_private *dev_priv,
> > >  			      const int num_active,
> > >  			      struct skl_ddb_allocation *ddb)
> > >  {
> > > -	const struct drm_display_mode *adjusted_mode;
> > > -	u64 total_data_bw;
> > >  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > > -
> > >  	WARN_ON(ddb_size == 0);
> > >  
> > >  	if (INTEL_GEN(dev_priv) < 11)
> > >  		return ddb_size - 4; /* 4 blocks for bypass path
> > > allocation */
> > >  
> > > -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> > > -	total_data_bw = total_data_rate *
> > > drm_mode_vrefresh(adjusted_mode);
> > > +	return ddb_size;
> > > +}
> > >  
> > > -	/*
> > > -	 * 12GB/s is maximum BW supported by single DBuf slice.
> > > -	 *
> > > -	 * FIXME dbuf slice code is broken:
> > > -	 * - must wait for planes to stop using the slice before
> > > powering it off
> > > -	 * - plane straddling both slices is illegal in multi-pipe
> > > scenarios
> > > -	 * - should validate we stay within the hw bandwidth limits
> > > -	 */
> > > -	if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
> > > -		ddb->enabled_slices = 2;
> > > -	} else {
> > > -		ddb->enabled_slices = 1;
> > > -		ddb_size /= 2;
> > > -	}
> > > +/*
> > > + * Calculate initial DBuf slice offset, based on slice size
> > > + * and mask(i.e if slice size is 1024 and second slice is enabled
> > > + * offset would be 1024)
> > > + */
> > > +static u32 skl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
> > > +					   u32 slice_size, u32
> > > ddb_size)
> > > +{
> > > +	u32 offset = 0;
> > >  
> > > -	return ddb_size;
> > > +	if (!dbuf_slice_mask)
> > > +		return 0;
> > > +
> > > +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > > +
> > > +	WARN_ON(offset >= ddb_size);
> > > +	return offset;
> > >  }
> > >  
> > > +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > *dev_priv,
> > > +				      int pipe, u32 active_pipes,
> > > +				      const struct intel_crtc_state
> > > *crtc_state);
> > > +
> > >  static void
> > >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > > *dev_priv,
> > >  				   const struct intel_crtc_state
> > > *crtc_state,
> > > @@ -3857,7 +3855,14 @@ skl_ddb_get_pipe_allocation_limits(struct
> > > drm_i915_private *dev_priv,
> > >  	u32 pipe_width = 0, total_width = 0, width_before_pipe = 0;
> > >  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> > >  	u16 ddb_size;
> > > +	u32 ddb_range_size;
> > >  	u32 i;
> > > +	u32 dbuf_slice_mask;
> > > +	u32 active_pipes;
> > > +	u32 offset;
> > > +	u32 slice_size;
> > > +	u32 total_slice_mask;
> > > +	u32 start, end;
> > >  
> > >  	if (WARN_ON(!state) || !crtc_state->hw.active) {
> > >  		alloc->start = 0;
> > > @@ -3866,14 +3871,23 @@ skl_ddb_get_pipe_allocation_limits(struct
> > > drm_i915_private *dev_priv,
> > >  		return;
> > >  	}
> > >  
> > > -	if (intel_state->active_pipe_changes)
> > > +	if (intel_state->active_pipe_changes) {
> > >  		*num_active = hweight8(intel_state->active_pipes);
> > > -	else
> > > +		active_pipes = intel_state->active_pipes;
> > > +	} else {
> > >  		*num_active = hweight8(dev_priv->active_pipes);
> > > +		active_pipes = dev_priv->active_pipes;
> > > +	}
> > >  
> > >  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > > total_data_rate,
> > >  				      *num_active, ddb);
> > >  
> > > +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > > +
> > > +	slice_size = ddb_size / INTEL_INFO(dev_priv)->supported_slices;
> > > +
> > > +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> > > +
> > >  	/*
> > >  	 * If the state doesn't change the active CRTC's or there is no
> > >  	 * modeset request, then there's no need to recalculate;
> > > @@ -3891,20 +3905,70 @@ skl_ddb_get_pipe_allocation_limits(struct
> > > drm_i915_private *dev_priv,
> > >  		return;
> > >  	}
> > >  
> > > +	/*
> > > +	 * Get allowed DBuf slices for correspondent pipe and platform.
> > > +	 */
> > > +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv,
> > > for_pipe,
> > > +						     active_pipes,
> > > crtc_state);
> > 
> > Quite a few redundant arguments. The function name looks alien.
> > 
> > skl_crtc_possible_dbuf_slices() or something perhaps?
> > 
> > > +
> > > +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes %x\n",
> > > +		      dbuf_slice_mask,
> > > +		      for_pipe, active_pipes);
> > > +
> > > +	/*
> > > +	 * Figure out at which DBuf slice we start, i.e if we start at
> > > Dbuf S2
> > > +	 * and slice size is 1024, the offset would be 1024
> > > +	 */
> > > +	offset = skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > > +						 slice_size, ddb_size);
> > > +
> > > +	/*
> > > +	 * Figure out total size of allowed DBuf slices, which is
> > > basically
> > > +	 * a number of allowed slices for that pipe multiplied by slice
> > > size.
> > > +	 * Inside of this
> > > +	 * range ddb entries are still allocated in proportion to
> > > display width.
> > > +	 */
> > > +	ddb_range_size = hweight8(dbuf_slice_mask) * slice_size;
> > > +
> > >  	/*
> > >  	 * Watermark/ddb requirement highly depends upon width of the
> > >  	 * framebuffer, So instead of allocating DDB equally among
> > > pipes
> > >  	 * distribute DDB based on resolution/width of the display.
> > >  	 */
> > > +	total_slice_mask = dbuf_slice_mask;
> > 
> > Isn't this already taken care by the loop below?
> > 
> > >  	for_each_new_intel_crtc_in_state(intel_state, crtc, crtc_state,
> > > i) {
> > >  		const struct drm_display_mode *adjusted_mode =
> > >  			&crtc_state->hw.adjusted_mode;
> > >  		enum pipe pipe = crtc->pipe;
> > >  		int hdisplay, vdisplay;
> > > +		u32 pipe_dbuf_slice_mask = \
> > > +			i915_get_allowed_dbuf_mask(dev_priv,
> > > +						pipe,
> > > +						active_pipes,
> > > +						crtc_state);
> > >  
> > >  		if (!crtc_state->hw.enable)
> > >  			continue;
> > >  
> > > +		/*
> > > +		 * According to BSpec pipe can share one dbuf slice
> > > with another
> > > +		 * pipes or pipe can use multiple dbufs, in both cases
> > > we
> > > +		 * account for other pipes only if they have exactly
> > > same mask.
> > > +		 * However we need to account how many slices we should
> > > enable
> > > +		 * in total.
> > > +		 */
> > > +		total_slice_mask |= pipe_dbuf_slice_mask;
> > 
> > total_slice_mask will now account only the crtcs in the state. What
> > happens to the other pipes' slices?
> 
> Yes we need to iterate through all pipes/crtcs here, however
> for getting correct dbuf slice mask most important is active_pipes.
> However yes, I will change the loop for it to be more correct.
> 
> > 
> > > +
> > > +		/*
> > > +		 * Do not account pipes using other slice sets
> > > +		 * luckily as of current BSpec slice sets do not
> > > partially
> > > +		 * intersect(pipes share either same one slice or same
> > > slice set
> > > +		 * i.e no partial intersection), so it is enough to
> > > check for
> > > +		 * equality for now.
> > > +		 */
> > > +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > > +			continue;
> > > +
> > >  		drm_mode_get_hv_timing(adjusted_mode, &hdisplay,
> > > &vdisplay);
> > >  		total_width += hdisplay;
> > >  
> > > @@ -3914,8 +3978,19 @@ skl_ddb_get_pipe_allocation_limits(struct
> > > drm_i915_private *dev_priv,
> > >  			pipe_width = hdisplay;
> > >  	}
> > >  
> > > -	alloc->start = ddb_size * width_before_pipe / total_width;
> > > -	alloc->end = ddb_size * (width_before_pipe + pipe_width) /
> > > total_width;
> > > +	ddb->enabled_slices = total_slice_mask;
> > > +
> > > +	start = ddb_range_size * width_before_pipe / total_width;
> > > +	end = ddb_range_size * (width_before_pipe + pipe_width) /
> > > total_width;
> > > +
> > > +	alloc->start = offset + start;
> > > +	alloc->end = offset + end;
> > > +
> > > +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > > +		      alloc->start, alloc->end);
> > > +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported %d\n",
> > > +		      ddb->enabled_slices,
> > > +		      INTEL_INFO(dev_priv)->supported_slices);
> > >  }
> > >  
> > >  static int skl_compute_wm_params(const struct intel_crtc_state
> > > *crtc_state,
> > > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > > intel_crtc *crtc,
> > >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
> > >  			  struct skl_ddb_allocation *ddb /* out */)
> > >  {
> > > -	ddb->enabled_slices = intel_enabled_dbuf_slices_num(dev_priv);
> > > +	ddb->enabled_slices = intel_enabled_dbuf_slices(dev_priv);
> > > +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > > >enabled_slices);
> > >  }
> > >  
> > >  /*
> > > @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct
> > > intel_crtc_state *crtc_state,
> > >  	return total_data_rate;
> > >  }
> > >  
> > > +static uint_fixed_16_16_t
> > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > *crtc_state)
> > > +{
> > > +	u32 src_w, src_h, dst_w, dst_h;
> > > +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > +	uint_fixed_16_16_t downscale_h, downscale_w;
> > > +	const struct drm_display_mode *adjusted_mode = &crtc_state-
> > > >hw.adjusted_mode;
> > > +
> > > +	src_w = crtc_state->pipe_src_w;
> > > +	src_h = crtc_state->pipe_src_h;
> > > +	dst_w = adjusted_mode->crtc_hdisplay;
> > > +	dst_h = adjusted_mode->crtc_vdisplay;
> > > +
> > > +	fp_w_ratio = div_fixed16(src_w, dst_w);
> > > +	fp_h_ratio = div_fixed16(src_h, dst_h);
> > > +	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
> > > +	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
> > > +
> > > +	return mul_fixed16(downscale_w, downscale_h);
> > > +}
> > > +
> > > +uint_fixed_16_16_t
> > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > +{
> > > +	struct drm_plane *plane;
> > > +	const struct drm_plane_state *drm_plane_state;
> > > +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > +	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> > > +
> > > +	if (!crtc_state->uapi.enable)
> > > +		return max_downscale;
> > > +
> > > +	drm_atomic_crtc_state_for_each_plane_state(plane,
> > > drm_plane_state, &crtc_state->uapi) {
> > > +		uint_fixed_16_16_t plane_downscale;
> > > +		const struct intel_plane_state *plane_state =
> > > +			to_intel_plane_state(drm_plane_state);
> > > +
> > > +		if (!intel_wm_plane_visible(crtc_state, plane_state))
> > > +			continue;
> > > +
> > > +		plane_downscale =
> > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > +
> > > +		DRM_DEBUG_KMS("Plane %d downscale amount %d.%d\n",
> > > +			      to_intel_plane(plane)->id,
> > > +			      plane_downscale.val >> 16,
> > > +			      plane_downscale.val & 0xffff);
> > > +
> > > +		max_downscale = max_fixed16(plane_downscale,
> > > max_downscale);
> > > +	}
> > > +
> > > +
> > > +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > +
> > > +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > +		       crtc->pipe, pipe_downscale.val >> 16,
> > > +		       pipe_downscale.val & 0xffff);
> > > +
> > > +	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
> > > +
> > > +	/*
> > > +	 * Convert result to percentage: get the actual ratio instead
> > > of rate,
> > > +	 * then multiply by 100 to get percentage.
> > > +	 */
> > > +	pipe_ratio = u32_to_fixed16(100 * div_round_up_u32_fixed16(1,
> > > pipe_downscale));
> > > +
> > > +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > +		      pipe_ratio.val >> 16, pipe_ratio.val & 0xffff);
> > > +
> > > +	return pipe_ratio;
> > > +}
> > 
> > Yuck. I just killed these.
> 
> I still need pipe_ratio fro ICL as you know, this is dictated by BSpec,
> not by me.

That part of the spec is pretty crazy IMO. I think we should just ignore
it for now. Also nothing in there says we must use both slices for a
single pipe so ignoring doesn't even violate the spec in any way.

> 
> > 
> > > +
> > > +struct dbuf_slice_conf_entry {
> > > +	u32 dbuf_mask[I915_MAX_PIPES];
> > > +	u32 active_pipes;
> > > +};
> > > +
> > > +
> > > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > > +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > > +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > > +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > > +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > > +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > > BIT(PIPE_C) }
> > > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C) |
> > > BIT(PIPE_D) }
> > > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C) |
> > > BIT(PIPE_D) }
> > > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > > BIT(PIPE_D) }
> > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > > BIT(PIPE_C) }
> > > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \
> > > +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) | BIT(PIPE_B) | \
> > > +					  BIT(PIPE_C) | BIT(PIPE_D) }
> > > +
> > > +/*
> > > + * Table taken from Bspec 12716
> > > + * Pipes do have some preferred DBuf slice affinity,
> > > + * plus there are some hardcoded requirements on how
> > > + * those should be distributed for multipipe scenarios.
> > > + * For more DBuf slices algorithm can get even more messy
> > > + * and less readable, so decided to use a table almost
> > > + * as is from BSpec itself - that way it is at least easier
> > > + * to compare, change and check.
> > > + */
> > > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > > +{ { 0, 0, 0, 0 }, 0 },
> > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > > < 88.8 */
> > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe
> > > ratio >= 88.8 */
> > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > > < 88.8 */
> > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe
> > > ratio >= 88.8 */
> > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > > < 88.8 */
> > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for pipe
> > > ratio >= 88.8 */
> > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > > +};
> > > +
> > > +/*
> > > + * Table taken from Bspec 49255
> > > + * Pipes do have some preferred DBuf slice affinity,
> > > + * plus there are some hardcoded requirements on how
> > > + * those should be distributed for multipipe scenarios.
> > > + * For more DBuf slices algorithm can get even more messy
> > > + * and less readable, so decided to use a table almost
> > > + * as is from BSpec itself - that way it is at least easier
> > > + * to compare, change and check.
> > > + */
> > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > +{ { 0, 0, 0, 0 }, 0 },
> > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT,
> > > DBUF_S2_BIT),
> > > +};
> > 
> > My eyes!
> > 
> > I have to think we should be able to reduce all that to a handful
> > of lines of code.
> 
> Yeah, then we are going to have a huge function with lots of weird
> definitions, unfortunately BSpec table has quite strange DBuf
> assignments like 
> +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> 
> i.e slices can get mixed in a quite various ways. There is no
> rational pattern in those and could be even dangerous to try to 
> optimize it some way, as one day it might change again in some
> unpredictable way.
> 
> Would you prefer to have it like
> 
> if (pipes are A and B)
>     program S2 to A and S1 to B
> if (pipes are A and C)
>     program S1 to A and S2 to C
> ...
> 
> I would prefer at least to see it in a comparable way with the
> table we have in BSpec, rather than having lots of weird looking
> if-else statements, GEN checks and so on.
> 
> I knew this table was not going to look like it is done typically
> here - but really my opinion that it is way better than hardcoding
> it into some weird algorithm, which would be hard to compare to 
> initial table, which is already strange enough.
> 
> If you don't like it - could you please explain why this is exactly
> worse than having long functions with hardcoded checks?

I think we should be able to encode this simply using the
"Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
is the distance of each pipe from each dbuf slice. Then all we have
to do is minimize the total distance. The following tables I think
should encode that reasonably concisely:

/* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
static const u8 icl_dbuf_distance[][] {
	[PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
	[PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
	[PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
};

/* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
static const u8 tgl_dbuf_distance[][] {
	[PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
	[PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
	[PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
	[PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
};

/*
 * pipeA - DBUF_S0 - DBUF_S1 - pipeB
 * pipeC - DBUF_S2 - DBUF_S3 - pipeD
 */
static const u8 future_dbuf_distance[][] {
	[PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
	[PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
	[PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
	[PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
};

I suppose we could make that even more minimal by just
directly sticking that "Pipe and DBUF ordering" language
into a single array but and then calculate each distance
from it, but that is perhaps more work, and we'd need
some kind of enum just for that so that we have a single
namespace for the pipes and dbuf slices. So i have a
feeling having those distances pre-computed in a table
like thet is maybe a better way to go.

> 
> I think this is pretty stupid to fix stuff which looks new or unusual
> just because somebody doesn't like it, without even constructive
> arguing.
> 
> - Stan
> 
> > 
> > 
> > > +
> > > +/*
> > > + * This function
> > 
> > "This function" is totally redundant. Ditto elsewhere.
> > 
> > > finds an entry with same enabled pipe configuration and
> > > + * returns correspondent DBuf slice mask as stated in BSpec for
> > > particular
> > > + * platform.
> > > + */
> > > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > > +				     u32 active_pipes,
> > > +				     const struct intel_crtc_state
> > > *crtc_state)
> > > +{
> > > +	int i;
> > > +	uint_fixed_16_16_t pipe_ratio;
> > > +
> > > +	/*
> > > +	 * Calculate pipe ratio as stated in BSpec 28692
> > > +	 */
> > > +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > > +
> > > +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > > +		if (icl_allowed_dbufs[i].active_pipes == active_pipes)
> > > {
> > > +			if (hweight32(active_pipes) == 1) {
> > > +				/*
> > > +				 * According to BSpec 12716: if ratio
> > > >= 88.8
> > > +				 * always use single dbuf slice.
> > > +				 */
> > > +				if (pipe_ratio.val >= div_fixed16(888,
> > > 10).val)
> > > +					++i;
> > > +			}
> > > +			return icl_allowed_dbufs[i].dbuf_mask[pipe];
> > > +		}
> > > +	}
> > > +	return 0;
> > > +}
> > > +
> > > +/*
> > > + * This function finds an entry with same enabled pipe
> > > configuration and
> > > + * returns correspondent DBuf slice mask as stated in BSpec for
> > > particular
> > > + * platform.
> > > + */
> > > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > > +				     u32 active_pipes,
> > > +				     const struct intel_crtc_state
> > > *crtc_state)
> > > +{
> > > +	int i;
> > > +
> > > +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > > +		if (tgl_allowed_dbufs[i].active_pipes == active_pipes)
> > > +			return tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > > +	}
> > > +	return 0;
> > > +}
> > > +
> > > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> > > +				      int pipe, u32 active_pipes,
> > > +				      const struct intel_crtc_state
> > > *crtc_state)
> > > +{
> > > +	if (IS_GEN(dev_priv, 11))
> > > +		return icl_get_allowed_dbuf_mask(pipe,
> > > +						 active_pipes,
> > > +						 crtc_state);
> > > +	else if (IS_GEN(dev_priv, 12))
> > > +		return tgl_get_allowed_dbuf_mask(pipe,
> > > +						 active_pipes,
> > > +						 crtc_state);
> > > +	/*
> > > +	 * For anything else just return one slice yet.
> > > +	 * Should be extended for other platforms.
> > > +	 */
> > > +	return DBUF_S1_BIT;
> > > +}
> > > +
> > >  static u64
> > >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > > *crtc_state,
> > >  				 u64 *plane_data_rate)
> > > -- 
> > > 2.17.1
> > 
> > 

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 15:27       ` Ville Syrjälä
  0 siblings, 0 replies; 34+ messages in thread
From: Ville Syrjälä @ 2019-11-18 15:27 UTC (permalink / raw)
  To: Lisovskiy, Stanislav; +Cc: intel-gfx

On Mon, Nov 18, 2019 at 09:19:18AM +0000, Lisovskiy, Stanislav wrote:
> On Fri, 2019-11-15 at 22:19 +0200, Ville Syrjälä wrote:
> > On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy wrote:
> > > Also implemented algorithm for choosing DBuf slice configuration
> > > based on active pipes, pipe ratio as stated in BSpec 12716.
> > > 
> > > Now pipe allocation still stays proportional to pipe width as
> > > before,
> > > however within allowed DBuf slice for this particular
> > > configuration.
> > > 
> > > v2: Remove unneeded check from commit as ddb enabled slices might
> > >     now differ from hw state.
> > > 
> > > v3: - Added new field "supported_slices" to ddb to separate max
> > >       supported slices vs currently enabled, to avoid confusion.
> > >     - Removed obsolete comments and code related to DBuf(Matthew
> > > Roper).
> > >     - Some code style and long line removal(Matthew Roper).
> > >     - Added WARN_ON to new ddb range offset calc function(Matthew
> > > Roper).
> > >     - Removed platform specific call to calc pipe ratio from ddb
> > >       allocation function and fixed the return value(Matthew Roper)
> > >     - Refactored DBUF slice allocation table to improve readability
> > >     - Added DBUF slice allocation for TGL as well.
> > >     - ICL(however not TGL) seems to voluntarily enable second DBuf
> > > slice
> > >       after pm suspend/resume causing a mismatch failure, because
> > > we
> > >       update DBuf slices only if we do a modeset, however this
> > > check
> > >       is done always. Fixed it to be done only when modeset for
> > > ICL.
> > > 
> > > v4: - Now enabled slices is not actually a number, but a bitmask,
> > >       because we might need to enable second slice only and number
> > >       of slices would still 1 and that current code doesn't allow.
> > >     - Remove redundant duplicate code to have some unified way of
> > >       enabling dbuf slices instead of hardcoding.
> > > 
> > > v5: - Fix failing gen9_assert_dbuf_enabled as it was naively
> > > thinking
> > >       that we have only one DBUF_CTL slice. Now another version for
> > >       gen11 will check both slices as only second one can be
> > > enabled,
> > >       to keep CI happy.
> > > 
> > > v6: - Removed enabled dbuf assertion completely. Previous code
> > >       was keeping dbuf enabled, even(!) when _dbuf_disable is
> > > called.
> > >       Currently we enable/disable dbuf slices based on requrement
> > >       so no point in asserting this here.
> > >     - Removed unnecessary modeset check from
> > > verify_wm_state(Matthew Roper)
> > >     - Slices intersection after union is same as final
> > > result(Matthew Roper)
> > >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> > >     - Some macros added(Matthew Roper)
> > >     - ddb range is now always less or equal to ddb size - no need
> > > for additional
> > >       checks(previously needed as we had some bandwidth checks in
> > > there which
> > >       could "suddenly" return ddb_size smaller than it is.
> > > 
> > > v7: Minor costemic changes:
> > >     - Changed icl_dbuf_slices_restore name to
> > > icl_program_dbuf_slices
> > >       as it more corresponds to the actual functionality.
> > >     - Some simplification with supported slices for BXT and GLK
> > >     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
> > >       this is not used for skl anymore.
> > >     - Changed DBuf slice assignment order for TGL case
> > > 
> > > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > > Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> > > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > > Cc: James Ausmus <james.ausmus@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> > >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> > >  .../drm/i915/display/intel_display_power.h    |   2 +
> > >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> > >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> > >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> > >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> > >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> > >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > > ++++++++++++++++--
> > >  9 files changed, 431 insertions(+), 104 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > index 876fc25968bf..bd7aff675198 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > > intel_crtc *crtc,
> > >  
> > >  	if (INTEL_GEN(dev_priv) >= 11 &&
> > >  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > > -		DRM_ERROR("mismatch in DBUF Slices (expected %u, got
> > > %u)\n",
> > > +		DRM_ERROR("mismatch in DBUF Slices (expected %x, got
> > > %x)\n",
> > >  			  sw_ddb->enabled_slices,
> > >  			  hw->ddb.enabled_slices);
> > >  
> > > @@ -14614,15 +14614,24 @@ static void
> > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > >  	u8 hw_enabled_slices = dev_priv->wm.skl_hw.ddb.enabled_slices;
> > >  	u8 required_slices = state->wm_results.ddb.enabled_slices;
> > >  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > > +	u8 slices_union = hw_enabled_slices | required_slices;
> > > +	u8 slices_intersection = required_slices;
> > >  
> > >  	for_each_oldnew_intel_crtc_in_state(state, crtc,
> > > old_crtc_state, new_crtc_state, i)
> > >  		/* ignore allocations for crtc's that have been turned
> > > off. */
> > >  		if (new_crtc_state->hw.active)
> > >  			entries[i] = old_crtc_state->wm.skl.ddb;
> > >  
> > > -	/* If 2nd DBuf slice required, enable it here */
> > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > > hw_enabled_slices)
> > > -		icl_dbuf_slices_update(dev_priv, required_slices);
> > > +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > > +		      required_slices, hw_enabled_slices);
> > > +
> > > +	/*
> > > +	 * If some other DBuf slice required, enable it here,
> > > +	 * as here we only add more slices, but not remove to prevent
> > > +	 * issues if somebody still uses those.
> > > +	 */
> > > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > hw_enabled_slices)
> > > +		icl_dbuf_slices_update(dev_priv, slices_union);
> > >  
> > >  	/*
> > >  	 * Whenever the number of active pipes changes, we need to make
> > > sure we
> > > @@ -14681,9 +14690,12 @@ static void
> > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > >  		}
> > >  	} while (progress);
> > >  
> > > -	/* If 2nd DBuf slice is no more required disable it */
> > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > hw_enabled_slices)
> > > -		icl_dbuf_slices_update(dev_priv, required_slices);
> > > +	/*
> > > +	 * If some other DBuf slice is not needed, disable it here,
> > > +	 * as here we only remove slices, which are not needed anymore.
> > > +	 */
> > > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > hw_enabled_slices)
> > > +		icl_dbuf_slices_update(dev_priv, slices_intersection);
> > 
> > This is still a totally wrong place for this. We need to be sure the
> > planes have switched over to the new ddb layout before we power off
> > anything.
> 
> I have just modified already existing line here, not added by me.
> The point of this patch is starting to use second DBuf slice but
> not fixing all kinds of issues related to DBuf code at the same time.

You're re-enabling code I disabled on purpose. It has to be fixed
properly or not touched at all. In hindsight I should have probably
just ripped it all out to not give people bad ideas.

> 
> Of course this is nice idea to fix it - if that is wrong plane,
> but current code somehow lives with it and probably this should be
> with some other patch.
> 
> > 
> > >  }
> > >  
> > >  static void intel_atomic_helper_free_state(struct drm_i915_private
> > > *dev_priv)
> > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > index ce1b64f4dd44..a78bc9b61862 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > @@ -1031,15 +1031,6 @@ static bool
> > > gen9_dc_off_power_well_enabled(struct drm_i915_private *dev_priv,
> > >  		(I915_READ(DC_STATE_EN) &
> > > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> > >  }
> > >  
> > > -static void gen9_assert_dbuf_enabled(struct drm_i915_private
> > > *dev_priv)
> > > -{
> > > -	u32 tmp = I915_READ(DBUF_CTL);
> > > -
> > > -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> > > -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > > -	     "Unexpected DBuf power power state (0x%08x)\n", tmp);
> > > -}
> > > -
> > >  static void gen9_disable_dc_states(struct drm_i915_private
> > > *dev_priv)
> > >  {
> > >  	struct intel_cdclk_state cdclk_state = {};
> > > @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct
> > > drm_i915_private *dev_priv)
> > >  	/* Can't read out voltage_level so can't use
> > > intel_cdclk_changed() */
> > >  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw,
> > > &cdclk_state));
> > >  
> > > -	gen9_assert_dbuf_enabled(dev_priv);
> > > -
> > >  	if (IS_GEN9_LP(dev_priv))
> > >  		bxt_verify_ddi_phy_power_wells(dev_priv);
> > >  
> > > @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct
> > > drm_i915_private *dev_priv)
> > >  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> > >  }
> > >  
> > > -static u8 intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > >  {
> > > -	if (INTEL_GEN(dev_priv) < 11)
> > > -		return 1;
> > > -	return 2;
> > > +	return INTEL_INFO(dev_priv)->supported_slices;
> > > +}
> > 
> > Kinda pointless function now. 'supported_slices' is not a very good
> > name. Should be 'num_dbuf_slices' or something along those lines.
> 
> Supported slices really means how much slices we support in hardware.
> num_dbuf_slices will not reflect that, it will be more ambigious.
> I don't like this name.

supported_slices doesn't even say "dbuf" anywhere. So it's even more
ambiguous. Also we already have num_pipes, num_sprites, etc.

> 
> > 
> > > +
> > > +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv)
> > > +{
> > > +	const u8 hw_enabled_slices = dev_priv-
> > > >wm.skl_hw.ddb.enabled_slices;
> > 
> > Seems a rather pointless variable. const at least is pointless.
> > 
> > > +
> > > +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> > >  }
> > >  
> > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > >  			    u8 req_slices)
> > >  {
> > > -	const u8 hw_enabled_slices = dev_priv-
> > > >wm.skl_hw.ddb.enabled_slices;
> > > -	bool ret;
> > > +	bool ret = true;
> > > +	int i;
> > > +	int max_slices = intel_dbuf_max_slices(dev_priv);
> > >  
> > > -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > > +	if (hweight8(req_slices) > intel_dbuf_max_slices(dev_priv)) {
> > >  		DRM_ERROR("Invalid number of dbuf slices requested\n");
> > >  		return;
> > >  	}
> > >  
> > > -	if (req_slices == hw_enabled_slices || req_slices == 0)
> > > -		return;
> > > +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n", req_slices);
> > >  
> > > -	if (req_slices > hw_enabled_slices)
> > > -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2,
> > > true);
> > > -	else
> > > -		ret = intel_dbuf_slice_set(dev_priv, DBUF_CTL_S2,
> > > false);
> > > +	for (i = 0; i < max_slices; i++) {
> > > +		int slice_bit = BIT(i);
> > > +		bool slice_set = (slice_bit & req_slices) != 0;
> > > +
> > > +		switch (slice_bit) {
> > > +		case DBUF_S1_BIT:
> > > +			ret = ret && intel_dbuf_slice_set(dev_priv,
> > > +							  DBUF_CTL_S1,
> > > +							  slice_set);
> > > +			break;
> > > +		case DBUF_S2_BIT:
> > > +			ret = ret && intel_dbuf_slice_set(dev_priv,
> > > +							  DBUF_CTL_S2,
> > > +							  slice_set);
> > 
> > The return value on intel_dbuf_slice_set() seems pointless. I'd nuke
> > it.
> > 
> > Also as long you have intel_dbuf_slice_set() I'd probably just pass
> > the
> > slice index there and let it deal with the register details. DBUF_CTL
> > should probably be parametrized as well so you don't need annoying
> > switch
> > statements.
> 
> What if it fails? Shouldn't we still have some clue what state do we
> have/had at the moment?

It if fails you log an error and keep going.

> 
> > 
> > > +			break;
> > > +		default:
> > > +			MISSING_CASE(slice_bit);
> > > +		}
> > > +	}
> > >  
> > >  	if (ret)
> > >  		dev_priv->wm.skl_hw.ddb.enabled_slices = req_slices;
> > > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > > drm_i915_private *dev_priv,
> > >  
> > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > >  {
> > > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > DBUF_POWER_REQUEST);
> > > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > DBUF_POWER_REQUEST);
> > > -	POSTING_READ(DBUF_CTL_S2);
> > > -
> > > -	udelay(10);
> > > -
> > > -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > -		DRM_ERROR("DBuf power enable timeout\n");
> > > -	else
> > > -		/*
> > > -		 * FIXME: for now pretend that we only have 1 slice,
> > > see
> > > -		 * intel_enabled_dbuf_slices_num().
> > > -		 */
> > > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > +	/*
> > > +	 * Just power up 1 slice, we will
> > > +	 * figure out later which slices we have and what we need.
> > > +	 */
> > > +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > 
> > That stuff really shouldn't be modified by low level functions.
> > The intial value should come from readout, and after that it should
> > be adjusted as part of the overall atomic state.
> 
> Well, as you can see currently this is even worse - has some weird
> hardcoded stuff and is modified like that. Also if we are enabling the 
> dbuf slices, shouldn't we really determine what value should be there?
> 
> Also I wouldn't really always trust the readout, as to me it seems
> icelake resets it constantly to 2 after suspend/resume. 

The point of readout is that we syncrhonize our software state with the
current state of hardware without having to modify the hardware state.
Once synchronized the driver takes over and tells the hw what to do.

> 
> > 
> > > +	icl_program_dbuf_slices(dev_priv);
> > >  }
> > >  
> > >  static void icl_dbuf_disable(struct drm_i915_private *dev_priv)
> > >  {
> > > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > > ~DBUF_POWER_REQUEST);
> > > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > > ~DBUF_POWER_REQUEST);
> > > -	POSTING_READ(DBUF_CTL_S2);
> > > -
> > > -	udelay(10);
> > > -
> > > -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > -		DRM_ERROR("DBuf power disable timeout!\n");
> > > -	else
> > > -		/*
> > > -		 * FIXME: for now pretend that the first slice is
> > > always
> > > -		 * enabled, see intel_enabled_dbuf_slices_num().
> > > -		 */
> > > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > +	/*
> > > +	 * Disable all slices
> > > +	 */
> > > +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > > +	icl_program_dbuf_slices(dev_priv);
> > >  }
> > >  
> > >  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > index 1da04f3e0fb3..b7057d68ad78 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > > drm_i915_private *i915,
> > >  	for ((wf) = intel_display_power_get((i915), (domain)); (wf); \
> > >  	     intel_display_power_put_async((i915), (domain), (wf)),
> > > (wf) = 0)
> > >  
> > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
> > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > >  			    u8 req_slices);
> > > +void icl_program_dbuf_slices(struct drm_i915_private *dev_priv);
> > >  
> > >  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
> > >  			     bool override, unsigned int mask);
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > b/drivers/gpu/drm/i915/i915_drv.c
> > > index 82e4e6bf08c3..96741e68633a 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -51,6 +51,7 @@
> > >  #include "display/intel_bw.h"
> > >  #include "display/intel_cdclk.h"
> > >  #include "display/intel_display_types.h"
> > > +#include "display/intel_display_power.h"
> > >  #include "display/intel_dp.h"
> > >  #include "display/intel_fbdev.h"
> > >  #include "display/intel_hotplug.h"
> > > @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct
> > > device *kdev)
> > >  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
> > >  		ret = vlv_resume_prepare(dev_priv, true);
> > >  
> > > +	/* Weird hack to fix ICL hardware bug, as it resets DBUF slices
> > > reg */
> > > +	if (INTEL_GEN(dev_priv) == 11)
> > > +		icl_program_dbuf_slices(dev_priv);
> > > +
> > >  	intel_uncore_runtime_resume(&dev_priv->uncore);
> > >  
> > >  	intel_runtime_pm_enable_interrupts(dev_priv);
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > b/drivers/gpu/drm/i915/i915_drv.h
> > > index 7e0f67babe20..a396977c9c2d 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -804,7 +804,7 @@ static inline bool skl_ddb_entry_equal(const
> > > struct skl_ddb_entry *e1,
> > >  }
> > >  
> > >  struct skl_ddb_allocation {
> > 
> > This struct is quite pointless. Should be killed off.
> 
> Legacy stuff, not added by me. Quite fine with removing it - 
> but should we really do all this at the same time and in same
> patch? 

Not the same patch. But clean up the cobwebs first, then add the new
stuff. Much easier that way around.

> 
> > 
> > > -	u8 enabled_slices; /* GEN11 has configurable 2 slices */
> > > +	u8 enabled_slices;   /* Bitmask of currently enabled DBuf
> > > slices */
> > >  };
> > >  
> > >  struct skl_ddb_values {
> > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > b/drivers/gpu/drm/i915/i915_pci.c
> > > index 1bb701d32a5d..8e86af505730 100644
> > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > > intel_cherryview_info = {
> > >  	.has_gt_uc = 1, \
> > >  	.display.has_hdcp = 1, \
> > >  	.display.has_ipc = 1, \
> > > -	.ddb_size = 896
> > > +	.ddb_size = 896, \
> > > +	.supported_slices = 1
> > >  
> > >  #define SKL_PLATFORM \
> > >  	GEN9_FEATURES, \
> > > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > > intel_skylake_gt4_info = {
> > >  	GEN(9), \
> > >  	.is_lp = 1, \
> > >  	.display.has_hotplug = 1, \
> > > +	.supported_slices = 1, \
> > >  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) | BIT(VECS0),
> > > \
> > >  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
> > >  	.has_64bit_reloc = 1, \
> > > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > > intel_coffeelake_gt3_info = {
> > >  	GEN9_FEATURES, \
> > >  	GEN(10), \
> > >  	.ddb_size = 1024, \
> > > +	.supported_slices = 1, \
> > >  	.display.has_dsc = 1, \
> > >  	.has_coherent_ggtt = false, \
> > >  	GLK_COLORS
> > > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > > intel_cannonlake_info = {
> > >  	}, \
> > >  	GEN(11), \
> > >  	.ddb_size = 2048, \
> > > +	.supported_slices = 2, \
> > >  	.has_logical_ring_elsq = 1, \
> > >  	.color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 }
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > > b/drivers/gpu/drm/i915/i915_reg.h
> > > index a607ea520829..fba5731063d8 100644
> > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > @@ -7734,9 +7734,11 @@ enum {
> > >  #define DISP_ARB_CTL2	_MMIO(0x45004)
> > >  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
> > >  #define  DISP_IPC_ENABLE		(1 << 3)
> > > -#define DBUF_CTL	_MMIO(0x45008)
> > > -#define DBUF_CTL_S1	_MMIO(0x45008)
> > > -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> > > +#define DBUF_S1_BIT			BIT(0)
> > > +#define DBUF_S2_BIT			BIT(1)
> > 
> > These are not bits inside DBUF_CTL so they should not be here.
> > I'd just nuke them entirely. If you think we need more than an
> > int to refer to a slice then IMO something simple like
> > 
> > enum dbuf_slice {
> > 	DBUF_S1,
> > 	DBUF_S2,
> > };
> > 
> > would match the general apporach we use for other things much better
> > rather than defining some shifted values.
> 
> We need to combine slices - so bitmask is quite suitable for that, 
> I can of course use this enum, but then I'll have to re-encode it 
> to bitmask anyways.

Which is pretty much how everything else works in the driver.

> 
> 
> > 
> > > +#define DBUF_CTL			_MMIO(0x45008)
> > > +#define DBUF_CTL_S1			_MMIO(0x45008)
> > > +#define DBUF_CTL_S2			_MMIO(0x44FE8)
> > >  #define  DBUF_POWER_REQUEST		(1 << 31)
> > >  #define  DBUF_POWER_STATE		(1 << 30)
> > >  #define GEN7_MSG_CTL	_MMIO(0x45010)
> > > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > > b/drivers/gpu/drm/i915/intel_device_info.h
> > > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > > @@ -180,6 +180,7 @@ struct intel_device_info {
> > >  	} display;
> > >  
> > >  	u16 ddb_size; /* in blocks */
> > > +	u8 supported_slices; /* number of DBuf slices */
> > >  
> > >  	/* Register offsets for the various display pipes and
> > > transcoders */
> > >  	int pipe_offsets[I915_MAX_TRANSCODERS];
> > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > b/drivers/gpu/drm/i915/intel_pm.c
> > > index 2d389e437e87..1ea81ab92429 100644
> > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct drm_device
> > > *dev)
> > >  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> > >  }
> > >  
> > > -static u8 intel_enabled_dbuf_slices_num(struct drm_i915_private
> > > *dev_priv)
> > > +static u8 intel_enabled_dbuf_slices(struct drm_i915_private
> > > *dev_priv)
> > >  {
> > > -	u8 enabled_slices;
> > > -
> > > -	/* Slice 1 will always be enabled */
> > > -	enabled_slices = 1;
> > > +	u8 enabled_slices = 0;
> > >  
> > >  	/* Gen prior to GEN11 have only one DBuf slice */
> > >  	if (INTEL_GEN(dev_priv) < 11)
> > > -		return enabled_slices;
> > > +		return DBUF_S1_BIT;
> > 
> > I'd nuke this special case entirely and just read everything from the
> > hw. Assuming we need readout somewhere.
> 
> We have it that way hardcoded right now in drm-tip. I think this is a
> subject to other patch, but not this change. As I understand we need 
> to first figure out how to do it properly. 
> This patch anyway makes it better by at least removing hardcoded
> relation between number of slices and which slices are enabled.
> With current code there is no way for example to enable second DBuf
> slice only, which is required by BSpec.
> 
> > 
> > >  
> > > -	/*
> > > -	 * FIXME: for now we'll only ever use 1 slice; pretend that we
> > > have
> > > -	 * only that 1 slice enabled until we have a proper way for on-
> > > demand
> > > -	 * toggling of the second slice.
> > > -	 */
> > > -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > -		enabled_slices++;
> > > +	/* Check if second DBuf slice is enabled */
> > > +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > > +		enabled_slices |= DBUF_S1_BIT;
> > > +
> > > +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > +		enabled_slices |= DBUF_S2_BIT;
> > >  
> > >  	return enabled_slices;
> > >  }
> > > @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct
> > > drm_i915_private *dev_priv,
> > >  			      const int num_active,
> > >  			      struct skl_ddb_allocation *ddb)
> > >  {
> > > -	const struct drm_display_mode *adjusted_mode;
> > > -	u64 total_data_bw;
> > >  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > > -
> > >  	WARN_ON(ddb_size == 0);
> > >  
> > >  	if (INTEL_GEN(dev_priv) < 11)
> > >  		return ddb_size - 4; /* 4 blocks for bypass path
> > > allocation */
> > >  
> > > -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> > > -	total_data_bw = total_data_rate *
> > > drm_mode_vrefresh(adjusted_mode);
> > > +	return ddb_size;
> > > +}
> > >  
> > > -	/*
> > > -	 * 12GB/s is maximum BW supported by single DBuf slice.
> > > -	 *
> > > -	 * FIXME dbuf slice code is broken:
> > > -	 * - must wait for planes to stop using the slice before
> > > powering it off
> > > -	 * - plane straddling both slices is illegal in multi-pipe
> > > scenarios
> > > -	 * - should validate we stay within the hw bandwidth limits
> > > -	 */
> > > -	if (0 && (num_active > 1 || total_data_bw >= GBps(12))) {
> > > -		ddb->enabled_slices = 2;
> > > -	} else {
> > > -		ddb->enabled_slices = 1;
> > > -		ddb_size /= 2;
> > > -	}
> > > +/*
> > > + * Calculate initial DBuf slice offset, based on slice size
> > > + * and mask(i.e if slice size is 1024 and second slice is enabled
> > > + * offset would be 1024)
> > > + */
> > > +static u32 skl_get_first_dbuf_slice_offset(u32 dbuf_slice_mask,
> > > +					   u32 slice_size, u32
> > > ddb_size)
> > > +{
> > > +	u32 offset = 0;
> > >  
> > > -	return ddb_size;
> > > +	if (!dbuf_slice_mask)
> > > +		return 0;
> > > +
> > > +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > > +
> > > +	WARN_ON(offset >= ddb_size);
> > > +	return offset;
> > >  }
> > >  
> > > +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > *dev_priv,
> > > +				      int pipe, u32 active_pipes,
> > > +				      const struct intel_crtc_state
> > > *crtc_state);
> > > +
> > >  static void
> > >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > > *dev_priv,
> > >  				   const struct intel_crtc_state
> > > *crtc_state,
> > > @@ -3857,7 +3855,14 @@ skl_ddb_get_pipe_allocation_limits(struct
> > > drm_i915_private *dev_priv,
> > >  	u32 pipe_width = 0, total_width = 0, width_before_pipe = 0;
> > >  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> > >  	u16 ddb_size;
> > > +	u32 ddb_range_size;
> > >  	u32 i;
> > > +	u32 dbuf_slice_mask;
> > > +	u32 active_pipes;
> > > +	u32 offset;
> > > +	u32 slice_size;
> > > +	u32 total_slice_mask;
> > > +	u32 start, end;
> > >  
> > >  	if (WARN_ON(!state) || !crtc_state->hw.active) {
> > >  		alloc->start = 0;
> > > @@ -3866,14 +3871,23 @@ skl_ddb_get_pipe_allocation_limits(struct
> > > drm_i915_private *dev_priv,
> > >  		return;
> > >  	}
> > >  
> > > -	if (intel_state->active_pipe_changes)
> > > +	if (intel_state->active_pipe_changes) {
> > >  		*num_active = hweight8(intel_state->active_pipes);
> > > -	else
> > > +		active_pipes = intel_state->active_pipes;
> > > +	} else {
> > >  		*num_active = hweight8(dev_priv->active_pipes);
> > > +		active_pipes = dev_priv->active_pipes;
> > > +	}
> > >  
> > >  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > > total_data_rate,
> > >  				      *num_active, ddb);
> > >  
> > > +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > > +
> > > +	slice_size = ddb_size / INTEL_INFO(dev_priv)->supported_slices;
> > > +
> > > +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> > > +
> > >  	/*
> > >  	 * If the state doesn't change the active CRTC's or there is no
> > >  	 * modeset request, then there's no need to recalculate;
> > > @@ -3891,20 +3905,70 @@ skl_ddb_get_pipe_allocation_limits(struct
> > > drm_i915_private *dev_priv,
> > >  		return;
> > >  	}
> > >  
> > > +	/*
> > > +	 * Get allowed DBuf slices for correspondent pipe and platform.
> > > +	 */
> > > +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv,
> > > for_pipe,
> > > +						     active_pipes,
> > > crtc_state);
> > 
> > Quite a few redundant arguments. The function name looks alien.
> > 
> > skl_crtc_possible_dbuf_slices() or something perhaps?
> > 
> > > +
> > > +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes %x\n",
> > > +		      dbuf_slice_mask,
> > > +		      for_pipe, active_pipes);
> > > +
> > > +	/*
> > > +	 * Figure out at which DBuf slice we start, i.e if we start at
> > > Dbuf S2
> > > +	 * and slice size is 1024, the offset would be 1024
> > > +	 */
> > > +	offset = skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > > +						 slice_size, ddb_size);
> > > +
> > > +	/*
> > > +	 * Figure out total size of allowed DBuf slices, which is
> > > basically
> > > +	 * a number of allowed slices for that pipe multiplied by slice
> > > size.
> > > +	 * Inside of this
> > > +	 * range ddb entries are still allocated in proportion to
> > > display width.
> > > +	 */
> > > +	ddb_range_size = hweight8(dbuf_slice_mask) * slice_size;
> > > +
> > >  	/*
> > >  	 * Watermark/ddb requirement highly depends upon width of the
> > >  	 * framebuffer, So instead of allocating DDB equally among
> > > pipes
> > >  	 * distribute DDB based on resolution/width of the display.
> > >  	 */
> > > +	total_slice_mask = dbuf_slice_mask;
> > 
> > Isn't this already taken care by the loop below?
> > 
> > >  	for_each_new_intel_crtc_in_state(intel_state, crtc, crtc_state,
> > > i) {
> > >  		const struct drm_display_mode *adjusted_mode =
> > >  			&crtc_state->hw.adjusted_mode;
> > >  		enum pipe pipe = crtc->pipe;
> > >  		int hdisplay, vdisplay;
> > > +		u32 pipe_dbuf_slice_mask = \
> > > +			i915_get_allowed_dbuf_mask(dev_priv,
> > > +						pipe,
> > > +						active_pipes,
> > > +						crtc_state);
> > >  
> > >  		if (!crtc_state->hw.enable)
> > >  			continue;
> > >  
> > > +		/*
> > > +		 * According to BSpec pipe can share one dbuf slice
> > > with another
> > > +		 * pipes or pipe can use multiple dbufs, in both cases
> > > we
> > > +		 * account for other pipes only if they have exactly
> > > same mask.
> > > +		 * However we need to account how many slices we should
> > > enable
> > > +		 * in total.
> > > +		 */
> > > +		total_slice_mask |= pipe_dbuf_slice_mask;
> > 
> > total_slice_mask will now account only the crtcs in the state. What
> > happens to the other pipes' slices?
> 
> Yes we need to iterate through all pipes/crtcs here, however
> for getting correct dbuf slice mask most important is active_pipes.
> However yes, I will change the loop for it to be more correct.
> 
> > 
> > > +
> > > +		/*
> > > +		 * Do not account pipes using other slice sets
> > > +		 * luckily as of current BSpec slice sets do not
> > > partially
> > > +		 * intersect(pipes share either same one slice or same
> > > slice set
> > > +		 * i.e no partial intersection), so it is enough to
> > > check for
> > > +		 * equality for now.
> > > +		 */
> > > +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > > +			continue;
> > > +
> > >  		drm_mode_get_hv_timing(adjusted_mode, &hdisplay,
> > > &vdisplay);
> > >  		total_width += hdisplay;
> > >  
> > > @@ -3914,8 +3978,19 @@ skl_ddb_get_pipe_allocation_limits(struct
> > > drm_i915_private *dev_priv,
> > >  			pipe_width = hdisplay;
> > >  	}
> > >  
> > > -	alloc->start = ddb_size * width_before_pipe / total_width;
> > > -	alloc->end = ddb_size * (width_before_pipe + pipe_width) /
> > > total_width;
> > > +	ddb->enabled_slices = total_slice_mask;
> > > +
> > > +	start = ddb_range_size * width_before_pipe / total_width;
> > > +	end = ddb_range_size * (width_before_pipe + pipe_width) /
> > > total_width;
> > > +
> > > +	alloc->start = offset + start;
> > > +	alloc->end = offset + end;
> > > +
> > > +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > > +		      alloc->start, alloc->end);
> > > +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported %d\n",
> > > +		      ddb->enabled_slices,
> > > +		      INTEL_INFO(dev_priv)->supported_slices);
> > >  }
> > >  
> > >  static int skl_compute_wm_params(const struct intel_crtc_state
> > > *crtc_state,
> > > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > > intel_crtc *crtc,
> > >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
> > >  			  struct skl_ddb_allocation *ddb /* out */)
> > >  {
> > > -	ddb->enabled_slices = intel_enabled_dbuf_slices_num(dev_priv);
> > > +	ddb->enabled_slices = intel_enabled_dbuf_slices(dev_priv);
> > > +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > > >enabled_slices);
> > >  }
> > >  
> > >  /*
> > > @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct
> > > intel_crtc_state *crtc_state,
> > >  	return total_data_rate;
> > >  }
> > >  
> > > +static uint_fixed_16_16_t
> > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > *crtc_state)
> > > +{
> > > +	u32 src_w, src_h, dst_w, dst_h;
> > > +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > +	uint_fixed_16_16_t downscale_h, downscale_w;
> > > +	const struct drm_display_mode *adjusted_mode = &crtc_state-
> > > >hw.adjusted_mode;
> > > +
> > > +	src_w = crtc_state->pipe_src_w;
> > > +	src_h = crtc_state->pipe_src_h;
> > > +	dst_w = adjusted_mode->crtc_hdisplay;
> > > +	dst_h = adjusted_mode->crtc_vdisplay;
> > > +
> > > +	fp_w_ratio = div_fixed16(src_w, dst_w);
> > > +	fp_h_ratio = div_fixed16(src_h, dst_h);
> > > +	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
> > > +	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
> > > +
> > > +	return mul_fixed16(downscale_w, downscale_h);
> > > +}
> > > +
> > > +uint_fixed_16_16_t
> > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > +{
> > > +	struct drm_plane *plane;
> > > +	const struct drm_plane_state *drm_plane_state;
> > > +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > +	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> > > +
> > > +	if (!crtc_state->uapi.enable)
> > > +		return max_downscale;
> > > +
> > > +	drm_atomic_crtc_state_for_each_plane_state(plane,
> > > drm_plane_state, &crtc_state->uapi) {
> > > +		uint_fixed_16_16_t plane_downscale;
> > > +		const struct intel_plane_state *plane_state =
> > > +			to_intel_plane_state(drm_plane_state);
> > > +
> > > +		if (!intel_wm_plane_visible(crtc_state, plane_state))
> > > +			continue;
> > > +
> > > +		plane_downscale =
> > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > +
> > > +		DRM_DEBUG_KMS("Plane %d downscale amount %d.%d\n",
> > > +			      to_intel_plane(plane)->id,
> > > +			      plane_downscale.val >> 16,
> > > +			      plane_downscale.val & 0xffff);
> > > +
> > > +		max_downscale = max_fixed16(plane_downscale,
> > > max_downscale);
> > > +	}
> > > +
> > > +
> > > +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > +
> > > +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > +		       crtc->pipe, pipe_downscale.val >> 16,
> > > +		       pipe_downscale.val & 0xffff);
> > > +
> > > +	pipe_downscale = mul_fixed16(pipe_downscale, max_downscale);
> > > +
> > > +	/*
> > > +	 * Convert result to percentage: get the actual ratio instead
> > > of rate,
> > > +	 * then multiply by 100 to get percentage.
> > > +	 */
> > > +	pipe_ratio = u32_to_fixed16(100 * div_round_up_u32_fixed16(1,
> > > pipe_downscale));
> > > +
> > > +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > +		      pipe_ratio.val >> 16, pipe_ratio.val & 0xffff);
> > > +
> > > +	return pipe_ratio;
> > > +}
> > 
> > Yuck. I just killed these.
> 
> I still need pipe_ratio fro ICL as you know, this is dictated by BSpec,
> not by me.

That part of the spec is pretty crazy IMO. I think we should just ignore
it for now. Also nothing in there says we must use both slices for a
single pipe so ignoring doesn't even violate the spec in any way.

> 
> > 
> > > +
> > > +struct dbuf_slice_conf_entry {
> > > +	u32 dbuf_mask[I915_MAX_PIPES];
> > > +	u32 active_pipes;
> > > +};
> > > +
> > > +
> > > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > > +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > > +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > > +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > > +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > > +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > > BIT(PIPE_C) }
> > > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C) |
> > > BIT(PIPE_D) }
> > > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C) |
> > > BIT(PIPE_D) }
> > > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > > BIT(PIPE_D) }
> > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B) |
> > > BIT(PIPE_C) }
> > > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3, DBuf4)  \
> > > +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) | BIT(PIPE_B) | \
> > > +					  BIT(PIPE_C) | BIT(PIPE_D) }
> > > +
> > > +/*
> > > + * Table taken from Bspec 12716
> > > + * Pipes do have some preferred DBuf slice affinity,
> > > + * plus there are some hardcoded requirements on how
> > > + * those should be distributed for multipipe scenarios.
> > > + * For more DBuf slices algorithm can get even more messy
> > > + * and less readable, so decided to use a table almost
> > > + * as is from BSpec itself - that way it is at least easier
> > > + * to compare, change and check.
> > > + */
> > > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > > +{ { 0, 0, 0, 0 }, 0 },
> > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > > < 88.8 */
> > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe
> > > ratio >= 88.8 */
> > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > > < 88.8 */
> > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for pipe
> > > ratio >= 88.8 */
> > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe ratio
> > > < 88.8 */
> > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for pipe
> > > ratio >= 88.8 */
> > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > > +};
> > > +
> > > +/*
> > > + * Table taken from Bspec 49255
> > > + * Pipes do have some preferred DBuf slice affinity,
> > > + * plus there are some hardcoded requirements on how
> > > + * those should be distributed for multipipe scenarios.
> > > + * For more DBuf slices algorithm can get even more messy
> > > + * and less readable, so decided to use a table almost
> > > + * as is from BSpec itself - that way it is at least easier
> > > + * to compare, change and check.
> > > + */
> > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > +{ { 0, 0, 0, 0 }, 0 },
> > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT, DBUF_S2_BIT),
> > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT, DBUF_S2_BIT,
> > > DBUF_S2_BIT),
> > > +};
> > 
> > My eyes!
> > 
> > I have to think we should be able to reduce all that to a handful
> > of lines of code.
> 
> Yeah, then we are going to have a huge function with lots of weird
> definitions, unfortunately BSpec table has quite strange DBuf
> assignments like 
> +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> 
> i.e slices can get mixed in a quite various ways. There is no
> rational pattern in those and could be even dangerous to try to 
> optimize it some way, as one day it might change again in some
> unpredictable way.
> 
> Would you prefer to have it like
> 
> if (pipes are A and B)
>     program S2 to A and S1 to B
> if (pipes are A and C)
>     program S1 to A and S2 to C
> ...
> 
> I would prefer at least to see it in a comparable way with the
> table we have in BSpec, rather than having lots of weird looking
> if-else statements, GEN checks and so on.
> 
> I knew this table was not going to look like it is done typically
> here - but really my opinion that it is way better than hardcoding
> it into some weird algorithm, which would be hard to compare to 
> initial table, which is already strange enough.
> 
> If you don't like it - could you please explain why this is exactly
> worse than having long functions with hardcoded checks?

I think we should be able to encode this simply using the
"Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
is the distance of each pipe from each dbuf slice. Then all we have
to do is minimize the total distance. The following tables I think
should encode that reasonably concisely:

/* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
static const u8 icl_dbuf_distance[][] {
	[PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
	[PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
	[PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
};

/* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
static const u8 tgl_dbuf_distance[][] {
	[PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
	[PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
	[PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
	[PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
};

/*
 * pipeA - DBUF_S0 - DBUF_S1 - pipeB
 * pipeC - DBUF_S2 - DBUF_S3 - pipeD
 */
static const u8 future_dbuf_distance[][] {
	[PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
	[PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
	[PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
	[PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
};

I suppose we could make that even more minimal by just
directly sticking that "Pipe and DBUF ordering" language
into a single array but and then calculate each distance
from it, but that is perhaps more work, and we'd need
some kind of enum just for that so that we have a single
namespace for the pipes and dbuf slices. So i have a
feeling having those distances pre-computed in a table
like thet is maybe a better way to go.

> 
> I think this is pretty stupid to fix stuff which looks new or unusual
> just because somebody doesn't like it, without even constructive
> arguing.
> 
> - Stan
> 
> > 
> > 
> > > +
> > > +/*
> > > + * This function
> > 
> > "This function" is totally redundant. Ditto elsewhere.
> > 
> > > finds an entry with same enabled pipe configuration and
> > > + * returns correspondent DBuf slice mask as stated in BSpec for
> > > particular
> > > + * platform.
> > > + */
> > > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > > +				     u32 active_pipes,
> > > +				     const struct intel_crtc_state
> > > *crtc_state)
> > > +{
> > > +	int i;
> > > +	uint_fixed_16_16_t pipe_ratio;
> > > +
> > > +	/*
> > > +	 * Calculate pipe ratio as stated in BSpec 28692
> > > +	 */
> > > +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > > +
> > > +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > > +		if (icl_allowed_dbufs[i].active_pipes == active_pipes)
> > > {
> > > +			if (hweight32(active_pipes) == 1) {
> > > +				/*
> > > +				 * According to BSpec 12716: if ratio
> > > >= 88.8
> > > +				 * always use single dbuf slice.
> > > +				 */
> > > +				if (pipe_ratio.val >= div_fixed16(888,
> > > 10).val)
> > > +					++i;
> > > +			}
> > > +			return icl_allowed_dbufs[i].dbuf_mask[pipe];
> > > +		}
> > > +	}
> > > +	return 0;
> > > +}
> > > +
> > > +/*
> > > + * This function finds an entry with same enabled pipe
> > > configuration and
> > > + * returns correspondent DBuf slice mask as stated in BSpec for
> > > particular
> > > + * platform.
> > > + */
> > > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > > +				     u32 active_pipes,
> > > +				     const struct intel_crtc_state
> > > *crtc_state)
> > > +{
> > > +	int i;
> > > +
> > > +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > > +		if (tgl_allowed_dbufs[i].active_pipes == active_pipes)
> > > +			return tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > > +	}
> > > +	return 0;
> > > +}
> > > +
> > > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private *dev_priv,
> > > +				      int pipe, u32 active_pipes,
> > > +				      const struct intel_crtc_state
> > > *crtc_state)
> > > +{
> > > +	if (IS_GEN(dev_priv, 11))
> > > +		return icl_get_allowed_dbuf_mask(pipe,
> > > +						 active_pipes,
> > > +						 crtc_state);
> > > +	else if (IS_GEN(dev_priv, 12))
> > > +		return tgl_get_allowed_dbuf_mask(pipe,
> > > +						 active_pipes,
> > > +						 crtc_state);
> > > +	/*
> > > +	 * For anything else just return one slice yet.
> > > +	 * Should be extended for other platforms.
> > > +	 */
> > > +	return DBUF_S1_BIT;
> > > +}
> > > +
> > >  static u64
> > >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > > *crtc_state,
> > >  				 u64 *plane_data_rate)
> > > -- 
> > > 2.17.1
> > 
> > 

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 17:17         ` Lisovskiy, Stanislav
  0 siblings, 0 replies; 34+ messages in thread
From: Lisovskiy, Stanislav @ 2019-11-18 17:17 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx

On Mon, 2019-11-18 at 17:27 +0200, Ville Syrjälä wrote:
> On Mon, Nov 18, 2019 at 09:19:18AM +0000, Lisovskiy, Stanislav wrote:
> > On Fri, 2019-11-15 at 22:19 +0200, Ville Syrjälä wrote:
> > > On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy
> > > wrote:
> > > > Also implemented algorithm for choosing DBuf slice
> > > > configuration
> > > > based on active pipes, pipe ratio as stated in BSpec 12716.
> > > > 
> > > > Now pipe allocation still stays proportional to pipe width as
> > > > before,
> > > > however within allowed DBuf slice for this particular
> > > > configuration.
> > > > 
> > > > v2: Remove unneeded check from commit as ddb enabled slices
> > > > might
> > > >     now differ from hw state.
> > > > 
> > > > v3: - Added new field "supported_slices" to ddb to separate max
> > > >       supported slices vs currently enabled, to avoid
> > > > confusion.
> > > >     - Removed obsolete comments and code related to
> > > > DBuf(Matthew
> > > > Roper).
> > > >     - Some code style and long line removal(Matthew Roper).
> > > >     - Added WARN_ON to new ddb range offset calc
> > > > function(Matthew
> > > > Roper).
> > > >     - Removed platform specific call to calc pipe ratio from
> > > > ddb
> > > >       allocation function and fixed the return value(Matthew
> > > > Roper)
> > > >     - Refactored DBUF slice allocation table to improve
> > > > readability
> > > >     - Added DBUF slice allocation for TGL as well.
> > > >     - ICL(however not TGL) seems to voluntarily enable second
> > > > DBuf
> > > > slice
> > > >       after pm suspend/resume causing a mismatch failure,
> > > > because
> > > > we
> > > >       update DBuf slices only if we do a modeset, however this
> > > > check
> > > >       is done always. Fixed it to be done only when modeset for
> > > > ICL.
> > > > 
> > > > v4: - Now enabled slices is not actually a number, but a
> > > > bitmask,
> > > >       because we might need to enable second slice only and
> > > > number
> > > >       of slices would still 1 and that current code doesn't
> > > > allow.
> > > >     - Remove redundant duplicate code to have some unified way
> > > > of
> > > >       enabling dbuf slices instead of hardcoding.
> > > > 
> > > > v5: - Fix failing gen9_assert_dbuf_enabled as it was naively
> > > > thinking
> > > >       that we have only one DBUF_CTL slice. Now another version
> > > > for
> > > >       gen11 will check both slices as only second one can be
> > > > enabled,
> > > >       to keep CI happy.
> > > > 
> > > > v6: - Removed enabled dbuf assertion completely. Previous code
> > > >       was keeping dbuf enabled, even(!) when _dbuf_disable is
> > > > called.
> > > >       Currently we enable/disable dbuf slices based on
> > > > requrement
> > > >       so no point in asserting this here.
> > > >     - Removed unnecessary modeset check from
> > > > verify_wm_state(Matthew Roper)
> > > >     - Slices intersection after union is same as final
> > > > result(Matthew Roper)
> > > >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> > > >     - Some macros added(Matthew Roper)
> > > >     - ddb range is now always less or equal to ddb size - no
> > > > need
> > > > for additional
> > > >       checks(previously needed as we had some bandwidth checks
> > > > in
> > > > there which
> > > >       could "suddenly" return ddb_size smaller than it is.
> > > > 
> > > > v7: Minor costemic changes:
> > > >     - Changed icl_dbuf_slices_restore name to
> > > > icl_program_dbuf_slices
> > > >       as it more corresponds to the actual functionality.
> > > >     - Some simplification with supported slices for BXT and GLK
> > > >     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
> > > >       this is not used for skl anymore.
> > > >     - Changed DBuf slice assignment order for TGL case
> > > > 
> > > > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > > > Signed-off-by: Stanislav Lisovskiy <
> > > > stanislav.lisovskiy@intel.com>
> > > > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > > > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > > > Cc: James Ausmus <james.ausmus@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> > > >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> > > >  .../drm/i915/display/intel_display_power.h    |   2 +
> > > >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> > > >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> > > >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> > > >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> > > >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> > > >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > > > ++++++++++++++++--
> > > >  9 files changed, 431 insertions(+), 104 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > > index 876fc25968bf..bd7aff675198 100644
> > > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > > > intel_crtc *crtc,
> > > >  
> > > >  	if (INTEL_GEN(dev_priv) >= 11 &&
> > > >  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > > > -		DRM_ERROR("mismatch in DBUF Slices (expected
> > > > %u, got
> > > > %u)\n",
> > > > +		DRM_ERROR("mismatch in DBUF Slices (expected
> > > > %x, got
> > > > %x)\n",
> > > >  			  sw_ddb->enabled_slices,
> > > >  			  hw->ddb.enabled_slices);
> > > >  
> > > > @@ -14614,15 +14614,24 @@ static void
> > > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > > >  	u8 hw_enabled_slices = dev_priv-
> > > > >wm.skl_hw.ddb.enabled_slices;
> > > >  	u8 required_slices = state-
> > > > >wm_results.ddb.enabled_slices;
> > > >  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > > > +	u8 slices_union = hw_enabled_slices | required_slices;
> > > > +	u8 slices_intersection = required_slices;
> > > >  
> > > >  	for_each_oldnew_intel_crtc_in_state(state, crtc,
> > > > old_crtc_state, new_crtc_state, i)
> > > >  		/* ignore allocations for crtc's that have been
> > > > turned
> > > > off. */
> > > >  		if (new_crtc_state->hw.active)
> > > >  			entries[i] = old_crtc_state-
> > > > >wm.skl.ddb;
> > > >  
> > > > -	/* If 2nd DBuf slice required, enable it here */
> > > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > > > hw_enabled_slices)
> > > > -		icl_dbuf_slices_update(dev_priv,
> > > > required_slices);
> > > > +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > > > +		      required_slices, hw_enabled_slices);
> > > > +
> > > > +	/*
> > > > +	 * If some other DBuf slice required, enable it here,
> > > > +	 * as here we only add more slices, but not remove to
> > > > prevent
> > > > +	 * issues if somebody still uses those.
> > > > +	 */
> > > > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > hw_enabled_slices)
> > > > +		icl_dbuf_slices_update(dev_priv, slices_union);
> > > >  
> > > >  	/*
> > > >  	 * Whenever the number of active pipes changes, we need
> > > > to make
> > > > sure we
> > > > @@ -14681,9 +14690,12 @@ static void
> > > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > > >  		}
> > > >  	} while (progress);
> > > >  
> > > > -	/* If 2nd DBuf slice is no more required disable it */
> > > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > hw_enabled_slices)
> > > > -		icl_dbuf_slices_update(dev_priv,
> > > > required_slices);
> > > > +	/*
> > > > +	 * If some other DBuf slice is not needed, disable it
> > > > here,
> > > > +	 * as here we only remove slices, which are not needed
> > > > anymore.
> > > > +	 */
> > > > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > hw_enabled_slices)
> > > > +		icl_dbuf_slices_update(dev_priv,
> > > > slices_intersection);
> > > 
> > > This is still a totally wrong place for this. We need to be sure
> > > the
> > > planes have switched over to the new ddb layout before we power
> > > off
> > > anything.
> > 
> > I have just modified already existing line here, not added by me.
> > The point of this patch is starting to use second DBuf slice but
> > not fixing all kinds of issues related to DBuf code at the same
> > time.
> 
> You're re-enabling code I disabled on purpose. It has to be fixed
> properly or not touched at all. In hindsight I should have probably
> just ripped it all out to not give people bad ideas.

-	/* If 2nd DBuf slice is no more required disable it */
> > > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > hw_enabled_slices)
> > > > -		icl_dbuf_slices_update(dev_priv,
> > > > required_slices);

I'm not reenabling it - this is code is out there right now.
This is all again about changing all kinds of stuff at the same time.

Why we can't just do things step by step? You have a point that
this is wrong place ok, but as we see current code causes no
regressions. Just tell me your idea where to put it and I will
do that, if you don't have it yet - let it stay there, if CI shows
no regressions currently. 

> 
> > 
> > Of course this is nice idea to fix it - if that is wrong plane,
> > but current code somehow lives with it and probably this should be
> > with some other patch.
> > 
> > > 
> > > >  }
> > > >  
> > > >  static void intel_atomic_helper_free_state(struct
> > > > drm_i915_private
> > > > *dev_priv)
> > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > index ce1b64f4dd44..a78bc9b61862 100644
> > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > @@ -1031,15 +1031,6 @@ static bool
> > > > gen9_dc_off_power_well_enabled(struct drm_i915_private
> > > > *dev_priv,
> > > >  		(I915_READ(DC_STATE_EN) &
> > > > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> > > >  }
> > > >  
> > > > -static void gen9_assert_dbuf_enabled(struct drm_i915_private
> > > > *dev_priv)
> > > > -{
> > > > -	u32 tmp = I915_READ(DBUF_CTL);
> > > > -
> > > > -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> > > > -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > > > -	     "Unexpected DBuf power power state (0x%08x)\n",
> > > > tmp);
> > > > -}
> > > > -
> > > >  static void gen9_disable_dc_states(struct drm_i915_private
> > > > *dev_priv)
> > > >  {
> > > >  	struct intel_cdclk_state cdclk_state = {};
> > > > @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct
> > > > drm_i915_private *dev_priv)
> > > >  	/* Can't read out voltage_level so can't use
> > > > intel_cdclk_changed() */
> > > >  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw,
> > > > &cdclk_state));
> > > >  
> > > > -	gen9_assert_dbuf_enabled(dev_priv);
> > > > -
> > > >  	if (IS_GEN9_LP(dev_priv))
> > > >  		bxt_verify_ddi_phy_power_wells(dev_priv);
> > > >  
> > > > @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct
> > > > drm_i915_private *dev_priv)
> > > >  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> > > >  }
> > > >  
> > > > -static u8 intel_dbuf_max_slices(struct drm_i915_private
> > > > *dev_priv)
> > > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > > >  {
> > > > -	if (INTEL_GEN(dev_priv) < 11)
> > > > -		return 1;
> > > > -	return 2;
> > > > +	return INTEL_INFO(dev_priv)->supported_slices;
> > > > +}
> > > 
> > > Kinda pointless function now. 'supported_slices' is not a very
> > > good
> > > name. Should be 'num_dbuf_slices' or something along those lines.
> > 
> > Supported slices really means how much slices we support in
> > hardware.
> > num_dbuf_slices will not reflect that, it will be more ambigious.
> > I don't like this name.
> 
> supported_slices doesn't even say "dbuf" anywhere. So it's even more
> ambiguous. Also we already have num_pipes, num_sprites, etc.

..and in current code we have... "enabled_slices", no? Ouch! :) IS this
really that important and worth arguing at the very last moment?
whether number of
supported_slices_in_hardware, should be named "supported slices" or
"num_dbuf_slices"? If yes, lets go on, miss some targets and continue
discussing this "soooo important" thing. 

Some one might come up that it is called "num_slices" or
"dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
keep everyones bloated egos happy??


> 
> > 
> > > 
> > > > +
> > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > *dev_priv)
> > > > +{
> > > > +	const u8 hw_enabled_slices = dev_priv-
> > > > > wm.skl_hw.ddb.enabled_slices;
> > > 
> > > Seems a rather pointless variable. const at least is pointless.
> > > 
> > > > +
> > > > +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> > > >  }
> > > >  
> > > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > > >  			    u8 req_slices)
> > > >  {
> > > > -	const u8 hw_enabled_slices = dev_priv-
> > > > > wm.skl_hw.ddb.enabled_slices;
> > > > 
> > > > -	bool ret;
> > > > +	bool ret = true;
> > > > +	int i;
> > > > +	int max_slices = intel_dbuf_max_slices(dev_priv);
> > > >  
> > > > -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > > > +	if (hweight8(req_slices) >
> > > > intel_dbuf_max_slices(dev_priv)) {
> > > >  		DRM_ERROR("Invalid number of dbuf slices
> > > > requested\n");
> > > >  		return;
> > > >  	}
> > > >  
> > > > -	if (req_slices == hw_enabled_slices || req_slices == 0)
> > > > -		return;
> > > > +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n",
> > > > req_slices);
> > > >  
> > > > -	if (req_slices > hw_enabled_slices)
> > > > -		ret = intel_dbuf_slice_set(dev_priv,
> > > > DBUF_CTL_S2,
> > > > true);
> > > > -	else
> > > > -		ret = intel_dbuf_slice_set(dev_priv,
> > > > DBUF_CTL_S2,
> > > > false);
> > > > +	for (i = 0; i < max_slices; i++) {
> > > > +		int slice_bit = BIT(i);
> > > > +		bool slice_set = (slice_bit & req_slices) != 0;
> > > > +
> > > > +		switch (slice_bit) {
> > > > +		case DBUF_S1_BIT:
> > > > +			ret = ret &&
> > > > intel_dbuf_slice_set(dev_priv,
> > > > +							  DBUF_
> > > > CTL_S1,
> > > > +							  slice
> > > > _set);
> > > > +			break;
> > > > +		case DBUF_S2_BIT:
> > > > +			ret = ret &&
> > > > intel_dbuf_slice_set(dev_priv,
> > > > +							  DBUF_
> > > > CTL_S2,
> > > > +							  slice
> > > > _set);
> > > 
> > > The return value on intel_dbuf_slice_set() seems pointless. I'd
> > > nuke
> > > it.
> > > 
> > > Also as long you have intel_dbuf_slice_set() I'd probably just
> > > pass
> > > the
> > > slice index there and let it deal with the register details.
> > > DBUF_CTL
> > > should probably be parametrized as well so you don't need
> > > annoying
> > > switch
> > > statements.
> > 
> > What if it fails? Shouldn't we still have some clue what state do
> > we
> > have/had at the moment?
> 
> It if fails you log an error and keep going.

Ah sure. And let hw/sw state be completely different.
Also see next comment about this.

> 
> > 
> > > 
> > > > +			break;
> > > > +		default:
> > > > +			MISSING_CASE(slice_bit);
> > > > +		}
> > > > +	}
> > > >  
> > > >  	if (ret)
> > > >  		dev_priv->wm.skl_hw.ddb.enabled_slices =
> > > > req_slices;
> > > > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > > > drm_i915_private *dev_priv,
> > > >  
> > > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > > >  {
> > > > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > DBUF_POWER_REQUEST);
> > > > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > DBUF_POWER_REQUEST);
> > > > -	POSTING_READ(DBUF_CTL_S2);
> > > > -
> > > > -	udelay(10);
> > > > -
> > > > -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > -		DRM_ERROR("DBuf power enable timeout\n");
> > > > -	else
> > > > -		/*
> > > > -		 * FIXME: for now pretend that we only have 1
> > > > slice,
> > > > see
> > > > -		 * intel_enabled_dbuf_slices_num().
> > > > -		 */
> > > > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > +	/*
> > > > +	 * Just power up 1 slice, we will
> > > > +	 * figure out later which slices we have and what we
> > > > need.
> > > > +	 */
> > > > +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > 
> > > That stuff really shouldn't be modified by low level functions.
> > > The intial value should come from readout, and after that it
> > > should
> > > be adjusted as part of the overall atomic state.
> > 
> > Well, as you can see currently this is even worse - has some weird
> > hardcoded stuff and is modified like that. Also if we are enabling
> > the 
> > dbuf slices, shouldn't we really determine what value should be
> > there?
> > 
> > Also I wouldn't really always trust the readout, as to me it seems
> > icelake resets it constantly to 2 after suspend/resume. 
> 
> The point of readout is that we syncrhonize our software state with
> the
> current state of hardware without having to modify the hardware
> state.
> Once synchronized the driver takes over and tells the hw what to do.

That is especially nice after you say "report error and don't care"
when I really want to reflect hw state in sw state, by checking whether
we could set dbuf slices or not. 
Now you say that we need to synchronize software state with hw state,
however if I fail to modify it - then "who cares".


> 
> > 
> > > 
> > > > +	icl_program_dbuf_slices(dev_priv);
> > > >  }
> > > >  
> > > >  static void icl_dbuf_disable(struct drm_i915_private
> > > > *dev_priv)
> > > >  {
> > > > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > > > ~DBUF_POWER_REQUEST);
> > > > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > > > ~DBUF_POWER_REQUEST);
> > > > -	POSTING_READ(DBUF_CTL_S2);
> > > > -
> > > > -	udelay(10);
> > > > -
> > > > -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > -		DRM_ERROR("DBuf power disable timeout!\n");
> > > > -	else
> > > > -		/*
> > > > -		 * FIXME: for now pretend that the first slice
> > > > is
> > > > always
> > > > -		 * enabled, see
> > > > intel_enabled_dbuf_slices_num().
> > > > -		 */
> > > > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > +	/*
> > > > +	 * Disable all slices
> > > > +	 */
> > > > +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > > > +	icl_program_dbuf_slices(dev_priv);
> > > >  }
> > > >  
> > > >  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > index 1da04f3e0fb3..b7057d68ad78 100644
> > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > > > drm_i915_private *i915,
> > > >  	for ((wf) = intel_display_power_get((i915), (domain));
> > > > (wf); \
> > > >  	     intel_display_power_put_async((i915), (domain),
> > > > (wf)),
> > > > (wf) = 0)
> > > >  
> > > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
> > > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > > >  			    u8 req_slices);
> > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > *dev_priv);
> > > >  
> > > >  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
> > > >  			     bool override, unsigned int mask);
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > index 82e4e6bf08c3..96741e68633a 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > @@ -51,6 +51,7 @@
> > > >  #include "display/intel_bw.h"
> > > >  #include "display/intel_cdclk.h"
> > > >  #include "display/intel_display_types.h"
> > > > +#include "display/intel_display_power.h"
> > > >  #include "display/intel_dp.h"
> > > >  #include "display/intel_fbdev.h"
> > > >  #include "display/intel_hotplug.h"
> > > > @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct
> > > > device *kdev)
> > > >  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
> > > >  		ret = vlv_resume_prepare(dev_priv, true);
> > > >  
> > > > +	/* Weird hack to fix ICL hardware bug, as it resets
> > > > DBUF slices
> > > > reg */
> > > > +	if (INTEL_GEN(dev_priv) == 11)
> > > > +		icl_program_dbuf_slices(dev_priv);
> > > > +
> > > >  	intel_uncore_runtime_resume(&dev_priv->uncore);
> > > >  
> > > >  	intel_runtime_pm_enable_interrupts(dev_priv);
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > index 7e0f67babe20..a396977c9c2d 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > @@ -804,7 +804,7 @@ static inline bool
> > > > skl_ddb_entry_equal(const
> > > > struct skl_ddb_entry *e1,
> > > >  }
> > > >  
> > > >  struct skl_ddb_allocation {
> > > 
> > > This struct is quite pointless. Should be killed off.
> > 
> > Legacy stuff, not added by me. Quite fine with removing it - 
> > but should we really do all this at the same time and in same
> > patch? 
> 
> Not the same patch. But clean up the cobwebs first, then add the new
> stuff. Much easier that way around.

There is a shit ton load of cobwebs to clean. Personally my opinion is
that current dbuf controlling code is total garbage - which has lots of
hardcoded stuff. Good luck enabling only second dbuf slice, good luck
disabling/enabling dbuf properly....
Where you were, when this was committed then?

If we need to fix it - fine. But may be we should do it in stages?
Point of this patch is "enable second dbuf slice", next patch "remove 
unneeded structs" and so on. 
While you want me to shuffle everything at the same time, which is not
good idea, especially considering complexity and amount of legacy stuff
we currently have.

> 
> > 
> > > 
> > > > -	u8 enabled_slices; /* GEN11 has configurable 2 slices
> > > > */
> > > > +	u8 enabled_slices;   /* Bitmask of currently enabled
> > > > DBuf
> > > > slices */
> > > >  };
> > > >  
> > > >  struct skl_ddb_values {
> > > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > > b/drivers/gpu/drm/i915/i915_pci.c
> > > > index 1bb701d32a5d..8e86af505730 100644
> > > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > > > intel_cherryview_info = {
> > > >  	.has_gt_uc = 1, \
> > > >  	.display.has_hdcp = 1, \
> > > >  	.display.has_ipc = 1, \
> > > > -	.ddb_size = 896
> > > > +	.ddb_size = 896, \
> > > > +	.supported_slices = 1
> > > >  
> > > >  #define SKL_PLATFORM \
> > > >  	GEN9_FEATURES, \
> > > > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > > > intel_skylake_gt4_info = {
> > > >  	GEN(9), \
> > > >  	.is_lp = 1, \
> > > >  	.display.has_hotplug = 1, \
> > > > +	.supported_slices = 1, \
> > > >  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) |
> > > > BIT(VECS0),
> > > > \
> > > >  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
> > > >  	.has_64bit_reloc = 1, \
> > > > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > > > intel_coffeelake_gt3_info = {
> > > >  	GEN9_FEATURES, \
> > > >  	GEN(10), \
> > > >  	.ddb_size = 1024, \
> > > > +	.supported_slices = 1, \
> > > >  	.display.has_dsc = 1, \
> > > >  	.has_coherent_ggtt = false, \
> > > >  	GLK_COLORS
> > > > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > > > intel_cannonlake_info = {
> > > >  	}, \
> > > >  	GEN(11), \
> > > >  	.ddb_size = 2048, \
> > > > +	.supported_slices = 2, \
> > > >  	.has_logical_ring_elsq = 1, \
> > > >  	.color = { .degamma_lut_size = 33, .gamma_lut_size =
> > > > 262145 }
> > > >  
> > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > > > b/drivers/gpu/drm/i915/i915_reg.h
> > > > index a607ea520829..fba5731063d8 100644
> > > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > > @@ -7734,9 +7734,11 @@ enum {
> > > >  #define DISP_ARB_CTL2	_MMIO(0x45004)
> > > >  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
> > > >  #define  DISP_IPC_ENABLE		(1 << 3)
> > > > -#define DBUF_CTL	_MMIO(0x45008)
> > > > -#define DBUF_CTL_S1	_MMIO(0x45008)
> > > > -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> > > > +#define DBUF_S1_BIT			BIT(0)
> > > > +#define DBUF_S2_BIT			BIT(1)
> > > 
> > > These are not bits inside DBUF_CTL so they should not be here.
> > > I'd just nuke them entirely. If you think we need more than an
> > > int to refer to a slice then IMO something simple like
> > > 
> > > enum dbuf_slice {
> > > 	DBUF_S1,
> > > 	DBUF_S2,
> > > };
> > > 
> > > would match the general apporach we use for other things much
> > > better
> > > rather than defining some shifted values.
> > 
> > We need to combine slices - so bitmask is quite suitable for that, 
> > I can of course use this enum, but then I'll have to re-encode it 
> > to bitmask anyways.
> 
> Which is pretty much how everything else works in the driver.

Another piece of incredibly important thing to argue..

> 
> > 
> > 
> > > 
> > > > +#define DBUF_CTL			_MMIO(0x45008)
> > > > +#define DBUF_CTL_S1			_MMIO(0x45008)
> > > > +#define DBUF_CTL_S2			_MMIO(0x44FE8)
> > > >  #define  DBUF_POWER_REQUEST		(1 << 31)
> > > >  #define  DBUF_POWER_STATE		(1 << 30)
> > > >  #define GEN7_MSG_CTL	_MMIO(0x45010)
> > > > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > > > b/drivers/gpu/drm/i915/intel_device_info.h
> > > > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > > > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > > > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > > > @@ -180,6 +180,7 @@ struct intel_device_info {
> > > >  	} display;
> > > >  
> > > >  	u16 ddb_size; /* in blocks */
> > > > +	u8 supported_slices; /* number of DBuf slices */
> > > >  
> > > >  	/* Register offsets for the various display pipes and
> > > > transcoders */
> > > >  	int pipe_offsets[I915_MAX_TRANSCODERS];
> > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > > b/drivers/gpu/drm/i915/intel_pm.c
> > > > index 2d389e437e87..1ea81ab92429 100644
> > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct
> > > > drm_device
> > > > *dev)
> > > >  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> > > >  }
> > > >  
> > > > -static u8 intel_enabled_dbuf_slices_num(struct
> > > > drm_i915_private
> > > > *dev_priv)
> > > > +static u8 intel_enabled_dbuf_slices(struct drm_i915_private
> > > > *dev_priv)
> > > >  {
> > > > -	u8 enabled_slices;
> > > > -
> > > > -	/* Slice 1 will always be enabled */
> > > > -	enabled_slices = 1;
> > > > +	u8 enabled_slices = 0;
> > > >  
> > > >  	/* Gen prior to GEN11 have only one DBuf slice */
> > > >  	if (INTEL_GEN(dev_priv) < 11)
> > > > -		return enabled_slices;
> > > > +		return DBUF_S1_BIT;
> > > 
> > > I'd nuke this special case entirely and just read everything from
> > > the
> > > hw. Assuming we need readout somewhere.
> > 
> > We have it that way hardcoded right now in drm-tip. I think this is
> > a
> > subject to other patch, but not this change. As I understand we
> > need 
> > to first figure out how to do it properly. 
> > This patch anyway makes it better by at least removing hardcoded
> > relation between number of slices and which slices are enabled.
> > With current code there is no way for example to enable second DBuf
> > slice only, which is required by BSpec.
> > 
> > > 
> > > >  
> > > > -	/*
> > > > -	 * FIXME: for now we'll only ever use 1 slice; pretend
> > > > that we
> > > > have
> > > > -	 * only that 1 slice enabled until we have a proper way
> > > > for on-
> > > > demand
> > > > -	 * toggling of the second slice.
> > > > -	 */
> > > > -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > -		enabled_slices++;
> > > > +	/* Check if second DBuf slice is enabled */
> > > > +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > > > +		enabled_slices |= DBUF_S1_BIT;
> > > > +
> > > > +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > +		enabled_slices |= DBUF_S2_BIT;
> > > >  
> > > >  	return enabled_slices;
> > > >  }
> > > > @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct
> > > > drm_i915_private *dev_priv,
> > > >  			      const int num_active,
> > > >  			      struct skl_ddb_allocation *ddb)
> > > >  {
> > > > -	const struct drm_display_mode *adjusted_mode;
> > > > -	u64 total_data_bw;
> > > >  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > > > -
> > > >  	WARN_ON(ddb_size == 0);
> > > >  
> > > >  	if (INTEL_GEN(dev_priv) < 11)
> > > >  		return ddb_size - 4; /* 4 blocks for bypass
> > > > path
> > > > allocation */
> > > >  
> > > > -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> > > > -	total_data_bw = total_data_rate *
> > > > drm_mode_vrefresh(adjusted_mode);
> > > > +	return ddb_size;
> > > > +}
> > > >  
> > > > -	/*
> > > > -	 * 12GB/s is maximum BW supported by single DBuf slice.
> > > > -	 *
> > > > -	 * FIXME dbuf slice code is broken:
> > > > -	 * - must wait for planes to stop using the slice
> > > > before
> > > > powering it off
> > > > -	 * - plane straddling both slices is illegal in multi-
> > > > pipe
> > > > scenarios
> > > > -	 * - should validate we stay within the hw bandwidth
> > > > limits
> > > > -	 */
> > > > -	if (0 && (num_active > 1 || total_data_bw >= GBps(12)))
> > > > {
> > > > -		ddb->enabled_slices = 2;
> > > > -	} else {
> > > > -		ddb->enabled_slices = 1;
> > > > -		ddb_size /= 2;
> > > > -	}
> > > > +/*
> > > > + * Calculate initial DBuf slice offset, based on slice size
> > > > + * and mask(i.e if slice size is 1024 and second slice is
> > > > enabled
> > > > + * offset would be 1024)
> > > > + */
> > > > +static u32 skl_get_first_dbuf_slice_offset(u32
> > > > dbuf_slice_mask,
> > > > +					   u32 slice_size, u32
> > > > ddb_size)
> > > > +{
> > > > +	u32 offset = 0;
> > > >  
> > > > -	return ddb_size;
> > > > +	if (!dbuf_slice_mask)
> > > > +		return 0;
> > > > +
> > > > +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > > > +
> > > > +	WARN_ON(offset >= ddb_size);
> > > > +	return offset;
> > > >  }
> > > >  
> > > > +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > *dev_priv,
> > > > +				      int pipe, u32
> > > > active_pipes,
> > > > +				      const struct
> > > > intel_crtc_state
> > > > *crtc_state);
> > > > +
> > > >  static void
> > > >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > > > *dev_priv,
> > > >  				   const struct
> > > > intel_crtc_state
> > > > *crtc_state,
> > > > @@ -3857,7 +3855,14 @@
> > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > drm_i915_private *dev_priv,
> > > >  	u32 pipe_width = 0, total_width = 0, width_before_pipe
> > > > = 0;
> > > >  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> > > >  	u16 ddb_size;
> > > > +	u32 ddb_range_size;
> > > >  	u32 i;
> > > > +	u32 dbuf_slice_mask;
> > > > +	u32 active_pipes;
> > > > +	u32 offset;
> > > > +	u32 slice_size;
> > > > +	u32 total_slice_mask;
> > > > +	u32 start, end;
> > > >  
> > > >  	if (WARN_ON(!state) || !crtc_state->hw.active) {
> > > >  		alloc->start = 0;
> > > > @@ -3866,14 +3871,23 @@
> > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > drm_i915_private *dev_priv,
> > > >  		return;
> > > >  	}
> > > >  
> > > > -	if (intel_state->active_pipe_changes)
> > > > +	if (intel_state->active_pipe_changes) {
> > > >  		*num_active = hweight8(intel_state-
> > > > >active_pipes);
> > > > -	else
> > > > +		active_pipes = intel_state->active_pipes;
> > > > +	} else {
> > > >  		*num_active = hweight8(dev_priv->active_pipes);
> > > > +		active_pipes = dev_priv->active_pipes;
> > > > +	}
> > > >  
> > > >  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > > > total_data_rate,
> > > >  				      *num_active, ddb);
> > > >  
> > > > +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > > > +
> > > > +	slice_size = ddb_size / INTEL_INFO(dev_priv)-
> > > > >supported_slices;
> > > > +
> > > > +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> > > > +
> > > >  	/*
> > > >  	 * If the state doesn't change the active CRTC's or
> > > > there is no
> > > >  	 * modeset request, then there's no need to
> > > > recalculate;
> > > > @@ -3891,20 +3905,70 @@
> > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > drm_i915_private *dev_priv,
> > > >  		return;
> > > >  	}
> > > >  
> > > > +	/*
> > > > +	 * Get allowed DBuf slices for correspondent pipe and
> > > > platform.
> > > > +	 */
> > > > +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv,
> > > > for_pipe,
> > > > +						     active_pip
> > > > es,
> > > > crtc_state);
> > > 
> > > Quite a few redundant arguments. The function name looks alien.
> > > 
> > > skl_crtc_possible_dbuf_slices() or something perhaps?
> > > 
> > > > +
> > > > +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes
> > > > %x\n",
> > > > +		      dbuf_slice_mask,
> > > > +		      for_pipe, active_pipes);
> > > > +
> > > > +	/*
> > > > +	 * Figure out at which DBuf slice we start, i.e if we
> > > > start at
> > > > Dbuf S2
> > > > +	 * and slice size is 1024, the offset would be 1024
> > > > +	 */
> > > > +	offset =
> > > > skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > > > +						 slice_size,
> > > > ddb_size);
> > > > +
> > > > +	/*
> > > > +	 * Figure out total size of allowed DBuf slices, which
> > > > is
> > > > basically
> > > > +	 * a number of allowed slices for that pipe multiplied
> > > > by slice
> > > > size.
> > > > +	 * Inside of this
> > > > +	 * range ddb entries are still allocated in proportion
> > > > to
> > > > display width.
> > > > +	 */
> > > > +	ddb_range_size = hweight8(dbuf_slice_mask) *
> > > > slice_size;
> > > > +
> > > >  	/*
> > > >  	 * Watermark/ddb requirement highly depends upon width
> > > > of the
> > > >  	 * framebuffer, So instead of allocating DDB equally
> > > > among
> > > > pipes
> > > >  	 * distribute DDB based on resolution/width of the
> > > > display.
> > > >  	 */
> > > > +	total_slice_mask = dbuf_slice_mask;
> > > 
> > > Isn't this already taken care by the loop below?
> > > 
> > > >  	for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > crtc_state,
> > > > i) {
> > > >  		const struct drm_display_mode *adjusted_mode =
> > > >  			&crtc_state->hw.adjusted_mode;
> > > >  		enum pipe pipe = crtc->pipe;
> > > >  		int hdisplay, vdisplay;
> > > > +		u32 pipe_dbuf_slice_mask = \
> > > > +			i915_get_allowed_dbuf_mask(dev_priv,
> > > > +						pipe,
> > > > +						active_pipes,
> > > > +						crtc_state);
> > > >  
> > > >  		if (!crtc_state->hw.enable)
> > > >  			continue;
> > > >  
> > > > +		/*
> > > > +		 * According to BSpec pipe can share one dbuf
> > > > slice
> > > > with another
> > > > +		 * pipes or pipe can use multiple dbufs, in
> > > > both cases
> > > > we
> > > > +		 * account for other pipes only if they have
> > > > exactly
> > > > same mask.
> > > > +		 * However we need to account how many slices
> > > > we should
> > > > enable
> > > > +		 * in total.
> > > > +		 */
> > > > +		total_slice_mask |= pipe_dbuf_slice_mask;
> > > 
> > > total_slice_mask will now account only the crtcs in the state.
> > > What
> > > happens to the other pipes' slices?

Same as happens now btw. Right now it also does the same thing.

> > 
> > Yes we need to iterate through all pipes/crtcs here, however
> > for getting correct dbuf slice mask most important is active_pipes.
> > However yes, I will change the loop for it to be more correct.
> > 
> > > 
> > > > +
> > > > +		/*
> > > > +		 * Do not account pipes using other slice sets
> > > > +		 * luckily as of current BSpec slice sets do
> > > > not
> > > > partially
> > > > +		 * intersect(pipes share either same one slice
> > > > or same
> > > > slice set
> > > > +		 * i.e no partial intersection), so it is
> > > > enough to
> > > > check for
> > > > +		 * equality for now.
> > > > +		 */
> > > > +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > > > +			continue;
> > > > +
> > > >  		drm_mode_get_hv_timing(adjusted_mode,
> > > > &hdisplay,
> > > > &vdisplay);
> > > >  		total_width += hdisplay;
> > > >  
> > > > @@ -3914,8 +3978,19 @@
> > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > drm_i915_private *dev_priv,
> > > >  			pipe_width = hdisplay;
> > > >  	}
> > > >  
> > > > -	alloc->start = ddb_size * width_before_pipe /
> > > > total_width;
> > > > -	alloc->end = ddb_size * (width_before_pipe +
> > > > pipe_width) /
> > > > total_width;
> > > > +	ddb->enabled_slices = total_slice_mask;
> > > > +
> > > > +	start = ddb_range_size * width_before_pipe /
> > > > total_width;
> > > > +	end = ddb_range_size * (width_before_pipe + pipe_width)
> > > > /
> > > > total_width;
> > > > +
> > > > +	alloc->start = offset + start;
> > > > +	alloc->end = offset + end;
> > > > +
> > > > +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > > > +		      alloc->start, alloc->end);
> > > > +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported
> > > > %d\n",
> > > > +		      ddb->enabled_slices,
> > > > +		      INTEL_INFO(dev_priv)->supported_slices);
> > > >  }
> > > >  
> > > >  static int skl_compute_wm_params(const struct intel_crtc_state
> > > > *crtc_state,
> > > > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > > > intel_crtc *crtc,
> > > >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
> > > >  			  struct skl_ddb_allocation *ddb /* out
> > > > */)
> > > >  {
> > > > -	ddb->enabled_slices =
> > > > intel_enabled_dbuf_slices_num(dev_priv);
> > > > +	ddb->enabled_slices =
> > > > intel_enabled_dbuf_slices(dev_priv);
> > > > +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > > > > enabled_slices);
> > > > 
> > > >  }
> > > >  
> > > >  /*
> > > > @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct
> > > > intel_crtc_state *crtc_state,
> > > >  	return total_data_rate;
> > > >  }
> > > >  
> > > > +static uint_fixed_16_16_t
> > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > *crtc_state)
> > > > +{
> > > > +	u32 src_w, src_h, dst_w, dst_h;
> > > > +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > +	uint_fixed_16_16_t downscale_h, downscale_w;
> > > > +	const struct drm_display_mode *adjusted_mode =
> > > > &crtc_state-
> > > > > hw.adjusted_mode;
> > > > 
> > > > +
> > > > +	src_w = crtc_state->pipe_src_w;
> > > > +	src_h = crtc_state->pipe_src_h;
> > > > +	dst_w = adjusted_mode->crtc_hdisplay;
> > > > +	dst_h = adjusted_mode->crtc_vdisplay;
> > > > +
> > > > +	fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > +	fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > +	downscale_w = max_fixed16(fp_w_ratio,
> > > > u32_to_fixed16(1));
> > > > +	downscale_h = max_fixed16(fp_h_ratio,
> > > > u32_to_fixed16(1));
> > > > +
> > > > +	return mul_fixed16(downscale_w, downscale_h);
> > > > +}
> > > > +
> > > > +uint_fixed_16_16_t
> > > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > > +{
> > > > +	struct drm_plane *plane;
> > > > +	const struct drm_plane_state *drm_plane_state;
> > > > +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > +	struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > >uapi.crtc);
> > > > +
> > > > +	if (!crtc_state->uapi.enable)
> > > > +		return max_downscale;
> > > > +
> > > > +	drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > drm_plane_state, &crtc_state->uapi) {
> > > > +		uint_fixed_16_16_t plane_downscale;
> > > > +		const struct intel_plane_state *plane_state =
> > > > +			to_intel_plane_state(drm_plane_state);
> > > > +
> > > > +		if (!intel_wm_plane_visible(crtc_state,
> > > > plane_state))
> > > > +			continue;
> > > > +
> > > > +		plane_downscale =
> > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > +
> > > > +		DRM_DEBUG_KMS("Plane %d downscale amount
> > > > %d.%d\n",
> > > > +			      to_intel_plane(plane)->id,
> > > > +			      plane_downscale.val >> 16,
> > > > +			      plane_downscale.val & 0xffff);
> > > > +
> > > > +		max_downscale = max_fixed16(plane_downscale,
> > > > max_downscale);
> > > > +	}
> > > > +
> > > > +
> > > > +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > > +
> > > > +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > +		       crtc->pipe, pipe_downscale.val >> 16,
> > > > +		       pipe_downscale.val & 0xffff);
> > > > +
> > > > +	pipe_downscale = mul_fixed16(pipe_downscale,
> > > > max_downscale);
> > > > +
> > > > +	/*
> > > > +	 * Convert result to percentage: get the actual ratio
> > > > instead
> > > > of rate,
> > > > +	 * then multiply by 100 to get percentage.
> > > > +	 */
> > > > +	pipe_ratio = u32_to_fixed16(100 *
> > > > div_round_up_u32_fixed16(1,
> > > > pipe_downscale));
> > > > +
> > > > +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > +		      pipe_ratio.val >> 16, pipe_ratio.val &
> > > > 0xffff);
> > > > +
> > > > +	return pipe_ratio;
> > > > +}
> > > 
> > > Yuck. I just killed these.
> > 
> > I still need pipe_ratio fro ICL as you know, this is dictated by
> > BSpec,
> > not by me.
> 
> That part of the spec is pretty crazy IMO. I think we should just
> ignore
> it for now. Also nothing in there says we must use both slices for a
> single pipe so ignoring doesn't even violate the spec in any way.

So you are instructing me now to ignore BSpec as "we are smarter"?
I don't think this is properly to do it this way. Lets first change the
BSpec, if we don't like pipe ratio here and then driver. Currently this
is written that way and if we are going to do everything our own way
ignoring specs, this is going to be chaos.

> 
> > 
> > > 
> > > > +
> > > > +struct dbuf_slice_conf_entry {
> > > > +	u32 dbuf_mask[I915_MAX_PIPES];
> > > > +	u32 active_pipes;
> > > > +};
> > > > +
> > > > +
> > > > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > > > +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > > > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > > > +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > > > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > > > +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > > > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > > > +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > > > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > > > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > > > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > > > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > > > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > > > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > |
> > > > BIT(PIPE_C) }
> > > > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C)
> > > > |
> > > > BIT(PIPE_D) }
> > > > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C)
> > > > |
> > > > BIT(PIPE_D) }
> > > > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > |
> > > > BIT(PIPE_D) }
> > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > |
> > > > BIT(PIPE_C) }
> > > > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3,
> > > > DBuf4)  \
> > > > +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) |
> > > > BIT(PIPE_B) | \
> > > > +					  BIT(PIPE_C) |
> > > > BIT(PIPE_D) }
> > > > +
> > > > +/*
> > > > + * Table taken from Bspec 12716
> > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > + * plus there are some hardcoded requirements on how
> > > > + * those should be distributed for multipipe scenarios.
> > > > + * For more DBuf slices algorithm can get even more messy
> > > > + * and less readable, so decided to use a table almost
> > > > + * as is from BSpec itself - that way it is at least easier
> > > > + * to compare, change and check.
> > > > + */
> > > > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > ratio
> > > > < 88.8 */
> > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > pipe
> > > > ratio >= 88.8 */
> > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > ratio
> > > > < 88.8 */
> > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > pipe
> > > > ratio >= 88.8 */
> > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > ratio
> > > > < 88.8 */
> > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for
> > > > pipe
> > > > ratio >= 88.8 */
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > DBUF_S2_BIT),
> > > > +};
> > > > +
> > > > +/*
> > > > + * Table taken from Bspec 49255
> > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > + * plus there are some hardcoded requirements on how
> > > > + * those should be distributed for multipipe scenarios.
> > > > + * For more DBuf slices algorithm can get even more messy
> > > > + * and less readable, so decided to use a table almost
> > > > + * as is from BSpec itself - that way it is at least easier
> > > > + * to compare, change and check.
> > > > + */
> > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > DBUF_S2_BIT),
> > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > DBUF_S2_BIT),
> > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > DBUF_S2_BIT),
> > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > DBUF_S2_BIT),
> > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > DBUF_S2_BIT,
> > > > DBUF_S2_BIT),
> > > > +};
> > > 
> > > My eyes!
> > > 
> > > I have to think we should be able to reduce all that to a handful
> > > of lines of code.
> > 
> > Yeah, then we are going to have a huge function with lots of weird
> > definitions, unfortunately BSpec table has quite strange DBuf
> > assignments like 
> > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > 
> > i.e slices can get mixed in a quite various ways. There is no
> > rational pattern in those and could be even dangerous to try to 
> > optimize it some way, as one day it might change again in some
> > unpredictable way.
> > 
> > Would you prefer to have it like
> > 
> > if (pipes are A and B)
> >     program S2 to A and S1 to B
> > if (pipes are A and C)
> >     program S1 to A and S2 to C
> > ...
> > 
> > I would prefer at least to see it in a comparable way with the
> > table we have in BSpec, rather than having lots of weird looking
> > if-else statements, GEN checks and so on.
> > 
> > I knew this table was not going to look like it is done typically
> > here - but really my opinion that it is way better than hardcoding
> > it into some weird algorithm, which would be hard to compare to 
> > initial table, which is already strange enough.
> > 
> > If you don't like it - could you please explain why this is exactly
> > worse than having long functions with hardcoded checks?
> 
> I think we should be able to encode this simply using the
> "Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
> is the distance of each pipe from each dbuf slice. Then all we have
> to do is minimize the total distance. The following tables I think
> should encode that reasonably concisely:
> 
> /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> static const u8 icl_dbuf_distance[][] {
> 	[PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> 	[PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> 	[PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> };
> 
> /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> static const u8 tgl_dbuf_distance[][] {
> 	[PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> 	[PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> 	[PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> 	[PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> };

I think you are missing my point. I will explain why.
My initial idea was similar, use some kind of distance
metrics and encode everything in a smart way.
Current BSpec table has sometimes slices swapped for no reason.

There are seem to be some hardware constraints we are not aware
of. For example for TGL pipes A,B get S2 and S1 slices,
but pipes A C get S1 and S2.

if you encode all this into some kind of distance algorithm,
those differences would be enourmously difficult to spot and
compare with actual table. That is why I decided to do it that
way.

Most likely yes you can solve it by using this metrics,
but will it really be then more readable and comparable
with reference table? 
What if some contraints will be added again, like pipe ratio?

> 
> /*
>  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
>  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
>  */
> static const u8 future_dbuf_distance[][] {
> 	[PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> 	[PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> 	[PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> 	[PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> };
> 
> I suppose we could make that even more minimal by just
> directly sticking that "Pipe and DBUF ordering" language
> into a single array but and then calculate each distance
> from it, but that is perhaps more work, and we'd need
> some kind of enum just for that so that we have a single
> namespace for the pipes and dbuf slices. So i have a
> feeling having those distances pre-computed in a table
> like thet is maybe a better way to go.

Good luck then spoting differences with the BSpec table once it
gets changed.

> 
> > 
> > I think this is pretty stupid to fix stuff which looks new or
> > unusual
> > just because somebody doesn't like it, without even constructive
> > arguing.
> > 
> > - Stan
> > 
> > > 
> > > 
> > > > +
> > > > +/*
> > > > + * This function
> > > 
> > > "This function" is totally redundant. Ditto elsewhere.
> > > 
> > > > finds an entry with same enabled pipe configuration and
> > > > + * returns correspondent DBuf slice mask as stated in BSpec
> > > > for
> > > > particular
> > > > + * platform.
> > > > + */
> > > > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > > > +				     u32 active_pipes,
> > > > +				     const struct
> > > > intel_crtc_state
> > > > *crtc_state)
> > > > +{
> > > > +	int i;
> > > > +	uint_fixed_16_16_t pipe_ratio;
> > > > +
> > > > +	/*
> > > > +	 * Calculate pipe ratio as stated in BSpec 28692
> > > > +	 */
> > > > +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > > > +
> > > > +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > > > +		if (icl_allowed_dbufs[i].active_pipes ==
> > > > active_pipes)
> > > > {
> > > > +			if (hweight32(active_pipes) == 1) {
> > > > +				/*
> > > > +				 * According to BSpec 12716: if
> > > > ratio
> > > > > = 88.8
> > > > 
> > > > +				 * always use single dbuf
> > > > slice.
> > > > +				 */
> > > > +				if (pipe_ratio.val >=
> > > > div_fixed16(888,
> > > > 10).val)
> > > > +					++i;
> > > > +			}
> > > > +			return
> > > > icl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > +		}
> > > > +	}
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +/*
> > > > + * This function finds an entry with same enabled pipe
> > > > configuration and
> > > > + * returns correspondent DBuf slice mask as stated in BSpec
> > > > for
> > > > particular
> > > > + * platform.
> > > > + */
> > > > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > > > +				     u32 active_pipes,
> > > > +				     const struct
> > > > intel_crtc_state
> > > > *crtc_state)
> > > > +{
> > > > +	int i;
> > > > +
> > > > +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > > > +		if (tgl_allowed_dbufs[i].active_pipes ==
> > > > active_pipes)
> > > > +			return
> > > > tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > +	}
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > *dev_priv,
> > > > +				      int pipe, u32
> > > > active_pipes,
> > > > +				      const struct
> > > > intel_crtc_state
> > > > *crtc_state)
> > > > +{
> > > > +	if (IS_GEN(dev_priv, 11))
> > > > +		return icl_get_allowed_dbuf_mask(pipe,
> > > > +						 active_pipes,
> > > > +						 crtc_state);
> > > > +	else if (IS_GEN(dev_priv, 12))
> > > > +		return tgl_get_allowed_dbuf_mask(pipe,
> > > > +						 active_pipes,
> > > > +						 crtc_state);
> > > > +	/*
> > > > +	 * For anything else just return one slice yet.
> > > > +	 * Should be extended for other platforms.
> > > > +	 */
> > > > +	return DBUF_S1_BIT;
> > > > +}
> > > > +
> > > >  static u64
> > > >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > > > *crtc_state,
> > > >  				 u64 *plane_data_rate)
> > > > -- 
> > > > 2.17.1
> > > 
> > > 
> 
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 17:17         ` Lisovskiy, Stanislav
  0 siblings, 0 replies; 34+ messages in thread
From: Lisovskiy, Stanislav @ 2019-11-18 17:17 UTC (permalink / raw)
  To: ville.syrjala; +Cc: intel-gfx

On Mon, 2019-11-18 at 17:27 +0200, Ville Syrjälä wrote:
> On Mon, Nov 18, 2019 at 09:19:18AM +0000, Lisovskiy, Stanislav wrote:
> > On Fri, 2019-11-15 at 22:19 +0200, Ville Syrjälä wrote:
> > > On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy
> > > wrote:
> > > > Also implemented algorithm for choosing DBuf slice
> > > > configuration
> > > > based on active pipes, pipe ratio as stated in BSpec 12716.
> > > > 
> > > > Now pipe allocation still stays proportional to pipe width as
> > > > before,
> > > > however within allowed DBuf slice for this particular
> > > > configuration.
> > > > 
> > > > v2: Remove unneeded check from commit as ddb enabled slices
> > > > might
> > > >     now differ from hw state.
> > > > 
> > > > v3: - Added new field "supported_slices" to ddb to separate max
> > > >       supported slices vs currently enabled, to avoid
> > > > confusion.
> > > >     - Removed obsolete comments and code related to
> > > > DBuf(Matthew
> > > > Roper).
> > > >     - Some code style and long line removal(Matthew Roper).
> > > >     - Added WARN_ON to new ddb range offset calc
> > > > function(Matthew
> > > > Roper).
> > > >     - Removed platform specific call to calc pipe ratio from
> > > > ddb
> > > >       allocation function and fixed the return value(Matthew
> > > > Roper)
> > > >     - Refactored DBUF slice allocation table to improve
> > > > readability
> > > >     - Added DBUF slice allocation for TGL as well.
> > > >     - ICL(however not TGL) seems to voluntarily enable second
> > > > DBuf
> > > > slice
> > > >       after pm suspend/resume causing a mismatch failure,
> > > > because
> > > > we
> > > >       update DBuf slices only if we do a modeset, however this
> > > > check
> > > >       is done always. Fixed it to be done only when modeset for
> > > > ICL.
> > > > 
> > > > v4: - Now enabled slices is not actually a number, but a
> > > > bitmask,
> > > >       because we might need to enable second slice only and
> > > > number
> > > >       of slices would still 1 and that current code doesn't
> > > > allow.
> > > >     - Remove redundant duplicate code to have some unified way
> > > > of
> > > >       enabling dbuf slices instead of hardcoding.
> > > > 
> > > > v5: - Fix failing gen9_assert_dbuf_enabled as it was naively
> > > > thinking
> > > >       that we have only one DBUF_CTL slice. Now another version
> > > > for
> > > >       gen11 will check both slices as only second one can be
> > > > enabled,
> > > >       to keep CI happy.
> > > > 
> > > > v6: - Removed enabled dbuf assertion completely. Previous code
> > > >       was keeping dbuf enabled, even(!) when _dbuf_disable is
> > > > called.
> > > >       Currently we enable/disable dbuf slices based on
> > > > requrement
> > > >       so no point in asserting this here.
> > > >     - Removed unnecessary modeset check from
> > > > verify_wm_state(Matthew Roper)
> > > >     - Slices intersection after union is same as final
> > > > result(Matthew Roper)
> > > >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> > > >     - Some macros added(Matthew Roper)
> > > >     - ddb range is now always less or equal to ddb size - no
> > > > need
> > > > for additional
> > > >       checks(previously needed as we had some bandwidth checks
> > > > in
> > > > there which
> > > >       could "suddenly" return ddb_size smaller than it is.
> > > > 
> > > > v7: Minor costemic changes:
> > > >     - Changed icl_dbuf_slices_restore name to
> > > > icl_program_dbuf_slices
> > > >       as it more corresponds to the actual functionality.
> > > >     - Some simplification with supported slices for BXT and GLK
> > > >     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
> > > >       this is not used for skl anymore.
> > > >     - Changed DBuf slice assignment order for TGL case
> > > > 
> > > > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > > > Signed-off-by: Stanislav Lisovskiy <
> > > > stanislav.lisovskiy@intel.com>
> > > > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > > > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > > > Cc: James Ausmus <james.ausmus@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> > > >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> > > >  .../drm/i915/display/intel_display_power.h    |   2 +
> > > >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> > > >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> > > >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> > > >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> > > >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> > > >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > > > ++++++++++++++++--
> > > >  9 files changed, 431 insertions(+), 104 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > > index 876fc25968bf..bd7aff675198 100644
> > > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > > > intel_crtc *crtc,
> > > >  
> > > >  	if (INTEL_GEN(dev_priv) >= 11 &&
> > > >  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > > > -		DRM_ERROR("mismatch in DBUF Slices (expected
> > > > %u, got
> > > > %u)\n",
> > > > +		DRM_ERROR("mismatch in DBUF Slices (expected
> > > > %x, got
> > > > %x)\n",
> > > >  			  sw_ddb->enabled_slices,
> > > >  			  hw->ddb.enabled_slices);
> > > >  
> > > > @@ -14614,15 +14614,24 @@ static void
> > > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > > >  	u8 hw_enabled_slices = dev_priv-
> > > > >wm.skl_hw.ddb.enabled_slices;
> > > >  	u8 required_slices = state-
> > > > >wm_results.ddb.enabled_slices;
> > > >  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > > > +	u8 slices_union = hw_enabled_slices | required_slices;
> > > > +	u8 slices_intersection = required_slices;
> > > >  
> > > >  	for_each_oldnew_intel_crtc_in_state(state, crtc,
> > > > old_crtc_state, new_crtc_state, i)
> > > >  		/* ignore allocations for crtc's that have been
> > > > turned
> > > > off. */
> > > >  		if (new_crtc_state->hw.active)
> > > >  			entries[i] = old_crtc_state-
> > > > >wm.skl.ddb;
> > > >  
> > > > -	/* If 2nd DBuf slice required, enable it here */
> > > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > > > hw_enabled_slices)
> > > > -		icl_dbuf_slices_update(dev_priv,
> > > > required_slices);
> > > > +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > > > +		      required_slices, hw_enabled_slices);
> > > > +
> > > > +	/*
> > > > +	 * If some other DBuf slice required, enable it here,
> > > > +	 * as here we only add more slices, but not remove to
> > > > prevent
> > > > +	 * issues if somebody still uses those.
> > > > +	 */
> > > > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > hw_enabled_slices)
> > > > +		icl_dbuf_slices_update(dev_priv, slices_union);
> > > >  
> > > >  	/*
> > > >  	 * Whenever the number of active pipes changes, we need
> > > > to make
> > > > sure we
> > > > @@ -14681,9 +14690,12 @@ static void
> > > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > > >  		}
> > > >  	} while (progress);
> > > >  
> > > > -	/* If 2nd DBuf slice is no more required disable it */
> > > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > hw_enabled_slices)
> > > > -		icl_dbuf_slices_update(dev_priv,
> > > > required_slices);
> > > > +	/*
> > > > +	 * If some other DBuf slice is not needed, disable it
> > > > here,
> > > > +	 * as here we only remove slices, which are not needed
> > > > anymore.
> > > > +	 */
> > > > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > hw_enabled_slices)
> > > > +		icl_dbuf_slices_update(dev_priv,
> > > > slices_intersection);
> > > 
> > > This is still a totally wrong place for this. We need to be sure
> > > the
> > > planes have switched over to the new ddb layout before we power
> > > off
> > > anything.
> > 
> > I have just modified already existing line here, not added by me.
> > The point of this patch is starting to use second DBuf slice but
> > not fixing all kinds of issues related to DBuf code at the same
> > time.
> 
> You're re-enabling code I disabled on purpose. It has to be fixed
> properly or not touched at all. In hindsight I should have probably
> just ripped it all out to not give people bad ideas.

-	/* If 2nd DBuf slice is no more required disable it */
> > > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > hw_enabled_slices)
> > > > -		icl_dbuf_slices_update(dev_priv,
> > > > required_slices);

I'm not reenabling it - this is code is out there right now.
This is all again about changing all kinds of stuff at the same time.

Why we can't just do things step by step? You have a point that
this is wrong place ok, but as we see current code causes no
regressions. Just tell me your idea where to put it and I will
do that, if you don't have it yet - let it stay there, if CI shows
no regressions currently. 

> 
> > 
> > Of course this is nice idea to fix it - if that is wrong plane,
> > but current code somehow lives with it and probably this should be
> > with some other patch.
> > 
> > > 
> > > >  }
> > > >  
> > > >  static void intel_atomic_helper_free_state(struct
> > > > drm_i915_private
> > > > *dev_priv)
> > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > index ce1b64f4dd44..a78bc9b61862 100644
> > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > @@ -1031,15 +1031,6 @@ static bool
> > > > gen9_dc_off_power_well_enabled(struct drm_i915_private
> > > > *dev_priv,
> > > >  		(I915_READ(DC_STATE_EN) &
> > > > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> > > >  }
> > > >  
> > > > -static void gen9_assert_dbuf_enabled(struct drm_i915_private
> > > > *dev_priv)
> > > > -{
> > > > -	u32 tmp = I915_READ(DBUF_CTL);
> > > > -
> > > > -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> > > > -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > > > -	     "Unexpected DBuf power power state (0x%08x)\n",
> > > > tmp);
> > > > -}
> > > > -
> > > >  static void gen9_disable_dc_states(struct drm_i915_private
> > > > *dev_priv)
> > > >  {
> > > >  	struct intel_cdclk_state cdclk_state = {};
> > > > @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct
> > > > drm_i915_private *dev_priv)
> > > >  	/* Can't read out voltage_level so can't use
> > > > intel_cdclk_changed() */
> > > >  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw,
> > > > &cdclk_state));
> > > >  
> > > > -	gen9_assert_dbuf_enabled(dev_priv);
> > > > -
> > > >  	if (IS_GEN9_LP(dev_priv))
> > > >  		bxt_verify_ddi_phy_power_wells(dev_priv);
> > > >  
> > > > @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct
> > > > drm_i915_private *dev_priv)
> > > >  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> > > >  }
> > > >  
> > > > -static u8 intel_dbuf_max_slices(struct drm_i915_private
> > > > *dev_priv)
> > > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > > >  {
> > > > -	if (INTEL_GEN(dev_priv) < 11)
> > > > -		return 1;
> > > > -	return 2;
> > > > +	return INTEL_INFO(dev_priv)->supported_slices;
> > > > +}
> > > 
> > > Kinda pointless function now. 'supported_slices' is not a very
> > > good
> > > name. Should be 'num_dbuf_slices' or something along those lines.
> > 
> > Supported slices really means how much slices we support in
> > hardware.
> > num_dbuf_slices will not reflect that, it will be more ambigious.
> > I don't like this name.
> 
> supported_slices doesn't even say "dbuf" anywhere. So it's even more
> ambiguous. Also we already have num_pipes, num_sprites, etc.

..and in current code we have... "enabled_slices", no? Ouch! :) IS this
really that important and worth arguing at the very last moment?
whether number of
supported_slices_in_hardware, should be named "supported slices" or
"num_dbuf_slices"? If yes, lets go on, miss some targets and continue
discussing this "soooo important" thing. 

Some one might come up that it is called "num_slices" or
"dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
keep everyones bloated egos happy??


> 
> > 
> > > 
> > > > +
> > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > *dev_priv)
> > > > +{
> > > > +	const u8 hw_enabled_slices = dev_priv-
> > > > > wm.skl_hw.ddb.enabled_slices;
> > > 
> > > Seems a rather pointless variable. const at least is pointless.
> > > 
> > > > +
> > > > +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> > > >  }
> > > >  
> > > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > > >  			    u8 req_slices)
> > > >  {
> > > > -	const u8 hw_enabled_slices = dev_priv-
> > > > > wm.skl_hw.ddb.enabled_slices;
> > > > 
> > > > -	bool ret;
> > > > +	bool ret = true;
> > > > +	int i;
> > > > +	int max_slices = intel_dbuf_max_slices(dev_priv);
> > > >  
> > > > -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > > > +	if (hweight8(req_slices) >
> > > > intel_dbuf_max_slices(dev_priv)) {
> > > >  		DRM_ERROR("Invalid number of dbuf slices
> > > > requested\n");
> > > >  		return;
> > > >  	}
> > > >  
> > > > -	if (req_slices == hw_enabled_slices || req_slices == 0)
> > > > -		return;
> > > > +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n",
> > > > req_slices);
> > > >  
> > > > -	if (req_slices > hw_enabled_slices)
> > > > -		ret = intel_dbuf_slice_set(dev_priv,
> > > > DBUF_CTL_S2,
> > > > true);
> > > > -	else
> > > > -		ret = intel_dbuf_slice_set(dev_priv,
> > > > DBUF_CTL_S2,
> > > > false);
> > > > +	for (i = 0; i < max_slices; i++) {
> > > > +		int slice_bit = BIT(i);
> > > > +		bool slice_set = (slice_bit & req_slices) != 0;
> > > > +
> > > > +		switch (slice_bit) {
> > > > +		case DBUF_S1_BIT:
> > > > +			ret = ret &&
> > > > intel_dbuf_slice_set(dev_priv,
> > > > +							  DBUF_
> > > > CTL_S1,
> > > > +							  slice
> > > > _set);
> > > > +			break;
> > > > +		case DBUF_S2_BIT:
> > > > +			ret = ret &&
> > > > intel_dbuf_slice_set(dev_priv,
> > > > +							  DBUF_
> > > > CTL_S2,
> > > > +							  slice
> > > > _set);
> > > 
> > > The return value on intel_dbuf_slice_set() seems pointless. I'd
> > > nuke
> > > it.
> > > 
> > > Also as long you have intel_dbuf_slice_set() I'd probably just
> > > pass
> > > the
> > > slice index there and let it deal with the register details.
> > > DBUF_CTL
> > > should probably be parametrized as well so you don't need
> > > annoying
> > > switch
> > > statements.
> > 
> > What if it fails? Shouldn't we still have some clue what state do
> > we
> > have/had at the moment?
> 
> It if fails you log an error and keep going.

Ah sure. And let hw/sw state be completely different.
Also see next comment about this.

> 
> > 
> > > 
> > > > +			break;
> > > > +		default:
> > > > +			MISSING_CASE(slice_bit);
> > > > +		}
> > > > +	}
> > > >  
> > > >  	if (ret)
> > > >  		dev_priv->wm.skl_hw.ddb.enabled_slices =
> > > > req_slices;
> > > > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > > > drm_i915_private *dev_priv,
> > > >  
> > > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > > >  {
> > > > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > DBUF_POWER_REQUEST);
> > > > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > DBUF_POWER_REQUEST);
> > > > -	POSTING_READ(DBUF_CTL_S2);
> > > > -
> > > > -	udelay(10);
> > > > -
> > > > -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > -		DRM_ERROR("DBuf power enable timeout\n");
> > > > -	else
> > > > -		/*
> > > > -		 * FIXME: for now pretend that we only have 1
> > > > slice,
> > > > see
> > > > -		 * intel_enabled_dbuf_slices_num().
> > > > -		 */
> > > > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > +	/*
> > > > +	 * Just power up 1 slice, we will
> > > > +	 * figure out later which slices we have and what we
> > > > need.
> > > > +	 */
> > > > +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > 
> > > That stuff really shouldn't be modified by low level functions.
> > > The intial value should come from readout, and after that it
> > > should
> > > be adjusted as part of the overall atomic state.
> > 
> > Well, as you can see currently this is even worse - has some weird
> > hardcoded stuff and is modified like that. Also if we are enabling
> > the 
> > dbuf slices, shouldn't we really determine what value should be
> > there?
> > 
> > Also I wouldn't really always trust the readout, as to me it seems
> > icelake resets it constantly to 2 after suspend/resume. 
> 
> The point of readout is that we syncrhonize our software state with
> the
> current state of hardware without having to modify the hardware
> state.
> Once synchronized the driver takes over and tells the hw what to do.

That is especially nice after you say "report error and don't care"
when I really want to reflect hw state in sw state, by checking whether
we could set dbuf slices or not. 
Now you say that we need to synchronize software state with hw state,
however if I fail to modify it - then "who cares".


> 
> > 
> > > 
> > > > +	icl_program_dbuf_slices(dev_priv);
> > > >  }
> > > >  
> > > >  static void icl_dbuf_disable(struct drm_i915_private
> > > > *dev_priv)
> > > >  {
> > > > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > > > ~DBUF_POWER_REQUEST);
> > > > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > > > ~DBUF_POWER_REQUEST);
> > > > -	POSTING_READ(DBUF_CTL_S2);
> > > > -
> > > > -	udelay(10);
> > > > -
> > > > -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > -		DRM_ERROR("DBuf power disable timeout!\n");
> > > > -	else
> > > > -		/*
> > > > -		 * FIXME: for now pretend that the first slice
> > > > is
> > > > always
> > > > -		 * enabled, see
> > > > intel_enabled_dbuf_slices_num().
> > > > -		 */
> > > > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > +	/*
> > > > +	 * Disable all slices
> > > > +	 */
> > > > +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > > > +	icl_program_dbuf_slices(dev_priv);
> > > >  }
> > > >  
> > > >  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > index 1da04f3e0fb3..b7057d68ad78 100644
> > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > > > drm_i915_private *i915,
> > > >  	for ((wf) = intel_display_power_get((i915), (domain));
> > > > (wf); \
> > > >  	     intel_display_power_put_async((i915), (domain),
> > > > (wf)),
> > > > (wf) = 0)
> > > >  
> > > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
> > > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > > >  			    u8 req_slices);
> > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > *dev_priv);
> > > >  
> > > >  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
> > > >  			     bool override, unsigned int mask);
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > index 82e4e6bf08c3..96741e68633a 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > @@ -51,6 +51,7 @@
> > > >  #include "display/intel_bw.h"
> > > >  #include "display/intel_cdclk.h"
> > > >  #include "display/intel_display_types.h"
> > > > +#include "display/intel_display_power.h"
> > > >  #include "display/intel_dp.h"
> > > >  #include "display/intel_fbdev.h"
> > > >  #include "display/intel_hotplug.h"
> > > > @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct
> > > > device *kdev)
> > > >  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
> > > >  		ret = vlv_resume_prepare(dev_priv, true);
> > > >  
> > > > +	/* Weird hack to fix ICL hardware bug, as it resets
> > > > DBUF slices
> > > > reg */
> > > > +	if (INTEL_GEN(dev_priv) == 11)
> > > > +		icl_program_dbuf_slices(dev_priv);
> > > > +
> > > >  	intel_uncore_runtime_resume(&dev_priv->uncore);
> > > >  
> > > >  	intel_runtime_pm_enable_interrupts(dev_priv);
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > index 7e0f67babe20..a396977c9c2d 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > @@ -804,7 +804,7 @@ static inline bool
> > > > skl_ddb_entry_equal(const
> > > > struct skl_ddb_entry *e1,
> > > >  }
> > > >  
> > > >  struct skl_ddb_allocation {
> > > 
> > > This struct is quite pointless. Should be killed off.
> > 
> > Legacy stuff, not added by me. Quite fine with removing it - 
> > but should we really do all this at the same time and in same
> > patch? 
> 
> Not the same patch. But clean up the cobwebs first, then add the new
> stuff. Much easier that way around.

There is a shit ton load of cobwebs to clean. Personally my opinion is
that current dbuf controlling code is total garbage - which has lots of
hardcoded stuff. Good luck enabling only second dbuf slice, good luck
disabling/enabling dbuf properly....
Where you were, when this was committed then?

If we need to fix it - fine. But may be we should do it in stages?
Point of this patch is "enable second dbuf slice", next patch "remove 
unneeded structs" and so on. 
While you want me to shuffle everything at the same time, which is not
good idea, especially considering complexity and amount of legacy stuff
we currently have.

> 
> > 
> > > 
> > > > -	u8 enabled_slices; /* GEN11 has configurable 2 slices
> > > > */
> > > > +	u8 enabled_slices;   /* Bitmask of currently enabled
> > > > DBuf
> > > > slices */
> > > >  };
> > > >  
> > > >  struct skl_ddb_values {
> > > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > > b/drivers/gpu/drm/i915/i915_pci.c
> > > > index 1bb701d32a5d..8e86af505730 100644
> > > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > > > intel_cherryview_info = {
> > > >  	.has_gt_uc = 1, \
> > > >  	.display.has_hdcp = 1, \
> > > >  	.display.has_ipc = 1, \
> > > > -	.ddb_size = 896
> > > > +	.ddb_size = 896, \
> > > > +	.supported_slices = 1
> > > >  
> > > >  #define SKL_PLATFORM \
> > > >  	GEN9_FEATURES, \
> > > > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > > > intel_skylake_gt4_info = {
> > > >  	GEN(9), \
> > > >  	.is_lp = 1, \
> > > >  	.display.has_hotplug = 1, \
> > > > +	.supported_slices = 1, \
> > > >  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) |
> > > > BIT(VECS0),
> > > > \
> > > >  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
> > > >  	.has_64bit_reloc = 1, \
> > > > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > > > intel_coffeelake_gt3_info = {
> > > >  	GEN9_FEATURES, \
> > > >  	GEN(10), \
> > > >  	.ddb_size = 1024, \
> > > > +	.supported_slices = 1, \
> > > >  	.display.has_dsc = 1, \
> > > >  	.has_coherent_ggtt = false, \
> > > >  	GLK_COLORS
> > > > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > > > intel_cannonlake_info = {
> > > >  	}, \
> > > >  	GEN(11), \
> > > >  	.ddb_size = 2048, \
> > > > +	.supported_slices = 2, \
> > > >  	.has_logical_ring_elsq = 1, \
> > > >  	.color = { .degamma_lut_size = 33, .gamma_lut_size =
> > > > 262145 }
> > > >  
> > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > > > b/drivers/gpu/drm/i915/i915_reg.h
> > > > index a607ea520829..fba5731063d8 100644
> > > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > > @@ -7734,9 +7734,11 @@ enum {
> > > >  #define DISP_ARB_CTL2	_MMIO(0x45004)
> > > >  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
> > > >  #define  DISP_IPC_ENABLE		(1 << 3)
> > > > -#define DBUF_CTL	_MMIO(0x45008)
> > > > -#define DBUF_CTL_S1	_MMIO(0x45008)
> > > > -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> > > > +#define DBUF_S1_BIT			BIT(0)
> > > > +#define DBUF_S2_BIT			BIT(1)
> > > 
> > > These are not bits inside DBUF_CTL so they should not be here.
> > > I'd just nuke them entirely. If you think we need more than an
> > > int to refer to a slice then IMO something simple like
> > > 
> > > enum dbuf_slice {
> > > 	DBUF_S1,
> > > 	DBUF_S2,
> > > };
> > > 
> > > would match the general apporach we use for other things much
> > > better
> > > rather than defining some shifted values.
> > 
> > We need to combine slices - so bitmask is quite suitable for that, 
> > I can of course use this enum, but then I'll have to re-encode it 
> > to bitmask anyways.
> 
> Which is pretty much how everything else works in the driver.

Another piece of incredibly important thing to argue..

> 
> > 
> > 
> > > 
> > > > +#define DBUF_CTL			_MMIO(0x45008)
> > > > +#define DBUF_CTL_S1			_MMIO(0x45008)
> > > > +#define DBUF_CTL_S2			_MMIO(0x44FE8)
> > > >  #define  DBUF_POWER_REQUEST		(1 << 31)
> > > >  #define  DBUF_POWER_STATE		(1 << 30)
> > > >  #define GEN7_MSG_CTL	_MMIO(0x45010)
> > > > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > > > b/drivers/gpu/drm/i915/intel_device_info.h
> > > > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > > > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > > > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > > > @@ -180,6 +180,7 @@ struct intel_device_info {
> > > >  	} display;
> > > >  
> > > >  	u16 ddb_size; /* in blocks */
> > > > +	u8 supported_slices; /* number of DBuf slices */
> > > >  
> > > >  	/* Register offsets for the various display pipes and
> > > > transcoders */
> > > >  	int pipe_offsets[I915_MAX_TRANSCODERS];
> > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > > b/drivers/gpu/drm/i915/intel_pm.c
> > > > index 2d389e437e87..1ea81ab92429 100644
> > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct
> > > > drm_device
> > > > *dev)
> > > >  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> > > >  }
> > > >  
> > > > -static u8 intel_enabled_dbuf_slices_num(struct
> > > > drm_i915_private
> > > > *dev_priv)
> > > > +static u8 intel_enabled_dbuf_slices(struct drm_i915_private
> > > > *dev_priv)
> > > >  {
> > > > -	u8 enabled_slices;
> > > > -
> > > > -	/* Slice 1 will always be enabled */
> > > > -	enabled_slices = 1;
> > > > +	u8 enabled_slices = 0;
> > > >  
> > > >  	/* Gen prior to GEN11 have only one DBuf slice */
> > > >  	if (INTEL_GEN(dev_priv) < 11)
> > > > -		return enabled_slices;
> > > > +		return DBUF_S1_BIT;
> > > 
> > > I'd nuke this special case entirely and just read everything from
> > > the
> > > hw. Assuming we need readout somewhere.
> > 
> > We have it that way hardcoded right now in drm-tip. I think this is
> > a
> > subject to other patch, but not this change. As I understand we
> > need 
> > to first figure out how to do it properly. 
> > This patch anyway makes it better by at least removing hardcoded
> > relation between number of slices and which slices are enabled.
> > With current code there is no way for example to enable second DBuf
> > slice only, which is required by BSpec.
> > 
> > > 
> > > >  
> > > > -	/*
> > > > -	 * FIXME: for now we'll only ever use 1 slice; pretend
> > > > that we
> > > > have
> > > > -	 * only that 1 slice enabled until we have a proper way
> > > > for on-
> > > > demand
> > > > -	 * toggling of the second slice.
> > > > -	 */
> > > > -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > -		enabled_slices++;
> > > > +	/* Check if second DBuf slice is enabled */
> > > > +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > > > +		enabled_slices |= DBUF_S1_BIT;
> > > > +
> > > > +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > +		enabled_slices |= DBUF_S2_BIT;
> > > >  
> > > >  	return enabled_slices;
> > > >  }
> > > > @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct
> > > > drm_i915_private *dev_priv,
> > > >  			      const int num_active,
> > > >  			      struct skl_ddb_allocation *ddb)
> > > >  {
> > > > -	const struct drm_display_mode *adjusted_mode;
> > > > -	u64 total_data_bw;
> > > >  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > > > -
> > > >  	WARN_ON(ddb_size == 0);
> > > >  
> > > >  	if (INTEL_GEN(dev_priv) < 11)
> > > >  		return ddb_size - 4; /* 4 blocks for bypass
> > > > path
> > > > allocation */
> > > >  
> > > > -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> > > > -	total_data_bw = total_data_rate *
> > > > drm_mode_vrefresh(adjusted_mode);
> > > > +	return ddb_size;
> > > > +}
> > > >  
> > > > -	/*
> > > > -	 * 12GB/s is maximum BW supported by single DBuf slice.
> > > > -	 *
> > > > -	 * FIXME dbuf slice code is broken:
> > > > -	 * - must wait for planes to stop using the slice
> > > > before
> > > > powering it off
> > > > -	 * - plane straddling both slices is illegal in multi-
> > > > pipe
> > > > scenarios
> > > > -	 * - should validate we stay within the hw bandwidth
> > > > limits
> > > > -	 */
> > > > -	if (0 && (num_active > 1 || total_data_bw >= GBps(12)))
> > > > {
> > > > -		ddb->enabled_slices = 2;
> > > > -	} else {
> > > > -		ddb->enabled_slices = 1;
> > > > -		ddb_size /= 2;
> > > > -	}
> > > > +/*
> > > > + * Calculate initial DBuf slice offset, based on slice size
> > > > + * and mask(i.e if slice size is 1024 and second slice is
> > > > enabled
> > > > + * offset would be 1024)
> > > > + */
> > > > +static u32 skl_get_first_dbuf_slice_offset(u32
> > > > dbuf_slice_mask,
> > > > +					   u32 slice_size, u32
> > > > ddb_size)
> > > > +{
> > > > +	u32 offset = 0;
> > > >  
> > > > -	return ddb_size;
> > > > +	if (!dbuf_slice_mask)
> > > > +		return 0;
> > > > +
> > > > +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > > > +
> > > > +	WARN_ON(offset >= ddb_size);
> > > > +	return offset;
> > > >  }
> > > >  
> > > > +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > *dev_priv,
> > > > +				      int pipe, u32
> > > > active_pipes,
> > > > +				      const struct
> > > > intel_crtc_state
> > > > *crtc_state);
> > > > +
> > > >  static void
> > > >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > > > *dev_priv,
> > > >  				   const struct
> > > > intel_crtc_state
> > > > *crtc_state,
> > > > @@ -3857,7 +3855,14 @@
> > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > drm_i915_private *dev_priv,
> > > >  	u32 pipe_width = 0, total_width = 0, width_before_pipe
> > > > = 0;
> > > >  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> > > >  	u16 ddb_size;
> > > > +	u32 ddb_range_size;
> > > >  	u32 i;
> > > > +	u32 dbuf_slice_mask;
> > > > +	u32 active_pipes;
> > > > +	u32 offset;
> > > > +	u32 slice_size;
> > > > +	u32 total_slice_mask;
> > > > +	u32 start, end;
> > > >  
> > > >  	if (WARN_ON(!state) || !crtc_state->hw.active) {
> > > >  		alloc->start = 0;
> > > > @@ -3866,14 +3871,23 @@
> > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > drm_i915_private *dev_priv,
> > > >  		return;
> > > >  	}
> > > >  
> > > > -	if (intel_state->active_pipe_changes)
> > > > +	if (intel_state->active_pipe_changes) {
> > > >  		*num_active = hweight8(intel_state-
> > > > >active_pipes);
> > > > -	else
> > > > +		active_pipes = intel_state->active_pipes;
> > > > +	} else {
> > > >  		*num_active = hweight8(dev_priv->active_pipes);
> > > > +		active_pipes = dev_priv->active_pipes;
> > > > +	}
> > > >  
> > > >  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > > > total_data_rate,
> > > >  				      *num_active, ddb);
> > > >  
> > > > +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > > > +
> > > > +	slice_size = ddb_size / INTEL_INFO(dev_priv)-
> > > > >supported_slices;
> > > > +
> > > > +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> > > > +
> > > >  	/*
> > > >  	 * If the state doesn't change the active CRTC's or
> > > > there is no
> > > >  	 * modeset request, then there's no need to
> > > > recalculate;
> > > > @@ -3891,20 +3905,70 @@
> > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > drm_i915_private *dev_priv,
> > > >  		return;
> > > >  	}
> > > >  
> > > > +	/*
> > > > +	 * Get allowed DBuf slices for correspondent pipe and
> > > > platform.
> > > > +	 */
> > > > +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv,
> > > > for_pipe,
> > > > +						     active_pip
> > > > es,
> > > > crtc_state);
> > > 
> > > Quite a few redundant arguments. The function name looks alien.
> > > 
> > > skl_crtc_possible_dbuf_slices() or something perhaps?
> > > 
> > > > +
> > > > +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes
> > > > %x\n",
> > > > +		      dbuf_slice_mask,
> > > > +		      for_pipe, active_pipes);
> > > > +
> > > > +	/*
> > > > +	 * Figure out at which DBuf slice we start, i.e if we
> > > > start at
> > > > Dbuf S2
> > > > +	 * and slice size is 1024, the offset would be 1024
> > > > +	 */
> > > > +	offset =
> > > > skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > > > +						 slice_size,
> > > > ddb_size);
> > > > +
> > > > +	/*
> > > > +	 * Figure out total size of allowed DBuf slices, which
> > > > is
> > > > basically
> > > > +	 * a number of allowed slices for that pipe multiplied
> > > > by slice
> > > > size.
> > > > +	 * Inside of this
> > > > +	 * range ddb entries are still allocated in proportion
> > > > to
> > > > display width.
> > > > +	 */
> > > > +	ddb_range_size = hweight8(dbuf_slice_mask) *
> > > > slice_size;
> > > > +
> > > >  	/*
> > > >  	 * Watermark/ddb requirement highly depends upon width
> > > > of the
> > > >  	 * framebuffer, So instead of allocating DDB equally
> > > > among
> > > > pipes
> > > >  	 * distribute DDB based on resolution/width of the
> > > > display.
> > > >  	 */
> > > > +	total_slice_mask = dbuf_slice_mask;
> > > 
> > > Isn't this already taken care by the loop below?
> > > 
> > > >  	for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > crtc_state,
> > > > i) {
> > > >  		const struct drm_display_mode *adjusted_mode =
> > > >  			&crtc_state->hw.adjusted_mode;
> > > >  		enum pipe pipe = crtc->pipe;
> > > >  		int hdisplay, vdisplay;
> > > > +		u32 pipe_dbuf_slice_mask = \
> > > > +			i915_get_allowed_dbuf_mask(dev_priv,
> > > > +						pipe,
> > > > +						active_pipes,
> > > > +						crtc_state);
> > > >  
> > > >  		if (!crtc_state->hw.enable)
> > > >  			continue;
> > > >  
> > > > +		/*
> > > > +		 * According to BSpec pipe can share one dbuf
> > > > slice
> > > > with another
> > > > +		 * pipes or pipe can use multiple dbufs, in
> > > > both cases
> > > > we
> > > > +		 * account for other pipes only if they have
> > > > exactly
> > > > same mask.
> > > > +		 * However we need to account how many slices
> > > > we should
> > > > enable
> > > > +		 * in total.
> > > > +		 */
> > > > +		total_slice_mask |= pipe_dbuf_slice_mask;
> > > 
> > > total_slice_mask will now account only the crtcs in the state.
> > > What
> > > happens to the other pipes' slices?

Same as happens now btw. Right now it also does the same thing.

> > 
> > Yes we need to iterate through all pipes/crtcs here, however
> > for getting correct dbuf slice mask most important is active_pipes.
> > However yes, I will change the loop for it to be more correct.
> > 
> > > 
> > > > +
> > > > +		/*
> > > > +		 * Do not account pipes using other slice sets
> > > > +		 * luckily as of current BSpec slice sets do
> > > > not
> > > > partially
> > > > +		 * intersect(pipes share either same one slice
> > > > or same
> > > > slice set
> > > > +		 * i.e no partial intersection), so it is
> > > > enough to
> > > > check for
> > > > +		 * equality for now.
> > > > +		 */
> > > > +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > > > +			continue;
> > > > +
> > > >  		drm_mode_get_hv_timing(adjusted_mode,
> > > > &hdisplay,
> > > > &vdisplay);
> > > >  		total_width += hdisplay;
> > > >  
> > > > @@ -3914,8 +3978,19 @@
> > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > drm_i915_private *dev_priv,
> > > >  			pipe_width = hdisplay;
> > > >  	}
> > > >  
> > > > -	alloc->start = ddb_size * width_before_pipe /
> > > > total_width;
> > > > -	alloc->end = ddb_size * (width_before_pipe +
> > > > pipe_width) /
> > > > total_width;
> > > > +	ddb->enabled_slices = total_slice_mask;
> > > > +
> > > > +	start = ddb_range_size * width_before_pipe /
> > > > total_width;
> > > > +	end = ddb_range_size * (width_before_pipe + pipe_width)
> > > > /
> > > > total_width;
> > > > +
> > > > +	alloc->start = offset + start;
> > > > +	alloc->end = offset + end;
> > > > +
> > > > +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > > > +		      alloc->start, alloc->end);
> > > > +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported
> > > > %d\n",
> > > > +		      ddb->enabled_slices,
> > > > +		      INTEL_INFO(dev_priv)->supported_slices);
> > > >  }
> > > >  
> > > >  static int skl_compute_wm_params(const struct intel_crtc_state
> > > > *crtc_state,
> > > > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > > > intel_crtc *crtc,
> > > >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
> > > >  			  struct skl_ddb_allocation *ddb /* out
> > > > */)
> > > >  {
> > > > -	ddb->enabled_slices =
> > > > intel_enabled_dbuf_slices_num(dev_priv);
> > > > +	ddb->enabled_slices =
> > > > intel_enabled_dbuf_slices(dev_priv);
> > > > +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > > > > enabled_slices);
> > > > 
> > > >  }
> > > >  
> > > >  /*
> > > > @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct
> > > > intel_crtc_state *crtc_state,
> > > >  	return total_data_rate;
> > > >  }
> > > >  
> > > > +static uint_fixed_16_16_t
> > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > *crtc_state)
> > > > +{
> > > > +	u32 src_w, src_h, dst_w, dst_h;
> > > > +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > +	uint_fixed_16_16_t downscale_h, downscale_w;
> > > > +	const struct drm_display_mode *adjusted_mode =
> > > > &crtc_state-
> > > > > hw.adjusted_mode;
> > > > 
> > > > +
> > > > +	src_w = crtc_state->pipe_src_w;
> > > > +	src_h = crtc_state->pipe_src_h;
> > > > +	dst_w = adjusted_mode->crtc_hdisplay;
> > > > +	dst_h = adjusted_mode->crtc_vdisplay;
> > > > +
> > > > +	fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > +	fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > +	downscale_w = max_fixed16(fp_w_ratio,
> > > > u32_to_fixed16(1));
> > > > +	downscale_h = max_fixed16(fp_h_ratio,
> > > > u32_to_fixed16(1));
> > > > +
> > > > +	return mul_fixed16(downscale_w, downscale_h);
> > > > +}
> > > > +
> > > > +uint_fixed_16_16_t
> > > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > > +{
> > > > +	struct drm_plane *plane;
> > > > +	const struct drm_plane_state *drm_plane_state;
> > > > +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > +	struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > >uapi.crtc);
> > > > +
> > > > +	if (!crtc_state->uapi.enable)
> > > > +		return max_downscale;
> > > > +
> > > > +	drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > drm_plane_state, &crtc_state->uapi) {
> > > > +		uint_fixed_16_16_t plane_downscale;
> > > > +		const struct intel_plane_state *plane_state =
> > > > +			to_intel_plane_state(drm_plane_state);
> > > > +
> > > > +		if (!intel_wm_plane_visible(crtc_state,
> > > > plane_state))
> > > > +			continue;
> > > > +
> > > > +		plane_downscale =
> > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > +
> > > > +		DRM_DEBUG_KMS("Plane %d downscale amount
> > > > %d.%d\n",
> > > > +			      to_intel_plane(plane)->id,
> > > > +			      plane_downscale.val >> 16,
> > > > +			      plane_downscale.val & 0xffff);
> > > > +
> > > > +		max_downscale = max_fixed16(plane_downscale,
> > > > max_downscale);
> > > > +	}
> > > > +
> > > > +
> > > > +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > > +
> > > > +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > +		       crtc->pipe, pipe_downscale.val >> 16,
> > > > +		       pipe_downscale.val & 0xffff);
> > > > +
> > > > +	pipe_downscale = mul_fixed16(pipe_downscale,
> > > > max_downscale);
> > > > +
> > > > +	/*
> > > > +	 * Convert result to percentage: get the actual ratio
> > > > instead
> > > > of rate,
> > > > +	 * then multiply by 100 to get percentage.
> > > > +	 */
> > > > +	pipe_ratio = u32_to_fixed16(100 *
> > > > div_round_up_u32_fixed16(1,
> > > > pipe_downscale));
> > > > +
> > > > +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > +		      pipe_ratio.val >> 16, pipe_ratio.val &
> > > > 0xffff);
> > > > +
> > > > +	return pipe_ratio;
> > > > +}
> > > 
> > > Yuck. I just killed these.
> > 
> > I still need pipe_ratio fro ICL as you know, this is dictated by
> > BSpec,
> > not by me.
> 
> That part of the spec is pretty crazy IMO. I think we should just
> ignore
> it for now. Also nothing in there says we must use both slices for a
> single pipe so ignoring doesn't even violate the spec in any way.

So you are instructing me now to ignore BSpec as "we are smarter"?
I don't think this is properly to do it this way. Lets first change the
BSpec, if we don't like pipe ratio here and then driver. Currently this
is written that way and if we are going to do everything our own way
ignoring specs, this is going to be chaos.

> 
> > 
> > > 
> > > > +
> > > > +struct dbuf_slice_conf_entry {
> > > > +	u32 dbuf_mask[I915_MAX_PIPES];
> > > > +	u32 active_pipes;
> > > > +};
> > > > +
> > > > +
> > > > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > > > +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > > > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > > > +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > > > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > > > +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > > > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > > > +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > > > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > > > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > > > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > > > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > > > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > > > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > |
> > > > BIT(PIPE_C) }
> > > > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C)
> > > > |
> > > > BIT(PIPE_D) }
> > > > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C)
> > > > |
> > > > BIT(PIPE_D) }
> > > > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > |
> > > > BIT(PIPE_D) }
> > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > |
> > > > BIT(PIPE_C) }
> > > > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3,
> > > > DBuf4)  \
> > > > +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) |
> > > > BIT(PIPE_B) | \
> > > > +					  BIT(PIPE_C) |
> > > > BIT(PIPE_D) }
> > > > +
> > > > +/*
> > > > + * Table taken from Bspec 12716
> > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > + * plus there are some hardcoded requirements on how
> > > > + * those should be distributed for multipipe scenarios.
> > > > + * For more DBuf slices algorithm can get even more messy
> > > > + * and less readable, so decided to use a table almost
> > > > + * as is from BSpec itself - that way it is at least easier
> > > > + * to compare, change and check.
> > > > + */
> > > > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > ratio
> > > > < 88.8 */
> > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > pipe
> > > > ratio >= 88.8 */
> > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > ratio
> > > > < 88.8 */
> > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > pipe
> > > > ratio >= 88.8 */
> > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > ratio
> > > > < 88.8 */
> > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for
> > > > pipe
> > > > ratio >= 88.8 */
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > DBUF_S2_BIT),
> > > > +};
> > > > +
> > > > +/*
> > > > + * Table taken from Bspec 49255
> > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > + * plus there are some hardcoded requirements on how
> > > > + * those should be distributed for multipipe scenarios.
> > > > + * For more DBuf slices algorithm can get even more messy
> > > > + * and less readable, so decided to use a table almost
> > > > + * as is from BSpec itself - that way it is at least easier
> > > > + * to compare, change and check.
> > > > + */
> > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > DBUF_S2_BIT),
> > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > DBUF_S2_BIT),
> > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > DBUF_S2_BIT),
> > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > DBUF_S2_BIT),
> > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > DBUF_S2_BIT,
> > > > DBUF_S2_BIT),
> > > > +};
> > > 
> > > My eyes!
> > > 
> > > I have to think we should be able to reduce all that to a handful
> > > of lines of code.
> > 
> > Yeah, then we are going to have a huge function with lots of weird
> > definitions, unfortunately BSpec table has quite strange DBuf
> > assignments like 
> > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > 
> > i.e slices can get mixed in a quite various ways. There is no
> > rational pattern in those and could be even dangerous to try to 
> > optimize it some way, as one day it might change again in some
> > unpredictable way.
> > 
> > Would you prefer to have it like
> > 
> > if (pipes are A and B)
> >     program S2 to A and S1 to B
> > if (pipes are A and C)
> >     program S1 to A and S2 to C
> > ...
> > 
> > I would prefer at least to see it in a comparable way with the
> > table we have in BSpec, rather than having lots of weird looking
> > if-else statements, GEN checks and so on.
> > 
> > I knew this table was not going to look like it is done typically
> > here - but really my opinion that it is way better than hardcoding
> > it into some weird algorithm, which would be hard to compare to 
> > initial table, which is already strange enough.
> > 
> > If you don't like it - could you please explain why this is exactly
> > worse than having long functions with hardcoded checks?
> 
> I think we should be able to encode this simply using the
> "Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
> is the distance of each pipe from each dbuf slice. Then all we have
> to do is minimize the total distance. The following tables I think
> should encode that reasonably concisely:
> 
> /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> static const u8 icl_dbuf_distance[][] {
> 	[PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> 	[PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> 	[PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> };
> 
> /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> static const u8 tgl_dbuf_distance[][] {
> 	[PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> 	[PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> 	[PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> 	[PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> };

I think you are missing my point. I will explain why.
My initial idea was similar, use some kind of distance
metrics and encode everything in a smart way.
Current BSpec table has sometimes slices swapped for no reason.

There are seem to be some hardware constraints we are not aware
of. For example for TGL pipes A,B get S2 and S1 slices,
but pipes A C get S1 and S2.

if you encode all this into some kind of distance algorithm,
those differences would be enourmously difficult to spot and
compare with actual table. That is why I decided to do it that
way.

Most likely yes you can solve it by using this metrics,
but will it really be then more readable and comparable
with reference table? 
What if some contraints will be added again, like pipe ratio?

> 
> /*
>  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
>  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
>  */
> static const u8 future_dbuf_distance[][] {
> 	[PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> 	[PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> 	[PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> 	[PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> };
> 
> I suppose we could make that even more minimal by just
> directly sticking that "Pipe and DBUF ordering" language
> into a single array but and then calculate each distance
> from it, but that is perhaps more work, and we'd need
> some kind of enum just for that so that we have a single
> namespace for the pipes and dbuf slices. So i have a
> feeling having those distances pre-computed in a table
> like thet is maybe a better way to go.

Good luck then spoting differences with the BSpec table once it
gets changed.

> 
> > 
> > I think this is pretty stupid to fix stuff which looks new or
> > unusual
> > just because somebody doesn't like it, without even constructive
> > arguing.
> > 
> > - Stan
> > 
> > > 
> > > 
> > > > +
> > > > +/*
> > > > + * This function
> > > 
> > > "This function" is totally redundant. Ditto elsewhere.
> > > 
> > > > finds an entry with same enabled pipe configuration and
> > > > + * returns correspondent DBuf slice mask as stated in BSpec
> > > > for
> > > > particular
> > > > + * platform.
> > > > + */
> > > > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > > > +				     u32 active_pipes,
> > > > +				     const struct
> > > > intel_crtc_state
> > > > *crtc_state)
> > > > +{
> > > > +	int i;
> > > > +	uint_fixed_16_16_t pipe_ratio;
> > > > +
> > > > +	/*
> > > > +	 * Calculate pipe ratio as stated in BSpec 28692
> > > > +	 */
> > > > +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > > > +
> > > > +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > > > +		if (icl_allowed_dbufs[i].active_pipes ==
> > > > active_pipes)
> > > > {
> > > > +			if (hweight32(active_pipes) == 1) {
> > > > +				/*
> > > > +				 * According to BSpec 12716: if
> > > > ratio
> > > > > = 88.8
> > > > 
> > > > +				 * always use single dbuf
> > > > slice.
> > > > +				 */
> > > > +				if (pipe_ratio.val >=
> > > > div_fixed16(888,
> > > > 10).val)
> > > > +					++i;
> > > > +			}
> > > > +			return
> > > > icl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > +		}
> > > > +	}
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +/*
> > > > + * This function finds an entry with same enabled pipe
> > > > configuration and
> > > > + * returns correspondent DBuf slice mask as stated in BSpec
> > > > for
> > > > particular
> > > > + * platform.
> > > > + */
> > > > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > > > +				     u32 active_pipes,
> > > > +				     const struct
> > > > intel_crtc_state
> > > > *crtc_state)
> > > > +{
> > > > +	int i;
> > > > +
> > > > +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > > > +		if (tgl_allowed_dbufs[i].active_pipes ==
> > > > active_pipes)
> > > > +			return
> > > > tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > +	}
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > *dev_priv,
> > > > +				      int pipe, u32
> > > > active_pipes,
> > > > +				      const struct
> > > > intel_crtc_state
> > > > *crtc_state)
> > > > +{
> > > > +	if (IS_GEN(dev_priv, 11))
> > > > +		return icl_get_allowed_dbuf_mask(pipe,
> > > > +						 active_pipes,
> > > > +						 crtc_state);
> > > > +	else if (IS_GEN(dev_priv, 12))
> > > > +		return tgl_get_allowed_dbuf_mask(pipe,
> > > > +						 active_pipes,
> > > > +						 crtc_state);
> > > > +	/*
> > > > +	 * For anything else just return one slice yet.
> > > > +	 * Should be extended for other platforms.
> > > > +	 */
> > > > +	return DBUF_S1_BIT;
> > > > +}
> > > > +
> > > >  static u64
> > > >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > > > *crtc_state,
> > > >  				 u64 *plane_data_rate)
> > > > -- 
> > > > 2.17.1
> > > 
> > > 
> 
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 18:18           ` Ville Syrjälä
  0 siblings, 0 replies; 34+ messages in thread
From: Ville Syrjälä @ 2019-11-18 18:18 UTC (permalink / raw)
  To: Lisovskiy, Stanislav; +Cc: intel-gfx

On Mon, Nov 18, 2019 at 05:17:51PM +0000, Lisovskiy, Stanislav wrote:
> On Mon, 2019-11-18 at 17:27 +0200, Ville Syrjälä wrote:
> > On Mon, Nov 18, 2019 at 09:19:18AM +0000, Lisovskiy, Stanislav wrote:
> > > On Fri, 2019-11-15 at 22:19 +0200, Ville Syrjälä wrote:
> > > > On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy
> > > > wrote:
> > > > > Also implemented algorithm for choosing DBuf slice
> > > > > configuration
> > > > > based on active pipes, pipe ratio as stated in BSpec 12716.
> > > > > 
> > > > > Now pipe allocation still stays proportional to pipe width as
> > > > > before,
> > > > > however within allowed DBuf slice for this particular
> > > > > configuration.
> > > > > 
> > > > > v2: Remove unneeded check from commit as ddb enabled slices
> > > > > might
> > > > >     now differ from hw state.
> > > > > 
> > > > > v3: - Added new field "supported_slices" to ddb to separate max
> > > > >       supported slices vs currently enabled, to avoid
> > > > > confusion.
> > > > >     - Removed obsolete comments and code related to
> > > > > DBuf(Matthew
> > > > > Roper).
> > > > >     - Some code style and long line removal(Matthew Roper).
> > > > >     - Added WARN_ON to new ddb range offset calc
> > > > > function(Matthew
> > > > > Roper).
> > > > >     - Removed platform specific call to calc pipe ratio from
> > > > > ddb
> > > > >       allocation function and fixed the return value(Matthew
> > > > > Roper)
> > > > >     - Refactored DBUF slice allocation table to improve
> > > > > readability
> > > > >     - Added DBUF slice allocation for TGL as well.
> > > > >     - ICL(however not TGL) seems to voluntarily enable second
> > > > > DBuf
> > > > > slice
> > > > >       after pm suspend/resume causing a mismatch failure,
> > > > > because
> > > > > we
> > > > >       update DBuf slices only if we do a modeset, however this
> > > > > check
> > > > >       is done always. Fixed it to be done only when modeset for
> > > > > ICL.
> > > > > 
> > > > > v4: - Now enabled slices is not actually a number, but a
> > > > > bitmask,
> > > > >       because we might need to enable second slice only and
> > > > > number
> > > > >       of slices would still 1 and that current code doesn't
> > > > > allow.
> > > > >     - Remove redundant duplicate code to have some unified way
> > > > > of
> > > > >       enabling dbuf slices instead of hardcoding.
> > > > > 
> > > > > v5: - Fix failing gen9_assert_dbuf_enabled as it was naively
> > > > > thinking
> > > > >       that we have only one DBUF_CTL slice. Now another version
> > > > > for
> > > > >       gen11 will check both slices as only second one can be
> > > > > enabled,
> > > > >       to keep CI happy.
> > > > > 
> > > > > v6: - Removed enabled dbuf assertion completely. Previous code
> > > > >       was keeping dbuf enabled, even(!) when _dbuf_disable is
> > > > > called.
> > > > >       Currently we enable/disable dbuf slices based on
> > > > > requrement
> > > > >       so no point in asserting this here.
> > > > >     - Removed unnecessary modeset check from
> > > > > verify_wm_state(Matthew Roper)
> > > > >     - Slices intersection after union is same as final
> > > > > result(Matthew Roper)
> > > > >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> > > > >     - Some macros added(Matthew Roper)
> > > > >     - ddb range is now always less or equal to ddb size - no
> > > > > need
> > > > > for additional
> > > > >       checks(previously needed as we had some bandwidth checks
> > > > > in
> > > > > there which
> > > > >       could "suddenly" return ddb_size smaller than it is.
> > > > > 
> > > > > v7: Minor costemic changes:
> > > > >     - Changed icl_dbuf_slices_restore name to
> > > > > icl_program_dbuf_slices
> > > > >       as it more corresponds to the actual functionality.
> > > > >     - Some simplification with supported slices for BXT and GLK
> > > > >     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
> > > > >       this is not used for skl anymore.
> > > > >     - Changed DBuf slice assignment order for TGL case
> > > > > 
> > > > > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > > > > Signed-off-by: Stanislav Lisovskiy <
> > > > > stanislav.lisovskiy@intel.com>
> > > > > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > > > > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > > > > Cc: James Ausmus <james.ausmus@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> > > > >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> > > > >  .../drm/i915/display/intel_display_power.h    |   2 +
> > > > >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> > > > >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> > > > >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> > > > >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> > > > >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> > > > >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > > > > ++++++++++++++++--
> > > > >  9 files changed, 431 insertions(+), 104 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > index 876fc25968bf..bd7aff675198 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > > > > intel_crtc *crtc,
> > > > >  
> > > > >  	if (INTEL_GEN(dev_priv) >= 11 &&
> > > > >  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > > > > -		DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > %u, got
> > > > > %u)\n",
> > > > > +		DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > %x, got
> > > > > %x)\n",
> > > > >  			  sw_ddb->enabled_slices,
> > > > >  			  hw->ddb.enabled_slices);
> > > > >  
> > > > > @@ -14614,15 +14614,24 @@ static void
> > > > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > > > >  	u8 hw_enabled_slices = dev_priv-
> > > > > >wm.skl_hw.ddb.enabled_slices;
> > > > >  	u8 required_slices = state-
> > > > > >wm_results.ddb.enabled_slices;
> > > > >  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > > > > +	u8 slices_union = hw_enabled_slices | required_slices;
> > > > > +	u8 slices_intersection = required_slices;
> > > > >  
> > > > >  	for_each_oldnew_intel_crtc_in_state(state, crtc,
> > > > > old_crtc_state, new_crtc_state, i)
> > > > >  		/* ignore allocations for crtc's that have been
> > > > > turned
> > > > > off. */
> > > > >  		if (new_crtc_state->hw.active)
> > > > >  			entries[i] = old_crtc_state-
> > > > > >wm.skl.ddb;
> > > > >  
> > > > > -	/* If 2nd DBuf slice required, enable it here */
> > > > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > > > > hw_enabled_slices)
> > > > > -		icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
> > > > > +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > > > > +		      required_slices, hw_enabled_slices);
> > > > > +
> > > > > +	/*
> > > > > +	 * If some other DBuf slice required, enable it here,
> > > > > +	 * as here we only add more slices, but not remove to
> > > > > prevent
> > > > > +	 * issues if somebody still uses those.
> > > > > +	 */
> > > > > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > hw_enabled_slices)
> > > > > +		icl_dbuf_slices_update(dev_priv, slices_union);
> > > > >  
> > > > >  	/*
> > > > >  	 * Whenever the number of active pipes changes, we need
> > > > > to make
> > > > > sure we
> > > > > @@ -14681,9 +14690,12 @@ static void
> > > > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > > > >  		}
> > > > >  	} while (progress);
> > > > >  
> > > > > -	/* If 2nd DBuf slice is no more required disable it */
> > > > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > hw_enabled_slices)
> > > > > -		icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
> > > > > +	/*
> > > > > +	 * If some other DBuf slice is not needed, disable it
> > > > > here,
> > > > > +	 * as here we only remove slices, which are not needed
> > > > > anymore.
> > > > > +	 */
> > > > > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > hw_enabled_slices)
> > > > > +		icl_dbuf_slices_update(dev_priv,
> > > > > slices_intersection);
> > > > 
> > > > This is still a totally wrong place for this. We need to be sure
> > > > the
> > > > planes have switched over to the new ddb layout before we power
> > > > off
> > > > anything.
> > > 
> > > I have just modified already existing line here, not added by me.
> > > The point of this patch is starting to use second DBuf slice but
> > > not fixing all kinds of issues related to DBuf code at the same
> > > time.
> > 
> > You're re-enabling code I disabled on purpose. It has to be fixed
> > properly or not touched at all. In hindsight I should have probably
> > just ripped it all out to not give people bad ideas.
> 
> -	/* If 2nd DBuf slice is no more required disable it */
> > > > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > hw_enabled_slices)
> > > > > -		icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
> 
> I'm not reenabling it - this is code is out there right now.

It's _dead_ code, and it's like that on purpose. You're now attempting
to re-enable it without fixing the underlying issues.

> This is all again about changing all kinds of stuff at the same time.
> 
> Why we can't just do things step by step? You have a point that
> this is wrong place ok, but as we see current code causes no
> regressions. Just tell me your idea where to put it and I will
> do that, if you don't have it yet - let it stay there, if CI shows
> no regressions currently. 

The same global state tricks apply here as with the cdclk/sagv stuff.
1. enable new resources/bump up clocks
2. reconfigure planes/etc.
3. wait for vblanks
4. disable unused resources/drop the clocks

> 
> > 
> > > 
> > > Of course this is nice idea to fix it - if that is wrong plane,
> > > but current code somehow lives with it and probably this should be
> > > with some other patch.
> > > 
> > > > 
> > > > >  }
> > > > >  
> > > > >  static void intel_atomic_helper_free_state(struct
> > > > > drm_i915_private
> > > > > *dev_priv)
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > index ce1b64f4dd44..a78bc9b61862 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > @@ -1031,15 +1031,6 @@ static bool
> > > > > gen9_dc_off_power_well_enabled(struct drm_i915_private
> > > > > *dev_priv,
> > > > >  		(I915_READ(DC_STATE_EN) &
> > > > > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> > > > >  }
> > > > >  
> > > > > -static void gen9_assert_dbuf_enabled(struct drm_i915_private
> > > > > *dev_priv)
> > > > > -{
> > > > > -	u32 tmp = I915_READ(DBUF_CTL);
> > > > > -
> > > > > -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> > > > > -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > > > > -	     "Unexpected DBuf power power state (0x%08x)\n",
> > > > > tmp);
> > > > > -}
> > > > > -
> > > > >  static void gen9_disable_dc_states(struct drm_i915_private
> > > > > *dev_priv)
> > > > >  {
> > > > >  	struct intel_cdclk_state cdclk_state = {};
> > > > > @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct
> > > > > drm_i915_private *dev_priv)
> > > > >  	/* Can't read out voltage_level so can't use
> > > > > intel_cdclk_changed() */
> > > > >  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw,
> > > > > &cdclk_state));
> > > > >  
> > > > > -	gen9_assert_dbuf_enabled(dev_priv);
> > > > > -
> > > > >  	if (IS_GEN9_LP(dev_priv))
> > > > >  		bxt_verify_ddi_phy_power_wells(dev_priv);
> > > > >  
> > > > > @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct
> > > > > drm_i915_private *dev_priv)
> > > > >  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> > > > >  }
> > > > >  
> > > > > -static u8 intel_dbuf_max_slices(struct drm_i915_private
> > > > > *dev_priv)
> > > > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > > > >  {
> > > > > -	if (INTEL_GEN(dev_priv) < 11)
> > > > > -		return 1;
> > > > > -	return 2;
> > > > > +	return INTEL_INFO(dev_priv)->supported_slices;
> > > > > +}
> > > > 
> > > > Kinda pointless function now. 'supported_slices' is not a very
> > > > good
> > > > name. Should be 'num_dbuf_slices' or something along those lines.
> > > 
> > > Supported slices really means how much slices we support in
> > > hardware.
> > > num_dbuf_slices will not reflect that, it will be more ambigious.
> > > I don't like this name.
> > 
> > supported_slices doesn't even say "dbuf" anywhere. So it's even more
> > ambiguous. Also we already have num_pipes, num_sprites, etc.
> 
> ..and in current code we have... "enabled_slices", no? Ouch! :) IS this
> really that important and worth arguing at the very last moment?
> whether number of
> supported_slices_in_hardware, should be named "supported slices" or
> "num_dbuf_slices"? If yes, lets go on, miss some targets and continue
> discussing this "soooo important" thing. 
> 
> Some one might come up that it is called "num_slices" or
> "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
> keep everyones bloated egos happy??
> 
> 
> > 
> > > 
> > > > 
> > > > > +
> > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > *dev_priv)
> > > > > +{
> > > > > +	const u8 hw_enabled_slices = dev_priv-
> > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > 
> > > > Seems a rather pointless variable. const at least is pointless.
> > > > 
> > > > > +
> > > > > +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> > > > >  }
> > > > >  
> > > > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > > > >  			    u8 req_slices)
> > > > >  {
> > > > > -	const u8 hw_enabled_slices = dev_priv-
> > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > > 
> > > > > -	bool ret;
> > > > > +	bool ret = true;
> > > > > +	int i;
> > > > > +	int max_slices = intel_dbuf_max_slices(dev_priv);
> > > > >  
> > > > > -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > > > > +	if (hweight8(req_slices) >
> > > > > intel_dbuf_max_slices(dev_priv)) {
> > > > >  		DRM_ERROR("Invalid number of dbuf slices
> > > > > requested\n");
> > > > >  		return;
> > > > >  	}
> > > > >  
> > > > > -	if (req_slices == hw_enabled_slices || req_slices == 0)
> > > > > -		return;
> > > > > +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n",
> > > > > req_slices);
> > > > >  
> > > > > -	if (req_slices > hw_enabled_slices)
> > > > > -		ret = intel_dbuf_slice_set(dev_priv,
> > > > > DBUF_CTL_S2,
> > > > > true);
> > > > > -	else
> > > > > -		ret = intel_dbuf_slice_set(dev_priv,
> > > > > DBUF_CTL_S2,
> > > > > false);
> > > > > +	for (i = 0; i < max_slices; i++) {
> > > > > +		int slice_bit = BIT(i);
> > > > > +		bool slice_set = (slice_bit & req_slices) != 0;
> > > > > +
> > > > > +		switch (slice_bit) {
> > > > > +		case DBUF_S1_BIT:
> > > > > +			ret = ret &&
> > > > > intel_dbuf_slice_set(dev_priv,
> > > > > +							  DBUF_
> > > > > CTL_S1,
> > > > > +							  slice
> > > > > _set);
> > > > > +			break;
> > > > > +		case DBUF_S2_BIT:
> > > > > +			ret = ret &&
> > > > > intel_dbuf_slice_set(dev_priv,
> > > > > +							  DBUF_
> > > > > CTL_S2,
> > > > > +							  slice
> > > > > _set);
> > > > 
> > > > The return value on intel_dbuf_slice_set() seems pointless. I'd
> > > > nuke
> > > > it.
> > > > 
> > > > Also as long you have intel_dbuf_slice_set() I'd probably just
> > > > pass
> > > > the
> > > > slice index there and let it deal with the register details.
> > > > DBUF_CTL
> > > > should probably be parametrized as well so you don't need
> > > > annoying
> > > > switch
> > > > statements.
> > > 
> > > What if it fails? Shouldn't we still have some clue what state do
> > > we
> > > have/had at the moment?
> > 
> > It if fails you log an error and keep going.
> 
> Ah sure. And let hw/sw state be completely different.
> Also see next comment about this.
> 
> > 
> > > 
> > > > 
> > > > > +			break;
> > > > > +		default:
> > > > > +			MISSING_CASE(slice_bit);
> > > > > +		}
> > > > > +	}
> > > > >  
> > > > >  	if (ret)
> > > > >  		dev_priv->wm.skl_hw.ddb.enabled_slices =
> > > > > req_slices;
> > > > > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > > > > drm_i915_private *dev_priv,
> > > > >  
> > > > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > > > >  {
> > > > > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > > DBUF_POWER_REQUEST);
> > > > > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > > DBUF_POWER_REQUEST);
> > > > > -	POSTING_READ(DBUF_CTL_S2);
> > > > > -
> > > > > -	udelay(10);
> > > > > -
> > > > > -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > -		DRM_ERROR("DBuf power enable timeout\n");
> > > > > -	else
> > > > > -		/*
> > > > > -		 * FIXME: for now pretend that we only have 1
> > > > > slice,
> > > > > see
> > > > > -		 * intel_enabled_dbuf_slices_num().
> > > > > -		 */
> > > > > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > +	/*
> > > > > +	 * Just power up 1 slice, we will
> > > > > +	 * figure out later which slices we have and what we
> > > > > need.
> > > > > +	 */
> > > > > +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > > 
> > > > That stuff really shouldn't be modified by low level functions.
> > > > The intial value should come from readout, and after that it
> > > > should
> > > > be adjusted as part of the overall atomic state.
> > > 
> > > Well, as you can see currently this is even worse - has some weird
> > > hardcoded stuff and is modified like that. Also if we are enabling
> > > the 
> > > dbuf slices, shouldn't we really determine what value should be
> > > there?
> > > 
> > > Also I wouldn't really always trust the readout, as to me it seems
> > > icelake resets it constantly to 2 after suspend/resume. 
> > 
> > The point of readout is that we syncrhonize our software state with
> > the
> > current state of hardware without having to modify the hardware
> > state.
> > Once synchronized the driver takes over and tells the hw what to do.
> 
> That is especially nice after you say "report error and don't care"
> when I really want to reflect hw state in sw state, by checking whether
> we could set dbuf slices or not. 
> Now you say that we need to synchronize software state with hw state,
> however if I fail to modify it - then "who cares".

Yes. If you fail to modify the hardware state the whole thing is
broken anyway. There's is *nothing* we can do at this stage in the
commit to fix it.

> 
> 
> > 
> > > 
> > > > 
> > > > > +	icl_program_dbuf_slices(dev_priv);
> > > > >  }
> > > > >  
> > > > >  static void icl_dbuf_disable(struct drm_i915_private
> > > > > *dev_priv)
> > > > >  {
> > > > > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > > > > ~DBUF_POWER_REQUEST);
> > > > > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > > > > ~DBUF_POWER_REQUEST);
> > > > > -	POSTING_READ(DBUF_CTL_S2);
> > > > > -
> > > > > -	udelay(10);
> > > > > -
> > > > > -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > -		DRM_ERROR("DBuf power disable timeout!\n");
> > > > > -	else
> > > > > -		/*
> > > > > -		 * FIXME: for now pretend that the first slice
> > > > > is
> > > > > always
> > > > > -		 * enabled, see
> > > > > intel_enabled_dbuf_slices_num().
> > > > > -		 */
> > > > > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > +	/*
> > > > > +	 * Disable all slices
> > > > > +	 */
> > > > > +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > > > > +	icl_program_dbuf_slices(dev_priv);
> > > > >  }
> > > > >  
> > > > >  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > index 1da04f3e0fb3..b7057d68ad78 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > > > > drm_i915_private *i915,
> > > > >  	for ((wf) = intel_display_power_get((i915), (domain));
> > > > > (wf); \
> > > > >  	     intel_display_power_put_async((i915), (domain),
> > > > > (wf)),
> > > > > (wf) = 0)
> > > > >  
> > > > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
> > > > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > > > >  			    u8 req_slices);
> > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > *dev_priv);
> > > > >  
> > > > >  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
> > > > >  			     bool override, unsigned int mask);
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > > index 82e4e6bf08c3..96741e68633a 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > > @@ -51,6 +51,7 @@
> > > > >  #include "display/intel_bw.h"
> > > > >  #include "display/intel_cdclk.h"
> > > > >  #include "display/intel_display_types.h"
> > > > > +#include "display/intel_display_power.h"
> > > > >  #include "display/intel_dp.h"
> > > > >  #include "display/intel_fbdev.h"
> > > > >  #include "display/intel_hotplug.h"
> > > > > @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct
> > > > > device *kdev)
> > > > >  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
> > > > >  		ret = vlv_resume_prepare(dev_priv, true);
> > > > >  
> > > > > +	/* Weird hack to fix ICL hardware bug, as it resets
> > > > > DBUF slices
> > > > > reg */
> > > > > +	if (INTEL_GEN(dev_priv) == 11)
> > > > > +		icl_program_dbuf_slices(dev_priv);
> > > > > +
> > > > >  	intel_uncore_runtime_resume(&dev_priv->uncore);
> > > > >  
> > > > >  	intel_runtime_pm_enable_interrupts(dev_priv);
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > > index 7e0f67babe20..a396977c9c2d 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > @@ -804,7 +804,7 @@ static inline bool
> > > > > skl_ddb_entry_equal(const
> > > > > struct skl_ddb_entry *e1,
> > > > >  }
> > > > >  
> > > > >  struct skl_ddb_allocation {
> > > > 
> > > > This struct is quite pointless. Should be killed off.
> > > 
> > > Legacy stuff, not added by me. Quite fine with removing it - 
> > > but should we really do all this at the same time and in same
> > > patch? 
> > 
> > Not the same patch. But clean up the cobwebs first, then add the new
> > stuff. Much easier that way around.
> 
> There is a shit ton load of cobwebs to clean. Personally my opinion is
> that current dbuf controlling code is total garbage - which has lots of
> hardcoded stuff. Good luck enabling only second dbuf slice, good luck
> disabling/enabling dbuf properly....
> Where you were, when this was committed then?

Looking at something else unfortunately. Also it has just kept
bitrotting since because no one has cared.

> 
> If we need to fix it - fine. But may be we should do it in stages?
> Point of this patch is "enable second dbuf slice", next patch "remove 
> unneeded structs" and so on. 
> While you want me to shuffle everything at the same time, which is not
> good idea, especially considering complexity and amount of legacy stuff
> we currently have.

Building new stuff on top of a decomposing garbage pile is not
a good idea: a) it's harder than building on clean foundation,
b) it'll probably fall over and get buried in the old garbage
at which point you're likely going to be right back where you
started.

> 
> > 
> > > 
> > > > 
> > > > > -	u8 enabled_slices; /* GEN11 has configurable 2 slices
> > > > > */
> > > > > +	u8 enabled_slices;   /* Bitmask of currently enabled
> > > > > DBuf
> > > > > slices */
> > > > >  };
> > > > >  
> > > > >  struct skl_ddb_values {
> > > > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > > > b/drivers/gpu/drm/i915/i915_pci.c
> > > > > index 1bb701d32a5d..8e86af505730 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > > > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > > > > intel_cherryview_info = {
> > > > >  	.has_gt_uc = 1, \
> > > > >  	.display.has_hdcp = 1, \
> > > > >  	.display.has_ipc = 1, \
> > > > > -	.ddb_size = 896
> > > > > +	.ddb_size = 896, \
> > > > > +	.supported_slices = 1
> > > > >  
> > > > >  #define SKL_PLATFORM \
> > > > >  	GEN9_FEATURES, \
> > > > > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > > > > intel_skylake_gt4_info = {
> > > > >  	GEN(9), \
> > > > >  	.is_lp = 1, \
> > > > >  	.display.has_hotplug = 1, \
> > > > > +	.supported_slices = 1, \
> > > > >  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) |
> > > > > BIT(VECS0),
> > > > > \
> > > > >  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
> > > > >  	.has_64bit_reloc = 1, \
> > > > > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > > > > intel_coffeelake_gt3_info = {
> > > > >  	GEN9_FEATURES, \
> > > > >  	GEN(10), \
> > > > >  	.ddb_size = 1024, \
> > > > > +	.supported_slices = 1, \
> > > > >  	.display.has_dsc = 1, \
> > > > >  	.has_coherent_ggtt = false, \
> > > > >  	GLK_COLORS
> > > > > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > > > > intel_cannonlake_info = {
> > > > >  	}, \
> > > > >  	GEN(11), \
> > > > >  	.ddb_size = 2048, \
> > > > > +	.supported_slices = 2, \
> > > > >  	.has_logical_ring_elsq = 1, \
> > > > >  	.color = { .degamma_lut_size = 33, .gamma_lut_size =
> > > > > 262145 }
> > > > >  
> > > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > > > > b/drivers/gpu/drm/i915/i915_reg.h
> > > > > index a607ea520829..fba5731063d8 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > > > @@ -7734,9 +7734,11 @@ enum {
> > > > >  #define DISP_ARB_CTL2	_MMIO(0x45004)
> > > > >  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
> > > > >  #define  DISP_IPC_ENABLE		(1 << 3)
> > > > > -#define DBUF_CTL	_MMIO(0x45008)
> > > > > -#define DBUF_CTL_S1	_MMIO(0x45008)
> > > > > -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> > > > > +#define DBUF_S1_BIT			BIT(0)
> > > > > +#define DBUF_S2_BIT			BIT(1)
> > > > 
> > > > These are not bits inside DBUF_CTL so they should not be here.
> > > > I'd just nuke them entirely. If you think we need more than an
> > > > int to refer to a slice then IMO something simple like
> > > > 
> > > > enum dbuf_slice {
> > > > 	DBUF_S1,
> > > > 	DBUF_S2,
> > > > };
> > > > 
> > > > would match the general apporach we use for other things much
> > > > better
> > > > rather than defining some shifted values.
> > > 
> > > We need to combine slices - so bitmask is quite suitable for that, 
> > > I can of course use this enum, but then I'll have to re-encode it 
> > > to bitmask anyways.
> > 
> > Which is pretty much how everything else works in the driver.
> 
> Another piece of incredibly important thing to argue..

It is. Lot's of crap in the driver is simply caused by people not
following established style and coming up with something different..
Which means there is an extra cost for everyone else when reading the
code. And that probably means most people just don't bother to read it.

Granted, some of that is due to spurts in evolution where major overhauls
changed the architecture so significantly that the old established
conventions simply made node sense anymore. Atomic being the last big
one in that regard. And we're still cleaning up the resulting mess.

> 
> > 
> > > 
> > > 
> > > > 
> > > > > +#define DBUF_CTL			_MMIO(0x45008)
> > > > > +#define DBUF_CTL_S1			_MMIO(0x45008)
> > > > > +#define DBUF_CTL_S2			_MMIO(0x44FE8)
> > > > >  #define  DBUF_POWER_REQUEST		(1 << 31)
> > > > >  #define  DBUF_POWER_STATE		(1 << 30)
> > > > >  #define GEN7_MSG_CTL	_MMIO(0x45010)
> > > > > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > @@ -180,6 +180,7 @@ struct intel_device_info {
> > > > >  	} display;
> > > > >  
> > > > >  	u16 ddb_size; /* in blocks */
> > > > > +	u8 supported_slices; /* number of DBuf slices */
> > > > >  
> > > > >  	/* Register offsets for the various display pipes and
> > > > > transcoders */
> > > > >  	int pipe_offsets[I915_MAX_TRANSCODERS];
> > > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > > > b/drivers/gpu/drm/i915/intel_pm.c
> > > > > index 2d389e437e87..1ea81ab92429 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct
> > > > > drm_device
> > > > > *dev)
> > > > >  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> > > > >  }
> > > > >  
> > > > > -static u8 intel_enabled_dbuf_slices_num(struct
> > > > > drm_i915_private
> > > > > *dev_priv)
> > > > > +static u8 intel_enabled_dbuf_slices(struct drm_i915_private
> > > > > *dev_priv)
> > > > >  {
> > > > > -	u8 enabled_slices;
> > > > > -
> > > > > -	/* Slice 1 will always be enabled */
> > > > > -	enabled_slices = 1;
> > > > > +	u8 enabled_slices = 0;
> > > > >  
> > > > >  	/* Gen prior to GEN11 have only one DBuf slice */
> > > > >  	if (INTEL_GEN(dev_priv) < 11)
> > > > > -		return enabled_slices;
> > > > > +		return DBUF_S1_BIT;
> > > > 
> > > > I'd nuke this special case entirely and just read everything from
> > > > the
> > > > hw. Assuming we need readout somewhere.
> > > 
> > > We have it that way hardcoded right now in drm-tip. I think this is
> > > a
> > > subject to other patch, but not this change. As I understand we
> > > need 
> > > to first figure out how to do it properly. 
> > > This patch anyway makes it better by at least removing hardcoded
> > > relation between number of slices and which slices are enabled.
> > > With current code there is no way for example to enable second DBuf
> > > slice only, which is required by BSpec.
> > > 
> > > > 
> > > > >  
> > > > > -	/*
> > > > > -	 * FIXME: for now we'll only ever use 1 slice; pretend
> > > > > that we
> > > > > have
> > > > > -	 * only that 1 slice enabled until we have a proper way
> > > > > for on-
> > > > > demand
> > > > > -	 * toggling of the second slice.
> > > > > -	 */
> > > > > -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > -		enabled_slices++;
> > > > > +	/* Check if second DBuf slice is enabled */
> > > > > +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > > > > +		enabled_slices |= DBUF_S1_BIT;
> > > > > +
> > > > > +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > +		enabled_slices |= DBUF_S2_BIT;
> > > > >  
> > > > >  	return enabled_slices;
> > > > >  }
> > > > > @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct
> > > > > drm_i915_private *dev_priv,
> > > > >  			      const int num_active,
> > > > >  			      struct skl_ddb_allocation *ddb)
> > > > >  {
> > > > > -	const struct drm_display_mode *adjusted_mode;
> > > > > -	u64 total_data_bw;
> > > > >  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > > > > -
> > > > >  	WARN_ON(ddb_size == 0);
> > > > >  
> > > > >  	if (INTEL_GEN(dev_priv) < 11)
> > > > >  		return ddb_size - 4; /* 4 blocks for bypass
> > > > > path
> > > > > allocation */
> > > > >  
> > > > > -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> > > > > -	total_data_bw = total_data_rate *
> > > > > drm_mode_vrefresh(adjusted_mode);
> > > > > +	return ddb_size;
> > > > > +}
> > > > >  
> > > > > -	/*
> > > > > -	 * 12GB/s is maximum BW supported by single DBuf slice.
> > > > > -	 *
> > > > > -	 * FIXME dbuf slice code is broken:
> > > > > -	 * - must wait for planes to stop using the slice
> > > > > before
> > > > > powering it off
> > > > > -	 * - plane straddling both slices is illegal in multi-
> > > > > pipe
> > > > > scenarios
> > > > > -	 * - should validate we stay within the hw bandwidth
> > > > > limits
> > > > > -	 */
> > > > > -	if (0 && (num_active > 1 || total_data_bw >= GBps(12)))
> > > > > {
> > > > > -		ddb->enabled_slices = 2;
> > > > > -	} else {
> > > > > -		ddb->enabled_slices = 1;
> > > > > -		ddb_size /= 2;
> > > > > -	}
> > > > > +/*
> > > > > + * Calculate initial DBuf slice offset, based on slice size
> > > > > + * and mask(i.e if slice size is 1024 and second slice is
> > > > > enabled
> > > > > + * offset would be 1024)
> > > > > + */
> > > > > +static u32 skl_get_first_dbuf_slice_offset(u32
> > > > > dbuf_slice_mask,
> > > > > +					   u32 slice_size, u32
> > > > > ddb_size)
> > > > > +{
> > > > > +	u32 offset = 0;
> > > > >  
> > > > > -	return ddb_size;
> > > > > +	if (!dbuf_slice_mask)
> > > > > +		return 0;
> > > > > +
> > > > > +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > > > > +
> > > > > +	WARN_ON(offset >= ddb_size);
> > > > > +	return offset;
> > > > >  }
> > > > >  
> > > > > +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > > *dev_priv,
> > > > > +				      int pipe, u32
> > > > > active_pipes,
> > > > > +				      const struct
> > > > > intel_crtc_state
> > > > > *crtc_state);
> > > > > +
> > > > >  static void
> > > > >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > > > > *dev_priv,
> > > > >  				   const struct
> > > > > intel_crtc_state
> > > > > *crtc_state,
> > > > > @@ -3857,7 +3855,14 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >  	u32 pipe_width = 0, total_width = 0, width_before_pipe
> > > > > = 0;
> > > > >  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> > > > >  	u16 ddb_size;
> > > > > +	u32 ddb_range_size;
> > > > >  	u32 i;
> > > > > +	u32 dbuf_slice_mask;
> > > > > +	u32 active_pipes;
> > > > > +	u32 offset;
> > > > > +	u32 slice_size;
> > > > > +	u32 total_slice_mask;
> > > > > +	u32 start, end;
> > > > >  
> > > > >  	if (WARN_ON(!state) || !crtc_state->hw.active) {
> > > > >  		alloc->start = 0;
> > > > > @@ -3866,14 +3871,23 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >  		return;
> > > > >  	}
> > > > >  
> > > > > -	if (intel_state->active_pipe_changes)
> > > > > +	if (intel_state->active_pipe_changes) {
> > > > >  		*num_active = hweight8(intel_state-
> > > > > >active_pipes);
> > > > > -	else
> > > > > +		active_pipes = intel_state->active_pipes;
> > > > > +	} else {
> > > > >  		*num_active = hweight8(dev_priv->active_pipes);
> > > > > +		active_pipes = dev_priv->active_pipes;
> > > > > +	}
> > > > >  
> > > > >  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > > > > total_data_rate,
> > > > >  				      *num_active, ddb);
> > > > >  
> > > > > +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > > > > +
> > > > > +	slice_size = ddb_size / INTEL_INFO(dev_priv)-
> > > > > >supported_slices;
> > > > > +
> > > > > +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> > > > > +
> > > > >  	/*
> > > > >  	 * If the state doesn't change the active CRTC's or
> > > > > there is no
> > > > >  	 * modeset request, then there's no need to
> > > > > recalculate;
> > > > > @@ -3891,20 +3905,70 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >  		return;
> > > > >  	}
> > > > >  
> > > > > +	/*
> > > > > +	 * Get allowed DBuf slices for correspondent pipe and
> > > > > platform.
> > > > > +	 */
> > > > > +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv,
> > > > > for_pipe,
> > > > > +						     active_pip
> > > > > es,
> > > > > crtc_state);
> > > > 
> > > > Quite a few redundant arguments. The function name looks alien.
> > > > 
> > > > skl_crtc_possible_dbuf_slices() or something perhaps?
> > > > 
> > > > > +
> > > > > +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes
> > > > > %x\n",
> > > > > +		      dbuf_slice_mask,
> > > > > +		      for_pipe, active_pipes);
> > > > > +
> > > > > +	/*
> > > > > +	 * Figure out at which DBuf slice we start, i.e if we
> > > > > start at
> > > > > Dbuf S2
> > > > > +	 * and slice size is 1024, the offset would be 1024
> > > > > +	 */
> > > > > +	offset =
> > > > > skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > > > > +						 slice_size,
> > > > > ddb_size);
> > > > > +
> > > > > +	/*
> > > > > +	 * Figure out total size of allowed DBuf slices, which
> > > > > is
> > > > > basically
> > > > > +	 * a number of allowed slices for that pipe multiplied
> > > > > by slice
> > > > > size.
> > > > > +	 * Inside of this
> > > > > +	 * range ddb entries are still allocated in proportion
> > > > > to
> > > > > display width.
> > > > > +	 */
> > > > > +	ddb_range_size = hweight8(dbuf_slice_mask) *
> > > > > slice_size;
> > > > > +
> > > > >  	/*
> > > > >  	 * Watermark/ddb requirement highly depends upon width
> > > > > of the
> > > > >  	 * framebuffer, So instead of allocating DDB equally
> > > > > among
> > > > > pipes
> > > > >  	 * distribute DDB based on resolution/width of the
> > > > > display.
> > > > >  	 */
> > > > > +	total_slice_mask = dbuf_slice_mask;
> > > > 
> > > > Isn't this already taken care by the loop below?
> > > > 
> > > > >  	for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > crtc_state,
> > > > > i) {
> > > > >  		const struct drm_display_mode *adjusted_mode =
> > > > >  			&crtc_state->hw.adjusted_mode;
> > > > >  		enum pipe pipe = crtc->pipe;
> > > > >  		int hdisplay, vdisplay;
> > > > > +		u32 pipe_dbuf_slice_mask = \
> > > > > +			i915_get_allowed_dbuf_mask(dev_priv,
> > > > > +						pipe,
> > > > > +						active_pipes,
> > > > > +						crtc_state);
> > > > >  
> > > > >  		if (!crtc_state->hw.enable)
> > > > >  			continue;
> > > > >  
> > > > > +		/*
> > > > > +		 * According to BSpec pipe can share one dbuf
> > > > > slice
> > > > > with another
> > > > > +		 * pipes or pipe can use multiple dbufs, in
> > > > > both cases
> > > > > we
> > > > > +		 * account for other pipes only if they have
> > > > > exactly
> > > > > same mask.
> > > > > +		 * However we need to account how many slices
> > > > > we should
> > > > > enable
> > > > > +		 * in total.
> > > > > +		 */
> > > > > +		total_slice_mask |= pipe_dbuf_slice_mask;
> > > > 
> > > > total_slice_mask will now account only the crtcs in the state.
> > > > What
> > > > happens to the other pipes' slices?
> 
> Same as happens now btw. Right now it also does the same thing.

Right now it does absolutely nothing. It's all dead code.

> 
> > > 
> > > Yes we need to iterate through all pipes/crtcs here, however
> > > for getting correct dbuf slice mask most important is active_pipes.
> > > However yes, I will change the loop for it to be more correct.
> > > 
> > > > 
> > > > > +
> > > > > +		/*
> > > > > +		 * Do not account pipes using other slice sets
> > > > > +		 * luckily as of current BSpec slice sets do
> > > > > not
> > > > > partially
> > > > > +		 * intersect(pipes share either same one slice
> > > > > or same
> > > > > slice set
> > > > > +		 * i.e no partial intersection), so it is
> > > > > enough to
> > > > > check for
> > > > > +		 * equality for now.
> > > > > +		 */
> > > > > +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > > > > +			continue;
> > > > > +
> > > > >  		drm_mode_get_hv_timing(adjusted_mode,
> > > > > &hdisplay,
> > > > > &vdisplay);
> > > > >  		total_width += hdisplay;
> > > > >  
> > > > > @@ -3914,8 +3978,19 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >  			pipe_width = hdisplay;
> > > > >  	}
> > > > >  
> > > > > -	alloc->start = ddb_size * width_before_pipe /
> > > > > total_width;
> > > > > -	alloc->end = ddb_size * (width_before_pipe +
> > > > > pipe_width) /
> > > > > total_width;
> > > > > +	ddb->enabled_slices = total_slice_mask;
> > > > > +
> > > > > +	start = ddb_range_size * width_before_pipe /
> > > > > total_width;
> > > > > +	end = ddb_range_size * (width_before_pipe + pipe_width)
> > > > > /
> > > > > total_width;
> > > > > +
> > > > > +	alloc->start = offset + start;
> > > > > +	alloc->end = offset + end;
> > > > > +
> > > > > +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > > > > +		      alloc->start, alloc->end);
> > > > > +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported
> > > > > %d\n",
> > > > > +		      ddb->enabled_slices,
> > > > > +		      INTEL_INFO(dev_priv)->supported_slices);
> > > > >  }
> > > > >  
> > > > >  static int skl_compute_wm_params(const struct intel_crtc_state
> > > > > *crtc_state,
> > > > > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > > > > intel_crtc *crtc,
> > > > >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
> > > > >  			  struct skl_ddb_allocation *ddb /* out
> > > > > */)
> > > > >  {
> > > > > -	ddb->enabled_slices =
> > > > > intel_enabled_dbuf_slices_num(dev_priv);
> > > > > +	ddb->enabled_slices =
> > > > > intel_enabled_dbuf_slices(dev_priv);
> > > > > +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > > > > > enabled_slices);
> > > > > 
> > > > >  }
> > > > >  
> > > > >  /*
> > > > > @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct
> > > > > intel_crtc_state *crtc_state,
> > > > >  	return total_data_rate;
> > > > >  }
> > > > >  
> > > > > +static uint_fixed_16_16_t
> > > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +	u32 src_w, src_h, dst_w, dst_h;
> > > > > +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > > +	uint_fixed_16_16_t downscale_h, downscale_w;
> > > > > +	const struct drm_display_mode *adjusted_mode =
> > > > > &crtc_state-
> > > > > > hw.adjusted_mode;
> > > > > 
> > > > > +
> > > > > +	src_w = crtc_state->pipe_src_w;
> > > > > +	src_h = crtc_state->pipe_src_h;
> > > > > +	dst_w = adjusted_mode->crtc_hdisplay;
> > > > > +	dst_h = adjusted_mode->crtc_vdisplay;
> > > > > +
> > > > > +	fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > > +	fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > > +	downscale_w = max_fixed16(fp_w_ratio,
> > > > > u32_to_fixed16(1));
> > > > > +	downscale_h = max_fixed16(fp_h_ratio,
> > > > > u32_to_fixed16(1));
> > > > > +
> > > > > +	return mul_fixed16(downscale_w, downscale_h);
> > > > > +}
> > > > > +
> > > > > +uint_fixed_16_16_t
> > > > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > > > +{
> > > > > +	struct drm_plane *plane;
> > > > > +	const struct drm_plane_state *drm_plane_state;
> > > > > +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > > +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > > +	struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > > >uapi.crtc);
> > > > > +
> > > > > +	if (!crtc_state->uapi.enable)
> > > > > +		return max_downscale;
> > > > > +
> > > > > +	drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > > drm_plane_state, &crtc_state->uapi) {
> > > > > +		uint_fixed_16_16_t plane_downscale;
> > > > > +		const struct intel_plane_state *plane_state =
> > > > > +			to_intel_plane_state(drm_plane_state);
> > > > > +
> > > > > +		if (!intel_wm_plane_visible(crtc_state,
> > > > > plane_state))
> > > > > +			continue;
> > > > > +
> > > > > +		plane_downscale =
> > > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > > +
> > > > > +		DRM_DEBUG_KMS("Plane %d downscale amount
> > > > > %d.%d\n",
> > > > > +			      to_intel_plane(plane)->id,
> > > > > +			      plane_downscale.val >> 16,
> > > > > +			      plane_downscale.val & 0xffff);
> > > > > +
> > > > > +		max_downscale = max_fixed16(plane_downscale,
> > > > > max_downscale);
> > > > > +	}
> > > > > +
> > > > > +
> > > > > +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > > > +
> > > > > +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > > +		       crtc->pipe, pipe_downscale.val >> 16,
> > > > > +		       pipe_downscale.val & 0xffff);
> > > > > +
> > > > > +	pipe_downscale = mul_fixed16(pipe_downscale,
> > > > > max_downscale);
> > > > > +
> > > > > +	/*
> > > > > +	 * Convert result to percentage: get the actual ratio
> > > > > instead
> > > > > of rate,
> > > > > +	 * then multiply by 100 to get percentage.
> > > > > +	 */
> > > > > +	pipe_ratio = u32_to_fixed16(100 *
> > > > > div_round_up_u32_fixed16(1,
> > > > > pipe_downscale));
> > > > > +
> > > > > +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > > +		      pipe_ratio.val >> 16, pipe_ratio.val &
> > > > > 0xffff);
> > > > > +
> > > > > +	return pipe_ratio;
> > > > > +}
> > > > 
> > > > Yuck. I just killed these.
> > > 
> > > I still need pipe_ratio fro ICL as you know, this is dictated by
> > > BSpec,
> > > not by me.
> > 
> > That part of the spec is pretty crazy IMO. I think we should just
> > ignore
> > it for now. Also nothing in there says we must use both slices for a
> > single pipe so ignoring doesn't even violate the spec in any way.
> 
> So you are instructing me now to ignore BSpec as "we are smarter"?
> I don't think this is properly to do it this way. Lets first change the
> BSpec, if we don't like pipe ratio here and then driver. Currently this
> is written that way and if we are going to do everything our own way
> ignoring specs, this is going to be chaos.

Like I sai you don't have to violate the spec at all. Just don't use
two slices for the same pipe and you can avoid the whole strange pipe
ratio limitation. We should file a bspec issue to get that clarified
though.

> 
> > 
> > > 
> > > > 
> > > > > +
> > > > > +struct dbuf_slice_conf_entry {
> > > > > +	u32 dbuf_mask[I915_MAX_PIPES];
> > > > > +	u32 active_pipes;
> > > > > +};
> > > > > +
> > > > > +
> > > > > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > > > > +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > > > > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > > > > +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > > > > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > > > > +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > > > > +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > > > > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > > |
> > > > > BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C)
> > > > > |
> > > > > BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C)
> > > > > |
> > > > > BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > > |
> > > > > BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > > |
> > > > > BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3,
> > > > > DBuf4)  \
> > > > > +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) |
> > > > > BIT(PIPE_B) | \
> > > > > +					  BIT(PIPE_C) |
> > > > > BIT(PIPE_D) }
> > > > > +
> > > > > +/*
> > > > > + * Table taken from Bspec 12716
> > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > + * plus there are some hardcoded requirements on how
> > > > > + * those should be distributed for multipipe scenarios.
> > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > + * and less readable, so decided to use a table almost
> > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > + * to compare, change and check.
> > > > > + */
> > > > > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > > ratio
> > > > > < 88.8 */
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > pipe
> > > > > ratio >= 88.8 */
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > > ratio
> > > > > < 88.8 */
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > pipe
> > > > > ratio >= 88.8 */
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > > ratio
> > > > > < 88.8 */
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for
> > > > > pipe
> > > > > ratio >= 88.8 */
> > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +};
> > > > > +
> > > > > +/*
> > > > > + * Table taken from Bspec 49255
> > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > + * plus there are some hardcoded requirements on how
> > > > > + * those should be distributed for multipipe scenarios.
> > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > + * and less readable, so decided to use a table almost
> > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > + * to compare, change and check.
> > > > > + */
> > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +};
> > > > 
> > > > My eyes!
> > > > 
> > > > I have to think we should be able to reduce all that to a handful
> > > > of lines of code.
> > > 
> > > Yeah, then we are going to have a huge function with lots of weird
> > > definitions, unfortunately BSpec table has quite strange DBuf
> > > assignments like 
> > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > 
> > > i.e slices can get mixed in a quite various ways. There is no
> > > rational pattern in those and could be even dangerous to try to 
> > > optimize it some way, as one day it might change again in some
> > > unpredictable way.
> > > 
> > > Would you prefer to have it like
> > > 
> > > if (pipes are A and B)
> > >     program S2 to A and S1 to B
> > > if (pipes are A and C)
> > >     program S1 to A and S2 to C
> > > ...
> > > 
> > > I would prefer at least to see it in a comparable way with the
> > > table we have in BSpec, rather than having lots of weird looking
> > > if-else statements, GEN checks and so on.
> > > 
> > > I knew this table was not going to look like it is done typically
> > > here - but really my opinion that it is way better than hardcoding
> > > it into some weird algorithm, which would be hard to compare to 
> > > initial table, which is already strange enough.
> > > 
> > > If you don't like it - could you please explain why this is exactly
> > > worse than having long functions with hardcoded checks?
> > 
> > I think we should be able to encode this simply using the
> > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
> > is the distance of each pipe from each dbuf slice. Then all we have
> > to do is minimize the total distance. The following tables I think
> > should encode that reasonably concisely:
> > 
> > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > static const u8 icl_dbuf_distance[][] {
> > 	[PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > 	[PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > 	[PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > };
> > 
> > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > static const u8 tgl_dbuf_distance[][] {
> > 	[PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > 	[PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > 	[PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > 	[PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > };
> 
> I think you are missing my point. I will explain why.
> My initial idea was similar, use some kind of distance
> metrics and encode everything in a smart way.
> Current BSpec table has sometimes slices swapped for no reason.
> 
> There are seem to be some hardware constraints we are not aware
> of. For example for TGL pipes A,B get S2 and S1 slices,

A/S1 + B/S2 = 1 + 4 = 5
A/S2 + B/S1 = 2 + 1 = 3

> but pipes A C get S1 and S2.

A/S1 + C/S2 = 1 + 1 = 2
A/S2 + C/S1 = 2 + 2 = 4

So it's exactly as my distance table says it should be.

> 
> if you encode all this into some kind of distance algorithm,
> those differences would be enourmously difficult to spot and
> compare with actual table. That is why I decided to do it that
> way.
> 
> Most likely yes you can solve it by using this metrics,
> but will it really be then more readable and comparable
> with reference table? 
> What if some contraints will be added again, like pipe ratio?
> 
> > 
> > /*
> >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> >  */
> > static const u8 future_dbuf_distance[][] {
> > 	[PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> > 	[PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> > 	[PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> > 	[PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > };
> > 
> > I suppose we could make that even more minimal by just
> > directly sticking that "Pipe and DBUF ordering" language
> > into a single array but and then calculate each distance
> > from it, but that is perhaps more work, and we'd need
> > some kind of enum just for that so that we have a single
> > namespace for the pipes and dbuf slices. So i have a
> > feeling having those distances pre-computed in a table
> > like thet is maybe a better way to go.
> 
> Good luck then spoting differences with the BSpec table once it
> gets changed.

Well, I'd rather do that than decode through layers of cpp macros.
Also atm the table has no extra restrictions so it's clearly there
just because someone thought they'd write out the results of the
distance calculations. In general bspec style is somewhat annoying
where they try to write out things in the lowest level of absraction
possible (they are hardware people after all). I'd much prefer a
higher level description of the issue.

> 
> > 
> > > 
> > > I think this is pretty stupid to fix stuff which looks new or
> > > unusual
> > > just because somebody doesn't like it, without even constructive
> > > arguing.
> > > 
> > > - Stan
> > > 
> > > > 
> > > > 
> > > > > +
> > > > > +/*
> > > > > + * This function
> > > > 
> > > > "This function" is totally redundant. Ditto elsewhere.
> > > > 
> > > > > finds an entry with same enabled pipe configuration and
> > > > > + * returns correspondent DBuf slice mask as stated in BSpec
> > > > > for
> > > > > particular
> > > > > + * platform.
> > > > > + */
> > > > > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > > > > +				     u32 active_pipes,
> > > > > +				     const struct
> > > > > intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +	int i;
> > > > > +	uint_fixed_16_16_t pipe_ratio;
> > > > > +
> > > > > +	/*
> > > > > +	 * Calculate pipe ratio as stated in BSpec 28692
> > > > > +	 */
> > > > > +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > > > > +
> > > > > +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > > > > +		if (icl_allowed_dbufs[i].active_pipes ==
> > > > > active_pipes)
> > > > > {
> > > > > +			if (hweight32(active_pipes) == 1) {
> > > > > +				/*
> > > > > +				 * According to BSpec 12716: if
> > > > > ratio
> > > > > > = 88.8
> > > > > 
> > > > > +				 * always use single dbuf
> > > > > slice.
> > > > > +				 */
> > > > > +				if (pipe_ratio.val >=
> > > > > div_fixed16(888,
> > > > > 10).val)
> > > > > +					++i;
> > > > > +			}
> > > > > +			return
> > > > > icl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > +		}
> > > > > +	}
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * This function finds an entry with same enabled pipe
> > > > > configuration and
> > > > > + * returns correspondent DBuf slice mask as stated in BSpec
> > > > > for
> > > > > particular
> > > > > + * platform.
> > > > > + */
> > > > > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > > > > +				     u32 active_pipes,
> > > > > +				     const struct
> > > > > intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +	int i;
> > > > > +
> > > > > +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > > > > +		if (tgl_allowed_dbufs[i].active_pipes ==
> > > > > active_pipes)
> > > > > +			return
> > > > > tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > +	}
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > > *dev_priv,
> > > > > +				      int pipe, u32
> > > > > active_pipes,
> > > > > +				      const struct
> > > > > intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +	if (IS_GEN(dev_priv, 11))
> > > > > +		return icl_get_allowed_dbuf_mask(pipe,
> > > > > +						 active_pipes,
> > > > > +						 crtc_state);
> > > > > +	else if (IS_GEN(dev_priv, 12))
> > > > > +		return tgl_get_allowed_dbuf_mask(pipe,
> > > > > +						 active_pipes,
> > > > > +						 crtc_state);
> > > > > +	/*
> > > > > +	 * For anything else just return one slice yet.
> > > > > +	 * Should be extended for other platforms.
> > > > > +	 */
> > > > > +	return DBUF_S1_BIT;
> > > > > +}
> > > > > +
> > > > >  static u64
> > > > >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > > > > *crtc_state,
> > > > >  				 u64 *plane_data_rate)
> > > > > -- 
> > > > > 2.17.1
> > > > 
> > > > 
> > 
> > 

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 18:18           ` Ville Syrjälä
  0 siblings, 0 replies; 34+ messages in thread
From: Ville Syrjälä @ 2019-11-18 18:18 UTC (permalink / raw)
  To: Lisovskiy, Stanislav; +Cc: intel-gfx

On Mon, Nov 18, 2019 at 05:17:51PM +0000, Lisovskiy, Stanislav wrote:
> On Mon, 2019-11-18 at 17:27 +0200, Ville Syrjälä wrote:
> > On Mon, Nov 18, 2019 at 09:19:18AM +0000, Lisovskiy, Stanislav wrote:
> > > On Fri, 2019-11-15 at 22:19 +0200, Ville Syrjälä wrote:
> > > > On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy
> > > > wrote:
> > > > > Also implemented algorithm for choosing DBuf slice
> > > > > configuration
> > > > > based on active pipes, pipe ratio as stated in BSpec 12716.
> > > > > 
> > > > > Now pipe allocation still stays proportional to pipe width as
> > > > > before,
> > > > > however within allowed DBuf slice for this particular
> > > > > configuration.
> > > > > 
> > > > > v2: Remove unneeded check from commit as ddb enabled slices
> > > > > might
> > > > >     now differ from hw state.
> > > > > 
> > > > > v3: - Added new field "supported_slices" to ddb to separate max
> > > > >       supported slices vs currently enabled, to avoid
> > > > > confusion.
> > > > >     - Removed obsolete comments and code related to
> > > > > DBuf(Matthew
> > > > > Roper).
> > > > >     - Some code style and long line removal(Matthew Roper).
> > > > >     - Added WARN_ON to new ddb range offset calc
> > > > > function(Matthew
> > > > > Roper).
> > > > >     - Removed platform specific call to calc pipe ratio from
> > > > > ddb
> > > > >       allocation function and fixed the return value(Matthew
> > > > > Roper)
> > > > >     - Refactored DBUF slice allocation table to improve
> > > > > readability
> > > > >     - Added DBUF slice allocation for TGL as well.
> > > > >     - ICL(however not TGL) seems to voluntarily enable second
> > > > > DBuf
> > > > > slice
> > > > >       after pm suspend/resume causing a mismatch failure,
> > > > > because
> > > > > we
> > > > >       update DBuf slices only if we do a modeset, however this
> > > > > check
> > > > >       is done always. Fixed it to be done only when modeset for
> > > > > ICL.
> > > > > 
> > > > > v4: - Now enabled slices is not actually a number, but a
> > > > > bitmask,
> > > > >       because we might need to enable second slice only and
> > > > > number
> > > > >       of slices would still 1 and that current code doesn't
> > > > > allow.
> > > > >     - Remove redundant duplicate code to have some unified way
> > > > > of
> > > > >       enabling dbuf slices instead of hardcoding.
> > > > > 
> > > > > v5: - Fix failing gen9_assert_dbuf_enabled as it was naively
> > > > > thinking
> > > > >       that we have only one DBUF_CTL slice. Now another version
> > > > > for
> > > > >       gen11 will check both slices as only second one can be
> > > > > enabled,
> > > > >       to keep CI happy.
> > > > > 
> > > > > v6: - Removed enabled dbuf assertion completely. Previous code
> > > > >       was keeping dbuf enabled, even(!) when _dbuf_disable is
> > > > > called.
> > > > >       Currently we enable/disable dbuf slices based on
> > > > > requrement
> > > > >       so no point in asserting this here.
> > > > >     - Removed unnecessary modeset check from
> > > > > verify_wm_state(Matthew Roper)
> > > > >     - Slices intersection after union is same as final
> > > > > result(Matthew Roper)
> > > > >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> > > > >     - Some macros added(Matthew Roper)
> > > > >     - ddb range is now always less or equal to ddb size - no
> > > > > need
> > > > > for additional
> > > > >       checks(previously needed as we had some bandwidth checks
> > > > > in
> > > > > there which
> > > > >       could "suddenly" return ddb_size smaller than it is.
> > > > > 
> > > > > v7: Minor costemic changes:
> > > > >     - Changed icl_dbuf_slices_restore name to
> > > > > icl_program_dbuf_slices
> > > > >       as it more corresponds to the actual functionality.
> > > > >     - Some simplification with supported slices for BXT and GLK
> > > > >     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
> > > > >       this is not used for skl anymore.
> > > > >     - Changed DBuf slice assignment order for TGL case
> > > > > 
> > > > > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > > > > Signed-off-by: Stanislav Lisovskiy <
> > > > > stanislav.lisovskiy@intel.com>
> > > > > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > > > > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > > > > Cc: James Ausmus <james.ausmus@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> > > > >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> > > > >  .../drm/i915/display/intel_display_power.h    |   2 +
> > > > >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> > > > >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> > > > >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> > > > >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> > > > >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> > > > >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > > > > ++++++++++++++++--
> > > > >  9 files changed, 431 insertions(+), 104 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > index 876fc25968bf..bd7aff675198 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > > > > intel_crtc *crtc,
> > > > >  
> > > > >  	if (INTEL_GEN(dev_priv) >= 11 &&
> > > > >  	    hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > > > > -		DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > %u, got
> > > > > %u)\n",
> > > > > +		DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > %x, got
> > > > > %x)\n",
> > > > >  			  sw_ddb->enabled_slices,
> > > > >  			  hw->ddb.enabled_slices);
> > > > >  
> > > > > @@ -14614,15 +14614,24 @@ static void
> > > > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > > > >  	u8 hw_enabled_slices = dev_priv-
> > > > > >wm.skl_hw.ddb.enabled_slices;
> > > > >  	u8 required_slices = state-
> > > > > >wm_results.ddb.enabled_slices;
> > > > >  	struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > > > > +	u8 slices_union = hw_enabled_slices | required_slices;
> > > > > +	u8 slices_intersection = required_slices;
> > > > >  
> > > > >  	for_each_oldnew_intel_crtc_in_state(state, crtc,
> > > > > old_crtc_state, new_crtc_state, i)
> > > > >  		/* ignore allocations for crtc's that have been
> > > > > turned
> > > > > off. */
> > > > >  		if (new_crtc_state->hw.active)
> > > > >  			entries[i] = old_crtc_state-
> > > > > >wm.skl.ddb;
> > > > >  
> > > > > -	/* If 2nd DBuf slice required, enable it here */
> > > > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > > > > hw_enabled_slices)
> > > > > -		icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
> > > > > +	DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > > > > +		      required_slices, hw_enabled_slices);
> > > > > +
> > > > > +	/*
> > > > > +	 * If some other DBuf slice required, enable it here,
> > > > > +	 * as here we only add more slices, but not remove to
> > > > > prevent
> > > > > +	 * issues if somebody still uses those.
> > > > > +	 */
> > > > > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > hw_enabled_slices)
> > > > > +		icl_dbuf_slices_update(dev_priv, slices_union);
> > > > >  
> > > > >  	/*
> > > > >  	 * Whenever the number of active pipes changes, we need
> > > > > to make
> > > > > sure we
> > > > > @@ -14681,9 +14690,12 @@ static void
> > > > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > > > >  		}
> > > > >  	} while (progress);
> > > > >  
> > > > > -	/* If 2nd DBuf slice is no more required disable it */
> > > > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > hw_enabled_slices)
> > > > > -		icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
> > > > > +	/*
> > > > > +	 * If some other DBuf slice is not needed, disable it
> > > > > here,
> > > > > +	 * as here we only remove slices, which are not needed
> > > > > anymore.
> > > > > +	 */
> > > > > +	if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > hw_enabled_slices)
> > > > > +		icl_dbuf_slices_update(dev_priv,
> > > > > slices_intersection);
> > > > 
> > > > This is still a totally wrong place for this. We need to be sure
> > > > the
> > > > planes have switched over to the new ddb layout before we power
> > > > off
> > > > anything.
> > > 
> > > I have just modified already existing line here, not added by me.
> > > The point of this patch is starting to use second DBuf slice but
> > > not fixing all kinds of issues related to DBuf code at the same
> > > time.
> > 
> > You're re-enabling code I disabled on purpose. It has to be fixed
> > properly or not touched at all. In hindsight I should have probably
> > just ripped it all out to not give people bad ideas.
> 
> -	/* If 2nd DBuf slice is no more required disable it */
> > > > > -	if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > hw_enabled_slices)
> > > > > -		icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
> 
> I'm not reenabling it - this is code is out there right now.

It's _dead_ code, and it's like that on purpose. You're now attempting
to re-enable it without fixing the underlying issues.

> This is all again about changing all kinds of stuff at the same time.
> 
> Why we can't just do things step by step? You have a point that
> this is wrong place ok, but as we see current code causes no
> regressions. Just tell me your idea where to put it and I will
> do that, if you don't have it yet - let it stay there, if CI shows
> no regressions currently. 

The same global state tricks apply here as with the cdclk/sagv stuff.
1. enable new resources/bump up clocks
2. reconfigure planes/etc.
3. wait for vblanks
4. disable unused resources/drop the clocks

> 
> > 
> > > 
> > > Of course this is nice idea to fix it - if that is wrong plane,
> > > but current code somehow lives with it and probably this should be
> > > with some other patch.
> > > 
> > > > 
> > > > >  }
> > > > >  
> > > > >  static void intel_atomic_helper_free_state(struct
> > > > > drm_i915_private
> > > > > *dev_priv)
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > index ce1b64f4dd44..a78bc9b61862 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > @@ -1031,15 +1031,6 @@ static bool
> > > > > gen9_dc_off_power_well_enabled(struct drm_i915_private
> > > > > *dev_priv,
> > > > >  		(I915_READ(DC_STATE_EN) &
> > > > > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> > > > >  }
> > > > >  
> > > > > -static void gen9_assert_dbuf_enabled(struct drm_i915_private
> > > > > *dev_priv)
> > > > > -{
> > > > > -	u32 tmp = I915_READ(DBUF_CTL);
> > > > > -
> > > > > -	WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> > > > > -	     (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > > > > -	     "Unexpected DBuf power power state (0x%08x)\n",
> > > > > tmp);
> > > > > -}
> > > > > -
> > > > >  static void gen9_disable_dc_states(struct drm_i915_private
> > > > > *dev_priv)
> > > > >  {
> > > > >  	struct intel_cdclk_state cdclk_state = {};
> > > > > @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct
> > > > > drm_i915_private *dev_priv)
> > > > >  	/* Can't read out voltage_level so can't use
> > > > > intel_cdclk_changed() */
> > > > >  	WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw,
> > > > > &cdclk_state));
> > > > >  
> > > > > -	gen9_assert_dbuf_enabled(dev_priv);
> > > > > -
> > > > >  	if (IS_GEN9_LP(dev_priv))
> > > > >  		bxt_verify_ddi_phy_power_wells(dev_priv);
> > > > >  
> > > > > @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct
> > > > > drm_i915_private *dev_priv)
> > > > >  	intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> > > > >  }
> > > > >  
> > > > > -static u8 intel_dbuf_max_slices(struct drm_i915_private
> > > > > *dev_priv)
> > > > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > > > >  {
> > > > > -	if (INTEL_GEN(dev_priv) < 11)
> > > > > -		return 1;
> > > > > -	return 2;
> > > > > +	return INTEL_INFO(dev_priv)->supported_slices;
> > > > > +}
> > > > 
> > > > Kinda pointless function now. 'supported_slices' is not a very
> > > > good
> > > > name. Should be 'num_dbuf_slices' or something along those lines.
> > > 
> > > Supported slices really means how much slices we support in
> > > hardware.
> > > num_dbuf_slices will not reflect that, it will be more ambigious.
> > > I don't like this name.
> > 
> > supported_slices doesn't even say "dbuf" anywhere. So it's even more
> > ambiguous. Also we already have num_pipes, num_sprites, etc.
> 
> ..and in current code we have... "enabled_slices", no? Ouch! :) IS this
> really that important and worth arguing at the very last moment?
> whether number of
> supported_slices_in_hardware, should be named "supported slices" or
> "num_dbuf_slices"? If yes, lets go on, miss some targets and continue
> discussing this "soooo important" thing. 
> 
> Some one might come up that it is called "num_slices" or
> "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
> keep everyones bloated egos happy??
> 
> 
> > 
> > > 
> > > > 
> > > > > +
> > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > *dev_priv)
> > > > > +{
> > > > > +	const u8 hw_enabled_slices = dev_priv-
> > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > 
> > > > Seems a rather pointless variable. const at least is pointless.
> > > > 
> > > > > +
> > > > > +	icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> > > > >  }
> > > > >  
> > > > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > > > >  			    u8 req_slices)
> > > > >  {
> > > > > -	const u8 hw_enabled_slices = dev_priv-
> > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > > 
> > > > > -	bool ret;
> > > > > +	bool ret = true;
> > > > > +	int i;
> > > > > +	int max_slices = intel_dbuf_max_slices(dev_priv);
> > > > >  
> > > > > -	if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > > > > +	if (hweight8(req_slices) >
> > > > > intel_dbuf_max_slices(dev_priv)) {
> > > > >  		DRM_ERROR("Invalid number of dbuf slices
> > > > > requested\n");
> > > > >  		return;
> > > > >  	}
> > > > >  
> > > > > -	if (req_slices == hw_enabled_slices || req_slices == 0)
> > > > > -		return;
> > > > > +	DRM_DEBUG_KMS("Updating dbuf slices to %x\n",
> > > > > req_slices);
> > > > >  
> > > > > -	if (req_slices > hw_enabled_slices)
> > > > > -		ret = intel_dbuf_slice_set(dev_priv,
> > > > > DBUF_CTL_S2,
> > > > > true);
> > > > > -	else
> > > > > -		ret = intel_dbuf_slice_set(dev_priv,
> > > > > DBUF_CTL_S2,
> > > > > false);
> > > > > +	for (i = 0; i < max_slices; i++) {
> > > > > +		int slice_bit = BIT(i);
> > > > > +		bool slice_set = (slice_bit & req_slices) != 0;
> > > > > +
> > > > > +		switch (slice_bit) {
> > > > > +		case DBUF_S1_BIT:
> > > > > +			ret = ret &&
> > > > > intel_dbuf_slice_set(dev_priv,
> > > > > +							  DBUF_
> > > > > CTL_S1,
> > > > > +							  slice
> > > > > _set);
> > > > > +			break;
> > > > > +		case DBUF_S2_BIT:
> > > > > +			ret = ret &&
> > > > > intel_dbuf_slice_set(dev_priv,
> > > > > +							  DBUF_
> > > > > CTL_S2,
> > > > > +							  slice
> > > > > _set);
> > > > 
> > > > The return value on intel_dbuf_slice_set() seems pointless. I'd
> > > > nuke
> > > > it.
> > > > 
> > > > Also as long you have intel_dbuf_slice_set() I'd probably just
> > > > pass
> > > > the
> > > > slice index there and let it deal with the register details.
> > > > DBUF_CTL
> > > > should probably be parametrized as well so you don't need
> > > > annoying
> > > > switch
> > > > statements.
> > > 
> > > What if it fails? Shouldn't we still have some clue what state do
> > > we
> > > have/had at the moment?
> > 
> > It if fails you log an error and keep going.
> 
> Ah sure. And let hw/sw state be completely different.
> Also see next comment about this.
> 
> > 
> > > 
> > > > 
> > > > > +			break;
> > > > > +		default:
> > > > > +			MISSING_CASE(slice_bit);
> > > > > +		}
> > > > > +	}
> > > > >  
> > > > >  	if (ret)
> > > > >  		dev_priv->wm.skl_hw.ddb.enabled_slices =
> > > > > req_slices;
> > > > > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > > > > drm_i915_private *dev_priv,
> > > > >  
> > > > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > > > >  {
> > > > > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > > DBUF_POWER_REQUEST);
> > > > > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > > DBUF_POWER_REQUEST);
> > > > > -	POSTING_READ(DBUF_CTL_S2);
> > > > > -
> > > > > -	udelay(10);
> > > > > -
> > > > > -	if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > -	    !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > -		DRM_ERROR("DBuf power enable timeout\n");
> > > > > -	else
> > > > > -		/*
> > > > > -		 * FIXME: for now pretend that we only have 1
> > > > > slice,
> > > > > see
> > > > > -		 * intel_enabled_dbuf_slices_num().
> > > > > -		 */
> > > > > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > +	/*
> > > > > +	 * Just power up 1 slice, we will
> > > > > +	 * figure out later which slices we have and what we
> > > > > need.
> > > > > +	 */
> > > > > +	dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > > 
> > > > That stuff really shouldn't be modified by low level functions.
> > > > The intial value should come from readout, and after that it
> > > > should
> > > > be adjusted as part of the overall atomic state.
> > > 
> > > Well, as you can see currently this is even worse - has some weird
> > > hardcoded stuff and is modified like that. Also if we are enabling
> > > the 
> > > dbuf slices, shouldn't we really determine what value should be
> > > there?
> > > 
> > > Also I wouldn't really always trust the readout, as to me it seems
> > > icelake resets it constantly to 2 after suspend/resume. 
> > 
> > The point of readout is that we syncrhonize our software state with
> > the
> > current state of hardware without having to modify the hardware
> > state.
> > Once synchronized the driver takes over and tells the hw what to do.
> 
> That is especially nice after you say "report error and don't care"
> when I really want to reflect hw state in sw state, by checking whether
> we could set dbuf slices or not. 
> Now you say that we need to synchronize software state with hw state,
> however if I fail to modify it - then "who cares".

Yes. If you fail to modify the hardware state the whole thing is
broken anyway. There's is *nothing* we can do at this stage in the
commit to fix it.

> 
> 
> > 
> > > 
> > > > 
> > > > > +	icl_program_dbuf_slices(dev_priv);
> > > > >  }
> > > > >  
> > > > >  static void icl_dbuf_disable(struct drm_i915_private
> > > > > *dev_priv)
> > > > >  {
> > > > > -	I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > > > > ~DBUF_POWER_REQUEST);
> > > > > -	I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > > > > ~DBUF_POWER_REQUEST);
> > > > > -	POSTING_READ(DBUF_CTL_S2);
> > > > > -
> > > > > -	udelay(10);
> > > > > -
> > > > > -	if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > -	    (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > -		DRM_ERROR("DBuf power disable timeout!\n");
> > > > > -	else
> > > > > -		/*
> > > > > -		 * FIXME: for now pretend that the first slice
> > > > > is
> > > > > always
> > > > > -		 * enabled, see
> > > > > intel_enabled_dbuf_slices_num().
> > > > > -		 */
> > > > > -		dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > +	/*
> > > > > +	 * Disable all slices
> > > > > +	 */
> > > > > +	dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > > > > +	icl_program_dbuf_slices(dev_priv);
> > > > >  }
> > > > >  
> > > > >  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > index 1da04f3e0fb3..b7057d68ad78 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > > > > drm_i915_private *i915,
> > > > >  	for ((wf) = intel_display_power_get((i915), (domain));
> > > > > (wf); \
> > > > >  	     intel_display_power_put_async((i915), (domain),
> > > > > (wf)),
> > > > > (wf) = 0)
> > > > >  
> > > > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
> > > > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > > > >  			    u8 req_slices);
> > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > *dev_priv);
> > > > >  
> > > > >  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
> > > > >  			     bool override, unsigned int mask);
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > > index 82e4e6bf08c3..96741e68633a 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > > @@ -51,6 +51,7 @@
> > > > >  #include "display/intel_bw.h"
> > > > >  #include "display/intel_cdclk.h"
> > > > >  #include "display/intel_display_types.h"
> > > > > +#include "display/intel_display_power.h"
> > > > >  #include "display/intel_dp.h"
> > > > >  #include "display/intel_fbdev.h"
> > > > >  #include "display/intel_hotplug.h"
> > > > > @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct
> > > > > device *kdev)
> > > > >  	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
> > > > >  		ret = vlv_resume_prepare(dev_priv, true);
> > > > >  
> > > > > +	/* Weird hack to fix ICL hardware bug, as it resets
> > > > > DBUF slices
> > > > > reg */
> > > > > +	if (INTEL_GEN(dev_priv) == 11)
> > > > > +		icl_program_dbuf_slices(dev_priv);
> > > > > +
> > > > >  	intel_uncore_runtime_resume(&dev_priv->uncore);
> > > > >  
> > > > >  	intel_runtime_pm_enable_interrupts(dev_priv);
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > > index 7e0f67babe20..a396977c9c2d 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > @@ -804,7 +804,7 @@ static inline bool
> > > > > skl_ddb_entry_equal(const
> > > > > struct skl_ddb_entry *e1,
> > > > >  }
> > > > >  
> > > > >  struct skl_ddb_allocation {
> > > > 
> > > > This struct is quite pointless. Should be killed off.
> > > 
> > > Legacy stuff, not added by me. Quite fine with removing it - 
> > > but should we really do all this at the same time and in same
> > > patch? 
> > 
> > Not the same patch. But clean up the cobwebs first, then add the new
> > stuff. Much easier that way around.
> 
> There is a shit ton load of cobwebs to clean. Personally my opinion is
> that current dbuf controlling code is total garbage - which has lots of
> hardcoded stuff. Good luck enabling only second dbuf slice, good luck
> disabling/enabling dbuf properly....
> Where you were, when this was committed then?

Looking at something else unfortunately. Also it has just kept
bitrotting since because no one has cared.

> 
> If we need to fix it - fine. But may be we should do it in stages?
> Point of this patch is "enable second dbuf slice", next patch "remove 
> unneeded structs" and so on. 
> While you want me to shuffle everything at the same time, which is not
> good idea, especially considering complexity and amount of legacy stuff
> we currently have.

Building new stuff on top of a decomposing garbage pile is not
a good idea: a) it's harder than building on clean foundation,
b) it'll probably fall over and get buried in the old garbage
at which point you're likely going to be right back where you
started.

> 
> > 
> > > 
> > > > 
> > > > > -	u8 enabled_slices; /* GEN11 has configurable 2 slices
> > > > > */
> > > > > +	u8 enabled_slices;   /* Bitmask of currently enabled
> > > > > DBuf
> > > > > slices */
> > > > >  };
> > > > >  
> > > > >  struct skl_ddb_values {
> > > > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > > > b/drivers/gpu/drm/i915/i915_pci.c
> > > > > index 1bb701d32a5d..8e86af505730 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > > > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > > > > intel_cherryview_info = {
> > > > >  	.has_gt_uc = 1, \
> > > > >  	.display.has_hdcp = 1, \
> > > > >  	.display.has_ipc = 1, \
> > > > > -	.ddb_size = 896
> > > > > +	.ddb_size = 896, \
> > > > > +	.supported_slices = 1
> > > > >  
> > > > >  #define SKL_PLATFORM \
> > > > >  	GEN9_FEATURES, \
> > > > > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > > > > intel_skylake_gt4_info = {
> > > > >  	GEN(9), \
> > > > >  	.is_lp = 1, \
> > > > >  	.display.has_hotplug = 1, \
> > > > > +	.supported_slices = 1, \
> > > > >  	.engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) |
> > > > > BIT(VECS0),
> > > > > \
> > > > >  	.pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
> > > > >  	.has_64bit_reloc = 1, \
> > > > > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > > > > intel_coffeelake_gt3_info = {
> > > > >  	GEN9_FEATURES, \
> > > > >  	GEN(10), \
> > > > >  	.ddb_size = 1024, \
> > > > > +	.supported_slices = 1, \
> > > > >  	.display.has_dsc = 1, \
> > > > >  	.has_coherent_ggtt = false, \
> > > > >  	GLK_COLORS
> > > > > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > > > > intel_cannonlake_info = {
> > > > >  	}, \
> > > > >  	GEN(11), \
> > > > >  	.ddb_size = 2048, \
> > > > > +	.supported_slices = 2, \
> > > > >  	.has_logical_ring_elsq = 1, \
> > > > >  	.color = { .degamma_lut_size = 33, .gamma_lut_size =
> > > > > 262145 }
> > > > >  
> > > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > > > > b/drivers/gpu/drm/i915/i915_reg.h
> > > > > index a607ea520829..fba5731063d8 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > > > @@ -7734,9 +7734,11 @@ enum {
> > > > >  #define DISP_ARB_CTL2	_MMIO(0x45004)
> > > > >  #define  DISP_DATA_PARTITION_5_6	(1 << 6)
> > > > >  #define  DISP_IPC_ENABLE		(1 << 3)
> > > > > -#define DBUF_CTL	_MMIO(0x45008)
> > > > > -#define DBUF_CTL_S1	_MMIO(0x45008)
> > > > > -#define DBUF_CTL_S2	_MMIO(0x44FE8)
> > > > > +#define DBUF_S1_BIT			BIT(0)
> > > > > +#define DBUF_S2_BIT			BIT(1)
> > > > 
> > > > These are not bits inside DBUF_CTL so they should not be here.
> > > > I'd just nuke them entirely. If you think we need more than an
> > > > int to refer to a slice then IMO something simple like
> > > > 
> > > > enum dbuf_slice {
> > > > 	DBUF_S1,
> > > > 	DBUF_S2,
> > > > };
> > > > 
> > > > would match the general apporach we use for other things much
> > > > better
> > > > rather than defining some shifted values.
> > > 
> > > We need to combine slices - so bitmask is quite suitable for that, 
> > > I can of course use this enum, but then I'll have to re-encode it 
> > > to bitmask anyways.
> > 
> > Which is pretty much how everything else works in the driver.
> 
> Another piece of incredibly important thing to argue..

It is. Lot's of crap in the driver is simply caused by people not
following established style and coming up with something different..
Which means there is an extra cost for everyone else when reading the
code. And that probably means most people just don't bother to read it.

Granted, some of that is due to spurts in evolution where major overhauls
changed the architecture so significantly that the old established
conventions simply made node sense anymore. Atomic being the last big
one in that regard. And we're still cleaning up the resulting mess.

> 
> > 
> > > 
> > > 
> > > > 
> > > > > +#define DBUF_CTL			_MMIO(0x45008)
> > > > > +#define DBUF_CTL_S1			_MMIO(0x45008)
> > > > > +#define DBUF_CTL_S2			_MMIO(0x44FE8)
> > > > >  #define  DBUF_POWER_REQUEST		(1 << 31)
> > > > >  #define  DBUF_POWER_STATE		(1 << 30)
> > > > >  #define GEN7_MSG_CTL	_MMIO(0x45010)
> > > > > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > @@ -180,6 +180,7 @@ struct intel_device_info {
> > > > >  	} display;
> > > > >  
> > > > >  	u16 ddb_size; /* in blocks */
> > > > > +	u8 supported_slices; /* number of DBuf slices */
> > > > >  
> > > > >  	/* Register offsets for the various display pipes and
> > > > > transcoders */
> > > > >  	int pipe_offsets[I915_MAX_TRANSCODERS];
> > > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > > > b/drivers/gpu/drm/i915/intel_pm.c
> > > > > index 2d389e437e87..1ea81ab92429 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct
> > > > > drm_device
> > > > > *dev)
> > > > >  	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> > > > >  }
> > > > >  
> > > > > -static u8 intel_enabled_dbuf_slices_num(struct
> > > > > drm_i915_private
> > > > > *dev_priv)
> > > > > +static u8 intel_enabled_dbuf_slices(struct drm_i915_private
> > > > > *dev_priv)
> > > > >  {
> > > > > -	u8 enabled_slices;
> > > > > -
> > > > > -	/* Slice 1 will always be enabled */
> > > > > -	enabled_slices = 1;
> > > > > +	u8 enabled_slices = 0;
> > > > >  
> > > > >  	/* Gen prior to GEN11 have only one DBuf slice */
> > > > >  	if (INTEL_GEN(dev_priv) < 11)
> > > > > -		return enabled_slices;
> > > > > +		return DBUF_S1_BIT;
> > > > 
> > > > I'd nuke this special case entirely and just read everything from
> > > > the
> > > > hw. Assuming we need readout somewhere.
> > > 
> > > We have it that way hardcoded right now in drm-tip. I think this is
> > > a
> > > subject to other patch, but not this change. As I understand we
> > > need 
> > > to first figure out how to do it properly. 
> > > This patch anyway makes it better by at least removing hardcoded
> > > relation between number of slices and which slices are enabled.
> > > With current code there is no way for example to enable second DBuf
> > > slice only, which is required by BSpec.
> > > 
> > > > 
> > > > >  
> > > > > -	/*
> > > > > -	 * FIXME: for now we'll only ever use 1 slice; pretend
> > > > > that we
> > > > > have
> > > > > -	 * only that 1 slice enabled until we have a proper way
> > > > > for on-
> > > > > demand
> > > > > -	 * toggling of the second slice.
> > > > > -	 */
> > > > > -	if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > -		enabled_slices++;
> > > > > +	/* Check if second DBuf slice is enabled */
> > > > > +	if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > > > > +		enabled_slices |= DBUF_S1_BIT;
> > > > > +
> > > > > +	if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > +		enabled_slices |= DBUF_S2_BIT;
> > > > >  
> > > > >  	return enabled_slices;
> > > > >  }
> > > > > @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct
> > > > > drm_i915_private *dev_priv,
> > > > >  			      const int num_active,
> > > > >  			      struct skl_ddb_allocation *ddb)
> > > > >  {
> > > > > -	const struct drm_display_mode *adjusted_mode;
> > > > > -	u64 total_data_bw;
> > > > >  	u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > > > > -
> > > > >  	WARN_ON(ddb_size == 0);
> > > > >  
> > > > >  	if (INTEL_GEN(dev_priv) < 11)
> > > > >  		return ddb_size - 4; /* 4 blocks for bypass
> > > > > path
> > > > > allocation */
> > > > >  
> > > > > -	adjusted_mode = &crtc_state->hw.adjusted_mode;
> > > > > -	total_data_bw = total_data_rate *
> > > > > drm_mode_vrefresh(adjusted_mode);
> > > > > +	return ddb_size;
> > > > > +}
> > > > >  
> > > > > -	/*
> > > > > -	 * 12GB/s is maximum BW supported by single DBuf slice.
> > > > > -	 *
> > > > > -	 * FIXME dbuf slice code is broken:
> > > > > -	 * - must wait for planes to stop using the slice
> > > > > before
> > > > > powering it off
> > > > > -	 * - plane straddling both slices is illegal in multi-
> > > > > pipe
> > > > > scenarios
> > > > > -	 * - should validate we stay within the hw bandwidth
> > > > > limits
> > > > > -	 */
> > > > > -	if (0 && (num_active > 1 || total_data_bw >= GBps(12)))
> > > > > {
> > > > > -		ddb->enabled_slices = 2;
> > > > > -	} else {
> > > > > -		ddb->enabled_slices = 1;
> > > > > -		ddb_size /= 2;
> > > > > -	}
> > > > > +/*
> > > > > + * Calculate initial DBuf slice offset, based on slice size
> > > > > + * and mask(i.e if slice size is 1024 and second slice is
> > > > > enabled
> > > > > + * offset would be 1024)
> > > > > + */
> > > > > +static u32 skl_get_first_dbuf_slice_offset(u32
> > > > > dbuf_slice_mask,
> > > > > +					   u32 slice_size, u32
> > > > > ddb_size)
> > > > > +{
> > > > > +	u32 offset = 0;
> > > > >  
> > > > > -	return ddb_size;
> > > > > +	if (!dbuf_slice_mask)
> > > > > +		return 0;
> > > > > +
> > > > > +	offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > > > > +
> > > > > +	WARN_ON(offset >= ddb_size);
> > > > > +	return offset;
> > > > >  }
> > > > >  
> > > > > +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > > *dev_priv,
> > > > > +				      int pipe, u32
> > > > > active_pipes,
> > > > > +				      const struct
> > > > > intel_crtc_state
> > > > > *crtc_state);
> > > > > +
> > > > >  static void
> > > > >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > > > > *dev_priv,
> > > > >  				   const struct
> > > > > intel_crtc_state
> > > > > *crtc_state,
> > > > > @@ -3857,7 +3855,14 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >  	u32 pipe_width = 0, total_width = 0, width_before_pipe
> > > > > = 0;
> > > > >  	enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> > > > >  	u16 ddb_size;
> > > > > +	u32 ddb_range_size;
> > > > >  	u32 i;
> > > > > +	u32 dbuf_slice_mask;
> > > > > +	u32 active_pipes;
> > > > > +	u32 offset;
> > > > > +	u32 slice_size;
> > > > > +	u32 total_slice_mask;
> > > > > +	u32 start, end;
> > > > >  
> > > > >  	if (WARN_ON(!state) || !crtc_state->hw.active) {
> > > > >  		alloc->start = 0;
> > > > > @@ -3866,14 +3871,23 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >  		return;
> > > > >  	}
> > > > >  
> > > > > -	if (intel_state->active_pipe_changes)
> > > > > +	if (intel_state->active_pipe_changes) {
> > > > >  		*num_active = hweight8(intel_state-
> > > > > >active_pipes);
> > > > > -	else
> > > > > +		active_pipes = intel_state->active_pipes;
> > > > > +	} else {
> > > > >  		*num_active = hweight8(dev_priv->active_pipes);
> > > > > +		active_pipes = dev_priv->active_pipes;
> > > > > +	}
> > > > >  
> > > > >  	ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > > > > total_data_rate,
> > > > >  				      *num_active, ddb);
> > > > >  
> > > > > +	DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > > > > +
> > > > > +	slice_size = ddb_size / INTEL_INFO(dev_priv)-
> > > > > >supported_slices;
> > > > > +
> > > > > +	DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> > > > > +
> > > > >  	/*
> > > > >  	 * If the state doesn't change the active CRTC's or
> > > > > there is no
> > > > >  	 * modeset request, then there's no need to
> > > > > recalculate;
> > > > > @@ -3891,20 +3905,70 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >  		return;
> > > > >  	}
> > > > >  
> > > > > +	/*
> > > > > +	 * Get allowed DBuf slices for correspondent pipe and
> > > > > platform.
> > > > > +	 */
> > > > > +	dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv,
> > > > > for_pipe,
> > > > > +						     active_pip
> > > > > es,
> > > > > crtc_state);
> > > > 
> > > > Quite a few redundant arguments. The function name looks alien.
> > > > 
> > > > skl_crtc_possible_dbuf_slices() or something perhaps?
> > > > 
> > > > > +
> > > > > +	DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes
> > > > > %x\n",
> > > > > +		      dbuf_slice_mask,
> > > > > +		      for_pipe, active_pipes);
> > > > > +
> > > > > +	/*
> > > > > +	 * Figure out at which DBuf slice we start, i.e if we
> > > > > start at
> > > > > Dbuf S2
> > > > > +	 * and slice size is 1024, the offset would be 1024
> > > > > +	 */
> > > > > +	offset =
> > > > > skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > > > > +						 slice_size,
> > > > > ddb_size);
> > > > > +
> > > > > +	/*
> > > > > +	 * Figure out total size of allowed DBuf slices, which
> > > > > is
> > > > > basically
> > > > > +	 * a number of allowed slices for that pipe multiplied
> > > > > by slice
> > > > > size.
> > > > > +	 * Inside of this
> > > > > +	 * range ddb entries are still allocated in proportion
> > > > > to
> > > > > display width.
> > > > > +	 */
> > > > > +	ddb_range_size = hweight8(dbuf_slice_mask) *
> > > > > slice_size;
> > > > > +
> > > > >  	/*
> > > > >  	 * Watermark/ddb requirement highly depends upon width
> > > > > of the
> > > > >  	 * framebuffer, So instead of allocating DDB equally
> > > > > among
> > > > > pipes
> > > > >  	 * distribute DDB based on resolution/width of the
> > > > > display.
> > > > >  	 */
> > > > > +	total_slice_mask = dbuf_slice_mask;
> > > > 
> > > > Isn't this already taken care by the loop below?
> > > > 
> > > > >  	for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > crtc_state,
> > > > > i) {
> > > > >  		const struct drm_display_mode *adjusted_mode =
> > > > >  			&crtc_state->hw.adjusted_mode;
> > > > >  		enum pipe pipe = crtc->pipe;
> > > > >  		int hdisplay, vdisplay;
> > > > > +		u32 pipe_dbuf_slice_mask = \
> > > > > +			i915_get_allowed_dbuf_mask(dev_priv,
> > > > > +						pipe,
> > > > > +						active_pipes,
> > > > > +						crtc_state);
> > > > >  
> > > > >  		if (!crtc_state->hw.enable)
> > > > >  			continue;
> > > > >  
> > > > > +		/*
> > > > > +		 * According to BSpec pipe can share one dbuf
> > > > > slice
> > > > > with another
> > > > > +		 * pipes or pipe can use multiple dbufs, in
> > > > > both cases
> > > > > we
> > > > > +		 * account for other pipes only if they have
> > > > > exactly
> > > > > same mask.
> > > > > +		 * However we need to account how many slices
> > > > > we should
> > > > > enable
> > > > > +		 * in total.
> > > > > +		 */
> > > > > +		total_slice_mask |= pipe_dbuf_slice_mask;
> > > > 
> > > > total_slice_mask will now account only the crtcs in the state.
> > > > What
> > > > happens to the other pipes' slices?
> 
> Same as happens now btw. Right now it also does the same thing.

Right now it does absolutely nothing. It's all dead code.

> 
> > > 
> > > Yes we need to iterate through all pipes/crtcs here, however
> > > for getting correct dbuf slice mask most important is active_pipes.
> > > However yes, I will change the loop for it to be more correct.
> > > 
> > > > 
> > > > > +
> > > > > +		/*
> > > > > +		 * Do not account pipes using other slice sets
> > > > > +		 * luckily as of current BSpec slice sets do
> > > > > not
> > > > > partially
> > > > > +		 * intersect(pipes share either same one slice
> > > > > or same
> > > > > slice set
> > > > > +		 * i.e no partial intersection), so it is
> > > > > enough to
> > > > > check for
> > > > > +		 * equality for now.
> > > > > +		 */
> > > > > +		if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > > > > +			continue;
> > > > > +
> > > > >  		drm_mode_get_hv_timing(adjusted_mode,
> > > > > &hdisplay,
> > > > > &vdisplay);
> > > > >  		total_width += hdisplay;
> > > > >  
> > > > > @@ -3914,8 +3978,19 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >  			pipe_width = hdisplay;
> > > > >  	}
> > > > >  
> > > > > -	alloc->start = ddb_size * width_before_pipe /
> > > > > total_width;
> > > > > -	alloc->end = ddb_size * (width_before_pipe +
> > > > > pipe_width) /
> > > > > total_width;
> > > > > +	ddb->enabled_slices = total_slice_mask;
> > > > > +
> > > > > +	start = ddb_range_size * width_before_pipe /
> > > > > total_width;
> > > > > +	end = ddb_range_size * (width_before_pipe + pipe_width)
> > > > > /
> > > > > total_width;
> > > > > +
> > > > > +	alloc->start = offset + start;
> > > > > +	alloc->end = offset + end;
> > > > > +
> > > > > +	DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > > > > +		      alloc->start, alloc->end);
> > > > > +	DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported
> > > > > %d\n",
> > > > > +		      ddb->enabled_slices,
> > > > > +		      INTEL_INFO(dev_priv)->supported_slices);
> > > > >  }
> > > > >  
> > > > >  static int skl_compute_wm_params(const struct intel_crtc_state
> > > > > *crtc_state,
> > > > > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > > > > intel_crtc *crtc,
> > > > >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
> > > > >  			  struct skl_ddb_allocation *ddb /* out
> > > > > */)
> > > > >  {
> > > > > -	ddb->enabled_slices =
> > > > > intel_enabled_dbuf_slices_num(dev_priv);
> > > > > +	ddb->enabled_slices =
> > > > > intel_enabled_dbuf_slices(dev_priv);
> > > > > +	DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > > > > > enabled_slices);
> > > > > 
> > > > >  }
> > > > >  
> > > > >  /*
> > > > > @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct
> > > > > intel_crtc_state *crtc_state,
> > > > >  	return total_data_rate;
> > > > >  }
> > > > >  
> > > > > +static uint_fixed_16_16_t
> > > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +	u32 src_w, src_h, dst_w, dst_h;
> > > > > +	uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > > +	uint_fixed_16_16_t downscale_h, downscale_w;
> > > > > +	const struct drm_display_mode *adjusted_mode =
> > > > > &crtc_state-
> > > > > > hw.adjusted_mode;
> > > > > 
> > > > > +
> > > > > +	src_w = crtc_state->pipe_src_w;
> > > > > +	src_h = crtc_state->pipe_src_h;
> > > > > +	dst_w = adjusted_mode->crtc_hdisplay;
> > > > > +	dst_h = adjusted_mode->crtc_vdisplay;
> > > > > +
> > > > > +	fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > > +	fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > > +	downscale_w = max_fixed16(fp_w_ratio,
> > > > > u32_to_fixed16(1));
> > > > > +	downscale_h = max_fixed16(fp_h_ratio,
> > > > > u32_to_fixed16(1));
> > > > > +
> > > > > +	return mul_fixed16(downscale_w, downscale_h);
> > > > > +}
> > > > > +
> > > > > +uint_fixed_16_16_t
> > > > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > > > +{
> > > > > +	struct drm_plane *plane;
> > > > > +	const struct drm_plane_state *drm_plane_state;
> > > > > +	uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > > +	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > > +	struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > > >uapi.crtc);
> > > > > +
> > > > > +	if (!crtc_state->uapi.enable)
> > > > > +		return max_downscale;
> > > > > +
> > > > > +	drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > > drm_plane_state, &crtc_state->uapi) {
> > > > > +		uint_fixed_16_16_t plane_downscale;
> > > > > +		const struct intel_plane_state *plane_state =
> > > > > +			to_intel_plane_state(drm_plane_state);
> > > > > +
> > > > > +		if (!intel_wm_plane_visible(crtc_state,
> > > > > plane_state))
> > > > > +			continue;
> > > > > +
> > > > > +		plane_downscale =
> > > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > > +
> > > > > +		DRM_DEBUG_KMS("Plane %d downscale amount
> > > > > %d.%d\n",
> > > > > +			      to_intel_plane(plane)->id,
> > > > > +			      plane_downscale.val >> 16,
> > > > > +			      plane_downscale.val & 0xffff);
> > > > > +
> > > > > +		max_downscale = max_fixed16(plane_downscale,
> > > > > max_downscale);
> > > > > +	}
> > > > > +
> > > > > +
> > > > > +	pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > > > +
> > > > > +	DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > > +		       crtc->pipe, pipe_downscale.val >> 16,
> > > > > +		       pipe_downscale.val & 0xffff);
> > > > > +
> > > > > +	pipe_downscale = mul_fixed16(pipe_downscale,
> > > > > max_downscale);
> > > > > +
> > > > > +	/*
> > > > > +	 * Convert result to percentage: get the actual ratio
> > > > > instead
> > > > > of rate,
> > > > > +	 * then multiply by 100 to get percentage.
> > > > > +	 */
> > > > > +	pipe_ratio = u32_to_fixed16(100 *
> > > > > div_round_up_u32_fixed16(1,
> > > > > pipe_downscale));
> > > > > +
> > > > > +	DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > > +		      pipe_ratio.val >> 16, pipe_ratio.val &
> > > > > 0xffff);
> > > > > +
> > > > > +	return pipe_ratio;
> > > > > +}
> > > > 
> > > > Yuck. I just killed these.
> > > 
> > > I still need pipe_ratio fro ICL as you know, this is dictated by
> > > BSpec,
> > > not by me.
> > 
> > That part of the spec is pretty crazy IMO. I think we should just
> > ignore
> > it for now. Also nothing in there says we must use both slices for a
> > single pipe so ignoring doesn't even violate the spec in any way.
> 
> So you are instructing me now to ignore BSpec as "we are smarter"?
> I don't think this is properly to do it this way. Lets first change the
> BSpec, if we don't like pipe ratio here and then driver. Currently this
> is written that way and if we are going to do everything our own way
> ignoring specs, this is going to be chaos.

Like I sai you don't have to violate the spec at all. Just don't use
two slices for the same pipe and you can avoid the whole strange pipe
ratio limitation. We should file a bspec issue to get that clarified
though.

> 
> > 
> > > 
> > > > 
> > > > > +
> > > > > +struct dbuf_slice_conf_entry {
> > > > > +	u32 dbuf_mask[I915_MAX_PIPES];
> > > > > +	u32 active_pipes;
> > > > > +};
> > > > > +
> > > > > +
> > > > > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > > > > +	{ { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > > > > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > > > > +	{ { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > > > > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > > > > +	{ { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > > > > +	{ { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +	{ { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > > > > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +	{ { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +	{ { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +	{ { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +	{ { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +	{ { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > > |
> > > > > BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +	{ { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C)
> > > > > |
> > > > > BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +	{ { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C)
> > > > > |
> > > > > BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +	{ { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > > |
> > > > > BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +	{ { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > > |
> > > > > BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3,
> > > > > DBuf4)  \
> > > > > +	{ { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) |
> > > > > BIT(PIPE_B) | \
> > > > > +					  BIT(PIPE_C) |
> > > > > BIT(PIPE_D) }
> > > > > +
> > > > > +/*
> > > > > + * Table taken from Bspec 12716
> > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > + * plus there are some hardcoded requirements on how
> > > > > + * those should be distributed for multipipe scenarios.
> > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > + * and less readable, so decided to use a table almost
> > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > + * to compare, change and check.
> > > > > + */
> > > > > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > > ratio
> > > > > < 88.8 */
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > pipe
> > > > > ratio >= 88.8 */
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > > ratio
> > > > > < 88.8 */
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > pipe
> > > > > ratio >= 88.8 */
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > > ratio
> > > > > < 88.8 */
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for
> > > > > pipe
> > > > > ratio >= 88.8 */
> > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +};
> > > > > +
> > > > > +/*
> > > > > + * Table taken from Bspec 49255
> > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > + * plus there are some hardcoded requirements on how
> > > > > + * those should be distributed for multipipe scenarios.
> > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > + * and less readable, so decided to use a table almost
> > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > + * to compare, change and check.
> > > > > + */
> > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +};
> > > > 
> > > > My eyes!
> > > > 
> > > > I have to think we should be able to reduce all that to a handful
> > > > of lines of code.
> > > 
> > > Yeah, then we are going to have a huge function with lots of weird
> > > definitions, unfortunately BSpec table has quite strange DBuf
> > > assignments like 
> > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > 
> > > i.e slices can get mixed in a quite various ways. There is no
> > > rational pattern in those and could be even dangerous to try to 
> > > optimize it some way, as one day it might change again in some
> > > unpredictable way.
> > > 
> > > Would you prefer to have it like
> > > 
> > > if (pipes are A and B)
> > >     program S2 to A and S1 to B
> > > if (pipes are A and C)
> > >     program S1 to A and S2 to C
> > > ...
> > > 
> > > I would prefer at least to see it in a comparable way with the
> > > table we have in BSpec, rather than having lots of weird looking
> > > if-else statements, GEN checks and so on.
> > > 
> > > I knew this table was not going to look like it is done typically
> > > here - but really my opinion that it is way better than hardcoding
> > > it into some weird algorithm, which would be hard to compare to 
> > > initial table, which is already strange enough.
> > > 
> > > If you don't like it - could you please explain why this is exactly
> > > worse than having long functions with hardcoded checks?
> > 
> > I think we should be able to encode this simply using the
> > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
> > is the distance of each pipe from each dbuf slice. Then all we have
> > to do is minimize the total distance. The following tables I think
> > should encode that reasonably concisely:
> > 
> > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > static const u8 icl_dbuf_distance[][] {
> > 	[PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > 	[PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > 	[PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > };
> > 
> > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > static const u8 tgl_dbuf_distance[][] {
> > 	[PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > 	[PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > 	[PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > 	[PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > };
> 
> I think you are missing my point. I will explain why.
> My initial idea was similar, use some kind of distance
> metrics and encode everything in a smart way.
> Current BSpec table has sometimes slices swapped for no reason.
> 
> There are seem to be some hardware constraints we are not aware
> of. For example for TGL pipes A,B get S2 and S1 slices,

A/S1 + B/S2 = 1 + 4 = 5
A/S2 + B/S1 = 2 + 1 = 3

> but pipes A C get S1 and S2.

A/S1 + C/S2 = 1 + 1 = 2
A/S2 + C/S1 = 2 + 2 = 4

So it's exactly as my distance table says it should be.

> 
> if you encode all this into some kind of distance algorithm,
> those differences would be enourmously difficult to spot and
> compare with actual table. That is why I decided to do it that
> way.
> 
> Most likely yes you can solve it by using this metrics,
> but will it really be then more readable and comparable
> with reference table? 
> What if some contraints will be added again, like pipe ratio?
> 
> > 
> > /*
> >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> >  */
> > static const u8 future_dbuf_distance[][] {
> > 	[PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> > 	[PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> > 	[PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> > 	[PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > };
> > 
> > I suppose we could make that even more minimal by just
> > directly sticking that "Pipe and DBUF ordering" language
> > into a single array but and then calculate each distance
> > from it, but that is perhaps more work, and we'd need
> > some kind of enum just for that so that we have a single
> > namespace for the pipes and dbuf slices. So i have a
> > feeling having those distances pre-computed in a table
> > like thet is maybe a better way to go.
> 
> Good luck then spoting differences with the BSpec table once it
> gets changed.

Well, I'd rather do that than decode through layers of cpp macros.
Also atm the table has no extra restrictions so it's clearly there
just because someone thought they'd write out the results of the
distance calculations. In general bspec style is somewhat annoying
where they try to write out things in the lowest level of absraction
possible (they are hardware people after all). I'd much prefer a
higher level description of the issue.

> 
> > 
> > > 
> > > I think this is pretty stupid to fix stuff which looks new or
> > > unusual
> > > just because somebody doesn't like it, without even constructive
> > > arguing.
> > > 
> > > - Stan
> > > 
> > > > 
> > > > 
> > > > > +
> > > > > +/*
> > > > > + * This function
> > > > 
> > > > "This function" is totally redundant. Ditto elsewhere.
> > > > 
> > > > > finds an entry with same enabled pipe configuration and
> > > > > + * returns correspondent DBuf slice mask as stated in BSpec
> > > > > for
> > > > > particular
> > > > > + * platform.
> > > > > + */
> > > > > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > > > > +				     u32 active_pipes,
> > > > > +				     const struct
> > > > > intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +	int i;
> > > > > +	uint_fixed_16_16_t pipe_ratio;
> > > > > +
> > > > > +	/*
> > > > > +	 * Calculate pipe ratio as stated in BSpec 28692
> > > > > +	 */
> > > > > +	pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > > > > +
> > > > > +	for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > > > > +		if (icl_allowed_dbufs[i].active_pipes ==
> > > > > active_pipes)
> > > > > {
> > > > > +			if (hweight32(active_pipes) == 1) {
> > > > > +				/*
> > > > > +				 * According to BSpec 12716: if
> > > > > ratio
> > > > > > = 88.8
> > > > > 
> > > > > +				 * always use single dbuf
> > > > > slice.
> > > > > +				 */
> > > > > +				if (pipe_ratio.val >=
> > > > > div_fixed16(888,
> > > > > 10).val)
> > > > > +					++i;
> > > > > +			}
> > > > > +			return
> > > > > icl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > +		}
> > > > > +	}
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * This function finds an entry with same enabled pipe
> > > > > configuration and
> > > > > + * returns correspondent DBuf slice mask as stated in BSpec
> > > > > for
> > > > > particular
> > > > > + * platform.
> > > > > + */
> > > > > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > > > > +				     u32 active_pipes,
> > > > > +				     const struct
> > > > > intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +	int i;
> > > > > +
> > > > > +	for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > > > > +		if (tgl_allowed_dbufs[i].active_pipes ==
> > > > > active_pipes)
> > > > > +			return
> > > > > tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > +	}
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > > *dev_priv,
> > > > > +				      int pipe, u32
> > > > > active_pipes,
> > > > > +				      const struct
> > > > > intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +	if (IS_GEN(dev_priv, 11))
> > > > > +		return icl_get_allowed_dbuf_mask(pipe,
> > > > > +						 active_pipes,
> > > > > +						 crtc_state);
> > > > > +	else if (IS_GEN(dev_priv, 12))
> > > > > +		return tgl_get_allowed_dbuf_mask(pipe,
> > > > > +						 active_pipes,
> > > > > +						 crtc_state);
> > > > > +	/*
> > > > > +	 * For anything else just return one slice yet.
> > > > > +	 * Should be extended for other platforms.
> > > > > +	 */
> > > > > +	return DBUF_S1_BIT;
> > > > > +}
> > > > > +
> > > > >  static u64
> > > > >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > > > > *crtc_state,
> > > > >  				 u64 *plane_data_rate)
> > > > > -- 
> > > > > 2.17.1
> > > > 
> > > > 
> > 
> > 

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 20:25             ` Lisovskiy, Stanislav
  0 siblings, 0 replies; 34+ messages in thread
From: Lisovskiy, Stanislav @ 2019-11-18 20:25 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

Hi,

> > > > >
> > > > > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > hw_enabled_slices)
> > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
> > > > > +     /*
> > > > > +      * If some other DBuf slice is not needed, disable it
> > > > > here,
> > > > > +      * as here we only remove slices, which are not needed
> > > > > anymore.
> > > > > +      */
> > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > hw_enabled_slices)
> > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > slices_intersection);
> > > >
> > > > This is still a totally wrong place for this. We need to be sure
> > > > the
> > > > planes have switched over to the new ddb layout before we power
> > > > off
> > > > anything.
> > >
> > > I have just modified already existing line here, not added by me.
> > > The point of this patch is starting to use second DBuf slice but
> > > not fixing all kinds of issues related to DBuf code at the same
> > > time.
> >
> > You're re-enabling code I disabled on purpose. It has to be fixed
> > properly or not touched at all. In hindsight I should have probably
> > just ripped it all out to not give people bad ideas.
>
> -     /* If 2nd DBuf slice is no more required disable it */
> > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > hw_enabled_slices)
> > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
>
> I'm not reenabling it - this is code is out there right now.

> It's _dead_ code, and it's like that on purpose. You're now attempting
> to re-enable it without fixing the underlying issues.

Sounds like some sacred thing. Dead code which is not commented, neither
removed, but which you can't touch.  

> This is all again about changing all kinds of stuff at the same time.
>
> Why we can't just do things step by step? You have a point that
> this is wrong place ok, but as we see current code causes no
> regressions. Just tell me your idea where to put it and I will
> do that, if you don't have it yet - let it stay there, if CI shows
> no regressions currently.

>The same global state tricks apply here as with the cdclk/sagv stuff.
>1. enable new resources/bump up clocks
>2. reconfigure planes/etc.
>3. wait for vblanks
>4. disable unused resources/drop the clocks

It is different with SAGV, there we restrict whats needed first and relax
some limitations afterwards.
So all this was simply about that I should put this function call into
skl_commit_modeset_disables? 
I could just have said it straight away to me, without mentioned "dead" code
and other stuff..

> > > > > +     return INTEL_INFO(dev_priv)->supported_slices;
> > > > > +}
> > > >
> > > > Kinda pointless function now. 'supported_slices' is not a very
> > > > good
> > > > name. Should be 'num_dbuf_slices' or something along those lines.
> > >
> > > Supported slices really means how much slices we support in
> > > hardware.
> > > num_dbuf_slices will not reflect that, it will be more ambigious.
> > > I don't like this name.
> >
> > supported_slices doesn't even say "dbuf" anywhere. So it's even more
> > ambiguous. Also we already have num_pipes, num_sprites, etc.
>
 ..and in current code we have... "enabled_slices", no? Ouch! :) IS this
 really that important and worth arguing at the very last moment?
 whether number of
 supported_slices_in_hardware, should be named "supported slices" or
 "num_dbuf_slices"? If yes, lets go on, miss some targets and continue
 discussing this "soooo important" thing.

 Some one might come up that it is called "num_slices" or
 "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
 keep everyones bloated egos happy??

> > > > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > > > >  {
> > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > > DBUF_POWER_REQUEST);
> > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > > DBUF_POWER_REQUEST);
> > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > -
> > > > > -     udelay(10);
> > > > > -
> > > > > -     if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > -         !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > -             DRM_ERROR("DBuf power enable timeout\n");
> > > > > -     else
> > > > > -             /*
> > > > > -              * FIXME: for now pretend that we only have 1
> > > > > slice,
> > > > > see
> > > > > -              * intel_enabled_dbuf_slices_num().
> > > > > -              */
> > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > +     /*
> > > > > +      * Just power up 1 slice, we will
> > > > > +      * figure out later which slices we have and what we
> > > > > need.
> > > > > +      */
> > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > >
> > > > That stuff really shouldn't be modified by low level functions.
> > > > The intial value should come from readout, and after that it
> > > > should
> > > > be adjusted as part of the overall atomic state.
> > >
> > > Well, as you can see currently this is even worse - has some weird
> > > hardcoded stuff and is modified like that. Also if we are enabling
> > > the
> > > dbuf slices, shouldn't we really determine what value should be
> > > there?
> > >
> > > Also I wouldn't really always trust the readout, as to me it seems
> > > icelake resets it constantly to 2 after suspend/resume.
> >
> > The point of readout is that we syncrhonize our software state with
> > the
> > current state of hardware without having to modify the hardware
> > state.
> > Once synchronized the driver takes over and tells the hw what to do.
>
> That is especially nice after you say "report error and don't care"
> when I really want to reflect hw state in sw state, by checking whether
> we could set dbuf slices or not.
> Now you say that we need to synchronize software state with hw state,
> however if I fail to modify it - then "who cares".

>Yes. If you fail to modify the hardware state the whole thing is
>broken anyway. There's is *nothing* we can do at this stage in the
>commit to fix it.

Ok, but why we can't at least keep the last successful state in sw
instead of something which is completely wrong. How does that harm?

> > > > >
> > > > >  struct skl_ddb_allocation {
> > > >
> > > > This struct is quite pointless. Should be killed off.
> > >
> > > Legacy stuff, not added by me. Quite fine with removing it -
> > > but should we really do all this at the same time and in same
> > > patch?
> >
> > Not the same patch. But clean up the cobwebs first, then add the new
> > stuff. Much easier that way around.
>
> There is a shit ton load of cobwebs to clean. Personally my opinion is
> that current dbuf controlling code is total garbage - which has lots of
> hardcoded stuff. Good luck enabling only second dbuf slice, good luck
> disabling/enabling dbuf properly....
> Where you were, when this was committed then?

>Looking at something else unfortunately. Also it has just kept
>bitrotting since because no one has cared.

That is what exactly surprises me. While we have issues
like commit serialization, sagv, mst, half of display buffer only
enabled, bw, and we are arguing about how supported_slices
should be named. Or that I shouldn't refer to DBuf slices as bits
but define as values and then decode those as bits. Priorities.

>
> If we need to fix it - fine. But may be we should do it in stages?
> Point of this patch is "enable second dbuf slice", next patch "remove
> unneeded structs" and so on.
> While you want me to shuffle everything at the same time, which is not
> good idea, especially considering complexity and amount of legacy stuff
> we currently have.

>Building new stuff on top of a decomposing garbage pile is not
>a good idea: a) it's harder than building on clean foundation,
>b) it'll probably fall over and get buried in the old garbage
>at which point you're likely going to be right back where you
>started
.
Sure, let's change everything at the same time and then wonder
what had failed exactly in that huge series with lots of unrelated changes. 
Standard practice.

> > > >
> > > > >       for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > crtc_state,
> > > > > i) {
> > > > >               const struct drm_display_mode *adjusted_mode =
> > > > >                       &crtc_state->hw.adjusted_mode;
> > > > >               enum pipe pipe = crtc->pipe;
> > > > >               int hdisplay, vdisplay;
> > > > > +             u32 pipe_dbuf_slice_mask = \
> > > > > +                     i915_get_allowed_dbuf_mask(dev_priv,
> > > > > +                                             pipe,
> > > > > +                                             active_pipes,
> > > > > +                                             crtc_state);
> > > > >
> > > > >               if (!crtc_state->hw.enable)
> > > > >                       continue;
> > > > >
> > > > > +             /*
> > > > > +              * According to BSpec pipe can share one dbuf
> > > > > slice
> > > > > with another
> > > > > +              * pipes or pipe can use multiple dbufs, in
> > > > > both cases
> > > > > we
> > > > > +              * account for other pipes only if they have
> > > > > exactly
> > > > > same mask.
> > > > > +              * However we need to account how many slices
> > > > > we should
> > > > > enable
> > > > > +              * in total.
> > > > > +              */
> > > > > +             total_slice_mask |= pipe_dbuf_slice_mask;
> > > >
> > > > total_slice_mask will now account only the crtcs in the state.
> > > > What
> > > > happens to the other pipes' slices?
>
> Same as happens now btw. Right now it also does the same thing.

>Right now it does absolutely nothing. It's all dead code.

Right now, this "dead" code exactly iterates through crtcs in state and
allocates DBuf S1 in proportion to pipe width.
What I do is make it do the same thing but using either range of one
or two slices, depending on pipe configuration.

So should I have thrown away that code and rewritten this as well?

> > > > > +static uint_fixed_16_16_t
> > > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +     u32 src_w, src_h, dst_w, dst_h;
> > > > > +     uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > > +     uint_fixed_16_16_t downscale_h, downscale_w;
> > > > > +     const struct drm_display_mode *adjusted_mode =
> > > > > &crtc_state-
> > > > > > hw.adjusted_mode;
> > > > >
> > > > > +
> > > > > +     src_w = crtc_state->pipe_src_w;
> > > > > +     src_h = crtc_state->pipe_src_h;
> > > > > +     dst_w = adjusted_mode->crtc_hdisplay;
> > > > > +     dst_h = adjusted_mode->crtc_vdisplay;
> > > > > +
> > > > > +     fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > > +     fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > > +     downscale_w = max_fixed16(fp_w_ratio,
> > > > > u32_to_fixed16(1));
> > > > > +     downscale_h = max_fixed16(fp_h_ratio,
> > > > > u32_to_fixed16(1));
> > > > > +
> > > > > +     return mul_fixed16(downscale_w, downscale_h);
> > > > > +}
> > > > > +
> > > > > +uint_fixed_16_16_t
> > > > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > > > +{
> > > > > +     struct drm_plane *plane;
> > > > > +     const struct drm_plane_state *drm_plane_state;
> > > > > +     uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > > +     uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > > +     struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > > >uapi.crtc);
> > > > > +
> > > > > +     if (!crtc_state->uapi.enable)
> > > > > +             return max_downscale;
> > > > > +
> > > > > +     drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > > drm_plane_state, &crtc_state->uapi) {
> > > > > +             uint_fixed_16_16_t plane_downscale;
> > > > > +             const struct intel_plane_state *plane_state =
> > > > > +                     to_intel_plane_state(drm_plane_state);
> > > > > +
> > > > > +             if (!intel_wm_plane_visible(crtc_state,
> > > > > plane_state))
> > > > > +                     continue;
> > > > > +
> > > > > +             plane_downscale =
> > > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > > +
> > > > > +             DRM_DEBUG_KMS("Plane %d downscale amount
> > > > > %d.%d\n",
> > > > > +                           to_intel_plane(plane)->id,
> > > > > +                           plane_downscale.val >> 16,
> > > > > +                           plane_downscale.val & 0xffff);
> > > > > +
> > > > > +             max_downscale = max_fixed16(plane_downscale,
> > > > > max_downscale);
> > > > > +     }
> > > > > +
> > > > > +
> > > > > +     pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > > > +
> > > > > +     DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > > +                    crtc->pipe, pipe_downscale.val >> 16,
> > > > > +                    pipe_downscale.val & 0xffff);
> > > > > +
> > > > > +     pipe_downscale = mul_fixed16(pipe_downscale,
> > > > > max_downscale);
> > > > > +
> > > > > +     /*
> > > > > +      * Convert result to percentage: get the actual ratio
> > > > > instead
> > > > > of rate,
> > > > > +      * then multiply by 100 to get percentage.
> > > > > +      */
> > > > > +     pipe_ratio = u32_to_fixed16(100 *
> > > > > div_round_up_u32_fixed16(1,
> > > > > pipe_downscale));
> > > > > +
> > > > > +     DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > > +                   pipe_ratio.val >> 16, pipe_ratio.val &
> > > > > 0xffff);
> > > > > +
> > > > > +     return pipe_ratio;
> > > > > +}
> > > >
> > > > Yuck. I just killed these.
> > >
> > > I still need pipe_ratio fro ICL as you know, this is dictated by
> > > BSpec,
> > > not by me.
> >
> > That part of the spec is pretty crazy IMO. I think we should just
> > ignore
> > it for now. Also nothing in there says we must use both slices for a
> > single pipe so ignoring doesn't even violate the spec in any way.
>
> So you are instructing me now to ignore BSpec as "we are smarter"?
> I don't think this is properly to do it this way. Lets first change the
> BSpec, if we don't like pipe ratio here and then driver. Currently this
> is written that way and if we are going to do everything our own way
> ignoring specs, this is going to be chaos.

>Like I sai you don't have to violate the spec at all. Just don't use
>two slices for the same pipe and you can avoid the whole strange pipe
>ratio limitation. We should file a bspec issue to get that clarified
>though.

That is actually a violation of BSpec, we are not using two slices for same
pipe while we could. We had already enough of bw issues, why should
we make our life even harder?

> > > > > +/*
> > > > > + * Table taken from Bspec 49255
> > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > + * plus there are some hardcoded requirements on how
> > > > > + * those should be distributed for multipipe scenarios.
> > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > + * and less readable, so decided to use a table almost
> > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > + * to compare, change and check.
> > > > > + */
> > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +};
> > > >
> > > > My eyes!
> > > >
> > > > I have to think we should be able to reduce all that to a handful
> > > > of lines of code.
> > >
> > > Yeah, then we are going to have a huge function with lots of weird
> > > definitions, unfortunately BSpec table has quite strange DBuf
> > > assignments like
> > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > >
> > > i.e slices can get mixed in a quite various ways. There is no
> > > rational pattern in those and could be even dangerous to try to
> > > optimize it some way, as one day it might change again in some
> > > unpredictable way.
> > >
> > > Would you prefer to have it like
> > >
> > > if (pipes are A and B)
> > >     program S2 to A and S1 to B
> > > if (pipes are A and C)
> > >     program S1 to A and S2 to C
> > > ...
> > >
> > > I would prefer at least to see it in a comparable way with the
> > > table we have in BSpec, rather than having lots of weird looking
> > > if-else statements, GEN checks and so on.
> > >
> > > I knew this table was not going to look like it is done typically
> > > here - but really my opinion that it is way better than hardcoding
> > > it into some weird algorithm, which would be hard to compare to
> > > initial table, which is already strange enough.
> > >
> > > If you don't like it - could you please explain why this is exactly
> > > worse than having long functions with hardcoded checks?
> >
> > I think we should be able to encode this simply using the
> > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
> > is the distance of each pipe from each dbuf slice. Then all we have
> > to do is minimize the total distance. The following tables I think
> > should encode that reasonably concisely:
> >
> > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > static const u8 icl_dbuf_distance[][] {
> >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > };
> >
> > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > static const u8 tgl_dbuf_distance[][] {
> >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > };
>
> I think you are missing my point. I will explain why.
> My initial idea was similar, use some kind of distance
> metrics and encode everything in a smart way.
> Current BSpec table has sometimes slices swapped for no reason.
>
> There are seem to be some hardware constraints we are not aware
> of. For example for TGL pipes A,B get S2 and S1 slices,

>A/S1 + B/S2 = 1 + 4 = 5
>A/S2 + B/S1 = 2 + 1 = 3

> but pipes A C get S1 and S2.

>A/S1 + C/S2 = 1 + 1 = 2
>A/S2 + C/S1 = 2 + 2 = 4

>So it's exactly as my distance table says it should be.

>
> if you encode all this into some kind of distance algorithm,
> those differences would be enourmously difficult to spot and
> compare with actual table. That is why I decided to do it that
> way.
>
> Most likely yes you can solve it by using this metrics,
> but will it really be then more readable and comparable
> with reference table?
> What if some contraints will be added again, like pipe ratio?
>
> >
> > /*
> >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> >  */
> > static const u8 future_dbuf_distance[][] {
> >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > };
> >
> > I suppose we could make that even more minimal by just
> > directly sticking that "Pipe and DBUF ordering" language
> > into a single array but and then calculate each distance
> > from it, but that is perhaps more work, and we'd need
> > some kind of enum just for that so that we have a single
> > namespace for the pipes and dbuf slices. So i have a
> > feeling having those distances pre-computed in a table
> > like thet is maybe a better way to go.
>
> Good luck then spoting differences with the BSpec table once it
> gets changed.

>Well, I'd rather do that than decode through layers of cpp macros.
>Also atm the table has no extra restrictions so it's clearly there
>just because someone thought they'd write out the results of the
>distance calculations. In general bspec style is somewhat annoying
>where they try to write out things in the lowest level of absraction
>possible (they are hardware people after all). I'd much prefer a
>higher level description of the issue.

I would prefer to see immediately what is different as this table
makes it possible right away
Also to me this table simply intuitively more understandable.

> > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +};

We have similar CDCLK table in intel_cdclk, while you could 
also argue that we might had come up with some formula.

Could be we could make the table approach itself simpler - 
that is quite possible, but implementing a distance algorithm
seems to me to be a dangerous idea, despite that it of course
looks better to me(yes it is).

Best Regards,

Lisovskiy Stanislav

Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

________________________________________
From: Ville Syrjälä [ville.syrjala@linux.intel.com]
Sent: Monday, November 18, 2019 8:18 PM
To: Lisovskiy, Stanislav
Cc: Saarinen, Jani; intel-gfx@lists.freedesktop.org; Roper, Matthew D; maarten.lankhorst@linux.intel.com; Ausmus, James
Subject: Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL

On Mon, Nov 18, 2019 at 05:17:51PM +0000, Lisovskiy, Stanislav wrote:
> On Mon, 2019-11-18 at 17:27 +0200, Ville Syrjälä wrote:
> > On Mon, Nov 18, 2019 at 09:19:18AM +0000, Lisovskiy, Stanislav wrote:
> > > On Fri, 2019-11-15 at 22:19 +0200, Ville Syrjälä wrote:
> > > > On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy
> > > > wrote:
> > > > > Also implemented algorithm for choosing DBuf slice
> > > > > configuration
> > > > > based on active pipes, pipe ratio as stated in BSpec 12716.
> > > > >
> > > > > Now pipe allocation still stays proportional to pipe width as
> > > > > before,
> > > > > however within allowed DBuf slice for this particular
> > > > > configuration.
> > > > >
> > > > > v2: Remove unneeded check from commit as ddb enabled slices
> > > > > might
> > > > >     now differ from hw state.
> > > > >
> > > > > v3: - Added new field "supported_slices" to ddb to separate max
> > > > >       supported slices vs currently enabled, to avoid
> > > > > confusion.
> > > > >     - Removed obsolete comments and code related to
> > > > > DBuf(Matthew
> > > > > Roper).
> > > > >     - Some code style and long line removal(Matthew Roper).
> > > > >     - Added WARN_ON to new ddb range offset calc
> > > > > function(Matthew
> > > > > Roper).
> > > > >     - Removed platform specific call to calc pipe ratio from
> > > > > ddb
> > > > >       allocation function and fixed the return value(Matthew
> > > > > Roper)
> > > > >     - Refactored DBUF slice allocation table to improve
> > > > > readability
> > > > >     - Added DBUF slice allocation for TGL as well.
> > > > >     - ICL(however not TGL) seems to voluntarily enable second
> > > > > DBuf
> > > > > slice
> > > > >       after pm suspend/resume causing a mismatch failure,
> > > > > because
> > > > > we
> > > > >       update DBuf slices only if we do a modeset, however this
> > > > > check
> > > > >       is done always. Fixed it to be done only when modeset for
> > > > > ICL.
> > > > >
> > > > > v4: - Now enabled slices is not actually a number, but a
> > > > > bitmask,
> > > > >       because we might need to enable second slice only and
> > > > > number
> > > > >       of slices would still 1 and that current code doesn't
> > > > > allow.
> > > > >     - Remove redundant duplicate code to have some unified way
> > > > > of
> > > > >       enabling dbuf slices instead of hardcoding.
> > > > >
> > > > > v5: - Fix failing gen9_assert_dbuf_enabled as it was naively
> > > > > thinking
> > > > >       that we have only one DBUF_CTL slice. Now another version
> > > > > for
> > > > >       gen11 will check both slices as only second one can be
> > > > > enabled,
> > > > >       to keep CI happy.
> > > > >
> > > > > v6: - Removed enabled dbuf assertion completely. Previous code
> > > > >       was keeping dbuf enabled, even(!) when _dbuf_disable is
> > > > > called.
> > > > >       Currently we enable/disable dbuf slices based on
> > > > > requrement
> > > > >       so no point in asserting this here.
> > > > >     - Removed unnecessary modeset check from
> > > > > verify_wm_state(Matthew Roper)
> > > > >     - Slices intersection after union is same as final
> > > > > result(Matthew Roper)
> > > > >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> > > > >     - Some macros added(Matthew Roper)
> > > > >     - ddb range is now always less or equal to ddb size - no
> > > > > need
> > > > > for additional
> > > > >       checks(previously needed as we had some bandwidth checks
> > > > > in
> > > > > there which
> > > > >       could "suddenly" return ddb_size smaller than it is.
> > > > >
> > > > > v7: Minor costemic changes:
> > > > >     - Changed icl_dbuf_slices_restore name to
> > > > > icl_program_dbuf_slices
> > > > >       as it more corresponds to the actual functionality.
> > > > >     - Some simplification with supported slices for BXT and GLK
> > > > >     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
> > > > >       this is not used for skl anymore.
> > > > >     - Changed DBuf slice assignment order for TGL case
> > > > >
> > > > > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > > > > Signed-off-by: Stanislav Lisovskiy <
> > > > > stanislav.lisovskiy@intel.com>
> > > > > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > > > > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > > > > Cc: James Ausmus <james.ausmus@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> > > > >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> > > > >  .../drm/i915/display/intel_display_power.h    |   2 +
> > > > >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> > > > >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> > > > >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> > > > >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> > > > >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> > > > >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > > > > ++++++++++++++++--
> > > > >  9 files changed, 431 insertions(+), 104 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > index 876fc25968bf..bd7aff675198 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > > > > intel_crtc *crtc,
> > > > >
> > > > >       if (INTEL_GEN(dev_priv) >= 11 &&
> > > > >           hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > > > > -             DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > %u, got
> > > > > %u)\n",
> > > > > +             DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > %x, got
> > > > > %x)\n",
> > > > >                         sw_ddb->enabled_slices,
> > > > >                         hw->ddb.enabled_slices);
> > > > >
> > > > > @@ -14614,15 +14614,24 @@ static void
> > > > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > > > >       u8 hw_enabled_slices = dev_priv-
> > > > > >wm.skl_hw.ddb.enabled_slices;
> > > > >       u8 required_slices = state-
> > > > > >wm_results.ddb.enabled_slices;
> > > > >       struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > > > > +     u8 slices_union = hw_enabled_slices | required_slices;
> > > > > +     u8 slices_intersection = required_slices;
> > > > >
> > > > >       for_each_oldnew_intel_crtc_in_state(state, crtc,
> > > > > old_crtc_state, new_crtc_state, i)
> > > > >               /* ignore allocations for crtc's that have been
> > > > > turned
> > > > > off. */
> > > > >               if (new_crtc_state->hw.active)
> > > > >                       entries[i] = old_crtc_state-
> > > > > >wm.skl.ddb;
> > > > >
> > > > > -     /* If 2nd DBuf slice required, enable it here */
> > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > > > > hw_enabled_slices)
> > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
> > > > > +     DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > > > > +                   required_slices, hw_enabled_slices);
> > > > > +
> > > > > +     /*
> > > > > +      * If some other DBuf slice required, enable it here,
> > > > > +      * as here we only add more slices, but not remove to
> > > > > prevent
> > > > > +      * issues if somebody still uses those.
> > > > > +      */
> > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > hw_enabled_slices)
> > > > > +             icl_dbuf_slices_update(dev_priv, slices_union);
> > > > >
> > > > >       /*
> > > > >        * Whenever the number of active pipes changes, we need
> > > > > to make
> > > > > sure we
> > > > > @@ -14681,9 +14690,12 @@ static void
> > > > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > > > >               }
> > > > >       } while (progress);
> > > > >
> > > > > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > hw_enabled_slices)
> > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
> > > > > +     /*
> > > > > +      * If some other DBuf slice is not needed, disable it
> > > > > here,
> > > > > +      * as here we only remove slices, which are not needed
> > > > > anymore.
> > > > > +      */
> > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > hw_enabled_slices)
> > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > slices_intersection);
> > > >
> > > > This is still a totally wrong place for this. We need to be sure
> > > > the
> > > > planes have switched over to the new ddb layout before we power
> > > > off
> > > > anything.
> > >
> > > I have just modified already existing line here, not added by me.
> > > The point of this patch is starting to use second DBuf slice but
> > > not fixing all kinds of issues related to DBuf code at the same
> > > time.
> >
> > You're re-enabling code I disabled on purpose. It has to be fixed
> > properly or not touched at all. In hindsight I should have probably
> > just ripped it all out to not give people bad ideas.
>
> -     /* If 2nd DBuf slice is no more required disable it */
> > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > hw_enabled_slices)
> > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
>
> I'm not reenabling it - this is code is out there right now.

It's _dead_ code, and it's like that on purpose. You're now attempting
to re-enable it without fixing the underlying issues.

> This is all again about changing all kinds of stuff at the same time.
>
> Why we can't just do things step by step? You have a point that
> this is wrong place ok, but as we see current code causes no
> regressions. Just tell me your idea where to put it and I will
> do that, if you don't have it yet - let it stay there, if CI shows
> no regressions currently.

The same global state tricks apply here as with the cdclk/sagv stuff.
1. enable new resources/bump up clocks
2. reconfigure planes/etc.
3. wait for vblanks
4. disable unused resources/drop the clocks

>
> >
> > >
> > > Of course this is nice idea to fix it - if that is wrong plane,
> > > but current code somehow lives with it and probably this should be
> > > with some other patch.
> > >
> > > >
> > > > >  }
> > > > >
> > > > >  static void intel_atomic_helper_free_state(struct
> > > > > drm_i915_private
> > > > > *dev_priv)
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > index ce1b64f4dd44..a78bc9b61862 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > @@ -1031,15 +1031,6 @@ static bool
> > > > > gen9_dc_off_power_well_enabled(struct drm_i915_private
> > > > > *dev_priv,
> > > > >               (I915_READ(DC_STATE_EN) &
> > > > > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> > > > >  }
> > > > >
> > > > > -static void gen9_assert_dbuf_enabled(struct drm_i915_private
> > > > > *dev_priv)
> > > > > -{
> > > > > -     u32 tmp = I915_READ(DBUF_CTL);
> > > > > -
> > > > > -     WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> > > > > -          (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > > > > -          "Unexpected DBuf power power state (0x%08x)\n",
> > > > > tmp);
> > > > > -}
> > > > > -
> > > > >  static void gen9_disable_dc_states(struct drm_i915_private
> > > > > *dev_priv)
> > > > >  {
> > > > >       struct intel_cdclk_state cdclk_state = {};
> > > > > @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct
> > > > > drm_i915_private *dev_priv)
> > > > >       /* Can't read out voltage_level so can't use
> > > > > intel_cdclk_changed() */
> > > > >       WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw,
> > > > > &cdclk_state));
> > > > >
> > > > > -     gen9_assert_dbuf_enabled(dev_priv);
> > > > > -
> > > > >       if (IS_GEN9_LP(dev_priv))
> > > > >               bxt_verify_ddi_phy_power_wells(dev_priv);
> > > > >
> > > > > @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct
> > > > > drm_i915_private *dev_priv)
> > > > >       intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> > > > >  }
> > > > >
> > > > > -static u8 intel_dbuf_max_slices(struct drm_i915_private
> > > > > *dev_priv)
> > > > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > > > >  {
> > > > > -     if (INTEL_GEN(dev_priv) < 11)
> > > > > -             return 1;
> > > > > -     return 2;
> > > > > +     return INTEL_INFO(dev_priv)->supported_slices;
> > > > > +}
> > > >
> > > > Kinda pointless function now. 'supported_slices' is not a very
> > > > good
> > > > name. Should be 'num_dbuf_slices' or something along those lines.
> > >
> > > Supported slices really means how much slices we support in
> > > hardware.
> > > num_dbuf_slices will not reflect that, it will be more ambigious.
> > > I don't like this name.
> >
> > supported_slices doesn't even say "dbuf" anywhere. So it's even more
> > ambiguous. Also we already have num_pipes, num_sprites, etc.
>
> ..and in current code we have... "enabled_slices", no? Ouch! :) IS this
> really that important and worth arguing at the very last moment?
> whether number of
> supported_slices_in_hardware, should be named "supported slices" or
> "num_dbuf_slices"? If yes, lets go on, miss some targets and continue
> discussing this "soooo important" thing.
>
> Some one might come up that it is called "num_slices" or
> "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
> keep everyones bloated egos happy??
>
>
> >
> > >
> > > >
> > > > > +
> > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > *dev_priv)
> > > > > +{
> > > > > +     const u8 hw_enabled_slices = dev_priv-
> > > > > > wm.skl_hw.ddb.enabled_slices;
> > > >
> > > > Seems a rather pointless variable. const at least is pointless.
> > > >
> > > > > +
> > > > > +     icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> > > > >  }
> > > > >
> > > > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > > > >                           u8 req_slices)
> > > > >  {
> > > > > -     const u8 hw_enabled_slices = dev_priv-
> > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > >
> > > > > -     bool ret;
> > > > > +     bool ret = true;
> > > > > +     int i;
> > > > > +     int max_slices = intel_dbuf_max_slices(dev_priv);
> > > > >
> > > > > -     if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > > > > +     if (hweight8(req_slices) >
> > > > > intel_dbuf_max_slices(dev_priv)) {
> > > > >               DRM_ERROR("Invalid number of dbuf slices
> > > > > requested\n");
> > > > >               return;
> > > > >       }
> > > > >
> > > > > -     if (req_slices == hw_enabled_slices || req_slices == 0)
> > > > > -             return;
> > > > > +     DRM_DEBUG_KMS("Updating dbuf slices to %x\n",
> > > > > req_slices);
> > > > >
> > > > > -     if (req_slices > hw_enabled_slices)
> > > > > -             ret = intel_dbuf_slice_set(dev_priv,
> > > > > DBUF_CTL_S2,
> > > > > true);
> > > > > -     else
> > > > > -             ret = intel_dbuf_slice_set(dev_priv,
> > > > > DBUF_CTL_S2,
> > > > > false);
> > > > > +     for (i = 0; i < max_slices; i++) {
> > > > > +             int slice_bit = BIT(i);
> > > > > +             bool slice_set = (slice_bit & req_slices) != 0;
> > > > > +
> > > > > +             switch (slice_bit) {
> > > > > +             case DBUF_S1_BIT:
> > > > > +                     ret = ret &&
> > > > > intel_dbuf_slice_set(dev_priv,
> > > > > +                                                       DBUF_
> > > > > CTL_S1,
> > > > > +                                                       slice
> > > > > _set);
> > > > > +                     break;
> > > > > +             case DBUF_S2_BIT:
> > > > > +                     ret = ret &&
> > > > > intel_dbuf_slice_set(dev_priv,
> > > > > +                                                       DBUF_
> > > > > CTL_S2,
> > > > > +                                                       slice
> > > > > _set);
> > > >
> > > > The return value on intel_dbuf_slice_set() seems pointless. I'd
> > > > nuke
> > > > it.
> > > >
> > > > Also as long you have intel_dbuf_slice_set() I'd probably just
> > > > pass
> > > > the
> > > > slice index there and let it deal with the register details.
> > > > DBUF_CTL
> > > > should probably be parametrized as well so you don't need
> > > > annoying
> > > > switch
> > > > statements.
> > >
> > > What if it fails? Shouldn't we still have some clue what state do
> > > we
> > > have/had at the moment?
> >
> > It if fails you log an error and keep going.
>
> Ah sure. And let hw/sw state be completely different.
> Also see next comment about this.
>
> >
> > >
> > > >
> > > > > +                     break;
> > > > > +             default:
> > > > > +                     MISSING_CASE(slice_bit);
> > > > > +             }
> > > > > +     }
> > > > >
> > > > >       if (ret)
> > > > >               dev_priv->wm.skl_hw.ddb.enabled_slices =
> > > > > req_slices;
> > > > > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > > > > drm_i915_private *dev_priv,
> > > > >
> > > > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > > > >  {
> > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > > DBUF_POWER_REQUEST);
> > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > > DBUF_POWER_REQUEST);
> > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > -
> > > > > -     udelay(10);
> > > > > -
> > > > > -     if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > -         !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > -             DRM_ERROR("DBuf power enable timeout\n");
> > > > > -     else
> > > > > -             /*
> > > > > -              * FIXME: for now pretend that we only have 1
> > > > > slice,
> > > > > see
> > > > > -              * intel_enabled_dbuf_slices_num().
> > > > > -              */
> > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > +     /*
> > > > > +      * Just power up 1 slice, we will
> > > > > +      * figure out later which slices we have and what we
> > > > > need.
> > > > > +      */
> > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > >
> > > > That stuff really shouldn't be modified by low level functions.
> > > > The intial value should come from readout, and after that it
> > > > should
> > > > be adjusted as part of the overall atomic state.
> > >
> > > Well, as you can see currently this is even worse - has some weird
> > > hardcoded stuff and is modified like that. Also if we are enabling
> > > the
> > > dbuf slices, shouldn't we really determine what value should be
> > > there?
> > >
> > > Also I wouldn't really always trust the readout, as to me it seems
> > > icelake resets it constantly to 2 after suspend/resume.
> >
> > The point of readout is that we syncrhonize our software state with
> > the
> > current state of hardware without having to modify the hardware
> > state.
> > Once synchronized the driver takes over and tells the hw what to do.
>
> That is especially nice after you say "report error and don't care"
> when I really want to reflect hw state in sw state, by checking whether
> we could set dbuf slices or not.
> Now you say that we need to synchronize software state with hw state,
> however if I fail to modify it - then "who cares".

Yes. If you fail to modify the hardware state the whole thing is
broken anyway. There's is *nothing* we can do at this stage in the
commit to fix it.

>
>
> >
> > >
> > > >
> > > > > +     icl_program_dbuf_slices(dev_priv);
> > > > >  }
> > > > >
> > > > >  static void icl_dbuf_disable(struct drm_i915_private
> > > > > *dev_priv)
> > > > >  {
> > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > > > > ~DBUF_POWER_REQUEST);
> > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > > > > ~DBUF_POWER_REQUEST);
> > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > -
> > > > > -     udelay(10);
> > > > > -
> > > > > -     if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > -         (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > -             DRM_ERROR("DBuf power disable timeout!\n");
> > > > > -     else
> > > > > -             /*
> > > > > -              * FIXME: for now pretend that the first slice
> > > > > is
> > > > > always
> > > > > -              * enabled, see
> > > > > intel_enabled_dbuf_slices_num().
> > > > > -              */
> > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > +     /*
> > > > > +      * Disable all slices
> > > > > +      */
> > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > > > > +     icl_program_dbuf_slices(dev_priv);
> > > > >  }
> > > > >
> > > > >  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > index 1da04f3e0fb3..b7057d68ad78 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > > > > drm_i915_private *i915,
> > > > >       for ((wf) = intel_display_power_get((i915), (domain));
> > > > > (wf); \
> > > > >            intel_display_power_put_async((i915), (domain),
> > > > > (wf)),
> > > > > (wf) = 0)
> > > > >
> > > > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
> > > > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > > > >                           u8 req_slices);
> > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > *dev_priv);
> > > > >
> > > > >  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
> > > > >                            bool override, unsigned int mask);
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > > index 82e4e6bf08c3..96741e68633a 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > > @@ -51,6 +51,7 @@
> > > > >  #include "display/intel_bw.h"
> > > > >  #include "display/intel_cdclk.h"
> > > > >  #include "display/intel_display_types.h"
> > > > > +#include "display/intel_display_power.h"
> > > > >  #include "display/intel_dp.h"
> > > > >  #include "display/intel_fbdev.h"
> > > > >  #include "display/intel_hotplug.h"
> > > > > @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct
> > > > > device *kdev)
> > > > >       if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
> > > > >               ret = vlv_resume_prepare(dev_priv, true);
> > > > >
> > > > > +     /* Weird hack to fix ICL hardware bug, as it resets
> > > > > DBUF slices
> > > > > reg */
> > > > > +     if (INTEL_GEN(dev_priv) == 11)
> > > > > +             icl_program_dbuf_slices(dev_priv);
> > > > > +
> > > > >       intel_uncore_runtime_resume(&dev_priv->uncore);
> > > > >
> > > > >       intel_runtime_pm_enable_interrupts(dev_priv);
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > > index 7e0f67babe20..a396977c9c2d 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > @@ -804,7 +804,7 @@ static inline bool
> > > > > skl_ddb_entry_equal(const
> > > > > struct skl_ddb_entry *e1,
> > > > >  }
> > > > >
> > > > >  struct skl_ddb_allocation {
> > > >
> > > > This struct is quite pointless. Should be killed off.
> > >
> > > Legacy stuff, not added by me. Quite fine with removing it -
> > > but should we really do all this at the same time and in same
> > > patch?
> >
> > Not the same patch. But clean up the cobwebs first, then add the new
> > stuff. Much easier that way around.
>
> There is a shit ton load of cobwebs to clean. Personally my opinion is
> that current dbuf controlling code is total garbage - which has lots of
> hardcoded stuff. Good luck enabling only second dbuf slice, good luck
> disabling/enabling dbuf properly....
> Where you were, when this was committed then?

Looking at something else unfortunately. Also it has just kept
bitrotting since because no one has cared.

>
> If we need to fix it - fine. But may be we should do it in stages?
> Point of this patch is "enable second dbuf slice", next patch "remove
> unneeded structs" and so on.
> While you want me to shuffle everything at the same time, which is not
> good idea, especially considering complexity and amount of legacy stuff
> we currently have.

Building new stuff on top of a decomposing garbage pile is not
a good idea: a) it's harder than building on clean foundation,
b) it'll probably fall over and get buried in the old garbage
at which point you're likely going to be right back where you
started.

>
> >
> > >
> > > >
> > > > > -     u8 enabled_slices; /* GEN11 has configurable 2 slices
> > > > > */
> > > > > +     u8 enabled_slices;   /* Bitmask of currently enabled
> > > > > DBuf
> > > > > slices */
> > > > >  };
> > > > >
> > > > >  struct skl_ddb_values {
> > > > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > > > b/drivers/gpu/drm/i915/i915_pci.c
> > > > > index 1bb701d32a5d..8e86af505730 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > > > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > > > > intel_cherryview_info = {
> > > > >       .has_gt_uc = 1, \
> > > > >       .display.has_hdcp = 1, \
> > > > >       .display.has_ipc = 1, \
> > > > > -     .ddb_size = 896
> > > > > +     .ddb_size = 896, \
> > > > > +     .supported_slices = 1
> > > > >
> > > > >  #define SKL_PLATFORM \
> > > > >       GEN9_FEATURES, \
> > > > > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > > > > intel_skylake_gt4_info = {
> > > > >       GEN(9), \
> > > > >       .is_lp = 1, \
> > > > >       .display.has_hotplug = 1, \
> > > > > +     .supported_slices = 1, \
> > > > >       .engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) |
> > > > > BIT(VECS0),
> > > > > \
> > > > >       .pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
> > > > >       .has_64bit_reloc = 1, \
> > > > > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > > > > intel_coffeelake_gt3_info = {
> > > > >       GEN9_FEATURES, \
> > > > >       GEN(10), \
> > > > >       .ddb_size = 1024, \
> > > > > +     .supported_slices = 1, \
> > > > >       .display.has_dsc = 1, \
> > > > >       .has_coherent_ggtt = false, \
> > > > >       GLK_COLORS
> > > > > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > > > > intel_cannonlake_info = {
> > > > >       }, \
> > > > >       GEN(11), \
> > > > >       .ddb_size = 2048, \
> > > > > +     .supported_slices = 2, \
> > > > >       .has_logical_ring_elsq = 1, \
> > > > >       .color = { .degamma_lut_size = 33, .gamma_lut_size =
> > > > > 262145 }
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > > > > b/drivers/gpu/drm/i915/i915_reg.h
> > > > > index a607ea520829..fba5731063d8 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > > > @@ -7734,9 +7734,11 @@ enum {
> > > > >  #define DISP_ARB_CTL2        _MMIO(0x45004)
> > > > >  #define  DISP_DATA_PARTITION_5_6     (1 << 6)
> > > > >  #define  DISP_IPC_ENABLE             (1 << 3)
> > > > > -#define DBUF_CTL     _MMIO(0x45008)
> > > > > -#define DBUF_CTL_S1  _MMIO(0x45008)
> > > > > -#define DBUF_CTL_S2  _MMIO(0x44FE8)
> > > > > +#define DBUF_S1_BIT                  BIT(0)
> > > > > +#define DBUF_S2_BIT                  BIT(1)
> > > >
> > > > These are not bits inside DBUF_CTL so they should not be here.
> > > > I'd just nuke them entirely. If you think we need more than an
> > > > int to refer to a slice then IMO something simple like
> > > >
> > > > enum dbuf_slice {
> > > >         DBUF_S1,
> > > >         DBUF_S2,
> > > > };
> > > >
> > > > would match the general apporach we use for other things much
> > > > better
> > > > rather than defining some shifted values.
> > >
> > > We need to combine slices - so bitmask is quite suitable for that,
> > > I can of course use this enum, but then I'll have to re-encode it
> > > to bitmask anyways.
> >
> > Which is pretty much how everything else works in the driver.
>
> Another piece of incredibly important thing to argue..

It is. Lot's of crap in the driver is simply caused by people not
following established style and coming up with something different..
Which means there is an extra cost for everyone else when reading the
code. And that probably means most people just don't bother to read it.

Granted, some of that is due to spurts in evolution where major overhauls
changed the architecture so significantly that the old established
conventions simply made node sense anymore. Atomic being the last big
one in that regard. And we're still cleaning up the resulting mess.

>
> >
> > >
> > >
> > > >
> > > > > +#define DBUF_CTL                     _MMIO(0x45008)
> > > > > +#define DBUF_CTL_S1                  _MMIO(0x45008)
> > > > > +#define DBUF_CTL_S2                  _MMIO(0x44FE8)
> > > > >  #define  DBUF_POWER_REQUEST          (1 << 31)
> > > > >  #define  DBUF_POWER_STATE            (1 << 30)
> > > > >  #define GEN7_MSG_CTL _MMIO(0x45010)
> > > > > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > @@ -180,6 +180,7 @@ struct intel_device_info {
> > > > >       } display;
> > > > >
> > > > >       u16 ddb_size; /* in blocks */
> > > > > +     u8 supported_slices; /* number of DBuf slices */
> > > > >
> > > > >       /* Register offsets for the various display pipes and
> > > > > transcoders */
> > > > >       int pipe_offsets[I915_MAX_TRANSCODERS];
> > > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > > > b/drivers/gpu/drm/i915/intel_pm.c
> > > > > index 2d389e437e87..1ea81ab92429 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct
> > > > > drm_device
> > > > > *dev)
> > > > >       return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> > > > >  }
> > > > >
> > > > > -static u8 intel_enabled_dbuf_slices_num(struct
> > > > > drm_i915_private
> > > > > *dev_priv)
> > > > > +static u8 intel_enabled_dbuf_slices(struct drm_i915_private
> > > > > *dev_priv)
> > > > >  {
> > > > > -     u8 enabled_slices;
> > > > > -
> > > > > -     /* Slice 1 will always be enabled */
> > > > > -     enabled_slices = 1;
> > > > > +     u8 enabled_slices = 0;
> > > > >
> > > > >       /* Gen prior to GEN11 have only one DBuf slice */
> > > > >       if (INTEL_GEN(dev_priv) < 11)
> > > > > -             return enabled_slices;
> > > > > +             return DBUF_S1_BIT;
> > > >
> > > > I'd nuke this special case entirely and just read everything from
> > > > the
> > > > hw. Assuming we need readout somewhere.
> > >
> > > We have it that way hardcoded right now in drm-tip. I think this is
> > > a
> > > subject to other patch, but not this change. As I understand we
> > > need
> > > to first figure out how to do it properly.
> > > This patch anyway makes it better by at least removing hardcoded
> > > relation between number of slices and which slices are enabled.
> > > With current code there is no way for example to enable second DBuf
> > > slice only, which is required by BSpec.
> > >
> > > >
> > > > >
> > > > > -     /*
> > > > > -      * FIXME: for now we'll only ever use 1 slice; pretend
> > > > > that we
> > > > > have
> > > > > -      * only that 1 slice enabled until we have a proper way
> > > > > for on-
> > > > > demand
> > > > > -      * toggling of the second slice.
> > > > > -      */
> > > > > -     if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > -             enabled_slices++;
> > > > > +     /* Check if second DBuf slice is enabled */
> > > > > +     if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > > > > +             enabled_slices |= DBUF_S1_BIT;
> > > > > +
> > > > > +     if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > +             enabled_slices |= DBUF_S2_BIT;
> > > > >
> > > > >       return enabled_slices;
> > > > >  }
> > > > > @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct
> > > > > drm_i915_private *dev_priv,
> > > > >                             const int num_active,
> > > > >                             struct skl_ddb_allocation *ddb)
> > > > >  {
> > > > > -     const struct drm_display_mode *adjusted_mode;
> > > > > -     u64 total_data_bw;
> > > > >       u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > > > > -
> > > > >       WARN_ON(ddb_size == 0);
> > > > >
> > > > >       if (INTEL_GEN(dev_priv) < 11)
> > > > >               return ddb_size - 4; /* 4 blocks for bypass
> > > > > path
> > > > > allocation */
> > > > >
> > > > > -     adjusted_mode = &crtc_state->hw.adjusted_mode;
> > > > > -     total_data_bw = total_data_rate *
> > > > > drm_mode_vrefresh(adjusted_mode);
> > > > > +     return ddb_size;
> > > > > +}
> > > > >
> > > > > -     /*
> > > > > -      * 12GB/s is maximum BW supported by single DBuf slice.
> > > > > -      *
> > > > > -      * FIXME dbuf slice code is broken:
> > > > > -      * - must wait for planes to stop using the slice
> > > > > before
> > > > > powering it off
> > > > > -      * - plane straddling both slices is illegal in multi-
> > > > > pipe
> > > > > scenarios
> > > > > -      * - should validate we stay within the hw bandwidth
> > > > > limits
> > > > > -      */
> > > > > -     if (0 && (num_active > 1 || total_data_bw >= GBps(12)))
> > > > > {
> > > > > -             ddb->enabled_slices = 2;
> > > > > -     } else {
> > > > > -             ddb->enabled_slices = 1;
> > > > > -             ddb_size /= 2;
> > > > > -     }
> > > > > +/*
> > > > > + * Calculate initial DBuf slice offset, based on slice size
> > > > > + * and mask(i.e if slice size is 1024 and second slice is
> > > > > enabled
> > > > > + * offset would be 1024)
> > > > > + */
> > > > > +static u32 skl_get_first_dbuf_slice_offset(u32
> > > > > dbuf_slice_mask,
> > > > > +                                        u32 slice_size, u32
> > > > > ddb_size)
> > > > > +{
> > > > > +     u32 offset = 0;
> > > > >
> > > > > -     return ddb_size;
> > > > > +     if (!dbuf_slice_mask)
> > > > > +             return 0;
> > > > > +
> > > > > +     offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > > > > +
> > > > > +     WARN_ON(offset >= ddb_size);
> > > > > +     return offset;
> > > > >  }
> > > > >
> > > > > +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > > *dev_priv,
> > > > > +                                   int pipe, u32
> > > > > active_pipes,
> > > > > +                                   const struct
> > > > > intel_crtc_state
> > > > > *crtc_state);
> > > > > +
> > > > >  static void
> > > > >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > > > > *dev_priv,
> > > > >                                  const struct
> > > > > intel_crtc_state
> > > > > *crtc_state,
> > > > > @@ -3857,7 +3855,14 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >       u32 pipe_width = 0, total_width = 0, width_before_pipe
> > > > > = 0;
> > > > >       enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> > > > >       u16 ddb_size;
> > > > > +     u32 ddb_range_size;
> > > > >       u32 i;
> > > > > +     u32 dbuf_slice_mask;
> > > > > +     u32 active_pipes;
> > > > > +     u32 offset;
> > > > > +     u32 slice_size;
> > > > > +     u32 total_slice_mask;
> > > > > +     u32 start, end;
> > > > >
> > > > >       if (WARN_ON(!state) || !crtc_state->hw.active) {
> > > > >               alloc->start = 0;
> > > > > @@ -3866,14 +3871,23 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >               return;
> > > > >       }
> > > > >
> > > > > -     if (intel_state->active_pipe_changes)
> > > > > +     if (intel_state->active_pipe_changes) {
> > > > >               *num_active = hweight8(intel_state-
> > > > > >active_pipes);
> > > > > -     else
> > > > > +             active_pipes = intel_state->active_pipes;
> > > > > +     } else {
> > > > >               *num_active = hweight8(dev_priv->active_pipes);
> > > > > +             active_pipes = dev_priv->active_pipes;
> > > > > +     }
> > > > >
> > > > >       ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > > > > total_data_rate,
> > > > >                                     *num_active, ddb);
> > > > >
> > > > > +     DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > > > > +
> > > > > +     slice_size = ddb_size / INTEL_INFO(dev_priv)-
> > > > > >supported_slices;
> > > > > +
> > > > > +     DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> > > > > +
> > > > >       /*
> > > > >        * If the state doesn't change the active CRTC's or
> > > > > there is no
> > > > >        * modeset request, then there's no need to
> > > > > recalculate;
> > > > > @@ -3891,20 +3905,70 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >               return;
> > > > >       }
> > > > >
> > > > > +     /*
> > > > > +      * Get allowed DBuf slices for correspondent pipe and
> > > > > platform.
> > > > > +      */
> > > > > +     dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv,
> > > > > for_pipe,
> > > > > +                                                  active_pip
> > > > > es,
> > > > > crtc_state);
> > > >
> > > > Quite a few redundant arguments. The function name looks alien.
> > > >
> > > > skl_crtc_possible_dbuf_slices() or something perhaps?
> > > >
> > > > > +
> > > > > +     DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes
> > > > > %x\n",
> > > > > +                   dbuf_slice_mask,
> > > > > +                   for_pipe, active_pipes);
> > > > > +
> > > > > +     /*
> > > > > +      * Figure out at which DBuf slice we start, i.e if we
> > > > > start at
> > > > > Dbuf S2
> > > > > +      * and slice size is 1024, the offset would be 1024
> > > > > +      */
> > > > > +     offset =
> > > > > skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > > > > +                                              slice_size,
> > > > > ddb_size);
> > > > > +
> > > > > +     /*
> > > > > +      * Figure out total size of allowed DBuf slices, which
> > > > > is
> > > > > basically
> > > > > +      * a number of allowed slices for that pipe multiplied
> > > > > by slice
> > > > > size.
> > > > > +      * Inside of this
> > > > > +      * range ddb entries are still allocated in proportion
> > > > > to
> > > > > display width.
> > > > > +      */
> > > > > +     ddb_range_size = hweight8(dbuf_slice_mask) *
> > > > > slice_size;
> > > > > +
> > > > >       /*
> > > > >        * Watermark/ddb requirement highly depends upon width
> > > > > of the
> > > > >        * framebuffer, So instead of allocating DDB equally
> > > > > among
> > > > > pipes
> > > > >        * distribute DDB based on resolution/width of the
> > > > > display.
> > > > >        */
> > > > > +     total_slice_mask = dbuf_slice_mask;
> > > >
> > > > Isn't this already taken care by the loop below?
> > > >
> > > > >       for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > crtc_state,
> > > > > i) {
> > > > >               const struct drm_display_mode *adjusted_mode =
> > > > >                       &crtc_state->hw.adjusted_mode;
> > > > >               enum pipe pipe = crtc->pipe;
> > > > >               int hdisplay, vdisplay;
> > > > > +             u32 pipe_dbuf_slice_mask = \
> > > > > +                     i915_get_allowed_dbuf_mask(dev_priv,
> > > > > +                                             pipe,
> > > > > +                                             active_pipes,
> > > > > +                                             crtc_state);
> > > > >
> > > > >               if (!crtc_state->hw.enable)
> > > > >                       continue;
> > > > >
> > > > > +             /*
> > > > > +              * According to BSpec pipe can share one dbuf
> > > > > slice
> > > > > with another
> > > > > +              * pipes or pipe can use multiple dbufs, in
> > > > > both cases
> > > > > we
> > > > > +              * account for other pipes only if they have
> > > > > exactly
> > > > > same mask.
> > > > > +              * However we need to account how many slices
> > > > > we should
> > > > > enable
> > > > > +              * in total.
> > > > > +              */
> > > > > +             total_slice_mask |= pipe_dbuf_slice_mask;
> > > >
> > > > total_slice_mask will now account only the crtcs in the state.
> > > > What
> > > > happens to the other pipes' slices?
>
> Same as happens now btw. Right now it also does the same thing.

Right now it does absolutely nothing. It's all dead code.

>
> > >
> > > Yes we need to iterate through all pipes/crtcs here, however
> > > for getting correct dbuf slice mask most important is active_pipes.
> > > However yes, I will change the loop for it to be more correct.
> > >
> > > >
> > > > > +
> > > > > +             /*
> > > > > +              * Do not account pipes using other slice sets
> > > > > +              * luckily as of current BSpec slice sets do
> > > > > not
> > > > > partially
> > > > > +              * intersect(pipes share either same one slice
> > > > > or same
> > > > > slice set
> > > > > +              * i.e no partial intersection), so it is
> > > > > enough to
> > > > > check for
> > > > > +              * equality for now.
> > > > > +              */
> > > > > +             if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > > > > +                     continue;
> > > > > +
> > > > >               drm_mode_get_hv_timing(adjusted_mode,
> > > > > &hdisplay,
> > > > > &vdisplay);
> > > > >               total_width += hdisplay;
> > > > >
> > > > > @@ -3914,8 +3978,19 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >                       pipe_width = hdisplay;
> > > > >       }
> > > > >
> > > > > -     alloc->start = ddb_size * width_before_pipe /
> > > > > total_width;
> > > > > -     alloc->end = ddb_size * (width_before_pipe +
> > > > > pipe_width) /
> > > > > total_width;
> > > > > +     ddb->enabled_slices = total_slice_mask;
> > > > > +
> > > > > +     start = ddb_range_size * width_before_pipe /
> > > > > total_width;
> > > > > +     end = ddb_range_size * (width_before_pipe + pipe_width)
> > > > > /
> > > > > total_width;
> > > > > +
> > > > > +     alloc->start = offset + start;
> > > > > +     alloc->end = offset + end;
> > > > > +
> > > > > +     DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > > > > +                   alloc->start, alloc->end);
> > > > > +     DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported
> > > > > %d\n",
> > > > > +                   ddb->enabled_slices,
> > > > > +                   INTEL_INFO(dev_priv)->supported_slices);
> > > > >  }
> > > > >
> > > > >  static int skl_compute_wm_params(const struct intel_crtc_state
> > > > > *crtc_state,
> > > > > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > > > > intel_crtc *crtc,
> > > > >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
> > > > >                         struct skl_ddb_allocation *ddb /* out
> > > > > */)
> > > > >  {
> > > > > -     ddb->enabled_slices =
> > > > > intel_enabled_dbuf_slices_num(dev_priv);
> > > > > +     ddb->enabled_slices =
> > > > > intel_enabled_dbuf_slices(dev_priv);
> > > > > +     DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > > > > > enabled_slices);
> > > > >
> > > > >  }
> > > > >
> > > > >  /*
> > > > > @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct
> > > > > intel_crtc_state *crtc_state,
> > > > >       return total_data_rate;
> > > > >  }
> > > > >
> > > > > +static uint_fixed_16_16_t
> > > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +     u32 src_w, src_h, dst_w, dst_h;
> > > > > +     uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > > +     uint_fixed_16_16_t downscale_h, downscale_w;
> > > > > +     const struct drm_display_mode *adjusted_mode =
> > > > > &crtc_state-
> > > > > > hw.adjusted_mode;
> > > > >
> > > > > +
> > > > > +     src_w = crtc_state->pipe_src_w;
> > > > > +     src_h = crtc_state->pipe_src_h;
> > > > > +     dst_w = adjusted_mode->crtc_hdisplay;
> > > > > +     dst_h = adjusted_mode->crtc_vdisplay;
> > > > > +
> > > > > +     fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > > +     fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > > +     downscale_w = max_fixed16(fp_w_ratio,
> > > > > u32_to_fixed16(1));
> > > > > +     downscale_h = max_fixed16(fp_h_ratio,
> > > > > u32_to_fixed16(1));
> > > > > +
> > > > > +     return mul_fixed16(downscale_w, downscale_h);
> > > > > +}
> > > > > +
> > > > > +uint_fixed_16_16_t
> > > > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > > > +{
> > > > > +     struct drm_plane *plane;
> > > > > +     const struct drm_plane_state *drm_plane_state;
> > > > > +     uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > > +     uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > > +     struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > > >uapi.crtc);
> > > > > +
> > > > > +     if (!crtc_state->uapi.enable)
> > > > > +             return max_downscale;
> > > > > +
> > > > > +     drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > > drm_plane_state, &crtc_state->uapi) {
> > > > > +             uint_fixed_16_16_t plane_downscale;
> > > > > +             const struct intel_plane_state *plane_state =
> > > > > +                     to_intel_plane_state(drm_plane_state);
> > > > > +
> > > > > +             if (!intel_wm_plane_visible(crtc_state,
> > > > > plane_state))
> > > > > +                     continue;
> > > > > +
> > > > > +             plane_downscale =
> > > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > > +
> > > > > +             DRM_DEBUG_KMS("Plane %d downscale amount
> > > > > %d.%d\n",
> > > > > +                           to_intel_plane(plane)->id,
> > > > > +                           plane_downscale.val >> 16,
> > > > > +                           plane_downscale.val & 0xffff);
> > > > > +
> > > > > +             max_downscale = max_fixed16(plane_downscale,
> > > > > max_downscale);
> > > > > +     }
> > > > > +
> > > > > +
> > > > > +     pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > > > +
> > > > > +     DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > > +                    crtc->pipe, pipe_downscale.val >> 16,
> > > > > +                    pipe_downscale.val & 0xffff);
> > > > > +
> > > > > +     pipe_downscale = mul_fixed16(pipe_downscale,
> > > > > max_downscale);
> > > > > +
> > > > > +     /*
> > > > > +      * Convert result to percentage: get the actual ratio
> > > > > instead
> > > > > of rate,
> > > > > +      * then multiply by 100 to get percentage.
> > > > > +      */
> > > > > +     pipe_ratio = u32_to_fixed16(100 *
> > > > > div_round_up_u32_fixed16(1,
> > > > > pipe_downscale));
> > > > > +
> > > > > +     DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > > +                   pipe_ratio.val >> 16, pipe_ratio.val &
> > > > > 0xffff);
> > > > > +
> > > > > +     return pipe_ratio;
> > > > > +}
> > > >
> > > > Yuck. I just killed these.
> > >
> > > I still need pipe_ratio fro ICL as you know, this is dictated by
> > > BSpec,
> > > not by me.
> >
> > That part of the spec is pretty crazy IMO. I think we should just
> > ignore
> > it for now. Also nothing in there says we must use both slices for a
> > single pipe so ignoring doesn't even violate the spec in any way.
>
> So you are instructing me now to ignore BSpec as "we are smarter"?
> I don't think this is properly to do it this way. Lets first change the
> BSpec, if we don't like pipe ratio here and then driver. Currently this
> is written that way and if we are going to do everything our own way
> ignoring specs, this is going to be chaos.

Like I sai you don't have to violate the spec at all. Just don't use
two slices for the same pipe and you can avoid the whole strange pipe
ratio limitation. We should file a bspec issue to get that clarified
though.

>
> >
> > >
> > > >
> > > > > +
> > > > > +struct dbuf_slice_conf_entry {
> > > > > +     u32 dbuf_mask[I915_MAX_PIPES];
> > > > > +     u32 active_pipes;
> > > > > +};
> > > > > +
> > > > > +
> > > > > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > > > > +     { { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > > > > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > > > > +     { { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > > > > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > > > > +     { { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > > > > +     { { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +     { { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > > > > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +     { { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +     { { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +     { { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +     { { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +     { { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +     { { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > > |
> > > > > BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +     { { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C)
> > > > > |
> > > > > BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +     { { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C)
> > > > > |
> > > > > BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +     { { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > > |
> > > > > BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +     { { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > > |
> > > > > BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3,
> > > > > DBuf4)  \
> > > > > +     { { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) |
> > > > > BIT(PIPE_B) | \
> > > > > +                                       BIT(PIPE_C) |
> > > > > BIT(PIPE_D) }
> > > > > +
> > > > > +/*
> > > > > + * Table taken from Bspec 12716
> > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > + * plus there are some hardcoded requirements on how
> > > > > + * those should be distributed for multipipe scenarios.
> > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > + * and less readable, so decided to use a table almost
> > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > + * to compare, change and check.
> > > > > + */
> > > > > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > > ratio
> > > > > < 88.8 */
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > pipe
> > > > > ratio >= 88.8 */
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > > ratio
> > > > > < 88.8 */
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > pipe
> > > > > ratio >= 88.8 */
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > > ratio
> > > > > < 88.8 */
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for
> > > > > pipe
> > > > > ratio >= 88.8 */
> > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +};
> > > > > +
> > > > > +/*
> > > > > + * Table taken from Bspec 49255
> > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > + * plus there are some hardcoded requirements on how
> > > > > + * those should be distributed for multipipe scenarios.
> > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > + * and less readable, so decided to use a table almost
> > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > + * to compare, change and check.
> > > > > + */
> > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +};
> > > >
> > > > My eyes!
> > > >
> > > > I have to think we should be able to reduce all that to a handful
> > > > of lines of code.
> > >
> > > Yeah, then we are going to have a huge function with lots of weird
> > > definitions, unfortunately BSpec table has quite strange DBuf
> > > assignments like
> > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > >
> > > i.e slices can get mixed in a quite various ways. There is no
> > > rational pattern in those and could be even dangerous to try to
> > > optimize it some way, as one day it might change again in some
> > > unpredictable way.
> > >
> > > Would you prefer to have it like
> > >
> > > if (pipes are A and B)
> > >     program S2 to A and S1 to B
> > > if (pipes are A and C)
> > >     program S1 to A and S2 to C
> > > ...
> > >
> > > I would prefer at least to see it in a comparable way with the
> > > table we have in BSpec, rather than having lots of weird looking
> > > if-else statements, GEN checks and so on.
> > >
> > > I knew this table was not going to look like it is done typically
> > > here - but really my opinion that it is way better than hardcoding
> > > it into some weird algorithm, which would be hard to compare to
> > > initial table, which is already strange enough.
> > >
> > > If you don't like it - could you please explain why this is exactly
> > > worse than having long functions with hardcoded checks?
> >
> > I think we should be able to encode this simply using the
> > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
> > is the distance of each pipe from each dbuf slice. Then all we have
> > to do is minimize the total distance. The following tables I think
> > should encode that reasonably concisely:
> >
> > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > static const u8 icl_dbuf_distance[][] {
> >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > };
> >
> > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > static const u8 tgl_dbuf_distance[][] {
> >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > };
>
> I think you are missing my point. I will explain why.
> My initial idea was similar, use some kind of distance
> metrics and encode everything in a smart way.
> Current BSpec table has sometimes slices swapped for no reason.
>
> There are seem to be some hardware constraints we are not aware
> of. For example for TGL pipes A,B get S2 and S1 slices,

A/S1 + B/S2 = 1 + 4 = 5
A/S2 + B/S1 = 2 + 1 = 3

> but pipes A C get S1 and S2.

A/S1 + C/S2 = 1 + 1 = 2
A/S2 + C/S1 = 2 + 2 = 4

So it's exactly as my distance table says it should be.

>
> if you encode all this into some kind of distance algorithm,
> those differences would be enourmously difficult to spot and
> compare with actual table. That is why I decided to do it that
> way.
>
> Most likely yes you can solve it by using this metrics,
> but will it really be then more readable and comparable
> with reference table?
> What if some contraints will be added again, like pipe ratio?
>
> >
> > /*
> >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> >  */
> > static const u8 future_dbuf_distance[][] {
> >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > };
> >
> > I suppose we could make that even more minimal by just
> > directly sticking that "Pipe and DBUF ordering" language
> > into a single array but and then calculate each distance
> > from it, but that is perhaps more work, and we'd need
> > some kind of enum just for that so that we have a single
> > namespace for the pipes and dbuf slices. So i have a
> > feeling having those distances pre-computed in a table
> > like thet is maybe a better way to go.
>
> Good luck then spoting differences with the BSpec table once it
> gets changed.

Well, I'd rather do that than decode through layers of cpp macros.
Also atm the table has no extra restrictions so it's clearly there
just because someone thought they'd write out the results of the
distance calculations. In general bspec style is somewhat annoying
where they try to write out things in the lowest level of absraction
possible (they are hardware people after all). I'd much prefer a
higher level description of the issue.

>
> >
> > >
> > > I think this is pretty stupid to fix stuff which looks new or
> > > unusual
> > > just because somebody doesn't like it, without even constructive
> > > arguing.
> > >
> > > - Stan
> > >
> > > >
> > > >
> > > > > +
> > > > > +/*
> > > > > + * This function
> > > >
> > > > "This function" is totally redundant. Ditto elsewhere.
> > > >
> > > > > finds an entry with same enabled pipe configuration and
> > > > > + * returns correspondent DBuf slice mask as stated in BSpec
> > > > > for
> > > > > particular
> > > > > + * platform.
> > > > > + */
> > > > > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > > > > +                                  u32 active_pipes,
> > > > > +                                  const struct
> > > > > intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +     int i;
> > > > > +     uint_fixed_16_16_t pipe_ratio;
> > > > > +
> > > > > +     /*
> > > > > +      * Calculate pipe ratio as stated in BSpec 28692
> > > > > +      */
> > > > > +     pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > > > > +
> > > > > +     for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > > > > +             if (icl_allowed_dbufs[i].active_pipes ==
> > > > > active_pipes)
> > > > > {
> > > > > +                     if (hweight32(active_pipes) == 1) {
> > > > > +                             /*
> > > > > +                              * According to BSpec 12716: if
> > > > > ratio
> > > > > > = 88.8
> > > > >
> > > > > +                              * always use single dbuf
> > > > > slice.
> > > > > +                              */
> > > > > +                             if (pipe_ratio.val >=
> > > > > div_fixed16(888,
> > > > > 10).val)
> > > > > +                                     ++i;
> > > > > +                     }
> > > > > +                     return
> > > > > icl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > +             }
> > > > > +     }
> > > > > +     return 0;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * This function finds an entry with same enabled pipe
> > > > > configuration and
> > > > > + * returns correspondent DBuf slice mask as stated in BSpec
> > > > > for
> > > > > particular
> > > > > + * platform.
> > > > > + */
> > > > > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > > > > +                                  u32 active_pipes,
> > > > > +                                  const struct
> > > > > intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +     int i;
> > > > > +
> > > > > +     for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > > > > +             if (tgl_allowed_dbufs[i].active_pipes ==
> > > > > active_pipes)
> > > > > +                     return
> > > > > tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > +     }
> > > > > +     return 0;
> > > > > +}
> > > > > +
> > > > > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > > *dev_priv,
> > > > > +                                   int pipe, u32
> > > > > active_pipes,
> > > > > +                                   const struct
> > > > > intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +     if (IS_GEN(dev_priv, 11))
> > > > > +             return icl_get_allowed_dbuf_mask(pipe,
> > > > > +                                              active_pipes,
> > > > > +                                              crtc_state);
> > > > > +     else if (IS_GEN(dev_priv, 12))
> > > > > +             return tgl_get_allowed_dbuf_mask(pipe,
> > > > > +                                              active_pipes,
> > > > > +                                              crtc_state);
> > > > > +     /*
> > > > > +      * For anything else just return one slice yet.
> > > > > +      * Should be extended for other platforms.
> > > > > +      */
> > > > > +     return DBUF_S1_BIT;
> > > > > +}
> > > > > +
> > > > >  static u64
> > > > >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > > > > *crtc_state,
> > > > >                                u64 *plane_data_rate)
> > > > > --
> > > > > 2.17.1
> > > >
> > > >
> >
> >

--
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-18 20:25             ` Lisovskiy, Stanislav
  0 siblings, 0 replies; 34+ messages in thread
From: Lisovskiy, Stanislav @ 2019-11-18 20:25 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

Hi,

> > > > >
> > > > > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > hw_enabled_slices)
> > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
> > > > > +     /*
> > > > > +      * If some other DBuf slice is not needed, disable it
> > > > > here,
> > > > > +      * as here we only remove slices, which are not needed
> > > > > anymore.
> > > > > +      */
> > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > hw_enabled_slices)
> > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > slices_intersection);
> > > >
> > > > This is still a totally wrong place for this. We need to be sure
> > > > the
> > > > planes have switched over to the new ddb layout before we power
> > > > off
> > > > anything.
> > >
> > > I have just modified already existing line here, not added by me.
> > > The point of this patch is starting to use second DBuf slice but
> > > not fixing all kinds of issues related to DBuf code at the same
> > > time.
> >
> > You're re-enabling code I disabled on purpose. It has to be fixed
> > properly or not touched at all. In hindsight I should have probably
> > just ripped it all out to not give people bad ideas.
>
> -     /* If 2nd DBuf slice is no more required disable it */
> > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > hw_enabled_slices)
> > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
>
> I'm not reenabling it - this is code is out there right now.

> It's _dead_ code, and it's like that on purpose. You're now attempting
> to re-enable it without fixing the underlying issues.

Sounds like some sacred thing. Dead code which is not commented, neither
removed, but which you can't touch.  

> This is all again about changing all kinds of stuff at the same time.
>
> Why we can't just do things step by step? You have a point that
> this is wrong place ok, but as we see current code causes no
> regressions. Just tell me your idea where to put it and I will
> do that, if you don't have it yet - let it stay there, if CI shows
> no regressions currently.

>The same global state tricks apply here as with the cdclk/sagv stuff.
>1. enable new resources/bump up clocks
>2. reconfigure planes/etc.
>3. wait for vblanks
>4. disable unused resources/drop the clocks

It is different with SAGV, there we restrict whats needed first and relax
some limitations afterwards.
So all this was simply about that I should put this function call into
skl_commit_modeset_disables? 
I could just have said it straight away to me, without mentioned "dead" code
and other stuff..

> > > > > +     return INTEL_INFO(dev_priv)->supported_slices;
> > > > > +}
> > > >
> > > > Kinda pointless function now. 'supported_slices' is not a very
> > > > good
> > > > name. Should be 'num_dbuf_slices' or something along those lines.
> > >
> > > Supported slices really means how much slices we support in
> > > hardware.
> > > num_dbuf_slices will not reflect that, it will be more ambigious.
> > > I don't like this name.
> >
> > supported_slices doesn't even say "dbuf" anywhere. So it's even more
> > ambiguous. Also we already have num_pipes, num_sprites, etc.
>
 ..and in current code we have... "enabled_slices", no? Ouch! :) IS this
 really that important and worth arguing at the very last moment?
 whether number of
 supported_slices_in_hardware, should be named "supported slices" or
 "num_dbuf_slices"? If yes, lets go on, miss some targets and continue
 discussing this "soooo important" thing.

 Some one might come up that it is called "num_slices" or
 "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
 keep everyones bloated egos happy??

> > > > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > > > >  {
> > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > > DBUF_POWER_REQUEST);
> > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > > DBUF_POWER_REQUEST);
> > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > -
> > > > > -     udelay(10);
> > > > > -
> > > > > -     if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > -         !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > -             DRM_ERROR("DBuf power enable timeout\n");
> > > > > -     else
> > > > > -             /*
> > > > > -              * FIXME: for now pretend that we only have 1
> > > > > slice,
> > > > > see
> > > > > -              * intel_enabled_dbuf_slices_num().
> > > > > -              */
> > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > +     /*
> > > > > +      * Just power up 1 slice, we will
> > > > > +      * figure out later which slices we have and what we
> > > > > need.
> > > > > +      */
> > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > >
> > > > That stuff really shouldn't be modified by low level functions.
> > > > The intial value should come from readout, and after that it
> > > > should
> > > > be adjusted as part of the overall atomic state.
> > >
> > > Well, as you can see currently this is even worse - has some weird
> > > hardcoded stuff and is modified like that. Also if we are enabling
> > > the
> > > dbuf slices, shouldn't we really determine what value should be
> > > there?
> > >
> > > Also I wouldn't really always trust the readout, as to me it seems
> > > icelake resets it constantly to 2 after suspend/resume.
> >
> > The point of readout is that we syncrhonize our software state with
> > the
> > current state of hardware without having to modify the hardware
> > state.
> > Once synchronized the driver takes over and tells the hw what to do.
>
> That is especially nice after you say "report error and don't care"
> when I really want to reflect hw state in sw state, by checking whether
> we could set dbuf slices or not.
> Now you say that we need to synchronize software state with hw state,
> however if I fail to modify it - then "who cares".

>Yes. If you fail to modify the hardware state the whole thing is
>broken anyway. There's is *nothing* we can do at this stage in the
>commit to fix it.

Ok, but why we can't at least keep the last successful state in sw
instead of something which is completely wrong. How does that harm?

> > > > >
> > > > >  struct skl_ddb_allocation {
> > > >
> > > > This struct is quite pointless. Should be killed off.
> > >
> > > Legacy stuff, not added by me. Quite fine with removing it -
> > > but should we really do all this at the same time and in same
> > > patch?
> >
> > Not the same patch. But clean up the cobwebs first, then add the new
> > stuff. Much easier that way around.
>
> There is a shit ton load of cobwebs to clean. Personally my opinion is
> that current dbuf controlling code is total garbage - which has lots of
> hardcoded stuff. Good luck enabling only second dbuf slice, good luck
> disabling/enabling dbuf properly....
> Where you were, when this was committed then?

>Looking at something else unfortunately. Also it has just kept
>bitrotting since because no one has cared.

That is what exactly surprises me. While we have issues
like commit serialization, sagv, mst, half of display buffer only
enabled, bw, and we are arguing about how supported_slices
should be named. Or that I shouldn't refer to DBuf slices as bits
but define as values and then decode those as bits. Priorities.

>
> If we need to fix it - fine. But may be we should do it in stages?
> Point of this patch is "enable second dbuf slice", next patch "remove
> unneeded structs" and so on.
> While you want me to shuffle everything at the same time, which is not
> good idea, especially considering complexity and amount of legacy stuff
> we currently have.

>Building new stuff on top of a decomposing garbage pile is not
>a good idea: a) it's harder than building on clean foundation,
>b) it'll probably fall over and get buried in the old garbage
>at which point you're likely going to be right back where you
>started
.
Sure, let's change everything at the same time and then wonder
what had failed exactly in that huge series with lots of unrelated changes. 
Standard practice.

> > > >
> > > > >       for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > crtc_state,
> > > > > i) {
> > > > >               const struct drm_display_mode *adjusted_mode =
> > > > >                       &crtc_state->hw.adjusted_mode;
> > > > >               enum pipe pipe = crtc->pipe;
> > > > >               int hdisplay, vdisplay;
> > > > > +             u32 pipe_dbuf_slice_mask = \
> > > > > +                     i915_get_allowed_dbuf_mask(dev_priv,
> > > > > +                                             pipe,
> > > > > +                                             active_pipes,
> > > > > +                                             crtc_state);
> > > > >
> > > > >               if (!crtc_state->hw.enable)
> > > > >                       continue;
> > > > >
> > > > > +             /*
> > > > > +              * According to BSpec pipe can share one dbuf
> > > > > slice
> > > > > with another
> > > > > +              * pipes or pipe can use multiple dbufs, in
> > > > > both cases
> > > > > we
> > > > > +              * account for other pipes only if they have
> > > > > exactly
> > > > > same mask.
> > > > > +              * However we need to account how many slices
> > > > > we should
> > > > > enable
> > > > > +              * in total.
> > > > > +              */
> > > > > +             total_slice_mask |= pipe_dbuf_slice_mask;
> > > >
> > > > total_slice_mask will now account only the crtcs in the state.
> > > > What
> > > > happens to the other pipes' slices?
>
> Same as happens now btw. Right now it also does the same thing.

>Right now it does absolutely nothing. It's all dead code.

Right now, this "dead" code exactly iterates through crtcs in state and
allocates DBuf S1 in proportion to pipe width.
What I do is make it do the same thing but using either range of one
or two slices, depending on pipe configuration.

So should I have thrown away that code and rewritten this as well?

> > > > > +static uint_fixed_16_16_t
> > > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +     u32 src_w, src_h, dst_w, dst_h;
> > > > > +     uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > > +     uint_fixed_16_16_t downscale_h, downscale_w;
> > > > > +     const struct drm_display_mode *adjusted_mode =
> > > > > &crtc_state-
> > > > > > hw.adjusted_mode;
> > > > >
> > > > > +
> > > > > +     src_w = crtc_state->pipe_src_w;
> > > > > +     src_h = crtc_state->pipe_src_h;
> > > > > +     dst_w = adjusted_mode->crtc_hdisplay;
> > > > > +     dst_h = adjusted_mode->crtc_vdisplay;
> > > > > +
> > > > > +     fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > > +     fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > > +     downscale_w = max_fixed16(fp_w_ratio,
> > > > > u32_to_fixed16(1));
> > > > > +     downscale_h = max_fixed16(fp_h_ratio,
> > > > > u32_to_fixed16(1));
> > > > > +
> > > > > +     return mul_fixed16(downscale_w, downscale_h);
> > > > > +}
> > > > > +
> > > > > +uint_fixed_16_16_t
> > > > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > > > +{
> > > > > +     struct drm_plane *plane;
> > > > > +     const struct drm_plane_state *drm_plane_state;
> > > > > +     uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > > +     uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > > +     struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > > >uapi.crtc);
> > > > > +
> > > > > +     if (!crtc_state->uapi.enable)
> > > > > +             return max_downscale;
> > > > > +
> > > > > +     drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > > drm_plane_state, &crtc_state->uapi) {
> > > > > +             uint_fixed_16_16_t plane_downscale;
> > > > > +             const struct intel_plane_state *plane_state =
> > > > > +                     to_intel_plane_state(drm_plane_state);
> > > > > +
> > > > > +             if (!intel_wm_plane_visible(crtc_state,
> > > > > plane_state))
> > > > > +                     continue;
> > > > > +
> > > > > +             plane_downscale =
> > > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > > +
> > > > > +             DRM_DEBUG_KMS("Plane %d downscale amount
> > > > > %d.%d\n",
> > > > > +                           to_intel_plane(plane)->id,
> > > > > +                           plane_downscale.val >> 16,
> > > > > +                           plane_downscale.val & 0xffff);
> > > > > +
> > > > > +             max_downscale = max_fixed16(plane_downscale,
> > > > > max_downscale);
> > > > > +     }
> > > > > +
> > > > > +
> > > > > +     pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > > > +
> > > > > +     DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > > +                    crtc->pipe, pipe_downscale.val >> 16,
> > > > > +                    pipe_downscale.val & 0xffff);
> > > > > +
> > > > > +     pipe_downscale = mul_fixed16(pipe_downscale,
> > > > > max_downscale);
> > > > > +
> > > > > +     /*
> > > > > +      * Convert result to percentage: get the actual ratio
> > > > > instead
> > > > > of rate,
> > > > > +      * then multiply by 100 to get percentage.
> > > > > +      */
> > > > > +     pipe_ratio = u32_to_fixed16(100 *
> > > > > div_round_up_u32_fixed16(1,
> > > > > pipe_downscale));
> > > > > +
> > > > > +     DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > > +                   pipe_ratio.val >> 16, pipe_ratio.val &
> > > > > 0xffff);
> > > > > +
> > > > > +     return pipe_ratio;
> > > > > +}
> > > >
> > > > Yuck. I just killed these.
> > >
> > > I still need pipe_ratio fro ICL as you know, this is dictated by
> > > BSpec,
> > > not by me.
> >
> > That part of the spec is pretty crazy IMO. I think we should just
> > ignore
> > it for now. Also nothing in there says we must use both slices for a
> > single pipe so ignoring doesn't even violate the spec in any way.
>
> So you are instructing me now to ignore BSpec as "we are smarter"?
> I don't think this is properly to do it this way. Lets first change the
> BSpec, if we don't like pipe ratio here and then driver. Currently this
> is written that way and if we are going to do everything our own way
> ignoring specs, this is going to be chaos.

>Like I sai you don't have to violate the spec at all. Just don't use
>two slices for the same pipe and you can avoid the whole strange pipe
>ratio limitation. We should file a bspec issue to get that clarified
>though.

That is actually a violation of BSpec, we are not using two slices for same
pipe while we could. We had already enough of bw issues, why should
we make our life even harder?

> > > > > +/*
> > > > > + * Table taken from Bspec 49255
> > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > + * plus there are some hardcoded requirements on how
> > > > > + * those should be distributed for multipipe scenarios.
> > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > + * and less readable, so decided to use a table almost
> > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > + * to compare, change and check.
> > > > > + */
> > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +};
> > > >
> > > > My eyes!
> > > >
> > > > I have to think we should be able to reduce all that to a handful
> > > > of lines of code.
> > >
> > > Yeah, then we are going to have a huge function with lots of weird
> > > definitions, unfortunately BSpec table has quite strange DBuf
> > > assignments like
> > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > >
> > > i.e slices can get mixed in a quite various ways. There is no
> > > rational pattern in those and could be even dangerous to try to
> > > optimize it some way, as one day it might change again in some
> > > unpredictable way.
> > >
> > > Would you prefer to have it like
> > >
> > > if (pipes are A and B)
> > >     program S2 to A and S1 to B
> > > if (pipes are A and C)
> > >     program S1 to A and S2 to C
> > > ...
> > >
> > > I would prefer at least to see it in a comparable way with the
> > > table we have in BSpec, rather than having lots of weird looking
> > > if-else statements, GEN checks and so on.
> > >
> > > I knew this table was not going to look like it is done typically
> > > here - but really my opinion that it is way better than hardcoding
> > > it into some weird algorithm, which would be hard to compare to
> > > initial table, which is already strange enough.
> > >
> > > If you don't like it - could you please explain why this is exactly
> > > worse than having long functions with hardcoded checks?
> >
> > I think we should be able to encode this simply using the
> > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
> > is the distance of each pipe from each dbuf slice. Then all we have
> > to do is minimize the total distance. The following tables I think
> > should encode that reasonably concisely:
> >
> > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > static const u8 icl_dbuf_distance[][] {
> >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > };
> >
> > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > static const u8 tgl_dbuf_distance[][] {
> >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > };
>
> I think you are missing my point. I will explain why.
> My initial idea was similar, use some kind of distance
> metrics and encode everything in a smart way.
> Current BSpec table has sometimes slices swapped for no reason.
>
> There are seem to be some hardware constraints we are not aware
> of. For example for TGL pipes A,B get S2 and S1 slices,

>A/S1 + B/S2 = 1 + 4 = 5
>A/S2 + B/S1 = 2 + 1 = 3

> but pipes A C get S1 and S2.

>A/S1 + C/S2 = 1 + 1 = 2
>A/S2 + C/S1 = 2 + 2 = 4

>So it's exactly as my distance table says it should be.

>
> if you encode all this into some kind of distance algorithm,
> those differences would be enourmously difficult to spot and
> compare with actual table. That is why I decided to do it that
> way.
>
> Most likely yes you can solve it by using this metrics,
> but will it really be then more readable and comparable
> with reference table?
> What if some contraints will be added again, like pipe ratio?
>
> >
> > /*
> >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> >  */
> > static const u8 future_dbuf_distance[][] {
> >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > };
> >
> > I suppose we could make that even more minimal by just
> > directly sticking that "Pipe and DBUF ordering" language
> > into a single array but and then calculate each distance
> > from it, but that is perhaps more work, and we'd need
> > some kind of enum just for that so that we have a single
> > namespace for the pipes and dbuf slices. So i have a
> > feeling having those distances pre-computed in a table
> > like thet is maybe a better way to go.
>
> Good luck then spoting differences with the BSpec table once it
> gets changed.

>Well, I'd rather do that than decode through layers of cpp macros.
>Also atm the table has no extra restrictions so it's clearly there
>just because someone thought they'd write out the results of the
>distance calculations. In general bspec style is somewhat annoying
>where they try to write out things in the lowest level of absraction
>possible (they are hardware people after all). I'd much prefer a
>higher level description of the issue.

I would prefer to see immediately what is different as this table
makes it possible right away
Also to me this table simply intuitively more understandable.

> > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +};

We have similar CDCLK table in intel_cdclk, while you could 
also argue that we might had come up with some formula.

Could be we could make the table approach itself simpler - 
that is quite possible, but implementing a distance algorithm
seems to me to be a dangerous idea, despite that it of course
looks better to me(yes it is).

Best Regards,

Lisovskiy Stanislav

Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

________________________________________
From: Ville Syrjälä [ville.syrjala@linux.intel.com]
Sent: Monday, November 18, 2019 8:18 PM
To: Lisovskiy, Stanislav
Cc: Saarinen, Jani; intel-gfx@lists.freedesktop.org; Roper, Matthew D; maarten.lankhorst@linux.intel.com; Ausmus, James
Subject: Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL

On Mon, Nov 18, 2019 at 05:17:51PM +0000, Lisovskiy, Stanislav wrote:
> On Mon, 2019-11-18 at 17:27 +0200, Ville Syrjälä wrote:
> > On Mon, Nov 18, 2019 at 09:19:18AM +0000, Lisovskiy, Stanislav wrote:
> > > On Fri, 2019-11-15 at 22:19 +0200, Ville Syrjälä wrote:
> > > > On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy
> > > > wrote:
> > > > > Also implemented algorithm for choosing DBuf slice
> > > > > configuration
> > > > > based on active pipes, pipe ratio as stated in BSpec 12716.
> > > > >
> > > > > Now pipe allocation still stays proportional to pipe width as
> > > > > before,
> > > > > however within allowed DBuf slice for this particular
> > > > > configuration.
> > > > >
> > > > > v2: Remove unneeded check from commit as ddb enabled slices
> > > > > might
> > > > >     now differ from hw state.
> > > > >
> > > > > v3: - Added new field "supported_slices" to ddb to separate max
> > > > >       supported slices vs currently enabled, to avoid
> > > > > confusion.
> > > > >     - Removed obsolete comments and code related to
> > > > > DBuf(Matthew
> > > > > Roper).
> > > > >     - Some code style and long line removal(Matthew Roper).
> > > > >     - Added WARN_ON to new ddb range offset calc
> > > > > function(Matthew
> > > > > Roper).
> > > > >     - Removed platform specific call to calc pipe ratio from
> > > > > ddb
> > > > >       allocation function and fixed the return value(Matthew
> > > > > Roper)
> > > > >     - Refactored DBUF slice allocation table to improve
> > > > > readability
> > > > >     - Added DBUF slice allocation for TGL as well.
> > > > >     - ICL(however not TGL) seems to voluntarily enable second
> > > > > DBuf
> > > > > slice
> > > > >       after pm suspend/resume causing a mismatch failure,
> > > > > because
> > > > > we
> > > > >       update DBuf slices only if we do a modeset, however this
> > > > > check
> > > > >       is done always. Fixed it to be done only when modeset for
> > > > > ICL.
> > > > >
> > > > > v4: - Now enabled slices is not actually a number, but a
> > > > > bitmask,
> > > > >       because we might need to enable second slice only and
> > > > > number
> > > > >       of slices would still 1 and that current code doesn't
> > > > > allow.
> > > > >     - Remove redundant duplicate code to have some unified way
> > > > > of
> > > > >       enabling dbuf slices instead of hardcoding.
> > > > >
> > > > > v5: - Fix failing gen9_assert_dbuf_enabled as it was naively
> > > > > thinking
> > > > >       that we have only one DBUF_CTL slice. Now another version
> > > > > for
> > > > >       gen11 will check both slices as only second one can be
> > > > > enabled,
> > > > >       to keep CI happy.
> > > > >
> > > > > v6: - Removed enabled dbuf assertion completely. Previous code
> > > > >       was keeping dbuf enabled, even(!) when _dbuf_disable is
> > > > > called.
> > > > >       Currently we enable/disable dbuf slices based on
> > > > > requrement
> > > > >       so no point in asserting this here.
> > > > >     - Removed unnecessary modeset check from
> > > > > verify_wm_state(Matthew Roper)
> > > > >     - Slices intersection after union is same as final
> > > > > result(Matthew Roper)
> > > > >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> > > > >     - Some macros added(Matthew Roper)
> > > > >     - ddb range is now always less or equal to ddb size - no
> > > > > need
> > > > > for additional
> > > > >       checks(previously needed as we had some bandwidth checks
> > > > > in
> > > > > there which
> > > > >       could "suddenly" return ddb_size smaller than it is.
> > > > >
> > > > > v7: Minor costemic changes:
> > > > >     - Changed icl_dbuf_slices_restore name to
> > > > > icl_program_dbuf_slices
> > > > >       as it more corresponds to the actual functionality.
> > > > >     - Some simplification with supported slices for BXT and GLK
> > > > >     - skl_pipe_downscale_amount -> icl_pipe_downscale_amount as
> > > > >       this is not used for skl anymore.
> > > > >     - Changed DBuf slice assignment order for TGL case
> > > > >
> > > > > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > > > > Signed-off-by: Stanislav Lisovskiy <
> > > > > stanislav.lisovskiy@intel.com>
> > > > > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > > > > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > > > > Cc: James Ausmus <james.ausmus@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> > > > >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> > > > >  .../drm/i915/display/intel_display_power.h    |   2 +
> > > > >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> > > > >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> > > > >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> > > > >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> > > > >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> > > > >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > > > > ++++++++++++++++--
> > > > >  9 files changed, 431 insertions(+), 104 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > index 876fc25968bf..bd7aff675198 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > > > > intel_crtc *crtc,
> > > > >
> > > > >       if (INTEL_GEN(dev_priv) >= 11 &&
> > > > >           hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > > > > -             DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > %u, got
> > > > > %u)\n",
> > > > > +             DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > %x, got
> > > > > %x)\n",
> > > > >                         sw_ddb->enabled_slices,
> > > > >                         hw->ddb.enabled_slices);
> > > > >
> > > > > @@ -14614,15 +14614,24 @@ static void
> > > > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > > > >       u8 hw_enabled_slices = dev_priv-
> > > > > >wm.skl_hw.ddb.enabled_slices;
> > > > >       u8 required_slices = state-
> > > > > >wm_results.ddb.enabled_slices;
> > > > >       struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > > > > +     u8 slices_union = hw_enabled_slices | required_slices;
> > > > > +     u8 slices_intersection = required_slices;
> > > > >
> > > > >       for_each_oldnew_intel_crtc_in_state(state, crtc,
> > > > > old_crtc_state, new_crtc_state, i)
> > > > >               /* ignore allocations for crtc's that have been
> > > > > turned
> > > > > off. */
> > > > >               if (new_crtc_state->hw.active)
> > > > >                       entries[i] = old_crtc_state-
> > > > > >wm.skl.ddb;
> > > > >
> > > > > -     /* If 2nd DBuf slice required, enable it here */
> > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > > > > hw_enabled_slices)
> > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
> > > > > +     DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > > > > +                   required_slices, hw_enabled_slices);
> > > > > +
> > > > > +     /*
> > > > > +      * If some other DBuf slice required, enable it here,
> > > > > +      * as here we only add more slices, but not remove to
> > > > > prevent
> > > > > +      * issues if somebody still uses those.
> > > > > +      */
> > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > hw_enabled_slices)
> > > > > +             icl_dbuf_slices_update(dev_priv, slices_union);
> > > > >
> > > > >       /*
> > > > >        * Whenever the number of active pipes changes, we need
> > > > > to make
> > > > > sure we
> > > > > @@ -14681,9 +14690,12 @@ static void
> > > > > skl_commit_modeset_enables(struct intel_atomic_state *state)
> > > > >               }
> > > > >       } while (progress);
> > > > >
> > > > > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > hw_enabled_slices)
> > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
> > > > > +     /*
> > > > > +      * If some other DBuf slice is not needed, disable it
> > > > > here,
> > > > > +      * as here we only remove slices, which are not needed
> > > > > anymore.
> > > > > +      */
> > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > hw_enabled_slices)
> > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > slices_intersection);
> > > >
> > > > This is still a totally wrong place for this. We need to be sure
> > > > the
> > > > planes have switched over to the new ddb layout before we power
> > > > off
> > > > anything.
> > >
> > > I have just modified already existing line here, not added by me.
> > > The point of this patch is starting to use second DBuf slice but
> > > not fixing all kinds of issues related to DBuf code at the same
> > > time.
> >
> > You're re-enabling code I disabled on purpose. It has to be fixed
> > properly or not touched at all. In hindsight I should have probably
> > just ripped it all out to not give people bad ideas.
>
> -     /* If 2nd DBuf slice is no more required disable it */
> > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > hw_enabled_slices)
> > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > required_slices);
>
> I'm not reenabling it - this is code is out there right now.

It's _dead_ code, and it's like that on purpose. You're now attempting
to re-enable it without fixing the underlying issues.

> This is all again about changing all kinds of stuff at the same time.
>
> Why we can't just do things step by step? You have a point that
> this is wrong place ok, but as we see current code causes no
> regressions. Just tell me your idea where to put it and I will
> do that, if you don't have it yet - let it stay there, if CI shows
> no regressions currently.

The same global state tricks apply here as with the cdclk/sagv stuff.
1. enable new resources/bump up clocks
2. reconfigure planes/etc.
3. wait for vblanks
4. disable unused resources/drop the clocks

>
> >
> > >
> > > Of course this is nice idea to fix it - if that is wrong plane,
> > > but current code somehow lives with it and probably this should be
> > > with some other patch.
> > >
> > > >
> > > > >  }
> > > > >
> > > > >  static void intel_atomic_helper_free_state(struct
> > > > > drm_i915_private
> > > > > *dev_priv)
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > index ce1b64f4dd44..a78bc9b61862 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > @@ -1031,15 +1031,6 @@ static bool
> > > > > gen9_dc_off_power_well_enabled(struct drm_i915_private
> > > > > *dev_priv,
> > > > >               (I915_READ(DC_STATE_EN) &
> > > > > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> > > > >  }
> > > > >
> > > > > -static void gen9_assert_dbuf_enabled(struct drm_i915_private
> > > > > *dev_priv)
> > > > > -{
> > > > > -     u32 tmp = I915_READ(DBUF_CTL);
> > > > > -
> > > > > -     WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST)) !=
> > > > > -          (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > > > > -          "Unexpected DBuf power power state (0x%08x)\n",
> > > > > tmp);
> > > > > -}
> > > > > -
> > > > >  static void gen9_disable_dc_states(struct drm_i915_private
> > > > > *dev_priv)
> > > > >  {
> > > > >       struct intel_cdclk_state cdclk_state = {};
> > > > > @@ -1055,8 +1046,6 @@ static void gen9_disable_dc_states(struct
> > > > > drm_i915_private *dev_priv)
> > > > >       /* Can't read out voltage_level so can't use
> > > > > intel_cdclk_changed() */
> > > > >       WARN_ON(intel_cdclk_needs_modeset(&dev_priv->cdclk.hw,
> > > > > &cdclk_state));
> > > > >
> > > > > -     gen9_assert_dbuf_enabled(dev_priv);
> > > > > -
> > > > >       if (IS_GEN9_LP(dev_priv))
> > > > >               bxt_verify_ddi_phy_power_wells(dev_priv);
> > > > >
> > > > > @@ -4254,31 +4243,51 @@ static void gen9_dbuf_disable(struct
> > > > > drm_i915_private *dev_priv)
> > > > >       intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> > > > >  }
> > > > >
> > > > > -static u8 intel_dbuf_max_slices(struct drm_i915_private
> > > > > *dev_priv)
> > > > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv)
> > > > >  {
> > > > > -     if (INTEL_GEN(dev_priv) < 11)
> > > > > -             return 1;
> > > > > -     return 2;
> > > > > +     return INTEL_INFO(dev_priv)->supported_slices;
> > > > > +}
> > > >
> > > > Kinda pointless function now. 'supported_slices' is not a very
> > > > good
> > > > name. Should be 'num_dbuf_slices' or something along those lines.
> > >
> > > Supported slices really means how much slices we support in
> > > hardware.
> > > num_dbuf_slices will not reflect that, it will be more ambigious.
> > > I don't like this name.
> >
> > supported_slices doesn't even say "dbuf" anywhere. So it's even more
> > ambiguous. Also we already have num_pipes, num_sprites, etc.
>
> ..and in current code we have... "enabled_slices", no? Ouch! :) IS this
> really that important and worth arguing at the very last moment?
> whether number of
> supported_slices_in_hardware, should be named "supported slices" or
> "num_dbuf_slices"? If yes, lets go on, miss some targets and continue
> discussing this "soooo important" thing.
>
> Some one might come up that it is called "num_slices" or
> "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
> keep everyones bloated egos happy??
>
>
> >
> > >
> > > >
> > > > > +
> > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > *dev_priv)
> > > > > +{
> > > > > +     const u8 hw_enabled_slices = dev_priv-
> > > > > > wm.skl_hw.ddb.enabled_slices;
> > > >
> > > > Seems a rather pointless variable. const at least is pointless.
> > > >
> > > > > +
> > > > > +     icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> > > > >  }
> > > > >
> > > > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > > > >                           u8 req_slices)
> > > > >  {
> > > > > -     const u8 hw_enabled_slices = dev_priv-
> > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > >
> > > > > -     bool ret;
> > > > > +     bool ret = true;
> > > > > +     int i;
> > > > > +     int max_slices = intel_dbuf_max_slices(dev_priv);
> > > > >
> > > > > -     if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > > > > +     if (hweight8(req_slices) >
> > > > > intel_dbuf_max_slices(dev_priv)) {
> > > > >               DRM_ERROR("Invalid number of dbuf slices
> > > > > requested\n");
> > > > >               return;
> > > > >       }
> > > > >
> > > > > -     if (req_slices == hw_enabled_slices || req_slices == 0)
> > > > > -             return;
> > > > > +     DRM_DEBUG_KMS("Updating dbuf slices to %x\n",
> > > > > req_slices);
> > > > >
> > > > > -     if (req_slices > hw_enabled_slices)
> > > > > -             ret = intel_dbuf_slice_set(dev_priv,
> > > > > DBUF_CTL_S2,
> > > > > true);
> > > > > -     else
> > > > > -             ret = intel_dbuf_slice_set(dev_priv,
> > > > > DBUF_CTL_S2,
> > > > > false);
> > > > > +     for (i = 0; i < max_slices; i++) {
> > > > > +             int slice_bit = BIT(i);
> > > > > +             bool slice_set = (slice_bit & req_slices) != 0;
> > > > > +
> > > > > +             switch (slice_bit) {
> > > > > +             case DBUF_S1_BIT:
> > > > > +                     ret = ret &&
> > > > > intel_dbuf_slice_set(dev_priv,
> > > > > +                                                       DBUF_
> > > > > CTL_S1,
> > > > > +                                                       slice
> > > > > _set);
> > > > > +                     break;
> > > > > +             case DBUF_S2_BIT:
> > > > > +                     ret = ret &&
> > > > > intel_dbuf_slice_set(dev_priv,
> > > > > +                                                       DBUF_
> > > > > CTL_S2,
> > > > > +                                                       slice
> > > > > _set);
> > > >
> > > > The return value on intel_dbuf_slice_set() seems pointless. I'd
> > > > nuke
> > > > it.
> > > >
> > > > Also as long you have intel_dbuf_slice_set() I'd probably just
> > > > pass
> > > > the
> > > > slice index there and let it deal with the register details.
> > > > DBUF_CTL
> > > > should probably be parametrized as well so you don't need
> > > > annoying
> > > > switch
> > > > statements.
> > >
> > > What if it fails? Shouldn't we still have some clue what state do
> > > we
> > > have/had at the moment?
> >
> > It if fails you log an error and keep going.
>
> Ah sure. And let hw/sw state be completely different.
> Also see next comment about this.
>
> >
> > >
> > > >
> > > > > +                     break;
> > > > > +             default:
> > > > > +                     MISSING_CASE(slice_bit);
> > > > > +             }
> > > > > +     }
> > > > >
> > > > >       if (ret)
> > > > >               dev_priv->wm.skl_hw.ddb.enabled_slices =
> > > > > req_slices;
> > > > > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > > > > drm_i915_private *dev_priv,
> > > > >
> > > > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > > > >  {
> > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > > DBUF_POWER_REQUEST);
> > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > > DBUF_POWER_REQUEST);
> > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > -
> > > > > -     udelay(10);
> > > > > -
> > > > > -     if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > -         !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > -             DRM_ERROR("DBuf power enable timeout\n");
> > > > > -     else
> > > > > -             /*
> > > > > -              * FIXME: for now pretend that we only have 1
> > > > > slice,
> > > > > see
> > > > > -              * intel_enabled_dbuf_slices_num().
> > > > > -              */
> > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > +     /*
> > > > > +      * Just power up 1 slice, we will
> > > > > +      * figure out later which slices we have and what we
> > > > > need.
> > > > > +      */
> > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > >
> > > > That stuff really shouldn't be modified by low level functions.
> > > > The intial value should come from readout, and after that it
> > > > should
> > > > be adjusted as part of the overall atomic state.
> > >
> > > Well, as you can see currently this is even worse - has some weird
> > > hardcoded stuff and is modified like that. Also if we are enabling
> > > the
> > > dbuf slices, shouldn't we really determine what value should be
> > > there?
> > >
> > > Also I wouldn't really always trust the readout, as to me it seems
> > > icelake resets it constantly to 2 after suspend/resume.
> >
> > The point of readout is that we syncrhonize our software state with
> > the
> > current state of hardware without having to modify the hardware
> > state.
> > Once synchronized the driver takes over and tells the hw what to do.
>
> That is especially nice after you say "report error and don't care"
> when I really want to reflect hw state in sw state, by checking whether
> we could set dbuf slices or not.
> Now you say that we need to synchronize software state with hw state,
> however if I fail to modify it - then "who cares".

Yes. If you fail to modify the hardware state the whole thing is
broken anyway. There's is *nothing* we can do at this stage in the
commit to fix it.

>
>
> >
> > >
> > > >
> > > > > +     icl_program_dbuf_slices(dev_priv);
> > > > >  }
> > > > >
> > > > >  static void icl_dbuf_disable(struct drm_i915_private
> > > > > *dev_priv)
> > > > >  {
> > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > > > > ~DBUF_POWER_REQUEST);
> > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > > > > ~DBUF_POWER_REQUEST);
> > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > -
> > > > > -     udelay(10);
> > > > > -
> > > > > -     if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > -         (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > -             DRM_ERROR("DBuf power disable timeout!\n");
> > > > > -     else
> > > > > -             /*
> > > > > -              * FIXME: for now pretend that the first slice
> > > > > is
> > > > > always
> > > > > -              * enabled, see
> > > > > intel_enabled_dbuf_slices_num().
> > > > > -              */
> > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > +     /*
> > > > > +      * Disable all slices
> > > > > +      */
> > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > > > > +     icl_program_dbuf_slices(dev_priv);
> > > > >  }
> > > > >
> > > > >  static void icl_mbus_init(struct drm_i915_private *dev_priv)
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > index 1da04f3e0fb3..b7057d68ad78 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > > > > drm_i915_private *i915,
> > > > >       for ((wf) = intel_display_power_get((i915), (domain));
> > > > > (wf); \
> > > > >            intel_display_power_put_async((i915), (domain),
> > > > > (wf)),
> > > > > (wf) = 0)
> > > > >
> > > > > +int intel_dbuf_max_slices(struct drm_i915_private *dev_priv);
> > > > >  void icl_dbuf_slices_update(struct drm_i915_private *dev_priv,
> > > > >                           u8 req_slices);
> > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > *dev_priv);
> > > > >
> > > > >  void chv_phy_powergate_lanes(struct intel_encoder *encoder,
> > > > >                            bool override, unsigned int mask);
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > > index 82e4e6bf08c3..96741e68633a 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > > @@ -51,6 +51,7 @@
> > > > >  #include "display/intel_bw.h"
> > > > >  #include "display/intel_cdclk.h"
> > > > >  #include "display/intel_display_types.h"
> > > > > +#include "display/intel_display_power.h"
> > > > >  #include "display/intel_dp.h"
> > > > >  #include "display/intel_fbdev.h"
> > > > >  #include "display/intel_hotplug.h"
> > > > > @@ -2588,6 +2589,10 @@ static int intel_runtime_resume(struct
> > > > > device *kdev)
> > > > >       if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
> > > > >               ret = vlv_resume_prepare(dev_priv, true);
> > > > >
> > > > > +     /* Weird hack to fix ICL hardware bug, as it resets
> > > > > DBUF slices
> > > > > reg */
> > > > > +     if (INTEL_GEN(dev_priv) == 11)
> > > > > +             icl_program_dbuf_slices(dev_priv);
> > > > > +
> > > > >       intel_uncore_runtime_resume(&dev_priv->uncore);
> > > > >
> > > > >       intel_runtime_pm_enable_interrupts(dev_priv);
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > > index 7e0f67babe20..a396977c9c2d 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > @@ -804,7 +804,7 @@ static inline bool
> > > > > skl_ddb_entry_equal(const
> > > > > struct skl_ddb_entry *e1,
> > > > >  }
> > > > >
> > > > >  struct skl_ddb_allocation {
> > > >
> > > > This struct is quite pointless. Should be killed off.
> > >
> > > Legacy stuff, not added by me. Quite fine with removing it -
> > > but should we really do all this at the same time and in same
> > > patch?
> >
> > Not the same patch. But clean up the cobwebs first, then add the new
> > stuff. Much easier that way around.
>
> There is a shit ton load of cobwebs to clean. Personally my opinion is
> that current dbuf controlling code is total garbage - which has lots of
> hardcoded stuff. Good luck enabling only second dbuf slice, good luck
> disabling/enabling dbuf properly....
> Where you were, when this was committed then?

Looking at something else unfortunately. Also it has just kept
bitrotting since because no one has cared.

>
> If we need to fix it - fine. But may be we should do it in stages?
> Point of this patch is "enable second dbuf slice", next patch "remove
> unneeded structs" and so on.
> While you want me to shuffle everything at the same time, which is not
> good idea, especially considering complexity and amount of legacy stuff
> we currently have.

Building new stuff on top of a decomposing garbage pile is not
a good idea: a) it's harder than building on clean foundation,
b) it'll probably fall over and get buried in the old garbage
at which point you're likely going to be right back where you
started.

>
> >
> > >
> > > >
> > > > > -     u8 enabled_slices; /* GEN11 has configurable 2 slices
> > > > > */
> > > > > +     u8 enabled_slices;   /* Bitmask of currently enabled
> > > > > DBuf
> > > > > slices */
> > > > >  };
> > > > >
> > > > >  struct skl_ddb_values {
> > > > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > > > b/drivers/gpu/drm/i915/i915_pci.c
> > > > > index 1bb701d32a5d..8e86af505730 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > > > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > > > > intel_cherryview_info = {
> > > > >       .has_gt_uc = 1, \
> > > > >       .display.has_hdcp = 1, \
> > > > >       .display.has_ipc = 1, \
> > > > > -     .ddb_size = 896
> > > > > +     .ddb_size = 896, \
> > > > > +     .supported_slices = 1
> > > > >
> > > > >  #define SKL_PLATFORM \
> > > > >       GEN9_FEATURES, \
> > > > > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > > > > intel_skylake_gt4_info = {
> > > > >       GEN(9), \
> > > > >       .is_lp = 1, \
> > > > >       .display.has_hotplug = 1, \
> > > > > +     .supported_slices = 1, \
> > > > >       .engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) |
> > > > > BIT(VECS0),
> > > > > \
> > > > >       .pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), \
> > > > >       .has_64bit_reloc = 1, \
> > > > > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > > > > intel_coffeelake_gt3_info = {
> > > > >       GEN9_FEATURES, \
> > > > >       GEN(10), \
> > > > >       .ddb_size = 1024, \
> > > > > +     .supported_slices = 1, \
> > > > >       .display.has_dsc = 1, \
> > > > >       .has_coherent_ggtt = false, \
> > > > >       GLK_COLORS
> > > > > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > > > > intel_cannonlake_info = {
> > > > >       }, \
> > > > >       GEN(11), \
> > > > >       .ddb_size = 2048, \
> > > > > +     .supported_slices = 2, \
> > > > >       .has_logical_ring_elsq = 1, \
> > > > >       .color = { .degamma_lut_size = 33, .gamma_lut_size =
> > > > > 262145 }
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > > > > b/drivers/gpu/drm/i915/i915_reg.h
> > > > > index a607ea520829..fba5731063d8 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > > > @@ -7734,9 +7734,11 @@ enum {
> > > > >  #define DISP_ARB_CTL2        _MMIO(0x45004)
> > > > >  #define  DISP_DATA_PARTITION_5_6     (1 << 6)
> > > > >  #define  DISP_IPC_ENABLE             (1 << 3)
> > > > > -#define DBUF_CTL     _MMIO(0x45008)
> > > > > -#define DBUF_CTL_S1  _MMIO(0x45008)
> > > > > -#define DBUF_CTL_S2  _MMIO(0x44FE8)
> > > > > +#define DBUF_S1_BIT                  BIT(0)
> > > > > +#define DBUF_S2_BIT                  BIT(1)
> > > >
> > > > These are not bits inside DBUF_CTL so they should not be here.
> > > > I'd just nuke them entirely. If you think we need more than an
> > > > int to refer to a slice then IMO something simple like
> > > >
> > > > enum dbuf_slice {
> > > >         DBUF_S1,
> > > >         DBUF_S2,
> > > > };
> > > >
> > > > would match the general apporach we use for other things much
> > > > better
> > > > rather than defining some shifted values.
> > >
> > > We need to combine slices - so bitmask is quite suitable for that,
> > > I can of course use this enum, but then I'll have to re-encode it
> > > to bitmask anyways.
> >
> > Which is pretty much how everything else works in the driver.
>
> Another piece of incredibly important thing to argue..

It is. Lot's of crap in the driver is simply caused by people not
following established style and coming up with something different..
Which means there is an extra cost for everyone else when reading the
code. And that probably means most people just don't bother to read it.

Granted, some of that is due to spurts in evolution where major overhauls
changed the architecture so significantly that the old established
conventions simply made node sense anymore. Atomic being the last big
one in that regard. And we're still cleaning up the resulting mess.

>
> >
> > >
> > >
> > > >
> > > > > +#define DBUF_CTL                     _MMIO(0x45008)
> > > > > +#define DBUF_CTL_S1                  _MMIO(0x45008)
> > > > > +#define DBUF_CTL_S2                  _MMIO(0x44FE8)
> > > > >  #define  DBUF_POWER_REQUEST          (1 << 31)
> > > > >  #define  DBUF_POWER_STATE            (1 << 30)
> > > > >  #define GEN7_MSG_CTL _MMIO(0x45010)
> > > > > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > @@ -180,6 +180,7 @@ struct intel_device_info {
> > > > >       } display;
> > > > >
> > > > >       u16 ddb_size; /* in blocks */
> > > > > +     u8 supported_slices; /* number of DBuf slices */
> > > > >
> > > > >       /* Register offsets for the various display pipes and
> > > > > transcoders */
> > > > >       int pipe_offsets[I915_MAX_TRANSCODERS];
> > > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > > > b/drivers/gpu/drm/i915/intel_pm.c
> > > > > index 2d389e437e87..1ea81ab92429 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct
> > > > > drm_device
> > > > > *dev)
> > > > >       return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> > > > >  }
> > > > >
> > > > > -static u8 intel_enabled_dbuf_slices_num(struct
> > > > > drm_i915_private
> > > > > *dev_priv)
> > > > > +static u8 intel_enabled_dbuf_slices(struct drm_i915_private
> > > > > *dev_priv)
> > > > >  {
> > > > > -     u8 enabled_slices;
> > > > > -
> > > > > -     /* Slice 1 will always be enabled */
> > > > > -     enabled_slices = 1;
> > > > > +     u8 enabled_slices = 0;
> > > > >
> > > > >       /* Gen prior to GEN11 have only one DBuf slice */
> > > > >       if (INTEL_GEN(dev_priv) < 11)
> > > > > -             return enabled_slices;
> > > > > +             return DBUF_S1_BIT;
> > > >
> > > > I'd nuke this special case entirely and just read everything from
> > > > the
> > > > hw. Assuming we need readout somewhere.
> > >
> > > We have it that way hardcoded right now in drm-tip. I think this is
> > > a
> > > subject to other patch, but not this change. As I understand we
> > > need
> > > to first figure out how to do it properly.
> > > This patch anyway makes it better by at least removing hardcoded
> > > relation between number of slices and which slices are enabled.
> > > With current code there is no way for example to enable second DBuf
> > > slice only, which is required by BSpec.
> > >
> > > >
> > > > >
> > > > > -     /*
> > > > > -      * FIXME: for now we'll only ever use 1 slice; pretend
> > > > > that we
> > > > > have
> > > > > -      * only that 1 slice enabled until we have a proper way
> > > > > for on-
> > > > > demand
> > > > > -      * toggling of the second slice.
> > > > > -      */
> > > > > -     if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > -             enabled_slices++;
> > > > > +     /* Check if second DBuf slice is enabled */
> > > > > +     if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > > > > +             enabled_slices |= DBUF_S1_BIT;
> > > > > +
> > > > > +     if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > +             enabled_slices |= DBUF_S2_BIT;
> > > > >
> > > > >       return enabled_slices;
> > > > >  }
> > > > > @@ -3812,36 +3808,38 @@ static u16 intel_get_ddb_size(struct
> > > > > drm_i915_private *dev_priv,
> > > > >                             const int num_active,
> > > > >                             struct skl_ddb_allocation *ddb)
> > > > >  {
> > > > > -     const struct drm_display_mode *adjusted_mode;
> > > > > -     u64 total_data_bw;
> > > > >       u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > > > > -
> > > > >       WARN_ON(ddb_size == 0);
> > > > >
> > > > >       if (INTEL_GEN(dev_priv) < 11)
> > > > >               return ddb_size - 4; /* 4 blocks for bypass
> > > > > path
> > > > > allocation */
> > > > >
> > > > > -     adjusted_mode = &crtc_state->hw.adjusted_mode;
> > > > > -     total_data_bw = total_data_rate *
> > > > > drm_mode_vrefresh(adjusted_mode);
> > > > > +     return ddb_size;
> > > > > +}
> > > > >
> > > > > -     /*
> > > > > -      * 12GB/s is maximum BW supported by single DBuf slice.
> > > > > -      *
> > > > > -      * FIXME dbuf slice code is broken:
> > > > > -      * - must wait for planes to stop using the slice
> > > > > before
> > > > > powering it off
> > > > > -      * - plane straddling both slices is illegal in multi-
> > > > > pipe
> > > > > scenarios
> > > > > -      * - should validate we stay within the hw bandwidth
> > > > > limits
> > > > > -      */
> > > > > -     if (0 && (num_active > 1 || total_data_bw >= GBps(12)))
> > > > > {
> > > > > -             ddb->enabled_slices = 2;
> > > > > -     } else {
> > > > > -             ddb->enabled_slices = 1;
> > > > > -             ddb_size /= 2;
> > > > > -     }
> > > > > +/*
> > > > > + * Calculate initial DBuf slice offset, based on slice size
> > > > > + * and mask(i.e if slice size is 1024 and second slice is
> > > > > enabled
> > > > > + * offset would be 1024)
> > > > > + */
> > > > > +static u32 skl_get_first_dbuf_slice_offset(u32
> > > > > dbuf_slice_mask,
> > > > > +                                        u32 slice_size, u32
> > > > > ddb_size)
> > > > > +{
> > > > > +     u32 offset = 0;
> > > > >
> > > > > -     return ddb_size;
> > > > > +     if (!dbuf_slice_mask)
> > > > > +             return 0;
> > > > > +
> > > > > +     offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > > > > +
> > > > > +     WARN_ON(offset >= ddb_size);
> > > > > +     return offset;
> > > > >  }
> > > > >
> > > > > +static u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > > *dev_priv,
> > > > > +                                   int pipe, u32
> > > > > active_pipes,
> > > > > +                                   const struct
> > > > > intel_crtc_state
> > > > > *crtc_state);
> > > > > +
> > > > >  static void
> > > > >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > > > > *dev_priv,
> > > > >                                  const struct
> > > > > intel_crtc_state
> > > > > *crtc_state,
> > > > > @@ -3857,7 +3855,14 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >       u32 pipe_width = 0, total_width = 0, width_before_pipe
> > > > > = 0;
> > > > >       enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> > > > >       u16 ddb_size;
> > > > > +     u32 ddb_range_size;
> > > > >       u32 i;
> > > > > +     u32 dbuf_slice_mask;
> > > > > +     u32 active_pipes;
> > > > > +     u32 offset;
> > > > > +     u32 slice_size;
> > > > > +     u32 total_slice_mask;
> > > > > +     u32 start, end;
> > > > >
> > > > >       if (WARN_ON(!state) || !crtc_state->hw.active) {
> > > > >               alloc->start = 0;
> > > > > @@ -3866,14 +3871,23 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >               return;
> > > > >       }
> > > > >
> > > > > -     if (intel_state->active_pipe_changes)
> > > > > +     if (intel_state->active_pipe_changes) {
> > > > >               *num_active = hweight8(intel_state-
> > > > > >active_pipes);
> > > > > -     else
> > > > > +             active_pipes = intel_state->active_pipes;
> > > > > +     } else {
> > > > >               *num_active = hweight8(dev_priv->active_pipes);
> > > > > +             active_pipes = dev_priv->active_pipes;
> > > > > +     }
> > > > >
> > > > >       ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > > > > total_data_rate,
> > > > >                                     *num_active, ddb);
> > > > >
> > > > > +     DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > > > > +
> > > > > +     slice_size = ddb_size / INTEL_INFO(dev_priv)-
> > > > > >supported_slices;
> > > > > +
> > > > > +     DRM_DEBUG_KMS("Got DBuf slice size %d\n", slice_size);
> > > > > +
> > > > >       /*
> > > > >        * If the state doesn't change the active CRTC's or
> > > > > there is no
> > > > >        * modeset request, then there's no need to
> > > > > recalculate;
> > > > > @@ -3891,20 +3905,70 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >               return;
> > > > >       }
> > > > >
> > > > > +     /*
> > > > > +      * Get allowed DBuf slices for correspondent pipe and
> > > > > platform.
> > > > > +      */
> > > > > +     dbuf_slice_mask = i915_get_allowed_dbuf_mask(dev_priv,
> > > > > for_pipe,
> > > > > +                                                  active_pip
> > > > > es,
> > > > > crtc_state);
> > > >
> > > > Quite a few redundant arguments. The function name looks alien.
> > > >
> > > > skl_crtc_possible_dbuf_slices() or something perhaps?
> > > >
> > > > > +
> > > > > +     DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active pipes
> > > > > %x\n",
> > > > > +                   dbuf_slice_mask,
> > > > > +                   for_pipe, active_pipes);
> > > > > +
> > > > > +     /*
> > > > > +      * Figure out at which DBuf slice we start, i.e if we
> > > > > start at
> > > > > Dbuf S2
> > > > > +      * and slice size is 1024, the offset would be 1024
> > > > > +      */
> > > > > +     offset =
> > > > > skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > > > > +                                              slice_size,
> > > > > ddb_size);
> > > > > +
> > > > > +     /*
> > > > > +      * Figure out total size of allowed DBuf slices, which
> > > > > is
> > > > > basically
> > > > > +      * a number of allowed slices for that pipe multiplied
> > > > > by slice
> > > > > size.
> > > > > +      * Inside of this
> > > > > +      * range ddb entries are still allocated in proportion
> > > > > to
> > > > > display width.
> > > > > +      */
> > > > > +     ddb_range_size = hweight8(dbuf_slice_mask) *
> > > > > slice_size;
> > > > > +
> > > > >       /*
> > > > >        * Watermark/ddb requirement highly depends upon width
> > > > > of the
> > > > >        * framebuffer, So instead of allocating DDB equally
> > > > > among
> > > > > pipes
> > > > >        * distribute DDB based on resolution/width of the
> > > > > display.
> > > > >        */
> > > > > +     total_slice_mask = dbuf_slice_mask;
> > > >
> > > > Isn't this already taken care by the loop below?
> > > >
> > > > >       for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > crtc_state,
> > > > > i) {
> > > > >               const struct drm_display_mode *adjusted_mode =
> > > > >                       &crtc_state->hw.adjusted_mode;
> > > > >               enum pipe pipe = crtc->pipe;
> > > > >               int hdisplay, vdisplay;
> > > > > +             u32 pipe_dbuf_slice_mask = \
> > > > > +                     i915_get_allowed_dbuf_mask(dev_priv,
> > > > > +                                             pipe,
> > > > > +                                             active_pipes,
> > > > > +                                             crtc_state);
> > > > >
> > > > >               if (!crtc_state->hw.enable)
> > > > >                       continue;
> > > > >
> > > > > +             /*
> > > > > +              * According to BSpec pipe can share one dbuf
> > > > > slice
> > > > > with another
> > > > > +              * pipes or pipe can use multiple dbufs, in
> > > > > both cases
> > > > > we
> > > > > +              * account for other pipes only if they have
> > > > > exactly
> > > > > same mask.
> > > > > +              * However we need to account how many slices
> > > > > we should
> > > > > enable
> > > > > +              * in total.
> > > > > +              */
> > > > > +             total_slice_mask |= pipe_dbuf_slice_mask;
> > > >
> > > > total_slice_mask will now account only the crtcs in the state.
> > > > What
> > > > happens to the other pipes' slices?
>
> Same as happens now btw. Right now it also does the same thing.

Right now it does absolutely nothing. It's all dead code.

>
> > >
> > > Yes we need to iterate through all pipes/crtcs here, however
> > > for getting correct dbuf slice mask most important is active_pipes.
> > > However yes, I will change the loop for it to be more correct.
> > >
> > > >
> > > > > +
> > > > > +             /*
> > > > > +              * Do not account pipes using other slice sets
> > > > > +              * luckily as of current BSpec slice sets do
> > > > > not
> > > > > partially
> > > > > +              * intersect(pipes share either same one slice
> > > > > or same
> > > > > slice set
> > > > > +              * i.e no partial intersection), so it is
> > > > > enough to
> > > > > check for
> > > > > +              * equality for now.
> > > > > +              */
> > > > > +             if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > > > > +                     continue;
> > > > > +
> > > > >               drm_mode_get_hv_timing(adjusted_mode,
> > > > > &hdisplay,
> > > > > &vdisplay);
> > > > >               total_width += hdisplay;
> > > > >
> > > > > @@ -3914,8 +3978,19 @@
> > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > drm_i915_private *dev_priv,
> > > > >                       pipe_width = hdisplay;
> > > > >       }
> > > > >
> > > > > -     alloc->start = ddb_size * width_before_pipe /
> > > > > total_width;
> > > > > -     alloc->end = ddb_size * (width_before_pipe +
> > > > > pipe_width) /
> > > > > total_width;
> > > > > +     ddb->enabled_slices = total_slice_mask;
> > > > > +
> > > > > +     start = ddb_range_size * width_before_pipe /
> > > > > total_width;
> > > > > +     end = ddb_range_size * (width_before_pipe + pipe_width)
> > > > > /
> > > > > total_width;
> > > > > +
> > > > > +     alloc->start = offset + start;
> > > > > +     alloc->end = offset + end;
> > > > > +
> > > > > +     DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > > > > +                   alloc->start, alloc->end);
> > > > > +     DRM_DEBUG_KMS("Enabled ddb slices mask %x num supported
> > > > > %d\n",
> > > > > +                   ddb->enabled_slices,
> > > > > +                   INTEL_INFO(dev_priv)->supported_slices);
> > > > >  }
> > > > >
> > > > >  static int skl_compute_wm_params(const struct intel_crtc_state
> > > > > *crtc_state,
> > > > > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > > > > intel_crtc *crtc,
> > > > >  void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv,
> > > > >                         struct skl_ddb_allocation *ddb /* out
> > > > > */)
> > > > >  {
> > > > > -     ddb->enabled_slices =
> > > > > intel_enabled_dbuf_slices_num(dev_priv);
> > > > > +     ddb->enabled_slices =
> > > > > intel_enabled_dbuf_slices(dev_priv);
> > > > > +     DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > > > > > enabled_slices);
> > > > >
> > > > >  }
> > > > >
> > > > >  /*
> > > > > @@ -4164,6 +4240,241 @@ skl_get_total_relative_data_rate(struct
> > > > > intel_crtc_state *crtc_state,
> > > > >       return total_data_rate;
> > > > >  }
> > > > >
> > > > > +static uint_fixed_16_16_t
> > > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +     u32 src_w, src_h, dst_w, dst_h;
> > > > > +     uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > > +     uint_fixed_16_16_t downscale_h, downscale_w;
> > > > > +     const struct drm_display_mode *adjusted_mode =
> > > > > &crtc_state-
> > > > > > hw.adjusted_mode;
> > > > >
> > > > > +
> > > > > +     src_w = crtc_state->pipe_src_w;
> > > > > +     src_h = crtc_state->pipe_src_h;
> > > > > +     dst_w = adjusted_mode->crtc_hdisplay;
> > > > > +     dst_h = adjusted_mode->crtc_vdisplay;
> > > > > +
> > > > > +     fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > > +     fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > > +     downscale_w = max_fixed16(fp_w_ratio,
> > > > > u32_to_fixed16(1));
> > > > > +     downscale_h = max_fixed16(fp_h_ratio,
> > > > > u32_to_fixed16(1));
> > > > > +
> > > > > +     return mul_fixed16(downscale_w, downscale_h);
> > > > > +}
> > > > > +
> > > > > +uint_fixed_16_16_t
> > > > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > > > +{
> > > > > +     struct drm_plane *plane;
> > > > > +     const struct drm_plane_state *drm_plane_state;
> > > > > +     uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > > +     uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > > +     struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > > >uapi.crtc);
> > > > > +
> > > > > +     if (!crtc_state->uapi.enable)
> > > > > +             return max_downscale;
> > > > > +
> > > > > +     drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > > drm_plane_state, &crtc_state->uapi) {
> > > > > +             uint_fixed_16_16_t plane_downscale;
> > > > > +             const struct intel_plane_state *plane_state =
> > > > > +                     to_intel_plane_state(drm_plane_state);
> > > > > +
> > > > > +             if (!intel_wm_plane_visible(crtc_state,
> > > > > plane_state))
> > > > > +                     continue;
> > > > > +
> > > > > +             plane_downscale =
> > > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > > +
> > > > > +             DRM_DEBUG_KMS("Plane %d downscale amount
> > > > > %d.%d\n",
> > > > > +                           to_intel_plane(plane)->id,
> > > > > +                           plane_downscale.val >> 16,
> > > > > +                           plane_downscale.val & 0xffff);
> > > > > +
> > > > > +             max_downscale = max_fixed16(plane_downscale,
> > > > > max_downscale);
> > > > > +     }
> > > > > +
> > > > > +
> > > > > +     pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > > > +
> > > > > +     DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > > +                    crtc->pipe, pipe_downscale.val >> 16,
> > > > > +                    pipe_downscale.val & 0xffff);
> > > > > +
> > > > > +     pipe_downscale = mul_fixed16(pipe_downscale,
> > > > > max_downscale);
> > > > > +
> > > > > +     /*
> > > > > +      * Convert result to percentage: get the actual ratio
> > > > > instead
> > > > > of rate,
> > > > > +      * then multiply by 100 to get percentage.
> > > > > +      */
> > > > > +     pipe_ratio = u32_to_fixed16(100 *
> > > > > div_round_up_u32_fixed16(1,
> > > > > pipe_downscale));
> > > > > +
> > > > > +     DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > > +                   pipe_ratio.val >> 16, pipe_ratio.val &
> > > > > 0xffff);
> > > > > +
> > > > > +     return pipe_ratio;
> > > > > +}
> > > >
> > > > Yuck. I just killed these.
> > >
> > > I still need pipe_ratio fro ICL as you know, this is dictated by
> > > BSpec,
> > > not by me.
> >
> > That part of the spec is pretty crazy IMO. I think we should just
> > ignore
> > it for now. Also nothing in there says we must use both slices for a
> > single pipe so ignoring doesn't even violate the spec in any way.
>
> So you are instructing me now to ignore BSpec as "we are smarter"?
> I don't think this is properly to do it this way. Lets first change the
> BSpec, if we don't like pipe ratio here and then driver. Currently this
> is written that way and if we are going to do everything our own way
> ignoring specs, this is going to be chaos.

Like I sai you don't have to violate the spec at all. Just don't use
two slices for the same pipe and you can avoid the whole strange pipe
ratio limitation. We should file a bspec issue to get that clarified
though.

>
> >
> > >
> > > >
> > > > > +
> > > > > +struct dbuf_slice_conf_entry {
> > > > > +     u32 dbuf_mask[I915_MAX_PIPES];
> > > > > +     u32 active_pipes;
> > > > > +};
> > > > > +
> > > > > +
> > > > > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > > > > +     { { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > > > > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > > > > +     { { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > > > > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > > > > +     { { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > > > > +     { { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +     { { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > > > > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +     { { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +     { { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +     { { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +     { { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > +     { { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +     { { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > > |
> > > > > BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +     { { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_C)
> > > > > |
> > > > > BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +     { { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) | BIT(PIPE_C)
> > > > > |
> > > > > BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +     { { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > > |
> > > > > BIT(PIPE_D) }
> > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > +     { { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) | BIT(PIPE_B)
> > > > > |
> > > > > BIT(PIPE_C) }
> > > > > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3,
> > > > > DBuf4)  \
> > > > > +     { { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) |
> > > > > BIT(PIPE_B) | \
> > > > > +                                       BIT(PIPE_C) |
> > > > > BIT(PIPE_D) }
> > > > > +
> > > > > +/*
> > > > > + * Table taken from Bspec 12716
> > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > + * plus there are some hardcoded requirements on how
> > > > > + * those should be distributed for multipipe scenarios.
> > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > + * and less readable, so decided to use a table almost
> > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > + * to compare, change and check.
> > > > > + */
> > > > > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > > ratio
> > > > > < 88.8 */
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > pipe
> > > > > ratio >= 88.8 */
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > > ratio
> > > > > < 88.8 */
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > pipe
> > > > > ratio >= 88.8 */
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /* pipe
> > > > > ratio
> > > > > < 88.8 */
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for
> > > > > pipe
> > > > > ratio >= 88.8 */
> > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +};
> > > > > +
> > > > > +/*
> > > > > + * Table taken from Bspec 49255
> > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > + * plus there are some hardcoded requirements on how
> > > > > + * those should be distributed for multipipe scenarios.
> > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > + * and less readable, so decided to use a table almost
> > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > + * to compare, change and check.
> > > > > + */
> > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > DBUF_S2_BIT,
> > > > > DBUF_S2_BIT),
> > > > > +};
> > > >
> > > > My eyes!
> > > >
> > > > I have to think we should be able to reduce all that to a handful
> > > > of lines of code.
> > >
> > > Yeah, then we are going to have a huge function with lots of weird
> > > definitions, unfortunately BSpec table has quite strange DBuf
> > > assignments like
> > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > >
> > > i.e slices can get mixed in a quite various ways. There is no
> > > rational pattern in those and could be even dangerous to try to
> > > optimize it some way, as one day it might change again in some
> > > unpredictable way.
> > >
> > > Would you prefer to have it like
> > >
> > > if (pipes are A and B)
> > >     program S2 to A and S1 to B
> > > if (pipes are A and C)
> > >     program S1 to A and S2 to C
> > > ...
> > >
> > > I would prefer at least to see it in a comparable way with the
> > > table we have in BSpec, rather than having lots of weird looking
> > > if-else statements, GEN checks and so on.
> > >
> > > I knew this table was not going to look like it is done typically
> > > here - but really my opinion that it is way better than hardcoding
> > > it into some weird algorithm, which would be hard to compare to
> > > initial table, which is already strange enough.
> > >
> > > If you don't like it - could you please explain why this is exactly
> > > worse than having long functions with hardcoded checks?
> >
> > I think we should be able to encode this simply using the
> > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
> > is the distance of each pipe from each dbuf slice. Then all we have
> > to do is minimize the total distance. The following tables I think
> > should encode that reasonably concisely:
> >
> > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > static const u8 icl_dbuf_distance[][] {
> >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > };
> >
> > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > static const u8 tgl_dbuf_distance[][] {
> >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > };
>
> I think you are missing my point. I will explain why.
> My initial idea was similar, use some kind of distance
> metrics and encode everything in a smart way.
> Current BSpec table has sometimes slices swapped for no reason.
>
> There are seem to be some hardware constraints we are not aware
> of. For example for TGL pipes A,B get S2 and S1 slices,

A/S1 + B/S2 = 1 + 4 = 5
A/S2 + B/S1 = 2 + 1 = 3

> but pipes A C get S1 and S2.

A/S1 + C/S2 = 1 + 1 = 2
A/S2 + C/S1 = 2 + 2 = 4

So it's exactly as my distance table says it should be.

>
> if you encode all this into some kind of distance algorithm,
> those differences would be enourmously difficult to spot and
> compare with actual table. That is why I decided to do it that
> way.
>
> Most likely yes you can solve it by using this metrics,
> but will it really be then more readable and comparable
> with reference table?
> What if some contraints will be added again, like pipe ratio?
>
> >
> > /*
> >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> >  */
> > static const u8 future_dbuf_distance[][] {
> >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > };
> >
> > I suppose we could make that even more minimal by just
> > directly sticking that "Pipe and DBUF ordering" language
> > into a single array but and then calculate each distance
> > from it, but that is perhaps more work, and we'd need
> > some kind of enum just for that so that we have a single
> > namespace for the pipes and dbuf slices. So i have a
> > feeling having those distances pre-computed in a table
> > like thet is maybe a better way to go.
>
> Good luck then spoting differences with the BSpec table once it
> gets changed.

Well, I'd rather do that than decode through layers of cpp macros.
Also atm the table has no extra restrictions so it's clearly there
just because someone thought they'd write out the results of the
distance calculations. In general bspec style is somewhat annoying
where they try to write out things in the lowest level of absraction
possible (they are hardware people after all). I'd much prefer a
higher level description of the issue.

>
> >
> > >
> > > I think this is pretty stupid to fix stuff which looks new or
> > > unusual
> > > just because somebody doesn't like it, without even constructive
> > > arguing.
> > >
> > > - Stan
> > >
> > > >
> > > >
> > > > > +
> > > > > +/*
> > > > > + * This function
> > > >
> > > > "This function" is totally redundant. Ditto elsewhere.
> > > >
> > > > > finds an entry with same enabled pipe configuration and
> > > > > + * returns correspondent DBuf slice mask as stated in BSpec
> > > > > for
> > > > > particular
> > > > > + * platform.
> > > > > + */
> > > > > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > > > > +                                  u32 active_pipes,
> > > > > +                                  const struct
> > > > > intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +     int i;
> > > > > +     uint_fixed_16_16_t pipe_ratio;
> > > > > +
> > > > > +     /*
> > > > > +      * Calculate pipe ratio as stated in BSpec 28692
> > > > > +      */
> > > > > +     pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > > > > +
> > > > > +     for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > > > > +             if (icl_allowed_dbufs[i].active_pipes ==
> > > > > active_pipes)
> > > > > {
> > > > > +                     if (hweight32(active_pipes) == 1) {
> > > > > +                             /*
> > > > > +                              * According to BSpec 12716: if
> > > > > ratio
> > > > > > = 88.8
> > > > >
> > > > > +                              * always use single dbuf
> > > > > slice.
> > > > > +                              */
> > > > > +                             if (pipe_ratio.val >=
> > > > > div_fixed16(888,
> > > > > 10).val)
> > > > > +                                     ++i;
> > > > > +                     }
> > > > > +                     return
> > > > > icl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > +             }
> > > > > +     }
> > > > > +     return 0;
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * This function finds an entry with same enabled pipe
> > > > > configuration and
> > > > > + * returns correspondent DBuf slice mask as stated in BSpec
> > > > > for
> > > > > particular
> > > > > + * platform.
> > > > > + */
> > > > > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > > > > +                                  u32 active_pipes,
> > > > > +                                  const struct
> > > > > intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +     int i;
> > > > > +
> > > > > +     for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > > > > +             if (tgl_allowed_dbufs[i].active_pipes ==
> > > > > active_pipes)
> > > > > +                     return
> > > > > tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > +     }
> > > > > +     return 0;
> > > > > +}
> > > > > +
> > > > > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > > *dev_priv,
> > > > > +                                   int pipe, u32
> > > > > active_pipes,
> > > > > +                                   const struct
> > > > > intel_crtc_state
> > > > > *crtc_state)
> > > > > +{
> > > > > +     if (IS_GEN(dev_priv, 11))
> > > > > +             return icl_get_allowed_dbuf_mask(pipe,
> > > > > +                                              active_pipes,
> > > > > +                                              crtc_state);
> > > > > +     else if (IS_GEN(dev_priv, 12))
> > > > > +             return tgl_get_allowed_dbuf_mask(pipe,
> > > > > +                                              active_pipes,
> > > > > +                                              crtc_state);
> > > > > +     /*
> > > > > +      * For anything else just return one slice yet.
> > > > > +      * Should be extended for other platforms.
> > > > > +      */
> > > > > +     return DBUF_S1_BIT;
> > > > > +}
> > > > > +
> > > > >  static u64
> > > > >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > > > > *crtc_state,
> > > > >                                u64 *plane_data_rate)
> > > > > --
> > > > > 2.17.1
> > > >
> > > >
> >
> >

--
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-19 13:24               ` Lisovskiy, Stanislav
  0 siblings, 0 replies; 34+ messages in thread
From: Lisovskiy, Stanislav @ 2019-11-19 13:24 UTC (permalink / raw)
  To: ville.syrjala; +Cc: Nikula, Jani, intel-gfx

On Mon, 2019-11-18 at 20:25 +0000, Lisovskiy, Stanislav wrote:

P.S: (In addition to my last yesterday letter):

> 
> That is actually a violation of BSpec, we are not using two slices
> for same
> pipe while we could. We had already enough of bw issues, why should
> we make our life even harder?
> 
> > > > > > +/*
> > > > > > + * Table taken from Bspec 49255
> > > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > > + * plus there are some hardcoded requirements on how
> > > > > > + * those should be distributed for multipipe scenarios.
> > > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > > + * and less readable, so decided to use a table almost
> > > > > > + * as is from BSpec itself - that way it is at least
> > > > > > easier
> > > > > > + * to compare, change and check.
> > > > > > + */
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> > > > > 
> > > > > My eyes!
> > > > > 
> > > > > I have to think we should be able to reduce all that to a
> > > > > handful
> > > > > of lines of code.
> > > > 
> > > > Yeah, then we are going to have a huge function with lots of
> > > > weird
> > > > definitions, unfortunately BSpec table has quite strange DBuf
> > > > assignments like
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > 
> > > > i.e slices can get mixed in a quite various ways. There is no
> > > > rational pattern in those and could be even dangerous to try to
> > > > optimize it some way, as one day it might change again in some
> > > > unpredictable way.
> > > > 
> > > > Would you prefer to have it like
> > > > 
> > > > if (pipes are A and B)
> > > >     program S2 to A and S1 to B
> > > > if (pipes are A and C)
> > > >     program S1 to A and S2 to C
> > > > ...
> > > > 
> > > > I would prefer at least to see it in a comparable way with the
> > > > table we have in BSpec, rather than having lots of weird
> > > > looking
> > > > if-else statements, GEN checks and so on.
> > > > 
> > > > I knew this table was not going to look like it is done
> > > > typically
> > > > here - but really my opinion that it is way better than
> > > > hardcoding
> > > > it into some weird algorithm, which would be hard to compare to
> > > > initial table, which is already strange enough.
> > > > 
> > > > If you don't like it - could you please explain why this is
> > > > exactly
> > > > worse than having long functions with hardcoded checks?
> > > 
> > > I think we should be able to encode this simply using the
> > > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify
> > > what
> > > is the distance of each pipe from each dbuf slice. Then all we
> > > have
> > > to do is minimize the total distance. The following tables I
> > > think
> > > should encode that reasonably concisely:
> > > 
> > > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > > static const u8 icl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > > };
> > > 
> > > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > > static const u8 tgl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > > };
> > 
> > I think you are missing my point. I will explain why.
> > My initial idea was similar, use some kind of distance
> > metrics and encode everything in a smart way.
> > Current BSpec table has sometimes slices swapped for no reason.
> > 
> > There are seem to be some hardware constraints we are not aware
> > of. For example for TGL pipes A,B get S2 and S1 slices,
> > A/S1 + B/S2 = 1 + 4 = 5
> > A/S2 + B/S1 = 2 + 1 = 3
> > but pipes A C get S1 and S2.
> > A/S1 + C/S2 = 1 + 1 = 2
> > A/S2 + C/S1 = 2 + 2 = 4
> > So it's exactly as my distance table says it should be.
> > 
> > if you encode all this into some kind of distance algorithm,
> > those differences would be enourmously difficult to spot and
> > compare with actual table. That is why I decided to do it that
> > way.
> > 
> > Most likely yes you can solve it by using this metrics,
> > but will it really be then more readable and comparable
> > with reference table?
> > What if some contraints will be added again, like pipe ratio?
> > 
> > > 
> > > /*
> > >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> > >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> > >  */
> > > static const u8 future_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> > >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> > >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> > >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > > };
> > > 

One more thing I didn't mention yesterday - this metric matrix simply
reflects the affinity/cost of using different DBuf slices by different
Pipes. Sure it will help you to figure out which slice is preferrable
to use.
However you will still have to map it properly to overall active 
pipes mask, which will require you to account which slices are already
used, how many pipes can use a single slice and so on.

See:
I.e if you have lets say pipes A,B(just looked down to my table to know
the expected conf - so convenient), you will need to get S1, S2 for
pipes A, B respectively.
So from your distance metric table for ICL you will first figure out
that for pipe A you prefer S1, and for pipe B you prefer... also S1
as it has lower metric than S2, despite that you should have used S1
and S2 here according to BSpec Then you will have to write some
function that will iterate and count somehow remaining unused slices,
taking into consideration that despite S1 is closer to pipe B, you have
to choose S2 if there are two pipes.
And if there are 3 pipes - you have to... choose again S1 for pipe A,
S1 for pipe B(so you can't even mark S1 as used to simplify things) and
S2 for pipe C.

Interesting to see how you will going to encode this check. Sure I know
you will find some short way. But this will be for sure some runtime
cycle(s).

While I'm just getting this mask immediately in compile time.
While I can immediately compare it to BSpec.

But sure metric matrix sounds more fancy( I like this graph theory
stuff)

What are we arguing here about? Even if my code is not perfect here
my approach at least makes sense and could be even better than yours,
got r-b, BAT/IGT green. 

So Ville, what is the actual point of this kind of unexpected code
review slamming? Could be I just annoyed you too much asking questions
:)

Do we actually share opinions here or there is only one single correct
opinion and perfect approach?

There are warnings yep(just as in many other patches here) - sure will
fix.

Best Regards,

Lisovskiy Stanislav

> > > I suppose we could make that even more minimal by just
> > > directly sticking that "Pipe and DBUF ordering" language
> > > into a single array but and then calculate each distance
> > > from it, but that is perhaps more work, and we'd need
> > > some kind of enum just for that so that we have a single
> > > namespace for the pipes and dbuf slices. So i have a
> > > feeling having those distances pre-computed in a table
> > > like thet is maybe a better way to go.
> > 
> > Good luck then spoting differences with the BSpec table once it
> > gets changed.
> > Well, I'd rather do that than decode through layers of cpp macros.
> > Also atm the table has no extra restrictions so it's clearly there
> > just because someone thought they'd write out the results of the
> > distance calculations. In general bspec style is somewhat annoying
> > where they try to write out things in the lowest level of
> > absraction
> > possible (they are hardware people after all). I'd much prefer a
> > higher level description of the issue.
> 
> I would prefer to see immediately what is different as this table
> makes it possible right away
> Also to me this table simply intuitively more understandable.
> 
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> 
> We have similar CDCLK table in intel_cdclk, while you could 
> also argue that we might had come up with some formula.
> 
> Could be we could make the table approach itself simpler - 
> that is quite possible, but implementing a distance algorithm
> seems to me to be a dangerous idea, despite that it of course
> looks better to me(yes it is).
> 
> Best Regards,
> 
> Lisovskiy Stanislav
> 
> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7,
> 02160 Espoo
> 
> ________________________________________
> From: Ville Syrjälä [ville.syrjala@linux.intel.com]
> Sent: Monday, November 18, 2019 8:18 PM
> To: Lisovskiy, Stanislav
> Cc: Saarinen, Jani; intel-gfx@lists.freedesktop.org; Roper, Matthew
> D; maarten.lankhorst@linux.intel.com; Ausmus, James
> Subject: Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL
> and TGL
> 
> On Mon, Nov 18, 2019 at 05:17:51PM +0000, Lisovskiy, Stanislav wrote:
> > On Mon, 2019-11-18 at 17:27 +0200, Ville Syrjälä wrote:
> > > On Mon, Nov 18, 2019 at 09:19:18AM +0000, Lisovskiy, Stanislav
> > > wrote:
> > > > On Fri, 2019-11-15 at 22:19 +0200, Ville Syrjälä wrote:
> > > > > On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy
> > > > > wrote:
> > > > > > Also implemented algorithm for choosing DBuf slice
> > > > > > configuration
> > > > > > based on active pipes, pipe ratio as stated in BSpec 12716.
> > > > > > 
> > > > > > Now pipe allocation still stays proportional to pipe width
> > > > > > as
> > > > > > before,
> > > > > > however within allowed DBuf slice for this particular
> > > > > > configuration.
> > > > > > 
> > > > > > v2: Remove unneeded check from commit as ddb enabled slices
> > > > > > might
> > > > > >     now differ from hw state.
> > > > > > 
> > > > > > v3: - Added new field "supported_slices" to ddb to separate
> > > > > > max
> > > > > >       supported slices vs currently enabled, to avoid
> > > > > > confusion.
> > > > > >     - Removed obsolete comments and code related to
> > > > > > DBuf(Matthew
> > > > > > Roper).
> > > > > >     - Some code style and long line removal(Matthew Roper).
> > > > > >     - Added WARN_ON to new ddb range offset calc
> > > > > > function(Matthew
> > > > > > Roper).
> > > > > >     - Removed platform specific call to calc pipe ratio
> > > > > > from
> > > > > > ddb
> > > > > >       allocation function and fixed the return
> > > > > > value(Matthew
> > > > > > Roper)
> > > > > >     - Refactored DBUF slice allocation table to improve
> > > > > > readability
> > > > > >     - Added DBUF slice allocation for TGL as well.
> > > > > >     - ICL(however not TGL) seems to voluntarily enable
> > > > > > second
> > > > > > DBuf
> > > > > > slice
> > > > > >       after pm suspend/resume causing a mismatch failure,
> > > > > > because
> > > > > > we
> > > > > >       update DBuf slices only if we do a modeset, however
> > > > > > this
> > > > > > check
> > > > > >       is done always. Fixed it to be done only when modeset
> > > > > > for
> > > > > > ICL.
> > > > > > 
> > > > > > v4: - Now enabled slices is not actually a number, but a
> > > > > > bitmask,
> > > > > >       because we might need to enable second slice only and
> > > > > > number
> > > > > >       of slices would still 1 and that current code doesn't
> > > > > > allow.
> > > > > >     - Remove redundant duplicate code to have some unified
> > > > > > way
> > > > > > of
> > > > > >       enabling dbuf slices instead of hardcoding.
> > > > > > 
> > > > > > v5: - Fix failing gen9_assert_dbuf_enabled as it was
> > > > > > naively
> > > > > > thinking
> > > > > >       that we have only one DBUF_CTL slice. Now another
> > > > > > version
> > > > > > for
> > > > > >       gen11 will check both slices as only second one can
> > > > > > be
> > > > > > enabled,
> > > > > >       to keep CI happy.
> > > > > > 
> > > > > > v6: - Removed enabled dbuf assertion completely. Previous
> > > > > > code
> > > > > >       was keeping dbuf enabled, even(!) when _dbuf_disable
> > > > > > is
> > > > > > called.
> > > > > >       Currently we enable/disable dbuf slices based on
> > > > > > requrement
> > > > > >       so no point in asserting this here.
> > > > > >     - Removed unnecessary modeset check from
> > > > > > verify_wm_state(Matthew Roper)
> > > > > >     - Slices intersection after union is same as final
> > > > > > result(Matthew Roper)
> > > > > >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> > > > > >     - Some macros added(Matthew Roper)
> > > > > >     - ddb range is now always less or equal to ddb size -
> > > > > > no
> > > > > > need
> > > > > > for additional
> > > > > >       checks(previously needed as we had some bandwidth
> > > > > > checks
> > > > > > in
> > > > > > there which
> > > > > >       could "suddenly" return ddb_size smaller than it is.
> > > > > > 
> > > > > > v7: Minor costemic changes:
> > > > > >     - Changed icl_dbuf_slices_restore name to
> > > > > > icl_program_dbuf_slices
> > > > > >       as it more corresponds to the actual functionality.
> > > > > >     - Some simplification with supported slices for BXT and
> > > > > > GLK
> > > > > >     - skl_pipe_downscale_amount ->
> > > > > > icl_pipe_downscale_amount as
> > > > > >       this is not used for skl anymore.
> > > > > >     - Changed DBuf slice assignment order for TGL case
> > > > > > 
> > > > > > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > > > > > Signed-off-by: Stanislav Lisovskiy <
> > > > > > stanislav.lisovskiy@intel.com>
> > > > > > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > > > > > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > > > > > Cc: James Ausmus <james.ausmus@intel.com>
> > > > > > ---
> > > > > >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> > > > > >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> > > > > >  .../drm/i915/display/intel_display_power.h    |   2 +
> > > > > >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> > > > > >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> > > > > >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> > > > > >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> > > > > >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> > > > > >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > > > > > ++++++++++++++++--
> > > > > >  9 files changed, 431 insertions(+), 104 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > index 876fc25968bf..bd7aff675198 100644
> > > > > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > > > > > intel_crtc *crtc,
> > > > > > 
> > > > > >       if (INTEL_GEN(dev_priv) >= 11 &&
> > > > > >           hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > > > > > -             DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > > %u, got
> > > > > > %u)\n",
> > > > > > +             DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > > %x, got
> > > > > > %x)\n",
> > > > > >                         sw_ddb->enabled_slices,
> > > > > >                         hw->ddb.enabled_slices);
> > > > > > 
> > > > > > @@ -14614,15 +14614,24 @@ static void
> > > > > > skl_commit_modeset_enables(struct intel_atomic_state
> > > > > > *state)
> > > > > >       u8 hw_enabled_slices = dev_priv-
> > > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > > > 
> > > > > >       u8 required_slices = state-
> > > > > > > wm_results.ddb.enabled_slices;
> > > > > > 
> > > > > >       struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > > > > > +     u8 slices_union = hw_enabled_slices |
> > > > > > required_slices;
> > > > > > +     u8 slices_intersection = required_slices;
> > > > > > 
> > > > > >       for_each_oldnew_intel_crtc_in_state(state, crtc,
> > > > > > old_crtc_state, new_crtc_state, i)
> > > > > >               /* ignore allocations for crtc's that have
> > > > > > been
> > > > > > turned
> > > > > > off. */
> > > > > >               if (new_crtc_state->hw.active)
> > > > > >                       entries[i] = old_crtc_state-
> > > > > > > wm.skl.ddb;
> > > > > > 
> > > > > > -     /* If 2nd DBuf slice required, enable it here */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> > > > > > +     DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > > > > > +                   required_slices, hw_enabled_slices);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * If some other DBuf slice required, enable it here,
> > > > > > +      * as here we only add more slices, but not remove to
> > > > > > prevent
> > > > > > +      * issues if somebody still uses those.
> > > > > > +      */
> > > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > > hw_enabled_slices)
> > > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > > slices_union);
> > > > > > 
> > > > > >       /*
> > > > > >        * Whenever the number of active pipes changes, we
> > > > > > need
> > > > > > to make
> > > > > > sure we
> > > > > > @@ -14681,9 +14690,12 @@ static void
> > > > > > skl_commit_modeset_enables(struct intel_atomic_state
> > > > > > *state)
> > > > > >               }
> > > > > >       } while (progress);
> > > > > > 
> > > > > > -     /* If 2nd DBuf slice is no more required disable it
> > > > > > */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> > > > > > +     /*
> > > > > > +      * If some other DBuf slice is not needed, disable it
> > > > > > here,
> > > > > > +      * as here we only remove slices, which are not
> > > > > > needed
> > > > > > anymore.
> > > > > > +      */
> > > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > > hw_enabled_slices)
> > > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > > slices_intersection);
> > > > > 
> > > > > This is still a totally wrong place for this. We need to be
> > > > > sure
> > > > > the
> > > > > planes have switched over to the new ddb layout before we
> > > > > power
> > > > > off
> > > > > anything.
> > > > 
> > > > I have just modified already existing line here, not added by
> > > > me.
> > > > The point of this patch is starting to use second DBuf slice
> > > > but
> > > > not fixing all kinds of issues related to DBuf code at the same
> > > > time.
> > > 
> > > You're re-enabling code I disabled on purpose. It has to be fixed
> > > properly or not touched at all. In hindsight I should have
> > > probably
> > > just ripped it all out to not give people bad ideas.
> > 
> > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> > 
> > I'm not reenabling it - this is code is out there right now.
> 
> It's _dead_ code, and it's like that on purpose. You're now
> attempting
> to re-enable it without fixing the underlying issues.
> 
> > This is all again about changing all kinds of stuff at the same
> > time.
> > 
> > Why we can't just do things step by step? You have a point that
> > this is wrong place ok, but as we see current code causes no
> > regressions. Just tell me your idea where to put it and I will
> > do that, if you don't have it yet - let it stay there, if CI shows
> > no regressions currently.
> 
> The same global state tricks apply here as with the cdclk/sagv stuff.
> 1. enable new resources/bump up clocks
> 2. reconfigure planes/etc.
> 3. wait for vblanks
> 4. disable unused resources/drop the clocks
> 
> > 
> > > 
> > > > 
> > > > Of course this is nice idea to fix it - if that is wrong plane,
> > > > but current code somehow lives with it and probably this should
> > > > be
> > > > with some other patch.
> > > > 
> > > > > 
> > > > > >  }
> > > > > > 
> > > > > >  static void intel_atomic_helper_free_state(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv)
> > > > > > diff --git
> > > > > > a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > > index ce1b64f4dd44..a78bc9b61862 100644
> > > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > > @@ -1031,15 +1031,6 @@ static bool
> > > > > > gen9_dc_off_power_well_enabled(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >               (I915_READ(DC_STATE_EN) &
> > > > > > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> > > > > >  }
> > > > > > 
> > > > > > -static void gen9_assert_dbuf_enabled(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv)
> > > > > > -{
> > > > > > -     u32 tmp = I915_READ(DBUF_CTL);
> > > > > > -
> > > > > > -     WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST))
> > > > > > !=
> > > > > > -          (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > > > > > -          "Unexpected DBuf power power state (0x%08x)\n",
> > > > > > tmp);
> > > > > > -}
> > > > > > -
> > > > > >  static void gen9_disable_dc_states(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > >       struct intel_cdclk_state cdclk_state = {};
> > > > > > @@ -1055,8 +1046,6 @@ static void
> > > > > > gen9_disable_dc_states(struct
> > > > > > drm_i915_private *dev_priv)
> > > > > >       /* Can't read out voltage_level so can't use
> > > > > > intel_cdclk_changed() */
> > > > > >       WARN_ON(intel_cdclk_needs_modeset(&dev_priv-
> > > > > > >cdclk.hw,
> > > > > > &cdclk_state));
> > > > > > 
> > > > > > -     gen9_assert_dbuf_enabled(dev_priv);
> > > > > > -
> > > > > >       if (IS_GEN9_LP(dev_priv))
> > > > > >               bxt_verify_ddi_phy_power_wells(dev_priv);
> > > > > > 
> > > > > > @@ -4254,31 +4243,51 @@ static void
> > > > > > gen9_dbuf_disable(struct
> > > > > > drm_i915_private *dev_priv)
> > > > > >       intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> > > > > >  }
> > > > > > 
> > > > > > -static u8 intel_dbuf_max_slices(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > > +int intel_dbuf_max_slices(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > > -     if (INTEL_GEN(dev_priv) < 11)
> > > > > > -             return 1;
> > > > > > -     return 2;
> > > > > > +     return INTEL_INFO(dev_priv)->supported_slices;
> > > > > > +}
> > > > > 
> > > > > Kinda pointless function now. 'supported_slices' is not a
> > > > > very
> > > > > good
> > > > > name. Should be 'num_dbuf_slices' or something along those
> > > > > lines.
> > > > 
> > > > Supported slices really means how much slices we support in
> > > > hardware.
> > > > num_dbuf_slices will not reflect that, it will be more
> > > > ambigious.
> > > > I don't like this name.
> > > 
> > > supported_slices doesn't even say "dbuf" anywhere. So it's even
> > > more
> > > ambiguous. Also we already have num_pipes, num_sprites, etc.
> > 
> > ..and in current code we have... "enabled_slices", no? Ouch! :) IS
> > this
> > really that important and worth arguing at the very last moment?
> > whether number of
> > supported_slices_in_hardware, should be named "supported slices" or
> > "num_dbuf_slices"? If yes, lets go on, miss some targets and
> > continue
> > discussing this "soooo important" thing.
> > 
> > Some one might come up that it is called "num_slices" or
> > "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
> > keep everyones bloated egos happy??
> > 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +
> > > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > > +{
> > > > > > +     const u8 hw_enabled_slices = dev_priv-
> > > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > > 
> > > > > Seems a rather pointless variable. const at least is
> > > > > pointless.
> > > > > 
> > > > > > +
> > > > > > +     icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> > > > > >  }
> > > > > > 
> > > > > >  void icl_dbuf_slices_update(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >                           u8 req_slices)
> > > > > >  {
> > > > > > -     const u8 hw_enabled_slices = dev_priv-
> > > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > > > 
> > > > > > -     bool ret;
> > > > > > +     bool ret = true;
> > > > > > +     int i;
> > > > > > +     int max_slices = intel_dbuf_max_slices(dev_priv);
> > > > > > 
> > > > > > -     if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > > > > > +     if (hweight8(req_slices) >
> > > > > > intel_dbuf_max_slices(dev_priv)) {
> > > > > >               DRM_ERROR("Invalid number of dbuf slices
> > > > > > requested\n");
> > > > > >               return;
> > > > > >       }
> > > > > > 
> > > > > > -     if (req_slices == hw_enabled_slices || req_slices ==
> > > > > > 0)
> > > > > > -             return;
> > > > > > +     DRM_DEBUG_KMS("Updating dbuf slices to %x\n",
> > > > > > req_slices);
> > > > > > 
> > > > > > -     if (req_slices > hw_enabled_slices)
> > > > > > -             ret = intel_dbuf_slice_set(dev_priv,
> > > > > > DBUF_CTL_S2,
> > > > > > true);
> > > > > > -     else
> > > > > > -             ret = intel_dbuf_slice_set(dev_priv,
> > > > > > DBUF_CTL_S2,
> > > > > > false);
> > > > > > +     for (i = 0; i < max_slices; i++) {
> > > > > > +             int slice_bit = BIT(i);
> > > > > > +             bool slice_set = (slice_bit & req_slices) !=
> > > > > > 0;
> > > > > > +
> > > > > > +             switch (slice_bit) {
> > > > > > +             case DBUF_S1_BIT:
> > > > > > +                     ret = ret &&
> > > > > > intel_dbuf_slice_set(dev_priv,
> > > > > > +                                                       DBU
> > > > > > F_
> > > > > > CTL_S1,
> > > > > > +                                                       sli
> > > > > > ce
> > > > > > _set);
> > > > > > +                     break;
> > > > > > +             case DBUF_S2_BIT:
> > > > > > +                     ret = ret &&
> > > > > > intel_dbuf_slice_set(dev_priv,
> > > > > > +                                                       DBU
> > > > > > F_
> > > > > > CTL_S2,
> > > > > > +                                                       sli
> > > > > > ce
> > > > > > _set);
> > > > > 
> > > > > The return value on intel_dbuf_slice_set() seems pointless.
> > > > > I'd
> > > > > nuke
> > > > > it.
> > > > > 
> > > > > Also as long you have intel_dbuf_slice_set() I'd probably
> > > > > just
> > > > > pass
> > > > > the
> > > > > slice index there and let it deal with the register details.
> > > > > DBUF_CTL
> > > > > should probably be parametrized as well so you don't need
> > > > > annoying
> > > > > switch
> > > > > statements.
> > > > 
> > > > What if it fails? Shouldn't we still have some clue what state
> > > > do
> > > > we
> > > > have/had at the moment?
> > > 
> > > It if fails you log an error and keep going.
> > 
> > Ah sure. And let hw/sw state be completely different.
> > Also see next comment about this.
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +                     break;
> > > > > > +             default:
> > > > > > +                     MISSING_CASE(slice_bit);
> > > > > > +             }
> > > > > > +     }
> > > > > > 
> > > > > >       if (ret)
> > > > > >               dev_priv->wm.skl_hw.ddb.enabled_slices =
> > > > > > req_slices;
> > > > > > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > > 
> > > > > >  static void icl_dbuf_enable(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > > -
> > > > > > -     udelay(10);
> > > > > > -
> > > > > > -     if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > > -         !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > > -             DRM_ERROR("DBuf power enable timeout\n");
> > > > > > -     else
> > > > > > -             /*
> > > > > > -              * FIXME: for now pretend that we only have 1
> > > > > > slice,
> > > > > > see
> > > > > > -              * intel_enabled_dbuf_slices_num().
> > > > > > -              */
> > > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > > +     /*
> > > > > > +      * Just power up 1 slice, we will
> > > > > > +      * figure out later which slices we have and what we
> > > > > > need.
> > > > > > +      */
> > > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > > > 
> > > > > That stuff really shouldn't be modified by low level
> > > > > functions.
> > > > > The intial value should come from readout, and after that it
> > > > > should
> > > > > be adjusted as part of the overall atomic state.
> > > > 
> > > > Well, as you can see currently this is even worse - has some
> > > > weird
> > > > hardcoded stuff and is modified like that. Also if we are
> > > > enabling
> > > > the
> > > > dbuf slices, shouldn't we really determine what value should be
> > > > there?
> > > > 
> > > > Also I wouldn't really always trust the readout, as to me it
> > > > seems
> > > > icelake resets it constantly to 2 after suspend/resume.
> > > 
> > > The point of readout is that we syncrhonize our software state
> > > with
> > > the
> > > current state of hardware without having to modify the hardware
> > > state.
> > > Once synchronized the driver takes over and tells the hw what to
> > > do.
> > 
> > That is especially nice after you say "report error and don't care"
> > when I really want to reflect hw state in sw state, by checking
> > whether
> > we could set dbuf slices or not.
> > Now you say that we need to synchronize software state with hw
> > state,
> > however if I fail to modify it - then "who cares".
> 
> Yes. If you fail to modify the hardware state the whole thing is
> broken anyway. There's is *nothing* we can do at this stage in the
> commit to fix it.
> 
> > 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +     icl_program_dbuf_slices(dev_priv);
> > > > > >  }
> > > > > > 
> > > > > >  static void icl_dbuf_disable(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > > > > > ~DBUF_POWER_REQUEST);
> > > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > > > > > ~DBUF_POWER_REQUEST);
> > > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > > -
> > > > > > -     udelay(10);
> > > > > > -
> > > > > > -     if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > > -         (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > > -             DRM_ERROR("DBuf power disable timeout!\n");
> > > > > > -     else
> > > > > > -             /*
> > > > > > -              * FIXME: for now pretend that the first
> > > > > > slice
> > > > > > is
> > > > > > always
> > > > > > -              * enabled, see
> > > > > > intel_enabled_dbuf_slices_num().
> > > > > > -              */
> > > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > > +     /*
> > > > > > +      * Disable all slices
> > > > > > +      */
> > > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > > > > > +     icl_program_dbuf_slices(dev_priv);
> > > > > >  }
> > > > > > 
> > > > > >  static void icl_mbus_init(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > > diff --git
> > > > > > a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > > index 1da04f3e0fb3..b7057d68ad78 100644
> > > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > > > > > drm_i915_private *i915,
> > > > > >       for ((wf) = intel_display_power_get((i915),
> > > > > > (domain));
> > > > > > (wf); \
> > > > > >            intel_display_power_put_async((i915), (domain),
> > > > > > (wf)),
> > > > > > (wf) = 0)
> > > > > > 
> > > > > > +int intel_dbuf_max_slices(struct drm_i915_private
> > > > > > *dev_priv);
> > > > > >  void icl_dbuf_slices_update(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >                           u8 req_slices);
> > > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > > *dev_priv);
> > > > > > 
> > > > > >  void chv_phy_powergate_lanes(struct intel_encoder
> > > > > > *encoder,
> > > > > >                            bool override, unsigned int
> > > > > > mask);
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > > > index 82e4e6bf08c3..96741e68633a 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > > > @@ -51,6 +51,7 @@
> > > > > >  #include "display/intel_bw.h"
> > > > > >  #include "display/intel_cdclk.h"
> > > > > >  #include "display/intel_display_types.h"
> > > > > > +#include "display/intel_display_power.h"
> > > > > >  #include "display/intel_dp.h"
> > > > > >  #include "display/intel_fbdev.h"
> > > > > >  #include "display/intel_hotplug.h"
> > > > > > @@ -2588,6 +2589,10 @@ static int
> > > > > > intel_runtime_resume(struct
> > > > > > device *kdev)
> > > > > >       if (IS_VALLEYVIEW(dev_priv) ||
> > > > > > IS_CHERRYVIEW(dev_priv))
> > > > > >               ret = vlv_resume_prepare(dev_priv, true);
> > > > > > 
> > > > > > +     /* Weird hack to fix ICL hardware bug, as it resets
> > > > > > DBUF slices
> > > > > > reg */
> > > > > > +     if (INTEL_GEN(dev_priv) == 11)
> > > > > > +             icl_program_dbuf_slices(dev_priv);
> > > > > > +
> > > > > >       intel_uncore_runtime_resume(&dev_priv->uncore);
> > > > > > 
> > > > > >       intel_runtime_pm_enable_interrupts(dev_priv);
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > > > index 7e0f67babe20..a396977c9c2d 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > > @@ -804,7 +804,7 @@ static inline bool
> > > > > > skl_ddb_entry_equal(const
> > > > > > struct skl_ddb_entry *e1,
> > > > > >  }
> > > > > > 
> > > > > >  struct skl_ddb_allocation {
> > > > > 
> > > > > This struct is quite pointless. Should be killed off.
> > > > 
> > > > Legacy stuff, not added by me. Quite fine with removing it -
> > > > but should we really do all this at the same time and in same
> > > > patch?
> > > 
> > > Not the same patch. But clean up the cobwebs first, then add the
> > > new
> > > stuff. Much easier that way around.
> > 
> > There is a shit ton load of cobwebs to clean. Personally my opinion
> > is
> > that current dbuf controlling code is total garbage - which has
> > lots of
> > hardcoded stuff. Good luck enabling only second dbuf slice, good
> > luck
> > disabling/enabling dbuf properly....
> > Where you were, when this was committed then?
> 
> Looking at something else unfortunately. Also it has just kept
> bitrotting since because no one has cared.
> 
> > 
> > If we need to fix it - fine. But may be we should do it in stages?
> > Point of this patch is "enable second dbuf slice", next patch
> > "remove
> > unneeded structs" and so on.
> > While you want me to shuffle everything at the same time, which is
> > not
> > good idea, especially considering complexity and amount of legacy
> > stuff
> > we currently have.
> 
> Building new stuff on top of a decomposing garbage pile is not
> a good idea: a) it's harder than building on clean foundation,
> b) it'll probably fall over and get buried in the old garbage
> at which point you're likely going to be right back where you
> started.
> 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > -     u8 enabled_slices; /* GEN11 has configurable 2 slices
> > > > > > */
> > > > > > +     u8 enabled_slices;   /* Bitmask of currently enabled
> > > > > > DBuf
> > > > > > slices */
> > > > > >  };
> > > > > > 
> > > > > >  struct skl_ddb_values {
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > > > > b/drivers/gpu/drm/i915/i915_pci.c
> > > > > > index 1bb701d32a5d..8e86af505730 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > > > > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > > > > > intel_cherryview_info = {
> > > > > >       .has_gt_uc = 1, \
> > > > > >       .display.has_hdcp = 1, \
> > > > > >       .display.has_ipc = 1, \
> > > > > > -     .ddb_size = 896
> > > > > > +     .ddb_size = 896, \
> > > > > > +     .supported_slices = 1
> > > > > > 
> > > > > >  #define SKL_PLATFORM \
> > > > > >       GEN9_FEATURES, \
> > > > > > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > > > > > intel_skylake_gt4_info = {
> > > > > >       GEN(9), \
> > > > > >       .is_lp = 1, \
> > > > > >       .display.has_hotplug = 1, \
> > > > > > +     .supported_slices = 1, \
> > > > > >       .engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) |
> > > > > > BIT(VECS0),
> > > > > > \
> > > > > >       .pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
> > > > > > \
> > > > > >       .has_64bit_reloc = 1, \
> > > > > > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > > > > > intel_coffeelake_gt3_info = {
> > > > > >       GEN9_FEATURES, \
> > > > > >       GEN(10), \
> > > > > >       .ddb_size = 1024, \
> > > > > > +     .supported_slices = 1, \
> > > > > >       .display.has_dsc = 1, \
> > > > > >       .has_coherent_ggtt = false, \
> > > > > >       GLK_COLORS
> > > > > > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > > > > > intel_cannonlake_info = {
> > > > > >       }, \
> > > > > >       GEN(11), \
> > > > > >       .ddb_size = 2048, \
> > > > > > +     .supported_slices = 2, \
> > > > > >       .has_logical_ring_elsq = 1, \
> > > > > >       .color = { .degamma_lut_size = 33, .gamma_lut_size =
> > > > > > 262145 }
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > > > > > b/drivers/gpu/drm/i915/i915_reg.h
> > > > > > index a607ea520829..fba5731063d8 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > > > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > > > > @@ -7734,9 +7734,11 @@ enum {
> > > > > >  #define DISP_ARB_CTL2        _MMIO(0x45004)
> > > > > >  #define  DISP_DATA_PARTITION_5_6     (1 << 6)
> > > > > >  #define  DISP_IPC_ENABLE             (1 << 3)
> > > > > > -#define DBUF_CTL     _MMIO(0x45008)
> > > > > > -#define DBUF_CTL_S1  _MMIO(0x45008)
> > > > > > -#define DBUF_CTL_S2  _MMIO(0x44FE8)
> > > > > > +#define DBUF_S1_BIT                  BIT(0)
> > > > > > +#define DBUF_S2_BIT                  BIT(1)
> > > > > 
> > > > > These are not bits inside DBUF_CTL so they should not be
> > > > > here.
> > > > > I'd just nuke them entirely. If you think we need more than
> > > > > an
> > > > > int to refer to a slice then IMO something simple like
> > > > > 
> > > > > enum dbuf_slice {
> > > > >         DBUF_S1,
> > > > >         DBUF_S2,
> > > > > };
> > > > > 
> > > > > would match the general apporach we use for other things much
> > > > > better
> > > > > rather than defining some shifted values.
> > > > 
> > > > We need to combine slices - so bitmask is quite suitable for
> > > > that,
> > > > I can of course use this enum, but then I'll have to re-encode
> > > > it
> > > > to bitmask anyways.
> > > 
> > > Which is pretty much how everything else works in the driver.
> > 
> > Another piece of incredibly important thing to argue..
> 
> It is. Lot's of crap in the driver is simply caused by people not
> following established style and coming up with something different..
> Which means there is an extra cost for everyone else when reading the
> code. And that probably means most people just don't bother to read
> it.
> 
> Granted, some of that is due to spurts in evolution where major
> overhauls
> changed the architecture so significantly that the old established
> conventions simply made node sense anymore. Atomic being the last big
> one in that regard. And we're still cleaning up the resulting mess.
> 
> > 
> > > 
> > > > 
> > > > 
> > > > > 
> > > > > > +#define DBUF_CTL                     _MMIO(0x45008)
> > > > > > +#define DBUF_CTL_S1                  _MMIO(0x45008)
> > > > > > +#define DBUF_CTL_S2                  _MMIO(0x44FE8)
> > > > > >  #define  DBUF_POWER_REQUEST          (1 << 31)
> > > > > >  #define  DBUF_POWER_STATE            (1 << 30)
> > > > > >  #define GEN7_MSG_CTL _MMIO(0x45010)
> > > > > > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > > b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > > > > > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > > @@ -180,6 +180,7 @@ struct intel_device_info {
> > > > > >       } display;
> > > > > > 
> > > > > >       u16 ddb_size; /* in blocks */
> > > > > > +     u8 supported_slices; /* number of DBuf slices */
> > > > > > 
> > > > > >       /* Register offsets for the various display pipes and
> > > > > > transcoders */
> > > > > >       int pipe_offsets[I915_MAX_TRANSCODERS];
> > > > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > > > > b/drivers/gpu/drm/i915/intel_pm.c
> > > > > > index 2d389e437e87..1ea81ab92429 100644
> > > > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > > > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct
> > > > > > drm_device
> > > > > > *dev)
> > > > > >       return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> > > > > >  }
> > > > > > 
> > > > > > -static u8 intel_enabled_dbuf_slices_num(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv)
> > > > > > +static u8 intel_enabled_dbuf_slices(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > > -     u8 enabled_slices;
> > > > > > -
> > > > > > -     /* Slice 1 will always be enabled */
> > > > > > -     enabled_slices = 1;
> > > > > > +     u8 enabled_slices = 0;
> > > > > > 
> > > > > >       /* Gen prior to GEN11 have only one DBuf slice */
> > > > > >       if (INTEL_GEN(dev_priv) < 11)
> > > > > > -             return enabled_slices;
> > > > > > +             return DBUF_S1_BIT;
> > > > > 
> > > > > I'd nuke this special case entirely and just read everything
> > > > > from
> > > > > the
> > > > > hw. Assuming we need readout somewhere.
> > > > 
> > > > We have it that way hardcoded right now in drm-tip. I think
> > > > this is
> > > > a
> > > > subject to other patch, but not this change. As I understand we
> > > > need
> > > > to first figure out how to do it properly.
> > > > This patch anyway makes it better by at least removing
> > > > hardcoded
> > > > relation between number of slices and which slices are enabled.
> > > > With current code there is no way for example to enable second
> > > > DBuf
> > > > slice only, which is required by BSpec.
> > > > 
> > > > > 
> > > > > > 
> > > > > > -     /*
> > > > > > -      * FIXME: for now we'll only ever use 1 slice;
> > > > > > pretend
> > > > > > that we
> > > > > > have
> > > > > > -      * only that 1 slice enabled until we have a proper
> > > > > > way
> > > > > > for on-
> > > > > > demand
> > > > > > -      * toggling of the second slice.
> > > > > > -      */
> > > > > > -     if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > > -             enabled_slices++;
> > > > > > +     /* Check if second DBuf slice is enabled */
> > > > > > +     if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > > > > > +             enabled_slices |= DBUF_S1_BIT;
> > > > > > +
> > > > > > +     if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > > +             enabled_slices |= DBUF_S2_BIT;
> > > > > > 
> > > > > >       return enabled_slices;
> > > > > >  }
> > > > > > @@ -3812,36 +3808,38 @@ static u16
> > > > > > intel_get_ddb_size(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >                             const int num_active,
> > > > > >                             struct skl_ddb_allocation *ddb)
> > > > > >  {
> > > > > > -     const struct drm_display_mode *adjusted_mode;
> > > > > > -     u64 total_data_bw;
> > > > > >       u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > > > > > -
> > > > > >       WARN_ON(ddb_size == 0);
> > > > > > 
> > > > > >       if (INTEL_GEN(dev_priv) < 11)
> > > > > >               return ddb_size - 4; /* 4 blocks for bypass
> > > > > > path
> > > > > > allocation */
> > > > > > 
> > > > > > -     adjusted_mode = &crtc_state->hw.adjusted_mode;
> > > > > > -     total_data_bw = total_data_rate *
> > > > > > drm_mode_vrefresh(adjusted_mode);
> > > > > > +     return ddb_size;
> > > > > > +}
> > > > > > 
> > > > > > -     /*
> > > > > > -      * 12GB/s is maximum BW supported by single DBuf
> > > > > > slice.
> > > > > > -      *
> > > > > > -      * FIXME dbuf slice code is broken:
> > > > > > -      * - must wait for planes to stop using the slice
> > > > > > before
> > > > > > powering it off
> > > > > > -      * - plane straddling both slices is illegal in
> > > > > > multi-
> > > > > > pipe
> > > > > > scenarios
> > > > > > -      * - should validate we stay within the hw bandwidth
> > > > > > limits
> > > > > > -      */
> > > > > > -     if (0 && (num_active > 1 || total_data_bw >=
> > > > > > GBps(12)))
> > > > > > {
> > > > > > -             ddb->enabled_slices = 2;
> > > > > > -     } else {
> > > > > > -             ddb->enabled_slices = 1;
> > > > > > -             ddb_size /= 2;
> > > > > > -     }
> > > > > > +/*
> > > > > > + * Calculate initial DBuf slice offset, based on slice
> > > > > > size
> > > > > > + * and mask(i.e if slice size is 1024 and second slice is
> > > > > > enabled
> > > > > > + * offset would be 1024)
> > > > > > + */
> > > > > > +static u32 skl_get_first_dbuf_slice_offset(u32
> > > > > > dbuf_slice_mask,
> > > > > > +                                        u32 slice_size,
> > > > > > u32
> > > > > > ddb_size)
> > > > > > +{
> > > > > > +     u32 offset = 0;
> > > > > > 
> > > > > > -     return ddb_size;
> > > > > > +     if (!dbuf_slice_mask)
> > > > > > +             return 0;
> > > > > > +
> > > > > > +     offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > > > > > +
> > > > > > +     WARN_ON(offset >= ddb_size);
> > > > > > +     return offset;
> > > > > >  }
> > > > > > 
> > > > > > +static u32 i915_get_allowed_dbuf_mask(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv,
> > > > > > +                                   int pipe, u32
> > > > > > active_pipes,
> > > > > > +                                   const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state);
> > > > > > +
> > > > > >  static void
> > > > > >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >                                  const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state,
> > > > > > @@ -3857,7 +3855,14 @@
> > > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >       u32 pipe_width = 0, total_width = 0,
> > > > > > width_before_pipe
> > > > > > = 0;
> > > > > >       enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> > > > > >       u16 ddb_size;
> > > > > > +     u32 ddb_range_size;
> > > > > >       u32 i;
> > > > > > +     u32 dbuf_slice_mask;
> > > > > > +     u32 active_pipes;
> > > > > > +     u32 offset;
> > > > > > +     u32 slice_size;
> > > > > > +     u32 total_slice_mask;
> > > > > > +     u32 start, end;
> > > > > > 
> > > > > >       if (WARN_ON(!state) || !crtc_state->hw.active) {
> > > > > >               alloc->start = 0;
> > > > > > @@ -3866,14 +3871,23 @@
> > > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >               return;
> > > > > >       }
> > > > > > 
> > > > > > -     if (intel_state->active_pipe_changes)
> > > > > > +     if (intel_state->active_pipe_changes) {
> > > > > >               *num_active = hweight8(intel_state-
> > > > > > > active_pipes);
> > > > > > 
> > > > > > -     else
> > > > > > +             active_pipes = intel_state->active_pipes;
> > > > > > +     } else {
> > > > > >               *num_active = hweight8(dev_priv-
> > > > > > >active_pipes);
> > > > > > +             active_pipes = dev_priv->active_pipes;
> > > > > > +     }
> > > > > > 
> > > > > >       ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > > > > > total_data_rate,
> > > > > >                                     *num_active, ddb);
> > > > > > 
> > > > > > +     DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > > > > > +
> > > > > > +     slice_size = ddb_size / INTEL_INFO(dev_priv)-
> > > > > > > supported_slices;
> > > > > > 
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Got DBuf slice size %d\n",
> > > > > > slice_size);
> > > > > > +
> > > > > >       /*
> > > > > >        * If the state doesn't change the active CRTC's or
> > > > > > there is no
> > > > > >        * modeset request, then there's no need to
> > > > > > recalculate;
> > > > > > @@ -3891,20 +3905,70 @@
> > > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >               return;
> > > > > >       }
> > > > > > 
> > > > > > +     /*
> > > > > > +      * Get allowed DBuf slices for correspondent pipe and
> > > > > > platform.
> > > > > > +      */
> > > > > > +     dbuf_slice_mask =
> > > > > > i915_get_allowed_dbuf_mask(dev_priv,
> > > > > > for_pipe,
> > > > > > +                                                  active_p
> > > > > > ip
> > > > > > es,
> > > > > > crtc_state);
> > > > > 
> > > > > Quite a few redundant arguments. The function name looks
> > > > > alien.
> > > > > 
> > > > > skl_crtc_possible_dbuf_slices() or something perhaps?
> > > > > 
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active
> > > > > > pipes
> > > > > > %x\n",
> > > > > > +                   dbuf_slice_mask,
> > > > > > +                   for_pipe, active_pipes);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Figure out at which DBuf slice we start, i.e if we
> > > > > > start at
> > > > > > Dbuf S2
> > > > > > +      * and slice size is 1024, the offset would be 1024
> > > > > > +      */
> > > > > > +     offset =
> > > > > > skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > > > > > +                                              slice_size,
> > > > > > ddb_size);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Figure out total size of allowed DBuf slices,
> > > > > > which
> > > > > > is
> > > > > > basically
> > > > > > +      * a number of allowed slices for that pipe
> > > > > > multiplied
> > > > > > by slice
> > > > > > size.
> > > > > > +      * Inside of this
> > > > > > +      * range ddb entries are still allocated in
> > > > > > proportion
> > > > > > to
> > > > > > display width.
> > > > > > +      */
> > > > > > +     ddb_range_size = hweight8(dbuf_slice_mask) *
> > > > > > slice_size;
> > > > > > +
> > > > > >       /*
> > > > > >        * Watermark/ddb requirement highly depends upon
> > > > > > width
> > > > > > of the
> > > > > >        * framebuffer, So instead of allocating DDB equally
> > > > > > among
> > > > > > pipes
> > > > > >        * distribute DDB based on resolution/width of the
> > > > > > display.
> > > > > >        */
> > > > > > +     total_slice_mask = dbuf_slice_mask;
> > > > > 
> > > > > Isn't this already taken care by the loop below?
> > > > > 
> > > > > >       for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > > crtc_state,
> > > > > > i) {
> > > > > >               const struct drm_display_mode *adjusted_mode
> > > > > > =
> > > > > >                       &crtc_state->hw.adjusted_mode;
> > > > > >               enum pipe pipe = crtc->pipe;
> > > > > >               int hdisplay, vdisplay;
> > > > > > +             u32 pipe_dbuf_slice_mask = \
> > > > > > +                     i915_get_allowed_dbuf_mask(dev_priv,
> > > > > > +                                             pipe,
> > > > > > +                                             active_pipes,
> > > > > > +                                             crtc_state);
> > > > > > 
> > > > > >               if (!crtc_state->hw.enable)
> > > > > >                       continue;
> > > > > > 
> > > > > > +             /*
> > > > > > +              * According to BSpec pipe can share one dbuf
> > > > > > slice
> > > > > > with another
> > > > > > +              * pipes or pipe can use multiple dbufs, in
> > > > > > both cases
> > > > > > we
> > > > > > +              * account for other pipes only if they have
> > > > > > exactly
> > > > > > same mask.
> > > > > > +              * However we need to account how many slices
> > > > > > we should
> > > > > > enable
> > > > > > +              * in total.
> > > > > > +              */
> > > > > > +             total_slice_mask |= pipe_dbuf_slice_mask;
> > > > > 
> > > > > total_slice_mask will now account only the crtcs in the
> > > > > state.
> > > > > What
> > > > > happens to the other pipes' slices?
> > 
> > Same as happens now btw. Right now it also does the same thing.
> 
> Right now it does absolutely nothing. It's all dead code.
> 
> > 
> > > > 
> > > > Yes we need to iterate through all pipes/crtcs here, however
> > > > for getting correct dbuf slice mask most important is
> > > > active_pipes.
> > > > However yes, I will change the loop for it to be more correct.
> > > > 
> > > > > 
> > > > > > +
> > > > > > +             /*
> > > > > > +              * Do not account pipes using other slice
> > > > > > sets
> > > > > > +              * luckily as of current BSpec slice sets do
> > > > > > not
> > > > > > partially
> > > > > > +              * intersect(pipes share either same one
> > > > > > slice
> > > > > > or same
> > > > > > slice set
> > > > > > +              * i.e no partial intersection), so it is
> > > > > > enough to
> > > > > > check for
> > > > > > +              * equality for now.
> > > > > > +              */
> > > > > > +             if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > > > > > +                     continue;
> > > > > > +
> > > > > >               drm_mode_get_hv_timing(adjusted_mode,
> > > > > > &hdisplay,
> > > > > > &vdisplay);
> > > > > >               total_width += hdisplay;
> > > > > > 
> > > > > > @@ -3914,8 +3978,19 @@
> > > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >                       pipe_width = hdisplay;
> > > > > >       }
> > > > > > 
> > > > > > -     alloc->start = ddb_size * width_before_pipe /
> > > > > > total_width;
> > > > > > -     alloc->end = ddb_size * (width_before_pipe +
> > > > > > pipe_width) /
> > > > > > total_width;
> > > > > > +     ddb->enabled_slices = total_slice_mask;
> > > > > > +
> > > > > > +     start = ddb_range_size * width_before_pipe /
> > > > > > total_width;
> > > > > > +     end = ddb_range_size * (width_before_pipe +
> > > > > > pipe_width)
> > > > > > /
> > > > > > total_width;
> > > > > > +
> > > > > > +     alloc->start = offset + start;
> > > > > > +     alloc->end = offset + end;
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > > > > > +                   alloc->start, alloc->end);
> > > > > > +     DRM_DEBUG_KMS("Enabled ddb slices mask %x num
> > > > > > supported
> > > > > > %d\n",
> > > > > > +                   ddb->enabled_slices,
> > > > > > +                   INTEL_INFO(dev_priv)-
> > > > > > >supported_slices);
> > > > > >  }
> > > > > > 
> > > > > >  static int skl_compute_wm_params(const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state,
> > > > > > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > > > > > intel_crtc *crtc,
> > > > > >  void skl_ddb_get_hw_state(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >                         struct skl_ddb_allocation *ddb /*
> > > > > > out
> > > > > > */)
> > > > > >  {
> > > > > > -     ddb->enabled_slices =
> > > > > > intel_enabled_dbuf_slices_num(dev_priv);
> > > > > > +     ddb->enabled_slices =
> > > > > > intel_enabled_dbuf_slices(dev_priv);
> > > > > > +     DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > > > > > > enabled_slices);
> > > > > > 
> > > > > >  }
> > > > > > 
> > > > > >  /*
> > > > > > @@ -4164,6 +4240,241 @@
> > > > > > skl_get_total_relative_data_rate(struct
> > > > > > intel_crtc_state *crtc_state,
> > > > > >       return total_data_rate;
> > > > > >  }
> > > > > > 
> > > > > > +static uint_fixed_16_16_t
> > > > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     u32 src_w, src_h, dst_w, dst_h;
> > > > > > +     uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > > > +     uint_fixed_16_16_t downscale_h, downscale_w;
> > > > > > +     const struct drm_display_mode *adjusted_mode =
> > > > > > &crtc_state-
> > > > > > > hw.adjusted_mode;
> > > > > > 
> > > > > > +
> > > > > > +     src_w = crtc_state->pipe_src_w;
> > > > > > +     src_h = crtc_state->pipe_src_h;
> > > > > > +     dst_w = adjusted_mode->crtc_hdisplay;
> > > > > > +     dst_h = adjusted_mode->crtc_vdisplay;
> > > > > > +
> > > > > > +     fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > > > +     fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > > > +     downscale_w = max_fixed16(fp_w_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +     downscale_h = max_fixed16(fp_h_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +
> > > > > > +     return mul_fixed16(downscale_w, downscale_h);
> > > > > > +}
> > > > > > +
> > > > > > +uint_fixed_16_16_t
> > > > > > +icl_get_pipe_ratio(const struct intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     struct drm_plane *plane;
> > > > > > +     const struct drm_plane_state *drm_plane_state;
> > > > > > +     uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > > > +     uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > > > +     struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > > > > uapi.crtc);
> > > > > > 
> > > > > > +
> > > > > > +     if (!crtc_state->uapi.enable)
> > > > > > +             return max_downscale;
> > > > > > +
> > > > > > +     drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > > > drm_plane_state, &crtc_state->uapi) {
> > > > > > +             uint_fixed_16_16_t plane_downscale;
> > > > > > +             const struct intel_plane_state *plane_state =
> > > > > > +                     to_intel_plane_state(drm_plane_state)
> > > > > > ;
> > > > > > +
> > > > > > +             if (!intel_wm_plane_visible(crtc_state,
> > > > > > plane_state))
> > > > > > +                     continue;
> > > > > > +
> > > > > > +             plane_downscale =
> > > > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > > > +
> > > > > > +             DRM_DEBUG_KMS("Plane %d downscale amount
> > > > > > %d.%d\n",
> > > > > > +                           to_intel_plane(plane)->id,
> > > > > > +                           plane_downscale.val >> 16,
> > > > > > +                           plane_downscale.val & 0xffff);
> > > > > > +
> > > > > > +             max_downscale = max_fixed16(plane_downscale,
> > > > > > max_downscale);
> > > > > > +     }
> > > > > > +
> > > > > > +
> > > > > > +     pipe_downscale =
> > > > > > icl_pipe_downscale_amount(crtc_state);
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > > > +                    crtc->pipe, pipe_downscale.val >> 16,
> > > > > > +                    pipe_downscale.val & 0xffff);
> > > > > > +
> > > > > > +     pipe_downscale = mul_fixed16(pipe_downscale,
> > > > > > max_downscale);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Convert result to percentage: get the actual ratio
> > > > > > instead
> > > > > > of rate,
> > > > > > +      * then multiply by 100 to get percentage.
> > > > > > +      */
> > > > > > +     pipe_ratio = u32_to_fixed16(100 *
> > > > > > div_round_up_u32_fixed16(1,
> > > > > > pipe_downscale));
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > > > +                   pipe_ratio.val >> 16, pipe_ratio.val &
> > > > > > 0xffff);
> > > > > > +
> > > > > > +     return pipe_ratio;
> > > > > > +}
> > > > > 
> > > > > Yuck. I just killed these.
> > > > 
> > > > I still need pipe_ratio fro ICL as you know, this is dictated
> > > > by
> > > > BSpec,
> > > > not by me.
> > > 
> > > That part of the spec is pretty crazy IMO. I think we should just
> > > ignore
> > > it for now. Also nothing in there says we must use both slices
> > > for a
> > > single pipe so ignoring doesn't even violate the spec in any way.
> > 
> > So you are instructing me now to ignore BSpec as "we are smarter"?
> > I don't think this is properly to do it this way. Lets first change
> > the
> > BSpec, if we don't like pipe ratio here and then driver. Currently
> > this
> > is written that way and if we are going to do everything our own
> > way
> > ignoring specs, this is going to be chaos.
> 
> Like I sai you don't have to violate the spec at all. Just don't use
> two slices for the same pipe and you can avoid the whole strange pipe
> ratio limitation. We should file a bspec issue to get that clarified
> though.
> 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +
> > > > > > +struct dbuf_slice_conf_entry {
> > > > > > +     u32 dbuf_mask[I915_MAX_PIPES];
> > > > > > +     u32 active_pipes;
> > > > > > +};
> > > > > > +
> > > > > > +
> > > > > > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > > > > > +     { { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > > > > > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > > > > > +     { { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > > > > > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > > > > > +     { { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > > > > > +     { { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > > > > > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_B)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_C)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) |
> > > > > > BIT(PIPE_C)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_B)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_B)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3,
> > > > > > DBuf4)  \
> > > > > > +     { { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_B) | \
> > > > > > +                                       BIT(PIPE_C) |
> > > > > > BIT(PIPE_D) }
> > > > > > +
> > > > > > +/*
> > > > > > + * Table taken from Bspec 12716
> > > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > > + * plus there are some hardcoded requirements on how
> > > > > > + * those should be distributed for multipipe scenarios.
> > > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > > + * and less readable, so decided to use a table almost
> > > > > > + * as is from BSpec itself - that way it is at least
> > > > > > easier
> > > > > > + * to compare, change and check.
> > > > > > + */
> > > > > > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /*
> > > > > > pipe
> > > > > > ratio
> > > > > > < 88.8 */
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > > pipe
> > > > > > ratio >= 88.8 */
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /*
> > > > > > pipe
> > > > > > ratio
> > > > > > < 88.8 */
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > > pipe
> > > > > > ratio >= 88.8 */
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /*
> > > > > > pipe
> > > > > > ratio
> > > > > > < 88.8 */
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for
> > > > > > pipe
> > > > > > ratio >= 88.8 */
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> > > > > > +
> > > > > > +/*
> > > > > > + * Table taken from Bspec 49255
> > > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > > + * plus there are some hardcoded requirements on how
> > > > > > + * those should be distributed for multipipe scenarios.
> > > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > > + * and less readable, so decided to use a table almost
> > > > > > + * as is from BSpec itself - that way it is at least
> > > > > > easier
> > > > > > + * to compare, change and check.
> > > > > > + */
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> > > > > 
> > > > > My eyes!
> > > > > 
> > > > > I have to think we should be able to reduce all that to a
> > > > > handful
> > > > > of lines of code.
> > > > 
> > > > Yeah, then we are going to have a huge function with lots of
> > > > weird
> > > > definitions, unfortunately BSpec table has quite strange DBuf
> > > > assignments like
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > 
> > > > i.e slices can get mixed in a quite various ways. There is no
> > > > rational pattern in those and could be even dangerous to try to
> > > > optimize it some way, as one day it might change again in some
> > > > unpredictable way.
> > > > 
> > > > Would you prefer to have it like
> > > > 
> > > > if (pipes are A and B)
> > > >     program S2 to A and S1 to B
> > > > if (pipes are A and C)
> > > >     program S1 to A and S2 to C
> > > > ...
> > > > 
> > > > I would prefer at least to see it in a comparable way with the
> > > > table we have in BSpec, rather than having lots of weird
> > > > looking
> > > > if-else statements, GEN checks and so on.
> > > > 
> > > > I knew this table was not going to look like it is done
> > > > typically
> > > > here - but really my opinion that it is way better than
> > > > hardcoding
> > > > it into some weird algorithm, which would be hard to compare to
> > > > initial table, which is already strange enough.
> > > > 
> > > > If you don't like it - could you please explain why this is
> > > > exactly
> > > > worse than having long functions with hardcoded checks?
> > > 
> > > I think we should be able to encode this simply using the
> > > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify
> > > what
> > > is the distance of each pipe from each dbuf slice. Then all we
> > > have
> > > to do is minimize the total distance. The following tables I
> > > think
> > > should encode that reasonably concisely:
> > > 
> > > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > > static const u8 icl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > > };
> > > 
> > > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > > static const u8 tgl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > > };
> > 
> > I think you are missing my point. I will explain why.
> > My initial idea was similar, use some kind of distance
> > metrics and encode everything in a smart way.
> > Current BSpec table has sometimes slices swapped for no reason.
> > 
> > There are seem to be some hardware constraints we are not aware
> > of. For example for TGL pipes A,B get S2 and S1 slices,
> 
> A/S1 + B/S2 = 1 + 4 = 5
> A/S2 + B/S1 = 2 + 1 = 3
> 
> > but pipes A C get S1 and S2.
> 
> A/S1 + C/S2 = 1 + 1 = 2
> A/S2 + C/S1 = 2 + 2 = 4
> 
> So it's exactly as my distance table says it should be.
> 
> > 
> > if you encode all this into some kind of distance algorithm,
> > those differences would be enourmously difficult to spot and
> > compare with actual table. That is why I decided to do it that
> > way.
> > 
> > Most likely yes you can solve it by using this metrics,
> > but will it really be then more readable and comparable
> > with reference table?
> > What if some contraints will be added again, like pipe ratio?
> > 
> > > 
> > > /*
> > >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> > >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> > >  */
> > > static const u8 future_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> > >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> > >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> > >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > > };
> > > 
> > > I suppose we could make that even more minimal by just
> > > directly sticking that "Pipe and DBUF ordering" language
> > > into a single array but and then calculate each distance
> > > from it, but that is perhaps more work, and we'd need
> > > some kind of enum just for that so that we have a single
> > > namespace for the pipes and dbuf slices. So i have a
> > > feeling having those distances pre-computed in a table
> > > like thet is maybe a better way to go.
> > 
> > Good luck then spoting differences with the BSpec table once it
> > gets changed.
> 
> Well, I'd rather do that than decode through layers of cpp macros.
> Also atm the table has no extra restrictions so it's clearly there
> just because someone thought they'd write out the results of the
> distance calculations. In general bspec style is somewhat annoying
> where they try to write out things in the lowest level of absraction
> possible (they are hardware people after all). I'd much prefer a
> higher level description of the issue.
> 
> > 
> > > 
> > > > 
> > > > I think this is pretty stupid to fix stuff which looks new or
> > > > unusual
> > > > just because somebody doesn't like it, without even
> > > > constructive
> > > > arguing.
> > > > 
> > > > - Stan
> > > > 
> > > > > 
> > > > > 
> > > > > > +
> > > > > > +/*
> > > > > > + * This function
> > > > > 
> > > > > "This function" is totally redundant. Ditto elsewhere.
> > > > > 
> > > > > > finds an entry with same enabled pipe configuration and
> > > > > > + * returns correspondent DBuf slice mask as stated in
> > > > > > BSpec
> > > > > > for
> > > > > > particular
> > > > > > + * platform.
> > > > > > + */
> > > > > > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > > > > > +                                  u32 active_pipes,
> > > > > > +                                  const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     int i;
> > > > > > +     uint_fixed_16_16_t pipe_ratio;
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Calculate pipe ratio as stated in BSpec 28692
> > > > > > +      */
> > > > > > +     pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > > > > > +
> > > > > > +     for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > > > > > +             if (icl_allowed_dbufs[i].active_pipes ==
> > > > > > active_pipes)
> > > > > > {
> > > > > > +                     if (hweight32(active_pipes) == 1) {
> > > > > > +                             /*
> > > > > > +                              * According to BSpec 12716:
> > > > > > if
> > > > > > ratio
> > > > > > > = 88.8
> > > > > > 
> > > > > > +                              * always use single dbuf
> > > > > > slice.
> > > > > > +                              */
> > > > > > +                             if (pipe_ratio.val >=
> > > > > > div_fixed16(888,
> > > > > > 10).val)
> > > > > > +                                     ++i;
> > > > > > +                     }
> > > > > > +                     return
> > > > > > icl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > > +             }
> > > > > > +     }
> > > > > > +     return 0;
> > > > > > +}
> > > > > > +
> > > > > > +/*
> > > > > > + * This function finds an entry with same enabled pipe
> > > > > > configuration and
> > > > > > + * returns correspondent DBuf slice mask as stated in
> > > > > > BSpec
> > > > > > for
> > > > > > particular
> > > > > > + * platform.
> > > > > > + */
> > > > > > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > > > > > +                                  u32 active_pipes,
> > > > > > +                                  const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     int i;
> > > > > > +
> > > > > > +     for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > > > > > +             if (tgl_allowed_dbufs[i].active_pipes ==
> > > > > > active_pipes)
> > > > > > +                     return
> > > > > > tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > > +     }
> > > > > > +     return 0;
> > > > > > +}
> > > > > > +
> > > > > > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > > +                                   int pipe, u32
> > > > > > active_pipes,
> > > > > > +                                   const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     if (IS_GEN(dev_priv, 11))
> > > > > > +             return icl_get_allowed_dbuf_mask(pipe,
> > > > > > +                                              active_pipes
> > > > > > ,
> > > > > > +                                              crtc_state);
> > > > > > +     else if (IS_GEN(dev_priv, 12))
> > > > > > +             return tgl_get_allowed_dbuf_mask(pipe,
> > > > > > +                                              active_pipes
> > > > > > ,
> > > > > > +                                              crtc_state);
> > > > > > +     /*
> > > > > > +      * For anything else just return one slice yet.
> > > > > > +      * Should be extended for other platforms.
> > > > > > +      */
> > > > > > +     return DBUF_S1_BIT;
> > > > > > +}
> > > > > > +
> > > > > >  static u64
> > > > > >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > > > > > *crtc_state,
> > > > > >                                u64 *plane_data_rate)
> > > > > > --
> > > > > > 2.17.1
> > > > > 
> > > > > 
> > > 
> > > 
> 
> --
> Ville Syrjälä
> Intel
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-19 13:24               ` Lisovskiy, Stanislav
  0 siblings, 0 replies; 34+ messages in thread
From: Lisovskiy, Stanislav @ 2019-11-19 13:24 UTC (permalink / raw)
  To: ville.syrjala; +Cc: Nikula, Jani, intel-gfx

On Mon, 2019-11-18 at 20:25 +0000, Lisovskiy, Stanislav wrote:

P.S: (In addition to my last yesterday letter):

> 
> That is actually a violation of BSpec, we are not using two slices
> for same
> pipe while we could. We had already enough of bw issues, why should
> we make our life even harder?
> 
> > > > > > +/*
> > > > > > + * Table taken from Bspec 49255
> > > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > > + * plus there are some hardcoded requirements on how
> > > > > > + * those should be distributed for multipipe scenarios.
> > > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > > + * and less readable, so decided to use a table almost
> > > > > > + * as is from BSpec itself - that way it is at least
> > > > > > easier
> > > > > > + * to compare, change and check.
> > > > > > + */
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> > > > > 
> > > > > My eyes!
> > > > > 
> > > > > I have to think we should be able to reduce all that to a
> > > > > handful
> > > > > of lines of code.
> > > > 
> > > > Yeah, then we are going to have a huge function with lots of
> > > > weird
> > > > definitions, unfortunately BSpec table has quite strange DBuf
> > > > assignments like
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > 
> > > > i.e slices can get mixed in a quite various ways. There is no
> > > > rational pattern in those and could be even dangerous to try to
> > > > optimize it some way, as one day it might change again in some
> > > > unpredictable way.
> > > > 
> > > > Would you prefer to have it like
> > > > 
> > > > if (pipes are A and B)
> > > >     program S2 to A and S1 to B
> > > > if (pipes are A and C)
> > > >     program S1 to A and S2 to C
> > > > ...
> > > > 
> > > > I would prefer at least to see it in a comparable way with the
> > > > table we have in BSpec, rather than having lots of weird
> > > > looking
> > > > if-else statements, GEN checks and so on.
> > > > 
> > > > I knew this table was not going to look like it is done
> > > > typically
> > > > here - but really my opinion that it is way better than
> > > > hardcoding
> > > > it into some weird algorithm, which would be hard to compare to
> > > > initial table, which is already strange enough.
> > > > 
> > > > If you don't like it - could you please explain why this is
> > > > exactly
> > > > worse than having long functions with hardcoded checks?
> > > 
> > > I think we should be able to encode this simply using the
> > > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify
> > > what
> > > is the distance of each pipe from each dbuf slice. Then all we
> > > have
> > > to do is minimize the total distance. The following tables I
> > > think
> > > should encode that reasonably concisely:
> > > 
> > > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > > static const u8 icl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > > };
> > > 
> > > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > > static const u8 tgl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > > };
> > 
> > I think you are missing my point. I will explain why.
> > My initial idea was similar, use some kind of distance
> > metrics and encode everything in a smart way.
> > Current BSpec table has sometimes slices swapped for no reason.
> > 
> > There are seem to be some hardware constraints we are not aware
> > of. For example for TGL pipes A,B get S2 and S1 slices,
> > A/S1 + B/S2 = 1 + 4 = 5
> > A/S2 + B/S1 = 2 + 1 = 3
> > but pipes A C get S1 and S2.
> > A/S1 + C/S2 = 1 + 1 = 2
> > A/S2 + C/S1 = 2 + 2 = 4
> > So it's exactly as my distance table says it should be.
> > 
> > if you encode all this into some kind of distance algorithm,
> > those differences would be enourmously difficult to spot and
> > compare with actual table. That is why I decided to do it that
> > way.
> > 
> > Most likely yes you can solve it by using this metrics,
> > but will it really be then more readable and comparable
> > with reference table?
> > What if some contraints will be added again, like pipe ratio?
> > 
> > > 
> > > /*
> > >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> > >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> > >  */
> > > static const u8 future_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> > >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> > >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> > >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > > };
> > > 

One more thing I didn't mention yesterday - this metric matrix simply
reflects the affinity/cost of using different DBuf slices by different
Pipes. Sure it will help you to figure out which slice is preferrable
to use.
However you will still have to map it properly to overall active 
pipes mask, which will require you to account which slices are already
used, how many pipes can use a single slice and so on.

See:
I.e if you have lets say pipes A,B(just looked down to my table to know
the expected conf - so convenient), you will need to get S1, S2 for
pipes A, B respectively.
So from your distance metric table for ICL you will first figure out
that for pipe A you prefer S1, and for pipe B you prefer... also S1
as it has lower metric than S2, despite that you should have used S1
and S2 here according to BSpec Then you will have to write some
function that will iterate and count somehow remaining unused slices,
taking into consideration that despite S1 is closer to pipe B, you have
to choose S2 if there are two pipes.
And if there are 3 pipes - you have to... choose again S1 for pipe A,
S1 for pipe B(so you can't even mark S1 as used to simplify things) and
S2 for pipe C.

Interesting to see how you will going to encode this check. Sure I know
you will find some short way. But this will be for sure some runtime
cycle(s).

While I'm just getting this mask immediately in compile time.
While I can immediately compare it to BSpec.

But sure metric matrix sounds more fancy( I like this graph theory
stuff)

What are we arguing here about? Even if my code is not perfect here
my approach at least makes sense and could be even better than yours,
got r-b, BAT/IGT green. 

So Ville, what is the actual point of this kind of unexpected code
review slamming? Could be I just annoyed you too much asking questions
:)

Do we actually share opinions here or there is only one single correct
opinion and perfect approach?

There are warnings yep(just as in many other patches here) - sure will
fix.

Best Regards,

Lisovskiy Stanislav

> > > I suppose we could make that even more minimal by just
> > > directly sticking that "Pipe and DBUF ordering" language
> > > into a single array but and then calculate each distance
> > > from it, but that is perhaps more work, and we'd need
> > > some kind of enum just for that so that we have a single
> > > namespace for the pipes and dbuf slices. So i have a
> > > feeling having those distances pre-computed in a table
> > > like thet is maybe a better way to go.
> > 
> > Good luck then spoting differences with the BSpec table once it
> > gets changed.
> > Well, I'd rather do that than decode through layers of cpp macros.
> > Also atm the table has no extra restrictions so it's clearly there
> > just because someone thought they'd write out the results of the
> > distance calculations. In general bspec style is somewhat annoying
> > where they try to write out things in the lowest level of
> > absraction
> > possible (they are hardware people after all). I'd much prefer a
> > higher level description of the issue.
> 
> I would prefer to see immediately what is different as this table
> makes it possible right away
> Also to me this table simply intuitively more understandable.
> 
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> 
> We have similar CDCLK table in intel_cdclk, while you could 
> also argue that we might had come up with some formula.
> 
> Could be we could make the table approach itself simpler - 
> that is quite possible, but implementing a distance algorithm
> seems to me to be a dangerous idea, despite that it of course
> looks better to me(yes it is).
> 
> Best Regards,
> 
> Lisovskiy Stanislav
> 
> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7,
> 02160 Espoo
> 
> ________________________________________
> From: Ville Syrjälä [ville.syrjala@linux.intel.com]
> Sent: Monday, November 18, 2019 8:18 PM
> To: Lisovskiy, Stanislav
> Cc: Saarinen, Jani; intel-gfx@lists.freedesktop.org; Roper, Matthew
> D; maarten.lankhorst@linux.intel.com; Ausmus, James
> Subject: Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL
> and TGL
> 
> On Mon, Nov 18, 2019 at 05:17:51PM +0000, Lisovskiy, Stanislav wrote:
> > On Mon, 2019-11-18 at 17:27 +0200, Ville Syrjälä wrote:
> > > On Mon, Nov 18, 2019 at 09:19:18AM +0000, Lisovskiy, Stanislav
> > > wrote:
> > > > On Fri, 2019-11-15 at 22:19 +0200, Ville Syrjälä wrote:
> > > > > On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy
> > > > > wrote:
> > > > > > Also implemented algorithm for choosing DBuf slice
> > > > > > configuration
> > > > > > based on active pipes, pipe ratio as stated in BSpec 12716.
> > > > > > 
> > > > > > Now pipe allocation still stays proportional to pipe width
> > > > > > as
> > > > > > before,
> > > > > > however within allowed DBuf slice for this particular
> > > > > > configuration.
> > > > > > 
> > > > > > v2: Remove unneeded check from commit as ddb enabled slices
> > > > > > might
> > > > > >     now differ from hw state.
> > > > > > 
> > > > > > v3: - Added new field "supported_slices" to ddb to separate
> > > > > > max
> > > > > >       supported slices vs currently enabled, to avoid
> > > > > > confusion.
> > > > > >     - Removed obsolete comments and code related to
> > > > > > DBuf(Matthew
> > > > > > Roper).
> > > > > >     - Some code style and long line removal(Matthew Roper).
> > > > > >     - Added WARN_ON to new ddb range offset calc
> > > > > > function(Matthew
> > > > > > Roper).
> > > > > >     - Removed platform specific call to calc pipe ratio
> > > > > > from
> > > > > > ddb
> > > > > >       allocation function and fixed the return
> > > > > > value(Matthew
> > > > > > Roper)
> > > > > >     - Refactored DBUF slice allocation table to improve
> > > > > > readability
> > > > > >     - Added DBUF slice allocation for TGL as well.
> > > > > >     - ICL(however not TGL) seems to voluntarily enable
> > > > > > second
> > > > > > DBuf
> > > > > > slice
> > > > > >       after pm suspend/resume causing a mismatch failure,
> > > > > > because
> > > > > > we
> > > > > >       update DBuf slices only if we do a modeset, however
> > > > > > this
> > > > > > check
> > > > > >       is done always. Fixed it to be done only when modeset
> > > > > > for
> > > > > > ICL.
> > > > > > 
> > > > > > v4: - Now enabled slices is not actually a number, but a
> > > > > > bitmask,
> > > > > >       because we might need to enable second slice only and
> > > > > > number
> > > > > >       of slices would still 1 and that current code doesn't
> > > > > > allow.
> > > > > >     - Remove redundant duplicate code to have some unified
> > > > > > way
> > > > > > of
> > > > > >       enabling dbuf slices instead of hardcoding.
> > > > > > 
> > > > > > v5: - Fix failing gen9_assert_dbuf_enabled as it was
> > > > > > naively
> > > > > > thinking
> > > > > >       that we have only one DBUF_CTL slice. Now another
> > > > > > version
> > > > > > for
> > > > > >       gen11 will check both slices as only second one can
> > > > > > be
> > > > > > enabled,
> > > > > >       to keep CI happy.
> > > > > > 
> > > > > > v6: - Removed enabled dbuf assertion completely. Previous
> > > > > > code
> > > > > >       was keeping dbuf enabled, even(!) when _dbuf_disable
> > > > > > is
> > > > > > called.
> > > > > >       Currently we enable/disable dbuf slices based on
> > > > > > requrement
> > > > > >       so no point in asserting this here.
> > > > > >     - Removed unnecessary modeset check from
> > > > > > verify_wm_state(Matthew Roper)
> > > > > >     - Slices intersection after union is same as final
> > > > > > result(Matthew Roper)
> > > > > >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> > > > > >     - Some macros added(Matthew Roper)
> > > > > >     - ddb range is now always less or equal to ddb size -
> > > > > > no
> > > > > > need
> > > > > > for additional
> > > > > >       checks(previously needed as we had some bandwidth
> > > > > > checks
> > > > > > in
> > > > > > there which
> > > > > >       could "suddenly" return ddb_size smaller than it is.
> > > > > > 
> > > > > > v7: Minor costemic changes:
> > > > > >     - Changed icl_dbuf_slices_restore name to
> > > > > > icl_program_dbuf_slices
> > > > > >       as it more corresponds to the actual functionality.
> > > > > >     - Some simplification with supported slices for BXT and
> > > > > > GLK
> > > > > >     - skl_pipe_downscale_amount ->
> > > > > > icl_pipe_downscale_amount as
> > > > > >       this is not used for skl anymore.
> > > > > >     - Changed DBuf slice assignment order for TGL case
> > > > > > 
> > > > > > Reviewed-by: Matthew Roper <matthew.d.roper@intel.com>
> > > > > > Signed-off-by: Stanislav Lisovskiy <
> > > > > > stanislav.lisovskiy@intel.com>
> > > > > > Cc: Matthew Roper <matthew.d.roper@intel.com>
> > > > > > Cc: Ville Syrjälä <ville.syrjala@intel.com>
> > > > > > Cc: James Ausmus <james.ausmus@intel.com>
> > > > > > ---
> > > > > >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> > > > > >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> > > > > >  .../drm/i915/display/intel_display_power.h    |   2 +
> > > > > >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> > > > > >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> > > > > >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> > > > > >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> > > > > >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> > > > > >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > > > > > ++++++++++++++++--
> > > > > >  9 files changed, 431 insertions(+), 104 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > index 876fc25968bf..bd7aff675198 100644
> > > > > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > > > > > intel_crtc *crtc,
> > > > > > 
> > > > > >       if (INTEL_GEN(dev_priv) >= 11 &&
> > > > > >           hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > > > > > -             DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > > %u, got
> > > > > > %u)\n",
> > > > > > +             DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > > %x, got
> > > > > > %x)\n",
> > > > > >                         sw_ddb->enabled_slices,
> > > > > >                         hw->ddb.enabled_slices);
> > > > > > 
> > > > > > @@ -14614,15 +14614,24 @@ static void
> > > > > > skl_commit_modeset_enables(struct intel_atomic_state
> > > > > > *state)
> > > > > >       u8 hw_enabled_slices = dev_priv-
> > > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > > > 
> > > > > >       u8 required_slices = state-
> > > > > > > wm_results.ddb.enabled_slices;
> > > > > > 
> > > > > >       struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > > > > > +     u8 slices_union = hw_enabled_slices |
> > > > > > required_slices;
> > > > > > +     u8 slices_intersection = required_slices;
> > > > > > 
> > > > > >       for_each_oldnew_intel_crtc_in_state(state, crtc,
> > > > > > old_crtc_state, new_crtc_state, i)
> > > > > >               /* ignore allocations for crtc's that have
> > > > > > been
> > > > > > turned
> > > > > > off. */
> > > > > >               if (new_crtc_state->hw.active)
> > > > > >                       entries[i] = old_crtc_state-
> > > > > > > wm.skl.ddb;
> > > > > > 
> > > > > > -     /* If 2nd DBuf slice required, enable it here */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> > > > > > +     DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > > > > > +                   required_slices, hw_enabled_slices);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * If some other DBuf slice required, enable it here,
> > > > > > +      * as here we only add more slices, but not remove to
> > > > > > prevent
> > > > > > +      * issues if somebody still uses those.
> > > > > > +      */
> > > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > > hw_enabled_slices)
> > > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > > slices_union);
> > > > > > 
> > > > > >       /*
> > > > > >        * Whenever the number of active pipes changes, we
> > > > > > need
> > > > > > to make
> > > > > > sure we
> > > > > > @@ -14681,9 +14690,12 @@ static void
> > > > > > skl_commit_modeset_enables(struct intel_atomic_state
> > > > > > *state)
> > > > > >               }
> > > > > >       } while (progress);
> > > > > > 
> > > > > > -     /* If 2nd DBuf slice is no more required disable it
> > > > > > */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> > > > > > +     /*
> > > > > > +      * If some other DBuf slice is not needed, disable it
> > > > > > here,
> > > > > > +      * as here we only remove slices, which are not
> > > > > > needed
> > > > > > anymore.
> > > > > > +      */
> > > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > > hw_enabled_slices)
> > > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > > slices_intersection);
> > > > > 
> > > > > This is still a totally wrong place for this. We need to be
> > > > > sure
> > > > > the
> > > > > planes have switched over to the new ddb layout before we
> > > > > power
> > > > > off
> > > > > anything.
> > > > 
> > > > I have just modified already existing line here, not added by
> > > > me.
> > > > The point of this patch is starting to use second DBuf slice
> > > > but
> > > > not fixing all kinds of issues related to DBuf code at the same
> > > > time.
> > > 
> > > You're re-enabling code I disabled on purpose. It has to be fixed
> > > properly or not touched at all. In hindsight I should have
> > > probably
> > > just ripped it all out to not give people bad ideas.
> > 
> > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> > 
> > I'm not reenabling it - this is code is out there right now.
> 
> It's _dead_ code, and it's like that on purpose. You're now
> attempting
> to re-enable it without fixing the underlying issues.
> 
> > This is all again about changing all kinds of stuff at the same
> > time.
> > 
> > Why we can't just do things step by step? You have a point that
> > this is wrong place ok, but as we see current code causes no
> > regressions. Just tell me your idea where to put it and I will
> > do that, if you don't have it yet - let it stay there, if CI shows
> > no regressions currently.
> 
> The same global state tricks apply here as with the cdclk/sagv stuff.
> 1. enable new resources/bump up clocks
> 2. reconfigure planes/etc.
> 3. wait for vblanks
> 4. disable unused resources/drop the clocks
> 
> > 
> > > 
> > > > 
> > > > Of course this is nice idea to fix it - if that is wrong plane,
> > > > but current code somehow lives with it and probably this should
> > > > be
> > > > with some other patch.
> > > > 
> > > > > 
> > > > > >  }
> > > > > > 
> > > > > >  static void intel_atomic_helper_free_state(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv)
> > > > > > diff --git
> > > > > > a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > > index ce1b64f4dd44..a78bc9b61862 100644
> > > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > > @@ -1031,15 +1031,6 @@ static bool
> > > > > > gen9_dc_off_power_well_enabled(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >               (I915_READ(DC_STATE_EN) &
> > > > > > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> > > > > >  }
> > > > > > 
> > > > > > -static void gen9_assert_dbuf_enabled(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv)
> > > > > > -{
> > > > > > -     u32 tmp = I915_READ(DBUF_CTL);
> > > > > > -
> > > > > > -     WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST))
> > > > > > !=
> > > > > > -          (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > > > > > -          "Unexpected DBuf power power state (0x%08x)\n",
> > > > > > tmp);
> > > > > > -}
> > > > > > -
> > > > > >  static void gen9_disable_dc_states(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > >       struct intel_cdclk_state cdclk_state = {};
> > > > > > @@ -1055,8 +1046,6 @@ static void
> > > > > > gen9_disable_dc_states(struct
> > > > > > drm_i915_private *dev_priv)
> > > > > >       /* Can't read out voltage_level so can't use
> > > > > > intel_cdclk_changed() */
> > > > > >       WARN_ON(intel_cdclk_needs_modeset(&dev_priv-
> > > > > > >cdclk.hw,
> > > > > > &cdclk_state));
> > > > > > 
> > > > > > -     gen9_assert_dbuf_enabled(dev_priv);
> > > > > > -
> > > > > >       if (IS_GEN9_LP(dev_priv))
> > > > > >               bxt_verify_ddi_phy_power_wells(dev_priv);
> > > > > > 
> > > > > > @@ -4254,31 +4243,51 @@ static void
> > > > > > gen9_dbuf_disable(struct
> > > > > > drm_i915_private *dev_priv)
> > > > > >       intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> > > > > >  }
> > > > > > 
> > > > > > -static u8 intel_dbuf_max_slices(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > > +int intel_dbuf_max_slices(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > > -     if (INTEL_GEN(dev_priv) < 11)
> > > > > > -             return 1;
> > > > > > -     return 2;
> > > > > > +     return INTEL_INFO(dev_priv)->supported_slices;
> > > > > > +}
> > > > > 
> > > > > Kinda pointless function now. 'supported_slices' is not a
> > > > > very
> > > > > good
> > > > > name. Should be 'num_dbuf_slices' or something along those
> > > > > lines.
> > > > 
> > > > Supported slices really means how much slices we support in
> > > > hardware.
> > > > num_dbuf_slices will not reflect that, it will be more
> > > > ambigious.
> > > > I don't like this name.
> > > 
> > > supported_slices doesn't even say "dbuf" anywhere. So it's even
> > > more
> > > ambiguous. Also we already have num_pipes, num_sprites, etc.
> > 
> > ..and in current code we have... "enabled_slices", no? Ouch! :) IS
> > this
> > really that important and worth arguing at the very last moment?
> > whether number of
> > supported_slices_in_hardware, should be named "supported slices" or
> > "num_dbuf_slices"? If yes, lets go on, miss some targets and
> > continue
> > discussing this "soooo important" thing.
> > 
> > Some one might come up that it is called "num_slices" or
> > "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
> > keep everyones bloated egos happy??
> > 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +
> > > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > > +{
> > > > > > +     const u8 hw_enabled_slices = dev_priv-
> > > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > > 
> > > > > Seems a rather pointless variable. const at least is
> > > > > pointless.
> > > > > 
> > > > > > +
> > > > > > +     icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> > > > > >  }
> > > > > > 
> > > > > >  void icl_dbuf_slices_update(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >                           u8 req_slices)
> > > > > >  {
> > > > > > -     const u8 hw_enabled_slices = dev_priv-
> > > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > > > 
> > > > > > -     bool ret;
> > > > > > +     bool ret = true;
> > > > > > +     int i;
> > > > > > +     int max_slices = intel_dbuf_max_slices(dev_priv);
> > > > > > 
> > > > > > -     if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > > > > > +     if (hweight8(req_slices) >
> > > > > > intel_dbuf_max_slices(dev_priv)) {
> > > > > >               DRM_ERROR("Invalid number of dbuf slices
> > > > > > requested\n");
> > > > > >               return;
> > > > > >       }
> > > > > > 
> > > > > > -     if (req_slices == hw_enabled_slices || req_slices ==
> > > > > > 0)
> > > > > > -             return;
> > > > > > +     DRM_DEBUG_KMS("Updating dbuf slices to %x\n",
> > > > > > req_slices);
> > > > > > 
> > > > > > -     if (req_slices > hw_enabled_slices)
> > > > > > -             ret = intel_dbuf_slice_set(dev_priv,
> > > > > > DBUF_CTL_S2,
> > > > > > true);
> > > > > > -     else
> > > > > > -             ret = intel_dbuf_slice_set(dev_priv,
> > > > > > DBUF_CTL_S2,
> > > > > > false);
> > > > > > +     for (i = 0; i < max_slices; i++) {
> > > > > > +             int slice_bit = BIT(i);
> > > > > > +             bool slice_set = (slice_bit & req_slices) !=
> > > > > > 0;
> > > > > > +
> > > > > > +             switch (slice_bit) {
> > > > > > +             case DBUF_S1_BIT:
> > > > > > +                     ret = ret &&
> > > > > > intel_dbuf_slice_set(dev_priv,
> > > > > > +                                                       DBU
> > > > > > F_
> > > > > > CTL_S1,
> > > > > > +                                                       sli
> > > > > > ce
> > > > > > _set);
> > > > > > +                     break;
> > > > > > +             case DBUF_S2_BIT:
> > > > > > +                     ret = ret &&
> > > > > > intel_dbuf_slice_set(dev_priv,
> > > > > > +                                                       DBU
> > > > > > F_
> > > > > > CTL_S2,
> > > > > > +                                                       sli
> > > > > > ce
> > > > > > _set);
> > > > > 
> > > > > The return value on intel_dbuf_slice_set() seems pointless.
> > > > > I'd
> > > > > nuke
> > > > > it.
> > > > > 
> > > > > Also as long you have intel_dbuf_slice_set() I'd probably
> > > > > just
> > > > > pass
> > > > > the
> > > > > slice index there and let it deal with the register details.
> > > > > DBUF_CTL
> > > > > should probably be parametrized as well so you don't need
> > > > > annoying
> > > > > switch
> > > > > statements.
> > > > 
> > > > What if it fails? Shouldn't we still have some clue what state
> > > > do
> > > > we
> > > > have/had at the moment?
> > > 
> > > It if fails you log an error and keep going.
> > 
> > Ah sure. And let hw/sw state be completely different.
> > Also see next comment about this.
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +                     break;
> > > > > > +             default:
> > > > > > +                     MISSING_CASE(slice_bit);
> > > > > > +             }
> > > > > > +     }
> > > > > > 
> > > > > >       if (ret)
> > > > > >               dev_priv->wm.skl_hw.ddb.enabled_slices =
> > > > > > req_slices;
> > > > > > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > > 
> > > > > >  static void icl_dbuf_enable(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > > -
> > > > > > -     udelay(10);
> > > > > > -
> > > > > > -     if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > > -         !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > > -             DRM_ERROR("DBuf power enable timeout\n");
> > > > > > -     else
> > > > > > -             /*
> > > > > > -              * FIXME: for now pretend that we only have 1
> > > > > > slice,
> > > > > > see
> > > > > > -              * intel_enabled_dbuf_slices_num().
> > > > > > -              */
> > > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > > +     /*
> > > > > > +      * Just power up 1 slice, we will
> > > > > > +      * figure out later which slices we have and what we
> > > > > > need.
> > > > > > +      */
> > > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > > > 
> > > > > That stuff really shouldn't be modified by low level
> > > > > functions.
> > > > > The intial value should come from readout, and after that it
> > > > > should
> > > > > be adjusted as part of the overall atomic state.
> > > > 
> > > > Well, as you can see currently this is even worse - has some
> > > > weird
> > > > hardcoded stuff and is modified like that. Also if we are
> > > > enabling
> > > > the
> > > > dbuf slices, shouldn't we really determine what value should be
> > > > there?
> > > > 
> > > > Also I wouldn't really always trust the readout, as to me it
> > > > seems
> > > > icelake resets it constantly to 2 after suspend/resume.
> > > 
> > > The point of readout is that we syncrhonize our software state
> > > with
> > > the
> > > current state of hardware without having to modify the hardware
> > > state.
> > > Once synchronized the driver takes over and tells the hw what to
> > > do.
> > 
> > That is especially nice after you say "report error and don't care"
> > when I really want to reflect hw state in sw state, by checking
> > whether
> > we could set dbuf slices or not.
> > Now you say that we need to synchronize software state with hw
> > state,
> > however if I fail to modify it - then "who cares".
> 
> Yes. If you fail to modify the hardware state the whole thing is
> broken anyway. There's is *nothing* we can do at this stage in the
> commit to fix it.
> 
> > 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +     icl_program_dbuf_slices(dev_priv);
> > > > > >  }
> > > > > > 
> > > > > >  static void icl_dbuf_disable(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > > > > > ~DBUF_POWER_REQUEST);
> > > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > > > > > ~DBUF_POWER_REQUEST);
> > > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > > -
> > > > > > -     udelay(10);
> > > > > > -
> > > > > > -     if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > > -         (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > > -             DRM_ERROR("DBuf power disable timeout!\n");
> > > > > > -     else
> > > > > > -             /*
> > > > > > -              * FIXME: for now pretend that the first
> > > > > > slice
> > > > > > is
> > > > > > always
> > > > > > -              * enabled, see
> > > > > > intel_enabled_dbuf_slices_num().
> > > > > > -              */
> > > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > > +     /*
> > > > > > +      * Disable all slices
> > > > > > +      */
> > > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > > > > > +     icl_program_dbuf_slices(dev_priv);
> > > > > >  }
> > > > > > 
> > > > > >  static void icl_mbus_init(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > > diff --git
> > > > > > a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > > index 1da04f3e0fb3..b7057d68ad78 100644
> > > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > > > > > drm_i915_private *i915,
> > > > > >       for ((wf) = intel_display_power_get((i915),
> > > > > > (domain));
> > > > > > (wf); \
> > > > > >            intel_display_power_put_async((i915), (domain),
> > > > > > (wf)),
> > > > > > (wf) = 0)
> > > > > > 
> > > > > > +int intel_dbuf_max_slices(struct drm_i915_private
> > > > > > *dev_priv);
> > > > > >  void icl_dbuf_slices_update(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >                           u8 req_slices);
> > > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > > *dev_priv);
> > > > > > 
> > > > > >  void chv_phy_powergate_lanes(struct intel_encoder
> > > > > > *encoder,
> > > > > >                            bool override, unsigned int
> > > > > > mask);
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > > > index 82e4e6bf08c3..96741e68633a 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > > > @@ -51,6 +51,7 @@
> > > > > >  #include "display/intel_bw.h"
> > > > > >  #include "display/intel_cdclk.h"
> > > > > >  #include "display/intel_display_types.h"
> > > > > > +#include "display/intel_display_power.h"
> > > > > >  #include "display/intel_dp.h"
> > > > > >  #include "display/intel_fbdev.h"
> > > > > >  #include "display/intel_hotplug.h"
> > > > > > @@ -2588,6 +2589,10 @@ static int
> > > > > > intel_runtime_resume(struct
> > > > > > device *kdev)
> > > > > >       if (IS_VALLEYVIEW(dev_priv) ||
> > > > > > IS_CHERRYVIEW(dev_priv))
> > > > > >               ret = vlv_resume_prepare(dev_priv, true);
> > > > > > 
> > > > > > +     /* Weird hack to fix ICL hardware bug, as it resets
> > > > > > DBUF slices
> > > > > > reg */
> > > > > > +     if (INTEL_GEN(dev_priv) == 11)
> > > > > > +             icl_program_dbuf_slices(dev_priv);
> > > > > > +
> > > > > >       intel_uncore_runtime_resume(&dev_priv->uncore);
> > > > > > 
> > > > > >       intel_runtime_pm_enable_interrupts(dev_priv);
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > > > index 7e0f67babe20..a396977c9c2d 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > > @@ -804,7 +804,7 @@ static inline bool
> > > > > > skl_ddb_entry_equal(const
> > > > > > struct skl_ddb_entry *e1,
> > > > > >  }
> > > > > > 
> > > > > >  struct skl_ddb_allocation {
> > > > > 
> > > > > This struct is quite pointless. Should be killed off.
> > > > 
> > > > Legacy stuff, not added by me. Quite fine with removing it -
> > > > but should we really do all this at the same time and in same
> > > > patch?
> > > 
> > > Not the same patch. But clean up the cobwebs first, then add the
> > > new
> > > stuff. Much easier that way around.
> > 
> > There is a shit ton load of cobwebs to clean. Personally my opinion
> > is
> > that current dbuf controlling code is total garbage - which has
> > lots of
> > hardcoded stuff. Good luck enabling only second dbuf slice, good
> > luck
> > disabling/enabling dbuf properly....
> > Where you were, when this was committed then?
> 
> Looking at something else unfortunately. Also it has just kept
> bitrotting since because no one has cared.
> 
> > 
> > If we need to fix it - fine. But may be we should do it in stages?
> > Point of this patch is "enable second dbuf slice", next patch
> > "remove
> > unneeded structs" and so on.
> > While you want me to shuffle everything at the same time, which is
> > not
> > good idea, especially considering complexity and amount of legacy
> > stuff
> > we currently have.
> 
> Building new stuff on top of a decomposing garbage pile is not
> a good idea: a) it's harder than building on clean foundation,
> b) it'll probably fall over and get buried in the old garbage
> at which point you're likely going to be right back where you
> started.
> 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > -     u8 enabled_slices; /* GEN11 has configurable 2 slices
> > > > > > */
> > > > > > +     u8 enabled_slices;   /* Bitmask of currently enabled
> > > > > > DBuf
> > > > > > slices */
> > > > > >  };
> > > > > > 
> > > > > >  struct skl_ddb_values {
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > > > > b/drivers/gpu/drm/i915/i915_pci.c
> > > > > > index 1bb701d32a5d..8e86af505730 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > > > > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > > > > > intel_cherryview_info = {
> > > > > >       .has_gt_uc = 1, \
> > > > > >       .display.has_hdcp = 1, \
> > > > > >       .display.has_ipc = 1, \
> > > > > > -     .ddb_size = 896
> > > > > > +     .ddb_size = 896, \
> > > > > > +     .supported_slices = 1
> > > > > > 
> > > > > >  #define SKL_PLATFORM \
> > > > > >       GEN9_FEATURES, \
> > > > > > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > > > > > intel_skylake_gt4_info = {
> > > > > >       GEN(9), \
> > > > > >       .is_lp = 1, \
> > > > > >       .display.has_hotplug = 1, \
> > > > > > +     .supported_slices = 1, \
> > > > > >       .engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) |
> > > > > > BIT(VECS0),
> > > > > > \
> > > > > >       .pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
> > > > > > \
> > > > > >       .has_64bit_reloc = 1, \
> > > > > > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > > > > > intel_coffeelake_gt3_info = {
> > > > > >       GEN9_FEATURES, \
> > > > > >       GEN(10), \
> > > > > >       .ddb_size = 1024, \
> > > > > > +     .supported_slices = 1, \
> > > > > >       .display.has_dsc = 1, \
> > > > > >       .has_coherent_ggtt = false, \
> > > > > >       GLK_COLORS
> > > > > > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > > > > > intel_cannonlake_info = {
> > > > > >       }, \
> > > > > >       GEN(11), \
> > > > > >       .ddb_size = 2048, \
> > > > > > +     .supported_slices = 2, \
> > > > > >       .has_logical_ring_elsq = 1, \
> > > > > >       .color = { .degamma_lut_size = 33, .gamma_lut_size =
> > > > > > 262145 }
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > > > > > b/drivers/gpu/drm/i915/i915_reg.h
> > > > > > index a607ea520829..fba5731063d8 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > > > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > > > > @@ -7734,9 +7734,11 @@ enum {
> > > > > >  #define DISP_ARB_CTL2        _MMIO(0x45004)
> > > > > >  #define  DISP_DATA_PARTITION_5_6     (1 << 6)
> > > > > >  #define  DISP_IPC_ENABLE             (1 << 3)
> > > > > > -#define DBUF_CTL     _MMIO(0x45008)
> > > > > > -#define DBUF_CTL_S1  _MMIO(0x45008)
> > > > > > -#define DBUF_CTL_S2  _MMIO(0x44FE8)
> > > > > > +#define DBUF_S1_BIT                  BIT(0)
> > > > > > +#define DBUF_S2_BIT                  BIT(1)
> > > > > 
> > > > > These are not bits inside DBUF_CTL so they should not be
> > > > > here.
> > > > > I'd just nuke them entirely. If you think we need more than
> > > > > an
> > > > > int to refer to a slice then IMO something simple like
> > > > > 
> > > > > enum dbuf_slice {
> > > > >         DBUF_S1,
> > > > >         DBUF_S2,
> > > > > };
> > > > > 
> > > > > would match the general apporach we use for other things much
> > > > > better
> > > > > rather than defining some shifted values.
> > > > 
> > > > We need to combine slices - so bitmask is quite suitable for
> > > > that,
> > > > I can of course use this enum, but then I'll have to re-encode
> > > > it
> > > > to bitmask anyways.
> > > 
> > > Which is pretty much how everything else works in the driver.
> > 
> > Another piece of incredibly important thing to argue..
> 
> It is. Lot's of crap in the driver is simply caused by people not
> following established style and coming up with something different..
> Which means there is an extra cost for everyone else when reading the
> code. And that probably means most people just don't bother to read
> it.
> 
> Granted, some of that is due to spurts in evolution where major
> overhauls
> changed the architecture so significantly that the old established
> conventions simply made node sense anymore. Atomic being the last big
> one in that regard. And we're still cleaning up the resulting mess.
> 
> > 
> > > 
> > > > 
> > > > 
> > > > > 
> > > > > > +#define DBUF_CTL                     _MMIO(0x45008)
> > > > > > +#define DBUF_CTL_S1                  _MMIO(0x45008)
> > > > > > +#define DBUF_CTL_S2                  _MMIO(0x44FE8)
> > > > > >  #define  DBUF_POWER_REQUEST          (1 << 31)
> > > > > >  #define  DBUF_POWER_STATE            (1 << 30)
> > > > > >  #define GEN7_MSG_CTL _MMIO(0x45010)
> > > > > > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > > b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > > > > > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > > @@ -180,6 +180,7 @@ struct intel_device_info {
> > > > > >       } display;
> > > > > > 
> > > > > >       u16 ddb_size; /* in blocks */
> > > > > > +     u8 supported_slices; /* number of DBuf slices */
> > > > > > 
> > > > > >       /* Register offsets for the various display pipes and
> > > > > > transcoders */
> > > > > >       int pipe_offsets[I915_MAX_TRANSCODERS];
> > > > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > > > > b/drivers/gpu/drm/i915/intel_pm.c
> > > > > > index 2d389e437e87..1ea81ab92429 100644
> > > > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > > > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct
> > > > > > drm_device
> > > > > > *dev)
> > > > > >       return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> > > > > >  }
> > > > > > 
> > > > > > -static u8 intel_enabled_dbuf_slices_num(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv)
> > > > > > +static u8 intel_enabled_dbuf_slices(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > > -     u8 enabled_slices;
> > > > > > -
> > > > > > -     /* Slice 1 will always be enabled */
> > > > > > -     enabled_slices = 1;
> > > > > > +     u8 enabled_slices = 0;
> > > > > > 
> > > > > >       /* Gen prior to GEN11 have only one DBuf slice */
> > > > > >       if (INTEL_GEN(dev_priv) < 11)
> > > > > > -             return enabled_slices;
> > > > > > +             return DBUF_S1_BIT;
> > > > > 
> > > > > I'd nuke this special case entirely and just read everything
> > > > > from
> > > > > the
> > > > > hw. Assuming we need readout somewhere.
> > > > 
> > > > We have it that way hardcoded right now in drm-tip. I think
> > > > this is
> > > > a
> > > > subject to other patch, but not this change. As I understand we
> > > > need
> > > > to first figure out how to do it properly.
> > > > This patch anyway makes it better by at least removing
> > > > hardcoded
> > > > relation between number of slices and which slices are enabled.
> > > > With current code there is no way for example to enable second
> > > > DBuf
> > > > slice only, which is required by BSpec.
> > > > 
> > > > > 
> > > > > > 
> > > > > > -     /*
> > > > > > -      * FIXME: for now we'll only ever use 1 slice;
> > > > > > pretend
> > > > > > that we
> > > > > > have
> > > > > > -      * only that 1 slice enabled until we have a proper
> > > > > > way
> > > > > > for on-
> > > > > > demand
> > > > > > -      * toggling of the second slice.
> > > > > > -      */
> > > > > > -     if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > > -             enabled_slices++;
> > > > > > +     /* Check if second DBuf slice is enabled */
> > > > > > +     if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > > > > > +             enabled_slices |= DBUF_S1_BIT;
> > > > > > +
> > > > > > +     if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > > +             enabled_slices |= DBUF_S2_BIT;
> > > > > > 
> > > > > >       return enabled_slices;
> > > > > >  }
> > > > > > @@ -3812,36 +3808,38 @@ static u16
> > > > > > intel_get_ddb_size(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >                             const int num_active,
> > > > > >                             struct skl_ddb_allocation *ddb)
> > > > > >  {
> > > > > > -     const struct drm_display_mode *adjusted_mode;
> > > > > > -     u64 total_data_bw;
> > > > > >       u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > > > > > -
> > > > > >       WARN_ON(ddb_size == 0);
> > > > > > 
> > > > > >       if (INTEL_GEN(dev_priv) < 11)
> > > > > >               return ddb_size - 4; /* 4 blocks for bypass
> > > > > > path
> > > > > > allocation */
> > > > > > 
> > > > > > -     adjusted_mode = &crtc_state->hw.adjusted_mode;
> > > > > > -     total_data_bw = total_data_rate *
> > > > > > drm_mode_vrefresh(adjusted_mode);
> > > > > > +     return ddb_size;
> > > > > > +}
> > > > > > 
> > > > > > -     /*
> > > > > > -      * 12GB/s is maximum BW supported by single DBuf
> > > > > > slice.
> > > > > > -      *
> > > > > > -      * FIXME dbuf slice code is broken:
> > > > > > -      * - must wait for planes to stop using the slice
> > > > > > before
> > > > > > powering it off
> > > > > > -      * - plane straddling both slices is illegal in
> > > > > > multi-
> > > > > > pipe
> > > > > > scenarios
> > > > > > -      * - should validate we stay within the hw bandwidth
> > > > > > limits
> > > > > > -      */
> > > > > > -     if (0 && (num_active > 1 || total_data_bw >=
> > > > > > GBps(12)))
> > > > > > {
> > > > > > -             ddb->enabled_slices = 2;
> > > > > > -     } else {
> > > > > > -             ddb->enabled_slices = 1;
> > > > > > -             ddb_size /= 2;
> > > > > > -     }
> > > > > > +/*
> > > > > > + * Calculate initial DBuf slice offset, based on slice
> > > > > > size
> > > > > > + * and mask(i.e if slice size is 1024 and second slice is
> > > > > > enabled
> > > > > > + * offset would be 1024)
> > > > > > + */
> > > > > > +static u32 skl_get_first_dbuf_slice_offset(u32
> > > > > > dbuf_slice_mask,
> > > > > > +                                        u32 slice_size,
> > > > > > u32
> > > > > > ddb_size)
> > > > > > +{
> > > > > > +     u32 offset = 0;
> > > > > > 
> > > > > > -     return ddb_size;
> > > > > > +     if (!dbuf_slice_mask)
> > > > > > +             return 0;
> > > > > > +
> > > > > > +     offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > > > > > +
> > > > > > +     WARN_ON(offset >= ddb_size);
> > > > > > +     return offset;
> > > > > >  }
> > > > > > 
> > > > > > +static u32 i915_get_allowed_dbuf_mask(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv,
> > > > > > +                                   int pipe, u32
> > > > > > active_pipes,
> > > > > > +                                   const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state);
> > > > > > +
> > > > > >  static void
> > > > > >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >                                  const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state,
> > > > > > @@ -3857,7 +3855,14 @@
> > > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >       u32 pipe_width = 0, total_width = 0,
> > > > > > width_before_pipe
> > > > > > = 0;
> > > > > >       enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> > > > > >       u16 ddb_size;
> > > > > > +     u32 ddb_range_size;
> > > > > >       u32 i;
> > > > > > +     u32 dbuf_slice_mask;
> > > > > > +     u32 active_pipes;
> > > > > > +     u32 offset;
> > > > > > +     u32 slice_size;
> > > > > > +     u32 total_slice_mask;
> > > > > > +     u32 start, end;
> > > > > > 
> > > > > >       if (WARN_ON(!state) || !crtc_state->hw.active) {
> > > > > >               alloc->start = 0;
> > > > > > @@ -3866,14 +3871,23 @@
> > > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >               return;
> > > > > >       }
> > > > > > 
> > > > > > -     if (intel_state->active_pipe_changes)
> > > > > > +     if (intel_state->active_pipe_changes) {
> > > > > >               *num_active = hweight8(intel_state-
> > > > > > > active_pipes);
> > > > > > 
> > > > > > -     else
> > > > > > +             active_pipes = intel_state->active_pipes;
> > > > > > +     } else {
> > > > > >               *num_active = hweight8(dev_priv-
> > > > > > >active_pipes);
> > > > > > +             active_pipes = dev_priv->active_pipes;
> > > > > > +     }
> > > > > > 
> > > > > >       ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > > > > > total_data_rate,
> > > > > >                                     *num_active, ddb);
> > > > > > 
> > > > > > +     DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > > > > > +
> > > > > > +     slice_size = ddb_size / INTEL_INFO(dev_priv)-
> > > > > > > supported_slices;
> > > > > > 
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Got DBuf slice size %d\n",
> > > > > > slice_size);
> > > > > > +
> > > > > >       /*
> > > > > >        * If the state doesn't change the active CRTC's or
> > > > > > there is no
> > > > > >        * modeset request, then there's no need to
> > > > > > recalculate;
> > > > > > @@ -3891,20 +3905,70 @@
> > > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >               return;
> > > > > >       }
> > > > > > 
> > > > > > +     /*
> > > > > > +      * Get allowed DBuf slices for correspondent pipe and
> > > > > > platform.
> > > > > > +      */
> > > > > > +     dbuf_slice_mask =
> > > > > > i915_get_allowed_dbuf_mask(dev_priv,
> > > > > > for_pipe,
> > > > > > +                                                  active_p
> > > > > > ip
> > > > > > es,
> > > > > > crtc_state);
> > > > > 
> > > > > Quite a few redundant arguments. The function name looks
> > > > > alien.
> > > > > 
> > > > > skl_crtc_possible_dbuf_slices() or something perhaps?
> > > > > 
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active
> > > > > > pipes
> > > > > > %x\n",
> > > > > > +                   dbuf_slice_mask,
> > > > > > +                   for_pipe, active_pipes);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Figure out at which DBuf slice we start, i.e if we
> > > > > > start at
> > > > > > Dbuf S2
> > > > > > +      * and slice size is 1024, the offset would be 1024
> > > > > > +      */
> > > > > > +     offset =
> > > > > > skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > > > > > +                                              slice_size,
> > > > > > ddb_size);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Figure out total size of allowed DBuf slices,
> > > > > > which
> > > > > > is
> > > > > > basically
> > > > > > +      * a number of allowed slices for that pipe
> > > > > > multiplied
> > > > > > by slice
> > > > > > size.
> > > > > > +      * Inside of this
> > > > > > +      * range ddb entries are still allocated in
> > > > > > proportion
> > > > > > to
> > > > > > display width.
> > > > > > +      */
> > > > > > +     ddb_range_size = hweight8(dbuf_slice_mask) *
> > > > > > slice_size;
> > > > > > +
> > > > > >       /*
> > > > > >        * Watermark/ddb requirement highly depends upon
> > > > > > width
> > > > > > of the
> > > > > >        * framebuffer, So instead of allocating DDB equally
> > > > > > among
> > > > > > pipes
> > > > > >        * distribute DDB based on resolution/width of the
> > > > > > display.
> > > > > >        */
> > > > > > +     total_slice_mask = dbuf_slice_mask;
> > > > > 
> > > > > Isn't this already taken care by the loop below?
> > > > > 
> > > > > >       for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > > crtc_state,
> > > > > > i) {
> > > > > >               const struct drm_display_mode *adjusted_mode
> > > > > > =
> > > > > >                       &crtc_state->hw.adjusted_mode;
> > > > > >               enum pipe pipe = crtc->pipe;
> > > > > >               int hdisplay, vdisplay;
> > > > > > +             u32 pipe_dbuf_slice_mask = \
> > > > > > +                     i915_get_allowed_dbuf_mask(dev_priv,
> > > > > > +                                             pipe,
> > > > > > +                                             active_pipes,
> > > > > > +                                             crtc_state);
> > > > > > 
> > > > > >               if (!crtc_state->hw.enable)
> > > > > >                       continue;
> > > > > > 
> > > > > > +             /*
> > > > > > +              * According to BSpec pipe can share one dbuf
> > > > > > slice
> > > > > > with another
> > > > > > +              * pipes or pipe can use multiple dbufs, in
> > > > > > both cases
> > > > > > we
> > > > > > +              * account for other pipes only if they have
> > > > > > exactly
> > > > > > same mask.
> > > > > > +              * However we need to account how many slices
> > > > > > we should
> > > > > > enable
> > > > > > +              * in total.
> > > > > > +              */
> > > > > > +             total_slice_mask |= pipe_dbuf_slice_mask;
> > > > > 
> > > > > total_slice_mask will now account only the crtcs in the
> > > > > state.
> > > > > What
> > > > > happens to the other pipes' slices?
> > 
> > Same as happens now btw. Right now it also does the same thing.
> 
> Right now it does absolutely nothing. It's all dead code.
> 
> > 
> > > > 
> > > > Yes we need to iterate through all pipes/crtcs here, however
> > > > for getting correct dbuf slice mask most important is
> > > > active_pipes.
> > > > However yes, I will change the loop for it to be more correct.
> > > > 
> > > > > 
> > > > > > +
> > > > > > +             /*
> > > > > > +              * Do not account pipes using other slice
> > > > > > sets
> > > > > > +              * luckily as of current BSpec slice sets do
> > > > > > not
> > > > > > partially
> > > > > > +              * intersect(pipes share either same one
> > > > > > slice
> > > > > > or same
> > > > > > slice set
> > > > > > +              * i.e no partial intersection), so it is
> > > > > > enough to
> > > > > > check for
> > > > > > +              * equality for now.
> > > > > > +              */
> > > > > > +             if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > > > > > +                     continue;
> > > > > > +
> > > > > >               drm_mode_get_hv_timing(adjusted_mode,
> > > > > > &hdisplay,
> > > > > > &vdisplay);
> > > > > >               total_width += hdisplay;
> > > > > > 
> > > > > > @@ -3914,8 +3978,19 @@
> > > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >                       pipe_width = hdisplay;
> > > > > >       }
> > > > > > 
> > > > > > -     alloc->start = ddb_size * width_before_pipe /
> > > > > > total_width;
> > > > > > -     alloc->end = ddb_size * (width_before_pipe +
> > > > > > pipe_width) /
> > > > > > total_width;
> > > > > > +     ddb->enabled_slices = total_slice_mask;
> > > > > > +
> > > > > > +     start = ddb_range_size * width_before_pipe /
> > > > > > total_width;
> > > > > > +     end = ddb_range_size * (width_before_pipe +
> > > > > > pipe_width)
> > > > > > /
> > > > > > total_width;
> > > > > > +
> > > > > > +     alloc->start = offset + start;
> > > > > > +     alloc->end = offset + end;
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > > > > > +                   alloc->start, alloc->end);
> > > > > > +     DRM_DEBUG_KMS("Enabled ddb slices mask %x num
> > > > > > supported
> > > > > > %d\n",
> > > > > > +                   ddb->enabled_slices,
> > > > > > +                   INTEL_INFO(dev_priv)-
> > > > > > >supported_slices);
> > > > > >  }
> > > > > > 
> > > > > >  static int skl_compute_wm_params(const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state,
> > > > > > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > > > > > intel_crtc *crtc,
> > > > > >  void skl_ddb_get_hw_state(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >                         struct skl_ddb_allocation *ddb /*
> > > > > > out
> > > > > > */)
> > > > > >  {
> > > > > > -     ddb->enabled_slices =
> > > > > > intel_enabled_dbuf_slices_num(dev_priv);
> > > > > > +     ddb->enabled_slices =
> > > > > > intel_enabled_dbuf_slices(dev_priv);
> > > > > > +     DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > > > > > > enabled_slices);
> > > > > > 
> > > > > >  }
> > > > > > 
> > > > > >  /*
> > > > > > @@ -4164,6 +4240,241 @@
> > > > > > skl_get_total_relative_data_rate(struct
> > > > > > intel_crtc_state *crtc_state,
> > > > > >       return total_data_rate;
> > > > > >  }
> > > > > > 
> > > > > > +static uint_fixed_16_16_t
> > > > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     u32 src_w, src_h, dst_w, dst_h;
> > > > > > +     uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > > > +     uint_fixed_16_16_t downscale_h, downscale_w;
> > > > > > +     const struct drm_display_mode *adjusted_mode =
> > > > > > &crtc_state-
> > > > > > > hw.adjusted_mode;
> > > > > > 
> > > > > > +
> > > > > > +     src_w = crtc_state->pipe_src_w;
> > > > > > +     src_h = crtc_state->pipe_src_h;
> > > > > > +     dst_w = adjusted_mode->crtc_hdisplay;
> > > > > > +     dst_h = adjusted_mode->crtc_vdisplay;
> > > > > > +
> > > > > > +     fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > > > +     fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > > > +     downscale_w = max_fixed16(fp_w_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +     downscale_h = max_fixed16(fp_h_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +
> > > > > > +     return mul_fixed16(downscale_w, downscale_h);
> > > > > > +}
> > > > > > +
> > > > > > +uint_fixed_16_16_t
> > > > > > +icl_get_pipe_ratio(const struct intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     struct drm_plane *plane;
> > > > > > +     const struct drm_plane_state *drm_plane_state;
> > > > > > +     uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > > > +     uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > > > +     struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > > > > uapi.crtc);
> > > > > > 
> > > > > > +
> > > > > > +     if (!crtc_state->uapi.enable)
> > > > > > +             return max_downscale;
> > > > > > +
> > > > > > +     drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > > > drm_plane_state, &crtc_state->uapi) {
> > > > > > +             uint_fixed_16_16_t plane_downscale;
> > > > > > +             const struct intel_plane_state *plane_state =
> > > > > > +                     to_intel_plane_state(drm_plane_state)
> > > > > > ;
> > > > > > +
> > > > > > +             if (!intel_wm_plane_visible(crtc_state,
> > > > > > plane_state))
> > > > > > +                     continue;
> > > > > > +
> > > > > > +             plane_downscale =
> > > > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > > > +
> > > > > > +             DRM_DEBUG_KMS("Plane %d downscale amount
> > > > > > %d.%d\n",
> > > > > > +                           to_intel_plane(plane)->id,
> > > > > > +                           plane_downscale.val >> 16,
> > > > > > +                           plane_downscale.val & 0xffff);
> > > > > > +
> > > > > > +             max_downscale = max_fixed16(plane_downscale,
> > > > > > max_downscale);
> > > > > > +     }
> > > > > > +
> > > > > > +
> > > > > > +     pipe_downscale =
> > > > > > icl_pipe_downscale_amount(crtc_state);
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > > > +                    crtc->pipe, pipe_downscale.val >> 16,
> > > > > > +                    pipe_downscale.val & 0xffff);
> > > > > > +
> > > > > > +     pipe_downscale = mul_fixed16(pipe_downscale,
> > > > > > max_downscale);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Convert result to percentage: get the actual ratio
> > > > > > instead
> > > > > > of rate,
> > > > > > +      * then multiply by 100 to get percentage.
> > > > > > +      */
> > > > > > +     pipe_ratio = u32_to_fixed16(100 *
> > > > > > div_round_up_u32_fixed16(1,
> > > > > > pipe_downscale));
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > > > +                   pipe_ratio.val >> 16, pipe_ratio.val &
> > > > > > 0xffff);
> > > > > > +
> > > > > > +     return pipe_ratio;
> > > > > > +}
> > > > > 
> > > > > Yuck. I just killed these.
> > > > 
> > > > I still need pipe_ratio fro ICL as you know, this is dictated
> > > > by
> > > > BSpec,
> > > > not by me.
> > > 
> > > That part of the spec is pretty crazy IMO. I think we should just
> > > ignore
> > > it for now. Also nothing in there says we must use both slices
> > > for a
> > > single pipe so ignoring doesn't even violate the spec in any way.
> > 
> > So you are instructing me now to ignore BSpec as "we are smarter"?
> > I don't think this is properly to do it this way. Lets first change
> > the
> > BSpec, if we don't like pipe ratio here and then driver. Currently
> > this
> > is written that way and if we are going to do everything our own
> > way
> > ignoring specs, this is going to be chaos.
> 
> Like I sai you don't have to violate the spec at all. Just don't use
> two slices for the same pipe and you can avoid the whole strange pipe
> ratio limitation. We should file a bspec issue to get that clarified
> though.
> 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +
> > > > > > +struct dbuf_slice_conf_entry {
> > > > > > +     u32 dbuf_mask[I915_MAX_PIPES];
> > > > > > +     u32 active_pipes;
> > > > > > +};
> > > > > > +
> > > > > > +
> > > > > > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > > > > > +     { { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > > > > > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > > > > > +     { { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > > > > > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > > > > > +     { { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > > > > > +     { { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > > > > > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_B)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_C)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) |
> > > > > > BIT(PIPE_C)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_B)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_B)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3,
> > > > > > DBuf4)  \
> > > > > > +     { { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_B) | \
> > > > > > +                                       BIT(PIPE_C) |
> > > > > > BIT(PIPE_D) }
> > > > > > +
> > > > > > +/*
> > > > > > + * Table taken from Bspec 12716
> > > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > > + * plus there are some hardcoded requirements on how
> > > > > > + * those should be distributed for multipipe scenarios.
> > > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > > + * and less readable, so decided to use a table almost
> > > > > > + * as is from BSpec itself - that way it is at least
> > > > > > easier
> > > > > > + * to compare, change and check.
> > > > > > + */
> > > > > > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /*
> > > > > > pipe
> > > > > > ratio
> > > > > > < 88.8 */
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > > pipe
> > > > > > ratio >= 88.8 */
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /*
> > > > > > pipe
> > > > > > ratio
> > > > > > < 88.8 */
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > > pipe
> > > > > > ratio >= 88.8 */
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /*
> > > > > > pipe
> > > > > > ratio
> > > > > > < 88.8 */
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for
> > > > > > pipe
> > > > > > ratio >= 88.8 */
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> > > > > > +
> > > > > > +/*
> > > > > > + * Table taken from Bspec 49255
> > > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > > + * plus there are some hardcoded requirements on how
> > > > > > + * those should be distributed for multipipe scenarios.
> > > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > > + * and less readable, so decided to use a table almost
> > > > > > + * as is from BSpec itself - that way it is at least
> > > > > > easier
> > > > > > + * to compare, change and check.
> > > > > > + */
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> > > > > 
> > > > > My eyes!
> > > > > 
> > > > > I have to think we should be able to reduce all that to a
> > > > > handful
> > > > > of lines of code.
> > > > 
> > > > Yeah, then we are going to have a huge function with lots of
> > > > weird
> > > > definitions, unfortunately BSpec table has quite strange DBuf
> > > > assignments like
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > 
> > > > i.e slices can get mixed in a quite various ways. There is no
> > > > rational pattern in those and could be even dangerous to try to
> > > > optimize it some way, as one day it might change again in some
> > > > unpredictable way.
> > > > 
> > > > Would you prefer to have it like
> > > > 
> > > > if (pipes are A and B)
> > > >     program S2 to A and S1 to B
> > > > if (pipes are A and C)
> > > >     program S1 to A and S2 to C
> > > > ...
> > > > 
> > > > I would prefer at least to see it in a comparable way with the
> > > > table we have in BSpec, rather than having lots of weird
> > > > looking
> > > > if-else statements, GEN checks and so on.
> > > > 
> > > > I knew this table was not going to look like it is done
> > > > typically
> > > > here - but really my opinion that it is way better than
> > > > hardcoding
> > > > it into some weird algorithm, which would be hard to compare to
> > > > initial table, which is already strange enough.
> > > > 
> > > > If you don't like it - could you please explain why this is
> > > > exactly
> > > > worse than having long functions with hardcoded checks?
> > > 
> > > I think we should be able to encode this simply using the
> > > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify
> > > what
> > > is the distance of each pipe from each dbuf slice. Then all we
> > > have
> > > to do is minimize the total distance. The following tables I
> > > think
> > > should encode that reasonably concisely:
> > > 
> > > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > > static const u8 icl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > > };
> > > 
> > > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > > static const u8 tgl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > > };
> > 
> > I think you are missing my point. I will explain why.
> > My initial idea was similar, use some kind of distance
> > metrics and encode everything in a smart way.
> > Current BSpec table has sometimes slices swapped for no reason.
> > 
> > There are seem to be some hardware constraints we are not aware
> > of. For example for TGL pipes A,B get S2 and S1 slices,
> 
> A/S1 + B/S2 = 1 + 4 = 5
> A/S2 + B/S1 = 2 + 1 = 3
> 
> > but pipes A C get S1 and S2.
> 
> A/S1 + C/S2 = 1 + 1 = 2
> A/S2 + C/S1 = 2 + 2 = 4
> 
> So it's exactly as my distance table says it should be.
> 
> > 
> > if you encode all this into some kind of distance algorithm,
> > those differences would be enourmously difficult to spot and
> > compare with actual table. That is why I decided to do it that
> > way.
> > 
> > Most likely yes you can solve it by using this metrics,
> > but will it really be then more readable and comparable
> > with reference table?
> > What if some contraints will be added again, like pipe ratio?
> > 
> > > 
> > > /*
> > >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> > >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> > >  */
> > > static const u8 future_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> > >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> > >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> > >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > > };
> > > 
> > > I suppose we could make that even more minimal by just
> > > directly sticking that "Pipe and DBUF ordering" language
> > > into a single array but and then calculate each distance
> > > from it, but that is perhaps more work, and we'd need
> > > some kind of enum just for that so that we have a single
> > > namespace for the pipes and dbuf slices. So i have a
> > > feeling having those distances pre-computed in a table
> > > like thet is maybe a better way to go.
> > 
> > Good luck then spoting differences with the BSpec table once it
> > gets changed.
> 
> Well, I'd rather do that than decode through layers of cpp macros.
> Also atm the table has no extra restrictions so it's clearly there
> just because someone thought they'd write out the results of the
> distance calculations. In general bspec style is somewhat annoying
> where they try to write out things in the lowest level of absraction
> possible (they are hardware people after all). I'd much prefer a
> higher level description of the issue.
> 
> > 
> > > 
> > > > 
> > > > I think this is pretty stupid to fix stuff which looks new or
> > > > unusual
> > > > just because somebody doesn't like it, without even
> > > > constructive
> > > > arguing.
> > > > 
> > > > - Stan
> > > > 
> > > > > 
> > > > > 
> > > > > > +
> > > > > > +/*
> > > > > > + * This function
> > > > > 
> > > > > "This function" is totally redundant. Ditto elsewhere.
> > > > > 
> > > > > > finds an entry with same enabled pipe configuration and
> > > > > > + * returns correspondent DBuf slice mask as stated in
> > > > > > BSpec
> > > > > > for
> > > > > > particular
> > > > > > + * platform.
> > > > > > + */
> > > > > > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > > > > > +                                  u32 active_pipes,
> > > > > > +                                  const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     int i;
> > > > > > +     uint_fixed_16_16_t pipe_ratio;
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Calculate pipe ratio as stated in BSpec 28692
> > > > > > +      */
> > > > > > +     pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > > > > > +
> > > > > > +     for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > > > > > +             if (icl_allowed_dbufs[i].active_pipes ==
> > > > > > active_pipes)
> > > > > > {
> > > > > > +                     if (hweight32(active_pipes) == 1) {
> > > > > > +                             /*
> > > > > > +                              * According to BSpec 12716:
> > > > > > if
> > > > > > ratio
> > > > > > > = 88.8
> > > > > > 
> > > > > > +                              * always use single dbuf
> > > > > > slice.
> > > > > > +                              */
> > > > > > +                             if (pipe_ratio.val >=
> > > > > > div_fixed16(888,
> > > > > > 10).val)
> > > > > > +                                     ++i;
> > > > > > +                     }
> > > > > > +                     return
> > > > > > icl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > > +             }
> > > > > > +     }
> > > > > > +     return 0;
> > > > > > +}
> > > > > > +
> > > > > > +/*
> > > > > > + * This function finds an entry with same enabled pipe
> > > > > > configuration and
> > > > > > + * returns correspondent DBuf slice mask as stated in
> > > > > > BSpec
> > > > > > for
> > > > > > particular
> > > > > > + * platform.
> > > > > > + */
> > > > > > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > > > > > +                                  u32 active_pipes,
> > > > > > +                                  const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     int i;
> > > > > > +
> > > > > > +     for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > > > > > +             if (tgl_allowed_dbufs[i].active_pipes ==
> > > > > > active_pipes)
> > > > > > +                     return
> > > > > > tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > > +     }
> > > > > > +     return 0;
> > > > > > +}
> > > > > > +
> > > > > > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > > +                                   int pipe, u32
> > > > > > active_pipes,
> > > > > > +                                   const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     if (IS_GEN(dev_priv, 11))
> > > > > > +             return icl_get_allowed_dbuf_mask(pipe,
> > > > > > +                                              active_pipes
> > > > > > ,
> > > > > > +                                              crtc_state);
> > > > > > +     else if (IS_GEN(dev_priv, 12))
> > > > > > +             return tgl_get_allowed_dbuf_mask(pipe,
> > > > > > +                                              active_pipes
> > > > > > ,
> > > > > > +                                              crtc_state);
> > > > > > +     /*
> > > > > > +      * For anything else just return one slice yet.
> > > > > > +      * Should be extended for other platforms.
> > > > > > +      */
> > > > > > +     return DBUF_S1_BIT;
> > > > > > +}
> > > > > > +
> > > > > >  static u64
> > > > > >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > > > > > *crtc_state,
> > > > > >                                u64 *plane_data_rate)
> > > > > > --
> > > > > > 2.17.1
> > > > > 
> > > > > 
> > > 
> > > 
> 
> --
> Ville Syrjälä
> Intel
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-20 19:18               ` Ville Syrjälä
  0 siblings, 0 replies; 34+ messages in thread
From: Ville Syrjälä @ 2019-11-20 19:18 UTC (permalink / raw)
  To: Lisovskiy, Stanislav; +Cc: intel-gfx

On Mon, Nov 18, 2019 at 08:25:48PM +0000, Lisovskiy, Stanislav wrote:
> Hi,
> 
> > > > > >
> > > > > > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> > > > > > +     /*
> > > > > > +      * If some other DBuf slice is not needed, disable it
> > > > > > here,
> > > > > > +      * as here we only remove slices, which are not needed
> > > > > > anymore.
> > > > > > +      */
> > > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > > hw_enabled_slices)
> > > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > > slices_intersection);
> > > > >
> > > > > This is still a totally wrong place for this. We need to be sure
> > > > > the
> > > > > planes have switched over to the new ddb layout before we power
> > > > > off
> > > > > anything.
> > > >
> > > > I have just modified already existing line here, not added by me.
> > > > The point of this patch is starting to use second DBuf slice but
> > > > not fixing all kinds of issues related to DBuf code at the same
> > > > time.
> > >
> > > You're re-enabling code I disabled on purpose. It has to be fixed
> > > properly or not touched at all. In hindsight I should have probably
> > > just ripped it all out to not give people bad ideas.
> >
> > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> >
> > I'm not reenabling it - this is code is out there right now.
> 
> > It's _dead_ code, and it's like that on purpose. You're now attempting
> > to re-enable it without fixing the underlying issues.
> 
> Sounds like some sacred thing. Dead code which is not commented, neither
> removed, but which you can't touch.  

You can touch it by either removing it or fixing it. Anything else
is a bit pointless.

> 
> > This is all again about changing all kinds of stuff at the same time.
> >
> > Why we can't just do things step by step? You have a point that
> > this is wrong place ok, but as we see current code causes no
> > regressions. Just tell me your idea where to put it and I will
> > do that, if you don't have it yet - let it stay there, if CI shows
> > no regressions currently.
> 
> >The same global state tricks apply here as with the cdclk/sagv stuff.
> >1. enable new resources/bump up clocks
> >2. reconfigure planes/etc.
> >3. wait for vblanks
> >4. disable unused resources/drop the clocks
> 
> It is different with SAGV, there we restrict whats needed first and relax
> some limitations afterwards.

It's a global resource shared by the pipes. Hence it's quite similar
to SAGV and cdclk.

> So all this was simply about that I should put this function call into
> skl_commit_modeset_disables? 

It should be in intel_atomic_commit_tail() most likely, just like cdclk.

I cooked up a small test that I believe should end up ping-ponging
between one and two slices on icl, and so should end up exploding
when the second slice gets shut down prematurely:
git://github.com/vsyrjala/intel-gpu-tools.git dbuf_slice_test

> I could just have said it straight away to me, without mentioned "dead" code
> and other stuff..

I don't read minds either.

> 
> > > > > > +     return INTEL_INFO(dev_priv)->supported_slices;
> > > > > > +}
> > > > >
> > > > > Kinda pointless function now. 'supported_slices' is not a very
> > > > > good
> > > > > name. Should be 'num_dbuf_slices' or something along those lines.
> > > >
> > > > Supported slices really means how much slices we support in
> > > > hardware.
> > > > num_dbuf_slices will not reflect that, it will be more ambigious.
> > > > I don't like this name.
> > >
> > > supported_slices doesn't even say "dbuf" anywhere. So it's even more
> > > ambiguous. Also we already have num_pipes, num_sprites, etc.
> >
>  ..and in current code we have... "enabled_slices", no? Ouch! :) IS this
>  really that important and worth arguing at the very last moment?
>  whether number of
>  supported_slices_in_hardware, should be named "supported slices" or
>  "num_dbuf_slices"? If yes, lets go on, miss some targets and continue
>  discussing this "soooo important" thing.
> 
>  Some one might come up that it is called "num_slices" or
>  "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
>  keep everyones bloated egos happy??

Since we have ddb_size in device info already it should be next to that
actually. And we should probably unify the terminology a bit so we don't
mix dbuf and ddb in the same sentence.

> 
> > > > > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > > > > >  {
> > > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > > -
> > > > > > -     udelay(10);
> > > > > > -
> > > > > > -     if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > > -         !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > > -             DRM_ERROR("DBuf power enable timeout\n");
> > > > > > -     else
> > > > > > -             /*
> > > > > > -              * FIXME: for now pretend that we only have 1
> > > > > > slice,
> > > > > > see
> > > > > > -              * intel_enabled_dbuf_slices_num().
> > > > > > -              */
> > > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > > +     /*
> > > > > > +      * Just power up 1 slice, we will
> > > > > > +      * figure out later which slices we have and what we
> > > > > > need.
> > > > > > +      */
> > > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > > >
> > > > > That stuff really shouldn't be modified by low level functions.
> > > > > The intial value should come from readout, and after that it
> > > > > should
> > > > > be adjusted as part of the overall atomic state.
> > > >
> > > > Well, as you can see currently this is even worse - has some weird
> > > > hardcoded stuff and is modified like that. Also if we are enabling
> > > > the
> > > > dbuf slices, shouldn't we really determine what value should be
> > > > there?
> > > >
> > > > Also I wouldn't really always trust the readout, as to me it seems
> > > > icelake resets it constantly to 2 after suspend/resume.
> > >
> > > The point of readout is that we syncrhonize our software state with
> > > the
> > > current state of hardware without having to modify the hardware
> > > state.
> > > Once synchronized the driver takes over and tells the hw what to do.
> >
> > That is especially nice after you say "report error and don't care"
> > when I really want to reflect hw state in sw state, by checking whether
> > we could set dbuf slices or not.
> > Now you say that we need to synchronize software state with hw state,
> > however if I fail to modify it - then "who cares".
> 
> >Yes. If you fail to modify the hardware state the whole thing is
> >broken anyway. There's is *nothing* we can do at this stage in the
> >commit to fix it.
> 
> Ok, but why we can't at least keep the last successful state in sw
> instead of something which is completely wrong. How does that harm?

It messes up the software state that we computed with something that
we didn't compute. So going forward the state will be inconsistent.

The core concept of atomic is that the state is precomputed and
validated ahead of time. The commit phase is supposed to be dumb and
isn't allowed to modify the state. This is also why we often pass
the states as const during the commit phase, though unfortunately
we can't do that always on account of the states being slightly
overloaded with other metadata.

> 
> > > > > >
> > > > > >  struct skl_ddb_allocation {
> > > > >
> > > > > This struct is quite pointless. Should be killed off.
> > > >
> > > > Legacy stuff, not added by me. Quite fine with removing it -
> > > > but should we really do all this at the same time and in same
> > > > patch?
> > >
> > > Not the same patch. But clean up the cobwebs first, then add the new
> > > stuff. Much easier that way around.
> >
> > There is a shit ton load of cobwebs to clean. Personally my opinion is
> > that current dbuf controlling code is total garbage - which has lots of
> > hardcoded stuff. Good luck enabling only second dbuf slice, good luck
> > disabling/enabling dbuf properly....
> > Where you were, when this was committed then?
> 
> >Looking at something else unfortunately. Also it has just kept
> >bitrotting since because no one has cared.
> 
> That is what exactly surprises me. While we have issues
> like commit serialization, sagv, mst, half of display buffer only
> enabled, bw, and we are arguing about how supported_slices
> should be named. Or that I shouldn't refer to DBuf slices as bits
> but define as values and then decode those as bits. Priorities.

Usually people complain about reviewers not pointing out
all the issues in one sitting. 

> 
> >
> > If we need to fix it - fine. But may be we should do it in stages?
> > Point of this patch is "enable second dbuf slice", next patch "remove
> > unneeded structs" and so on.
> > While you want me to shuffle everything at the same time, which is not
> > good idea, especially considering complexity and amount of legacy stuff
> > we currently have.
> 
> >Building new stuff on top of a decomposing garbage pile is not
> >a good idea: a) it's harder than building on clean foundation,
> >b) it'll probably fall over and get buried in the old garbage
> >at which point you're likely going to be right back where you
> >started
> .
> Sure, let's change everything at the same time and then wonder
> what had failed exactly in that huge series with lots of unrelated changes. 
> Standard practice.

Series can be split up to smaller chunks. My usual MO is to split
prep stuff out to separate patches, and if it looks like the overall
series is getting too big I even post the prep patches separately.
In fact I often end up with more than one prep series for whatever
actual goal I was trying to reach.

> 
> > > > >
> > > > > >       for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > > crtc_state,
> > > > > > i) {
> > > > > >               const struct drm_display_mode *adjusted_mode =
> > > > > >                       &crtc_state->hw.adjusted_mode;
> > > > > >               enum pipe pipe = crtc->pipe;
> > > > > >               int hdisplay, vdisplay;
> > > > > > +             u32 pipe_dbuf_slice_mask = \
> > > > > > +                     i915_get_allowed_dbuf_mask(dev_priv,
> > > > > > +                                             pipe,
> > > > > > +                                             active_pipes,
> > > > > > +                                             crtc_state);
> > > > > >
> > > > > >               if (!crtc_state->hw.enable)
> > > > > >                       continue;
> > > > > >
> > > > > > +             /*
> > > > > > +              * According to BSpec pipe can share one dbuf
> > > > > > slice
> > > > > > with another
> > > > > > +              * pipes or pipe can use multiple dbufs, in
> > > > > > both cases
> > > > > > we
> > > > > > +              * account for other pipes only if they have
> > > > > > exactly
> > > > > > same mask.
> > > > > > +              * However we need to account how many slices
> > > > > > we should
> > > > > > enable
> > > > > > +              * in total.
> > > > > > +              */
> > > > > > +             total_slice_mask |= pipe_dbuf_slice_mask;
> > > > >
> > > > > total_slice_mask will now account only the crtcs in the state.
> > > > > What
> > > > > happens to the other pipes' slices?
> >
> > Same as happens now btw. Right now it also does the same thing.
> 
> >Right now it does absolutely nothing. It's all dead code.
> 
> Right now, this "dead" code exactly iterates through crtcs in state and
> allocates DBuf S1 in proportion to pipe width.
> What I do is make it do the same thing but using either range of one
> or two slices, depending on pipe configuration.

OK. So how does it deal with my original question?

> 
> So should I have thrown away that code and rewritten this as well?

Not if it works.

> 
> > > > > > +static uint_fixed_16_16_t
> > > > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     u32 src_w, src_h, dst_w, dst_h;
> > > > > > +     uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > > > +     uint_fixed_16_16_t downscale_h, downscale_w;
> > > > > > +     const struct drm_display_mode *adjusted_mode =
> > > > > > &crtc_state-
> > > > > > > hw.adjusted_mode;
> > > > > >
> > > > > > +
> > > > > > +     src_w = crtc_state->pipe_src_w;
> > > > > > +     src_h = crtc_state->pipe_src_h;
> > > > > > +     dst_w = adjusted_mode->crtc_hdisplay;
> > > > > > +     dst_h = adjusted_mode->crtc_vdisplay;
> > > > > > +
> > > > > > +     fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > > > +     fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > > > +     downscale_w = max_fixed16(fp_w_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +     downscale_h = max_fixed16(fp_h_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +
> > > > > > +     return mul_fixed16(downscale_w, downscale_h);
> > > > > > +}
> > > > > > +
> > > > > > +uint_fixed_16_16_t
> > > > > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > > > > +{
> > > > > > +     struct drm_plane *plane;
> > > > > > +     const struct drm_plane_state *drm_plane_state;
> > > > > > +     uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > > > +     uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > > > +     struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > > > >uapi.crtc);
> > > > > > +
> > > > > > +     if (!crtc_state->uapi.enable)
> > > > > > +             return max_downscale;
> > > > > > +
> > > > > > +     drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > > > drm_plane_state, &crtc_state->uapi) {
> > > > > > +             uint_fixed_16_16_t plane_downscale;
> > > > > > +             const struct intel_plane_state *plane_state =
> > > > > > +                     to_intel_plane_state(drm_plane_state);
> > > > > > +
> > > > > > +             if (!intel_wm_plane_visible(crtc_state,
> > > > > > plane_state))
> > > > > > +                     continue;
> > > > > > +
> > > > > > +             plane_downscale =
> > > > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > > > +
> > > > > > +             DRM_DEBUG_KMS("Plane %d downscale amount
> > > > > > %d.%d\n",
> > > > > > +                           to_intel_plane(plane)->id,
> > > > > > +                           plane_downscale.val >> 16,
> > > > > > +                           plane_downscale.val & 0xffff);
> > > > > > +
> > > > > > +             max_downscale = max_fixed16(plane_downscale,
> > > > > > max_downscale);
> > > > > > +     }
> > > > > > +
> > > > > > +
> > > > > > +     pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > > > +                    crtc->pipe, pipe_downscale.val >> 16,
> > > > > > +                    pipe_downscale.val & 0xffff);
> > > > > > +
> > > > > > +     pipe_downscale = mul_fixed16(pipe_downscale,
> > > > > > max_downscale);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Convert result to percentage: get the actual ratio
> > > > > > instead
> > > > > > of rate,
> > > > > > +      * then multiply by 100 to get percentage.
> > > > > > +      */
> > > > > > +     pipe_ratio = u32_to_fixed16(100 *
> > > > > > div_round_up_u32_fixed16(1,
> > > > > > pipe_downscale));
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > > > +                   pipe_ratio.val >> 16, pipe_ratio.val &
> > > > > > 0xffff);
> > > > > > +
> > > > > > +     return pipe_ratio;
> > > > > > +}
> > > > >
> > > > > Yuck. I just killed these.
> > > >
> > > > I still need pipe_ratio fro ICL as you know, this is dictated by
> > > > BSpec,
> > > > not by me.
> > >
> > > That part of the spec is pretty crazy IMO. I think we should just
> > > ignore
> > > it for now. Also nothing in there says we must use both slices for a
> > > single pipe so ignoring doesn't even violate the spec in any way.
> >
> > So you are instructing me now to ignore BSpec as "we are smarter"?
> > I don't think this is properly to do it this way. Lets first change the
> > BSpec, if we don't like pipe ratio here and then driver. Currently this
> > is written that way and if we are going to do everything our own way
> > ignoring specs, this is going to be chaos.
> 
> >Like I sai you don't have to violate the spec at all. Just don't use
> >two slices for the same pipe and you can avoid the whole strange pipe
> >ratio limitation. We should file a bspec issue to get that clarified
> >though.
> 
> That is actually a violation of BSpec, we are not using two slices for same
> pipe while we could. We had already enough of bw issues, why should
> we make our life even harder?

Using more slices is, for whatever reason, harder for the hardware.
You're not allowed to use two slices if you do any significant
downscaling. Which is a bit sad/odd because increasing downscaling
increases bandwidth utilization but at the same time also prevents
us from using two slices.

I think the main benefit we would get from two slices is better
power savings because we may be able to enable more watermark
levels on account of the larger ddb.

There is nothing in the spec that I can see that says we *must* use
two slices (unless we're about to exceed the CDCDLKx64 bandwidth
limit I guess). And since we haven't been using two slices so far
it seems safer to not jump in the deep end and enable two slices
per-pipe at the same time as we do the other things. Ie. there
should be a separate patch to enable the two slices per pipe usage.

> 
> > > > > > +/*
> > > > > > + * Table taken from Bspec 49255
> > > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > > + * plus there are some hardcoded requirements on how
> > > > > > + * those should be distributed for multipipe scenarios.
> > > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > > + * and less readable, so decided to use a table almost
> > > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > > + * to compare, change and check.
> > > > > > + */
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> > > > >
> > > > > My eyes!
> > > > >
> > > > > I have to think we should be able to reduce all that to a handful
> > > > > of lines of code.
> > > >
> > > > Yeah, then we are going to have a huge function with lots of weird
> > > > definitions, unfortunately BSpec table has quite strange DBuf
> > > > assignments like
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > >
> > > > i.e slices can get mixed in a quite various ways. There is no
> > > > rational pattern in those and could be even dangerous to try to
> > > > optimize it some way, as one day it might change again in some
> > > > unpredictable way.
> > > >
> > > > Would you prefer to have it like
> > > >
> > > > if (pipes are A and B)
> > > >     program S2 to A and S1 to B
> > > > if (pipes are A and C)
> > > >     program S1 to A and S2 to C
> > > > ...
> > > >
> > > > I would prefer at least to see it in a comparable way with the
> > > > table we have in BSpec, rather than having lots of weird looking
> > > > if-else statements, GEN checks and so on.
> > > >
> > > > I knew this table was not going to look like it is done typically
> > > > here - but really my opinion that it is way better than hardcoding
> > > > it into some weird algorithm, which would be hard to compare to
> > > > initial table, which is already strange enough.
> > > >
> > > > If you don't like it - could you please explain why this is exactly
> > > > worse than having long functions with hardcoded checks?
> > >
> > > I think we should be able to encode this simply using the
> > > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
> > > is the distance of each pipe from each dbuf slice. Then all we have
> > > to do is minimize the total distance. The following tables I think
> > > should encode that reasonably concisely:
> > >
> > > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > > static const u8 icl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > > };
> > >
> > > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > > static const u8 tgl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > > };
> >
> > I think you are missing my point. I will explain why.
> > My initial idea was similar, use some kind of distance
> > metrics and encode everything in a smart way.
> > Current BSpec table has sometimes slices swapped for no reason.
> >
> > There are seem to be some hardware constraints we are not aware
> > of. For example for TGL pipes A,B get S2 and S1 slices,
> 
> >A/S1 + B/S2 = 1 + 4 = 5
> >A/S2 + B/S1 = 2 + 1 = 3
> 
> > but pipes A C get S1 and S2.
> 
> >A/S1 + C/S2 = 1 + 1 = 2
> >A/S2 + C/S1 = 2 + 2 = 4
> 
> >So it's exactly as my distance table says it should be.
> 
> >
> > if you encode all this into some kind of distance algorithm,
> > those differences would be enourmously difficult to spot and
> > compare with actual table. That is why I decided to do it that
> > way.
> >
> > Most likely yes you can solve it by using this metrics,
> > but will it really be then more readable and comparable
> > with reference table?
> > What if some contraints will be added again, like pipe ratio?
> >
> > >
> > > /*
> > >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> > >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> > >  */
> > > static const u8 future_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> > >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> > >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> > >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > > };
> > >
> > > I suppose we could make that even more minimal by just
> > > directly sticking that "Pipe and DBUF ordering" language
> > > into a single array but and then calculate each distance
> > > from it, but that is perhaps more work, and we'd need
> > > some kind of enum just for that so that we have a single
> > > namespace for the pipes and dbuf slices. So i have a
> > > feeling having those distances pre-computed in a table
> > > like thet is maybe a better way to go.
> >
> > Good luck then spoting differences with the BSpec table once it
> > gets changed.
> 
> >Well, I'd rather do that than decode through layers of cpp macros.
> >Also atm the table has no extra restrictions so it's clearly there
> >just because someone thought they'd write out the results of the
> >distance calculations. In general bspec style is somewhat annoying
> >where they try to write out things in the lowest level of absraction
> >possible (they are hardware people after all). I'd much prefer a
> >higher level description of the issue.
> 
> I would prefer to see immediately what is different as this table
> makes it possible right away
> Also to me this table simply intuitively more understandable.
> 
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> 
> We have similar CDCLK table in intel_cdclk, while you could 
> also argue that we might had come up with some formula.

The cdclk table has one redundancy that I'm not 100% fan of:
it specifies both the cdclk frequency and the cd2x divider.
We could make do with just either one and calculate the other
one.

On the flip side that redundancy would allow us to cross check
that the values produce the expected results. I actually have
a patch for that buried in some branch.

Apart from that I don't see what the cdclk table has to do
with this stuff. We can't derive that cdclk table from anything
much simpler.

Anyways, my preferred approach to this sort of stuff is to
make the computer do the calculations. That's what it's good
at. Humans, not so great at computations, or even copy pasting.
Though as with the cdclk table it may be possible to make
humans do the calculations/copy paste and then have the
computer double check it all makes sense.

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-20 19:18               ` Ville Syrjälä
  0 siblings, 0 replies; 34+ messages in thread
From: Ville Syrjälä @ 2019-11-20 19:18 UTC (permalink / raw)
  To: Lisovskiy, Stanislav; +Cc: intel-gfx

On Mon, Nov 18, 2019 at 08:25:48PM +0000, Lisovskiy, Stanislav wrote:
> Hi,
> 
> > > > > >
> > > > > > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> > > > > > +     /*
> > > > > > +      * If some other DBuf slice is not needed, disable it
> > > > > > here,
> > > > > > +      * as here we only remove slices, which are not needed
> > > > > > anymore.
> > > > > > +      */
> > > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > > hw_enabled_slices)
> > > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > > slices_intersection);
> > > > >
> > > > > This is still a totally wrong place for this. We need to be sure
> > > > > the
> > > > > planes have switched over to the new ddb layout before we power
> > > > > off
> > > > > anything.
> > > >
> > > > I have just modified already existing line here, not added by me.
> > > > The point of this patch is starting to use second DBuf slice but
> > > > not fixing all kinds of issues related to DBuf code at the same
> > > > time.
> > >
> > > You're re-enabling code I disabled on purpose. It has to be fixed
> > > properly or not touched at all. In hindsight I should have probably
> > > just ripped it all out to not give people bad ideas.
> >
> > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> >
> > I'm not reenabling it - this is code is out there right now.
> 
> > It's _dead_ code, and it's like that on purpose. You're now attempting
> > to re-enable it without fixing the underlying issues.
> 
> Sounds like some sacred thing. Dead code which is not commented, neither
> removed, but which you can't touch.  

You can touch it by either removing it or fixing it. Anything else
is a bit pointless.

> 
> > This is all again about changing all kinds of stuff at the same time.
> >
> > Why we can't just do things step by step? You have a point that
> > this is wrong place ok, but as we see current code causes no
> > regressions. Just tell me your idea where to put it and I will
> > do that, if you don't have it yet - let it stay there, if CI shows
> > no regressions currently.
> 
> >The same global state tricks apply here as with the cdclk/sagv stuff.
> >1. enable new resources/bump up clocks
> >2. reconfigure planes/etc.
> >3. wait for vblanks
> >4. disable unused resources/drop the clocks
> 
> It is different with SAGV, there we restrict whats needed first and relax
> some limitations afterwards.

It's a global resource shared by the pipes. Hence it's quite similar
to SAGV and cdclk.

> So all this was simply about that I should put this function call into
> skl_commit_modeset_disables? 

It should be in intel_atomic_commit_tail() most likely, just like cdclk.

I cooked up a small test that I believe should end up ping-ponging
between one and two slices on icl, and so should end up exploding
when the second slice gets shut down prematurely:
git://github.com/vsyrjala/intel-gpu-tools.git dbuf_slice_test

> I could just have said it straight away to me, without mentioned "dead" code
> and other stuff..

I don't read minds either.

> 
> > > > > > +     return INTEL_INFO(dev_priv)->supported_slices;
> > > > > > +}
> > > > >
> > > > > Kinda pointless function now. 'supported_slices' is not a very
> > > > > good
> > > > > name. Should be 'num_dbuf_slices' or something along those lines.
> > > >
> > > > Supported slices really means how much slices we support in
> > > > hardware.
> > > > num_dbuf_slices will not reflect that, it will be more ambigious.
> > > > I don't like this name.
> > >
> > > supported_slices doesn't even say "dbuf" anywhere. So it's even more
> > > ambiguous. Also we already have num_pipes, num_sprites, etc.
> >
>  ..and in current code we have... "enabled_slices", no? Ouch! :) IS this
>  really that important and worth arguing at the very last moment?
>  whether number of
>  supported_slices_in_hardware, should be named "supported slices" or
>  "num_dbuf_slices"? If yes, lets go on, miss some targets and continue
>  discussing this "soooo important" thing.
> 
>  Some one might come up that it is called "num_slices" or
>  "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
>  keep everyones bloated egos happy??

Since we have ddb_size in device info already it should be next to that
actually. And we should probably unify the terminology a bit so we don't
mix dbuf and ddb in the same sentence.

> 
> > > > > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > > > > >  {
> > > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > > -
> > > > > > -     udelay(10);
> > > > > > -
> > > > > > -     if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > > -         !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > > -             DRM_ERROR("DBuf power enable timeout\n");
> > > > > > -     else
> > > > > > -             /*
> > > > > > -              * FIXME: for now pretend that we only have 1
> > > > > > slice,
> > > > > > see
> > > > > > -              * intel_enabled_dbuf_slices_num().
> > > > > > -              */
> > > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > > +     /*
> > > > > > +      * Just power up 1 slice, we will
> > > > > > +      * figure out later which slices we have and what we
> > > > > > need.
> > > > > > +      */
> > > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > > >
> > > > > That stuff really shouldn't be modified by low level functions.
> > > > > The intial value should come from readout, and after that it
> > > > > should
> > > > > be adjusted as part of the overall atomic state.
> > > >
> > > > Well, as you can see currently this is even worse - has some weird
> > > > hardcoded stuff and is modified like that. Also if we are enabling
> > > > the
> > > > dbuf slices, shouldn't we really determine what value should be
> > > > there?
> > > >
> > > > Also I wouldn't really always trust the readout, as to me it seems
> > > > icelake resets it constantly to 2 after suspend/resume.
> > >
> > > The point of readout is that we syncrhonize our software state with
> > > the
> > > current state of hardware without having to modify the hardware
> > > state.
> > > Once synchronized the driver takes over and tells the hw what to do.
> >
> > That is especially nice after you say "report error and don't care"
> > when I really want to reflect hw state in sw state, by checking whether
> > we could set dbuf slices or not.
> > Now you say that we need to synchronize software state with hw state,
> > however if I fail to modify it - then "who cares".
> 
> >Yes. If you fail to modify the hardware state the whole thing is
> >broken anyway. There's is *nothing* we can do at this stage in the
> >commit to fix it.
> 
> Ok, but why we can't at least keep the last successful state in sw
> instead of something which is completely wrong. How does that harm?

It messes up the software state that we computed with something that
we didn't compute. So going forward the state will be inconsistent.

The core concept of atomic is that the state is precomputed and
validated ahead of time. The commit phase is supposed to be dumb and
isn't allowed to modify the state. This is also why we often pass
the states as const during the commit phase, though unfortunately
we can't do that always on account of the states being slightly
overloaded with other metadata.

> 
> > > > > >
> > > > > >  struct skl_ddb_allocation {
> > > > >
> > > > > This struct is quite pointless. Should be killed off.
> > > >
> > > > Legacy stuff, not added by me. Quite fine with removing it -
> > > > but should we really do all this at the same time and in same
> > > > patch?
> > >
> > > Not the same patch. But clean up the cobwebs first, then add the new
> > > stuff. Much easier that way around.
> >
> > There is a shit ton load of cobwebs to clean. Personally my opinion is
> > that current dbuf controlling code is total garbage - which has lots of
> > hardcoded stuff. Good luck enabling only second dbuf slice, good luck
> > disabling/enabling dbuf properly....
> > Where you were, when this was committed then?
> 
> >Looking at something else unfortunately. Also it has just kept
> >bitrotting since because no one has cared.
> 
> That is what exactly surprises me. While we have issues
> like commit serialization, sagv, mst, half of display buffer only
> enabled, bw, and we are arguing about how supported_slices
> should be named. Or that I shouldn't refer to DBuf slices as bits
> but define as values and then decode those as bits. Priorities.

Usually people complain about reviewers not pointing out
all the issues in one sitting. 

> 
> >
> > If we need to fix it - fine. But may be we should do it in stages?
> > Point of this patch is "enable second dbuf slice", next patch "remove
> > unneeded structs" and so on.
> > While you want me to shuffle everything at the same time, which is not
> > good idea, especially considering complexity and amount of legacy stuff
> > we currently have.
> 
> >Building new stuff on top of a decomposing garbage pile is not
> >a good idea: a) it's harder than building on clean foundation,
> >b) it'll probably fall over and get buried in the old garbage
> >at which point you're likely going to be right back where you
> >started
> .
> Sure, let's change everything at the same time and then wonder
> what had failed exactly in that huge series with lots of unrelated changes. 
> Standard practice.

Series can be split up to smaller chunks. My usual MO is to split
prep stuff out to separate patches, and if it looks like the overall
series is getting too big I even post the prep patches separately.
In fact I often end up with more than one prep series for whatever
actual goal I was trying to reach.

> 
> > > > >
> > > > > >       for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > > crtc_state,
> > > > > > i) {
> > > > > >               const struct drm_display_mode *adjusted_mode =
> > > > > >                       &crtc_state->hw.adjusted_mode;
> > > > > >               enum pipe pipe = crtc->pipe;
> > > > > >               int hdisplay, vdisplay;
> > > > > > +             u32 pipe_dbuf_slice_mask = \
> > > > > > +                     i915_get_allowed_dbuf_mask(dev_priv,
> > > > > > +                                             pipe,
> > > > > > +                                             active_pipes,
> > > > > > +                                             crtc_state);
> > > > > >
> > > > > >               if (!crtc_state->hw.enable)
> > > > > >                       continue;
> > > > > >
> > > > > > +             /*
> > > > > > +              * According to BSpec pipe can share one dbuf
> > > > > > slice
> > > > > > with another
> > > > > > +              * pipes or pipe can use multiple dbufs, in
> > > > > > both cases
> > > > > > we
> > > > > > +              * account for other pipes only if they have
> > > > > > exactly
> > > > > > same mask.
> > > > > > +              * However we need to account how many slices
> > > > > > we should
> > > > > > enable
> > > > > > +              * in total.
> > > > > > +              */
> > > > > > +             total_slice_mask |= pipe_dbuf_slice_mask;
> > > > >
> > > > > total_slice_mask will now account only the crtcs in the state.
> > > > > What
> > > > > happens to the other pipes' slices?
> >
> > Same as happens now btw. Right now it also does the same thing.
> 
> >Right now it does absolutely nothing. It's all dead code.
> 
> Right now, this "dead" code exactly iterates through crtcs in state and
> allocates DBuf S1 in proportion to pipe width.
> What I do is make it do the same thing but using either range of one
> or two slices, depending on pipe configuration.

OK. So how does it deal with my original question?

> 
> So should I have thrown away that code and rewritten this as well?

Not if it works.

> 
> > > > > > +static uint_fixed_16_16_t
> > > > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     u32 src_w, src_h, dst_w, dst_h;
> > > > > > +     uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > > > +     uint_fixed_16_16_t downscale_h, downscale_w;
> > > > > > +     const struct drm_display_mode *adjusted_mode =
> > > > > > &crtc_state-
> > > > > > > hw.adjusted_mode;
> > > > > >
> > > > > > +
> > > > > > +     src_w = crtc_state->pipe_src_w;
> > > > > > +     src_h = crtc_state->pipe_src_h;
> > > > > > +     dst_w = adjusted_mode->crtc_hdisplay;
> > > > > > +     dst_h = adjusted_mode->crtc_vdisplay;
> > > > > > +
> > > > > > +     fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > > > +     fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > > > +     downscale_w = max_fixed16(fp_w_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +     downscale_h = max_fixed16(fp_h_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +
> > > > > > +     return mul_fixed16(downscale_w, downscale_h);
> > > > > > +}
> > > > > > +
> > > > > > +uint_fixed_16_16_t
> > > > > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > > > > +{
> > > > > > +     struct drm_plane *plane;
> > > > > > +     const struct drm_plane_state *drm_plane_state;
> > > > > > +     uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > > > +     uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > > > +     struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > > > >uapi.crtc);
> > > > > > +
> > > > > > +     if (!crtc_state->uapi.enable)
> > > > > > +             return max_downscale;
> > > > > > +
> > > > > > +     drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > > > drm_plane_state, &crtc_state->uapi) {
> > > > > > +             uint_fixed_16_16_t plane_downscale;
> > > > > > +             const struct intel_plane_state *plane_state =
> > > > > > +                     to_intel_plane_state(drm_plane_state);
> > > > > > +
> > > > > > +             if (!intel_wm_plane_visible(crtc_state,
> > > > > > plane_state))
> > > > > > +                     continue;
> > > > > > +
> > > > > > +             plane_downscale =
> > > > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > > > +
> > > > > > +             DRM_DEBUG_KMS("Plane %d downscale amount
> > > > > > %d.%d\n",
> > > > > > +                           to_intel_plane(plane)->id,
> > > > > > +                           plane_downscale.val >> 16,
> > > > > > +                           plane_downscale.val & 0xffff);
> > > > > > +
> > > > > > +             max_downscale = max_fixed16(plane_downscale,
> > > > > > max_downscale);
> > > > > > +     }
> > > > > > +
> > > > > > +
> > > > > > +     pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > > > +                    crtc->pipe, pipe_downscale.val >> 16,
> > > > > > +                    pipe_downscale.val & 0xffff);
> > > > > > +
> > > > > > +     pipe_downscale = mul_fixed16(pipe_downscale,
> > > > > > max_downscale);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Convert result to percentage: get the actual ratio
> > > > > > instead
> > > > > > of rate,
> > > > > > +      * then multiply by 100 to get percentage.
> > > > > > +      */
> > > > > > +     pipe_ratio = u32_to_fixed16(100 *
> > > > > > div_round_up_u32_fixed16(1,
> > > > > > pipe_downscale));
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > > > +                   pipe_ratio.val >> 16, pipe_ratio.val &
> > > > > > 0xffff);
> > > > > > +
> > > > > > +     return pipe_ratio;
> > > > > > +}
> > > > >
> > > > > Yuck. I just killed these.
> > > >
> > > > I still need pipe_ratio fro ICL as you know, this is dictated by
> > > > BSpec,
> > > > not by me.
> > >
> > > That part of the spec is pretty crazy IMO. I think we should just
> > > ignore
> > > it for now. Also nothing in there says we must use both slices for a
> > > single pipe so ignoring doesn't even violate the spec in any way.
> >
> > So you are instructing me now to ignore BSpec as "we are smarter"?
> > I don't think this is properly to do it this way. Lets first change the
> > BSpec, if we don't like pipe ratio here and then driver. Currently this
> > is written that way and if we are going to do everything our own way
> > ignoring specs, this is going to be chaos.
> 
> >Like I sai you don't have to violate the spec at all. Just don't use
> >two slices for the same pipe and you can avoid the whole strange pipe
> >ratio limitation. We should file a bspec issue to get that clarified
> >though.
> 
> That is actually a violation of BSpec, we are not using two slices for same
> pipe while we could. We had already enough of bw issues, why should
> we make our life even harder?

Using more slices is, for whatever reason, harder for the hardware.
You're not allowed to use two slices if you do any significant
downscaling. Which is a bit sad/odd because increasing downscaling
increases bandwidth utilization but at the same time also prevents
us from using two slices.

I think the main benefit we would get from two slices is better
power savings because we may be able to enable more watermark
levels on account of the larger ddb.

There is nothing in the spec that I can see that says we *must* use
two slices (unless we're about to exceed the CDCDLKx64 bandwidth
limit I guess). And since we haven't been using two slices so far
it seems safer to not jump in the deep end and enable two slices
per-pipe at the same time as we do the other things. Ie. there
should be a separate patch to enable the two slices per pipe usage.

> 
> > > > > > +/*
> > > > > > + * Table taken from Bspec 49255
> > > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > > + * plus there are some hardcoded requirements on how
> > > > > > + * those should be distributed for multipipe scenarios.
> > > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > > + * and less readable, so decided to use a table almost
> > > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > > + * to compare, change and check.
> > > > > > + */
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> > > > >
> > > > > My eyes!
> > > > >
> > > > > I have to think we should be able to reduce all that to a handful
> > > > > of lines of code.
> > > >
> > > > Yeah, then we are going to have a huge function with lots of weird
> > > > definitions, unfortunately BSpec table has quite strange DBuf
> > > > assignments like
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > >
> > > > i.e slices can get mixed in a quite various ways. There is no
> > > > rational pattern in those and could be even dangerous to try to
> > > > optimize it some way, as one day it might change again in some
> > > > unpredictable way.
> > > >
> > > > Would you prefer to have it like
> > > >
> > > > if (pipes are A and B)
> > > >     program S2 to A and S1 to B
> > > > if (pipes are A and C)
> > > >     program S1 to A and S2 to C
> > > > ...
> > > >
> > > > I would prefer at least to see it in a comparable way with the
> > > > table we have in BSpec, rather than having lots of weird looking
> > > > if-else statements, GEN checks and so on.
> > > >
> > > > I knew this table was not going to look like it is done typically
> > > > here - but really my opinion that it is way better than hardcoding
> > > > it into some weird algorithm, which would be hard to compare to
> > > > initial table, which is already strange enough.
> > > >
> > > > If you don't like it - could you please explain why this is exactly
> > > > worse than having long functions with hardcoded checks?
> > >
> > > I think we should be able to encode this simply using the
> > > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
> > > is the distance of each pipe from each dbuf slice. Then all we have
> > > to do is minimize the total distance. The following tables I think
> > > should encode that reasonably concisely:
> > >
> > > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > > static const u8 icl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > > };
> > >
> > > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > > static const u8 tgl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > > };
> >
> > I think you are missing my point. I will explain why.
> > My initial idea was similar, use some kind of distance
> > metrics and encode everything in a smart way.
> > Current BSpec table has sometimes slices swapped for no reason.
> >
> > There are seem to be some hardware constraints we are not aware
> > of. For example for TGL pipes A,B get S2 and S1 slices,
> 
> >A/S1 + B/S2 = 1 + 4 = 5
> >A/S2 + B/S1 = 2 + 1 = 3
> 
> > but pipes A C get S1 and S2.
> 
> >A/S1 + C/S2 = 1 + 1 = 2
> >A/S2 + C/S1 = 2 + 2 = 4
> 
> >So it's exactly as my distance table says it should be.
> 
> >
> > if you encode all this into some kind of distance algorithm,
> > those differences would be enourmously difficult to spot and
> > compare with actual table. That is why I decided to do it that
> > way.
> >
> > Most likely yes you can solve it by using this metrics,
> > but will it really be then more readable and comparable
> > with reference table?
> > What if some contraints will be added again, like pipe ratio?
> >
> > >
> > > /*
> > >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> > >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> > >  */
> > > static const u8 future_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> > >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> > >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> > >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > > };
> > >
> > > I suppose we could make that even more minimal by just
> > > directly sticking that "Pipe and DBUF ordering" language
> > > into a single array but and then calculate each distance
> > > from it, but that is perhaps more work, and we'd need
> > > some kind of enum just for that so that we have a single
> > > namespace for the pipes and dbuf slices. So i have a
> > > feeling having those distances pre-computed in a table
> > > like thet is maybe a better way to go.
> >
> > Good luck then spoting differences with the BSpec table once it
> > gets changed.
> 
> >Well, I'd rather do that than decode through layers of cpp macros.
> >Also atm the table has no extra restrictions so it's clearly there
> >just because someone thought they'd write out the results of the
> >distance calculations. In general bspec style is somewhat annoying
> >where they try to write out things in the lowest level of absraction
> >possible (they are hardware people after all). I'd much prefer a
> >higher level description of the issue.
> 
> I would prefer to see immediately what is different as this table
> makes it possible right away
> Also to me this table simply intuitively more understandable.
> 
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> 
> We have similar CDCLK table in intel_cdclk, while you could 
> also argue that we might had come up with some formula.

The cdclk table has one redundancy that I'm not 100% fan of:
it specifies both the cdclk frequency and the cd2x divider.
We could make do with just either one and calculate the other
one.

On the flip side that redundancy would allow us to cross check
that the values produce the expected results. I actually have
a patch for that buried in some branch.

Apart from that I don't see what the cdclk table has to do
with this stuff. We can't derive that cdclk table from anything
much simpler.

Anyways, my preferred approach to this sort of stuff is to
make the computer do the calculations. That's what it's good
at. Humans, not so great at computations, or even copy pasting.
Though as with the cdclk table it may be possible to make
humans do the calculations/copy paste and then have the
computer double check it all makes sense.

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-20 21:55                 ` Lisovskiy, Stanislav
  0 siblings, 0 replies; 34+ messages in thread
From: Lisovskiy, Stanislav @ 2019-11-20 21:55 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Mon, Nov 20, 2019 at 09:18:48PM +0000, Ville Syrjälä wrote:

> >
> > I'm not reenabling it - this is code is out there right now.
>
> > It's _dead_ code, and it's like that on purpose. You're now attempting
> > to re-enable it without fixing the underlying issues.
>
> Sounds like some sacred thing. Dead code which is not commented, neither
> removed, but which you can't touch.

>You can touch it by either removing it or fixing it. Anything else
>is a bit pointless.

Ok, then instead of pointless arguing, just need idea where to put it.

Like there is a big difference between "this code is wrong.point." and 
and properly explained suggestion where/why it should be put.
Also yep to me any code which is not actually working properly or
should not be used, must be removed or commented out.

I don't really get this "dead" code concept. Sounds cool though.

>It's a global resource shared by the pipes. Hence it's quite similar
>to SAGV and cdclk.

>> So all this was simply about that I should put this function call into
>> skl_commit_modeset_disables?

>It should be in intel_atomic_commit_tail() most likely, just like cdclk.

Ok. 

>I cooked up a small test that I believe should end up ping-ponging
>between one and two slices on icl, and so should end up exploding
>when the second slice gets shut down prematurely:
>git://github.com/vsyrjala/intel-gpu-tools.git dbuf_slice_test

 Yep, because currently no tests explode on this. 

>>  Some one might come up that it is called "num_slices" or
>>  "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
>>  keep everyones bloated egos happy??

Since we have ddb_size in device info already it should be next to that
actually. And we should probably unify the terminology a bit so we don't
mix dbuf and ddb in the same sentence.

May be still use ddb allocation struct to group things together, similar
how it is done for bw info in dev_priv. This will make it easier at least
in a sense that we won't have to constantly prefix this with dbuf/ddb.

Not sure if this needs to be done all in one as this feels more like
some cleanup/prep stuff, which is evergoing anyway.

I will anyway rename the var, as we wasted already way too much 
time on this..

>It messes up the software state that we computed with something that
>we didn't compute. So going forward the state will be inconsistent.

>The core concept of atomic is that the state is precomputed and
>validated ahead of time. The commit phase is supposed to be dumb and
>isn't allowed to modify the state. This is also why we often pass
>the states as const during the commit phase, though unfortunately
>we can't do that always on account of the states being slightly
>overloaded with other metadata.

Sounds more philosophical question.  You either have inconsistent
with itself software state or the software which is consistent with
itself but not with the hardware.
My idea was always to reflect hardware state as much as possible,
however I get your point here.

>> >Looking at something else unfortunately. Also it has just kept
>> >bitrotting since because no one has cared.
>>
>> That is what exactly surprises me. While we have issues
>> like commit serialization, sagv, mst, half of display buffer only
>> enabled, bw, and we are arguing about how supported_slices
>> should be named. Or that I shouldn't refer to DBuf slices as bits
>> but define as values and then decode those as bits. Priorities.

>Usually people complain about reviewers not pointing out
>all the issues in one sitting.

It was not about complaining but more about prioritization.
While trying to reach as perfect code/variable naming/placement is great,
this process is evergoing, while some things might be more important.

I don't believe DBUF_S1_BIT defined as BIT(0) or DBUF_S1 defined as 0
and then used anyway used as BIT(0) will bring any difference.

We are just arguing whether A + B or B + A is better here. 

There are lots of assumptions, "dead" code and other stuff already.

And then there is a real important stuff like commit serialization problem,
the actual features implementation which have to wait, because of that.

>
> > > > >
> > > > > >       for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > > crtc_state,
> > > > > > i) {
> > > > > >               const struct drm_display_mode *adjusted_mode =
> > > > > >                       &crtc_state->hw.adjusted_mode;
> > > > > >               enum pipe pipe = crtc->pipe;
> > > > > >               int hdisplay, vdisplay;
> > > > > > +             u32 pipe_dbuf_slice_mask = \
> > > > > > +                     i915_get_allowed_dbuf_mask(dev_priv,
> > > > > > +                                             pipe,
> > > > > > +                                             active_pipes,
> > > > > > +                                             crtc_state);
> > > > > >
> > > > > >               if (!crtc_state->hw.enable)
> > > > > >                       continue;
> > > > > >
> > > > > > +             /*
> > > > > > +              * According to BSpec pipe can share one dbuf
> > > > > > slice
> > > > > > with another
> > > > > > +              * pipes or pipe can use multiple dbufs, in
> > > > > > both cases
> > > > > > we
> > > > > > +              * account for other pipes only if they have
> > > > > > exactly
> > > > > > same mask.
> > > > > > +              * However we need to account how many slices
> > > > > > we should
> > > > > > enable
> > > > > > +              * in total.
> > > > > > +              */
> > > > > > +             total_slice_mask |= pipe_dbuf_slice_mask;
> > > > >
> > > > > total_slice_mask will now account only the crtcs in the state.
> > > > > What
> > > > > happens to the other pipes' slices?
> >
>> > Same as happens now btw. Right now it also does the same thing.
>>
>> >Right now it does absolutely nothing. It's all dead code.
>>
>> Right now, this "dead" code exactly iterates through crtcs in state and
>> allocates DBuf S1 in proportion to pipe width.
>> What I do is make it do the same thing but using either range of one
>> or two slices, depending on pipe configuration.

>OK. So how does it deal with my original question?

Original question was about that there are might be not all crtcs in the state.
I told that I know as this is existing problem and actually not brought up
by my code even. 
Currently it works that way. Just as many other things related to commit serialization
thing. But should I really fix all this in my patch, which was just supposed to 
bring proper usage of the DBuf slices?
I think this should be fixed separately with some patch, saying "start using global
state instead of current atomic state when allocating DBuf for pipes".
Because I don't want to debug issues coming from enabling second DBuf slice
and those from that issue at the same time.

>>
>> So should I have thrown away that code and rewritten this as well?

>Not if it works.

So it works or not? Or this is a "dead" code as you mentioned, 
I'm confused now.


>The cdclk table has one redundancy that I'm not 100% fan of:
>it specifies both the cdclk frequency and the cd2x divider.
>We could make do with just either one and calculate the other
>one.

>On the flip side that redundancy would allow us to cross check
>that the values produce the expected results. I actually have
>a patch for that buried in some branch.

Exactly easier cross check.

>Apart from that I don't see what the cdclk table has to do
>with this stuff. We can't derive that cdclk table from anything
>much simpler

See above. Even if you could that would eliminate table but
reduce readability.

>Anyways, my preferred approach to this sort of stuff is to
>make the computer do the calculations. That's what it's good
>at. Humans, not so great at computations, or even copy pasting.
>Though as with the cdclk table it may be possible to make
>humans do the calculations/copy paste and then have the
>computer double check it all makes sense.

Calculations are great for something complex. You will not use
calculator every time you need to add 2+2. 
The table you proposing will require some kind of minimal spanning
tree algorithm.

I.e there are metrics about how far is each DBuf slice from
each Pipe. You build a graph from that and try to get a config
which corresponds to minimal spanning tree of this graph - 
that would be optimal DBuf assignment.

However this would have to be done in runtime, doing calculations
and lots of compares. 
           S1    S2
Pipe1   1      2
Pipe2    2     1

You basically then have 1+2, 2+1,2+2,1+1 where minimum is of course 1+1, 
i.e Pipe1(S1) and Pipe2(S2)
Even for this amount of pipes you have 4 combinations to check.
You can of course forbid to use same slice, thus leaving only 2 of those,
but that will violate BSpec as it allows to use same slice.
And how about 3,4 pipes? 

Sound almost like implementing a hash function for doing search
in array of 10 elements, just as we discussed at the meeting.
Fancy, interesting, but not redundant.

So basically we are loosing readability of the code and also add
more complexity to the code(means more possible errors).
What for?

While we can just have a compile time defined mask. In a small lookup table,
which is easy to read and compare. And no run-time algorithms.
No redundant complexity.

I admit my table could be not the best way to do it. OK.
But it makes sense and works. It is always possible to improve.

Stan



Best Regards,

Lisovskiy Stanislav

Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

________________________________________
From: Ville Syrjälä [ville.syrjala@linux.intel.com]
Sent: Wednesday, November 20, 2019 9:18 PM
To: Lisovskiy, Stanislav
Cc: Saarinen, Jani; intel-gfx@lists.freedesktop.org; Roper, Matthew D; maarten.lankhorst@linux.intel.com; Ausmus, James
Subject: Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL

On Mon, Nov 18, 2019 at 08:25:48PM +0000, Lisovskiy, Stanislav wrote:
> Hi,
>
> > > > > >
> > > > > > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> > > > > > +     /*
> > > > > > +      * If some other DBuf slice is not needed, disable it
> > > > > > here,
> > > > > > +      * as here we only remove slices, which are not needed
> > > > > > anymore.
> > > > > > +      */
> > > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > > hw_enabled_slices)
> > > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > > slices_intersection);
> > > > >
> > > > > This is still a totally wrong place for this. We need to be sure
> > > > > the
> > > > > planes have switched over to the new ddb layout before we power
> > > > > off
> > > > > anything.
> > > >
> > > > I have just modified already existing line here, not added by me.
> > > > The point of this patch is starting to use second DBuf slice but
> > > > not fixing all kinds of issues related to DBuf code at the same
> > > > time.
> > >
> > > You're re-enabling code I disabled on purpose. It has to be fixed
> > > properly or not touched at all. In hindsight I should have probably
> > > just ripped it all out to not give people bad ideas.
> >
> > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> >
> > I'm not reenabling it - this is code is out there right now.
>
> > It's _dead_ code, and it's like that on purpose. You're now attempting
> > to re-enable it without fixing the underlying issues.
>
> Sounds like some sacred thing. Dead code which is not commented, neither
> removed, but which you can't touch.

You can touch it by either removing it or fixing it. Anything else
is a bit pointless.

>
> > This is all again about changing all kinds of stuff at the same time.
> >
> > Why we can't just do things step by step? You have a point that
> > this is wrong place ok, but as we see current code causes no
> > regressions. Just tell me your idea where to put it and I will
> > do that, if you don't have it yet - let it stay there, if CI shows
> > no regressions currently.
>
> >The same global state tricks apply here as with the cdclk/sagv stuff.
> >1. enable new resources/bump up clocks
> >2. reconfigure planes/etc.
> >3. wait for vblanks
> >4. disable unused resources/drop the clocks
>
> It is different with SAGV, there we restrict whats needed first and relax
> some limitations afterwards.

It's a global resource shared by the pipes. Hence it's quite similar
to SAGV and cdclk.

> So all this was simply about that I should put this function call into
> skl_commit_modeset_disables?

It should be in intel_atomic_commit_tail() most likely, just like cdclk.

I cooked up a small test that I believe should end up ping-ponging
between one and two slices on icl, and so should end up exploding
when the second slice gets shut down prematurely:
git://github.com/vsyrjala/intel-gpu-tools.git dbuf_slice_test

> I could just have said it straight away to me, without mentioned "dead" code
> and other stuff..

I don't read minds either.

>
> > > > > > +     return INTEL_INFO(dev_priv)->supported_slices;
> > > > > > +}
> > > > >
> > > > > Kinda pointless function now. 'supported_slices' is not a very
> > > > > good
> > > > > name. Should be 'num_dbuf_slices' or something along those lines.
> > > >
> > > > Supported slices really means how much slices we support in
> > > > hardware.
> > > > num_dbuf_slices will not reflect that, it will be more ambigious.
> > > > I don't like this name.
> > >
> > > supported_slices doesn't even say "dbuf" anywhere. So it's even more
> > > ambiguous. Also we already have num_pipes, num_sprites, etc.
> >
>  ..and in current code we have... "enabled_slices", no? Ouch! :) IS this
>  really that important and worth arguing at the very last moment?
>  whether number of
>  supported_slices_in_hardware, should be named "supported slices" or
>  "num_dbuf_slices"? If yes, lets go on, miss some targets and continue
>  discussing this "soooo important" thing.
>
>  Some one might come up that it is called "num_slices" or
>  "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
>  keep everyones bloated egos happy??

Since we have ddb_size in device info already it should be next to that
actually. And we should probably unify the terminology a bit so we don't
mix dbuf and ddb in the same sentence.

>
> > > > > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > > > > >  {
> > > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > > -
> > > > > > -     udelay(10);
> > > > > > -
> > > > > > -     if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > > -         !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > > -             DRM_ERROR("DBuf power enable timeout\n");
> > > > > > -     else
> > > > > > -             /*
> > > > > > -              * FIXME: for now pretend that we only have 1
> > > > > > slice,
> > > > > > see
> > > > > > -              * intel_enabled_dbuf_slices_num().
> > > > > > -              */
> > > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > > +     /*
> > > > > > +      * Just power up 1 slice, we will
> > > > > > +      * figure out later which slices we have and what we
> > > > > > need.
> > > > > > +      */
> > > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > > >
> > > > > That stuff really shouldn't be modified by low level functions.
> > > > > The intial value should come from readout, and after that it
> > > > > should
> > > > > be adjusted as part of the overall atomic state.
> > > >
> > > > Well, as you can see currently this is even worse - has some weird
> > > > hardcoded stuff and is modified like that. Also if we are enabling
> > > > the
> > > > dbuf slices, shouldn't we really determine what value should be
> > > > there?
> > > >
> > > > Also I wouldn't really always trust the readout, as to me it seems
> > > > icelake resets it constantly to 2 after suspend/resume.
> > >
> > > The point of readout is that we syncrhonize our software state with
> > > the
> > > current state of hardware without having to modify the hardware
> > > state.
> > > Once synchronized the driver takes over and tells the hw what to do.
> >
> > That is especially nice after you say "report error and don't care"
> > when I really want to reflect hw state in sw state, by checking whether
> > we could set dbuf slices or not.
> > Now you say that we need to synchronize software state with hw state,
> > however if I fail to modify it - then "who cares".
>
> >Yes. If you fail to modify the hardware state the whole thing is
> >broken anyway. There's is *nothing* we can do at this stage in the
> >commit to fix it.
>
> Ok, but why we can't at least keep the last successful state in sw
> instead of something which is completely wrong. How does that harm?

It messes up the software state that we computed with something that
we didn't compute. So going forward the state will be inconsistent.

The core concept of atomic is that the state is precomputed and
validated ahead of time. The commit phase is supposed to be dumb and
isn't allowed to modify the state. This is also why we often pass
the states as const during the commit phase, though unfortunately
we can't do that always on account of the states being slightly
overloaded with other metadata.

>
> > > > > >
> > > > > >  struct skl_ddb_allocation {
> > > > >
> > > > > This struct is quite pointless. Should be killed off.
> > > >
> > > > Legacy stuff, not added by me. Quite fine with removing it -
> > > > but should we really do all this at the same time and in same
> > > > patch?
> > >
> > > Not the same patch. But clean up the cobwebs first, then add the new
> > > stuff. Much easier that way around.
> >
> > There is a shit ton load of cobwebs to clean. Personally my opinion is
> > that current dbuf controlling code is total garbage - which has lots of
> > hardcoded stuff. Good luck enabling only second dbuf slice, good luck
> > disabling/enabling dbuf properly....
> > Where you were, when this was committed then?
>
> >Looking at something else unfortunately. Also it has just kept
> >bitrotting since because no one has cared.
>
> That is what exactly surprises me. While we have issues
> like commit serialization, sagv, mst, half of display buffer only
> enabled, bw, and we are arguing about how supported_slices
> should be named. Or that I shouldn't refer to DBuf slices as bits
> but define as values and then decode those as bits. Priorities.

Usually people complain about reviewers not pointing out
all the issues in one sitting.

>
> >
> > If we need to fix it - fine. But may be we should do it in stages?
> > Point of this patch is "enable second dbuf slice", next patch "remove
> > unneeded structs" and so on.
> > While you want me to shuffle everything at the same time, which is not
> > good idea, especially considering complexity and amount of legacy stuff
> > we currently have.
>
> >Building new stuff on top of a decomposing garbage pile is not
> >a good idea: a) it's harder than building on clean foundation,
> >b) it'll probably fall over and get buried in the old garbage
> >at which point you're likely going to be right back where you
> >started
> .
> Sure, let's change everything at the same time and then wonder
> what had failed exactly in that huge series with lots of unrelated changes.
> Standard practice.

Series can be split up to smaller chunks. My usual MO is to split
prep stuff out to separate patches, and if it looks like the overall
series is getting too big I even post the prep patches separately.
In fact I often end up with more than one prep series for whatever
actual goal I was trying to reach.

>
> > > > >
> > > > > >       for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > > crtc_state,
> > > > > > i) {
> > > > > >               const struct drm_display_mode *adjusted_mode =
> > > > > >                       &crtc_state->hw.adjusted_mode;
> > > > > >               enum pipe pipe = crtc->pipe;
> > > > > >               int hdisplay, vdisplay;
> > > > > > +             u32 pipe_dbuf_slice_mask = \
> > > > > > +                     i915_get_allowed_dbuf_mask(dev_priv,
> > > > > > +                                             pipe,
> > > > > > +                                             active_pipes,
> > > > > > +                                             crtc_state);
> > > > > >
> > > > > >               if (!crtc_state->hw.enable)
> > > > > >                       continue;
> > > > > >
> > > > > > +             /*
> > > > > > +              * According to BSpec pipe can share one dbuf
> > > > > > slice
> > > > > > with another
> > > > > > +              * pipes or pipe can use multiple dbufs, in
> > > > > > both cases
> > > > > > we
> > > > > > +              * account for other pipes only if they have
> > > > > > exactly
> > > > > > same mask.
> > > > > > +              * However we need to account how many slices
> > > > > > we should
> > > > > > enable
> > > > > > +              * in total.
> > > > > > +              */
> > > > > > +             total_slice_mask |= pipe_dbuf_slice_mask;
> > > > >
> > > > > total_slice_mask will now account only the crtcs in the state.
> > > > > What
> > > > > happens to the other pipes' slices?
> >
> > Same as happens now btw. Right now it also does the same thing.
>
> >Right now it does absolutely nothing. It's all dead code.
>
> Right now, this "dead" code exactly iterates through crtcs in state and
> allocates DBuf S1 in proportion to pipe width.
> What I do is make it do the same thing but using either range of one
> or two slices, depending on pipe configuration.

OK. So how does it deal with my original question?

>
> So should I have thrown away that code and rewritten this as well?

Not if it works.

>
> > > > > > +static uint_fixed_16_16_t
> > > > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     u32 src_w, src_h, dst_w, dst_h;
> > > > > > +     uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > > > +     uint_fixed_16_16_t downscale_h, downscale_w;
> > > > > > +     const struct drm_display_mode *adjusted_mode =
> > > > > > &crtc_state-
> > > > > > > hw.adjusted_mode;
> > > > > >
> > > > > > +
> > > > > > +     src_w = crtc_state->pipe_src_w;
> > > > > > +     src_h = crtc_state->pipe_src_h;
> > > > > > +     dst_w = adjusted_mode->crtc_hdisplay;
> > > > > > +     dst_h = adjusted_mode->crtc_vdisplay;
> > > > > > +
> > > > > > +     fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > > > +     fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > > > +     downscale_w = max_fixed16(fp_w_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +     downscale_h = max_fixed16(fp_h_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +
> > > > > > +     return mul_fixed16(downscale_w, downscale_h);
> > > > > > +}
> > > > > > +
> > > > > > +uint_fixed_16_16_t
> > > > > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > > > > +{
> > > > > > +     struct drm_plane *plane;
> > > > > > +     const struct drm_plane_state *drm_plane_state;
> > > > > > +     uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > > > +     uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > > > +     struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > > > >uapi.crtc);
> > > > > > +
> > > > > > +     if (!crtc_state->uapi.enable)
> > > > > > +             return max_downscale;
> > > > > > +
> > > > > > +     drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > > > drm_plane_state, &crtc_state->uapi) {
> > > > > > +             uint_fixed_16_16_t plane_downscale;
> > > > > > +             const struct intel_plane_state *plane_state =
> > > > > > +                     to_intel_plane_state(drm_plane_state);
> > > > > > +
> > > > > > +             if (!intel_wm_plane_visible(crtc_state,
> > > > > > plane_state))
> > > > > > +                     continue;
> > > > > > +
> > > > > > +             plane_downscale =
> > > > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > > > +
> > > > > > +             DRM_DEBUG_KMS("Plane %d downscale amount
> > > > > > %d.%d\n",
> > > > > > +                           to_intel_plane(plane)->id,
> > > > > > +                           plane_downscale.val >> 16,
> > > > > > +                           plane_downscale.val & 0xffff);
> > > > > > +
> > > > > > +             max_downscale = max_fixed16(plane_downscale,
> > > > > > max_downscale);
> > > > > > +     }
> > > > > > +
> > > > > > +
> > > > > > +     pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > > > +                    crtc->pipe, pipe_downscale.val >> 16,
> > > > > > +                    pipe_downscale.val & 0xffff);
> > > > > > +
> > > > > > +     pipe_downscale = mul_fixed16(pipe_downscale,
> > > > > > max_downscale);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Convert result to percentage: get the actual ratio
> > > > > > instead
> > > > > > of rate,
> > > > > > +      * then multiply by 100 to get percentage.
> > > > > > +      */
> > > > > > +     pipe_ratio = u32_to_fixed16(100 *
> > > > > > div_round_up_u32_fixed16(1,
> > > > > > pipe_downscale));
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > > > +                   pipe_ratio.val >> 16, pipe_ratio.val &
> > > > > > 0xffff);
> > > > > > +
> > > > > > +     return pipe_ratio;
> > > > > > +}
> > > > >
> > > > > Yuck. I just killed these.
> > > >
> > > > I still need pipe_ratio fro ICL as you know, this is dictated by
> > > > BSpec,
> > > > not by me.
> > >
> > > That part of the spec is pretty crazy IMO. I think we should just
> > > ignore
> > > it for now. Also nothing in there says we must use both slices for a
> > > single pipe so ignoring doesn't even violate the spec in any way.
> >
> > So you are instructing me now to ignore BSpec as "we are smarter"?
> > I don't think this is properly to do it this way. Lets first change the
> > BSpec, if we don't like pipe ratio here and then driver. Currently this
> > is written that way and if we are going to do everything our own way
> > ignoring specs, this is going to be chaos.
>
> >Like I sai you don't have to violate the spec at all. Just don't use
> >two slices for the same pipe and you can avoid the whole strange pipe
> >ratio limitation. We should file a bspec issue to get that clarified
> >though.
>
> That is actually a violation of BSpec, we are not using two slices for same
> pipe while we could. We had already enough of bw issues, why should
> we make our life even harder?

Using more slices is, for whatever reason, harder for the hardware.
You're not allowed to use two slices if you do any significant
downscaling. Which is a bit sad/odd because increasing downscaling
increases bandwidth utilization but at the same time also prevents
us from using two slices.

I think the main benefit we would get from two slices is better
power savings because we may be able to enable more watermark
levels on account of the larger ddb.

There is nothing in the spec that I can see that says we *must* use
two slices (unless we're about to exceed the CDCDLKx64 bandwidth
limit I guess). And since we haven't been using two slices so far
it seems safer to not jump in the deep end and enable two slices
per-pipe at the same time as we do the other things. Ie. there
should be a separate patch to enable the two slices per pipe usage.

>
> > > > > > +/*
> > > > > > + * Table taken from Bspec 49255
> > > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > > + * plus there are some hardcoded requirements on how
> > > > > > + * those should be distributed for multipipe scenarios.
> > > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > > + * and less readable, so decided to use a table almost
> > > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > > + * to compare, change and check.
> > > > > > + */
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> > > > >
> > > > > My eyes!
> > > > >
> > > > > I have to think we should be able to reduce all that to a handful
> > > > > of lines of code.
> > > >
> > > > Yeah, then we are going to have a huge function with lots of weird
> > > > definitions, unfortunately BSpec table has quite strange DBuf
> > > > assignments like
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > >
> > > > i.e slices can get mixed in a quite various ways. There is no
> > > > rational pattern in those and could be even dangerous to try to
> > > > optimize it some way, as one day it might change again in some
> > > > unpredictable way.
> > > >
> > > > Would you prefer to have it like
> > > >
> > > > if (pipes are A and B)
> > > >     program S2 to A and S1 to B
> > > > if (pipes are A and C)
> > > >     program S1 to A and S2 to C
> > > > ...
> > > >
> > > > I would prefer at least to see it in a comparable way with the
> > > > table we have in BSpec, rather than having lots of weird looking
> > > > if-else statements, GEN checks and so on.
> > > >
> > > > I knew this table was not going to look like it is done typically
> > > > here - but really my opinion that it is way better than hardcoding
> > > > it into some weird algorithm, which would be hard to compare to
> > > > initial table, which is already strange enough.
> > > >
> > > > If you don't like it - could you please explain why this is exactly
> > > > worse than having long functions with hardcoded checks?
> > >
> > > I think we should be able to encode this simply using the
> > > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
> > > is the distance of each pipe from each dbuf slice. Then all we have
> > > to do is minimize the total distance. The following tables I think
> > > should encode that reasonably concisely:
> > >
> > > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > > static const u8 icl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > > };
> > >
> > > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > > static const u8 tgl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > > };
> >
> > I think you are missing my point. I will explain why.
> > My initial idea was similar, use some kind of distance
> > metrics and encode everything in a smart way.
> > Current BSpec table has sometimes slices swapped for no reason.
> >
> > There are seem to be some hardware constraints we are not aware
> > of. For example for TGL pipes A,B get S2 and S1 slices,
>
> >A/S1 + B/S2 = 1 + 4 = 5
> >A/S2 + B/S1 = 2 + 1 = 3
>
> > but pipes A C get S1 and S2.
>
> >A/S1 + C/S2 = 1 + 1 = 2
> >A/S2 + C/S1 = 2 + 2 = 4
>
> >So it's exactly as my distance table says it should be.
>
> >
> > if you encode all this into some kind of distance algorithm,
> > those differences would be enourmously difficult to spot and
> > compare with actual table. That is why I decided to do it that
> > way.
> >
> > Most likely yes you can solve it by using this metrics,
> > but will it really be then more readable and comparable
> > with reference table?
> > What if some contraints will be added again, like pipe ratio?
> >
> > >
> > > /*
> > >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> > >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> > >  */
> > > static const u8 future_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> > >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> > >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> > >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > > };
> > >
> > > I suppose we could make that even more minimal by just
> > > directly sticking that "Pipe and DBUF ordering" language
> > > into a single array but and then calculate each distance
> > > from it, but that is perhaps more work, and we'd need
> > > some kind of enum just for that so that we have a single
> > > namespace for the pipes and dbuf slices. So i have a
> > > feeling having those distances pre-computed in a table
> > > like thet is maybe a better way to go.
> >
> > Good luck then spoting differences with the BSpec table once it
> > gets changed.
>
> >Well, I'd rather do that than decode through layers of cpp macros.
> >Also atm the table has no extra restrictions so it's clearly there
> >just because someone thought they'd write out the results of the
> >distance calculations. In general bspec style is somewhat annoying
> >where they try to write out things in the lowest level of absraction
> >possible (they are hardware people after all). I'd much prefer a
> >higher level description of the issue.
>
> I would prefer to see immediately what is different as this table
> makes it possible right away
> Also to me this table simply intuitively more understandable.
>
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
>
> We have similar CDCLK table in intel_cdclk, while you could
> also argue that we might had come up with some formula.

The cdclk table has one redundancy that I'm not 100% fan of:
it specifies both the cdclk frequency and the cd2x divider.
We could make do with just either one and calculate the other
one.

On the flip side that redundancy would allow us to cross check
that the values produce the expected results. I actually have
a patch for that buried in some branch.

Apart from that I don't see what the cdclk table has to do
with this stuff. We can't derive that cdclk table from anything
much simpler.

Anyways, my preferred approach to this sort of stuff is to
make the computer do the calculations. That's what it's good
at. Humans, not so great at computations, or even copy pasting.
Though as with the cdclk table it may be possible to make
humans do the calculations/copy paste and then have the
computer double check it all makes sense.

--
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [Intel-gfx] [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL
@ 2019-11-20 21:55                 ` Lisovskiy, Stanislav
  0 siblings, 0 replies; 34+ messages in thread
From: Lisovskiy, Stanislav @ 2019-11-20 21:55 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On Mon, Nov 20, 2019 at 09:18:48PM +0000, Ville Syrjälä wrote:

> >
> > I'm not reenabling it - this is code is out there right now.
>
> > It's _dead_ code, and it's like that on purpose. You're now attempting
> > to re-enable it without fixing the underlying issues.
>
> Sounds like some sacred thing. Dead code which is not commented, neither
> removed, but which you can't touch.

>You can touch it by either removing it or fixing it. Anything else
>is a bit pointless.

Ok, then instead of pointless arguing, just need idea where to put it.

Like there is a big difference between "this code is wrong.point." and 
and properly explained suggestion where/why it should be put.
Also yep to me any code which is not actually working properly or
should not be used, must be removed or commented out.

I don't really get this "dead" code concept. Sounds cool though.

>It's a global resource shared by the pipes. Hence it's quite similar
>to SAGV and cdclk.

>> So all this was simply about that I should put this function call into
>> skl_commit_modeset_disables?

>It should be in intel_atomic_commit_tail() most likely, just like cdclk.

Ok. 

>I cooked up a small test that I believe should end up ping-ponging
>between one and two slices on icl, and so should end up exploding
>when the second slice gets shut down prematurely:
>git://github.com/vsyrjala/intel-gpu-tools.git dbuf_slice_test

 Yep, because currently no tests explode on this. 

>>  Some one might come up that it is called "num_slices" or
>>  "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
>>  keep everyones bloated egos happy??

Since we have ddb_size in device info already it should be next to that
actually. And we should probably unify the terminology a bit so we don't
mix dbuf and ddb in the same sentence.

May be still use ddb allocation struct to group things together, similar
how it is done for bw info in dev_priv. This will make it easier at least
in a sense that we won't have to constantly prefix this with dbuf/ddb.

Not sure if this needs to be done all in one as this feels more like
some cleanup/prep stuff, which is evergoing anyway.

I will anyway rename the var, as we wasted already way too much 
time on this..

>It messes up the software state that we computed with something that
>we didn't compute. So going forward the state will be inconsistent.

>The core concept of atomic is that the state is precomputed and
>validated ahead of time. The commit phase is supposed to be dumb and
>isn't allowed to modify the state. This is also why we often pass
>the states as const during the commit phase, though unfortunately
>we can't do that always on account of the states being slightly
>overloaded with other metadata.

Sounds more philosophical question.  You either have inconsistent
with itself software state or the software which is consistent with
itself but not with the hardware.
My idea was always to reflect hardware state as much as possible,
however I get your point here.

>> >Looking at something else unfortunately. Also it has just kept
>> >bitrotting since because no one has cared.
>>
>> That is what exactly surprises me. While we have issues
>> like commit serialization, sagv, mst, half of display buffer only
>> enabled, bw, and we are arguing about how supported_slices
>> should be named. Or that I shouldn't refer to DBuf slices as bits
>> but define as values and then decode those as bits. Priorities.

>Usually people complain about reviewers not pointing out
>all the issues in one sitting.

It was not about complaining but more about prioritization.
While trying to reach as perfect code/variable naming/placement is great,
this process is evergoing, while some things might be more important.

I don't believe DBUF_S1_BIT defined as BIT(0) or DBUF_S1 defined as 0
and then used anyway used as BIT(0) will bring any difference.

We are just arguing whether A + B or B + A is better here. 

There are lots of assumptions, "dead" code and other stuff already.

And then there is a real important stuff like commit serialization problem,
the actual features implementation which have to wait, because of that.

>
> > > > >
> > > > > >       for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > > crtc_state,
> > > > > > i) {
> > > > > >               const struct drm_display_mode *adjusted_mode =
> > > > > >                       &crtc_state->hw.adjusted_mode;
> > > > > >               enum pipe pipe = crtc->pipe;
> > > > > >               int hdisplay, vdisplay;
> > > > > > +             u32 pipe_dbuf_slice_mask = \
> > > > > > +                     i915_get_allowed_dbuf_mask(dev_priv,
> > > > > > +                                             pipe,
> > > > > > +                                             active_pipes,
> > > > > > +                                             crtc_state);
> > > > > >
> > > > > >               if (!crtc_state->hw.enable)
> > > > > >                       continue;
> > > > > >
> > > > > > +             /*
> > > > > > +              * According to BSpec pipe can share one dbuf
> > > > > > slice
> > > > > > with another
> > > > > > +              * pipes or pipe can use multiple dbufs, in
> > > > > > both cases
> > > > > > we
> > > > > > +              * account for other pipes only if they have
> > > > > > exactly
> > > > > > same mask.
> > > > > > +              * However we need to account how many slices
> > > > > > we should
> > > > > > enable
> > > > > > +              * in total.
> > > > > > +              */
> > > > > > +             total_slice_mask |= pipe_dbuf_slice_mask;
> > > > >
> > > > > total_slice_mask will now account only the crtcs in the state.
> > > > > What
> > > > > happens to the other pipes' slices?
> >
>> > Same as happens now btw. Right now it also does the same thing.
>>
>> >Right now it does absolutely nothing. It's all dead code.
>>
>> Right now, this "dead" code exactly iterates through crtcs in state and
>> allocates DBuf S1 in proportion to pipe width.
>> What I do is make it do the same thing but using either range of one
>> or two slices, depending on pipe configuration.

>OK. So how does it deal with my original question?

Original question was about that there are might be not all crtcs in the state.
I told that I know as this is existing problem and actually not brought up
by my code even. 
Currently it works that way. Just as many other things related to commit serialization
thing. But should I really fix all this in my patch, which was just supposed to 
bring proper usage of the DBuf slices?
I think this should be fixed separately with some patch, saying "start using global
state instead of current atomic state when allocating DBuf for pipes".
Because I don't want to debug issues coming from enabling second DBuf slice
and those from that issue at the same time.

>>
>> So should I have thrown away that code and rewritten this as well?

>Not if it works.

So it works or not? Or this is a "dead" code as you mentioned, 
I'm confused now.


>The cdclk table has one redundancy that I'm not 100% fan of:
>it specifies both the cdclk frequency and the cd2x divider.
>We could make do with just either one and calculate the other
>one.

>On the flip side that redundancy would allow us to cross check
>that the values produce the expected results. I actually have
>a patch for that buried in some branch.

Exactly easier cross check.

>Apart from that I don't see what the cdclk table has to do
>with this stuff. We can't derive that cdclk table from anything
>much simpler

See above. Even if you could that would eliminate table but
reduce readability.

>Anyways, my preferred approach to this sort of stuff is to
>make the computer do the calculations. That's what it's good
>at. Humans, not so great at computations, or even copy pasting.
>Though as with the cdclk table it may be possible to make
>humans do the calculations/copy paste and then have the
>computer double check it all makes sense.

Calculations are great for something complex. You will not use
calculator every time you need to add 2+2. 
The table you proposing will require some kind of minimal spanning
tree algorithm.

I.e there are metrics about how far is each DBuf slice from
each Pipe. You build a graph from that and try to get a config
which corresponds to minimal spanning tree of this graph - 
that would be optimal DBuf assignment.

However this would have to be done in runtime, doing calculations
and lots of compares. 
           S1    S2
Pipe1   1      2
Pipe2    2     1

You basically then have 1+2, 2+1,2+2,1+1 where minimum is of course 1+1, 
i.e Pipe1(S1) and Pipe2(S2)
Even for this amount of pipes you have 4 combinations to check.
You can of course forbid to use same slice, thus leaving only 2 of those,
but that will violate BSpec as it allows to use same slice.
And how about 3,4 pipes? 

Sound almost like implementing a hash function for doing search
in array of 10 elements, just as we discussed at the meeting.
Fancy, interesting, but not redundant.

So basically we are loosing readability of the code and also add
more complexity to the code(means more possible errors).
What for?

While we can just have a compile time defined mask. In a small lookup table,
which is easy to read and compare. And no run-time algorithms.
No redundant complexity.

I admit my table could be not the best way to do it. OK.
But it makes sense and works. It is always possible to improve.

Stan



Best Regards,

Lisovskiy Stanislav

Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

________________________________________
From: Ville Syrjälä [ville.syrjala@linux.intel.com]
Sent: Wednesday, November 20, 2019 9:18 PM
To: Lisovskiy, Stanislav
Cc: Saarinen, Jani; intel-gfx@lists.freedesktop.org; Roper, Matthew D; maarten.lankhorst@linux.intel.com; Ausmus, James
Subject: Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL

On Mon, Nov 18, 2019 at 08:25:48PM +0000, Lisovskiy, Stanislav wrote:
> Hi,
>
> > > > > >
> > > > > > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> > > > > > +     /*
> > > > > > +      * If some other DBuf slice is not needed, disable it
> > > > > > here,
> > > > > > +      * as here we only remove slices, which are not needed
> > > > > > anymore.
> > > > > > +      */
> > > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > > hw_enabled_slices)
> > > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > > slices_intersection);
> > > > >
> > > > > This is still a totally wrong place for this. We need to be sure
> > > > > the
> > > > > planes have switched over to the new ddb layout before we power
> > > > > off
> > > > > anything.
> > > >
> > > > I have just modified already existing line here, not added by me.
> > > > The point of this patch is starting to use second DBuf slice but
> > > > not fixing all kinds of issues related to DBuf code at the same
> > > > time.
> > >
> > > You're re-enabling code I disabled on purpose. It has to be fixed
> > > properly or not touched at all. In hindsight I should have probably
> > > just ripped it all out to not give people bad ideas.
> >
> > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> >
> > I'm not reenabling it - this is code is out there right now.
>
> > It's _dead_ code, and it's like that on purpose. You're now attempting
> > to re-enable it without fixing the underlying issues.
>
> Sounds like some sacred thing. Dead code which is not commented, neither
> removed, but which you can't touch.

You can touch it by either removing it or fixing it. Anything else
is a bit pointless.

>
> > This is all again about changing all kinds of stuff at the same time.
> >
> > Why we can't just do things step by step? You have a point that
> > this is wrong place ok, but as we see current code causes no
> > regressions. Just tell me your idea where to put it and I will
> > do that, if you don't have it yet - let it stay there, if CI shows
> > no regressions currently.
>
> >The same global state tricks apply here as with the cdclk/sagv stuff.
> >1. enable new resources/bump up clocks
> >2. reconfigure planes/etc.
> >3. wait for vblanks
> >4. disable unused resources/drop the clocks
>
> It is different with SAGV, there we restrict whats needed first and relax
> some limitations afterwards.

It's a global resource shared by the pipes. Hence it's quite similar
to SAGV and cdclk.

> So all this was simply about that I should put this function call into
> skl_commit_modeset_disables?

It should be in intel_atomic_commit_tail() most likely, just like cdclk.

I cooked up a small test that I believe should end up ping-ponging
between one and two slices on icl, and so should end up exploding
when the second slice gets shut down prematurely:
git://github.com/vsyrjala/intel-gpu-tools.git dbuf_slice_test

> I could just have said it straight away to me, without mentioned "dead" code
> and other stuff..

I don't read minds either.

>
> > > > > > +     return INTEL_INFO(dev_priv)->supported_slices;
> > > > > > +}
> > > > >
> > > > > Kinda pointless function now. 'supported_slices' is not a very
> > > > > good
> > > > > name. Should be 'num_dbuf_slices' or something along those lines.
> > > >
> > > > Supported slices really means how much slices we support in
> > > > hardware.
> > > > num_dbuf_slices will not reflect that, it will be more ambigious.
> > > > I don't like this name.
> > >
> > > supported_slices doesn't even say "dbuf" anywhere. So it's even more
> > > ambiguous. Also we already have num_pipes, num_sprites, etc.
> >
>  ..and in current code we have... "enabled_slices", no? Ouch! :) IS this
>  really that important and worth arguing at the very last moment?
>  whether number of
>  supported_slices_in_hardware, should be named "supported slices" or
>  "num_dbuf_slices"? If yes, lets go on, miss some targets and continue
>  discussing this "soooo important" thing.
>
>  Some one might come up that it is called "num_slices" or
>  "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
>  keep everyones bloated egos happy??

Since we have ddb_size in device info already it should be next to that
actually. And we should probably unify the terminology a bit so we don't
mix dbuf and ddb in the same sentence.

>
> > > > > >  static void icl_dbuf_enable(struct drm_i915_private *dev_priv)
> > > > > >  {
> > > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > > -
> > > > > > -     udelay(10);
> > > > > > -
> > > > > > -     if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > > -         !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > > -             DRM_ERROR("DBuf power enable timeout\n");
> > > > > > -     else
> > > > > > -             /*
> > > > > > -              * FIXME: for now pretend that we only have 1
> > > > > > slice,
> > > > > > see
> > > > > > -              * intel_enabled_dbuf_slices_num().
> > > > > > -              */
> > > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > > +     /*
> > > > > > +      * Just power up 1 slice, we will
> > > > > > +      * figure out later which slices we have and what we
> > > > > > need.
> > > > > > +      */
> > > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > > >
> > > > > That stuff really shouldn't be modified by low level functions.
> > > > > The intial value should come from readout, and after that it
> > > > > should
> > > > > be adjusted as part of the overall atomic state.
> > > >
> > > > Well, as you can see currently this is even worse - has some weird
> > > > hardcoded stuff and is modified like that. Also if we are enabling
> > > > the
> > > > dbuf slices, shouldn't we really determine what value should be
> > > > there?
> > > >
> > > > Also I wouldn't really always trust the readout, as to me it seems
> > > > icelake resets it constantly to 2 after suspend/resume.
> > >
> > > The point of readout is that we syncrhonize our software state with
> > > the
> > > current state of hardware without having to modify the hardware
> > > state.
> > > Once synchronized the driver takes over and tells the hw what to do.
> >
> > That is especially nice after you say "report error and don't care"
> > when I really want to reflect hw state in sw state, by checking whether
> > we could set dbuf slices or not.
> > Now you say that we need to synchronize software state with hw state,
> > however if I fail to modify it - then "who cares".
>
> >Yes. If you fail to modify the hardware state the whole thing is
> >broken anyway. There's is *nothing* we can do at this stage in the
> >commit to fix it.
>
> Ok, but why we can't at least keep the last successful state in sw
> instead of something which is completely wrong. How does that harm?

It messes up the software state that we computed with something that
we didn't compute. So going forward the state will be inconsistent.

The core concept of atomic is that the state is precomputed and
validated ahead of time. The commit phase is supposed to be dumb and
isn't allowed to modify the state. This is also why we often pass
the states as const during the commit phase, though unfortunately
we can't do that always on account of the states being slightly
overloaded with other metadata.

>
> > > > > >
> > > > > >  struct skl_ddb_allocation {
> > > > >
> > > > > This struct is quite pointless. Should be killed off.
> > > >
> > > > Legacy stuff, not added by me. Quite fine with removing it -
> > > > but should we really do all this at the same time and in same
> > > > patch?
> > >
> > > Not the same patch. But clean up the cobwebs first, then add the new
> > > stuff. Much easier that way around.
> >
> > There is a shit ton load of cobwebs to clean. Personally my opinion is
> > that current dbuf controlling code is total garbage - which has lots of
> > hardcoded stuff. Good luck enabling only second dbuf slice, good luck
> > disabling/enabling dbuf properly....
> > Where you were, when this was committed then?
>
> >Looking at something else unfortunately. Also it has just kept
> >bitrotting since because no one has cared.
>
> That is what exactly surprises me. While we have issues
> like commit serialization, sagv, mst, half of display buffer only
> enabled, bw, and we are arguing about how supported_slices
> should be named. Or that I shouldn't refer to DBuf slices as bits
> but define as values and then decode those as bits. Priorities.

Usually people complain about reviewers not pointing out
all the issues in one sitting.

>
> >
> > If we need to fix it - fine. But may be we should do it in stages?
> > Point of this patch is "enable second dbuf slice", next patch "remove
> > unneeded structs" and so on.
> > While you want me to shuffle everything at the same time, which is not
> > good idea, especially considering complexity and amount of legacy stuff
> > we currently have.
>
> >Building new stuff on top of a decomposing garbage pile is not
> >a good idea: a) it's harder than building on clean foundation,
> >b) it'll probably fall over and get buried in the old garbage
> >at which point you're likely going to be right back where you
> >started
> .
> Sure, let's change everything at the same time and then wonder
> what had failed exactly in that huge series with lots of unrelated changes.
> Standard practice.

Series can be split up to smaller chunks. My usual MO is to split
prep stuff out to separate patches, and if it looks like the overall
series is getting too big I even post the prep patches separately.
In fact I often end up with more than one prep series for whatever
actual goal I was trying to reach.

>
> > > > >
> > > > > >       for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > > crtc_state,
> > > > > > i) {
> > > > > >               const struct drm_display_mode *adjusted_mode =
> > > > > >                       &crtc_state->hw.adjusted_mode;
> > > > > >               enum pipe pipe = crtc->pipe;
> > > > > >               int hdisplay, vdisplay;
> > > > > > +             u32 pipe_dbuf_slice_mask = \
> > > > > > +                     i915_get_allowed_dbuf_mask(dev_priv,
> > > > > > +                                             pipe,
> > > > > > +                                             active_pipes,
> > > > > > +                                             crtc_state);
> > > > > >
> > > > > >               if (!crtc_state->hw.enable)
> > > > > >                       continue;
> > > > > >
> > > > > > +             /*
> > > > > > +              * According to BSpec pipe can share one dbuf
> > > > > > slice
> > > > > > with another
> > > > > > +              * pipes or pipe can use multiple dbufs, in
> > > > > > both cases
> > > > > > we
> > > > > > +              * account for other pipes only if they have
> > > > > > exactly
> > > > > > same mask.
> > > > > > +              * However we need to account how many slices
> > > > > > we should
> > > > > > enable
> > > > > > +              * in total.
> > > > > > +              */
> > > > > > +             total_slice_mask |= pipe_dbuf_slice_mask;
> > > > >
> > > > > total_slice_mask will now account only the crtcs in the state.
> > > > > What
> > > > > happens to the other pipes' slices?
> >
> > Same as happens now btw. Right now it also does the same thing.
>
> >Right now it does absolutely nothing. It's all dead code.
>
> Right now, this "dead" code exactly iterates through crtcs in state and
> allocates DBuf S1 in proportion to pipe width.
> What I do is make it do the same thing but using either range of one
> or two slices, depending on pipe configuration.

OK. So how does it deal with my original question?

>
> So should I have thrown away that code and rewritten this as well?

Not if it works.

>
> > > > > > +static uint_fixed_16_16_t
> > > > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     u32 src_w, src_h, dst_w, dst_h;
> > > > > > +     uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > > > +     uint_fixed_16_16_t downscale_h, downscale_w;
> > > > > > +     const struct drm_display_mode *adjusted_mode =
> > > > > > &crtc_state-
> > > > > > > hw.adjusted_mode;
> > > > > >
> > > > > > +
> > > > > > +     src_w = crtc_state->pipe_src_w;
> > > > > > +     src_h = crtc_state->pipe_src_h;
> > > > > > +     dst_w = adjusted_mode->crtc_hdisplay;
> > > > > > +     dst_h = adjusted_mode->crtc_vdisplay;
> > > > > > +
> > > > > > +     fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > > > +     fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > > > +     downscale_w = max_fixed16(fp_w_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +     downscale_h = max_fixed16(fp_h_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +
> > > > > > +     return mul_fixed16(downscale_w, downscale_h);
> > > > > > +}
> > > > > > +
> > > > > > +uint_fixed_16_16_t
> > > > > > +icl_get_pipe_ratio(const struct intel_crtc_state *crtc_state)
> > > > > > +{
> > > > > > +     struct drm_plane *plane;
> > > > > > +     const struct drm_plane_state *drm_plane_state;
> > > > > > +     uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > > > +     uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > > > +     struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > > > >uapi.crtc);
> > > > > > +
> > > > > > +     if (!crtc_state->uapi.enable)
> > > > > > +             return max_downscale;
> > > > > > +
> > > > > > +     drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > > > drm_plane_state, &crtc_state->uapi) {
> > > > > > +             uint_fixed_16_16_t plane_downscale;
> > > > > > +             const struct intel_plane_state *plane_state =
> > > > > > +                     to_intel_plane_state(drm_plane_state);
> > > > > > +
> > > > > > +             if (!intel_wm_plane_visible(crtc_state,
> > > > > > plane_state))
> > > > > > +                     continue;
> > > > > > +
> > > > > > +             plane_downscale =
> > > > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > > > +
> > > > > > +             DRM_DEBUG_KMS("Plane %d downscale amount
> > > > > > %d.%d\n",
> > > > > > +                           to_intel_plane(plane)->id,
> > > > > > +                           plane_downscale.val >> 16,
> > > > > > +                           plane_downscale.val & 0xffff);
> > > > > > +
> > > > > > +             max_downscale = max_fixed16(plane_downscale,
> > > > > > max_downscale);
> > > > > > +     }
> > > > > > +
> > > > > > +
> > > > > > +     pipe_downscale = icl_pipe_downscale_amount(crtc_state);
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > > > +                    crtc->pipe, pipe_downscale.val >> 16,
> > > > > > +                    pipe_downscale.val & 0xffff);
> > > > > > +
> > > > > > +     pipe_downscale = mul_fixed16(pipe_downscale,
> > > > > > max_downscale);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Convert result to percentage: get the actual ratio
> > > > > > instead
> > > > > > of rate,
> > > > > > +      * then multiply by 100 to get percentage.
> > > > > > +      */
> > > > > > +     pipe_ratio = u32_to_fixed16(100 *
> > > > > > div_round_up_u32_fixed16(1,
> > > > > > pipe_downscale));
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > > > +                   pipe_ratio.val >> 16, pipe_ratio.val &
> > > > > > 0xffff);
> > > > > > +
> > > > > > +     return pipe_ratio;
> > > > > > +}
> > > > >
> > > > > Yuck. I just killed these.
> > > >
> > > > I still need pipe_ratio fro ICL as you know, this is dictated by
> > > > BSpec,
> > > > not by me.
> > >
> > > That part of the spec is pretty crazy IMO. I think we should just
> > > ignore
> > > it for now. Also nothing in there says we must use both slices for a
> > > single pipe so ignoring doesn't even violate the spec in any way.
> >
> > So you are instructing me now to ignore BSpec as "we are smarter"?
> > I don't think this is properly to do it this way. Lets first change the
> > BSpec, if we don't like pipe ratio here and then driver. Currently this
> > is written that way and if we are going to do everything our own way
> > ignoring specs, this is going to be chaos.
>
> >Like I sai you don't have to violate the spec at all. Just don't use
> >two slices for the same pipe and you can avoid the whole strange pipe
> >ratio limitation. We should file a bspec issue to get that clarified
> >though.
>
> That is actually a violation of BSpec, we are not using two slices for same
> pipe while we could. We had already enough of bw issues, why should
> we make our life even harder?

Using more slices is, for whatever reason, harder for the hardware.
You're not allowed to use two slices if you do any significant
downscaling. Which is a bit sad/odd because increasing downscaling
increases bandwidth utilization but at the same time also prevents
us from using two slices.

I think the main benefit we would get from two slices is better
power savings because we may be able to enable more watermark
levels on account of the larger ddb.

There is nothing in the spec that I can see that says we *must* use
two slices (unless we're about to exceed the CDCDLKx64 bandwidth
limit I guess). And since we haven't been using two slices so far
it seems safer to not jump in the deep end and enable two slices
per-pipe at the same time as we do the other things. Ie. there
should be a separate patch to enable the two slices per pipe usage.

>
> > > > > > +/*
> > > > > > + * Table taken from Bspec 49255
> > > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > > + * plus there are some hardcoded requirements on how
> > > > > > + * those should be distributed for multipipe scenarios.
> > > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > > + * and less readable, so decided to use a table almost
> > > > > > + * as is from BSpec itself - that way it is at least easier
> > > > > > + * to compare, change and check.
> > > > > > + */
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> > > > >
> > > > > My eyes!
> > > > >
> > > > > I have to think we should be able to reduce all that to a handful
> > > > > of lines of code.
> > > >
> > > > Yeah, then we are going to have a huge function with lots of weird
> > > > definitions, unfortunately BSpec table has quite strange DBuf
> > > > assignments like
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > >
> > > > i.e slices can get mixed in a quite various ways. There is no
> > > > rational pattern in those and could be even dangerous to try to
> > > > optimize it some way, as one day it might change again in some
> > > > unpredictable way.
> > > >
> > > > Would you prefer to have it like
> > > >
> > > > if (pipes are A and B)
> > > >     program S2 to A and S1 to B
> > > > if (pipes are A and C)
> > > >     program S1 to A and S2 to C
> > > > ...
> > > >
> > > > I would prefer at least to see it in a comparable way with the
> > > > table we have in BSpec, rather than having lots of weird looking
> > > > if-else statements, GEN checks and so on.
> > > >
> > > > I knew this table was not going to look like it is done typically
> > > > here - but really my opinion that it is way better than hardcoding
> > > > it into some weird algorithm, which would be hard to compare to
> > > > initial table, which is already strange enough.
> > > >
> > > > If you don't like it - could you please explain why this is exactly
> > > > worse than having long functions with hardcoded checks?
> > >
> > > I think we should be able to encode this simply using the
> > > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify what
> > > is the distance of each pipe from each dbuf slice. Then all we have
> > > to do is minimize the total distance. The following tables I think
> > > should encode that reasonably concisely:
> > >
> > > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > > static const u8 icl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > > };
> > >
> > > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > > static const u8 tgl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > > };
> >
> > I think you are missing my point. I will explain why.
> > My initial idea was similar, use some kind of distance
> > metrics and encode everything in a smart way.
> > Current BSpec table has sometimes slices swapped for no reason.
> >
> > There are seem to be some hardware constraints we are not aware
> > of. For example for TGL pipes A,B get S2 and S1 slices,
>
> >A/S1 + B/S2 = 1 + 4 = 5
> >A/S2 + B/S1 = 2 + 1 = 3
>
> > but pipes A C get S1 and S2.
>
> >A/S1 + C/S2 = 1 + 1 = 2
> >A/S2 + C/S1 = 2 + 2 = 4
>
> >So it's exactly as my distance table says it should be.
>
> >
> > if you encode all this into some kind of distance algorithm,
> > those differences would be enourmously difficult to spot and
> > compare with actual table. That is why I decided to do it that
> > way.
> >
> > Most likely yes you can solve it by using this metrics,
> > but will it really be then more readable and comparable
> > with reference table?
> > What if some contraints will be added again, like pipe ratio?
> >
> > >
> > > /*
> > >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> > >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> > >  */
> > > static const u8 future_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> > >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> > >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> > >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > > };
> > >
> > > I suppose we could make that even more minimal by just
> > > directly sticking that "Pipe and DBUF ordering" language
> > > into a single array but and then calculate each distance
> > > from it, but that is perhaps more work, and we'd need
> > > some kind of enum just for that so that we have a single
> > > namespace for the pipes and dbuf slices. So i have a
> > > feeling having those distances pre-computed in a table
> > > like thet is maybe a better way to go.
> >
> > Good luck then spoting differences with the BSpec table once it
> > gets changed.
>
> >Well, I'd rather do that than decode through layers of cpp macros.
> >Also atm the table has no extra restrictions so it's clearly there
> >just because someone thought they'd write out the results of the
> >distance calculations. In general bspec style is somewhat annoying
> >where they try to write out things in the lowest level of absraction
> >possible (they are hardware people after all). I'd much prefer a
> >higher level description of the issue.
>
> I would prefer to see immediately what is different as this table
> makes it possible right away
> Also to me this table simply intuitively more understandable.
>
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
>
> We have similar CDCLK table in intel_cdclk, while you could
> also argue that we might had come up with some formula.

The cdclk table has one redundancy that I'm not 100% fan of:
it specifies both the cdclk frequency and the cd2x divider.
We could make do with just either one and calculate the other
one.

On the flip side that redundancy would allow us to cross check
that the values produce the expected results. I actually have
a patch for that buried in some branch.

Apart from that I don't see what the cdclk table has to do
with this stuff. We can't derive that cdclk table from anything
much simpler.

Anyways, my preferred approach to this sort of stuff is to
make the computer do the calculations. That's what it's good
at. Humans, not so great at computations, or even copy pasting.
Though as with the cdclk table it may be possible to make
humans do the calculations/copy paste and then have the
computer double check it all makes sense.

--
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2019-11-20 21:56 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-14 12:24 [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL Stanislav Lisovskiy
2019-11-14 12:24 ` [Intel-gfx] " Stanislav Lisovskiy
2019-11-14 13:49 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Enable second dbuf slice for ICL and TGL (rev4) Patchwork
2019-11-14 13:49   ` [Intel-gfx] " Patchwork
2019-11-14 13:50 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-11-14 13:50   ` [Intel-gfx] " Patchwork
2019-11-14 14:18 ` ✓ Fi.CI.BAT: success " Patchwork
2019-11-14 14:18   ` [Intel-gfx] " Patchwork
2019-11-15  9:59 ` ✓ Fi.CI.IGT: " Patchwork
2019-11-15  9:59   ` [Intel-gfx] " Patchwork
2019-11-15 20:19 ` [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL Ville Syrjälä
2019-11-15 20:19   ` [Intel-gfx] " Ville Syrjälä
2019-11-18  9:19   ` Lisovskiy, Stanislav
2019-11-18  9:19     ` [Intel-gfx] " Lisovskiy, Stanislav
2019-11-18 15:27     ` Ville Syrjälä
2019-11-18 15:27       ` [Intel-gfx] " Ville Syrjälä
2019-11-18 17:17       ` Lisovskiy, Stanislav
2019-11-18 17:17         ` [Intel-gfx] " Lisovskiy, Stanislav
2019-11-18 18:18         ` Ville Syrjälä
2019-11-18 18:18           ` [Intel-gfx] " Ville Syrjälä
2019-11-18 20:25           ` Lisovskiy, Stanislav
2019-11-18 20:25             ` [Intel-gfx] " Lisovskiy, Stanislav
2019-11-19 13:24             ` Lisovskiy, Stanislav
2019-11-19 13:24               ` [Intel-gfx] " Lisovskiy, Stanislav
2019-11-20 19:18             ` Ville Syrjälä
2019-11-20 19:18               ` [Intel-gfx] " Ville Syrjälä
2019-11-20 21:55               ` Lisovskiy, Stanislav
2019-11-20 21:55                 ` [Intel-gfx] " Lisovskiy, Stanislav
2019-11-18 14:22 ` Jani Nikula
2019-11-18 14:22   ` [Intel-gfx] " Jani Nikula
2019-11-18 14:32   ` Lisovskiy, Stanislav
2019-11-18 14:32     ` [Intel-gfx] " Lisovskiy, Stanislav
2019-11-18 15:26     ` Jani Nikula
2019-11-18 15:26       ` [Intel-gfx] " Jani Nikula

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.