[PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC
@ 2016-11-24  4:01 Mahesh Kumar
  2016-11-24  4:01 ` [PATCH v6 1/8] drm/i915/skl: Add variables to check x_tile and y_tile Mahesh Kumar
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Mahesh Kumar @ 2016-11-24  4:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This series implements following set of functionality
	Implement IPC WA's for Broxton/KBL
	Enable IPC in supported platforms
	Convert WM calculation to fixed point calculation
	Calculation of System memory Bandwidth for SKL/KBL/BXT
	Implementation of Arbitrated memory Bandwidth related WM WA's


Mahesh Kumar (8):
  drm/i915/skl: Add variables to check x_tile and y_tile
  drm/i915/bxt: IPC WA for Broxton
  drm/i915/kbl: IPC workaround for kabylake
  drm/i915/bxt: Enable IPC support
  drm/i915/skl+: change WM calc to fixed point 16.16
  drm/i915: Add intel_atomic_get_existing_crtc_state function
  drm/i915: Decode system memory bandwidth
  drm/i915/gen9: WM memory bandwidth related workaround

 drivers/gpu/drm/i915/i915_drv.c  | 175 ++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h  | 107 ++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h  |  38 ++++++
 drivers/gpu/drm/i915/intel_drv.h |  15 +++
 drivers/gpu/drm/i915/intel_pm.c  | 265 ++++++++++++++++++++++++++++++++-------
 5 files changed, 553 insertions(+), 47 deletions(-)

-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 1/8] drm/i915/skl: Add variables to check x_tile and y_tile
  2016-11-24  4:01 [PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
@ 2016-11-24  4:01 ` Mahesh Kumar
  2016-11-24  4:01 ` [PATCH v6 2/8] drm/i915/bxt: IPC WA for Broxton Mahesh Kumar
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Mahesh Kumar @ 2016-11-24  4:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch adds variable to check for X_tiled & y_tiled planes, instead
of always checking against framebuffer-modifiers.

Changes:
 - Created separate patch as per Paulo's comment
 - Added x_tiled variable as well
Changes since V2:
 - Incorporate Paulo's comments
 - Rebase

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index e54708d..8908736 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3592,13 +3592,18 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	struct intel_atomic_state *state =
 		to_intel_atomic_state(cstate->base.state);
 	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
+	bool y_tiled, x_tiled;
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
 		*enabled = false;
 		return 0;
 	}
 
-	if (apply_memory_bw_wa && fb->modifier == I915_FORMAT_MOD_X_TILED)
+	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
+				fb->modifier == I915_FORMAT_MOD_Yf_TILED;
+	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
+
+	if (apply_memory_bw_wa && x_tiled)
 		latency += 15;
 
 	width = drm_rect_width(&intel_pstate->base.src) >> 16;
@@ -3637,16 +3642,15 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		y_min_scanlines *= 2;
 
 	plane_bytes_per_line = width * cpp;
-	if (fb->modifier == I915_FORMAT_MOD_Y_TILED ||
-	    fb->modifier == I915_FORMAT_MOD_Yf_TILED) {
+	if (y_tiled) {
 		plane_blocks_per_line =
 		      DIV_ROUND_UP(plane_bytes_per_line * y_min_scanlines, 512);
 		plane_blocks_per_line /= y_min_scanlines;
-	} else if (fb->modifier == DRM_FORMAT_MOD_NONE) {
+	} else if (x_tiled) {
+		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512);
+	} else {
 		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512)
 					+ 1;
-	} else {
-		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512);
 	}
 
 	method1 = skl_wm_method1(plane_pixel_rate, cpp, latency);
@@ -3657,8 +3661,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 
 	y_tile_minimum = plane_blocks_per_line * y_min_scanlines;
 
-	if (fb->modifier == I915_FORMAT_MOD_Y_TILED ||
-	    fb->modifier == I915_FORMAT_MOD_Yf_TILED) {
+	if (y_tiled) {
 		selected_result = max(method2, y_tile_minimum);
 	} else {
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
@@ -3674,8 +3677,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	res_lines = DIV_ROUND_UP(selected_result, plane_blocks_per_line);
 
 	if (level >= 1 && level <= 7) {
-		if (fb->modifier == I915_FORMAT_MOD_Y_TILED ||
-		    fb->modifier == I915_FORMAT_MOD_Yf_TILED) {
+		if (y_tiled) {
 			res_blocks += y_tile_minimum;
 			res_lines += y_min_scanlines;
 		} else {
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 2/8] drm/i915/bxt: IPC WA for Broxton
  2016-11-24  4:01 [PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
  2016-11-24  4:01 ` [PATCH v6 1/8] drm/i915/skl: Add variables to check x_tile and y_tile Mahesh Kumar
@ 2016-11-24  4:01 ` Mahesh Kumar
  2016-11-24 12:51   ` Lankhorst, Maarten
  2016-11-24  4:01 ` [PATCH v6 3/8] drm/i915/kbl: IPC workaround for kabylake Mahesh Kumar
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Mahesh Kumar @ 2016-11-24  4:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

If IPC is enabled in BXT, display underruns are observed.
WA: The Line Time programmed in the WM_LINETIME register should be
half of the actual calculated Line Time.

Programmed Line Time = 1/2*Calculated Line Time

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c |  2 ++
 drivers/gpu/drm/i915/i915_drv.h |  2 ++
 drivers/gpu/drm/i915/intel_pm.c | 11 +++++++++--
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 445fec9..1b0a589 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1243,6 +1243,8 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	intel_runtime_pm_enable(dev_priv);
 
+	dev_priv->ipc_enabled = false;
+
 	/* Everything is in place, we can now relax! */
 	DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
 		 driver.name, driver.major, driver.minor, driver.patchlevel,
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 63c0ea0..6691a4e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2128,6 +2128,8 @@ struct drm_i915_private {
 	/* perform PHY state sanity checks? */
 	bool chv_phy_assert[2];
 
+	bool ipc_enabled;
+
 	/* Used to save the pipe-to-encoder mapping for audio */
 	struct intel_encoder *av_enc_map[I915_MAX_PIPES];
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 8908736..7090a7c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3769,7 +3769,10 @@ skl_compute_wm_level(const struct drm_i915_private *dev_priv,
 static uint32_t
 skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 {
+	struct drm_atomic_state *state = cstate->base.state;
+	struct drm_i915_private *dev_priv = to_i915(state->dev);
 	uint32_t pixel_rate;
+	uint32_t linetime_wm;
 
 	if (!cstate->base.active)
 		return 0;
@@ -3779,8 +3782,12 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 	if (WARN_ON(pixel_rate == 0))
 		return 0;
 
-	return DIV_ROUND_UP(8 * cstate->base.adjusted_mode.crtc_htotal * 1000,
-			    pixel_rate);
+	linetime_wm = DIV_ROUND_UP(8 * cstate->base.adjusted_mode.crtc_htotal *
+						1000, pixel_rate);
+	if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled)
+		linetime_wm = DIV_ROUND_UP(linetime_wm, 2);
+
+	return linetime_wm;
 }
 
 static void skl_compute_transition_wm(struct intel_crtc_state *cstate,
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 3/8] drm/i915/kbl: IPC workaround for kabylake
  2016-11-24  4:01 [PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
  2016-11-24  4:01 ` [PATCH v6 1/8] drm/i915/skl: Add variables to check x_tile and y_tile Mahesh Kumar
  2016-11-24  4:01 ` [PATCH v6 2/8] drm/i915/bxt: IPC WA for Broxton Mahesh Kumar
@ 2016-11-24  4:01 ` Mahesh Kumar
  2016-11-24  4:01 ` [PATCH v6 4/8] drm/i915/bxt: Enable IPC support Mahesh Kumar
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Mahesh Kumar @ 2016-11-24  4:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

IPC (Isoch Priority Control) may cause underflows.

KBL WA: When IPC is enabled, watermark latency values must be increased
by 4us across all levels. This brings level 0 up to 6us.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 7090a7c..33d22cc 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3603,6 +3603,10 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				fb->modifier == I915_FORMAT_MOD_Yf_TILED;
 	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
 
+	/* IPC WA for kabylake */
+	if (IS_KABYLAKE(dev_priv) && dev_priv->ipc_enabled)
+		latency += 4;
+
 	if (apply_memory_bw_wa && x_tiled)
 		latency += 15;
 
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 4/8] drm/i915/bxt: Enable IPC support
  2016-11-24  4:01 [PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
                   ` (2 preceding siblings ...)
  2016-11-24  4:01 ` [PATCH v6 3/8] drm/i915/kbl: IPC workaround for kabylake Mahesh Kumar
@ 2016-11-24  4:01 ` Mahesh Kumar
  2016-11-24  4:01 ` [PATCH v6 5/8] drm/i915/skl+: change WM calc to fixed point 16.16 Mahesh Kumar
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Mahesh Kumar @ 2016-11-24  4:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch adds IPC support for platforms. This patch enables IPC
only for BXT/KBL platform as for SKL recommendation is to keep is disabled.
IPC (Isochronous Priority Control) is the hardware feature, which
dynamically controles the memory read priority of Display.

When IPC is enabled, plane read requests are sent at high priority until
filling above the transition watermark, then the requests are sent at
lower priority until dropping below the level 0 watermark.
The lower priority requests allow other memory clients to have better
memory access. When IPC is disabled, all plane read requests are sent at
high priority.

Changes since V1:
 - Remove commandline parameter to disable ipc
 - Address Paulo's comments
Changes since V2:
 - Address review comments
 - Set ipc_enabled flag
Changes since V3:
 - move ipc_enabled flag assignment inside intel_ipc_enable function

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c  |  2 +-
 drivers/gpu/drm/i915/i915_reg.h  |  1 +
 drivers/gpu/drm/i915/intel_drv.h |  1 +
 drivers/gpu/drm/i915/intel_pm.c  | 16 ++++++++++++++++
 4 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 1b0a589..b83e84b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1243,7 +1243,7 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	intel_runtime_pm_enable(dev_priv);
 
-	dev_priv->ipc_enabled = false;
+	intel_enable_ipc(dev_priv);
 
 	/* Everything is in place, we can now relax! */
 	DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index c70c07a..300418a 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -6076,6 +6076,7 @@ enum {
 #define  DISP_FBC_WM_DIS		(1<<15)
 #define DISP_ARB_CTL2	_MMIO(0x45004)
 #define  DISP_DATA_PARTITION_5_6	(1<<6)
+#define  DISP_IPC_ENABLE		(1<<3)
 #define DBUF_CTL	_MMIO(0x45008)
 #define  DBUF_POWER_REQUEST		(1<<31)
 #define  DBUF_POWER_STATE		(1<<30)
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index cd132c2..ad542a2 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1745,6 +1745,7 @@ bool skl_ddb_allocation_overlaps(const struct skl_ddb_entry **entries,
 uint32_t ilk_pipe_pixel_rate(const struct intel_crtc_state *pipe_config);
 bool ilk_disable_lp_wm(struct drm_device *dev);
 int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6);
+void intel_enable_ipc(struct drm_i915_private *dev_priv);
 static inline int intel_enable_rc6(void)
 {
 	return i915.enable_rc6;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 33d22cc..df6d231 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4682,6 +4682,22 @@ void intel_update_watermarks(struct intel_crtc *crtc)
 		dev_priv->display.update_wm(crtc);
 }
 
+void intel_enable_ipc(struct drm_i915_private *dev_priv)
+{
+	u32 val;
+
+	dev_priv->ipc_enabled = false;
+	if (!(IS_BROXTON(dev_priv) || IS_KABYLAKE(dev_priv)))
+		return;
+
+	val = I915_READ(DISP_ARB_CTL2);
+
+	val |= DISP_IPC_ENABLE;
+
+	I915_WRITE(DISP_ARB_CTL2, val);
+	dev_priv->ipc_enabled = true;
+}
+
 /*
  * Lock protecting IPS related data structures
  */
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 5/8] drm/i915/skl+: change WM calc to fixed point 16.16
  2016-11-24  4:01 [PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
                   ` (3 preceding siblings ...)
  2016-11-24  4:01 ` [PATCH v6 4/8] drm/i915/bxt: Enable IPC support Mahesh Kumar
@ 2016-11-24  4:01 ` Mahesh Kumar
  2016-11-24  4:01 ` [PATCH v6 6/8] drm/i915: Add intel_atomic_get_existing_crtc_state function Mahesh Kumar
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Mahesh Kumar @ 2016-11-24  4:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch changes Watermak calculation to fixed point calculation.
Problem with current calculation is during plane_blocks_per_line
calculation we divide intermediate blocks with min_scanlines and
takes floor of the result because of integer operation.
hence we end-up assigning less blocks than required. Which leads to
flickers.

Changes since V1:
 - Add fixed point data type as per Paulo's review
Changes since V1:
 - use fixed_point instead of fp_16_16

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h | 84 +++++++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_pm.c | 69 +++++++++++++++++++--------------
 2 files changed, 124 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6691a4e..e332f04 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -119,6 +119,90 @@ bool __i915_inject_load_failure(const char *func, int line);
 #define i915_inject_load_failure() \
 	__i915_inject_load_failure(__func__, __LINE__)
 
+typedef struct {
+	uint32_t val;
+} uint_fixed_16_16_t;
+
+#define FP_16_16_MAX ({ \
+		uint_fixed_16_16_t fp; \
+		fp.val = UINT_MAX; \
+		fp; \
+})
+
+static inline uint_fixed_16_16_t u32_to_fixed_point(uint32_t val)
+{
+	uint_fixed_16_16_t fp;
+
+	WARN_ON(val >> 16);
+
+	fp.val = val << 16;
+	return fp;
+}
+
+static inline uint32_t fixed_point_to_u32_round_up(uint_fixed_16_16_t fp)
+{
+	return DIV_ROUND_UP(fp.val, 1 << 16);
+}
+
+static inline uint32_t fixed_point_to_u32(uint_fixed_16_16_t fp)
+{
+	return (fp.val / (1 << 16));
+}
+
+static inline uint_fixed_16_16_t min_fixed_point(uint_fixed_16_16_t min1,
+					uint_fixed_16_16_t min2)
+{
+	uint_fixed_16_16_t min;
+
+	min.val = min(min1.val, min2.val);
+	return min;
+}
+
+static inline uint_fixed_16_16_t max_fixed_point(uint_fixed_16_16_t max1,
+					uint_fixed_16_16_t max2)
+{
+	uint_fixed_16_16_t max;
+
+	max.val = max(max1.val, max2.val);
+	return max;
+}
+
+static inline uint_fixed_16_16_t fixed_point_div_round_up(uint32_t val,
+					uint32_t d)
+{
+	uint_fixed_16_16_t fp, res;
+
+	fp = u32_to_fixed_point(val);
+	res.val = DIV_ROUND_UP(fp.val, d);
+	return res;
+}
+
+static inline uint_fixed_16_16_t fixed_point_div_round_up_u64(uint32_t val,
+								uint32_t d)
+{
+	uint_fixed_16_16_t res;
+	uint64_t interm_val;
+
+	interm_val = (uint64_t)val << 16;
+	interm_val = DIV_ROUND_UP_ULL(interm_val, d);
+	WARN_ON(interm_val >> 32);
+	res.val = (uint32_t) interm_val;
+
+	return res;
+}
+
+static inline uint_fixed_16_16_t mul_fixed_point_u32(uint32_t val,
+						uint_fixed_16_16_t mul)
+{
+	uint64_t intermediate_val;
+	uint_fixed_16_16_t fp;
+
+	intermediate_val = (uint64_t) val * mul.val;
+	WARN_ON(intermediate_val >> 32);
+	fp.val = (uint32_t) intermediate_val;
+	return fp;
+}
+
 static inline const char *yesno(bool v)
 {
 	return v ? "yes" : "no";
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index df6d231..3e2dd8f 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3516,32 +3516,35 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
  * should allow pixel_rate up to ~2 GHz which seems sufficient since max
  * 2xcdclk is 1350 MHz and the pixel rate should never exceed that.
 */
-static uint32_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp, uint32_t latency)
+static uint_fixed_16_16_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp,
+					 uint32_t latency)
 {
-	uint32_t wm_intermediate_val, ret;
+	uint32_t wm_intermediate_val;
+	uint_fixed_16_16_t ret;
 
 	if (latency == 0)
-		return UINT_MAX;
-
-	wm_intermediate_val = latency * pixel_rate * cpp / 512;
-	ret = DIV_ROUND_UP(wm_intermediate_val, 1000);
+		return FP_16_16_MAX;
 
+	wm_intermediate_val = latency * pixel_rate * cpp;
+	ret = fixed_point_div_round_up_u64(wm_intermediate_val, 1000 * 512);
 	return ret;
 }
 
-static uint32_t skl_wm_method2(uint32_t pixel_rate, uint32_t pipe_htotal,
-			       uint32_t latency, uint32_t plane_blocks_per_line)
+static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
+			uint32_t pipe_htotal,
+			uint32_t latency,
+			uint_fixed_16_16_t plane_blocks_per_line)
 {
-	uint32_t ret;
 	uint32_t wm_intermediate_val;
+	uint_fixed_16_16_t ret;
 
 	if (latency == 0)
-		return UINT_MAX;
+		return FP_16_16_MAX;
 
 	wm_intermediate_val = latency * pixel_rate;
-	ret = DIV_ROUND_UP(wm_intermediate_val, pipe_htotal * 1000) *
-				plane_blocks_per_line;
-
+	wm_intermediate_val = DIV_ROUND_UP(wm_intermediate_val,
+							pipe_htotal * 1000);
+	ret = mul_fixed_point_u32(wm_intermediate_val, plane_blocks_per_line);
 	return ret;
 }
 
@@ -3581,14 +3584,17 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	struct drm_plane_state *pstate = &intel_pstate->base;
 	struct drm_framebuffer *fb = pstate->fb;
 	uint32_t latency = dev_priv->wm.skl_latency[level];
-	uint32_t method1, method2;
-	uint32_t plane_bytes_per_line, plane_blocks_per_line;
+	uint_fixed_16_16_t method1, method2;
+	uint_fixed_16_16_t plane_blocks_per_line;
+	uint_fixed_16_16_t selected_result;
+	uint32_t interm_pbpl;
+	uint32_t plane_bytes_per_line;
 	uint32_t res_blocks, res_lines;
-	uint32_t selected_result;
 	uint8_t cpp;
 	uint32_t width = 0, height = 0;
 	uint32_t plane_pixel_rate;
-	uint32_t y_tile_minimum, y_min_scanlines;
+	uint_fixed_16_16_t y_tile_minimum;
+	uint32_t y_min_scanlines;
 	struct intel_atomic_state *state =
 		to_intel_atomic_state(cstate->base.state);
 	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
@@ -3647,14 +3653,16 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 
 	plane_bytes_per_line = width * cpp;
 	if (y_tiled) {
+		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line *
+						y_min_scanlines, 512);
 		plane_blocks_per_line =
-		      DIV_ROUND_UP(plane_bytes_per_line * y_min_scanlines, 512);
-		plane_blocks_per_line /= y_min_scanlines;
+		      fixed_point_div_round_up(interm_pbpl, y_min_scanlines);
 	} else if (x_tiled) {
-		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512);
+		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512);
+		plane_blocks_per_line = u32_to_fixed_point(interm_pbpl);
 	} else {
-		plane_blocks_per_line = DIV_ROUND_UP(plane_bytes_per_line, 512)
-					+ 1;
+		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512) + 1;
+		plane_blocks_per_line = u32_to_fixed_point(interm_pbpl);
 	}
 
 	method1 = skl_wm_method1(plane_pixel_rate, cpp, latency);
@@ -3663,26 +3671,29 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 				 latency,
 				 plane_blocks_per_line);
 
-	y_tile_minimum = plane_blocks_per_line * y_min_scanlines;
+	y_tile_minimum = mul_fixed_point_u32(y_min_scanlines,
+						plane_blocks_per_line);
 
 	if (y_tiled) {
-		selected_result = max(method2, y_tile_minimum);
+		selected_result = max_fixed_point(method2, y_tile_minimum);
 	} else {
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
 		    (plane_bytes_per_line / 512 < 1))
 			selected_result = method2;
-		else if ((ddb_allocation / plane_blocks_per_line) >= 1)
-			selected_result = min(method1, method2);
+		else if ((ddb_allocation /
+			fixed_point_to_u32_round_up(plane_blocks_per_line)) >= 1)
+			selected_result = min_fixed_point(method1, method2);
 		else
 			selected_result = method1;
 	}
 
-	res_blocks = selected_result + 1;
-	res_lines = DIV_ROUND_UP(selected_result, plane_blocks_per_line);
+	res_blocks = fixed_point_to_u32_round_up(selected_result) + 1;
+	res_lines = DIV_ROUND_UP(selected_result.val,
+					plane_blocks_per_line.val);
 
 	if (level >= 1 && level <= 7) {
 		if (y_tiled) {
-			res_blocks += y_tile_minimum;
+			res_blocks += fixed_point_to_u32_round_up(y_tile_minimum);
 			res_lines += y_min_scanlines;
 		} else {
 			res_blocks++;
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 6/8] drm/i915: Add intel_atomic_get_existing_crtc_state function
  2016-11-24  4:01 [PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
                   ` (4 preceding siblings ...)
  2016-11-24  4:01 ` [PATCH v6 5/8] drm/i915/skl+: change WM calc to fixed point 16.16 Mahesh Kumar
@ 2016-11-24  4:01 ` Mahesh Kumar
  2016-11-24  4:01 ` [PATCH v6 7/8] drm/i915: Decode system memory bandwidth Mahesh Kumar
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Mahesh Kumar @ 2016-11-24  4:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch Adds a function to extract intel_crtc_state from the
atomic_state, if not available it returns NULL.

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/intel_drv.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index ad542a2..db75773 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1794,6 +1794,20 @@ intel_atomic_get_crtc_state(struct drm_atomic_state *state,
 	return to_intel_crtc_state(crtc_state);
 }
 
+static inline struct intel_crtc_state *
+intel_atomic_get_existing_crtc_state(struct drm_atomic_state *state,
+				      struct intel_crtc *crtc)
+{
+	struct drm_crtc_state *crtc_state;
+
+	crtc_state = drm_atomic_get_existing_crtc_state(state, &crtc->base);
+
+	if (crtc_state)
+		return to_intel_crtc_state(crtc_state);
+	else
+		return NULL;
+}
+
 static inline struct intel_plane_state *
 intel_atomic_get_existing_plane_state(struct drm_atomic_state *state,
 				      struct intel_plane *plane)
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 7/8] drm/i915: Decode system memory bandwidth
  2016-11-24  4:01 [PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
                   ` (5 preceding siblings ...)
  2016-11-24  4:01 ` [PATCH v6 6/8] drm/i915: Add intel_atomic_get_existing_crtc_state function Mahesh Kumar
@ 2016-11-24  4:01 ` Mahesh Kumar
  2016-11-24 10:10   ` kbuild test robot
  2016-11-24  4:01 ` [PATCH v6 8/8] drm/i915/gen9: WM memory bandwidth related workaround Mahesh Kumar
  2016-11-24  4:45 ` ✓ Fi.CI.BAT: success for GEN-9 Arbitrated Bandwidth WM WA's & IPC (rev2) Patchwork
  8 siblings, 1 reply; 20+ messages in thread
From: Mahesh Kumar @ 2016-11-24  4:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch adds support to decode system memory bandwidth
which will be used for arbitrated display memory percentage
calculation in GEN9 based system.

Changes from v1:
 - Address comments from Paulo
 - implement decode function for SKL/KBL also
Changes from v2:
 - Rewrite the code as per HW team inputs
 - Addresses review comments

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 173 ++++++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h |  12 +++
 drivers/gpu/drm/i915/i915_reg.h |  37 +++++++++
 3 files changed, 222 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index b83e84b..ded1d05 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -976,6 +976,173 @@ static void intel_sanitize_options(struct drm_i915_private *dev_priv)
 	DRM_DEBUG_DRIVER("use GPU sempahores? %s\n", yesno(i915.semaphores));
 }
 
+static inline enum rank skl_memdev_get_channel_rank(uint32_t val)
+{
+	uint8_t l_rank, s_rank;
+	uint8_t l_size, s_size;
+	enum rank ch_rank = DRAM_RANK_SINGLE;
+
+	l_size = (val >> SKL_DRAM_SIZE_L_SHIFT) & SKL_DRAM_SIZE_MASK;
+	s_size = (val >> SKL_DRAM_SIZE_S_SHIFT) & SKL_DRAM_SIZE_MASK;
+	l_rank = (val >> SKL_DRAM_RANK_L_SHIFT) & SKL_DRAM_RANK_MASK;
+	s_rank = (val >> SKL_DRAM_RANK_S_SHIFT) & SKL_DRAM_RANK_MASK;
+
+	/*
+	 * If any of the slot has dual rank memory consider
+	 * dual rank memory channel
+	 */
+	if (l_rank == SKL_DRAM_RANK_DUAL || s_rank == SKL_DRAM_RANK_DUAL)
+		ch_rank = DRAM_RANK_DUAL;
+
+	/*
+	 * If both the slot has single rank memory then configuration
+	 * is dual rank memory
+	 */
+	if ((l_size && l_rank == SKL_DRAM_RANK_SINGLE) &&
+		(s_size && s_rank == SKL_DRAM_RANK_SINGLE))
+		ch_rank = DRAM_RANK_DUAL;
+	return ch_rank;
+}
+
+static int
+skl_get_memdev_info(struct drm_i915_private *dev_priv)
+{
+	struct memdev_info *memdev_info = &dev_priv->memdev_info;
+	uint32_t mem_freq_khz;
+	uint32_t val;
+	enum rank ch0_rank, ch1_rank;
+
+	val = I915_READ(SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
+	mem_freq_khz = (val & SKL_REQ_DATA_MASK) *
+				SKL_MEMORY_FREQ_MULTIPLIER_KHZ;
+
+	val = I915_READ(SKL_MAD_DIMM_CH0_0_0_0_MCHBAR_MCMAIN);
+	if (val != 0x0) {
+		memdev_info->num_channels++;
+		ch0_rank = skl_memdev_get_channel_rank(val);
+	}
+
+	val = I915_READ(SKL_MAD_DIMM_CH1_0_0_0_MCHBAR_MCMAIN);
+	if (val != 0x0) {
+		memdev_info->num_channels++;
+		ch1_rank = skl_memdev_get_channel_rank(val);
+	}
+
+	if (memdev_info->num_channels == 0) {
+		DRM_ERROR("Number of mem channels are zero\n");
+		return -EINVAL;
+	}
+
+	memdev_info->bandwidth_kbps = (memdev_info->num_channels *
+							mem_freq_khz * 8);
+
+	if (memdev_info->bandwidth_kbps == 0) {
+		DRM_ERROR("Couldn't get system memory bandwidth\n");
+		return -EINVAL;
+	}
+	memdev_info->valid = true;
+
+	/*
+	 * If any of channel is single rank channel,
+	 * consider single rank memory
+	 */
+	if (ch0_rank == DRAM_RANK_SINGLE || ch1_rank == DRAM_RANK_SINGLE)
+		memdev_info->rank = DRAM_RANK_SINGLE;
+	else
+		memdev_info->rank = max(ch0_rank, ch1_rank);
+
+	return 0;
+}
+
+static int
+bxt_get_memdev_info(struct drm_i915_private *dev_priv)
+{
+	struct memdev_info *memdev_info = &dev_priv->memdev_info;
+	uint32_t dram_channels;
+	uint32_t mem_freq_khz, val;
+	uint8_t num_active_channels;
+	int i;
+
+	val = I915_READ(BXT_P_CR_MC_BIOS_REQ_0_0_0);
+	mem_freq_khz = ((val & BXT_REQ_DATA_MASK) *
+				BXT_MEMORY_FREQ_MULTIPLIER_KHZ);
+
+	dram_channels = (val >> BXT_DRAM_CHANNEL_ACTIVE_SHIFT) &
+					BXT_DRAM_CHANNEL_ACTIVE_MASK;
+	num_active_channels = hweight32(dram_channels);
+
+	memdev_info->bandwidth_kbps = (mem_freq_khz * num_active_channels * 4);
+
+	if (memdev_info->bandwidth_kbps == 0) {
+		DRM_ERROR("Couldn't get system memory bandwidth\n");
+		return -EINVAL;
+	}
+	memdev_info->valid = true;
+
+	/*
+	 * Now read each DUNIT8/9/10/11 to check the rank of each dimms.
+	 */
+	for (i = 0; i < BXT_D_CR_DRP0_DUNIT_MAX; i++) {
+		val = I915_READ(BXT_D_CR_DRP0_DUNIT(i));
+		if (val != 0xFFFFFFFF) {
+			uint8_t rank;
+			enum rank ch_rank;
+
+			memdev_info->num_channels++;
+			rank = val & BXT_DRAM_RANK_MASK;
+			if (rank == BXT_DRAM_RANK_SINGLE)
+				ch_rank = DRAM_RANK_SINGLE;
+			else if (rank == BXT_DRAM_RANK_DUAL)
+				ch_rank = DRAM_RANK_DUAL;
+			else
+				ch_rank = DRAM_RANK_INVALID;
+
+			/*
+			 * If any of channel is having single rank memory
+			 * consider memory as single rank
+			 */
+			if (memdev_info->rank == DRAM_RANK_INVALID)
+				memdev_info->rank = ch_rank;
+			else if (ch_rank == DRAM_RANK_SINGLE)
+				memdev_info->rank = DRAM_RANK_SINGLE;
+		}
+	}
+	return 0;
+}
+
+static void
+intel_get_memdev_info(struct drm_i915_private *dev_priv)
+{
+	struct memdev_info *memdev_info = &dev_priv->memdev_info;
+	int ret;
+
+	memdev_info->valid = false;
+	memdev_info->rank = DRAM_RANK_INVALID;
+	memdev_info->num_channels = 0;
+
+	if (!IS_GEN9(dev_priv))
+		return;
+
+	if (IS_BROXTON(dev_priv))
+		ret = bxt_get_memdev_info(dev_priv);
+	else
+		ret = skl_get_memdev_info(dev_priv);
+	if (ret)
+		return;
+
+	DRM_DEBUG_DRIVER("DRAM bandwidth: %u KBps total-channels: %u\n",
+				memdev_info->bandwidth_kbps,
+				memdev_info->num_channels);
+	if (memdev_info->rank == DRAM_RANK_INVALID)
+		DRM_INFO("Counld not get memory rank info\n");
+	else {
+		DRM_DEBUG_DRIVER("DRAM rank: %s rank\n",
+				(memdev_info->rank == DRAM_RANK_DUAL) ?
+						"dual" : "single");
+	}
+}
+
+
 /**
  * i915_driver_init_hw - setup state requiring device access
  * @dev_priv: device private
@@ -1078,6 +1245,12 @@ static int i915_driver_init_hw(struct drm_i915_private *dev_priv)
 			DRM_DEBUG_DRIVER("can't enable MSI");
 	}
 
+	/*
+	 * Fill the memdev structure to get the system raw bandwidth
+	 * This will be used by WM algorithm, to implement GEN9 based WA
+	 */
+	intel_get_memdev_info(dev_priv);
+
 	return 0;
 
 out_ggtt:
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e332f04..4e2f17f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2168,6 +2168,18 @@ struct drm_i915_private {
 		bool distrust_bios_wm;
 	} wm;
 
+	struct memdev_info {
+		bool valid;
+		uint32_t bandwidth_kbps;
+		uint8_t num_channels;
+		enum rank {
+			DRAM_RANK_INVALID = 0,
+			DRAM_RANK_SINGLE,
+			DRAM_RANK_DUAL
+		} rank;
+	} memdev_info;
+
+
 	struct i915_runtime_pm pm;
 
 	/* Abstract the submission mechanism (legacy ringbuffer or execlists) away */
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 300418a..9a06810 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -7672,6 +7672,43 @@ enum {
 #define  DC_STATE_DEBUG_MASK_CORES	(1<<0)
 #define  DC_STATE_DEBUG_MASK_MEMORY_UP	(1<<1)
 
+#define BXT_P_CR_MC_BIOS_REQ_0_0_0	_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x7114)
+#define BXT_REQ_DATA_MASK			0x3F
+#define BXT_DRAM_ACTIVE_CHANNEL_SHIFT		12
+#define BXT_DRAM_ACTIVE_CHANNEL_MASK		0xF
+/*
+ * BIOS programs this field of REQ_DATA [5:0] in integer
+ * multiple of 133333 KHz (133.33MHz)
+ */
+#define	BXT_MEMORY_FREQ_MULTIPLIER_KHZ		133333
+#define BXT_D_CR_DRP0_DUNIT8			0x1000
+#define BXT_D_CR_DRP0_DUNIT9			0x1200
+#define BXT_D_CR_DRP0_DUNIT_MAX			4
+#define _MMIO_MCHBAR_DUNIT(x, a, b) _MMIO(MCHBAR_MIRROR_BASE_SNB + (a) + (x)*((b)-(a)))
+#define BXT_D_CR_DRP0_DUNIT(x)	_MMIO_MCHBAR_DUNIT(x, BXT_D_CR_DRP0_DUNIT8, BXT_D_CR_DRP0_DUNIT9)
+#define BXT_DRAM_CHANNEL_ACTIVE_SHIFT		12
+#define BXT_DRAM_CHANNEL_ACTIVE_MASK		0xF
+#define BXT_DRAM_RANK_MASK			0x3
+#define BXT_DRAM_RANK_SINGLE			0x1
+#define BXT_DRAM_RANK_DUAL			0x3
+
+/*
+ * SKL memory frequeny multiplier is 266667 KHz (266.67 MHz)
+ */
+#define	SKL_MEMORY_FREQ_MULTIPLIER_KHZ		266667
+#define SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU	_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5E04)
+#define SKL_REQ_DATA_MASK			(0xF << 0)
+#define SKL_MAD_DIMM_CH0_0_0_0_MCHBAR_MCMAIN	_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x500C)
+#define SKL_MAD_DIMM_CH1_0_0_0_MCHBAR_MCMAIN	_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5010)
+#define SKL_DRAM_SIZE_MASK			0x1F
+#define SKL_DRAM_SIZE_L_SHIFT			0
+#define SKL_DRAM_SIZE_S_SHIFT			16
+#define SKL_DRAM_RANK_MASK			0x1
+#define SKL_DRAM_RANK_L_SHIFT			10
+#define SKL_DRAM_RANK_S_SHIFT			26
+#define SKL_DRAM_RANK_SINGLE			0x0
+#define SKL_DRAM_RANK_DUAL			0x1
+
 /* Please see hsw_read_dcomp() and hsw_write_dcomp() before using this register,
  * since on HSW we can't write to it using I915_WRITE. */
 #define D_COMP_HSW			_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5F0C)
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 8/8] drm/i915/gen9: WM memory bandwidth related workaround
  2016-11-24  4:01 [PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
                   ` (6 preceding siblings ...)
  2016-11-24  4:01 ` [PATCH v6 7/8] drm/i915: Decode system memory bandwidth Mahesh Kumar
@ 2016-11-24  4:01 ` Mahesh Kumar
  2016-11-24 12:51   ` Lankhorst, Maarten
  2016-11-24  4:45 ` ✓ Fi.CI.BAT: success for GEN-9 Arbitrated Bandwidth WM WA's & IPC (rev2) Patchwork
  8 siblings, 1 reply; 20+ messages in thread
From: Mahesh Kumar @ 2016-11-24  4:01 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This patch implemnets Workariunds related to display arbitrated memory
bandwidth. These WA are applicabe for all gen-9 based platforms.

Changes since v1:
 - Rebase on top of Paulo's patch series
Changes since v2:
 - Address review comments
 - Rebase/rework as per other patch changes in series

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |   9 +++
 drivers/gpu/drm/i915/intel_pm.c | 149 +++++++++++++++++++++++++++++++++++++---
 2 files changed, 149 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4e2f17f..2b673c6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1193,6 +1193,13 @@ enum intel_sbi_destination {
 	SBI_MPHY,
 };
 
+/* SKL+ Watermark arbitrated display bandwidth Workarounds */
+enum watermark_memory_wa {
+	WATERMARK_WA_NONE,
+	WATERMARK_WA_X_TILED,
+	WATERMARK_WA_Y_TILED,
+};
+
 #define QUIRK_PIPEA_FORCE (1<<0)
 #define QUIRK_LVDS_SSC_DISABLE (1<<1)
 #define QUIRK_INVERT_BRIGHTNESS (1<<2)
@@ -1764,6 +1771,8 @@ struct skl_ddb_allocation {
 
 struct skl_wm_values {
 	unsigned dirty_pipes;
+	/* any WaterMark memory workaround Required */
+	enum watermark_memory_wa mem_wa;
 	struct skl_ddb_allocation ddb;
 };
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 3e2dd8f..547bbda 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2889,11 +2889,7 @@ skl_wm_plane_id(const struct intel_plane *plane)
 	}
 }
 
-/*
- * FIXME: We still don't have the proper code detect if we need to apply the WA,
- * so assume we'll always need it in order to avoid underruns.
- */
-static bool skl_needs_memory_bw_wa(struct intel_atomic_state *state)
+static bool intel_needs_memory_bw_wa(struct intel_atomic_state *state)
 {
 	struct drm_i915_private *dev_priv = to_i915(state->base.dev);
 
@@ -3067,7 +3063,7 @@ bool intel_can_enable_sagv(struct drm_atomic_state *state)
 
 		latency = dev_priv->wm.skl_latency[level];
 
-		if (skl_needs_memory_bw_wa(intel_state) &&
+		if (intel_needs_memory_bw_wa(intel_state) &&
 		    plane->base.state->fb->modifier ==
 		    I915_FORMAT_MOD_X_TILED)
 			latency += 15;
@@ -3597,7 +3593,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	uint32_t y_min_scanlines;
 	struct intel_atomic_state *state =
 		to_intel_atomic_state(cstate->base.state);
-	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
+	enum watermark_memory_wa mem_wa;
 	bool y_tiled, x_tiled;
 
 	if (latency == 0 || !cstate->base.active || !intel_pstate->base.visible) {
@@ -3613,7 +3609,8 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	if (IS_KABYLAKE(dev_priv) && dev_priv->ipc_enabled)
 		latency += 4;
 
-	if (apply_memory_bw_wa && x_tiled)
+	mem_wa = state->wm_results.mem_wa;
+	if (mem_wa != WATERMARK_WA_NONE && x_tiled)
 		latency += 15;
 
 	width = drm_rect_width(&intel_pstate->base.src) >> 16;
@@ -3648,7 +3645,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		y_min_scanlines = 4;
 	}
 
-	if (apply_memory_bw_wa)
+	if (mem_wa == WATERMARK_WA_Y_TILED)
 		y_min_scanlines *= 2;
 
 	plane_bytes_per_line = width * cpp;
@@ -4077,6 +4074,15 @@ skl_compute_ddb(struct drm_atomic_state *state)
 	}
 
 	/*
+	 * If Watermark workaround is changed we need to recalculate
+	 * watermark values for all active pipes
+	 */
+	if (intel_state->wm_results.mem_wa != dev_priv->wm.skl_hw.mem_wa) {
+		realloc_pipes = ~0;
+		intel_state->wm_results.dirty_pipes = ~0;
+	}
+
+	/*
 	 * We're not recomputing for the pipes not included in the commit, so
 	 * make sure we start with the current state.
 	 */
@@ -4102,6 +4108,129 @@ skl_compute_ddb(struct drm_atomic_state *state)
 }
 
 static void
+skl_compute_memory_bandwidth_wm_wa(struct drm_atomic_state *state)
+{
+	struct drm_device *dev = state->dev;
+	struct drm_i915_private *dev_priv = to_i915(dev);
+	struct intel_crtc *intel_crtc;
+	struct intel_plane_state *intel_pstate;
+	struct intel_atomic_state *intel_state = to_intel_atomic_state(state);
+	struct memdev_info *memdev_info = &dev_priv->memdev_info;
+	int num_active_pipes;
+	uint32_t max_pipe_bw_kbps, total_pipe_bw_kbps;
+	int display_bw_percentage;
+	bool y_tile_enabled = false;
+
+	if (!intel_needs_memory_bw_wa(intel_state)) {
+		intel_state->wm_results.mem_wa = WATERMARK_WA_NONE;
+		return;
+	}
+
+	if (!memdev_info->valid)
+		goto exit;
+
+	max_pipe_bw_kbps = 0;
+	num_active_pipes = 0;
+	for_each_intel_crtc(dev, intel_crtc) {
+		struct intel_crtc_state *cstate;
+		struct intel_plane *plane;
+		int num_active_planes;
+		uint32_t  max_plane_bw_kbps, pipe_bw_kbps;
+
+		/*
+		 * If CRTC is part of current atomic commit, get crtc state from
+		 * existing CRTC state. else take the cached CRTC state
+		 */
+		cstate = NULL;
+		if (state)
+			cstate = intel_atomic_get_existing_crtc_state(state,
+					intel_crtc);
+		if (!cstate)
+			cstate = to_intel_crtc_state(intel_crtc->base.state);
+
+		if (!cstate->base.active)
+			continue;
+
+		num_active_planes = 0;
+		max_plane_bw_kbps = 0;
+		for_each_intel_plane_mask(dev, plane, cstate->base.plane_mask) {
+			struct drm_framebuffer *fb;
+			uint32_t plane_bw_kbps;
+			uint32_t id = skl_wm_plane_id(plane);
+
+			intel_pstate = NULL;
+			intel_pstate =
+				intel_atomic_get_existing_plane_state(state,
+									plane);
+			if (!intel_pstate)
+				intel_pstate =
+					to_intel_plane_state(plane->base.state);
+
+			if (!intel_pstate->base.visible)
+				continue;
+
+			if (WARN_ON(!intel_pstate->base.fb))
+				continue;
+
+			if (id == PLANE_CURSOR)
+				continue;
+
+			fb = intel_pstate->base.fb;
+			if ((fb->modifier == I915_FORMAT_MOD_Y_TILED ||
+				fb->modifier == I915_FORMAT_MOD_Yf_TILED))
+				y_tile_enabled = true;
+
+			plane_bw_kbps = skl_adjusted_plane_pixel_rate(cstate,
+								intel_pstate);
+			max_plane_bw_kbps = max(plane_bw_kbps,
+							max_plane_bw_kbps);
+			num_active_planes++;
+		}
+		pipe_bw_kbps = max_plane_bw_kbps * num_active_planes;
+		max_pipe_bw_kbps = max(pipe_bw_kbps, max_pipe_bw_kbps);
+		num_active_pipes++;
+	}
+
+	total_pipe_bw_kbps = max_pipe_bw_kbps * num_active_pipes;
+
+	display_bw_percentage = DIV_ROUND_UP_ULL(total_pipe_bw_kbps * 100,
+						memdev_info->bandwidth_kbps);
+
+	/*
+	 * If there is any Ytile plane enabled and arbitrated display
+	 * bandwidth > 20% of raw system memory bandwidth
+	 * Enale Y-tile related WA
+	 *
+	 * If memory is dual channel single rank, Xtile limit = 35%, else Xtile
+	 * limit = 60%
+	 * If there is no Ytile plane enabled and
+	 * arbitrated display bandwidth > Xtile limit
+	 * Enable X-tile realated WA
+	 */
+	if (y_tile_enabled && (display_bw_percentage > 20))
+		intel_state->wm_results.mem_wa = WATERMARK_WA_Y_TILED;
+	else {
+		int x_tile_percentage = 60;
+		enum rank rank = DRAM_RANK_SINGLE;
+
+		if (memdev_info->rank != DRAM_RANK_INVALID)
+			rank = memdev_info->rank;
+
+		if ((rank == DRAM_RANK_SINGLE) &&
+					(memdev_info->num_channels == 2))
+			x_tile_percentage = 35;
+
+		if (display_bw_percentage > x_tile_percentage)
+			intel_state->wm_results.mem_wa = WATERMARK_WA_X_TILED;
+	}
+	return;
+
+exit:
+	intel_state->wm_results.mem_wa = WATERMARK_WA_Y_TILED;
+}
+
+
+static void
 skl_copy_wm_for_pipe(struct skl_wm_values *dst,
 		     struct skl_wm_values *src,
 		     enum pipe pipe)
@@ -4177,6 +4306,8 @@ skl_compute_wm(struct drm_atomic_state *state)
 	/* Clear all dirty flags */
 	results->dirty_pipes = 0;
 
+	skl_compute_memory_bandwidth_wm_wa(state);
+
 	ret = skl_compute_ddb(state);
 	if (ret)
 		return ret;
-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* ✓ Fi.CI.BAT: success for GEN-9 Arbitrated Bandwidth WM WA's & IPC (rev2)
  2016-11-24  4:01 [PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
                   ` (7 preceding siblings ...)
  2016-11-24  4:01 ` [PATCH v6 8/8] drm/i915/gen9: WM memory bandwidth related workaround Mahesh Kumar
@ 2016-11-24  4:45 ` Patchwork
  8 siblings, 0 replies; 20+ messages in thread
From: Patchwork @ 2016-11-24  4:45 UTC (permalink / raw)
  To: Kumar, Mahesh; +Cc: intel-gfx

== Series Details ==

Series: GEN-9 Arbitrated Bandwidth WM WA's & IPC (rev2)
URL   : https://patchwork.freedesktop.org/series/15562/
State : success

== Summary ==

Series 15562v2 GEN-9 Arbitrated Bandwidth WM WA's & IPC
https://patchwork.freedesktop.org/api/1.0/series/15562/revisions/2/mbox/


fi-bdw-5557u     total:244  pass:229  dwarn:0   dfail:0   fail:0   skip:15 
fi-bsw-n3050     total:244  pass:204  dwarn:0   dfail:0   fail:0   skip:40 
fi-byt-j1900     total:244  pass:216  dwarn:0   dfail:0   fail:0   skip:28 
fi-byt-n2820     total:244  pass:212  dwarn:0   dfail:0   fail:0   skip:32 
fi-hsw-4770      total:244  pass:224  dwarn:0   dfail:0   fail:0   skip:20 
fi-hsw-4770r     total:244  pass:224  dwarn:0   dfail:0   fail:0   skip:20 
fi-ilk-650       total:244  pass:191  dwarn:0   dfail:0   fail:0   skip:53 
fi-ivb-3520m     total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-ivb-3770      total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-kbl-7200u     total:244  pass:222  dwarn:0   dfail:0   fail:0   skip:22 
fi-skl-6260u     total:244  pass:230  dwarn:0   dfail:0   fail:0   skip:14 
fi-skl-6700hq    total:244  pass:223  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6700k     total:244  pass:222  dwarn:1   dfail:0   fail:0   skip:21 
fi-skl-6770hq    total:244  pass:230  dwarn:0   dfail:0   fail:0   skip:14 
fi-snb-2520m     total:244  pass:212  dwarn:0   dfail:0   fail:0   skip:32 
fi-snb-2600      total:244  pass:211  dwarn:0   dfail:0   fail:0   skip:33 

a08f9fc55a649334a2788a804904d920f8b2fda2 drm-intel-nightly: 2016y-11m-23d-20h-25m-10s UTC integration manifest
7ee0d75 drm/i915: Add intel_atomic_get_existing_crtc_state function
8472737 drm/i915/skl+: change WM calc to fixed point 16.16
bc8dd25 drm/i915/bxt: Enable IPC support
4a59efc drm/i915/kbl: IPC workaround for kabylake
e3fe641 drm/i915/bxt: IPC WA for Broxton
667f684 drm/i915/skl: Add variables to check x_tile and y_tile

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3101/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 7/8] drm/i915: Decode system memory bandwidth
  2016-11-24  4:01 ` [PATCH v6 7/8] drm/i915: Decode system memory bandwidth Mahesh Kumar
@ 2016-11-24 10:10   ` kbuild test robot
  0 siblings, 0 replies; 20+ messages in thread
From: kbuild test robot @ 2016-11-24 10:10 UTC (permalink / raw)
  To: Mahesh Kumar; +Cc: intel-gfx, kbuild-all, maarten.lankhorst, paulo.r.zanoni

[-- Attachment #1: Type: text/plain, Size: 2511 bytes --]

Hi Mahesh,

[auto build test WARNING on next-20161123]
[cannot apply to drm-intel/for-linux-next v4.9-rc6 v4.9-rc5 v4.9-rc4 v4.9-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Mahesh-Kumar/GEN-9-Arbitrated-Bandwidth-WM-WA-s-IPC/20161124-172826
config: x86_64-randconfig-x012-201647 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/i915/i915_drv.c: In function 'i915_driver_load':
>> drivers/gpu/drm/i915/i915_drv.c:1049:47: warning: 'ch1_rank' may be used uninitialized in this function [-Wmaybe-uninitialized]
     if (ch0_rank == DRAM_RANK_SINGLE || ch1_rank == DRAM_RANK_SINGLE)
                                         ~~~~~~~~~^~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.c:1013:22: note: 'ch1_rank' was declared here
     enum rank ch0_rank, ch1_rank;
                         ^~~~~~~~
>> drivers/gpu/drm/i915/i915_drv.c:1049:15: warning: 'ch0_rank' may be used uninitialized in this function [-Wmaybe-uninitialized]
     if (ch0_rank == DRAM_RANK_SINGLE || ch1_rank == DRAM_RANK_SINGLE)
         ~~~~~~~~~^~~~~~~~~~~~~~~~~~~
   drivers/gpu/drm/i915/i915_drv.c:1013:12: note: 'ch0_rank' was declared here
     enum rank ch0_rank, ch1_rank;
               ^~~~~~~~

vim +/ch1_rank +1049 drivers/gpu/drm/i915/i915_drv.c

  1033			return -EINVAL;
  1034		}
  1035	
  1036		memdev_info->bandwidth_kbps = (memdev_info->num_channels *
  1037								mem_freq_khz * 8);
  1038	
  1039		if (memdev_info->bandwidth_kbps == 0) {
  1040			DRM_ERROR("Couldn't get system memory bandwidth\n");
  1041			return -EINVAL;
  1042		}
  1043		memdev_info->valid = true;
  1044	
  1045		/*
  1046		 * If any of channel is single rank channel,
  1047		 * consider single rank memory
  1048		 */
> 1049		if (ch0_rank == DRAM_RANK_SINGLE || ch1_rank == DRAM_RANK_SINGLE)
  1050			memdev_info->rank = DRAM_RANK_SINGLE;
  1051		else
  1052			memdev_info->rank = max(ch0_rank, ch1_rank);
  1053	
  1054		return 0;
  1055	}
  1056	
  1057	static int

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 30897 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 8/8] drm/i915/gen9: WM memory bandwidth related workaround
  2016-11-24  4:01 ` [PATCH v6 8/8] drm/i915/gen9: WM memory bandwidth related workaround Mahesh Kumar
@ 2016-11-24 12:51   ` Lankhorst, Maarten
  2016-11-29  5:42     ` Mahesh Kumar
  0 siblings, 1 reply; 20+ messages in thread
From: Lankhorst, Maarten @ 2016-11-24 12:51 UTC (permalink / raw)
  To: intel-gfx, Kumar, Mahesh1; +Cc: Zanoni, Paulo R

Mahesh Kumar schreef op do 24-11-2016 om 09:31 [+0530]:
> This patch implemnets Workariunds related to display arbitrated
> memory
> bandwidth. These WA are applicabe for all gen-9 based platforms.
> 
> Changes since v1:
>  - Rebase on top of Paulo's patch series
> Changes since v2:
>  - Address review comments
>  - Rebase/rework as per other patch changes in series
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h |   9 +++
>  drivers/gpu/drm/i915/intel_pm.c | 149
> +++++++++++++++++++++++++++++++++++++---
>  2 files changed, 149 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h
> index 4e2f17f..2b673c6 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1193,6 +1193,13 @@ enum intel_sbi_destination {
>  	SBI_MPHY,
>  };
>  
> +/* SKL+ Watermark arbitrated display bandwidth Workarounds */
> +enum watermark_memory_wa {
> +	WATERMARK_WA_NONE,
> +	WATERMARK_WA_X_TILED,
> +	WATERMARK_WA_Y_TILED,
> +};
> +
>  #define QUIRK_PIPEA_FORCE (1<<0)
>  #define QUIRK_LVDS_SSC_DISABLE (1<<1)
>  #define QUIRK_INVERT_BRIGHTNESS (1<<2)
> @@ -1764,6 +1771,8 @@ struct skl_ddb_allocation {
>  
>  struct skl_wm_values {
>  	unsigned dirty_pipes;
> +	/* any WaterMark memory workaround Required */
> +	enum watermark_memory_wa mem_wa;
>  	struct skl_ddb_allocation ddb;
>  };
>  
> diff --git a/drivers/gpu/drm/i915/intel_pm.c
> b/drivers/gpu/drm/i915/intel_pm.c
> index 3e2dd8f..547bbda 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -2889,11 +2889,7 @@ skl_wm_plane_id(const struct intel_plane
> *plane)
>  	}
>  }
>  
> -/*
> - * FIXME: We still don't have the proper code detect if we need to
> apply the WA,
> - * so assume we'll always need it in order to avoid underruns.
> - */
> -static bool skl_needs_memory_bw_wa(struct intel_atomic_state *state)
> +static bool intel_needs_memory_bw_wa(struct intel_atomic_state
> *state)
>  {
>  	struct drm_i915_private *dev_priv = to_i915(state-
> >base.dev);
>  
> @@ -3067,7 +3063,7 @@ bool intel_can_enable_sagv(struct
> drm_atomic_state *state)
>  
>  		latency = dev_priv->wm.skl_latency[level];
>  
> -		if (skl_needs_memory_bw_wa(intel_state) &&
> +		if (intel_needs_memory_bw_wa(intel_state) &&
>  		    plane->base.state->fb->modifier ==
>  		    I915_FORMAT_MOD_X_TILED)
>  			latency += 15;
> @@ -3597,7 +3593,7 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	uint32_t y_min_scanlines;
>  	struct intel_atomic_state *state =
>  		to_intel_atomic_state(cstate->base.state);
> -	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
> +	enum watermark_memory_wa mem_wa;
>  	bool y_tiled, x_tiled;
>  
>  	if (latency == 0 || !cstate->base.active || !intel_pstate-
> >base.visible) {
> @@ -3613,7 +3609,8 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  	if (IS_KABYLAKE(dev_priv) && dev_priv->ipc_enabled)
>  		latency += 4;
>  
> -	if (apply_memory_bw_wa && x_tiled)
> +	mem_wa = state->wm_results.mem_wa;
> +	if (mem_wa != WATERMARK_WA_NONE && x_tiled)
>  		latency += 15;
>  
>  	width = drm_rect_width(&intel_pstate->base.src) >> 16;
> @@ -3648,7 +3645,7 @@ static int skl_compute_plane_wm(const struct
> drm_i915_private *dev_priv,
>  		y_min_scanlines = 4;
>  	}
>  
> -	if (apply_memory_bw_wa)
> +	if (mem_wa == WATERMARK_WA_Y_TILED)
>  		y_min_scanlines *= 2;
>  
>  	plane_bytes_per_line = width * cpp;
> @@ -4077,6 +4074,15 @@ skl_compute_ddb(struct drm_atomic_state
> *state)
>  	}
>  
>  	/*
> +	 * If Watermark workaround is changed we need to recalculate
> +	 * watermark values for all active pipes
> +	 */
> +	if (intel_state->wm_results.mem_wa != dev_priv-
> >wm.skl_hw.mem_wa) {
> +		realloc_pipes = ~0;
> +		intel_state->wm_results.dirty_pipes = ~0;
> +	}
> +
> +	/*
>  	 * We're not recomputing for the pipes not included in the
> commit, so
>  	 * make sure we start with the current state.
>  	 */
> @@ -4102,6 +4108,129 @@ skl_compute_ddb(struct drm_atomic_state
> *state)
>  }
>  
>  static void
> +skl_compute_memory_bandwidth_wm_wa(struct drm_atomic_state *state)
> +{
> +	struct drm_device *dev = state->dev;
> +	struct drm_i915_private *dev_priv = to_i915(dev);
> +	struct intel_crtc *intel_crtc;
> +	struct intel_plane_state *intel_pstate;
> +	struct intel_atomic_state *intel_state =
> to_intel_atomic_state(state);
> +	struct memdev_info *memdev_info = &dev_priv->memdev_info;
> +	int num_active_pipes;
> +	uint32_t max_pipe_bw_kbps, total_pipe_bw_kbps;
> +	int display_bw_percentage;
> +	bool y_tile_enabled = false;
> +
> +	if (!intel_needs_memory_bw_wa(intel_state)) {
> +		intel_state->wm_results.mem_wa = WATERMARK_WA_NONE;
> +		return;
> +	}
> +
> +	if (!memdev_info->valid)
> +		goto exit;
> +
> +	max_pipe_bw_kbps = 0;
> +	num_active_pipes = 0;
> +	for_each_intel_crtc(dev, intel_crtc) {
> +		struct intel_crtc_state *cstate;
> +		struct intel_plane *plane;
> +		int num_active_planes;
> +		uint32_t  max_plane_bw_kbps, pipe_bw_kbps;
> +
> +		/*
> +		 * If CRTC is part of current atomic commit, get
> crtc state from
> +		 * existing CRTC state. else take the cached CRTC
> state
> +		 */
> +		cstate = NULL;
> +		if (state)
> +			cstate =
> intel_atomic_get_existing_crtc_state(state,
> +					intel_crtc);
> +		if (!cstate)
> +			cstate = to_intel_crtc_state(intel_crtc-
> >base.state);
This is really really not allowed. If you want to access
crtc->state, you need to get the state.

Since a w/a forces a full recalculation, best you can do is put the per
pipe status in crtc_state->wm.skl as well. If it changes you could add
all other pipes.

> +		if (!cstate->base.active)
> +			continue;
> +
> +		num_active_planes = 0;
> +		max_plane_bw_kbps = 0;
> +		for_each_intel_plane_mask(dev, plane, cstate-
> >base.plane_mask) {
This is very much not allowed either!

If you want to do this safely and you hold crtc_state, use
drm_atomic_crtc_state_for_each_plane_state.
> +			struct drm_framebuffer *fb;
> +			uint32_t plane_bw_kbps;
> +			uint32_t id = skl_wm_plane_id(plane);
> +
> +			intel_pstate = NULL;
> +			intel_pstate =
> +				intel_atomic_get_existing_plane_stat
> e(state,
> +									
> plane);
> +			if (!intel_pstate)
> +				intel_pstate =
> +					to_intel_plane_state(plane-
> >base.state);
> +
> +			if (!intel_pstate->base.visible)
> +				continue;
> +
> +			if (WARN_ON(!intel_pstate->base.fb))
> +				continue;
> +
> +			if (id == PLANE_CURSOR)
> +				continue;
> +
> +			fb = intel_pstate->base.fb;
> +			if ((fb->modifier == I915_FORMAT_MOD_Y_TILED
> ||
> +				fb->modifier ==
> I915_FORMAT_MOD_Yf_TILED))
> +				y_tile_enabled = true;
> +
> +			plane_bw_kbps =
> skl_adjusted_plane_pixel_rate(cstate,
> +								inte
> l_pstate);
> +			max_plane_bw_kbps = max(plane_bw_kbps,
> +							max_plane_bw
> _kbps);
> +			num_active_planes++;
> +		}
> +		pipe_bw_kbps = max_plane_bw_kbps *
> num_active_planes;
> +		max_pipe_bw_kbps = max(pipe_bw_kbps,
> max_pipe_bw_kbps);
> +		num_active_pipes++;
> +	}
> +
> +	total_pipe_bw_kbps = max_pipe_bw_kbps * num_active_pipes;
> +
> +	display_bw_percentage = DIV_ROUND_UP_ULL(total_pipe_bw_kbps
> * 100,
> +						memdev_info-
> >bandwidth_kbps);
> +
> +	/*
> +	 * If there is any Ytile plane enabled and arbitrated
> display
> +	 * bandwidth > 20% of raw system memory bandwidth
> +	 * Enale Y-tile related WA
> +	 *
> +	 * If memory is dual channel single rank, Xtile limit = 35%,
> else Xtile
> +	 * limit = 60%
> +	 * If there is no Ytile plane enabled and
> +	 * arbitrated display bandwidth > Xtile limit
> +	 * Enable X-tile realated WA
> +	 */
> +	if (y_tile_enabled && (display_bw_percentage > 20))
> +		intel_state->wm_results.mem_wa =
> WATERMARK_WA_Y_TILED;
> +	else {
> +		int x_tile_percentage = 60;
> +		enum rank rank = DRAM_RANK_SINGLE;
> +
> +		if (memdev_info->rank != DRAM_RANK_INVALID)
> +			rank = memdev_info->rank;
> +
> +		if ((rank == DRAM_RANK_SINGLE) &&
> +					(memdev_info->num_channels
> == 2))
> +			x_tile_percentage = 35;
> +
> +		if (display_bw_percentage > x_tile_percentage)
> +			intel_state->wm_results.mem_wa =
> WATERMARK_WA_X_TILED;
> +	}
> +	return;
> +
> +exit:
> +	intel_state->wm_results.mem_wa = WATERMARK_WA_Y_TILED;
> +}
> +
> +
> +static void
>  skl_copy_wm_for_pipe(struct skl_wm_values *dst,
>  		     struct skl_wm_values *src,
>  		     enum pipe pipe)
> @@ -4177,6 +4306,8 @@ skl_compute_wm(struct drm_atomic_state *state)
>  	/* Clear all dirty flags */
>  	results->dirty_pipes = 0;
>  
> +	skl_compute_memory_bandwidth_wm_wa(state);
> +
>  	ret = skl_compute_ddb(state);
>  	if (ret)
>  		return ret;
---------------------------------------------------------------------
Intel International B.V.
Registered in The Netherlands under number 34098535
Statutory seat: Haarlemmermeer
Registered address: Capronilaan 37, 1119NG Schiphol-Rijk

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 2/8] drm/i915/bxt: IPC WA for Broxton
  2016-11-24  4:01 ` [PATCH v6 2/8] drm/i915/bxt: IPC WA for Broxton Mahesh Kumar
@ 2016-11-24 12:51   ` Lankhorst, Maarten
  2016-11-28 13:07     ` Mahesh Kumar
  0 siblings, 1 reply; 20+ messages in thread
From: Lankhorst, Maarten @ 2016-11-24 12:51 UTC (permalink / raw)
  To: intel-gfx, Kumar, Mahesh1; +Cc: Zanoni, Paulo R

Mahesh Kumar schreef op do 24-11-2016 om 09:31 [+0530]:
> If IPC is enabled in BXT, display underruns are observed.
> WA: The Line Time programmed in the WM_LINETIME register should be
> half of the actual calculated Line Time.
> 
> Programmed Line Time = 1/2*Calculated Line Time
> 
> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c |  2 ++
>  drivers/gpu/drm/i915/i915_drv.h |  2 ++
>  drivers/gpu/drm/i915/intel_pm.c | 11 +++++++++--
>  3 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c
> b/drivers/gpu/drm/i915/i915_drv.c
> index 445fec9..1b0a589 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1243,6 +1243,8 @@ int i915_driver_load(struct pci_dev *pdev,
> const struct pci_device_id *ent)
>  
>  	intel_runtime_pm_enable(dev_priv);
>  
> +	dev_priv->ipc_enabled = false;
> +
>  	/* Everything is in place, we can now relax! */
>  	DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
>  		 driver.name, driver.major, driver.minor,
> driver.patchlevel,
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h
> index 63c0ea0..6691a4e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2128,6 +2128,8 @@ struct drm_i915_private {
>  	/* perform PHY state sanity checks? */
>  	bool chv_phy_assert[2];
>  
> +	bool ipc_enabled;
> +
>  	/* Used to save the pipe-to-encoder mapping for audio */
>  	struct intel_encoder *av_enc_map[I915_MAX_PIPES];
>  
> diff --git a/drivers/gpu/drm/i915/intel_pm.c
> b/drivers/gpu/drm/i915/intel_pm.c
> index 8908736..7090a7c 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3769,7 +3769,10 @@ skl_compute_wm_level(const struct
> drm_i915_private *dev_priv,
>  static uint32_t
>  skl_compute_linetime_wm(struct intel_crtc_state *cstate)
>  {
> +	struct drm_atomic_state *state = cstate->base.state;
> +	struct drm_i915_private *dev_priv = to_i915(state->dev);
>  	uint32_t pixel_rate;
> +	uint32_t linetime_wm;
>  
>  	if (!cstate->base.active)
>  		return 0;
> @@ -3779,8 +3782,12 @@ skl_compute_linetime_wm(struct
> intel_crtc_state *cstate)
>  	if (WARN_ON(pixel_rate == 0))
>  		return 0;
>  
> -	return DIV_ROUND_UP(8 * cstate-
> >base.adjusted_mode.crtc_htotal * 1000,
> -			    pixel_rate);
> +	linetime_wm = DIV_ROUND_UP(8 * cstate-
> >base.adjusted_mode.crtc_htotal *
> +						1000, pixel_rate);
> +	if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled)
> +		linetime_wm = DIV_ROUND_UP(linetime_wm, 2);
I've asked on irc to be sure, but this needs slightly more info in the
code as well. Same for 3/8 and 8/8.

12:47 < mlankhorst> danvet: if we add hw workarounds that don't have a
name, do we still have to write comments about them in the code?
12:48 < danvet> at least a bspec reference
12:48 < danvet> some hint to make it possible to find where it's from
12:49 < danvet> e.g. mesa has traditionally just pasted the entire
bspec paragraph
12:49 < mlankhorst> ok
12:49 < danvet> we've done that too in the past, but then the wa db
mostly made that redundant
12:49 < danvet> but if there's no wa entry, then full deal
12:49 < danvet> also, maybe a bspec bug notice to pls add one
12:49 < danvet> you should be able to do that over the web, and afaik
there should be one for every w/a
---------------------------------------------------------------------
Intel International B.V.
Registered in The Netherlands under number 34098535
Statutory seat: Haarlemmermeer
Registered address: Capronilaan 37, 1119NG Schiphol-Rijk

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 2/8] drm/i915/bxt: IPC WA for Broxton
  2016-11-24 12:51   ` Lankhorst, Maarten
@ 2016-11-28 13:07     ` Mahesh Kumar
  2016-12-01 11:02       ` Lankhorst, Maarten
  0 siblings, 1 reply; 20+ messages in thread
From: Mahesh Kumar @ 2016-11-28 13:07 UTC (permalink / raw)
  To: Lankhorst, Maarten, intel-gfx; +Cc: Zanoni, Paulo R

Hi,

Will keep WA number in commit message/WA location.
thanks,

Regards,

-Mahesh

On Thursday 24 November 2016 06:21 PM, Lankhorst, Maarten wrote:
> Mahesh Kumar schreef op do 24-11-2016 om 09:31 [+0530]:
>> If IPC is enabled in BXT, display underruns are observed.
>> WA: The Line Time programmed in the WM_LINETIME register should be
>> half of the actual calculated Line Time.
>>
>> Programmed Line Time = 1/2*Calculated Line Time
>>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.c |  2 ++
>>   drivers/gpu/drm/i915/i915_drv.h |  2 ++
>>   drivers/gpu/drm/i915/intel_pm.c | 11 +++++++++--
>>   3 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c
>> b/drivers/gpu/drm/i915/i915_drv.c
>> index 445fec9..1b0a589 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -1243,6 +1243,8 @@ int i915_driver_load(struct pci_dev *pdev,
>> const struct pci_device_id *ent)
>>   
>>   	intel_runtime_pm_enable(dev_priv);
>>   
>> +	dev_priv->ipc_enabled = false;
>> +
>>   	/* Everything is in place, we can now relax! */
>>   	DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
>>   		 driver.name, driver.major, driver.minor,
>> driver.patchlevel,
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 63c0ea0..6691a4e 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2128,6 +2128,8 @@ struct drm_i915_private {
>>   	/* perform PHY state sanity checks? */
>>   	bool chv_phy_assert[2];
>>   
>> +	bool ipc_enabled;
>> +
>>   	/* Used to save the pipe-to-encoder mapping for audio */
>>   	struct intel_encoder *av_enc_map[I915_MAX_PIPES];
>>   
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c
>> b/drivers/gpu/drm/i915/intel_pm.c
>> index 8908736..7090a7c 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -3769,7 +3769,10 @@ skl_compute_wm_level(const struct
>> drm_i915_private *dev_priv,
>>   static uint32_t
>>   skl_compute_linetime_wm(struct intel_crtc_state *cstate)
>>   {
>> +	struct drm_atomic_state *state = cstate->base.state;
>> +	struct drm_i915_private *dev_priv = to_i915(state->dev);
>>   	uint32_t pixel_rate;
>> +	uint32_t linetime_wm;
>>   
>>   	if (!cstate->base.active)
>>   		return 0;
>> @@ -3779,8 +3782,12 @@ skl_compute_linetime_wm(struct
>> intel_crtc_state *cstate)
>>   	if (WARN_ON(pixel_rate == 0))
>>   		return 0;
>>   
>> -	return DIV_ROUND_UP(8 * cstate-
>>> base.adjusted_mode.crtc_htotal * 1000,
>> -			    pixel_rate);
>> +	linetime_wm = DIV_ROUND_UP(8 * cstate-
>>> base.adjusted_mode.crtc_htotal *
>> +						1000, pixel_rate);
>> +	if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled)
>> +		linetime_wm = DIV_ROUND_UP(linetime_wm, 2);
> I've asked on irc to be sure, but this needs slightly more info in the
> code as well. Same for 3/8 and 8/8.
>
> 12:47 < mlankhorst> danvet: if we add hw workarounds that don't have a
> name, do we still have to write comments about them in the code?
> 12:48 < danvet> at least a bspec reference
> 12:48 < danvet> some hint to make it possible to find where it's from
> 12:49 < danvet> e.g. mesa has traditionally just pasted the entire
> bspec paragraph
> 12:49 < mlankhorst> ok
> 12:49 < danvet> we've done that too in the past, but then the wa db
> mostly made that redundant
> 12:49 < danvet> but if there's no wa entry, then full deal
> 12:49 < danvet> also, maybe a bspec bug notice to pls add one
> 12:49 < danvet> you should be able to do that over the web, and afaik
> there should be one for every w/a

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 8/8] drm/i915/gen9: WM memory bandwidth related workaround
  2016-11-24 12:51   ` Lankhorst, Maarten
@ 2016-11-29  5:42     ` Mahesh Kumar
  2016-11-29  9:46       ` Lankhorst, Maarten
  0 siblings, 1 reply; 20+ messages in thread
From: Mahesh Kumar @ 2016-11-29  5:42 UTC (permalink / raw)
  To: Lankhorst, Maarten, intel-gfx; +Cc: Zanoni, Paulo R

Hi,


On Thursday 24 November 2016 06:21 PM, Lankhorst, Maarten wrote:
> Mahesh Kumar schreef op do 24-11-2016 om 09:31 [+0530]:
>> This patch implemnets Workariunds related to display arbitrated
>> memory
>> bandwidth. These WA are applicabe for all gen-9 based platforms.
>>
>> Changes since v1:
>>   - Rebase on top of Paulo's patch series
>> Changes since v2:
>>   - Address review comments
>>   - Rebase/rework as per other patch changes in series
>>
>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h |   9 +++
>>   drivers/gpu/drm/i915/intel_pm.c | 149
>> +++++++++++++++++++++++++++++++++++++---
>>   2 files changed, 149 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index 4e2f17f..2b673c6 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1193,6 +1193,13 @@ enum intel_sbi_destination {
>>   	SBI_MPHY,
>>   };
>>   
>> +/* SKL+ Watermark arbitrated display bandwidth Workarounds */
>> +enum watermark_memory_wa {
>> +	WATERMARK_WA_NONE,
>> +	WATERMARK_WA_X_TILED,
>> +	WATERMARK_WA_Y_TILED,
>> +};
>> +
>>   #define QUIRK_PIPEA_FORCE (1<<0)
>>   #define QUIRK_LVDS_SSC_DISABLE (1<<1)
>>   #define QUIRK_INVERT_BRIGHTNESS (1<<2)
>> @@ -1764,6 +1771,8 @@ struct skl_ddb_allocation {
>>   
>>   struct skl_wm_values {
>>   	unsigned dirty_pipes;
>> +	/* any WaterMark memory workaround Required */
>> +	enum watermark_memory_wa mem_wa;
>>   	struct skl_ddb_allocation ddb;
>>   };
>>   
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c
>> b/drivers/gpu/drm/i915/intel_pm.c
>> index 3e2dd8f..547bbda 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -2889,11 +2889,7 @@ skl_wm_plane_id(const struct intel_plane
>> *plane)
>>   	}
>>   }
>>   
>> -/*
>> - * FIXME: We still don't have the proper code detect if we need to
>> apply the WA,
>> - * so assume we'll always need it in order to avoid underruns.
>> - */
>> -static bool skl_needs_memory_bw_wa(struct intel_atomic_state *state)
>> +static bool intel_needs_memory_bw_wa(struct intel_atomic_state
>> *state)
>>   {
>>   	struct drm_i915_private *dev_priv = to_i915(state-
>>> base.dev);
>>   
>> @@ -3067,7 +3063,7 @@ bool intel_can_enable_sagv(struct
>> drm_atomic_state *state)
>>   
>>   		latency = dev_priv->wm.skl_latency[level];
>>   
>> -		if (skl_needs_memory_bw_wa(intel_state) &&
>> +		if (intel_needs_memory_bw_wa(intel_state) &&
>>   		    plane->base.state->fb->modifier ==
>>   		    I915_FORMAT_MOD_X_TILED)
>>   			latency += 15;
>> @@ -3597,7 +3593,7 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>   	uint32_t y_min_scanlines;
>>   	struct intel_atomic_state *state =
>>   		to_intel_atomic_state(cstate->base.state);
>> -	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
>> +	enum watermark_memory_wa mem_wa;
>>   	bool y_tiled, x_tiled;
>>   
>>   	if (latency == 0 || !cstate->base.active || !intel_pstate-
>>> base.visible) {
>> @@ -3613,7 +3609,8 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>   	if (IS_KABYLAKE(dev_priv) && dev_priv->ipc_enabled)
>>   		latency += 4;
>>   
>> -	if (apply_memory_bw_wa && x_tiled)
>> +	mem_wa = state->wm_results.mem_wa;
>> +	if (mem_wa != WATERMARK_WA_NONE && x_tiled)
>>   		latency += 15;
>>   
>>   	width = drm_rect_width(&intel_pstate->base.src) >> 16;
>> @@ -3648,7 +3645,7 @@ static int skl_compute_plane_wm(const struct
>> drm_i915_private *dev_priv,
>>   		y_min_scanlines = 4;
>>   	}
>>   
>> -	if (apply_memory_bw_wa)
>> +	if (mem_wa == WATERMARK_WA_Y_TILED)
>>   		y_min_scanlines *= 2;
>>   
>>   	plane_bytes_per_line = width * cpp;
>> @@ -4077,6 +4074,15 @@ skl_compute_ddb(struct drm_atomic_state
>> *state)
>>   	}
>>   
>>   	/*
>> +	 * If Watermark workaround is changed we need to recalculate
>> +	 * watermark values for all active pipes
>> +	 */
>> +	if (intel_state->wm_results.mem_wa != dev_priv-
>>> wm.skl_hw.mem_wa) {
>> +		realloc_pipes = ~0;
>> +		intel_state->wm_results.dirty_pipes = ~0;
>> +	}
>> +
>> +	/*
>>   	 * We're not recomputing for the pipes not included in the
>> commit, so
>>   	 * make sure we start with the current state.
>>   	 */
>> @@ -4102,6 +4108,129 @@ skl_compute_ddb(struct drm_atomic_state
>> *state)
>>   }
>>   
>>   static void
>> +skl_compute_memory_bandwidth_wm_wa(struct drm_atomic_state *state)
>> +{
>> +	struct drm_device *dev = state->dev;
>> +	struct drm_i915_private *dev_priv = to_i915(dev);
>> +	struct intel_crtc *intel_crtc;
>> +	struct intel_plane_state *intel_pstate;
>> +	struct intel_atomic_state *intel_state =
>> to_intel_atomic_state(state);
>> +	struct memdev_info *memdev_info = &dev_priv->memdev_info;
>> +	int num_active_pipes;
>> +	uint32_t max_pipe_bw_kbps, total_pipe_bw_kbps;
>> +	int display_bw_percentage;
>> +	bool y_tile_enabled = false;
>> +
>> +	if (!intel_needs_memory_bw_wa(intel_state)) {
>> +		intel_state->wm_results.mem_wa = WATERMARK_WA_NONE;
>> +		return;
>> +	}
>> +
>> +	if (!memdev_info->valid)
>> +		goto exit;
>> +
>> +	max_pipe_bw_kbps = 0;
>> +	num_active_pipes = 0;
>> +	for_each_intel_crtc(dev, intel_crtc) {
>> +		struct intel_crtc_state *cstate;
>> +		struct intel_plane *plane;
>> +		int num_active_planes;
>> +		uint32_t  max_plane_bw_kbps, pipe_bw_kbps;
>> +
>> +		/*
>> +		 * If CRTC is part of current atomic commit, get
>> crtc state from
>> +		 * existing CRTC state. else take the cached CRTC
>> state
>> +		 */
>> +		cstate = NULL;
>> +		if (state)
>> +			cstate =
>> intel_atomic_get_existing_crtc_state(state,
>> +					intel_crtc);
>> +		if (!cstate)
>> +			cstate = to_intel_crtc_state(intel_crtc-
>>> base.state);
> This is really really not allowed. If you want to access
> crtc->state, you need to get the state.
>
> Since a w/a forces a full recalculation, best you can do is put the per
> pipe status in crtc_state->wm.skl as well. If it changes you could add
> all other pipes.
I really don't want to add all pipes in "state" until that is necessary 
otherwise it'll make our commit huge, that's why getting the cached state.
Same goes for plane. Later if w/a changes then I'm  adding all the 
CRTC's in relloc_pipes mask.

	if (intel_state->wm_results.mem_wa != dev_priv->wm.skl_hw.mem_wa)

If I hold the crtc->mutex or plane->mutex lock, It may cause deadlock 
with other commits.
I can keep a cached copy of plane memory bandwidth requirement in 
"dev_priv->wm.skl_hw", but again if there are two parallel commits it 
may cause race.

Can you please suggest what is the correct way of getting plane/crtc 
state during check phase without have to actually commit them (add in 
global-state),
If any such thing is available in atomic framework?

Regards,
-Mahesh

>> +		if (!cstate->base.active)
>> +			continue;
>> +
>> +		num_active_planes = 0;
>> +		max_plane_bw_kbps = 0;
>> +		for_each_intel_plane_mask(dev, plane, cstate-
>>> base.plane_mask) {
> This is very much not allowed either!
>
> If you want to do this safely and you hold crtc_state, use
> drm_atomic_crtc_state_for_each_plane_state.
>> +			struct drm_framebuffer *fb;
>> +			uint32_t plane_bw_kbps;
>> +			uint32_t id = skl_wm_plane_id(plane);
>> +
>> +			intel_pstate = NULL;
>> +			intel_pstate =
>> +				intel_atomic_get_existing_plane_stat
>> e(state,
>> +									
>> plane);
>> +			if (!intel_pstate)
>> +				intel_pstate =
>> +					to_intel_plane_state(plane-
>>> base.state);
>> +
>> +			if (!intel_pstate->base.visible)
>> +				continue;
>> +
>> +			if (WARN_ON(!intel_pstate->base.fb))
>> +				continue;
>> +
>> +			if (id == PLANE_CURSOR)
>> +				continue;
>> +
>> +			fb = intel_pstate->base.fb;
>> +			if ((fb->modifier == I915_FORMAT_MOD_Y_TILED
>> ||
>> +				fb->modifier ==
>> I915_FORMAT_MOD_Yf_TILED))
>> +				y_tile_enabled = true;
>> +
>> +			plane_bw_kbps =
>> skl_adjusted_plane_pixel_rate(cstate,
>> +								inte
>> l_pstate);
>> +			max_plane_bw_kbps = max(plane_bw_kbps,
>> +							max_plane_bw
>> _kbps);
>> +			num_active_planes++;
>> +		}
>> +		pipe_bw_kbps = max_plane_bw_kbps *
>> num_active_planes;
>> +		max_pipe_bw_kbps = max(pipe_bw_kbps,
>> max_pipe_bw_kbps);
>> +		num_active_pipes++;
>> +	}
>> +
>> +	total_pipe_bw_kbps = max_pipe_bw_kbps * num_active_pipes;
>> +
>> +	display_bw_percentage = DIV_ROUND_UP_ULL(total_pipe_bw_kbps
>> * 100,
>> +						memdev_info-
>>> bandwidth_kbps);
>> +
>> +	/*
>> +	 * If there is any Ytile plane enabled and arbitrated
>> display
>> +	 * bandwidth > 20% of raw system memory bandwidth
>> +	 * Enale Y-tile related WA
>> +	 *
>> +	 * If memory is dual channel single rank, Xtile limit = 35%,
>> else Xtile
>> +	 * limit = 60%
>> +	 * If there is no Ytile plane enabled and
>> +	 * arbitrated display bandwidth > Xtile limit
>> +	 * Enable X-tile realated WA
>> +	 */
>> +	if (y_tile_enabled && (display_bw_percentage > 20))
>> +		intel_state->wm_results.mem_wa =
>> WATERMARK_WA_Y_TILED;
>> +	else {
>> +		int x_tile_percentage = 60;
>> +		enum rank rank = DRAM_RANK_SINGLE;
>> +
>> +		if (memdev_info->rank != DRAM_RANK_INVALID)
>> +			rank = memdev_info->rank;
>> +
>> +		if ((rank == DRAM_RANK_SINGLE) &&
>> +					(memdev_info->num_channels
>> == 2))
>> +			x_tile_percentage = 35;
>> +
>> +		if (display_bw_percentage > x_tile_percentage)
>> +			intel_state->wm_results.mem_wa =
>> WATERMARK_WA_X_TILED;
>> +	}
>> +	return;
>> +
>> +exit:
>> +	intel_state->wm_results.mem_wa = WATERMARK_WA_Y_TILED;
>> +}
>> +
>> +
>> +static void
>>   skl_copy_wm_for_pipe(struct skl_wm_values *dst,
>>   		     struct skl_wm_values *src,
>>   		     enum pipe pipe)
>> @@ -4177,6 +4306,8 @@ skl_compute_wm(struct drm_atomic_state *state)
>>   	/* Clear all dirty flags */
>>   	results->dirty_pipes = 0;
>>   
>> +	skl_compute_memory_bandwidth_wm_wa(state);
>> +
>>   	ret = skl_compute_ddb(state);
>>   	if (ret)
>>   		return ret;

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 8/8] drm/i915/gen9: WM memory bandwidth related workaround
  2016-11-29  5:42     ` Mahesh Kumar
@ 2016-11-29  9:46       ` Lankhorst, Maarten
  2016-11-29 13:47         ` Mahesh Kumar
  0 siblings, 1 reply; 20+ messages in thread
From: Lankhorst, Maarten @ 2016-11-29  9:46 UTC (permalink / raw)
  To: intel-gfx, Kumar, Mahesh1; +Cc: Zanoni, Paulo R

Mahesh Kumar schreef op di 29-11-2016 om 11:12 [+0530]:
> Hi,
> 
> 
> On Thursday 24 November 2016 06:21 PM, Lankhorst, Maarten wrote:
> > 
> > Mahesh Kumar schreef op do 24-11-2016 om 09:31 [+0530]:
> > > 
> > > This patch implemnets Workariunds related to display arbitrated
> > > memory
> > > bandwidth. These WA are applicabe for all gen-9 based platforms.
> > > 
> > > Changes since v1:
> > >   - Rebase on top of Paulo's patch series
> > > Changes since v2:
> > >   - Address review comments
> > >   - Rebase/rework as per other patch changes in series
> > > 
> > > Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> > > ---
> > >   drivers/gpu/drm/i915/i915_drv.h |   9 +++
> > >   drivers/gpu/drm/i915/intel_pm.c | 149
> > > +++++++++++++++++++++++++++++++++++++---
> > >   2 files changed, 149 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > b/drivers/gpu/drm/i915/i915_drv.h
> > > index 4e2f17f..2b673c6 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1193,6 +1193,13 @@ enum intel_sbi_destination {
> > >   	SBI_MPHY,
> > >   };
> > >   
> > > +/* SKL+ Watermark arbitrated display bandwidth Workarounds */
> > > +enum watermark_memory_wa {
> > > +	WATERMARK_WA_NONE,
> > > +	WATERMARK_WA_X_TILED,
> > > +	WATERMARK_WA_Y_TILED,
> > > +};
> > > +
> > >   #define QUIRK_PIPEA_FORCE (1<<0)
> > >   #define QUIRK_LVDS_SSC_DISABLE (1<<1)
> > >   #define QUIRK_INVERT_BRIGHTNESS (1<<2)
> > > @@ -1764,6 +1771,8 @@ struct skl_ddb_allocation {
> > >   
> > >   struct skl_wm_values {
> > >   	unsigned dirty_pipes;
> > > +	/* any WaterMark memory workaround Required */
> > > +	enum watermark_memory_wa mem_wa;
> > >   	struct skl_ddb_allocation ddb;
> > >   };
> > >   
> > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > b/drivers/gpu/drm/i915/intel_pm.c
> > > index 3e2dd8f..547bbda 100644
> > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > @@ -2889,11 +2889,7 @@ skl_wm_plane_id(const struct intel_plane
> > > *plane)
> > >   	}
> > >   }
> > >   
> > > -/*
> > > - * FIXME: We still don't have the proper code detect if we need
> > > to
> > > apply the WA,
> > > - * so assume we'll always need it in order to avoid underruns.
> > > - */
> > > -static bool skl_needs_memory_bw_wa(struct intel_atomic_state
> > > *state)
> > > +static bool intel_needs_memory_bw_wa(struct intel_atomic_state
> > > *state)
> > >   {
> > >   	struct drm_i915_private *dev_priv = to_i915(state-
> > > > 
> > > > base.dev);
> > >   
> > > @@ -3067,7 +3063,7 @@ bool intel_can_enable_sagv(struct
> > > drm_atomic_state *state)
> > >   
> > >   		latency = dev_priv->wm.skl_latency[level];
> > >   
> > > -		if (skl_needs_memory_bw_wa(intel_state) &&
> > > +		if (intel_needs_memory_bw_wa(intel_state) &&
> > >   		    plane->base.state->fb->modifier ==
> > >   		    I915_FORMAT_MOD_X_TILED)
> > >   			latency += 15;
> > > @@ -3597,7 +3593,7 @@ static int skl_compute_plane_wm(const
> > > struct
> > > drm_i915_private *dev_priv,
> > >   	uint32_t y_min_scanlines;
> > >   	struct intel_atomic_state *state =
> > >   		to_intel_atomic_state(cstate->base.state);
> > > -	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
> > > +	enum watermark_memory_wa mem_wa;
> > >   	bool y_tiled, x_tiled;
> > >   
> > >   	if (latency == 0 || !cstate->base.active ||
> > > !intel_pstate-
> > > > 
> > > > base.visible) {
> > > @@ -3613,7 +3609,8 @@ static int skl_compute_plane_wm(const
> > > struct
> > > drm_i915_private *dev_priv,
> > >   	if (IS_KABYLAKE(dev_priv) && dev_priv->ipc_enabled)
> > >   		latency += 4;
> > >   
> > > -	if (apply_memory_bw_wa && x_tiled)
> > > +	mem_wa = state->wm_results.mem_wa;
> > > +	if (mem_wa != WATERMARK_WA_NONE && x_tiled)
> > >   		latency += 15;
> > >   
> > >   	width = drm_rect_width(&intel_pstate->base.src) >> 16;
> > > @@ -3648,7 +3645,7 @@ static int skl_compute_plane_wm(const
> > > struct
> > > drm_i915_private *dev_priv,
> > >   		y_min_scanlines = 4;
> > >   	}
> > >   
> > > -	if (apply_memory_bw_wa)
> > > +	if (mem_wa == WATERMARK_WA_Y_TILED)
> > >   		y_min_scanlines *= 2;
> > >   
> > >   	plane_bytes_per_line = width * cpp;
> > > @@ -4077,6 +4074,15 @@ skl_compute_ddb(struct drm_atomic_state
> > > *state)
> > >   	}
> > >   
> > >   	/*
> > > +	 * If Watermark workaround is changed we need to
> > > recalculate
> > > +	 * watermark values for all active pipes
> > > +	 */
> > > +	if (intel_state->wm_results.mem_wa != dev_priv-
> > > > 
> > > > wm.skl_hw.mem_wa) {
> > > +		realloc_pipes = ~0;
> > > +		intel_state->wm_results.dirty_pipes = ~0;
> > > +	}
> > > +
> > > +	/*
> > >   	 * We're not recomputing for the pipes not included in
> > > the
> > > commit, so
> > >   	 * make sure we start with the current state.
> > >   	 */
> > > @@ -4102,6 +4108,129 @@ skl_compute_ddb(struct drm_atomic_state
> > > *state)
> > >   }
> > >   
> > >   static void
> > > +skl_compute_memory_bandwidth_wm_wa(struct drm_atomic_state
> > > *state)
> > > +{
> > > +	struct drm_device *dev = state->dev;
> > > +	struct drm_i915_private *dev_priv = to_i915(dev);
> > > +	struct intel_crtc *intel_crtc;
> > > +	struct intel_plane_state *intel_pstate;
> > > +	struct intel_atomic_state *intel_state =
> > > to_intel_atomic_state(state);
> > > +	struct memdev_info *memdev_info = &dev_priv-
> > > >memdev_info;
> > > +	int num_active_pipes;
> > > +	uint32_t max_pipe_bw_kbps, total_pipe_bw_kbps;
> > > +	int display_bw_percentage;
> > > +	bool y_tile_enabled = false;
> > > +
> > > +	if (!intel_needs_memory_bw_wa(intel_state)) {
> > > +		intel_state->wm_results.mem_wa =
> > > WATERMARK_WA_NONE;
> > > +		return;
> > > +	}
> > > +
> > > +	if (!memdev_info->valid)
> > > +		goto exit;
> > > +
> > > +	max_pipe_bw_kbps = 0;
> > > +	num_active_pipes = 0;
> > > +	for_each_intel_crtc(dev, intel_crtc) {
> > > +		struct intel_crtc_state *cstate;
> > > +		struct intel_plane *plane;
> > > +		int num_active_planes;
> > > +		uint32_t  max_plane_bw_kbps, pipe_bw_kbps;
> > > +
> > > +		/*
> > > +		 * If CRTC is part of current atomic commit, get
> > > crtc state from
> > > +		 * existing CRTC state. else take the cached
> > > CRTC
> > > state
> > > +		 */
> > > +		cstate = NULL;
> > > +		if (state)
> > > +			cstate =
> > > intel_atomic_get_existing_crtc_state(state,
> > > +					intel_crtc);
> > > +		if (!cstate)
> > > +			cstate = to_intel_crtc_state(intel_crtc-
> > > > 
> > > > base.state);
> > This is really really not allowed. If you want to access
> > crtc->state, you need to get the state.
> > 
> > Since a w/a forces a full recalculation, best you can do is put the
> > per
> > pipe status in crtc_state->wm.skl as well. If it changes you could
> > add
> > all other pipes.
> I really don't want to add all pipes in "state" until that is
> necessary 
> otherwise it'll make our commit huge, that's why getting the cached
> state.
> Same goes for plane. Later if w/a changes then I'm  adding all the 
> CRTC's in relloc_pipes mask.
> 
> 	if (intel_state->wm_results.mem_wa != dev_priv-
> >wm.skl_hw.mem_wa)
> 
> If I hold the crtc->mutex or plane->mutex lock, It may cause
> deadlock 
> with other commits.
> I can keep a cached copy of plane memory bandwidth requirement in 
> "dev_priv->wm.skl_hw", but again if there are two parallel commits
> it 
> may cause race.
> 
> Can you please suggest what is the correct way of getting plane/crtc 
> state during check phase without have to actually commit them (add
> in 
> global-state),
> If any such thing is available in atomic framework?
How often will the workaround status change in practice?

You could do same as active_crtcs, protect changing the workaround
current crtc's workaround mode in dev_priv->wm.skl.wa_state[pipe] with
connection_mutex. If this means changing that the global workaround
setting is changed, then also acquire all other crtc states.

I can't say it's pretty though, it would need a good justification for
how bad the problem is in practice plus pretty good documentation on
why it works.

~Maarten
---------------------------------------------------------------------
Intel International B.V.
Registered in The Netherlands under number 34098535
Statutory seat: Haarlemmermeer
Registered address: Capronilaan 37, 1119NG Schiphol-Rijk

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 8/8] drm/i915/gen9: WM memory bandwidth related workaround
  2016-11-29  9:46       ` Lankhorst, Maarten
@ 2016-11-29 13:47         ` Mahesh Kumar
  2016-11-29 15:10           ` Lankhorst, Maarten
  0 siblings, 1 reply; 20+ messages in thread
From: Mahesh Kumar @ 2016-11-29 13:47 UTC (permalink / raw)
  To: Lankhorst, Maarten, intel-gfx; +Cc: Zanoni, Paulo R



On Tuesday 29 November 2016 03:16 PM, Lankhorst, Maarten wrote:
> Mahesh Kumar schreef op di 29-11-2016 om 11:12 [+0530]:
>> Hi,
>>
>>
>> On Thursday 24 November 2016 06:21 PM, Lankhorst, Maarten wrote:
>>> Mahesh Kumar schreef op do 24-11-2016 om 09:31 [+0530]:
>>>> This patch implemnets Workariunds related to display arbitrated
>>>> memory
>>>> bandwidth. These WA are applicabe for all gen-9 based platforms.
>>>>
>>>> Changes since v1:
>>>>    - Rebase on top of Paulo's patch series
>>>> Changes since v2:
>>>>    - Address review comments
>>>>    - Rebase/rework as per other patch changes in series
>>>>
>>>> Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_drv.h |   9 +++
>>>>    drivers/gpu/drm/i915/intel_pm.c | 149
>>>> +++++++++++++++++++++++++++++++++++++---
>>>>    2 files changed, 149 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>> index 4e2f17f..2b673c6 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -1193,6 +1193,13 @@ enum intel_sbi_destination {
>>>>    	SBI_MPHY,
>>>>    };
>>>>    
>>>> +/* SKL+ Watermark arbitrated display bandwidth Workarounds */
>>>> +enum watermark_memory_wa {
>>>> +	WATERMARK_WA_NONE,
>>>> +	WATERMARK_WA_X_TILED,
>>>> +	WATERMARK_WA_Y_TILED,
>>>> +};
>>>> +
>>>>    #define QUIRK_PIPEA_FORCE (1<<0)
>>>>    #define QUIRK_LVDS_SSC_DISABLE (1<<1)
>>>>    #define QUIRK_INVERT_BRIGHTNESS (1<<2)
>>>> @@ -1764,6 +1771,8 @@ struct skl_ddb_allocation {
>>>>    
>>>>    struct skl_wm_values {
>>>>    	unsigned dirty_pipes;
>>>> +	/* any WaterMark memory workaround Required */
>>>> +	enum watermark_memory_wa mem_wa;
>>>>    	struct skl_ddb_allocation ddb;
>>>>    };
>>>>    
>>>> diff --git a/drivers/gpu/drm/i915/intel_pm.c
>>>> b/drivers/gpu/drm/i915/intel_pm.c
>>>> index 3e2dd8f..547bbda 100644
>>>> --- a/drivers/gpu/drm/i915/intel_pm.c
>>>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>>>> @@ -2889,11 +2889,7 @@ skl_wm_plane_id(const struct intel_plane
>>>> *plane)
>>>>    	}
>>>>    }
>>>>    
>>>> -/*
>>>> - * FIXME: We still don't have the proper code detect if we need
>>>> to
>>>> apply the WA,
>>>> - * so assume we'll always need it in order to avoid underruns.
>>>> - */
>>>> -static bool skl_needs_memory_bw_wa(struct intel_atomic_state
>>>> *state)
>>>> +static bool intel_needs_memory_bw_wa(struct intel_atomic_state
>>>> *state)
>>>>    {
>>>>    	struct drm_i915_private *dev_priv = to_i915(state-
>>>>> base.dev);
>>>>    
>>>> @@ -3067,7 +3063,7 @@ bool intel_can_enable_sagv(struct
>>>> drm_atomic_state *state)
>>>>    
>>>>    		latency = dev_priv->wm.skl_latency[level];
>>>>    
>>>> -		if (skl_needs_memory_bw_wa(intel_state) &&
>>>> +		if (intel_needs_memory_bw_wa(intel_state) &&
>>>>    		    plane->base.state->fb->modifier ==
>>>>    		    I915_FORMAT_MOD_X_TILED)
>>>>    			latency += 15;
>>>> @@ -3597,7 +3593,7 @@ static int skl_compute_plane_wm(const
>>>> struct
>>>> drm_i915_private *dev_priv,
>>>>    	uint32_t y_min_scanlines;
>>>>    	struct intel_atomic_state *state =
>>>>    		to_intel_atomic_state(cstate->base.state);
>>>> -	bool apply_memory_bw_wa = skl_needs_memory_bw_wa(state);
>>>> +	enum watermark_memory_wa mem_wa;
>>>>    	bool y_tiled, x_tiled;
>>>>    
>>>>    	if (latency == 0 || !cstate->base.active ||
>>>> !intel_pstate-
>>>>> base.visible) {
>>>> @@ -3613,7 +3609,8 @@ static int skl_compute_plane_wm(const
>>>> struct
>>>> drm_i915_private *dev_priv,
>>>>    	if (IS_KABYLAKE(dev_priv) && dev_priv->ipc_enabled)
>>>>    		latency += 4;
>>>>    
>>>> -	if (apply_memory_bw_wa && x_tiled)
>>>> +	mem_wa = state->wm_results.mem_wa;
>>>> +	if (mem_wa != WATERMARK_WA_NONE && x_tiled)
>>>>    		latency += 15;
>>>>    
>>>>    	width = drm_rect_width(&intel_pstate->base.src) >> 16;
>>>> @@ -3648,7 +3645,7 @@ static int skl_compute_plane_wm(const
>>>> struct
>>>> drm_i915_private *dev_priv,
>>>>    		y_min_scanlines = 4;
>>>>    	}
>>>>    
>>>> -	if (apply_memory_bw_wa)
>>>> +	if (mem_wa == WATERMARK_WA_Y_TILED)
>>>>    		y_min_scanlines *= 2;
>>>>    
>>>>    	plane_bytes_per_line = width * cpp;
>>>> @@ -4077,6 +4074,15 @@ skl_compute_ddb(struct drm_atomic_state
>>>> *state)
>>>>    	}
>>>>    
>>>>    	/*
>>>> +	 * If Watermark workaround is changed we need to
>>>> recalculate
>>>> +	 * watermark values for all active pipes
>>>> +	 */
>>>> +	if (intel_state->wm_results.mem_wa != dev_priv-
>>>>> wm.skl_hw.mem_wa) {
>>>> +		realloc_pipes = ~0;
>>>> +		intel_state->wm_results.dirty_pipes = ~0;
>>>> +	}
>>>> +
>>>> +	/*
>>>>    	 * We're not recomputing for the pipes not included in
>>>> the
>>>> commit, so
>>>>    	 * make sure we start with the current state.
>>>>    	 */
>>>> @@ -4102,6 +4108,129 @@ skl_compute_ddb(struct drm_atomic_state
>>>> *state)
>>>>    }
>>>>    
>>>>    static void
>>>> +skl_compute_memory_bandwidth_wm_wa(struct drm_atomic_state
>>>> *state)
>>>> +{
>>>> +	struct drm_device *dev = state->dev;
>>>> +	struct drm_i915_private *dev_priv = to_i915(dev);
>>>> +	struct intel_crtc *intel_crtc;
>>>> +	struct intel_plane_state *intel_pstate;
>>>> +	struct intel_atomic_state *intel_state =
>>>> to_intel_atomic_state(state);
>>>> +	struct memdev_info *memdev_info = &dev_priv-
>>>>> memdev_info;
>>>> +	int num_active_pipes;
>>>> +	uint32_t max_pipe_bw_kbps, total_pipe_bw_kbps;
>>>> +	int display_bw_percentage;
>>>> +	bool y_tile_enabled = false;
>>>> +
>>>> +	if (!intel_needs_memory_bw_wa(intel_state)) {
>>>> +		intel_state->wm_results.mem_wa =
>>>> WATERMARK_WA_NONE;
>>>> +		return;
>>>> +	}
>>>> +
>>>> +	if (!memdev_info->valid)
>>>> +		goto exit;
>>>> +
>>>> +	max_pipe_bw_kbps = 0;
>>>> +	num_active_pipes = 0;
>>>> +	for_each_intel_crtc(dev, intel_crtc) {
>>>> +		struct intel_crtc_state *cstate;
>>>> +		struct intel_plane *plane;
>>>> +		int num_active_planes;
>>>> +		uint32_t  max_plane_bw_kbps, pipe_bw_kbps;
>>>> +
>>>> +		/*
>>>> +		 * If CRTC is part of current atomic commit, get
>>>> crtc state from
>>>> +		 * existing CRTC state. else take the cached
>>>> CRTC
>>>> state
>>>> +		 */
>>>> +		cstate = NULL;
>>>> +		if (state)
>>>> +			cstate =
>>>> intel_atomic_get_existing_crtc_state(state,
>>>> +					intel_crtc);
>>>> +		if (!cstate)
>>>> +			cstate = to_intel_crtc_state(intel_crtc-
>>>>> base.state);
>>> This is really really not allowed. If you want to access
>>> crtc->state, you need to get the state.
>>>
>>> Since a w/a forces a full recalculation, best you can do is put the
>>> per
>>> pipe status in crtc_state->wm.skl as well. If it changes you could
>>> add
>>> all other pipes.
>> I really don't want to add all pipes in "state" until that is
>> necessary
>> otherwise it'll make our commit huge, that's why getting the cached
>> state.
>> Same goes for plane. Later if w/a changes then I'm  adding all the
>> CRTC's in relloc_pipes mask.
>>
>> 	if (intel_state->wm_results.mem_wa != dev_priv-
>>> wm.skl_hw.mem_wa)
>> If I hold the crtc->mutex or plane->mutex lock, It may cause
>> deadlock
>> with other commits.
>> I can keep a cached copy of plane memory bandwidth requirement in
>> "dev_priv->wm.skl_hw", but again if there are two parallel commits
>> it
>> may cause race.
>>
>> Can you please suggest what is the correct way of getting plane/crtc
>> state during check phase without have to actually commit them (add
>> in
>> global-state),
>> If any such thing is available in atomic framework?
> How often will the workaround status change in practice?
In Linux use-case status changes very rarely, If we connect multiple 
high resolution display & our DRAM's bandwidth is less we may have to 
enable the WA.
& it'll remain there until any display in unplugged OR any modeset with 
lower resolution.
On the other hand, In OS with multi-plane enabled, WA may change if any 
plane is enabled/disabled (based on total bandwidth requirement for display)
> You could do same as active_crtcs, protect changing the workaround
> current crtc's workaround mode in dev_priv->wm.skl.wa_state[pipe] with
> connection_mutex. If this means changing that the global workaround
> setting is changed, then also acquire all other crtc states.
This is not per-CRTC WA, this is complete display-system level WA. & 
depends upon total memory requirement of display.
As you suggested I can keep the status(bandwidth requirement) for each 
pipe in dev_priv->wm.skl.wa_state/bandwidth[pipe], and access it with 
some mutex.
Is connection mutex is right mutex to hold, what about wm_mutex?


Regards,
-Mahesh
>
> I can't say it's pretty though, it would need a good justification for
> how bad the problem is in practice plus pretty good documentation on
> why it works.
>
> ~Maarten

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 8/8] drm/i915/gen9: WM memory bandwidth related workaround
  2016-11-29 13:47         ` Mahesh Kumar
@ 2016-11-29 15:10           ` Lankhorst, Maarten
  0 siblings, 0 replies; 20+ messages in thread
From: Lankhorst, Maarten @ 2016-11-29 15:10 UTC (permalink / raw)
  To: intel-gfx, Kumar, Mahesh1; +Cc: Zanoni, Paulo R

Mahesh Kumar schreef op di 29-11-2016 om 19:17 [+0530]:
> 
> On Tuesday 29 November 2016 03:16 PM, Lankhorst, Maarten wrote:
> > 
> > Mahesh Kumar schreef op di 29-11-2016 om 11:12 [+0530]:
> > > 
> > > Hi,
> > > 
> > > 
> > > On Thursday 24 November 2016 06:21 PM, Lankhorst, Maarten wrote:
> > > > 
> > > > Mahesh Kumar schreef op do 24-11-2016 om 09:31 [+0530]:
> > > > > 
> > > > > This patch implemnets Workariunds related to display
> > > > > arbitrated
> > > > > memory
> > > > > bandwidth. These WA are applicabe for all gen-9 based
> > > > > platforms.
> > > > > 
> > > > > Changes since v1:
> > > > >    - Rebase on top of Paulo's patch series
> > > > > Changes since v2:
> > > > >    - Address review comments
> > > > >    - Rebase/rework as per other patch changes in series
> > > > > 
> > > > > Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
> > > > > ---
> > > > >    drivers/gpu/drm/i915/i915_drv.h |   9 +++
> > > > >    drivers/gpu/drm/i915/intel_pm.c | 149
> > > > > +++++++++++++++++++++++++++++++++++++---
> > > > >    2 files changed, 149 insertions(+), 9 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > > index 4e2f17f..2b673c6 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > @@ -1193,6 +1193,13 @@ enum intel_sbi_destination {
> > > > >    	SBI_MPHY,
> > > > >    };
> > > > >    
> > > > > +/* SKL+ Watermark arbitrated display bandwidth Workarounds
> > > > > */
> > > > > +enum watermark_memory_wa {
> > > > > +	WATERMARK_WA_NONE,
> > > > > +	WATERMARK_WA_X_TILED,
> > > > > +	WATERMARK_WA_Y_TILED,
> > > > > +};
> > > > > +
> > > > >    #define QUIRK_PIPEA_FORCE (1<<0)
> > > > >    #define QUIRK_LVDS_SSC_DISABLE (1<<1)
> > > > >    #define QUIRK_INVERT_BRIGHTNESS (1<<2)
> > > > > @@ -1764,6 +1771,8 @@ struct skl_ddb_allocation {
> > > > >    
> > > > >    struct skl_wm_values {
> > > > >    	unsigned dirty_pipes;
> > > > > +	/* any WaterMark memory workaround Required */
> > > > > +	enum watermark_memory_wa mem_wa;
> > > > >    	struct skl_ddb_allocation ddb;
> > > > >    };
> > > > >    
> > > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > > > b/drivers/gpu/drm/i915/intel_pm.c
> > > > > index 3e2dd8f..547bbda 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > > @@ -2889,11 +2889,7 @@ skl_wm_plane_id(const struct
> > > > > intel_plane
> > > > > *plane)
> > > > >    	}
> > > > >    }
> > > > >    
> > > > > -/*
> > > > > - * FIXME: We still don't have the proper code detect if we
> > > > > need
> > > > > to
> > > > > apply the WA,
> > > > > - * so assume we'll always need it in order to avoid
> > > > > underruns.
> > > > > - */
> > > > > -static bool skl_needs_memory_bw_wa(struct intel_atomic_state
> > > > > *state)
> > > > > +static bool intel_needs_memory_bw_wa(struct
> > > > > intel_atomic_state
> > > > > *state)
> > > > >    {
> > > > >    	struct drm_i915_private *dev_priv = to_i915(state-
> > > > > > 
> > > > > > base.dev);
> > > > >    
> > > > > @@ -3067,7 +3063,7 @@ bool intel_can_enable_sagv(struct
> > > > > drm_atomic_state *state)
> > > > >    
> > > > >    		latency = dev_priv->wm.skl_latency[level];
> > > > >    
> > > > > -		if (skl_needs_memory_bw_wa(intel_state) &&
> > > > > +		if (intel_needs_memory_bw_wa(intel_state) &&
> > > > >    		    plane->base.state->fb->modifier ==
> > > > >    		    I915_FORMAT_MOD_X_TILED)
> > > > >    			latency += 15;
> > > > > @@ -3597,7 +3593,7 @@ static int skl_compute_plane_wm(const
> > > > > struct
> > > > > drm_i915_private *dev_priv,
> > > > >    	uint32_t y_min_scanlines;
> > > > >    	struct intel_atomic_state *state =
> > > > >    		to_intel_atomic_state(cstate->base.state);
> > > > > -	bool apply_memory_bw_wa =
> > > > > skl_needs_memory_bw_wa(state);
> > > > > +	enum watermark_memory_wa mem_wa;
> > > > >    	bool y_tiled, x_tiled;
> > > > >    
> > > > >    	if (latency == 0 || !cstate->base.active ||
> > > > > !intel_pstate-
> > > > > > 
> > > > > > base.visible) {
> > > > > @@ -3613,7 +3609,8 @@ static int skl_compute_plane_wm(const
> > > > > struct
> > > > > drm_i915_private *dev_priv,
> > > > >    	if (IS_KABYLAKE(dev_priv) && dev_priv-
> > > > > >ipc_enabled)
> > > > >    		latency += 4;
> > > > >    
> > > > > -	if (apply_memory_bw_wa && x_tiled)
> > > > > +	mem_wa = state->wm_results.mem_wa;
> > > > > +	if (mem_wa != WATERMARK_WA_NONE && x_tiled)
> > > > >    		latency += 15;
> > > > >    
> > > > >    	width = drm_rect_width(&intel_pstate->base.src) >>
> > > > > 16;
> > > > > @@ -3648,7 +3645,7 @@ static int skl_compute_plane_wm(const
> > > > > struct
> > > > > drm_i915_private *dev_priv,
> > > > >    		y_min_scanlines = 4;
> > > > >    	}
> > > > >    
> > > > > -	if (apply_memory_bw_wa)
> > > > > +	if (mem_wa == WATERMARK_WA_Y_TILED)
> > > > >    		y_min_scanlines *= 2;
> > > > >    
> > > > >    	plane_bytes_per_line = width * cpp;
> > > > > @@ -4077,6 +4074,15 @@ skl_compute_ddb(struct
> > > > > drm_atomic_state
> > > > > *state)
> > > > >    	}
> > > > >    
> > > > >    	/*
> > > > > +	 * If Watermark workaround is changed we need to
> > > > > recalculate
> > > > > +	 * watermark values for all active pipes
> > > > > +	 */
> > > > > +	if (intel_state->wm_results.mem_wa != dev_priv-
> > > > > > 
> > > > > > wm.skl_hw.mem_wa) {
> > > > > +		realloc_pipes = ~0;
> > > > > +		intel_state->wm_results.dirty_pipes = ~0;
> > > > > +	}
> > > > > +
> > > > > +	/*
> > > > >    	 * We're not recomputing for the pipes not
> > > > > included in
> > > > > the
> > > > > commit, so
> > > > >    	 * make sure we start with the current state.
> > > > >    	 */
> > > > > @@ -4102,6 +4108,129 @@ skl_compute_ddb(struct
> > > > > drm_atomic_state
> > > > > *state)
> > > > >    }
> > > > >    
> > > > >    static void
> > > > > +skl_compute_memory_bandwidth_wm_wa(struct drm_atomic_state
> > > > > *state)
> > > > > +{
> > > > > +	struct drm_device *dev = state->dev;
> > > > > +	struct drm_i915_private *dev_priv = to_i915(dev);
> > > > > +	struct intel_crtc *intel_crtc;
> > > > > +	struct intel_plane_state *intel_pstate;
> > > > > +	struct intel_atomic_state *intel_state =
> > > > > to_intel_atomic_state(state);
> > > > > +	struct memdev_info *memdev_info = &dev_priv-
> > > > > > 
> > > > > > memdev_info;
> > > > > +	int num_active_pipes;
> > > > > +	uint32_t max_pipe_bw_kbps, total_pipe_bw_kbps;
> > > > > +	int display_bw_percentage;
> > > > > +	bool y_tile_enabled = false;
> > > > > +
> > > > > +	if (!intel_needs_memory_bw_wa(intel_state)) {
> > > > > +		intel_state->wm_results.mem_wa =
> > > > > WATERMARK_WA_NONE;
> > > > > +		return;
> > > > > +	}
> > > > > +
> > > > > +	if (!memdev_info->valid)
> > > > > +		goto exit;
> > > > > +
> > > > > +	max_pipe_bw_kbps = 0;
> > > > > +	num_active_pipes = 0;
> > > > > +	for_each_intel_crtc(dev, intel_crtc) {
> > > > > +		struct intel_crtc_state *cstate;
> > > > > +		struct intel_plane *plane;
> > > > > +		int num_active_planes;
> > > > > +		uint32_t  max_plane_bw_kbps, pipe_bw_kbps;
> > > > > +
> > > > > +		/*
> > > > > +		 * If CRTC is part of current atomic commit,
> > > > > get
> > > > > crtc state from
> > > > > +		 * existing CRTC state. else take the cached
> > > > > CRTC
> > > > > state
> > > > > +		 */
> > > > > +		cstate = NULL;
> > > > > +		if (state)
> > > > > +			cstate =
> > > > > intel_atomic_get_existing_crtc_state(state,
> > > > > +					intel_crtc);
> > > > > +		if (!cstate)
> > > > > +			cstate =
> > > > > to_intel_crtc_state(intel_crtc-
> > > > > > 
> > > > > > base.state);
> > > > This is really really not allowed. If you want to access
> > > > crtc->state, you need to get the state.
> > > > 
> > > > Since a w/a forces a full recalculation, best you can do is put
> > > > the
> > > > per
> > > > pipe status in crtc_state->wm.skl as well. If it changes you
> > > > could
> > > > add
> > > > all other pipes.
> > > I really don't want to add all pipes in "state" until that is
> > > necessary
> > > otherwise it'll make our commit huge, that's why getting the
> > > cached
> > > state.
> > > Same goes for plane. Later if w/a changes then I'm  adding all
> > > the
> > > CRTC's in relloc_pipes mask.
> > > 
> > > 	if (intel_state->wm_results.mem_wa != dev_priv-
> > > > 
> > > > wm.skl_hw.mem_wa)
> > > If I hold the crtc->mutex or plane->mutex lock, It may cause
> > > deadlock
> > > with other commits.
> > > I can keep a cached copy of plane memory bandwidth requirement in
> > > "dev_priv->wm.skl_hw", but again if there are two parallel
> > > commits
> > > it
> > > may cause race.
> > > 
> > > Can you please suggest what is the correct way of getting
> > > plane/crtc
> > > state during check phase without have to actually commit them
> > > (add
> > > in
> > > global-state),
> > > If any such thing is available in atomic framework?
> > How often will the workaround status change in practice?
> In Linux use-case status changes very rarely, If we connect multiple 
> high resolution display & our DRAM's bandwidth is less we may have
> to 
> enable the WA.
> & it'll remain there until any display in unplugged OR any modeset
> with 
> lower resolution.
> On the other hand, In OS with multi-plane enabled, WA may change if
> any 
> plane is enabled/disabled (based on total bandwidth requirement for
> display)
> > 
> > You could do same as active_crtcs, protect changing the workaround
> > current crtc's workaround mode in dev_priv->wm.skl.wa_state[pipe]
> > with
> > connection_mutex. If this means changing that the global workaround
> > setting is changed, then also acquire all other crtc states.
> This is not per-CRTC WA, this is complete display-system level WA. & 
> depends upon total memory requirement of display.
> As you suggested I can keep the status(bandwidth requirement) for
> each 
> pipe in dev_priv->wm.skl.wa_state/bandwidth[pipe], and access it
> with 
> some mutex.
> Is connection mutex is right mutex to hold, what about wm_mutex?

It's nasty to do so, but it would work. connection_mutex is a ww_mutex.
But I don't think this situation will happen often, so if the w/a for
current crtc changes, won't it be better to just grab all states
anyway? I don't think the optimization is worth the complexity it
introduces you run the numbers in normal use.

~Maarten
---------------------------------------------------------------------
Intel International B.V.
Registered in The Netherlands under number 34098535
Statutory seat: Haarlemmermeer
Registered address: Capronilaan 37, 1119NG Schiphol-Rijk

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v6 2/8] drm/i915/bxt: IPC WA for Broxton
  2016-11-28 13:07     ` Mahesh Kumar
@ 2016-12-01 11:02       ` Lankhorst, Maarten
  0 siblings, 0 replies; 20+ messages in thread
From: Lankhorst, Maarten @ 2016-12-01 11:02 UTC (permalink / raw)
  To: intel-gfx, Kumar, Mahesh1; +Cc: Zanoni, Paulo R

Hey,

Mahesh Kumar schreef op ma 28-11-2016 om 18:37 [+0530]:
> Hi,
> 
> Will keep WA number in commit message/WA location.
> thanks,

Sounds good, with that fixed patches 1-5 and 7 look good to me.
I think patch 6 will no longer be required since the workaround status
will also be kept inside intel_state.

~Maarten
---------------------------------------------------------------------
Intel International B.V.
Registered in The Netherlands under number 34098535
Statutory seat: Haarlemmermeer
Registered address: Capronilaan 37, 1119NG Schiphol-Rijk

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC
@ 2016-11-18 15:09 Mahesh Kumar
  0 siblings, 0 replies; 20+ messages in thread
From: Mahesh Kumar @ 2016-11-18 15:09 UTC (permalink / raw)
  To: intel-gfx; +Cc: paulo.r.zanoni, maarten.lankhorst

This series implements following set of functionality
    Implement IPC WA's for Broxton/KBL
    Enable IPC in supported platforms
    Convert WM calculation to fixed point calculation
    Calculation of System memory Bandwidth for SKL/KBL/BXT
    Implementation of Arbitrated memory Bandwidth related WM WA's

Mahesh Kumar (8):
  drm/i915/skl: Add variables to check x_tile and y_tile
  drm/i915/bxt: IPC WA for Broxton
  drm/i915/kbl: IPC workaround for kabylake
  drm/i915/bxt: Enable IPC support
  drm/i915/skl+: change WM calc to fixed point 16.16
  drm/i915: Add intel_atomic_get_existing_crtc_state function
  drm/i915: Decode system memory bandwidth
  drm/i915/gen9: WM memory bandwidth related workaround

 drivers/gpu/drm/i915/i915_drv.c  | 176 ++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h  | 108 ++++++++++++++++
 drivers/gpu/drm/i915/i915_reg.h  |  38 ++++++
 drivers/gpu/drm/i915/intel_drv.h |  15 +++
 drivers/gpu/drm/i915/intel_pm.c  | 265 ++++++++++++++++++++++++++++++++-------
 5 files changed, 555 insertions(+), 47 deletions(-)

-- 
2.10.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-12-01 11:03 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-24  4:01 [PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar
2016-11-24  4:01 ` [PATCH v6 1/8] drm/i915/skl: Add variables to check x_tile and y_tile Mahesh Kumar
2016-11-24  4:01 ` [PATCH v6 2/8] drm/i915/bxt: IPC WA for Broxton Mahesh Kumar
2016-11-24 12:51   ` Lankhorst, Maarten
2016-11-28 13:07     ` Mahesh Kumar
2016-12-01 11:02       ` Lankhorst, Maarten
2016-11-24  4:01 ` [PATCH v6 3/8] drm/i915/kbl: IPC workaround for kabylake Mahesh Kumar
2016-11-24  4:01 ` [PATCH v6 4/8] drm/i915/bxt: Enable IPC support Mahesh Kumar
2016-11-24  4:01 ` [PATCH v6 5/8] drm/i915/skl+: change WM calc to fixed point 16.16 Mahesh Kumar
2016-11-24  4:01 ` [PATCH v6 6/8] drm/i915: Add intel_atomic_get_existing_crtc_state function Mahesh Kumar
2016-11-24  4:01 ` [PATCH v6 7/8] drm/i915: Decode system memory bandwidth Mahesh Kumar
2016-11-24 10:10   ` kbuild test robot
2016-11-24  4:01 ` [PATCH v6 8/8] drm/i915/gen9: WM memory bandwidth related workaround Mahesh Kumar
2016-11-24 12:51   ` Lankhorst, Maarten
2016-11-29  5:42     ` Mahesh Kumar
2016-11-29  9:46       ` Lankhorst, Maarten
2016-11-29 13:47         ` Mahesh Kumar
2016-11-29 15:10           ` Lankhorst, Maarten
2016-11-24  4:45 ` ✓ Fi.CI.BAT: success for GEN-9 Arbitrated Bandwidth WM WA's & IPC (rev2) Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2016-11-18 15:09 [PATCH v5 0/8] GEN-9 Arbitrated Bandwidth WM WA's & IPC Mahesh Kumar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.