All of lore.kernel.org
 help / color / mirror / Atom feed
* [DMC_BUGFIX_SKL_V2 0/5] pc10 entry fixes for skl.
@ 2015-08-25 20:06 Animesh Manna
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading Animesh Manna
                   ` (5 more replies)
  0 siblings, 6 replies; 51+ messages in thread
From: Animesh Manna @ 2015-08-25 20:06 UTC (permalink / raw)
  To: intel-gfx

The following patches helps to solve PC10 entry issue for SKL.
Detailed description about the changes done to solve the issue
is mentioned in commit message of each patch.

v1: http://lists.freedesktop.org/archives/intel-gfx/2015-August/072870.html

v2: Based on review comments from Daniel, changes made in the current version.

DMC redesign patch series has dependencies with current patch series. Need
to rework on few patches, planning to send after initial review feedback
of the current patch series.
http://lists.freedesktop.org/archives/intel-gfx/2015-August/072921.html
 

Animesh Manna (5):
  drm/i915/skl: Added a check for the hardware status of csr fw before
    loading.
  drm/i915/skl Remove the call for csr uninitialization from suspend
    path
  drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  drm/i915/skl: Do not disable cdclk PLL if csr firmware is present
  drm/i915/skl: Block disable call for pw1 if dmc firmware is present.

 drivers/gpu/drm/i915/i915_drv.c         | 19 +++++++++++++------
 drivers/gpu/drm/i915/intel_csr.c        |  9 +++++++++
 drivers/gpu/drm/i915/intel_display.c    | 14 ++++++++++----
 drivers/gpu/drm/i915/intel_drv.h        |  2 ++
 drivers/gpu/drm/i915/intel_runtime_pm.c | 31 ++++++++++++++++---------------
 5 files changed, 50 insertions(+), 25 deletions(-)

-- 
2.0.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-08-25 20:06 [DMC_BUGFIX_SKL_V2 0/5] pc10 entry fixes for skl Animesh Manna
@ 2015-08-25 20:06 ` Animesh Manna
  2015-08-26 13:10   ` Daniel Vetter
  2015-09-07 11:04   ` Sunil Kamath
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 2/5] drm/i915/skl Remove the call for csr uninitialization from suspend path Animesh Manna
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 51+ messages in thread
From: Animesh Manna @ 2015-08-25 20:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

Dmc will restore the csr program except DC9, cold boot,
warm reset, PCI function level reset, and hibernate/suspend.

intel_csr_load_program() function is used to load the firmware
data from kernel memory to csr address space.

All values of csr address space will be zero if it got reset and
the first byte of csr program is always a non-zero if firmware
is loaded successfuly. Based on hardware status will load the
firmware.

Without this condition check if we overwrite the firmware data the
counters exposed for dc5/dc6 (help for debugging) will be nullified.

v1: Initial version.

v2: Based on review comments from Daniel,
- Added a check to know hardware status and load the firmware if not loaded.

Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
---
 drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
index ba1ae03..682cc26 100644
--- a/drivers/gpu/drm/i915/intel_csr.c
+++ b/drivers/gpu/drm/i915/intel_csr.c
@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
 		return;
 	}
 
+	/*
+	 * Dmc will restore the csr the program except DC9, cold boot,
+	 * warm reset, PCI function level reset, and hibernate/suspend.
+	 * This condition will help to check if csr address space is reset/
+	 * not loaded.
+	 */
+	if (I915_READ(CSR_PROGRAM_BASE))
+		return;
+
 	mutex_lock(&dev_priv->csr_lock);
 	fw_size = dev_priv->csr.dmc_fw_size;
 	for (i = 0; i < fw_size; i++)
-- 
2.0.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [DMC_BUGFIX_SKL_V2 2/5] drm/i915/skl Remove the call for csr uninitialization from suspend path
  2015-08-25 20:06 [DMC_BUGFIX_SKL_V2 0/5] pc10 entry fixes for skl Animesh Manna
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading Animesh Manna
@ 2015-08-25 20:06 ` Animesh Manna
  2015-09-07 11:05   ` Sunil Kamath
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow Animesh Manna
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 51+ messages in thread
From: Animesh Manna @ 2015-08-25 20:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

This patch remove the function call to set the firmware
loading status as uninitialized during suspend.

Dmc firmware will restore the firmware in normal suspend. In previous
patch added a check to directly read the hardware status and load
the firmware if got reset during resume from suspend-hibernation.

Cc: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 1d88745..478101c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1015,12 +1015,6 @@ static int skl_suspend_complete(struct drm_i915_private *dev_priv)
 {
 	/* Enabling DC6 is not a hard requirement to enter runtime D3 */
 
-	/*
-	 * This is to ensure that CSR isn't identified as loaded before
-	 * CSR-loading program is called during runtime-resume.
-	 */
-	intel_csr_load_status_set(dev_priv, FW_UNINITIALIZED);
-
 	skl_uninit_cdclk(dev_priv);
 
 	return 0;
-- 
2.0.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  2015-08-25 20:06 [DMC_BUGFIX_SKL_V2 0/5] pc10 entry fixes for skl Animesh Manna
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading Animesh Manna
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 2/5] drm/i915/skl Remove the call for csr uninitialization from suspend path Animesh Manna
@ 2015-08-25 20:06 ` Animesh Manna
  2015-09-07 11:06   ` Sunil Kamath
  2015-09-28  7:21   ` Daniel Vetter
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 4/5] drm/i915/skl: Do not disable cdclk PLL if csr firmware is present Animesh Manna
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 51+ messages in thread
From: Animesh Manna @ 2015-08-25 20:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rajneesh Bhardwaj, Daniel Vetter

Mmio register access after dc6/dc5 entry is not allowed when
DC6 power states are enabled according to bspec (bspec-id 0527),
so enabling dc6 as the last call in suspend flow.

v1: Initial version.

v2: Based on review comment from Daniel,
- created a seperate patch for csr uninitialization set call.

Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c         | 13 +++++++++++++
 drivers/gpu/drm/i915/intel_drv.h        |  2 ++
 drivers/gpu/drm/i915/intel_runtime_pm.c | 19 +++++++------------
 3 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 478101c..fa66162 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1013,10 +1013,20 @@ static int i915_pm_resume(struct device *dev)
 
 static int skl_suspend_complete(struct drm_i915_private *dev_priv)
 {
+	enum csr_state state;
 	/* Enabling DC6 is not a hard requirement to enter runtime D3 */
 
 	skl_uninit_cdclk(dev_priv);
 
+	/* TODO: wait for a completion event or
+	 * similar here instead of busy
+	 * waiting using wait_for function.
+	 */
+	wait_for((state = intel_csr_load_status_get(dev_priv)) !=
+			FW_UNINITIALIZED, 1000);
+	if (state == FW_LOADED)
+		skl_enable_dc6(dev_priv);
+
 	return 0;
 }
 
@@ -1063,6 +1073,9 @@ static int skl_resume_prepare(struct drm_i915_private *dev_priv)
 {
 	struct drm_device *dev = dev_priv->dev;
 
+	if (intel_csr_load_status_get(dev_priv) == FW_LOADED)
+		skl_disable_dc6(dev_priv);
+
 	skl_init_cdclk(dev_priv);
 	intel_csr_load_program(dev);
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 81b7d77..9cb7d4e 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1118,6 +1118,8 @@ void bxt_enable_dc9(struct drm_i915_private *dev_priv);
 void bxt_disable_dc9(struct drm_i915_private *dev_priv);
 void skl_init_cdclk(struct drm_i915_private *dev_priv);
 void skl_uninit_cdclk(struct drm_i915_private *dev_priv);
+void skl_enable_dc6(struct drm_i915_private *dev_priv);
+void skl_disable_dc6(struct drm_i915_private *dev_priv);
 void intel_dp_get_m_n(struct intel_crtc *crtc,
 		      struct intel_crtc_state *pipe_config);
 void intel_dp_set_m_n(struct intel_crtc *crtc, enum link_m_n_set m_n);
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 821644d..23a3aa3 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -548,7 +548,7 @@ static void assert_can_disable_dc6(struct drm_i915_private *dev_priv)
 		"DC6 already programmed to be disabled.\n");
 }
 
-static void skl_enable_dc6(struct drm_i915_private *dev_priv)
+void skl_enable_dc6(struct drm_i915_private *dev_priv)
 {
 	uint32_t val;
 
@@ -565,7 +565,7 @@ static void skl_enable_dc6(struct drm_i915_private *dev_priv)
 	POSTING_READ(DC_STATE_EN);
 }
 
-static void skl_disable_dc6(struct drm_i915_private *dev_priv)
+void skl_disable_dc6(struct drm_i915_private *dev_priv)
 {
 	uint32_t val;
 
@@ -626,10 +626,10 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
 				!I915_READ(HSW_PWR_WELL_BIOS),
 				"Invalid for power well status to be enabled, unless done by the BIOS, \
 				when request is to disable!\n");
-			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
-				power_well->data == SKL_DISP_PW_2) {
+			if (power_well->data == SKL_DISP_PW_2) {
+				if (GEN9_ENABLE_DC5(dev))
+					gen9_disable_dc5(dev_priv);
 				if (SKL_ENABLE_DC6(dev)) {
-					skl_disable_dc6(dev_priv);
 					/*
 					 * DDI buffer programming unnecessary during driver-load/resume
 					 * as it's already done during modeset initialization then.
@@ -637,8 +637,6 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
 					 */
 					if (!dev_priv->power_domains.initializing)
 						intel_prepare_ddi(dev);
-				} else {
-					gen9_disable_dc5(dev_priv);
 				}
 			}
 			I915_WRITE(HSW_PWR_WELL_DRIVER, tmp | req_mask);
@@ -658,7 +656,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
 			POSTING_READ(HSW_PWR_WELL_DRIVER);
 			DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
 
-			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
+			if (GEN9_ENABLE_DC5(dev) &&
 				power_well->data == SKL_DISP_PW_2) {
 				enum csr_state state;
 				/* TODO: wait for a completion event or
@@ -671,10 +669,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
 					DRM_ERROR("CSR firmware not ready (%d)\n",
 							state);
 				else
-					if (SKL_ENABLE_DC6(dev))
-						skl_enable_dc6(dev_priv);
-					else
-						gen9_enable_dc5(dev_priv);
+					gen9_enable_dc5(dev_priv);
 			}
 		}
 	}
-- 
2.0.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [DMC_BUGFIX_SKL_V2 4/5] drm/i915/skl: Do not disable cdclk PLL if csr firmware is present
  2015-08-25 20:06 [DMC_BUGFIX_SKL_V2 0/5] pc10 entry fixes for skl Animesh Manna
                   ` (2 preceding siblings ...)
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow Animesh Manna
@ 2015-08-25 20:06 ` Animesh Manna
  2015-08-26 13:11   ` Daniel Vetter
  2015-09-07 11:07   ` Sunil Kamath
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 5/5] drm/i915/skl: Block disable call for pw1 if dmc " Animesh Manna
  2015-10-09 13:58 ` [DMC_BUGFIX_SKL_V2 0/5] pc10 entry fixes for skl Imre Deak
  5 siblings, 2 replies; 51+ messages in thread
From: Animesh Manna @ 2015-08-25 20:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rajneesh Bhardwaj, Daniel Vetter

While display engine entering into low power state no need to disable
cdclk pll as CSR firmware of dmc will take care. If pll is already
enabled firmware execution sequence will be blocked. This is one
of the criteria for dmc to work properly.

v1: Initial version.

v2: Based on review comment from Daniel added code commnent.

Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Signed-off-bt: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index f604ce1..b6bef20 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -5687,10 +5687,16 @@ void skl_uninit_cdclk(struct drm_i915_private *dev_priv)
 	if (I915_READ(DBUF_CTL) & DBUF_POWER_STATE)
 		DRM_ERROR("DBuf power disable timeout\n");
 
-	/* disable DPLL0 */
-	I915_WRITE(LCPLL1_CTL, I915_READ(LCPLL1_CTL) & ~LCPLL_PLL_ENABLE);
-	if (wait_for(!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_LOCK), 1))
-		DRM_ERROR("Couldn't disable DPLL0\n");
+	/*
+	 * DMC assumes ownership of LCPLL and will get confused if we touch it.
+	 */
+	if (dev_priv->csr.dmc_payload) {
+		/* disable DPLL0 */
+		I915_WRITE(LCPLL1_CTL, I915_READ(LCPLL1_CTL) &
+					~LCPLL_PLL_ENABLE);
+		if (wait_for(!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_LOCK), 1))
+			DRM_ERROR("Couldn't disable DPLL0\n");
+	}
 
 	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);
 }
-- 
2.0.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [DMC_BUGFIX_SKL_V2 5/5] drm/i915/skl: Block disable call for pw1 if dmc firmware is present.
  2015-08-25 20:06 [DMC_BUGFIX_SKL_V2 0/5] pc10 entry fixes for skl Animesh Manna
                   ` (3 preceding siblings ...)
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 4/5] drm/i915/skl: Do not disable cdclk PLL if csr firmware is present Animesh Manna
@ 2015-08-25 20:06 ` Animesh Manna
  2015-09-07 11:09   ` Sunil Kamath
  2015-10-09 13:58 ` [DMC_BUGFIX_SKL_V2 0/5] pc10 entry fixes for skl Imre Deak
  5 siblings, 1 reply; 51+ messages in thread
From: Animesh Manna @ 2015-08-25 20:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

Another interesting criteria to work dmc as expected is pw1 to be
enabled by driver and dmc will shut it off in its execution
sequence. If already disabled by driver dmc will get confuse and
behave differently than expected found during pc10 entry issue
for skl.

So berfore we disable power-well 1, added check if dmc firmware is
present and driver will not disable power well 1, but for any reason
if firmware is not present of failed to load we can shut off the
power well 1 which will save some power.

As skl is currently fully dependent on dmc to go in lowest possible
power state (dc6) but the same is not applicable for bxt. Display
engine can enter into dc9 without dmc, hence unblocking disable call.

v1: Initial version.

v2: Rebased as per current patch series.

Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
---
 drivers/gpu/drm/i915/intel_runtime_pm.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 23a3aa3..340f386 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -652,9 +652,15 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
 		}
 	} else {
 		if (enable_requested) {
-			I915_WRITE(HSW_PWR_WELL_DRIVER,	tmp & ~req_mask);
-			POSTING_READ(HSW_PWR_WELL_DRIVER);
-			DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
+			if (IS_SKYLAKE(dev) &&
+				(power_well->data == SKL_DISP_PW_1) &&
+				(intel_csr_load_status_get(dev_priv) == FW_LOADED))
+				DRM_DEBUG_KMS("Not Disabling PW1, dmc will handle\n");
+			else {
+				I915_WRITE(HSW_PWR_WELL_DRIVER,	tmp & ~req_mask);
+				POSTING_READ(HSW_PWR_WELL_DRIVER);
+				DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
+			}
 
 			if (GEN9_ENABLE_DC5(dev) &&
 				power_well->data == SKL_DISP_PW_2) {
-- 
2.0.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading Animesh Manna
@ 2015-08-26 13:10   ` Daniel Vetter
  2015-08-26 14:10     ` Animesh Manna
  2015-09-07 11:04   ` Sunil Kamath
  1 sibling, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2015-08-26 13:10 UTC (permalink / raw)
  To: Animesh Manna; +Cc: Daniel Vetter, intel-gfx

On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
> Dmc will restore the csr program except DC9, cold boot,
> warm reset, PCI function level reset, and hibernate/suspend.
> 
> intel_csr_load_program() function is used to load the firmware
> data from kernel memory to csr address space.
> 
> All values of csr address space will be zero if it got reset and
> the first byte of csr program is always a non-zero if firmware
> is loaded successfuly. Based on hardware status will load the
> firmware.
> 
> Without this condition check if we overwrite the firmware data the
> counters exposed for dc5/dc6 (help for debugging) will be nullified.
> 
> v1: Initial version.
> 
> v2: Based on review comments from Daniel,
> - Added a check to know hardware status and load the firmware if not loaded.
> 
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Sunil Kamath <sunil.kamath@intel.com>
> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> index ba1ae03..682cc26 100644
> --- a/drivers/gpu/drm/i915/intel_csr.c
> +++ b/drivers/gpu/drm/i915/intel_csr.c
> @@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>  		return;
>  	}
>  
> +	/*
> +	 * Dmc will restore the csr the program except DC9, cold boot,
> +	 * warm reset, PCI function level reset, and hibernate/suspend.
> +	 * This condition will help to check if csr address space is reset/
> +	 * not loaded.
> +	 */

Atm we call this from driver load and resume, which doesn seem to cover
all the cases you mention in the comment. Should this be a WARN_ON
instead? Or do we have troubles in our init sequence where we load too
many times?

Either way I can't reconcile your commit message with the comment here.
-Daniel

> +	if (I915_READ(CSR_PROGRAM_BASE))
> +		return;
> +
>  	mutex_lock(&dev_priv->csr_lock);
>  	fw_size = dev_priv->csr.dmc_fw_size;
>  	for (i = 0; i < fw_size; i++)
> -- 
> 2.0.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 4/5] drm/i915/skl: Do not disable cdclk PLL if csr firmware is present
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 4/5] drm/i915/skl: Do not disable cdclk PLL if csr firmware is present Animesh Manna
@ 2015-08-26 13:11   ` Daniel Vetter
  2015-08-26 14:31     ` Animesh Manna
  2015-08-31  1:03     ` Hindman, Gavin
  2015-09-07 11:07   ` Sunil Kamath
  1 sibling, 2 replies; 51+ messages in thread
From: Daniel Vetter @ 2015-08-26 13:11 UTC (permalink / raw)
  To: Animesh Manna; +Cc: Rajneesh Bhardwaj, intel-gfx, Daniel Vetter

On Wed, Aug 26, 2015 at 01:36:08AM +0530, Animesh Manna wrote:
> While display engine entering into low power state no need to disable
> cdclk pll as CSR firmware of dmc will take care. If pll is already
> enabled firmware execution sequence will be blocked. This is one
> of the criteria for dmc to work properly.
> 
> v1: Initial version.
> 
> v2: Based on review comment from Daniel added code commnent.
> 
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Sunil Kamath <sunil.kamath@intel.com>
> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> Signed-off-bt: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index f604ce1..b6bef20 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -5687,10 +5687,16 @@ void skl_uninit_cdclk(struct drm_i915_private *dev_priv)
>  	if (I915_READ(DBUF_CTL) & DBUF_POWER_STATE)
>  		DRM_ERROR("DBuf power disable timeout\n");
>  
> -	/* disable DPLL0 */
> -	I915_WRITE(LCPLL1_CTL, I915_READ(LCPLL1_CTL) & ~LCPLL_PLL_ENABLE);
> -	if (wait_for(!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_LOCK), 1))
> -		DRM_ERROR("Couldn't disable DPLL0\n");
> +	/*
> +	 * DMC assumes ownership of LCPLL and will get confused if we touch it.

This should get a FIXME - once we have dmc loading fixed up we require the
firmware and there's no point in this check any more. Flexibilty just
because is something we simply don't have the developer and validation
resources for.
-Daniel

> +	 */
> +	if (dev_priv->csr.dmc_payload) {
> +		/* disable DPLL0 */
> +		I915_WRITE(LCPLL1_CTL, I915_READ(LCPLL1_CTL) &
> +					~LCPLL_PLL_ENABLE);
> +		if (wait_for(!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_LOCK), 1))
> +			DRM_ERROR("Couldn't disable DPLL0\n");
> +	}
>  
>  	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);
>  }
> -- 
> 2.0.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-08-26 13:10   ` Daniel Vetter
@ 2015-08-26 14:10     ` Animesh Manna
  2015-09-02  8:54       ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Animesh Manna @ 2015-08-26 14:10 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx



On 8/26/2015 6:40 PM, Daniel Vetter wrote:
> On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
>> Dmc will restore the csr program except DC9, cold boot,
>> warm reset, PCI function level reset, and hibernate/suspend.
>>
>> intel_csr_load_program() function is used to load the firmware
>> data from kernel memory to csr address space.
>>
>> All values of csr address space will be zero if it got reset and
>> the first byte of csr program is always a non-zero if firmware
>> is loaded successfuly. Based on hardware status will load the
>> firmware.
>>
>> Without this condition check if we overwrite the firmware data the
>> counters exposed for dc5/dc6 (help for debugging) will be nullified.

Bacause of the above reason mentioned just above we need to block firmware loading again.
So only WARN_ON will not help.


>>
>> v1: Initial version.
>>
>> v2: Based on review comments from Daniel,
>> - Added a check to know hardware status and load the firmware if not loaded.
>>
>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: Damien Lespiau <damien.lespiau@intel.com>
>> Cc: Imre Deak <imre.deak@intel.com>
>> Cc: Sunil Kamath <sunil.kamath@intel.com>
>> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>> index ba1ae03..682cc26 100644
>> --- a/drivers/gpu/drm/i915/intel_csr.c
>> +++ b/drivers/gpu/drm/i915/intel_csr.c
>> @@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>>   		return;
>>   	}
>>   
>> +	/*
>> +	 * Dmc will restore the csr the program except DC9, cold boot,
>> +	 * warm reset, PCI function level reset, and hibernate/suspend.
>> +	 * This condition will help to check if csr address space is reset/
>> +	 * not loaded.
>> +	 */
> Atm we call this from driver load and resume, which doesn seem to cover
> all the cases you mention in the comment. Should this be a WARN_ON
> instead? Or do we have troubles in our init sequence where we load too
> many times?

Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
and cold boot(not loaded).

Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.

Can the below comment more clear to you.

	/*
	 * Dmc will restore the csr the program except DC9, cold boot,
	 * warm reset, PCI function level reset, and hibernate/suspend.
	 * If firmware is restored by dmc then no need to load again which
	 * will keep the dc5/dc6 counter exposed by firmware.
	 */

No issue in init sequence.

-Animesh


>
> Either way I can't reconcile your commit message with the comment here.
> -Daniel
>
>> +	if (I915_READ(CSR_PROGRAM_BASE))
>> +		return;
>> +
>>   	mutex_lock(&dev_priv->csr_lock);
>>   	fw_size = dev_priv->csr.dmc_fw_size;
>>   	for (i = 0; i < fw_size; i++)
>> -- 
>> 2.0.2
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 4/5] drm/i915/skl: Do not disable cdclk PLL if csr firmware is present
  2015-08-26 13:11   ` Daniel Vetter
@ 2015-08-26 14:31     ` Animesh Manna
  2015-08-31  1:03     ` Hindman, Gavin
  1 sibling, 0 replies; 51+ messages in thread
From: Animesh Manna @ 2015-08-26 14:31 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Rajneesh Bhardwaj, intel-gfx, Daniel Vetter



On 8/26/2015 6:41 PM, Daniel Vetter wrote:
> On Wed, Aug 26, 2015 at 01:36:08AM +0530, Animesh Manna wrote:
>> While display engine entering into low power state no need to disable
>> cdclk pll as CSR firmware of dmc will take care. If pll is already
>> enabled firmware execution sequence will be blocked. This is one
>> of the criteria for dmc to work properly.
>>
>> v1: Initial version.
>>
>> v2: Based on review comment from Daniel added code commnent.
>>
>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: Damien Lespiau <damien.lespiau@intel.com>
>> Cc: Imre Deak <imre.deak@intel.com>
>> Cc: Sunil Kamath <sunil.kamath@intel.com>
>> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>> Signed-off-bt: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_display.c | 14 ++++++++++----
>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>> index f604ce1..b6bef20 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -5687,10 +5687,16 @@ void skl_uninit_cdclk(struct drm_i915_private *dev_priv)
>>   	if (I915_READ(DBUF_CTL) & DBUF_POWER_STATE)
>>   		DRM_ERROR("DBuf power disable timeout\n");
>>   
>> -	/* disable DPLL0 */
>> -	I915_WRITE(LCPLL1_CTL, I915_READ(LCPLL1_CTL) & ~LCPLL_PLL_ENABLE);
>> -	if (wait_for(!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_LOCK), 1))
>> -		DRM_ERROR("Couldn't disable DPLL0\n");
>> +	/*
>> +	 * DMC assumes ownership of LCPLL and will get confused if we touch it.
> This should get a FIXME - once we have dmc loading fixed up we require the
> firmware and there's no point in this check any more. Flexibilty just
> because is something we simply don't have the developer and validation
> resources for.
> -Daniel

I am not sure if dmc firmware is mandatory for all user.
But the code is written based on design principle published in below series
http://www.spinics.net/lists/intel-gfx/msg72399.html

Job of Dmc firmware is to disable pw1 and cdclk pll during suspend.
If dmc firmware is not present driver can disable cdclk pll to save some more power.

Am I missing anything?

-Animesh

>
>> +	 */
>> +	if (dev_priv->csr.dmc_payload) {
>> +		/* disable DPLL0 */
>> +		I915_WRITE(LCPLL1_CTL, I915_READ(LCPLL1_CTL) &
>> +					~LCPLL_PLL_ENABLE);
>> +		if (wait_for(!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_LOCK), 1))
>> +			DRM_ERROR("Couldn't disable DPLL0\n");
>> +	}
>>   
>>   	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);
>>   }
>> -- 
>> 2.0.2
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 4/5] drm/i915/skl: Do not disable cdclk PLL if csr firmware is present
  2015-08-26 13:11   ` Daniel Vetter
  2015-08-26 14:31     ` Animesh Manna
@ 2015-08-31  1:03     ` Hindman, Gavin
  2015-09-02  8:58       ` Daniel Vetter
  1 sibling, 1 reply; 51+ messages in thread
From: Hindman, Gavin @ 2015-08-31  1:03 UTC (permalink / raw)
  To: Daniel Vetter, Manna, Animesh
  Cc: Bhardwaj, Rajneesh, intel-gfx, Vetter, Daniel

Unless I'm misreading that would imply that we are moving away from our previous position that DMC FW is optional, correct?    Would this not render power-sequencing broken if a distro chose not to include DMC FW?

Gavin Hindman
Senior Program Manager
SSG/OTC – Open Source Technology Center


-----Original Message-----
From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf Of Daniel Vetter
Sent: Wednesday, August 26, 2015 6:12 AM
To: Manna, Animesh
Cc: Bhardwaj, Rajneesh; intel-gfx@lists.freedesktop.org; Vetter, Daniel
Subject: Re: [Intel-gfx] [DMC_BUGFIX_SKL_V2 4/5] drm/i915/skl: Do not disable cdclk PLL if csr firmware is present

On Wed, Aug 26, 2015 at 01:36:08AM +0530, Animesh Manna wrote:
> While display engine entering into low power state no need to disable 
> cdclk pll as CSR firmware of dmc will take care. If pll is already 
> enabled firmware execution sequence will be blocked. This is one of 
> the criteria for dmc to work properly.
> 
> v1: Initial version.
> 
> v2: Based on review comment from Daniel added code commnent.
> 
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Sunil Kamath <sunil.kamath@intel.com>
> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> Signed-off-bt: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index f604ce1..b6bef20 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -5687,10 +5687,16 @@ void skl_uninit_cdclk(struct drm_i915_private *dev_priv)
>  	if (I915_READ(DBUF_CTL) & DBUF_POWER_STATE)
>  		DRM_ERROR("DBuf power disable timeout\n");
>  
> -	/* disable DPLL0 */
> -	I915_WRITE(LCPLL1_CTL, I915_READ(LCPLL1_CTL) & ~LCPLL_PLL_ENABLE);
> -	if (wait_for(!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_LOCK), 1))
> -		DRM_ERROR("Couldn't disable DPLL0\n");
> +	/*
> +	 * DMC assumes ownership of LCPLL and will get confused if we touch it.

This should get a FIXME - once we have dmc loading fixed up we require the firmware and there's no point in this check any more. Flexibilty just because is something we simply don't have the developer and validation resources for.
-Daniel

> +	 */
> +	if (dev_priv->csr.dmc_payload) {
> +		/* disable DPLL0 */
> +		I915_WRITE(LCPLL1_CTL, I915_READ(LCPLL1_CTL) &
> +					~LCPLL_PLL_ENABLE);
> +		if (wait_for(!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_LOCK), 1))
> +			DRM_ERROR("Couldn't disable DPLL0\n");
> +	}
>  
>  	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);  }
> --
> 2.0.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-08-26 14:10     ` Animesh Manna
@ 2015-09-02  8:54       ` Daniel Vetter
  2015-09-09 20:28         ` Animesh Manna
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2015-09-02  8:54 UTC (permalink / raw)
  To: Animesh Manna; +Cc: Daniel Vetter, intel-gfx

On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
> 
> 
> On 8/26/2015 6:40 PM, Daniel Vetter wrote:
> >On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
> >>Dmc will restore the csr program except DC9, cold boot,
> >>warm reset, PCI function level reset, and hibernate/suspend.
> >>
> >>intel_csr_load_program() function is used to load the firmware
> >>data from kernel memory to csr address space.
> >>
> >>All values of csr address space will be zero if it got reset and
> >>the first byte of csr program is always a non-zero if firmware
> >>is loaded successfuly. Based on hardware status will load the
> >>firmware.
> >>
> >>Without this condition check if we overwrite the firmware data the
> >>counters exposed for dc5/dc6 (help for debugging) will be nullified.
> 
> Bacause of the above reason mentioned just above we need to block firmware loading again.
> So only WARN_ON will not help.
> 
> 
> >>
> >>v1: Initial version.
> >>
> >>v2: Based on review comments from Daniel,
> >>- Added a check to know hardware status and load the firmware if not loaded.
> >>
> >>Cc: Daniel Vetter <daniel.vetter@intel.com>
> >>Cc: Damien Lespiau <damien.lespiau@intel.com>
> >>Cc: Imre Deak <imre.deak@intel.com>
> >>Cc: Sunil Kamath <sunil.kamath@intel.com>
> >>Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> >>Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> >>---
> >>  drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
> >>  1 file changed, 9 insertions(+)
> >>
> >>diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> >>index ba1ae03..682cc26 100644
> >>--- a/drivers/gpu/drm/i915/intel_csr.c
> >>+++ b/drivers/gpu/drm/i915/intel_csr.c
> >>@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
> >>  		return;
> >>  	}
> >>+	/*
> >>+	 * Dmc will restore the csr the program except DC9, cold boot,
> >>+	 * warm reset, PCI function level reset, and hibernate/suspend.
> >>+	 * This condition will help to check if csr address space is reset/
> >>+	 * not loaded.
> >>+	 */
> >Atm we call this from driver load and resume, which doesn seem to cover
> >all the cases you mention in the comment. Should this be a WARN_ON
> >instead? Or do we have troubles in our init sequence where we load too
> >many times?
> 
> Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
> Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
> second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
> and cold boot(not loaded).
> 
> Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
> If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
> 
> Can the below comment more clear to you.
> 
> 	/*
> 	 * Dmc will restore the csr the program except DC9, cold boot,
> 	 * warm reset, PCI function level reset, and hibernate/suspend.
> 	 * If firmware is restored by dmc then no need to load again which
> 	 * will keep the dc5/dc6 counter exposed by firmware.
> 	 */
> 
> No issue in init sequence.

That seems to still cover all the callers of the function afaics - we do
pci resets over suspend resume unconditionally. So I still don't
understand where exactly we try to load the dmc firmware in i915.ko when
it's already loaded.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 4/5] drm/i915/skl: Do not disable cdclk PLL if csr firmware is present
  2015-08-31  1:03     ` Hindman, Gavin
@ 2015-09-02  8:58       ` Daniel Vetter
  0 siblings, 0 replies; 51+ messages in thread
From: Daniel Vetter @ 2015-09-02  8:58 UTC (permalink / raw)
  To: Hindman, Gavin; +Cc: Bhardwaj, Rajneesh, intel-gfx, Vetter, Daniel

On Mon, Aug 31, 2015 at 01:03:03AM +0000, Hindman, Gavin wrote:
> Unless I'm misreading that would imply that we are moving away from our
> previous position that DMC FW is optional, correct?    Would this not
> render power-sequencing broken if a distro chose not to include DMC FW?

For upstream we never had the stance (and I wouldn't accept the patches
really without a really good reason) that we'll support runtime pm without
DMC firmware.

We already have a really bad time just trying to keep things working in 1
configuration (there's a patch from me to enable runtime pm by default
blocked on unfixed regressions which are open since months and no one's
working on them), we absolutely don't have the resources to support any
crazy configurations that's theoretically possible. If a distro won't
include DMC then they won't get proper runtime pm and that's it.

Yes the original DMC patches (and the code still in tree) had code to
handle that case, but follow-up patches rip that complexity out.

If we are in position where we can actually ship all the power features we
develop then we could consider supporting more combinations of features,
but right now I think that's in the very distant future.

Thanks, Daniel

> 
> Gavin Hindman
> Senior Program Manager
> SSG/OTC – Open Source Technology Center
> 
> 
> -----Original Message-----
> From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf Of Daniel Vetter
> Sent: Wednesday, August 26, 2015 6:12 AM
> To: Manna, Animesh
> Cc: Bhardwaj, Rajneesh; intel-gfx@lists.freedesktop.org; Vetter, Daniel
> Subject: Re: [Intel-gfx] [DMC_BUGFIX_SKL_V2 4/5] drm/i915/skl: Do not disable cdclk PLL if csr firmware is present
> 
> On Wed, Aug 26, 2015 at 01:36:08AM +0530, Animesh Manna wrote:
> > While display engine entering into low power state no need to disable 
> > cdclk pll as CSR firmware of dmc will take care. If pll is already 
> > enabled firmware execution sequence will be blocked. This is one of 
> > the criteria for dmc to work properly.
> > 
> > v1: Initial version.
> > 
> > v2: Based on review comment from Daniel added code commnent.
> > 
> > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Damien Lespiau <damien.lespiau@intel.com>
> > Cc: Imre Deak <imre.deak@intel.com>
> > Cc: Sunil Kamath <sunil.kamath@intel.com>
> > Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> > Signed-off-bt: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> > Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_display.c | 14 ++++++++++----
> >  1 file changed, 10 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_display.c 
> > b/drivers/gpu/drm/i915/intel_display.c
> > index f604ce1..b6bef20 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -5687,10 +5687,16 @@ void skl_uninit_cdclk(struct drm_i915_private *dev_priv)
> >  	if (I915_READ(DBUF_CTL) & DBUF_POWER_STATE)
> >  		DRM_ERROR("DBuf power disable timeout\n");
> >  
> > -	/* disable DPLL0 */
> > -	I915_WRITE(LCPLL1_CTL, I915_READ(LCPLL1_CTL) & ~LCPLL_PLL_ENABLE);
> > -	if (wait_for(!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_LOCK), 1))
> > -		DRM_ERROR("Couldn't disable DPLL0\n");
> > +	/*
> > +	 * DMC assumes ownership of LCPLL and will get confused if we touch it.
> 
> This should get a FIXME - once we have dmc loading fixed up we require the firmware and there's no point in this check any more. Flexibilty just because is something we simply don't have the developer and validation resources for.
> -Daniel
> 
> > +	 */
> > +	if (dev_priv->csr.dmc_payload) {
> > +		/* disable DPLL0 */
> > +		I915_WRITE(LCPLL1_CTL, I915_READ(LCPLL1_CTL) &
> > +					~LCPLL_PLL_ENABLE);
> > +		if (wait_for(!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_LOCK), 1))
> > +			DRM_ERROR("Couldn't disable DPLL0\n");
> > +	}
> >  
> >  	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);  }
> > --
> > 2.0.2
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading Animesh Manna
  2015-08-26 13:10   ` Daniel Vetter
@ 2015-09-07 11:04   ` Sunil Kamath
  2015-09-07 16:22     ` Daniel Vetter
  2015-09-28  7:03     ` Daniel Vetter
  1 sibling, 2 replies; 51+ messages in thread
From: Sunil Kamath @ 2015-09-07 11:04 UTC (permalink / raw)
  To: Animesh Manna; +Cc: Daniel Vetter, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 2033 bytes --]

On Wednesday 26 August 2015 01:36 AM, Animesh Manna wrote:
> Dmc will restore the csr program except DC9, cold boot,
> warm reset, PCI function level reset, and hibernate/suspend.
>
> intel_csr_load_program() function is used to load the firmware
> data from kernel memory to csr address space.
>
> All values of csr address space will be zero if it got reset and
> the first byte of csr program is always a non-zero if firmware
> is loaded successfuly. Based on hardware status will load the
> firmware.
>
> Without this condition check if we overwrite the firmware data the
> counters exposed for dc5/dc6 (help for debugging) will be nullified.
>
> v1: Initial version.
>
> v2: Based on review comments from Daniel,
> - Added a check to know hardware status and load the firmware if not loaded.
>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Sunil Kamath <sunil.kamath@intel.com>
> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>   1 file changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> index ba1ae03..682cc26 100644
> --- a/drivers/gpu/drm/i915/intel_csr.c
> +++ b/drivers/gpu/drm/i915/intel_csr.c
> @@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>   		return;
>   	}
>   
> +	/*
> +	 * Dmc will restore the csr the program except DC9, cold boot,
> +	 * warm reset, PCI function level reset, and hibernate/suspend.
> +	 * This condition will help to check if csr address space is reset/
> +	 * not loaded.
> +	 */
> +	if (I915_READ(CSR_PROGRAM_BASE))
> +		return;
> +
>   	mutex_lock(&dev_priv->csr_lock);
>   	fw_size = dev_priv->csr.dmc_fw_size;
>   	for (i = 0; i < fw_size; i++)

Valid fix and patch is ready for merge now.

Reviewed-by: A.Sunil Kamath <sunil.kamath@intel.com> 
<mailto:sunil.kamath@intel.com>


[-- Attachment #1.2: Type: text/html, Size: 3210 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 2/5] drm/i915/skl Remove the call for csr uninitialization from suspend path
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 2/5] drm/i915/skl Remove the call for csr uninitialization from suspend path Animesh Manna
@ 2015-09-07 11:05   ` Sunil Kamath
  0 siblings, 0 replies; 51+ messages in thread
From: Sunil Kamath @ 2015-09-07 11:05 UTC (permalink / raw)
  To: Animesh Manna; +Cc: Daniel Vetter, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1395 bytes --]

On Wednesday 26 August 2015 01:36 AM, Animesh Manna wrote:
> This patch remove the function call to set the firmware
> loading status as uninitialized during suspend.
>
> Dmc firmware will restore the firmware in normal suspend. In previous
> patch added a check to directly read the hardware status and load
> the firmware if got reset during resume from suspend-hibernation.
>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.c | 6 ------
>   1 file changed, 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 1d88745..478101c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1015,12 +1015,6 @@ static int skl_suspend_complete(struct drm_i915_private *dev_priv)
>   {
>   	/* Enabling DC6 is not a hard requirement to enter runtime D3 */
>   
> -	/*
> -	 * This is to ensure that CSR isn't identified as loaded before
> -	 * CSR-loading program is called during runtime-resume.
> -	 */
> -	intel_csr_load_status_set(dev_priv, FW_UNINITIALIZED);
> -
>   	skl_uninit_cdclk(dev_priv);
>   
>   	return 0;

Valid fix and patch is ready for merge now.

Reviewed-by: A.Sunil Kamath <sunil.kamath@intel.com> 
<mailto:sunil.kamath@intel.com>


[-- Attachment #1.2: Type: text/html, Size: 2330 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow Animesh Manna
@ 2015-09-07 11:06   ` Sunil Kamath
  2015-09-28  7:21   ` Daniel Vetter
  1 sibling, 0 replies; 51+ messages in thread
From: Sunil Kamath @ 2015-09-07 11:06 UTC (permalink / raw)
  To: Animesh Manna; +Cc: intel-gfx, Rajneesh Bhardwaj, Daniel Vetter


[-- Attachment #1.1: Type: text/plain, Size: 5561 bytes --]

On Wednesday 26 August 2015 01:36 AM, Animesh Manna wrote:
> Mmio register access after dc6/dc5 entry is not allowed when
> DC6 power states are enabled according to bspec (bspec-id 0527),
> so enabling dc6 as the last call in suspend flow.
>
> v1: Initial version.
>
> v2: Based on review comment from Daniel,
> - created a seperate patch for csr uninitialization set call.
>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Sunil Kamath <sunil.kamath@intel.com>
> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.c         | 13 +++++++++++++
>   drivers/gpu/drm/i915/intel_drv.h        |  2 ++
>   drivers/gpu/drm/i915/intel_runtime_pm.c | 19 +++++++------------
>   3 files changed, 22 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 478101c..fa66162 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1013,10 +1013,20 @@ static int i915_pm_resume(struct device *dev)
>   
>   static int skl_suspend_complete(struct drm_i915_private *dev_priv)
>   {
> +	enum csr_state state;
>   	/* Enabling DC6 is not a hard requirement to enter runtime D3 */
>   
>   	skl_uninit_cdclk(dev_priv);
>   
> +	/* TODO: wait for a completion event or
> +	 * similar here instead of busy
> +	 * waiting using wait_for function.
> +	 */
> +	wait_for((state = intel_csr_load_status_get(dev_priv)) !=
> +			FW_UNINITIALIZED, 1000);
> +	if (state == FW_LOADED)
> +		skl_enable_dc6(dev_priv);
> +
>   	return 0;
>   }
>   
> @@ -1063,6 +1073,9 @@ static int skl_resume_prepare(struct drm_i915_private *dev_priv)
>   {
>   	struct drm_device *dev = dev_priv->dev;
>   
> +	if (intel_csr_load_status_get(dev_priv) == FW_LOADED)
> +		skl_disable_dc6(dev_priv);
> +
>   	skl_init_cdclk(dev_priv);
>   	intel_csr_load_program(dev);
>   
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 81b7d77..9cb7d4e 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1118,6 +1118,8 @@ void bxt_enable_dc9(struct drm_i915_private *dev_priv);
>   void bxt_disable_dc9(struct drm_i915_private *dev_priv);
>   void skl_init_cdclk(struct drm_i915_private *dev_priv);
>   void skl_uninit_cdclk(struct drm_i915_private *dev_priv);
> +void skl_enable_dc6(struct drm_i915_private *dev_priv);
> +void skl_disable_dc6(struct drm_i915_private *dev_priv);
>   void intel_dp_get_m_n(struct intel_crtc *crtc,
>   		      struct intel_crtc_state *pipe_config);
>   void intel_dp_set_m_n(struct intel_crtc *crtc, enum link_m_n_set m_n);
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index 821644d..23a3aa3 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -548,7 +548,7 @@ static void assert_can_disable_dc6(struct drm_i915_private *dev_priv)
>   		"DC6 already programmed to be disabled.\n");
>   }
>   
> -static void skl_enable_dc6(struct drm_i915_private *dev_priv)
> +void skl_enable_dc6(struct drm_i915_private *dev_priv)
>   {
>   	uint32_t val;
>   
> @@ -565,7 +565,7 @@ static void skl_enable_dc6(struct drm_i915_private *dev_priv)
>   	POSTING_READ(DC_STATE_EN);
>   }
>   
> -static void skl_disable_dc6(struct drm_i915_private *dev_priv)
> +void skl_disable_dc6(struct drm_i915_private *dev_priv)
>   {
>   	uint32_t val;
>   
> @@ -626,10 +626,10 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>   				!I915_READ(HSW_PWR_WELL_BIOS),
>   				"Invalid for power well status to be enabled, unless done by the BIOS, \
>   				when request is to disable!\n");
> -			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> -				power_well->data == SKL_DISP_PW_2) {
> +			if (power_well->data == SKL_DISP_PW_2) {
> +				if (GEN9_ENABLE_DC5(dev))
> +					gen9_disable_dc5(dev_priv);
>   				if (SKL_ENABLE_DC6(dev)) {
> -					skl_disable_dc6(dev_priv);
>   					/*
>   					 * DDI buffer programming unnecessary during driver-load/resume
>   					 * as it's already done during modeset initialization then.
> @@ -637,8 +637,6 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>   					 */
>   					if (!dev_priv->power_domains.initializing)
>   						intel_prepare_ddi(dev);
> -				} else {
> -					gen9_disable_dc5(dev_priv);
>   				}
>   			}
>   			I915_WRITE(HSW_PWR_WELL_DRIVER, tmp | req_mask);
> @@ -658,7 +656,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>   			POSTING_READ(HSW_PWR_WELL_DRIVER);
>   			DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
>   
> -			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> +			if (GEN9_ENABLE_DC5(dev) &&
>   				power_well->data == SKL_DISP_PW_2) {
>   				enum csr_state state;
>   				/* TODO: wait for a completion event or
> @@ -671,10 +669,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>   					DRM_ERROR("CSR firmware not ready (%d)\n",
>   							state);
>   				else
> -					if (SKL_ENABLE_DC6(dev))
> -						skl_enable_dc6(dev_priv);
> -					else
> -						gen9_enable_dc5(dev_priv);
> +					gen9_enable_dc5(dev_priv);
>   			}
>   		}
>   	}

Valid fix and patch is ready for merge now.

Reviewed-by: A.Sunil Kamath <sunil.kamath@intel.com> 
<mailto:sunil.kamath@intel.com>


[-- Attachment #1.2: Type: text/html, Size: 6620 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 4/5] drm/i915/skl: Do not disable cdclk PLL if csr firmware is present
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 4/5] drm/i915/skl: Do not disable cdclk PLL if csr firmware is present Animesh Manna
  2015-08-26 13:11   ` Daniel Vetter
@ 2015-09-07 11:07   ` Sunil Kamath
  1 sibling, 0 replies; 51+ messages in thread
From: Sunil Kamath @ 2015-09-07 11:07 UTC (permalink / raw)
  To: Animesh Manna; +Cc: Daniel Vetter, intel-gfx, Rajneesh Bhardwaj


[-- Attachment #1.1: Type: text/plain, Size: 2042 bytes --]

On Wednesday 26 August 2015 01:36 AM, Animesh Manna wrote:
> While display engine entering into low power state no need to disable
> cdclk pll as CSR firmware of dmc will take care. If pll is already
> enabled firmware execution sequence will be blocked. This is one
> of the criteria for dmc to work properly.
>
> v1: Initial version.
>
> v2: Based on review comment from Daniel added code commnent.
>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Sunil Kamath <sunil.kamath@intel.com>
> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> Signed-off-bt: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_display.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index f604ce1..b6bef20 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -5687,10 +5687,16 @@ void skl_uninit_cdclk(struct drm_i915_private *dev_priv)
>   	if (I915_READ(DBUF_CTL) & DBUF_POWER_STATE)
>   		DRM_ERROR("DBuf power disable timeout\n");
>   
> -	/* disable DPLL0 */
> -	I915_WRITE(LCPLL1_CTL, I915_READ(LCPLL1_CTL) & ~LCPLL_PLL_ENABLE);
> -	if (wait_for(!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_LOCK), 1))
> -		DRM_ERROR("Couldn't disable DPLL0\n");
> +	/*
> +	 * DMC assumes ownership of LCPLL and will get confused if we touch it.
> +	 */
> +	if (dev_priv->csr.dmc_payload) {
> +		/* disable DPLL0 */
> +		I915_WRITE(LCPLL1_CTL, I915_READ(LCPLL1_CTL) &
> +					~LCPLL_PLL_ENABLE);
> +		if (wait_for(!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_LOCK), 1))
> +			DRM_ERROR("Couldn't disable DPLL0\n");
> +	}
>   
>   	intel_display_power_put(dev_priv, POWER_DOMAIN_PLLS);
>   }

Valid fix and patch is ready for merge now.

Reviewed-by: A.Sunil Kamath <sunil.kamath@intel.com> 
<mailto:sunil.kamath@intel.com>


[-- Attachment #1.2: Type: text/html, Size: 3319 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 5/5] drm/i915/skl: Block disable call for pw1 if dmc firmware is present.
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 5/5] drm/i915/skl: Block disable call for pw1 if dmc " Animesh Manna
@ 2015-09-07 11:09   ` Sunil Kamath
  2015-09-28  7:24     ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Sunil Kamath @ 2015-09-07 11:09 UTC (permalink / raw)
  To: Animesh Manna; +Cc: Daniel Vetter, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 2400 bytes --]

On Wednesday 26 August 2015 01:36 AM, Animesh Manna wrote:
> Another interesting criteria to work dmc as expected is pw1 to be
> enabled by driver and dmc will shut it off in its execution
> sequence. If already disabled by driver dmc will get confuse and
> behave differently than expected found during pc10 entry issue
> for skl.
>
> So berfore we disable power-well 1, added check if dmc firmware is
> present and driver will not disable power well 1, but for any reason
> if firmware is not present of failed to load we can shut off the
> power well 1 which will save some power.
>
> As skl is currently fully dependent on dmc to go in lowest possible
> power state (dc6) but the same is not applicable for bxt. Display
> engine can enter into dc9 without dmc, hence unblocking disable call.
>
> v1: Initial version.
>
> v2: Rebased as per current patch series.
>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Sunil Kamath <sunil.kamath@intel.com>
> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_runtime_pm.c | 12 +++++++++---
>   1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index 23a3aa3..340f386 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -652,9 +652,15 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>   		}
>   	} else {
>   		if (enable_requested) {
> -			I915_WRITE(HSW_PWR_WELL_DRIVER,	tmp & ~req_mask);
> -			POSTING_READ(HSW_PWR_WELL_DRIVER);
> -			DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
> +			if (IS_SKYLAKE(dev) &&
> +				(power_well->data == SKL_DISP_PW_1) &&
> +				(intel_csr_load_status_get(dev_priv) == FW_LOADED))
> +				DRM_DEBUG_KMS("Not Disabling PW1, dmc will handle\n");
> +			else {
> +				I915_WRITE(HSW_PWR_WELL_DRIVER,	tmp & ~req_mask);
> +				POSTING_READ(HSW_PWR_WELL_DRIVER);
> +				DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
> +			}
>   
>   			if (GEN9_ENABLE_DC5(dev) &&
>   				power_well->data == SKL_DISP_PW_2) {

Valid fix and patch is ready for merge now.

Reviewed-by: A.Sunil Kamath <sunil.kamath@intel.com> 
<mailto:sunil.kamath@intel.com>


[-- Attachment #1.2: Type: text/html, Size: 3601 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-07 11:04   ` Sunil Kamath
@ 2015-09-07 16:22     ` Daniel Vetter
  2015-09-09 20:33       ` Animesh Manna
  2015-09-28  7:03     ` Daniel Vetter
  1 sibling, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2015-09-07 16:22 UTC (permalink / raw)
  To: Sunil Kamath; +Cc: Daniel Vetter, intel-gfx

On Mon, Sep 07, 2015 at 04:34:30PM +0530, Sunil Kamath wrote:
> On Wednesday 26 August 2015 01:36 AM, Animesh Manna wrote:
> >Dmc will restore the csr program except DC9, cold boot,
> >warm reset, PCI function level reset, and hibernate/suspend.
> >
> >intel_csr_load_program() function is used to load the firmware
> >data from kernel memory to csr address space.
> >
> >All values of csr address space will be zero if it got reset and
> >the first byte of csr program is always a non-zero if firmware
> >is loaded successfuly. Based on hardware status will load the
> >firmware.
> >
> >Without this condition check if we overwrite the firmware data the
> >counters exposed for dc5/dc6 (help for debugging) will be nullified.
> >
> >v1: Initial version.
> >
> >v2: Based on review comments from Daniel,
> >- Added a check to know hardware status and load the firmware if not loaded.
> >
> >Cc: Daniel Vetter <daniel.vetter@intel.com>
> >Cc: Damien Lespiau <damien.lespiau@intel.com>
> >Cc: Imre Deak <imre.deak@intel.com>
> >Cc: Sunil Kamath <sunil.kamath@intel.com>
> >Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> >Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> >---
> >  drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> >index ba1ae03..682cc26 100644
> >--- a/drivers/gpu/drm/i915/intel_csr.c
> >+++ b/drivers/gpu/drm/i915/intel_csr.c
> >@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
> >  		return;
> >  	}
> >+	/*
> >+	 * Dmc will restore the csr the program except DC9, cold boot,
> >+	 * warm reset, PCI function level reset, and hibernate/suspend.
> >+	 * This condition will help to check if csr address space is reset/
> >+	 * not loaded.
> >+	 */
> >+	if (I915_READ(CSR_PROGRAM_BASE))
> >+		return;
> >+
> >  	mutex_lock(&dev_priv->csr_lock);
> >  	fw_size = dev_priv->csr.dmc_fw_size;
> >  	for (i = 0; i < fw_size; i++)
> 
> Valid fix and patch is ready for merge now.

I still have a question open so will wait until that's anserwered ...

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-02  8:54       ` Daniel Vetter
@ 2015-09-09 20:28         ` Animesh Manna
  2015-09-10 14:45           ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Animesh Manna @ 2015-09-09 20:28 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx



On 9/2/2015 2:24 PM, Daniel Vetter wrote:
> On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
>>
>> On 8/26/2015 6:40 PM, Daniel Vetter wrote:
>>> On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
>>>> Dmc will restore the csr program except DC9, cold boot,
>>>> warm reset, PCI function level reset, and hibernate/suspend.
>>>>
>>>> intel_csr_load_program() function is used to load the firmware
>>>> data from kernel memory to csr address space.
>>>>
>>>> All values of csr address space will be zero if it got reset and
>>>> the first byte of csr program is always a non-zero if firmware
>>>> is loaded successfuly. Based on hardware status will load the
>>>> firmware.
>>>>
>>>> Without this condition check if we overwrite the firmware data the
>>>> counters exposed for dc5/dc6 (help for debugging) will be nullified.
>> Bacause of the above reason mentioned just above we need to block firmware loading again.
>> So only WARN_ON will not help.
>>
>>
>>>> v1: Initial version.
>>>>
>>>> v2: Based on review comments from Daniel,
>>>> - Added a check to know hardware status and load the firmware if not loaded.
>>>>
>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>>> Cc: Damien Lespiau <damien.lespiau@intel.com>
>>>> Cc: Imre Deak <imre.deak@intel.com>
>>>> Cc: Sunil Kamath <sunil.kamath@intel.com>
>>>> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>>>> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>>>>   1 file changed, 9 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>>>> index ba1ae03..682cc26 100644
>>>> --- a/drivers/gpu/drm/i915/intel_csr.c
>>>> +++ b/drivers/gpu/drm/i915/intel_csr.c
>>>> @@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>>>>   		return;
>>>>   	}
>>>> +	/*
>>>> +	 * Dmc will restore the csr the program except DC9, cold boot,
>>>> +	 * warm reset, PCI function level reset, and hibernate/suspend.
>>>> +	 * This condition will help to check if csr address space is reset/
>>>> +	 * not loaded.
>>>> +	 */
>>> Atm we call this from driver load and resume, which doesn seem to cover
>>> all the cases you mention in the comment. Should this be a WARN_ON
>>> instead? Or do we have troubles in our init sequence where we load too
>>> many times?
>> Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
>> Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
>> second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
>> and cold boot(not loaded).
>>
>> Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
>> If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
>>
>> Can the below comment more clear to you.
>>
>> 	/*
>> 	 * Dmc will restore the csr the program except DC9, cold boot,
>> 	 * warm reset, PCI function level reset, and hibernate/suspend.
>> 	 * If firmware is restored by dmc then no need to load again which
>> 	 * will keep the dc5/dc6 counter exposed by firmware.
>> 	 */
>>
>> No issue in init sequence.
> That seems to still cover all the callers of the function afaics - we do
> pci resets over suspend resume unconditionally. So I still don't
> understand where exactly we try to load the dmc firmware in i915.ko when
> it's already loaded.

During resume intel_csr_load_program() will be called from
intel_runtime_resume().

intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()

During Pc10 entry testing I can see dmc is restoring back the firmware always,
but as you mentioned pci-reset can happen unconditionally, but still then
also during resume intel_runtime_resume() will be called and based on
register read of csr-base-address firmware loading will happen.

-Animesh

> -Daniel


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-07 16:22     ` Daniel Vetter
@ 2015-09-09 20:33       ` Animesh Manna
  0 siblings, 0 replies; 51+ messages in thread
From: Animesh Manna @ 2015-09-09 20:33 UTC (permalink / raw)
  To: Daniel Vetter, Sunil Kamath; +Cc: Daniel Vetter, intel-gfx



On 9/7/2015 9:52 PM, Daniel Vetter wrote:
> On Mon, Sep 07, 2015 at 04:34:30PM +0530, Sunil Kamath wrote:
>> On Wednesday 26 August 2015 01:36 AM, Animesh Manna wrote:
>>> Dmc will restore the csr program except DC9, cold boot,
>>> warm reset, PCI function level reset, and hibernate/suspend.
>>>
>>> intel_csr_load_program() function is used to load the firmware
>>> data from kernel memory to csr address space.
>>>
>>> All values of csr address space will be zero if it got reset and
>>> the first byte of csr program is always a non-zero if firmware
>>> is loaded successfuly. Based on hardware status will load the
>>> firmware.
>>>
>>> Without this condition check if we overwrite the firmware data the
>>> counters exposed for dc5/dc6 (help for debugging) will be nullified.
>>>
>>> v1: Initial version.
>>>
>>> v2: Based on review comments from Daniel,
>>> - Added a check to know hardware status and load the firmware if not loaded.
>>>
>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Damien Lespiau <damien.lespiau@intel.com>
>>> Cc: Imre Deak <imre.deak@intel.com>
>>> Cc: Sunil Kamath <sunil.kamath@intel.com>
>>> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>>> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>>>   1 file changed, 9 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>>> index ba1ae03..682cc26 100644
>>> --- a/drivers/gpu/drm/i915/intel_csr.c
>>> +++ b/drivers/gpu/drm/i915/intel_csr.c
>>> @@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>>>   		return;
>>>   	}
>>> +	/*
>>> +	 * Dmc will restore the csr the program except DC9, cold boot,
>>> +	 * warm reset, PCI function level reset, and hibernate/suspend.
>>> +	 * This condition will help to check if csr address space is reset/
>>> +	 * not loaded.
>>> +	 */
>>> +	if (I915_READ(CSR_PROGRAM_BASE))
>>> +		return;
>>> +
>>>   	mutex_lock(&dev_priv->csr_lock);
>>>   	fw_size = dev_priv->csr.dmc_fw_size;
>>>   	for (i = 0; i < fw_size; i++)
>> Valid fix and patch is ready for merge now.
> I still have a question open so will wait until that's anserwered ...
>
> Thanks, Daniel

Shared my view on other thread.

-Animesh


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-09 20:28         ` Animesh Manna
@ 2015-09-10 14:45           ` Daniel Vetter
  2015-09-10 19:05             ` Animesh Manna
  2015-09-10 19:06             ` Animesh Manna
  0 siblings, 2 replies; 51+ messages in thread
From: Daniel Vetter @ 2015-09-10 14:45 UTC (permalink / raw)
  To: Animesh Manna; +Cc: Daniel Vetter, intel-gfx

On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
> 
> 
> On 9/2/2015 2:24 PM, Daniel Vetter wrote:
> >On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
> >>
> >>On 8/26/2015 6:40 PM, Daniel Vetter wrote:
> >>>On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
> >>>>Dmc will restore the csr program except DC9, cold boot,
> >>>>warm reset, PCI function level reset, and hibernate/suspend.
> >>>>
> >>>>intel_csr_load_program() function is used to load the firmware
> >>>>data from kernel memory to csr address space.
> >>>>
> >>>>All values of csr address space will be zero if it got reset and
> >>>>the first byte of csr program is always a non-zero if firmware
> >>>>is loaded successfuly. Based on hardware status will load the
> >>>>firmware.
> >>>>
> >>>>Without this condition check if we overwrite the firmware data the
> >>>>counters exposed for dc5/dc6 (help for debugging) will be nullified.
> >>Bacause of the above reason mentioned just above we need to block firmware loading again.
> >>So only WARN_ON will not help.
> >>
> >>
> >>>>v1: Initial version.
> >>>>
> >>>>v2: Based on review comments from Daniel,
> >>>>- Added a check to know hardware status and load the firmware if not loaded.
> >>>>
> >>>>Cc: Daniel Vetter <daniel.vetter@intel.com>
> >>>>Cc: Damien Lespiau <damien.lespiau@intel.com>
> >>>>Cc: Imre Deak <imre.deak@intel.com>
> >>>>Cc: Sunil Kamath <sunil.kamath@intel.com>
> >>>>Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> >>>>Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> >>>>---
> >>>>  drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
> >>>>  1 file changed, 9 insertions(+)
> >>>>
> >>>>diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> >>>>index ba1ae03..682cc26 100644
> >>>>--- a/drivers/gpu/drm/i915/intel_csr.c
> >>>>+++ b/drivers/gpu/drm/i915/intel_csr.c
> >>>>@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
> >>>>  		return;
> >>>>  	}
> >>>>+	/*
> >>>>+	 * Dmc will restore the csr the program except DC9, cold boot,
> >>>>+	 * warm reset, PCI function level reset, and hibernate/suspend.
> >>>>+	 * This condition will help to check if csr address space is reset/
> >>>>+	 * not loaded.
> >>>>+	 */
> >>>Atm we call this from driver load and resume, which doesn seem to cover
> >>>all the cases you mention in the comment. Should this be a WARN_ON
> >>>instead? Or do we have troubles in our init sequence where we load too
> >>>many times?
> >>Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
> >>Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
> >>second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
> >>and cold boot(not loaded).
> >>
> >>Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
> >>If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
> >>
> >>Can the below comment more clear to you.
> >>
> >>	/*
> >>	 * Dmc will restore the csr the program except DC9, cold boot,
> >>	 * warm reset, PCI function level reset, and hibernate/suspend.
> >>	 * If firmware is restored by dmc then no need to load again which
> >>	 * will keep the dc5/dc6 counter exposed by firmware.
> >>	 */
> >>
> >>No issue in init sequence.
> >That seems to still cover all the callers of the function afaics - we do
> >pci resets over suspend resume unconditionally. So I still don't
> >understand where exactly we try to load the dmc firmware in i915.ko when
> >it's already loaded.
> 
> During resume intel_csr_load_program() will be called from
> intel_runtime_resume().
> 
> intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
> 
> During Pc10 entry testing I can see dmc is restoring back the firmware always,
> but as you mentioned pci-reset can happen unconditionally, but still then
> also during resume intel_runtime_resume() will be called and based on
> register read of csr-base-address firmware loading will happen.

But in your comment you're saying it won't get restored in case of dc9 and
suspend. So that seems to mismatch what you're saying here (and what the
commit message says) and what the code does. And this function here is
called for resume after suspend/hibernate only.

So I still see a mismatch between what you're saying here and what the
code does vs. what the comment above says. It would be clear if there's
just no comment. And we don't call intel_csr_load_program from random
places just for luck, if there's unecessary calls of that its kinda a bug
in the code (and we should just remove it).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-10 14:45           ` Daniel Vetter
@ 2015-09-10 19:05             ` Animesh Manna
  2015-09-10 19:06             ` Animesh Manna
  1 sibling, 0 replies; 51+ messages in thread
From: Animesh Manna @ 2015-09-10 19:05 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx



On 9/10/2015 8:15 PM, Daniel Vetter wrote:
> On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
>>
>> On 9/2/2015 2:24 PM, Daniel Vetter wrote:
>>> On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
>>>> On 8/26/2015 6:40 PM, Daniel Vetter wrote:
>>>>> On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
>>>>>> Dmc will restore the csr program except DC9, cold boot,
>>>>>> warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>
>>>>>> intel_csr_load_program() function is used to load the firmware
>>>>>> data from kernel memory to csr address space.
>>>>>>
>>>>>> All values of csr address space will be zero if it got reset and
>>>>>> the first byte of csr program is always a non-zero if firmware
>>>>>> is loaded successfuly. Based on hardware status will load the
>>>>>> firmware.
>>>>>>
>>>>>> Without this condition check if we overwrite the firmware data the
>>>>>> counters exposed for dc5/dc6 (help for debugging) will be nullified.
>>>> Bacause of the above reason mentioned just above we need to block firmware loading again.
>>>> So only WARN_ON will not help.
>>>>
>>>>
>>>>>> v1: Initial version.
>>>>>>
>>>>>> v2: Based on review comments from Daniel,
>>>>>> - Added a check to know hardware status and load the firmware if not loaded.
>>>>>>
>>>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>>>>> Cc: Damien Lespiau <damien.lespiau@intel.com>
>>>>>> Cc: Imre Deak <imre.deak@intel.com>
>>>>>> Cc: Sunil Kamath <sunil.kamath@intel.com>
>>>>>> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>>>>>> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>>>>>>   1 file changed, 9 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>>>>>> index ba1ae03..682cc26 100644
>>>>>> --- a/drivers/gpu/drm/i915/intel_csr.c
>>>>>> +++ b/drivers/gpu/drm/i915/intel_csr.c
>>>>>> @@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>>>>>>   		return;
>>>>>>   	}
>>>>>> +	/*
>>>>>> +	 * Dmc will restore the csr the program except DC9, cold boot,
>>>>>> +	 * warm reset, PCI function level reset, and hibernate/suspend.
>>>>>> +	 * This condition will help to check if csr address space is reset/
>>>>>> +	 * not loaded.
>>>>>> +	 */
>>>>> Atm we call this from driver load and resume, which doesn seem to cover
>>>>> all the cases you mention in the comment. Should this be a WARN_ON
>>>>> instead? Or do we have troubles in our init sequence where we load too
>>>>> many times?
>>>> Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
>>>> Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
>>>> second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
>>>> and cold boot(not loaded).
>>>>
>>>> Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
>>>> If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
>>>>
>>>> Can the below comment more clear to you.
>>>>
>>>> 	/*
>>>> 	 * Dmc will restore the csr the program except DC9, cold boot,
>>>> 	 * warm reset, PCI function level reset, and hibernate/suspend.
>>>> 	 * If firmware is restored by dmc then no need to load again which
>>>> 	 * will keep the dc5/dc6 counter exposed by firmware.
>>>> 	 */
>>>>
>>>> No issue in init sequence.
>>> That seems to still cover all the callers of the function afaics - we do
>>> pci resets over suspend resume unconditionally. So I still don't
>>> understand where exactly we try to load the dmc firmware in i915.ko when
>>> it's already loaded.
>> During resume intel_csr_load_program() will be called from
>> intel_runtime_resume().
>>
>> intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
>>
>> During Pc10 entry testing I can see dmc is restoring back the firmware always,
>> but as you mentioned pci-reset can happen unconditionally, but still then
>> also during resume intel_runtime_resume() will be called and based on
>> register read of csr-base-address firmware loading will happen.
> But in your comment you're saying it won't get restored in case of dc9 and
> suspend. So that seems to mismatch what you're saying here (and what the
> commit message says) and what the code does. And this function here is
> called for resume after suspend/hibernate only.

pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
I think you are confusing between dc6 and dc9. Pc10 can be achieved by
entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
for broxton which is not present for skylake.

Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
execution flow will be different in case of suspend/resume which I think is confusing
you.

I am ready explain you in detail. It will be good if we discuss specific use-case scenario
and itz software design for specific platform. Another point - as dmc related code for
broxton is not merged better first we close design for skylake. Now, I have added dc9
description in comment thinking of future. If you want I can remove for now and later
can add in bxt patch series for enabling dmc. Will wait for your reply.

-Animesh

>
> So I still see a mismatch between what you're saying here and what the
> code does vs. what the comment above says. It would be clear if there's
> just no comment. And we don't call intel_csr_load_program from random
> places just for luck, if there's unecessary calls of that its kinda a bug
> in the code (and we should just remove it).
> -Daniel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-10 14:45           ` Daniel Vetter
  2015-09-10 19:05             ` Animesh Manna
@ 2015-09-10 19:06             ` Animesh Manna
  2015-09-14  7:46               ` Daniel Vetter
  1 sibling, 1 reply; 51+ messages in thread
From: Animesh Manna @ 2015-09-10 19:06 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx



On 9/10/2015 8:15 PM, Daniel Vetter wrote:
> On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
>>
>> On 9/2/2015 2:24 PM, Daniel Vetter wrote:
>>> On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
>>>> On 8/26/2015 6:40 PM, Daniel Vetter wrote:
>>>>> On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
>>>>>> Dmc will restore the csr program except DC9, cold boot,
>>>>>> warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>
>>>>>> intel_csr_load_program() function is used to load the firmware
>>>>>> data from kernel memory to csr address space.
>>>>>>
>>>>>> All values of csr address space will be zero if it got reset and
>>>>>> the first byte of csr program is always a non-zero if firmware
>>>>>> is loaded successfuly. Based on hardware status will load the
>>>>>> firmware.
>>>>>>
>>>>>> Without this condition check if we overwrite the firmware data the
>>>>>> counters exposed for dc5/dc6 (help for debugging) will be nullified.
>>>> Bacause of the above reason mentioned just above we need to block firmware loading again.
>>>> So only WARN_ON will not help.
>>>>
>>>>
>>>>>> v1: Initial version.
>>>>>>
>>>>>> v2: Based on review comments from Daniel,
>>>>>> - Added a check to know hardware status and load the firmware if not loaded.
>>>>>>
>>>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>>>>> Cc: Damien Lespiau <damien.lespiau@intel.com>
>>>>>> Cc: Imre Deak <imre.deak@intel.com>
>>>>>> Cc: Sunil Kamath <sunil.kamath@intel.com>
>>>>>> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>>>>>> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>>>>>> ---
>>>>>>   drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>>>>>>   1 file changed, 9 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>>>>>> index ba1ae03..682cc26 100644
>>>>>> --- a/drivers/gpu/drm/i915/intel_csr.c
>>>>>> +++ b/drivers/gpu/drm/i915/intel_csr.c
>>>>>> @@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>>>>>>   		return;
>>>>>>   	}
>>>>>> +	/*
>>>>>> +	 * Dmc will restore the csr the program except DC9, cold boot,
>>>>>> +	 * warm reset, PCI function level reset, and hibernate/suspend.
>>>>>> +	 * This condition will help to check if csr address space is reset/
>>>>>> +	 * not loaded.
>>>>>> +	 */
>>>>> Atm we call this from driver load and resume, which doesn seem to cover
>>>>> all the cases you mention in the comment. Should this be a WARN_ON
>>>>> instead? Or do we have troubles in our init sequence where we load too
>>>>> many times?
>>>> Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
>>>> Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
>>>> second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
>>>> and cold boot(not loaded).
>>>>
>>>> Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
>>>> If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
>>>>
>>>> Can the below comment more clear to you.
>>>>
>>>> 	/*
>>>> 	 * Dmc will restore the csr the program except DC9, cold boot,
>>>> 	 * warm reset, PCI function level reset, and hibernate/suspend.
>>>> 	 * If firmware is restored by dmc then no need to load again which
>>>> 	 * will keep the dc5/dc6 counter exposed by firmware.
>>>> 	 */
>>>>
>>>> No issue in init sequence.
>>> That seems to still cover all the callers of the function afaics - we do
>>> pci resets over suspend resume unconditionally. So I still don't
>>> understand where exactly we try to load the dmc firmware in i915.ko when
>>> it's already loaded.
>> During resume intel_csr_load_program() will be called from
>> intel_runtime_resume().
>>
>> intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
>>
>> During Pc10 entry testing I can see dmc is restoring back the firmware always,
>> but as you mentioned pci-reset can happen unconditionally, but still then
>> also during resume intel_runtime_resume() will be called and based on
>> register read of csr-base-address firmware loading will happen.
> But in your comment you're saying it won't get restored in case of dc9 and
> suspend. So that seems to mismatch what you're saying here (and what the
> commit message says) and what the code does. And this function here is
> called for resume after suspend/hibernate only.

pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
I think you are confusing between dc6 and dc9. Pc10 can be achieved by
entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
for broxton which is not present for skylake.

Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
execution flow will be different in case of suspend/resume which I think is confusing
you.

I am ready explain you in detail. It will be good if we discuss specific use-case scenario
and itz software design for specific platform. Another point - as dmc related code for
broxton is not merged better first we close design for skylake. Now, I have added dc9
description in comment thinking of future. If you want I can remove for now and later
can add in bxt patch series for enabling dmc. Will wait for your reply.

-Animesh

>
> So I still see a mismatch between what you're saying here and what the
> code does vs. what the comment above says. It would be clear if there's
> just no comment. And we don't call intel_csr_load_program from random
> places just for luck, if there's unecessary calls of that its kinda a bug
> in the code (and we should just remove it).
> -Daniel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-10 19:06             ` Animesh Manna
@ 2015-09-14  7:46               ` Daniel Vetter
  2015-09-16 19:23                 ` Animesh Manna
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2015-09-14  7:46 UTC (permalink / raw)
  To: Animesh Manna; +Cc: Daniel Vetter, intel-gfx

On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote:
> 
> 
> On 9/10/2015 8:15 PM, Daniel Vetter wrote:
> >On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
> >>
> >>On 9/2/2015 2:24 PM, Daniel Vetter wrote:
> >>>On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
> >>>>On 8/26/2015 6:40 PM, Daniel Vetter wrote:
> >>>>>On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
> >>>>>>Dmc will restore the csr program except DC9, cold boot,
> >>>>>>warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>
> >>>>>>intel_csr_load_program() function is used to load the firmware
> >>>>>>data from kernel memory to csr address space.
> >>>>>>
> >>>>>>All values of csr address space will be zero if it got reset and
> >>>>>>the first byte of csr program is always a non-zero if firmware
> >>>>>>is loaded successfuly. Based on hardware status will load the
> >>>>>>firmware.
> >>>>>>
> >>>>>>Without this condition check if we overwrite the firmware data the
> >>>>>>counters exposed for dc5/dc6 (help for debugging) will be nullified.
> >>>>Bacause of the above reason mentioned just above we need to block firmware loading again.
> >>>>So only WARN_ON will not help.
> >>>>
> >>>>
> >>>>>>v1: Initial version.
> >>>>>>
> >>>>>>v2: Based on review comments from Daniel,
> >>>>>>- Added a check to know hardware status and load the firmware if not loaded.
> >>>>>>
> >>>>>>Cc: Daniel Vetter <daniel.vetter@intel.com>
> >>>>>>Cc: Damien Lespiau <damien.lespiau@intel.com>
> >>>>>>Cc: Imre Deak <imre.deak@intel.com>
> >>>>>>Cc: Sunil Kamath <sunil.kamath@intel.com>
> >>>>>>Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> >>>>>>Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> >>>>>>---
> >>>>>>  drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
> >>>>>>  1 file changed, 9 insertions(+)
> >>>>>>
> >>>>>>diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>index ba1ae03..682cc26 100644
> >>>>>>--- a/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>+++ b/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
> >>>>>>  		return;
> >>>>>>  	}
> >>>>>>+	/*
> >>>>>>+	 * Dmc will restore the csr the program except DC9, cold boot,
> >>>>>>+	 * warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>+	 * This condition will help to check if csr address space is reset/
> >>>>>>+	 * not loaded.
> >>>>>>+	 */
> >>>>>Atm we call this from driver load and resume, which doesn seem to cover
> >>>>>all the cases you mention in the comment. Should this be a WARN_ON
> >>>>>instead? Or do we have troubles in our init sequence where we load too
> >>>>>many times?
> >>>>Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
> >>>>Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
> >>>>second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
> >>>>and cold boot(not loaded).
> >>>>
> >>>>Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
> >>>>If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
> >>>>
> >>>>Can the below comment more clear to you.
> >>>>
> >>>>	/*
> >>>>	 * Dmc will restore the csr the program except DC9, cold boot,
> >>>>	 * warm reset, PCI function level reset, and hibernate/suspend.
> >>>>	 * If firmware is restored by dmc then no need to load again which
> >>>>	 * will keep the dc5/dc6 counter exposed by firmware.
> >>>>	 */
> >>>>
> >>>>No issue in init sequence.
> >>>That seems to still cover all the callers of the function afaics - we do
> >>>pci resets over suspend resume unconditionally. So I still don't
> >>>understand where exactly we try to load the dmc firmware in i915.ko when
> >>>it's already loaded.
> >>During resume intel_csr_load_program() will be called from
> >>intel_runtime_resume().
> >>
> >>intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
> >>
> >>During Pc10 entry testing I can see dmc is restoring back the firmware always,
> >>but as you mentioned pci-reset can happen unconditionally, but still then
> >>also during resume intel_runtime_resume() will be called and based on
> >>register read of csr-base-address firmware loading will happen.
> >But in your comment you're saying it won't get restored in case of dc9 and
> >suspend. So that seems to mismatch what you're saying here (and what the
> >commit message says) and what the code does. And this function here is
> >called for resume after suspend/hibernate only.
> 
> pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
> I think you are confusing between dc6 and dc9. Pc10 can be achieved by
> entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
> for broxton which is not present for skylake.

I have no idea at all about different pc levels on skl. What I'm talking
about is system suspend/resume and driver load, which are the places this
function gets called. At least afaics.

> Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
> execution flow will be different in case of suspend/resume which I think is confusing
> you.

That seems like really important information. What's different on bxt?
These are the kind of details you should explain in the commit message ...

> I am ready explain you in detail. It will be good if we discuss specific use-case scenario
> and itz software design for specific platform. Another point - as dmc related code for
> broxton is not merged better first we close design for skylake. Now, I have added dc9
> description in comment thinking of future. If you want I can remove for now and later
> can add in bxt patch series for enabling dmc. Will wait for your reply.

This question here isn't about the overall design and how to handle power
wells in skl/bxt. That's a separate discussion and tracked somewhere else.
I'm really just confused about when exactly we need to reload to firmware,
and why we need a runtime check for that. Normally we should know when to
reload the firmware and just either reload or not, without checking hw
state. And I don't like checking for hw state since at least in the past
that kind of code ended up being fragile - it's an illusion that it does
the right thing no matter what, since often there's other tricky ordering
constraints. And if you have automatic duct-tape like then no one will
ever spot those other, harder to spot issues, until an expensive customer
escalation happens.

So what I want to know here is:
- When exactly do we need to reload dmc firmware.
- What exactly is the reason why we can't make that decision statically in
  the code (by calling csr_load at the right spots).

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-14  7:46               ` Daniel Vetter
@ 2015-09-16 19:23                 ` Animesh Manna
  2015-09-23  7:57                   ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Animesh Manna @ 2015-09-16 19:23 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx



On 9/14/2015 1:16 PM, Daniel Vetter wrote:
> On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote:
>>
>> On 9/10/2015 8:15 PM, Daniel Vetter wrote:
>>> On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
>>>> On 9/2/2015 2:24 PM, Daniel Vetter wrote:
>>>>> On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
>>>>>> On 8/26/2015 6:40 PM, Daniel Vetter wrote:
>>>>>>> On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
>>>>>>>> Dmc will restore the csr program except DC9, cold boot,
>>>>>>>> warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>>
>>>>>>>> intel_csr_load_program() function is used to load the firmware
>>>>>>>> data from kernel memory to csr address space.
>>>>>>>>
>>>>>>>> All values of csr address space will be zero if it got reset and
>>>>>>>> the first byte of csr program is always a non-zero if firmware
>>>>>>>> is loaded successfuly. Based on hardware status will load the
>>>>>>>> firmware.
>>>>>>>>
>>>>>>>> Without this condition check if we overwrite the firmware data the
>>>>>>>> counters exposed for dc5/dc6 (help for debugging) will be nullified.
>>>>>> Bacause of the above reason mentioned just above we need to block firmware loading again.
>>>>>> So only WARN_ON will not help.
>>>>>>
>>>>>>
>>>>>>>> v1: Initial version.
>>>>>>>>
>>>>>>>> v2: Based on review comments from Daniel,
>>>>>>>> - Added a check to know hardware status and load the firmware if not loaded.
>>>>>>>>
>>>>>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>> Cc: Damien Lespiau <damien.lespiau@intel.com>
>>>>>>>> Cc: Imre Deak <imre.deak@intel.com>
>>>>>>>> Cc: Sunil Kamath <sunil.kamath@intel.com>
>>>>>>>> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>>>>>>>> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>>>>>>>> ---
>>>>>>>>   drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>>>>>>>>   1 file changed, 9 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>> index ba1ae03..682cc26 100644
>>>>>>>> --- a/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>> +++ b/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>> @@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>>>>>>>>   		return;
>>>>>>>>   	}
>>>>>>>> +	/*
>>>>>>>> +	 * Dmc will restore the csr the program except DC9, cold boot,
>>>>>>>> +	 * warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>> +	 * This condition will help to check if csr address space is reset/
>>>>>>>> +	 * not loaded.
>>>>>>>> +	 */
>>>>>>> Atm we call this from driver load and resume, which doesn seem to cover
>>>>>>> all the cases you mention in the comment. Should this be a WARN_ON
>>>>>>> instead? Or do we have troubles in our init sequence where we load too
>>>>>>> many times?
>>>>>> Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
>>>>>> Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
>>>>>> second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
>>>>>> and cold boot(not loaded).
>>>>>>
>>>>>> Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
>>>>>> If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
>>>>>>
>>>>>> Can the below comment more clear to you.
>>>>>>
>>>>>> 	/*
>>>>>> 	 * Dmc will restore the csr the program except DC9, cold boot,
>>>>>> 	 * warm reset, PCI function level reset, and hibernate/suspend.
>>>>>> 	 * If firmware is restored by dmc then no need to load again which
>>>>>> 	 * will keep the dc5/dc6 counter exposed by firmware.
>>>>>> 	 */
>>>>>>
>>>>>> No issue in init sequence.
>>>>> That seems to still cover all the callers of the function afaics - we do
>>>>> pci resets over suspend resume unconditionally. So I still don't
>>>>> understand where exactly we try to load the dmc firmware in i915.ko when
>>>>> it's already loaded.
>>>> During resume intel_csr_load_program() will be called from
>>>> intel_runtime_resume().
>>>>
>>>> intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
>>>>
>>>> During Pc10 entry testing I can see dmc is restoring back the firmware always,
>>>> but as you mentioned pci-reset can happen unconditionally, but still then
>>>> also during resume intel_runtime_resume() will be called and based on
>>>> register read of csr-base-address firmware loading will happen.
>>> But in your comment you're saying it won't get restored in case of dc9 and
>>> suspend. So that seems to mismatch what you're saying here (and what the
>>> commit message says) and what the code does. And this function here is
>>> called for resume after suspend/hibernate only.
>> pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
>> I think you are confusing between dc6 and dc9. Pc10 can be achieved by
>> entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
>> for broxton which is not present for skylake.
> I have no idea at all about different pc levels on skl. What I'm talking
> about is system suspend/resume and driver load, which are the places this
> function gets called. At least afaics.
>
>> Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
>> execution flow will be different in case of suspend/resume which I think is confusing
>> you.
> That seems like really important information. What's different on bxt?
> These are the kind of details you should explain in the commit message ...
>
>> I am ready explain you in detail. It will be good if we discuss specific use-case scenario
>> and itz software design for specific platform. Another point - as dmc related code for
>> broxton is not merged better first we close design for skylake. Now, I have added dc9
>> description in comment thinking of future. If you want I can remove for now and later
>> can add in bxt patch series for enabling dmc. Will wait for your reply.
> This question here isn't about the overall design and how to handle power
> wells in skl/bxt. That's a separate discussion and tracked somewhere else.
> I'm really just confused about when exactly we need to reload to firmware,
> and why we need a runtime check for that. Normally we should know when to
> reload the firmware and just either reload or not, without checking hw
> state. And I don't like checking for hw state since at least in the past
> that kind of code ended up being fragile - it's an illusion that it does
> the right thing no matter what, since often there's other tricky ordering
> constraints. And if you have automatic duct-tape like then no one will
> ever spot those other, harder to spot issues, until an expensive customer
> escalation happens.
>
> So what I want to know here is:
> - When exactly do we need to reload dmc firmware.
In skl, during driver load first time we load the firmware, during 
normal suspend-resume (dc6 entry/exit)
no need to reload the firmware again as dmc will take care of it. But 
during suspend/hibernation
dmc will not restore the firmware. In that case driver need to reload it 
again. I do not know
how to differentiate pm-suspend and suspend-hibernation and thought both 
the cases
intel_runtime_resume() will be called where we can check the h/w state 
and reload the
firmware if dmc is not restored.

In bxt, during driver load first time we load the firmware, during 
normal suspend-resume
display engine will enter into dc9 and dmc will not restore the 
firmware. So every
suspend-resume we need to reload the firmware.
> - What exactly is the reason why we can't make that decision statically in
>    the code (by calling csr_load at the right spots).
As I mentioned before in case of skylake can we differentiate between
"resume from pm-suspend" with "resume from suspend-hibernation" inside 
driver?

In case of broxton, every time we need to reload, so we can decide 
statically.

-Animesh
>
> Thanks, Daniel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-16 19:23                 ` Animesh Manna
@ 2015-09-23  7:57                   ` Daniel Vetter
  2015-09-23 16:27                     ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2015-09-23  7:57 UTC (permalink / raw)
  To: Animesh Manna; +Cc: Daniel Vetter, intel-gfx

On Thu, Sep 17, 2015 at 12:53:21AM +0530, Animesh Manna wrote:
> 
> 
> On 9/14/2015 1:16 PM, Daniel Vetter wrote:
> >On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote:
> >>
> >>On 9/10/2015 8:15 PM, Daniel Vetter wrote:
> >>>On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
> >>>>On 9/2/2015 2:24 PM, Daniel Vetter wrote:
> >>>>>On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
> >>>>>>On 8/26/2015 6:40 PM, Daniel Vetter wrote:
> >>>>>>>On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
> >>>>>>>>Dmc will restore the csr program except DC9, cold boot,
> >>>>>>>>warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>>>
> >>>>>>>>intel_csr_load_program() function is used to load the firmware
> >>>>>>>>data from kernel memory to csr address space.
> >>>>>>>>
> >>>>>>>>All values of csr address space will be zero if it got reset and
> >>>>>>>>the first byte of csr program is always a non-zero if firmware
> >>>>>>>>is loaded successfuly. Based on hardware status will load the
> >>>>>>>>firmware.
> >>>>>>>>
> >>>>>>>>Without this condition check if we overwrite the firmware data the
> >>>>>>>>counters exposed for dc5/dc6 (help for debugging) will be nullified.
> >>>>>>Bacause of the above reason mentioned just above we need to block firmware loading again.
> >>>>>>So only WARN_ON will not help.
> >>>>>>
> >>>>>>
> >>>>>>>>v1: Initial version.
> >>>>>>>>
> >>>>>>>>v2: Based on review comments from Daniel,
> >>>>>>>>- Added a check to know hardware status and load the firmware if not loaded.
> >>>>>>>>
> >>>>>>>>Cc: Daniel Vetter <daniel.vetter@intel.com>
> >>>>>>>>Cc: Damien Lespiau <damien.lespiau@intel.com>
> >>>>>>>>Cc: Imre Deak <imre.deak@intel.com>
> >>>>>>>>Cc: Sunil Kamath <sunil.kamath@intel.com>
> >>>>>>>>Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> >>>>>>>>Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> >>>>>>>>---
> >>>>>>>>  drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
> >>>>>>>>  1 file changed, 9 insertions(+)
> >>>>>>>>
> >>>>>>>>diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>>>index ba1ae03..682cc26 100644
> >>>>>>>>--- a/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>>>+++ b/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>>>@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
> >>>>>>>>  		return;
> >>>>>>>>  	}
> >>>>>>>>+	/*
> >>>>>>>>+	 * Dmc will restore the csr the program except DC9, cold boot,
> >>>>>>>>+	 * warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>>>+	 * This condition will help to check if csr address space is reset/
> >>>>>>>>+	 * not loaded.
> >>>>>>>>+	 */
> >>>>>>>Atm we call this from driver load and resume, which doesn seem to cover
> >>>>>>>all the cases you mention in the comment. Should this be a WARN_ON
> >>>>>>>instead? Or do we have troubles in our init sequence where we load too
> >>>>>>>many times?
> >>>>>>Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
> >>>>>>Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
> >>>>>>second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
> >>>>>>and cold boot(not loaded).
> >>>>>>
> >>>>>>Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
> >>>>>>If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
> >>>>>>
> >>>>>>Can the below comment more clear to you.
> >>>>>>
> >>>>>>	/*
> >>>>>>	 * Dmc will restore the csr the program except DC9, cold boot,
> >>>>>>	 * warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>	 * If firmware is restored by dmc then no need to load again which
> >>>>>>	 * will keep the dc5/dc6 counter exposed by firmware.
> >>>>>>	 */
> >>>>>>
> >>>>>>No issue in init sequence.
> >>>>>That seems to still cover all the callers of the function afaics - we do
> >>>>>pci resets over suspend resume unconditionally. So I still don't
> >>>>>understand where exactly we try to load the dmc firmware in i915.ko when
> >>>>>it's already loaded.
> >>>>During resume intel_csr_load_program() will be called from
> >>>>intel_runtime_resume().
> >>>>
> >>>>intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
> >>>>
> >>>>During Pc10 entry testing I can see dmc is restoring back the firmware always,
> >>>>but as you mentioned pci-reset can happen unconditionally, but still then
> >>>>also during resume intel_runtime_resume() will be called and based on
> >>>>register read of csr-base-address firmware loading will happen.
> >>>But in your comment you're saying it won't get restored in case of dc9 and
> >>>suspend. So that seems to mismatch what you're saying here (and what the
> >>>commit message says) and what the code does. And this function here is
> >>>called for resume after suspend/hibernate only.
> >>pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
> >>I think you are confusing between dc6 and dc9. Pc10 can be achieved by
> >>entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
> >>for broxton which is not present for skylake.
> >I have no idea at all about different pc levels on skl. What I'm talking
> >about is system suspend/resume and driver load, which are the places this
> >function gets called. At least afaics.
> >
> >>Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
> >>execution flow will be different in case of suspend/resume which I think is confusing
> >>you.
> >That seems like really important information. What's different on bxt?
> >These are the kind of details you should explain in the commit message ...
> >
> >>I am ready explain you in detail. It will be good if we discuss specific use-case scenario
> >>and itz software design for specific platform. Another point - as dmc related code for
> >>broxton is not merged better first we close design for skylake. Now, I have added dc9
> >>description in comment thinking of future. If you want I can remove for now and later
> >>can add in bxt patch series for enabling dmc. Will wait for your reply.
> >This question here isn't about the overall design and how to handle power
> >wells in skl/bxt. That's a separate discussion and tracked somewhere else.
> >I'm really just confused about when exactly we need to reload to firmware,
> >and why we need a runtime check for that. Normally we should know when to
> >reload the firmware and just either reload or not, without checking hw
> >state. And I don't like checking for hw state since at least in the past
> >that kind of code ended up being fragile - it's an illusion that it does
> >the right thing no matter what, since often there's other tricky ordering
> >constraints. And if you have automatic duct-tape like then no one will
> >ever spot those other, harder to spot issues, until an expensive customer
> >escalation happens.
> >
> >So what I want to know here is:
> >- When exactly do we need to reload dmc firmware.
> In skl, during driver load first time we load the firmware, during normal
> suspend-resume (dc6 entry/exit)
> no need to reload the firmware again as dmc will take care of it. But during
> suspend/hibernation
> dmc will not restore the firmware. In that case driver need to reload it
> again. I do not know
> how to differentiate pm-suspend and suspend-hibernation and thought both the
> cases
> intel_runtime_resume() will be called where we can check the h/w state and
> reload the
> firmware if dmc is not restored.
> 
> In bxt, during driver load first time we load the firmware, during normal
> suspend-resume
> display engine will enter into dc9 and dmc will not restore the firmware. So
> every
> suspend-resume we need to reload the firmware.
> >- What exactly is the reason why we can't make that decision statically in
> >   the code (by calling csr_load at the right spots).
> As I mentioned before in case of skylake can we differentiate between
> "resume from pm-suspend" with "resume from suspend-hibernation" inside
> driver?
> 
> In case of broxton, every time we need to reload, so we can decide
> statically.

Of course we can differentiate between all the different resume paths, and
we also have a per-platform split to take care of bxt vs. skl. And there
are actually 3 different resume paths:

- runtime PM resume. This calls the runtime_resume hook. It sounds like on
  skl we should _not_ load the csr firmware, but on bxt we should load it.
  This can be fixed by removing the intel_csr_load_program call from
  skl_resume_prepare.
- resume from hibernate-to-disk (i.e. system completely off, state stored
  on the swap partition) is done by calling the thaw callbacks.
- resume from suspend-to-mem (i.e. system in low-power with only memory
  in self-refresh, all state stored in memory) is done by calling the
  resume callbacks.

For i915 we use unified handlers in our dev_pm_ops for both thaw and
resume, but it sounds like that won't be a problem for skl/bxt since we
need to reload the csr firmware in all cases. Although I'm not perfectly
sure since you don't explain what kind of resume you mean exactly (since
you don't use the linux names for them).

Anyway it sounds like we can replace this patch by one where we remove
that errornous csr load call from skl runtime pm resume and that's all.
But I suggest to make sure we get this right we keep the check you're
adding here, but wrap it in a WARN_ON. Then we'll get a backtrace when
this is going wrong again. Like this:

	if (WARN_ON(csr_loaded_already()))
		return;

Also when redoing the commits please explain in detail what exactly are
the requirements like you've done above, but please use the standard linux
names, i.e. "runtime PM" and "hibernate-to-disk" and "suspend-to-mem".

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-23  7:57                   ` Daniel Vetter
@ 2015-09-23 16:27                     ` Daniel Vetter
  2015-09-23 16:28                       ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2015-09-23 16:27 UTC (permalink / raw)
  To: Animesh Manna; +Cc: Daniel Vetter, intel-gfx

On Wed, Sep 23, 2015 at 9:57 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Thu, Sep 17, 2015 at 12:53:21AM +0530, Animesh Manna wrote:
>>
>>
>> On 9/14/2015 1:16 PM, Daniel Vetter wrote:
>> >On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote:
>> >>
>> >>On 9/10/2015 8:15 PM, Daniel Vetter wrote:
>> >>>On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
>> >>>>On 9/2/2015 2:24 PM, Daniel Vetter wrote:
>> >>>>>On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
>> >>>>>>On 8/26/2015 6:40 PM, Daniel Vetter wrote:
>> >>>>>>>On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
>> >>>>>>>>Dmc will restore the csr program except DC9, cold boot,
>> >>>>>>>>warm reset, PCI function level reset, and hibernate/suspend.
>> >>>>>>>>
>> >>>>>>>>intel_csr_load_program() function is used to load the firmware
>> >>>>>>>>data from kernel memory to csr address space.
>> >>>>>>>>
>> >>>>>>>>All values of csr address space will be zero if it got reset and
>> >>>>>>>>the first byte of csr program is always a non-zero if firmware
>> >>>>>>>>is loaded successfuly. Based on hardware status will load the
>> >>>>>>>>firmware.
>> >>>>>>>>
>> >>>>>>>>Without this condition check if we overwrite the firmware data the
>> >>>>>>>>counters exposed for dc5/dc6 (help for debugging) will be nullified.
>> >>>>>>Bacause of the above reason mentioned just above we need to block firmware loading again.
>> >>>>>>So only WARN_ON will not help.
>> >>>>>>
>> >>>>>>
>> >>>>>>>>v1: Initial version.
>> >>>>>>>>
>> >>>>>>>>v2: Based on review comments from Daniel,
>> >>>>>>>>- Added a check to know hardware status and load the firmware if not loaded.
>> >>>>>>>>
>> >>>>>>>>Cc: Daniel Vetter <daniel.vetter@intel.com>
>> >>>>>>>>Cc: Damien Lespiau <damien.lespiau@intel.com>
>> >>>>>>>>Cc: Imre Deak <imre.deak@intel.com>
>> >>>>>>>>Cc: Sunil Kamath <sunil.kamath@intel.com>
>> >>>>>>>>Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>> >>>>>>>>Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>> >>>>>>>>---
>> >>>>>>>>  drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>> >>>>>>>>  1 file changed, 9 insertions(+)
>> >>>>>>>>
>> >>>>>>>>diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>> >>>>>>>>index ba1ae03..682cc26 100644
>> >>>>>>>>--- a/drivers/gpu/drm/i915/intel_csr.c
>> >>>>>>>>+++ b/drivers/gpu/drm/i915/intel_csr.c
>> >>>>>>>>@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>> >>>>>>>>              return;
>> >>>>>>>>      }
>> >>>>>>>>+     /*
>> >>>>>>>>+      * Dmc will restore the csr the program except DC9, cold boot,
>> >>>>>>>>+      * warm reset, PCI function level reset, and hibernate/suspend.
>> >>>>>>>>+      * This condition will help to check if csr address space is reset/
>> >>>>>>>>+      * not loaded.
>> >>>>>>>>+      */
>> >>>>>>>Atm we call this from driver load and resume, which doesn seem to cover
>> >>>>>>>all the cases you mention in the comment. Should this be a WARN_ON
>> >>>>>>>instead? Or do we have troubles in our init sequence where we load too
>> >>>>>>>many times?
>> >>>>>>Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
>> >>>>>>Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
>> >>>>>>second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
>> >>>>>>and cold boot(not loaded).
>> >>>>>>
>> >>>>>>Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
>> >>>>>>If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
>> >>>>>>
>> >>>>>>Can the below comment more clear to you.
>> >>>>>>
>> >>>>>>        /*
>> >>>>>>         * Dmc will restore the csr the program except DC9, cold boot,
>> >>>>>>         * warm reset, PCI function level reset, and hibernate/suspend.
>> >>>>>>         * If firmware is restored by dmc then no need to load again which
>> >>>>>>         * will keep the dc5/dc6 counter exposed by firmware.
>> >>>>>>         */
>> >>>>>>
>> >>>>>>No issue in init sequence.
>> >>>>>That seems to still cover all the callers of the function afaics - we do
>> >>>>>pci resets over suspend resume unconditionally. So I still don't
>> >>>>>understand where exactly we try to load the dmc firmware in i915.ko when
>> >>>>>it's already loaded.
>> >>>>During resume intel_csr_load_program() will be called from
>> >>>>intel_runtime_resume().
>> >>>>
>> >>>>intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
>> >>>>
>> >>>>During Pc10 entry testing I can see dmc is restoring back the firmware always,
>> >>>>but as you mentioned pci-reset can happen unconditionally, but still then
>> >>>>also during resume intel_runtime_resume() will be called and based on
>> >>>>register read of csr-base-address firmware loading will happen.
>> >>>But in your comment you're saying it won't get restored in case of dc9 and
>> >>>suspend. So that seems to mismatch what you're saying here (and what the
>> >>>commit message says) and what the code does. And this function here is
>> >>>called for resume after suspend/hibernate only.
>> >>pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
>> >>I think you are confusing between dc6 and dc9. Pc10 can be achieved by
>> >>entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
>> >>for broxton which is not present for skylake.
>> >I have no idea at all about different pc levels on skl. What I'm talking
>> >about is system suspend/resume and driver load, which are the places this
>> >function gets called. At least afaics.
>> >
>> >>Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
>> >>execution flow will be different in case of suspend/resume which I think is confusing
>> >>you.
>> >That seems like really important information. What's different on bxt?
>> >These are the kind of details you should explain in the commit message ...
>> >
>> >>I am ready explain you in detail. It will be good if we discuss specific use-case scenario
>> >>and itz software design for specific platform. Another point - as dmc related code for
>> >>broxton is not merged better first we close design for skylake. Now, I have added dc9
>> >>description in comment thinking of future. If you want I can remove for now and later
>> >>can add in bxt patch series for enabling dmc. Will wait for your reply.
>> >This question here isn't about the overall design and how to handle power
>> >wells in skl/bxt. That's a separate discussion and tracked somewhere else.
>> >I'm really just confused about when exactly we need to reload to firmware,
>> >and why we need a runtime check for that. Normally we should know when to
>> >reload the firmware and just either reload or not, without checking hw
>> >state. And I don't like checking for hw state since at least in the past
>> >that kind of code ended up being fragile - it's an illusion that it does
>> >the right thing no matter what, since often there's other tricky ordering
>> >constraints. And if you have automatic duct-tape like then no one will
>> >ever spot those other, harder to spot issues, until an expensive customer
>> >escalation happens.
>> >
>> >So what I want to know here is:
>> >- When exactly do we need to reload dmc firmware.
>> In skl, during driver load first time we load the firmware, during normal
>> suspend-resume (dc6 entry/exit)
>> no need to reload the firmware again as dmc will take care of it. But during
>> suspend/hibernation
>> dmc will not restore the firmware. In that case driver need to reload it
>> again. I do not know
>> how to differentiate pm-suspend and suspend-hibernation and thought both the
>> cases
>> intel_runtime_resume() will be called where we can check the h/w state and
>> reload the
>> firmware if dmc is not restored.
>>
>> In bxt, during driver load first time we load the firmware, during normal
>> suspend-resume
>> display engine will enter into dc9 and dmc will not restore the firmware. So
>> every
>> suspend-resume we need to reload the firmware.
>> >- What exactly is the reason why we can't make that decision statically in
>> >   the code (by calling csr_load at the right spots).
>> As I mentioned before in case of skylake can we differentiate between
>> "resume from pm-suspend" with "resume from suspend-hibernation" inside
>> driver?
>>
>> In case of broxton, every time we need to reload, so we can decide
>> statically.
>
> Of course we can differentiate between all the different resume paths, and
> we also have a per-platform split to take care of bxt vs. skl. And there
> are actually 3 different resume paths:
>
> - runtime PM resume. This calls the runtime_resume hook. It sounds like on
>   skl we should _not_ load the csr firmware, but on bxt we should load it.
>   This can be fixed by removing the intel_csr_load_program call from
>   skl_resume_prepare.
> - resume from hibernate-to-disk (i.e. system completely off, state stored
>   on the swap partition) is done by calling the thaw callbacks.
> - resume from suspend-to-mem (i.e. system in low-power with only memory
>   in self-refresh, all state stored in memory) is done by calling the
>   resume callbacks.
>
> For i915 we use unified handlers in our dev_pm_ops for both thaw and
> resume, but it sounds like that won't be a problem for skl/bxt since we
> need to reload the csr firmware in all cases. Although I'm not perfectly
> sure since you don't explain what kind of resume you mean exactly (since
> you don't use the linux names for them).
>
> Anyway it sounds like we can replace this patch by one where we remove
> that errornous csr load call from skl runtime pm resume and that's all.
> But I suggest to make sure we get this right we keep the check you're
> adding here, but wrap it in a WARN_ON. Then we'll get a backtrace when
> this is going wrong again. Like this:
>
>         if (WARN_ON(csr_loaded_already()))
>                 return;
>
> Also when redoing the commits please explain in detail what exactly are
> the requirements like you've done above, but please use the standard linux
> names, i.e. "runtime PM" and "hibernate-to-disk" and "suspend-to-mem".

Ok hooray there's more suspend-to-something things I've totally missed:
- suspend-to-idle (done by cat freeze > /sys/power/state) and
- suspend (done by cat suspend > /sys/power/state)

And apparently there's really no way to drivers to tell them apart.
Rafael, is there really no way for drivers to take different paths for
these 3 suspend cases? I tried grepping for PM_SUSPEND_ON/STANDY/MEM
and didn't spot anything.

Also we're completely missing test coverage for that in igt. That is
something that needs to be fixed asap (yet another case of
combinatorial explosion in igt tests, yay). And at least one of those
suspend-to-idle testcase better be in the BAT.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-23 16:27                     ` Daniel Vetter
@ 2015-09-23 16:28                       ` Daniel Vetter
  2015-09-23 17:17                         ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2015-09-23 16:28 UTC (permalink / raw)
  To: Animesh Manna, Wysocki, Rafael J; +Cc: Daniel Vetter, intel-gfx

Actually add Rafael this time around ...
-Daniel

On Wed, Sep 23, 2015 at 6:27 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Wed, Sep 23, 2015 at 9:57 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Thu, Sep 17, 2015 at 12:53:21AM +0530, Animesh Manna wrote:
>>>
>>>
>>> On 9/14/2015 1:16 PM, Daniel Vetter wrote:
>>> >On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote:
>>> >>
>>> >>On 9/10/2015 8:15 PM, Daniel Vetter wrote:
>>> >>>On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
>>> >>>>On 9/2/2015 2:24 PM, Daniel Vetter wrote:
>>> >>>>>On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
>>> >>>>>>On 8/26/2015 6:40 PM, Daniel Vetter wrote:
>>> >>>>>>>On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
>>> >>>>>>>>Dmc will restore the csr program except DC9, cold boot,
>>> >>>>>>>>warm reset, PCI function level reset, and hibernate/suspend.
>>> >>>>>>>>
>>> >>>>>>>>intel_csr_load_program() function is used to load the firmware
>>> >>>>>>>>data from kernel memory to csr address space.
>>> >>>>>>>>
>>> >>>>>>>>All values of csr address space will be zero if it got reset and
>>> >>>>>>>>the first byte of csr program is always a non-zero if firmware
>>> >>>>>>>>is loaded successfuly. Based on hardware status will load the
>>> >>>>>>>>firmware.
>>> >>>>>>>>
>>> >>>>>>>>Without this condition check if we overwrite the firmware data the
>>> >>>>>>>>counters exposed for dc5/dc6 (help for debugging) will be nullified.
>>> >>>>>>Bacause of the above reason mentioned just above we need to block firmware loading again.
>>> >>>>>>So only WARN_ON will not help.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>>>v1: Initial version.
>>> >>>>>>>>
>>> >>>>>>>>v2: Based on review comments from Daniel,
>>> >>>>>>>>- Added a check to know hardware status and load the firmware if not loaded.
>>> >>>>>>>>
>>> >>>>>>>>Cc: Daniel Vetter <daniel.vetter@intel.com>
>>> >>>>>>>>Cc: Damien Lespiau <damien.lespiau@intel.com>
>>> >>>>>>>>Cc: Imre Deak <imre.deak@intel.com>
>>> >>>>>>>>Cc: Sunil Kamath <sunil.kamath@intel.com>
>>> >>>>>>>>Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>>> >>>>>>>>Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>>> >>>>>>>>---
>>> >>>>>>>>  drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>>> >>>>>>>>  1 file changed, 9 insertions(+)
>>> >>>>>>>>
>>> >>>>>>>>diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>>> >>>>>>>>index ba1ae03..682cc26 100644
>>> >>>>>>>>--- a/drivers/gpu/drm/i915/intel_csr.c
>>> >>>>>>>>+++ b/drivers/gpu/drm/i915/intel_csr.c
>>> >>>>>>>>@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>>> >>>>>>>>              return;
>>> >>>>>>>>      }
>>> >>>>>>>>+     /*
>>> >>>>>>>>+      * Dmc will restore the csr the program except DC9, cold boot,
>>> >>>>>>>>+      * warm reset, PCI function level reset, and hibernate/suspend.
>>> >>>>>>>>+      * This condition will help to check if csr address space is reset/
>>> >>>>>>>>+      * not loaded.
>>> >>>>>>>>+      */
>>> >>>>>>>Atm we call this from driver load and resume, which doesn seem to cover
>>> >>>>>>>all the cases you mention in the comment. Should this be a WARN_ON
>>> >>>>>>>instead? Or do we have troubles in our init sequence where we load too
>>> >>>>>>>many times?
>>> >>>>>>Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
>>> >>>>>>Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
>>> >>>>>>second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
>>> >>>>>>and cold boot(not loaded).
>>> >>>>>>
>>> >>>>>>Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
>>> >>>>>>If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
>>> >>>>>>
>>> >>>>>>Can the below comment more clear to you.
>>> >>>>>>
>>> >>>>>>        /*
>>> >>>>>>         * Dmc will restore the csr the program except DC9, cold boot,
>>> >>>>>>         * warm reset, PCI function level reset, and hibernate/suspend.
>>> >>>>>>         * If firmware is restored by dmc then no need to load again which
>>> >>>>>>         * will keep the dc5/dc6 counter exposed by firmware.
>>> >>>>>>         */
>>> >>>>>>
>>> >>>>>>No issue in init sequence.
>>> >>>>>That seems to still cover all the callers of the function afaics - we do
>>> >>>>>pci resets over suspend resume unconditionally. So I still don't
>>> >>>>>understand where exactly we try to load the dmc firmware in i915.ko when
>>> >>>>>it's already loaded.
>>> >>>>During resume intel_csr_load_program() will be called from
>>> >>>>intel_runtime_resume().
>>> >>>>
>>> >>>>intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
>>> >>>>
>>> >>>>During Pc10 entry testing I can see dmc is restoring back the firmware always,
>>> >>>>but as you mentioned pci-reset can happen unconditionally, but still then
>>> >>>>also during resume intel_runtime_resume() will be called and based on
>>> >>>>register read of csr-base-address firmware loading will happen.
>>> >>>But in your comment you're saying it won't get restored in case of dc9 and
>>> >>>suspend. So that seems to mismatch what you're saying here (and what the
>>> >>>commit message says) and what the code does. And this function here is
>>> >>>called for resume after suspend/hibernate only.
>>> >>pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
>>> >>I think you are confusing between dc6 and dc9. Pc10 can be achieved by
>>> >>entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
>>> >>for broxton which is not present for skylake.
>>> >I have no idea at all about different pc levels on skl. What I'm talking
>>> >about is system suspend/resume and driver load, which are the places this
>>> >function gets called. At least afaics.
>>> >
>>> >>Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
>>> >>execution flow will be different in case of suspend/resume which I think is confusing
>>> >>you.
>>> >That seems like really important information. What's different on bxt?
>>> >These are the kind of details you should explain in the commit message ...
>>> >
>>> >>I am ready explain you in detail. It will be good if we discuss specific use-case scenario
>>> >>and itz software design for specific platform. Another point - as dmc related code for
>>> >>broxton is not merged better first we close design for skylake. Now, I have added dc9
>>> >>description in comment thinking of future. If you want I can remove for now and later
>>> >>can add in bxt patch series for enabling dmc. Will wait for your reply.
>>> >This question here isn't about the overall design and how to handle power
>>> >wells in skl/bxt. That's a separate discussion and tracked somewhere else.
>>> >I'm really just confused about when exactly we need to reload to firmware,
>>> >and why we need a runtime check for that. Normally we should know when to
>>> >reload the firmware and just either reload or not, without checking hw
>>> >state. And I don't like checking for hw state since at least in the past
>>> >that kind of code ended up being fragile - it's an illusion that it does
>>> >the right thing no matter what, since often there's other tricky ordering
>>> >constraints. And if you have automatic duct-tape like then no one will
>>> >ever spot those other, harder to spot issues, until an expensive customer
>>> >escalation happens.
>>> >
>>> >So what I want to know here is:
>>> >- When exactly do we need to reload dmc firmware.
>>> In skl, during driver load first time we load the firmware, during normal
>>> suspend-resume (dc6 entry/exit)
>>> no need to reload the firmware again as dmc will take care of it. But during
>>> suspend/hibernation
>>> dmc will not restore the firmware. In that case driver need to reload it
>>> again. I do not know
>>> how to differentiate pm-suspend and suspend-hibernation and thought both the
>>> cases
>>> intel_runtime_resume() will be called where we can check the h/w state and
>>> reload the
>>> firmware if dmc is not restored.
>>>
>>> In bxt, during driver load first time we load the firmware, during normal
>>> suspend-resume
>>> display engine will enter into dc9 and dmc will not restore the firmware. So
>>> every
>>> suspend-resume we need to reload the firmware.
>>> >- What exactly is the reason why we can't make that decision statically in
>>> >   the code (by calling csr_load at the right spots).
>>> As I mentioned before in case of skylake can we differentiate between
>>> "resume from pm-suspend" with "resume from suspend-hibernation" inside
>>> driver?
>>>
>>> In case of broxton, every time we need to reload, so we can decide
>>> statically.
>>
>> Of course we can differentiate between all the different resume paths, and
>> we also have a per-platform split to take care of bxt vs. skl. And there
>> are actually 3 different resume paths:
>>
>> - runtime PM resume. This calls the runtime_resume hook. It sounds like on
>>   skl we should _not_ load the csr firmware, but on bxt we should load it.
>>   This can be fixed by removing the intel_csr_load_program call from
>>   skl_resume_prepare.
>> - resume from hibernate-to-disk (i.e. system completely off, state stored
>>   on the swap partition) is done by calling the thaw callbacks.
>> - resume from suspend-to-mem (i.e. system in low-power with only memory
>>   in self-refresh, all state stored in memory) is done by calling the
>>   resume callbacks.
>>
>> For i915 we use unified handlers in our dev_pm_ops for both thaw and
>> resume, but it sounds like that won't be a problem for skl/bxt since we
>> need to reload the csr firmware in all cases. Although I'm not perfectly
>> sure since you don't explain what kind of resume you mean exactly (since
>> you don't use the linux names for them).
>>
>> Anyway it sounds like we can replace this patch by one where we remove
>> that errornous csr load call from skl runtime pm resume and that's all.
>> But I suggest to make sure we get this right we keep the check you're
>> adding here, but wrap it in a WARN_ON. Then we'll get a backtrace when
>> this is going wrong again. Like this:
>>
>>         if (WARN_ON(csr_loaded_already()))
>>                 return;
>>
>> Also when redoing the commits please explain in detail what exactly are
>> the requirements like you've done above, but please use the standard linux
>> names, i.e. "runtime PM" and "hibernate-to-disk" and "suspend-to-mem".
>
> Ok hooray there's more suspend-to-something things I've totally missed:
> - suspend-to-idle (done by cat freeze > /sys/power/state) and
> - suspend (done by cat suspend > /sys/power/state)
>
> And apparently there's really no way to drivers to tell them apart.
> Rafael, is there really no way for drivers to take different paths for
> these 3 suspend cases? I tried grepping for PM_SUSPEND_ON/STANDY/MEM
> and didn't spot anything.
>
> Also we're completely missing test coverage for that in igt. That is
> something that needs to be fixed asap (yet another case of
> combinatorial explosion in igt tests, yay). And at least one of those
> suspend-to-idle testcase better be in the BAT.
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-23 16:28                       ` Daniel Vetter
@ 2015-09-23 17:17                         ` Daniel Vetter
  2015-09-23 20:49                           ` Rafael J. Wysocki
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2015-09-23 17:17 UTC (permalink / raw)
  To: Animesh Manna, Wysocki, Rafael J; +Cc: Daniel Vetter, intel-gfx

acpi_target_system_state() seems to be almost the thing we're looking
for, except that it's only valid in the suspend callbacks since it
gets reset to ACPI_STATE_S0 when resuming. So probably we want
something else ...
-Daniel

On Wed, Sep 23, 2015 at 6:28 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> Actually add Rafael this time around ...
> -Daniel
>
> On Wed, Sep 23, 2015 at 6:27 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Wed, Sep 23, 2015 at 9:57 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>> On Thu, Sep 17, 2015 at 12:53:21AM +0530, Animesh Manna wrote:
>>>>
>>>>
>>>> On 9/14/2015 1:16 PM, Daniel Vetter wrote:
>>>> >On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote:
>>>> >>
>>>> >>On 9/10/2015 8:15 PM, Daniel Vetter wrote:
>>>> >>>On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
>>>> >>>>On 9/2/2015 2:24 PM, Daniel Vetter wrote:
>>>> >>>>>On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
>>>> >>>>>>On 8/26/2015 6:40 PM, Daniel Vetter wrote:
>>>> >>>>>>>On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
>>>> >>>>>>>>Dmc will restore the csr program except DC9, cold boot,
>>>> >>>>>>>>warm reset, PCI function level reset, and hibernate/suspend.
>>>> >>>>>>>>
>>>> >>>>>>>>intel_csr_load_program() function is used to load the firmware
>>>> >>>>>>>>data from kernel memory to csr address space.
>>>> >>>>>>>>
>>>> >>>>>>>>All values of csr address space will be zero if it got reset and
>>>> >>>>>>>>the first byte of csr program is always a non-zero if firmware
>>>> >>>>>>>>is loaded successfuly. Based on hardware status will load the
>>>> >>>>>>>>firmware.
>>>> >>>>>>>>
>>>> >>>>>>>>Without this condition check if we overwrite the firmware data the
>>>> >>>>>>>>counters exposed for dc5/dc6 (help for debugging) will be nullified.
>>>> >>>>>>Bacause of the above reason mentioned just above we need to block firmware loading again.
>>>> >>>>>>So only WARN_ON will not help.
>>>> >>>>>>
>>>> >>>>>>
>>>> >>>>>>>>v1: Initial version.
>>>> >>>>>>>>
>>>> >>>>>>>>v2: Based on review comments from Daniel,
>>>> >>>>>>>>- Added a check to know hardware status and load the firmware if not loaded.
>>>> >>>>>>>>
>>>> >>>>>>>>Cc: Daniel Vetter <daniel.vetter@intel.com>
>>>> >>>>>>>>Cc: Damien Lespiau <damien.lespiau@intel.com>
>>>> >>>>>>>>Cc: Imre Deak <imre.deak@intel.com>
>>>> >>>>>>>>Cc: Sunil Kamath <sunil.kamath@intel.com>
>>>> >>>>>>>>Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>>>> >>>>>>>>Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>>>> >>>>>>>>---
>>>> >>>>>>>>  drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>>>> >>>>>>>>  1 file changed, 9 insertions(+)
>>>> >>>>>>>>
>>>> >>>>>>>>diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>>>> >>>>>>>>index ba1ae03..682cc26 100644
>>>> >>>>>>>>--- a/drivers/gpu/drm/i915/intel_csr.c
>>>> >>>>>>>>+++ b/drivers/gpu/drm/i915/intel_csr.c
>>>> >>>>>>>>@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>>>> >>>>>>>>              return;
>>>> >>>>>>>>      }
>>>> >>>>>>>>+     /*
>>>> >>>>>>>>+      * Dmc will restore the csr the program except DC9, cold boot,
>>>> >>>>>>>>+      * warm reset, PCI function level reset, and hibernate/suspend.
>>>> >>>>>>>>+      * This condition will help to check if csr address space is reset/
>>>> >>>>>>>>+      * not loaded.
>>>> >>>>>>>>+      */
>>>> >>>>>>>Atm we call this from driver load and resume, which doesn seem to cover
>>>> >>>>>>>all the cases you mention in the comment. Should this be a WARN_ON
>>>> >>>>>>>instead? Or do we have troubles in our init sequence where we load too
>>>> >>>>>>>many times?
>>>> >>>>>>Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
>>>> >>>>>>Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
>>>> >>>>>>second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
>>>> >>>>>>and cold boot(not loaded).
>>>> >>>>>>
>>>> >>>>>>Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
>>>> >>>>>>If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
>>>> >>>>>>
>>>> >>>>>>Can the below comment more clear to you.
>>>> >>>>>>
>>>> >>>>>>        /*
>>>> >>>>>>         * Dmc will restore the csr the program except DC9, cold boot,
>>>> >>>>>>         * warm reset, PCI function level reset, and hibernate/suspend.
>>>> >>>>>>         * If firmware is restored by dmc then no need to load again which
>>>> >>>>>>         * will keep the dc5/dc6 counter exposed by firmware.
>>>> >>>>>>         */
>>>> >>>>>>
>>>> >>>>>>No issue in init sequence.
>>>> >>>>>That seems to still cover all the callers of the function afaics - we do
>>>> >>>>>pci resets over suspend resume unconditionally. So I still don't
>>>> >>>>>understand where exactly we try to load the dmc firmware in i915.ko when
>>>> >>>>>it's already loaded.
>>>> >>>>During resume intel_csr_load_program() will be called from
>>>> >>>>intel_runtime_resume().
>>>> >>>>
>>>> >>>>intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
>>>> >>>>
>>>> >>>>During Pc10 entry testing I can see dmc is restoring back the firmware always,
>>>> >>>>but as you mentioned pci-reset can happen unconditionally, but still then
>>>> >>>>also during resume intel_runtime_resume() will be called and based on
>>>> >>>>register read of csr-base-address firmware loading will happen.
>>>> >>>But in your comment you're saying it won't get restored in case of dc9 and
>>>> >>>suspend. So that seems to mismatch what you're saying here (and what the
>>>> >>>commit message says) and what the code does. And this function here is
>>>> >>>called for resume after suspend/hibernate only.
>>>> >>pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
>>>> >>I think you are confusing between dc6 and dc9. Pc10 can be achieved by
>>>> >>entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
>>>> >>for broxton which is not present for skylake.
>>>> >I have no idea at all about different pc levels on skl. What I'm talking
>>>> >about is system suspend/resume and driver load, which are the places this
>>>> >function gets called. At least afaics.
>>>> >
>>>> >>Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
>>>> >>execution flow will be different in case of suspend/resume which I think is confusing
>>>> >>you.
>>>> >That seems like really important information. What's different on bxt?
>>>> >These are the kind of details you should explain in the commit message ...
>>>> >
>>>> >>I am ready explain you in detail. It will be good if we discuss specific use-case scenario
>>>> >>and itz software design for specific platform. Another point - as dmc related code for
>>>> >>broxton is not merged better first we close design for skylake. Now, I have added dc9
>>>> >>description in comment thinking of future. If you want I can remove for now and later
>>>> >>can add in bxt patch series for enabling dmc. Will wait for your reply.
>>>> >This question here isn't about the overall design and how to handle power
>>>> >wells in skl/bxt. That's a separate discussion and tracked somewhere else.
>>>> >I'm really just confused about when exactly we need to reload to firmware,
>>>> >and why we need a runtime check for that. Normally we should know when to
>>>> >reload the firmware and just either reload or not, without checking hw
>>>> >state. And I don't like checking for hw state since at least in the past
>>>> >that kind of code ended up being fragile - it's an illusion that it does
>>>> >the right thing no matter what, since often there's other tricky ordering
>>>> >constraints. And if you have automatic duct-tape like then no one will
>>>> >ever spot those other, harder to spot issues, until an expensive customer
>>>> >escalation happens.
>>>> >
>>>> >So what I want to know here is:
>>>> >- When exactly do we need to reload dmc firmware.
>>>> In skl, during driver load first time we load the firmware, during normal
>>>> suspend-resume (dc6 entry/exit)
>>>> no need to reload the firmware again as dmc will take care of it. But during
>>>> suspend/hibernation
>>>> dmc will not restore the firmware. In that case driver need to reload it
>>>> again. I do not know
>>>> how to differentiate pm-suspend and suspend-hibernation and thought both the
>>>> cases
>>>> intel_runtime_resume() will be called where we can check the h/w state and
>>>> reload the
>>>> firmware if dmc is not restored.
>>>>
>>>> In bxt, during driver load first time we load the firmware, during normal
>>>> suspend-resume
>>>> display engine will enter into dc9 and dmc will not restore the firmware. So
>>>> every
>>>> suspend-resume we need to reload the firmware.
>>>> >- What exactly is the reason why we can't make that decision statically in
>>>> >   the code (by calling csr_load at the right spots).
>>>> As I mentioned before in case of skylake can we differentiate between
>>>> "resume from pm-suspend" with "resume from suspend-hibernation" inside
>>>> driver?
>>>>
>>>> In case of broxton, every time we need to reload, so we can decide
>>>> statically.
>>>
>>> Of course we can differentiate between all the different resume paths, and
>>> we also have a per-platform split to take care of bxt vs. skl. And there
>>> are actually 3 different resume paths:
>>>
>>> - runtime PM resume. This calls the runtime_resume hook. It sounds like on
>>>   skl we should _not_ load the csr firmware, but on bxt we should load it.
>>>   This can be fixed by removing the intel_csr_load_program call from
>>>   skl_resume_prepare.
>>> - resume from hibernate-to-disk (i.e. system completely off, state stored
>>>   on the swap partition) is done by calling the thaw callbacks.
>>> - resume from suspend-to-mem (i.e. system in low-power with only memory
>>>   in self-refresh, all state stored in memory) is done by calling the
>>>   resume callbacks.
>>>
>>> For i915 we use unified handlers in our dev_pm_ops for both thaw and
>>> resume, but it sounds like that won't be a problem for skl/bxt since we
>>> need to reload the csr firmware in all cases. Although I'm not perfectly
>>> sure since you don't explain what kind of resume you mean exactly (since
>>> you don't use the linux names for them).
>>>
>>> Anyway it sounds like we can replace this patch by one where we remove
>>> that errornous csr load call from skl runtime pm resume and that's all.
>>> But I suggest to make sure we get this right we keep the check you're
>>> adding here, but wrap it in a WARN_ON. Then we'll get a backtrace when
>>> this is going wrong again. Like this:
>>>
>>>         if (WARN_ON(csr_loaded_already()))
>>>                 return;
>>>
>>> Also when redoing the commits please explain in detail what exactly are
>>> the requirements like you've done above, but please use the standard linux
>>> names, i.e. "runtime PM" and "hibernate-to-disk" and "suspend-to-mem".
>>
>> Ok hooray there's more suspend-to-something things I've totally missed:
>> - suspend-to-idle (done by cat freeze > /sys/power/state) and
>> - suspend (done by cat suspend > /sys/power/state)
>>
>> And apparently there's really no way to drivers to tell them apart.
>> Rafael, is there really no way for drivers to take different paths for
>> these 3 suspend cases? I tried grepping for PM_SUSPEND_ON/STANDY/MEM
>> and didn't spot anything.
>>
>> Also we're completely missing test coverage for that in igt. That is
>> something that needs to be fixed asap (yet another case of
>> combinatorial explosion in igt tests, yay). And at least one of those
>> suspend-to-idle testcase better be in the BAT.
>> -Daniel
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-23 17:17                         ` Daniel Vetter
@ 2015-09-23 20:49                           ` Rafael J. Wysocki
  2015-09-28  6:52                             ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Rafael J. Wysocki @ 2015-09-23 20:49 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx

On 9/23/2015 7:17 PM, Daniel Vetter wrote:
> acpi_target_system_state() seems to be almost the thing we're looking
> for, except that it's only valid in the suspend callbacks since it
> gets reset to ACPI_STATE_S0 when resuming. So probably we want
> something else ...

Right.

The idea is to add a way for drivers to check if
(a) suspend is going to enter the BIOS
(b) resume has been triggered by the BIOS
and that's really what drivers need to know.

For suspend-to-idle those two will return false and for S3 they'll 
return true.

Would that help?

Thanks,
Rafael


> On Wed, Sep 23, 2015 at 6:28 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>> Actually add Rafael this time around ...
>> -Daniel
>>
>> On Wed, Sep 23, 2015 at 6:27 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>> On Wed, Sep 23, 2015 at 9:57 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>>> On Thu, Sep 17, 2015 at 12:53:21AM +0530, Animesh Manna wrote:
>>>>>
>>>>> On 9/14/2015 1:16 PM, Daniel Vetter wrote:
>>>>>> On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote:
>>>>>>> On 9/10/2015 8:15 PM, Daniel Vetter wrote:
>>>>>>>> On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
>>>>>>>>> On 9/2/2015 2:24 PM, Daniel Vetter wrote:
>>>>>>>>>> On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
>>>>>>>>>>> On 8/26/2015 6:40 PM, Daniel Vetter wrote:
>>>>>>>>>>>> On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
>>>>>>>>>>>>> Dmc will restore the csr program except DC9, cold boot,
>>>>>>>>>>>>> warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>>>>>>>
>>>>>>>>>>>>> intel_csr_load_program() function is used to load the firmware
>>>>>>>>>>>>> data from kernel memory to csr address space.
>>>>>>>>>>>>>
>>>>>>>>>>>>> All values of csr address space will be zero if it got reset and
>>>>>>>>>>>>> the first byte of csr program is always a non-zero if firmware
>>>>>>>>>>>>> is loaded successfuly. Based on hardware status will load the
>>>>>>>>>>>>> firmware.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Without this condition check if we overwrite the firmware data the
>>>>>>>>>>>>> counters exposed for dc5/dc6 (help for debugging) will be nullified.
>>>>>>>>>>> Bacause of the above reason mentioned just above we need to block firmware loading again.
>>>>>>>>>>> So only WARN_ON will not help.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>> v1: Initial version.
>>>>>>>>>>>>>
>>>>>>>>>>>>> v2: Based on review comments from Daniel,
>>>>>>>>>>>>> - Added a check to know hardware status and load the firmware if not loaded.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>>>>>>> Cc: Damien Lespiau <damien.lespiau@intel.com>
>>>>>>>>>>>>> Cc: Imre Deak <imre.deak@intel.com>
>>>>>>>>>>>>> Cc: Sunil Kamath <sunil.kamath@intel.com>
>>>>>>>>>>>>> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>>>>>>>>>>>>> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>   drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>>>>>>>>>>>>>   1 file changed, 9 insertions(+)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>>>>>>> index ba1ae03..682cc26 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>>>>>>> @@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>>>>>>>>>>>>>               return;
>>>>>>>>>>>>>       }
>>>>>>>>>>>>> +     /*
>>>>>>>>>>>>> +      * Dmc will restore the csr the program except DC9, cold boot,
>>>>>>>>>>>>> +      * warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>>>>>>> +      * This condition will help to check if csr address space is reset/
>>>>>>>>>>>>> +      * not loaded.
>>>>>>>>>>>>> +      */
>>>>>>>>>>>> Atm we call this from driver load and resume, which doesn seem to cover
>>>>>>>>>>>> all the cases you mention in the comment. Should this be a WARN_ON
>>>>>>>>>>>> instead? Or do we have troubles in our init sequence where we load too
>>>>>>>>>>>> many times?
>>>>>>>>>>> Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
>>>>>>>>>>> Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
>>>>>>>>>>> second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
>>>>>>>>>>> and cold boot(not loaded).
>>>>>>>>>>>
>>>>>>>>>>> Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
>>>>>>>>>>> If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
>>>>>>>>>>>
>>>>>>>>>>> Can the below comment more clear to you.
>>>>>>>>>>>
>>>>>>>>>>>         /*
>>>>>>>>>>>          * Dmc will restore the csr the program except DC9, cold boot,
>>>>>>>>>>>          * warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>>>>>          * If firmware is restored by dmc then no need to load again which
>>>>>>>>>>>          * will keep the dc5/dc6 counter exposed by firmware.
>>>>>>>>>>>          */
>>>>>>>>>>>
>>>>>>>>>>> No issue in init sequence.
>>>>>>>>>> That seems to still cover all the callers of the function afaics - we do
>>>>>>>>>> pci resets over suspend resume unconditionally. So I still don't
>>>>>>>>>> understand where exactly we try to load the dmc firmware in i915.ko when
>>>>>>>>>> it's already loaded.
>>>>>>>>> During resume intel_csr_load_program() will be called from
>>>>>>>>> intel_runtime_resume().
>>>>>>>>>
>>>>>>>>> intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
>>>>>>>>>
>>>>>>>>> During Pc10 entry testing I can see dmc is restoring back the firmware always,
>>>>>>>>> but as you mentioned pci-reset can happen unconditionally, but still then
>>>>>>>>> also during resume intel_runtime_resume() will be called and based on
>>>>>>>>> register read of csr-base-address firmware loading will happen.
>>>>>>>> But in your comment you're saying it won't get restored in case of dc9 and
>>>>>>>> suspend. So that seems to mismatch what you're saying here (and what the
>>>>>>>> commit message says) and what the code does. And this function here is
>>>>>>>> called for resume after suspend/hibernate only.
>>>>>>> pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
>>>>>>> I think you are confusing between dc6 and dc9. Pc10 can be achieved by
>>>>>>> entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
>>>>>>> for broxton which is not present for skylake.
>>>>>> I have no idea at all about different pc levels on skl. What I'm talking
>>>>>> about is system suspend/resume and driver load, which are the places this
>>>>>> function gets called. At least afaics.
>>>>>>
>>>>>>> Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
>>>>>>> execution flow will be different in case of suspend/resume which I think is confusing
>>>>>>> you.
>>>>>> That seems like really important information. What's different on bxt?
>>>>>> These are the kind of details you should explain in the commit message ...
>>>>>>
>>>>>>> I am ready explain you in detail. It will be good if we discuss specific use-case scenario
>>>>>>> and itz software design for specific platform. Another point - as dmc related code for
>>>>>>> broxton is not merged better first we close design for skylake. Now, I have added dc9
>>>>>>> description in comment thinking of future. If you want I can remove for now and later
>>>>>>> can add in bxt patch series for enabling dmc. Will wait for your reply.
>>>>>> This question here isn't about the overall design and how to handle power
>>>>>> wells in skl/bxt. That's a separate discussion and tracked somewhere else.
>>>>>> I'm really just confused about when exactly we need to reload to firmware,
>>>>>> and why we need a runtime check for that. Normally we should know when to
>>>>>> reload the firmware and just either reload or not, without checking hw
>>>>>> state. And I don't like checking for hw state since at least in the past
>>>>>> that kind of code ended up being fragile - it's an illusion that it does
>>>>>> the right thing no matter what, since often there's other tricky ordering
>>>>>> constraints. And if you have automatic duct-tape like then no one will
>>>>>> ever spot those other, harder to spot issues, until an expensive customer
>>>>>> escalation happens.
>>>>>>
>>>>>> So what I want to know here is:
>>>>>> - When exactly do we need to reload dmc firmware.
>>>>> In skl, during driver load first time we load the firmware, during normal
>>>>> suspend-resume (dc6 entry/exit)
>>>>> no need to reload the firmware again as dmc will take care of it. But during
>>>>> suspend/hibernation
>>>>> dmc will not restore the firmware. In that case driver need to reload it
>>>>> again. I do not know
>>>>> how to differentiate pm-suspend and suspend-hibernation and thought both the
>>>>> cases
>>>>> intel_runtime_resume() will be called where we can check the h/w state and
>>>>> reload the
>>>>> firmware if dmc is not restored.
>>>>>
>>>>> In bxt, during driver load first time we load the firmware, during normal
>>>>> suspend-resume
>>>>> display engine will enter into dc9 and dmc will not restore the firmware. So
>>>>> every
>>>>> suspend-resume we need to reload the firmware.
>>>>>> - What exactly is the reason why we can't make that decision statically in
>>>>>>    the code (by calling csr_load at the right spots).
>>>>> As I mentioned before in case of skylake can we differentiate between
>>>>> "resume from pm-suspend" with "resume from suspend-hibernation" inside
>>>>> driver?
>>>>>
>>>>> In case of broxton, every time we need to reload, so we can decide
>>>>> statically.
>>>> Of course we can differentiate between all the different resume paths, and
>>>> we also have a per-platform split to take care of bxt vs. skl. And there
>>>> are actually 3 different resume paths:
>>>>
>>>> - runtime PM resume. This calls the runtime_resume hook. It sounds like on
>>>>    skl we should _not_ load the csr firmware, but on bxt we should load it.
>>>>    This can be fixed by removing the intel_csr_load_program call from
>>>>    skl_resume_prepare.
>>>> - resume from hibernate-to-disk (i.e. system completely off, state stored
>>>>    on the swap partition) is done by calling the thaw callbacks.
>>>> - resume from suspend-to-mem (i.e. system in low-power with only memory
>>>>    in self-refresh, all state stored in memory) is done by calling the
>>>>    resume callbacks.
>>>>
>>>> For i915 we use unified handlers in our dev_pm_ops for both thaw and
>>>> resume, but it sounds like that won't be a problem for skl/bxt since we
>>>> need to reload the csr firmware in all cases. Although I'm not perfectly
>>>> sure since you don't explain what kind of resume you mean exactly (since
>>>> you don't use the linux names for them).
>>>>
>>>> Anyway it sounds like we can replace this patch by one where we remove
>>>> that errornous csr load call from skl runtime pm resume and that's all.
>>>> But I suggest to make sure we get this right we keep the check you're
>>>> adding here, but wrap it in a WARN_ON. Then we'll get a backtrace when
>>>> this is going wrong again. Like this:
>>>>
>>>>          if (WARN_ON(csr_loaded_already()))
>>>>                  return;
>>>>
>>>> Also when redoing the commits please explain in detail what exactly are
>>>> the requirements like you've done above, but please use the standard linux
>>>> names, i.e. "runtime PM" and "hibernate-to-disk" and "suspend-to-mem".
>>> Ok hooray there's more suspend-to-something things I've totally missed:
>>> - suspend-to-idle (done by cat freeze > /sys/power/state) and
>>> - suspend (done by cat suspend > /sys/power/state)
>>>
>>> And apparently there's really no way to drivers to tell them apart.
>>> Rafael, is there really no way for drivers to take different paths for
>>> these 3 suspend cases? I tried grepping for PM_SUSPEND_ON/STANDY/MEM
>>> and didn't spot anything.
>>>
>>> Also we're completely missing test coverage for that in igt. That is
>>> something that needs to be fixed asap (yet another case of
>>> combinatorial explosion in igt tests, yay). And at least one of those
>>> suspend-to-idle testcase better be in the BAT.
>>> -Daniel
>>> --
>>> Daniel Vetter
>>> Software Engineer, Intel Corporation
>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>>
>>
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-23 20:49                           ` Rafael J. Wysocki
@ 2015-09-28  6:52                             ` Daniel Vetter
  2015-09-28 23:54                               ` Rafael J. Wysocki
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2015-09-28  6:52 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Daniel Vetter, intel-gfx

On Wed, Sep 23, 2015 at 10:49:36PM +0200, Rafael J. Wysocki wrote:
> On 9/23/2015 7:17 PM, Daniel Vetter wrote:
> >acpi_target_system_state() seems to be almost the thing we're looking
> >for, except that it's only valid in the suspend callbacks since it
> >gets reset to ACPI_STATE_S0 when resuming. So probably we want
> >something else ...
> 
> Right.
> 
> The idea is to add a way for drivers to check if
> (a) suspend is going to enter the BIOS
> (b) resume has been triggered by the BIOS
> and that's really what drivers need to know.
> 
> For suspend-to-idle those two will return false and for S3 they'll return
> true.
> 
> Would that help?

Not sure that matches exaxtly what we'd need here ... Essentially we need
to know whether we've been in S3/S4 (firmware has been eaten) or in one of
the higher suspend-to-idle/standby states (firmware still alive, don't
disturb it). Additional fun that just crossed my mind is that if the
suspend-to-mem is aborted (some other driver failed) then that function
should _not_ indicate that we've been in S3. So maybe something like

acpi_source_system_state() which usually is S0 and only when acpi
successfully went into the suspend state in platform_suspend_ops->enter it
gets set to the value of acpi_target_system_state. And then reset once the
resume has completed. I think that would be what we'd want here.

Anyway I'll pull in Animesh series meanwhile, amended with a FIXME
comment.
-Daniel

> 
> Thanks,
> Rafael
> 
> 
> >On Wed, Sep 23, 2015 at 6:28 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> >>Actually add Rafael this time around ...
> >>-Daniel
> >>
> >>On Wed, Sep 23, 2015 at 6:27 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>On Wed, Sep 23, 2015 at 9:57 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>On Thu, Sep 17, 2015 at 12:53:21AM +0530, Animesh Manna wrote:
> >>>>>
> >>>>>On 9/14/2015 1:16 PM, Daniel Vetter wrote:
> >>>>>>On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote:
> >>>>>>>On 9/10/2015 8:15 PM, Daniel Vetter wrote:
> >>>>>>>>On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
> >>>>>>>>>On 9/2/2015 2:24 PM, Daniel Vetter wrote:
> >>>>>>>>>>On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
> >>>>>>>>>>>On 8/26/2015 6:40 PM, Daniel Vetter wrote:
> >>>>>>>>>>>>On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
> >>>>>>>>>>>>>Dmc will restore the csr program except DC9, cold boot,
> >>>>>>>>>>>>>warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>intel_csr_load_program() function is used to load the firmware
> >>>>>>>>>>>>>data from kernel memory to csr address space.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>All values of csr address space will be zero if it got reset and
> >>>>>>>>>>>>>the first byte of csr program is always a non-zero if firmware
> >>>>>>>>>>>>>is loaded successfuly. Based on hardware status will load the
> >>>>>>>>>>>>>firmware.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>Without this condition check if we overwrite the firmware data the
> >>>>>>>>>>>>>counters exposed for dc5/dc6 (help for debugging) will be nullified.
> >>>>>>>>>>>Bacause of the above reason mentioned just above we need to block firmware loading again.
> >>>>>>>>>>>So only WARN_ON will not help.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>>v1: Initial version.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>v2: Based on review comments from Daniel,
> >>>>>>>>>>>>>- Added a check to know hardware status and load the firmware if not loaded.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>Cc: Daniel Vetter <daniel.vetter@intel.com>
> >>>>>>>>>>>>>Cc: Damien Lespiau <damien.lespiau@intel.com>
> >>>>>>>>>>>>>Cc: Imre Deak <imre.deak@intel.com>
> >>>>>>>>>>>>>Cc: Sunil Kamath <sunil.kamath@intel.com>
> >>>>>>>>>>>>>Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> >>>>>>>>>>>>>Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> >>>>>>>>>>>>>---
> >>>>>>>>>>>>>  drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
> >>>>>>>>>>>>>  1 file changed, 9 insertions(+)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>>>>>>>>index ba1ae03..682cc26 100644
> >>>>>>>>>>>>>--- a/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>>>>>>>>+++ b/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>>>>>>>>@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
> >>>>>>>>>>>>>              return;
> >>>>>>>>>>>>>      }
> >>>>>>>>>>>>>+     /*
> >>>>>>>>>>>>>+      * Dmc will restore the csr the program except DC9, cold boot,
> >>>>>>>>>>>>>+      * warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>>>>>>>>+      * This condition will help to check if csr address space is reset/
> >>>>>>>>>>>>>+      * not loaded.
> >>>>>>>>>>>>>+      */
> >>>>>>>>>>>>Atm we call this from driver load and resume, which doesn seem to cover
> >>>>>>>>>>>>all the cases you mention in the comment. Should this be a WARN_ON
> >>>>>>>>>>>>instead? Or do we have troubles in our init sequence where we load too
> >>>>>>>>>>>>many times?
> >>>>>>>>>>>Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
> >>>>>>>>>>>Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
> >>>>>>>>>>>second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
> >>>>>>>>>>>and cold boot(not loaded).
> >>>>>>>>>>>
> >>>>>>>>>>>Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
> >>>>>>>>>>>If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
> >>>>>>>>>>>
> >>>>>>>>>>>Can the below comment more clear to you.
> >>>>>>>>>>>
> >>>>>>>>>>>        /*
> >>>>>>>>>>>         * Dmc will restore the csr the program except DC9, cold boot,
> >>>>>>>>>>>         * warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>>>>>>         * If firmware is restored by dmc then no need to load again which
> >>>>>>>>>>>         * will keep the dc5/dc6 counter exposed by firmware.
> >>>>>>>>>>>         */
> >>>>>>>>>>>
> >>>>>>>>>>>No issue in init sequence.
> >>>>>>>>>>That seems to still cover all the callers of the function afaics - we do
> >>>>>>>>>>pci resets over suspend resume unconditionally. So I still don't
> >>>>>>>>>>understand where exactly we try to load the dmc firmware in i915.ko when
> >>>>>>>>>>it's already loaded.
> >>>>>>>>>During resume intel_csr_load_program() will be called from
> >>>>>>>>>intel_runtime_resume().
> >>>>>>>>>
> >>>>>>>>>intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
> >>>>>>>>>
> >>>>>>>>>During Pc10 entry testing I can see dmc is restoring back the firmware always,
> >>>>>>>>>but as you mentioned pci-reset can happen unconditionally, but still then
> >>>>>>>>>also during resume intel_runtime_resume() will be called and based on
> >>>>>>>>>register read of csr-base-address firmware loading will happen.
> >>>>>>>>But in your comment you're saying it won't get restored in case of dc9 and
> >>>>>>>>suspend. So that seems to mismatch what you're saying here (and what the
> >>>>>>>>commit message says) and what the code does. And this function here is
> >>>>>>>>called for resume after suspend/hibernate only.
> >>>>>>>pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
> >>>>>>>I think you are confusing between dc6 and dc9. Pc10 can be achieved by
> >>>>>>>entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
> >>>>>>>for broxton which is not present for skylake.
> >>>>>>I have no idea at all about different pc levels on skl. What I'm talking
> >>>>>>about is system suspend/resume and driver load, which are the places this
> >>>>>>function gets called. At least afaics.
> >>>>>>
> >>>>>>>Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
> >>>>>>>execution flow will be different in case of suspend/resume which I think is confusing
> >>>>>>>you.
> >>>>>>That seems like really important information. What's different on bxt?
> >>>>>>These are the kind of details you should explain in the commit message ...
> >>>>>>
> >>>>>>>I am ready explain you in detail. It will be good if we discuss specific use-case scenario
> >>>>>>>and itz software design for specific platform. Another point - as dmc related code for
> >>>>>>>broxton is not merged better first we close design for skylake. Now, I have added dc9
> >>>>>>>description in comment thinking of future. If you want I can remove for now and later
> >>>>>>>can add in bxt patch series for enabling dmc. Will wait for your reply.
> >>>>>>This question here isn't about the overall design and how to handle power
> >>>>>>wells in skl/bxt. That's a separate discussion and tracked somewhere else.
> >>>>>>I'm really just confused about when exactly we need to reload to firmware,
> >>>>>>and why we need a runtime check for that. Normally we should know when to
> >>>>>>reload the firmware and just either reload or not, without checking hw
> >>>>>>state. And I don't like checking for hw state since at least in the past
> >>>>>>that kind of code ended up being fragile - it's an illusion that it does
> >>>>>>the right thing no matter what, since often there's other tricky ordering
> >>>>>>constraints. And if you have automatic duct-tape like then no one will
> >>>>>>ever spot those other, harder to spot issues, until an expensive customer
> >>>>>>escalation happens.
> >>>>>>
> >>>>>>So what I want to know here is:
> >>>>>>- When exactly do we need to reload dmc firmware.
> >>>>>In skl, during driver load first time we load the firmware, during normal
> >>>>>suspend-resume (dc6 entry/exit)
> >>>>>no need to reload the firmware again as dmc will take care of it. But during
> >>>>>suspend/hibernation
> >>>>>dmc will not restore the firmware. In that case driver need to reload it
> >>>>>again. I do not know
> >>>>>how to differentiate pm-suspend and suspend-hibernation and thought both the
> >>>>>cases
> >>>>>intel_runtime_resume() will be called where we can check the h/w state and
> >>>>>reload the
> >>>>>firmware if dmc is not restored.
> >>>>>
> >>>>>In bxt, during driver load first time we load the firmware, during normal
> >>>>>suspend-resume
> >>>>>display engine will enter into dc9 and dmc will not restore the firmware. So
> >>>>>every
> >>>>>suspend-resume we need to reload the firmware.
> >>>>>>- What exactly is the reason why we can't make that decision statically in
> >>>>>>   the code (by calling csr_load at the right spots).
> >>>>>As I mentioned before in case of skylake can we differentiate between
> >>>>>"resume from pm-suspend" with "resume from suspend-hibernation" inside
> >>>>>driver?
> >>>>>
> >>>>>In case of broxton, every time we need to reload, so we can decide
> >>>>>statically.
> >>>>Of course we can differentiate between all the different resume paths, and
> >>>>we also have a per-platform split to take care of bxt vs. skl. And there
> >>>>are actually 3 different resume paths:
> >>>>
> >>>>- runtime PM resume. This calls the runtime_resume hook. It sounds like on
> >>>>   skl we should _not_ load the csr firmware, but on bxt we should load it.
> >>>>   This can be fixed by removing the intel_csr_load_program call from
> >>>>   skl_resume_prepare.
> >>>>- resume from hibernate-to-disk (i.e. system completely off, state stored
> >>>>   on the swap partition) is done by calling the thaw callbacks.
> >>>>- resume from suspend-to-mem (i.e. system in low-power with only memory
> >>>>   in self-refresh, all state stored in memory) is done by calling the
> >>>>   resume callbacks.
> >>>>
> >>>>For i915 we use unified handlers in our dev_pm_ops for both thaw and
> >>>>resume, but it sounds like that won't be a problem for skl/bxt since we
> >>>>need to reload the csr firmware in all cases. Although I'm not perfectly
> >>>>sure since you don't explain what kind of resume you mean exactly (since
> >>>>you don't use the linux names for them).
> >>>>
> >>>>Anyway it sounds like we can replace this patch by one where we remove
> >>>>that errornous csr load call from skl runtime pm resume and that's all.
> >>>>But I suggest to make sure we get this right we keep the check you're
> >>>>adding here, but wrap it in a WARN_ON. Then we'll get a backtrace when
> >>>>this is going wrong again. Like this:
> >>>>
> >>>>         if (WARN_ON(csr_loaded_already()))
> >>>>                 return;
> >>>>
> >>>>Also when redoing the commits please explain in detail what exactly are
> >>>>the requirements like you've done above, but please use the standard linux
> >>>>names, i.e. "runtime PM" and "hibernate-to-disk" and "suspend-to-mem".
> >>>Ok hooray there's more suspend-to-something things I've totally missed:
> >>>- suspend-to-idle (done by cat freeze > /sys/power/state) and
> >>>- suspend (done by cat suspend > /sys/power/state)
> >>>
> >>>And apparently there's really no way to drivers to tell them apart.
> >>>Rafael, is there really no way for drivers to take different paths for
> >>>these 3 suspend cases? I tried grepping for PM_SUSPEND_ON/STANDY/MEM
> >>>and didn't spot anything.
> >>>
> >>>Also we're completely missing test coverage for that in igt. That is
> >>>something that needs to be fixed asap (yet another case of
> >>>combinatorial explosion in igt tests, yay). And at least one of those
> >>>suspend-to-idle testcase better be in the BAT.
> >>>-Daniel
> >>>--
> >>>Daniel Vetter
> >>>Software Engineer, Intel Corporation
> >>>+41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >>
> >>
> >>--
> >>Daniel Vetter
> >>Software Engineer, Intel Corporation
> >>+41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >
> >
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-07 11:04   ` Sunil Kamath
  2015-09-07 16:22     ` Daniel Vetter
@ 2015-09-28  7:03     ` Daniel Vetter
  1 sibling, 0 replies; 51+ messages in thread
From: Daniel Vetter @ 2015-09-28  7:03 UTC (permalink / raw)
  To: Sunil Kamath; +Cc: Daniel Vetter, intel-gfx

On Mon, Sep 07, 2015 at 04:34:30PM +0530, Sunil Kamath wrote:
> On Wednesday 26 August 2015 01:36 AM, Animesh Manna wrote:
> >Dmc will restore the csr program except DC9, cold boot,
> >warm reset, PCI function level reset, and hibernate/suspend.
> >
> >intel_csr_load_program() function is used to load the firmware
> >data from kernel memory to csr address space.
> >
> >All values of csr address space will be zero if it got reset and
> >the first byte of csr program is always a non-zero if firmware
> >is loaded successfuly. Based on hardware status will load the
> >firmware.
> >
> >Without this condition check if we overwrite the firmware data the
> >counters exposed for dc5/dc6 (help for debugging) will be nullified.
> >
> >v1: Initial version.
> >
> >v2: Based on review comments from Daniel,
> >- Added a check to know hardware status and load the firmware if not loaded.
> >
> >Cc: Daniel Vetter <daniel.vetter@intel.com>
> >Cc: Damien Lespiau <damien.lespiau@intel.com>
> >Cc: Imre Deak <imre.deak@intel.com>
> >Cc: Sunil Kamath <sunil.kamath@intel.com>
> >Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> >Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> >---
> >  drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> >index ba1ae03..682cc26 100644
> >--- a/drivers/gpu/drm/i915/intel_csr.c
> >+++ b/drivers/gpu/drm/i915/intel_csr.c
> >@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
> >  		return;
> >  	}
> >+	/*
> >+	 * Dmc will restore the csr the program except DC9, cold boot,
> >+	 * warm reset, PCI function level reset, and hibernate/suspend.
> >+	 * This condition will help to check if csr address space is reset/
> >+	 * not loaded.
> >+	 */
> >+	if (I915_READ(CSR_PROGRAM_BASE))
> >+		return;
> >+
> >  	mutex_lock(&dev_priv->csr_lock);
> >  	fw_size = dev_priv->csr.dmc_fw_size;
> >  	for (i = 0; i < fw_size; i++)
> 
> Valid fix and patch is ready for merge now.
> 
> Reviewed-by: A.Sunil Kamath <sunil.kamath@intel.com>

Ok I changed the code comment to a FIXME to explain what's going on (DC9
is totally irrelevant if I understand this correctly now) and amended the
commit message with a note about what I think is really going on. Please
double-check.

Also I realized that we have 0 test coverage for suspend-to-idle (and
oversight for merging this feature a few years back). Please add a new igt
testcase (maybe subtest of drv_suspend) to exercise this feature.

Making sure that we have sufficient igt coverage is part of the review and
especially something that _must_ be checked for bugfixes like this.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow Animesh Manna
  2015-09-07 11:06   ` Sunil Kamath
@ 2015-09-28  7:21   ` Daniel Vetter
  2015-09-28 18:49     ` Hindman, Gavin
                       ` (2 more replies)
  1 sibling, 3 replies; 51+ messages in thread
From: Daniel Vetter @ 2015-09-28  7:21 UTC (permalink / raw)
  To: Animesh Manna; +Cc: Rajneesh Bhardwaj, intel-gfx, Daniel Vetter

On Wed, Aug 26, 2015 at 01:36:07AM +0530, Animesh Manna wrote:
> Mmio register access after dc6/dc5 entry is not allowed when
> DC6 power states are enabled according to bspec (bspec-id 0527),
> so enabling dc6 as the last call in suspend flow.

We unconditionaly grab a power well reference for everything in our
suspend/resume functions, which means dc6/5 should be prevented for long
enough.

Also since dc6/5 are part of the power well framework you can't just
enable/disable them directly, instead you need to get/put the
corresponding power wells.

Finally this patch doesn't just do that, but also frobs around a lot in
set power well code itself, and I have no idea what it does there and why.
It does smell a bit like you're just breaking dc6 for runtime pm though,
which wouldn't be good.

Anyway I decided to just merge this since this patch series has been
floating around since forever, but then the patch didn't apply cleanly and
so dropped it.
-Daniel

> 
> v1: Initial version.
> 
> v2: Based on review comment from Daniel,
> - created a seperate patch for csr uninitialization set call.
> 
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Sunil Kamath <sunil.kamath@intel.com>
> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c         | 13 +++++++++++++
>  drivers/gpu/drm/i915/intel_drv.h        |  2 ++
>  drivers/gpu/drm/i915/intel_runtime_pm.c | 19 +++++++------------
>  3 files changed, 22 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 478101c..fa66162 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1013,10 +1013,20 @@ static int i915_pm_resume(struct device *dev)
>  
>  static int skl_suspend_complete(struct drm_i915_private *dev_priv)
>  {
> +	enum csr_state state;
>  	/* Enabling DC6 is not a hard requirement to enter runtime D3 */
>  
>  	skl_uninit_cdclk(dev_priv);
>  
> +	/* TODO: wait for a completion event or
> +	 * similar here instead of busy
> +	 * waiting using wait_for function.
> +	 */
> +	wait_for((state = intel_csr_load_status_get(dev_priv)) !=
> +			FW_UNINITIALIZED, 1000);
> +	if (state == FW_LOADED)
> +		skl_enable_dc6(dev_priv);
> +
>  	return 0;
>  }
>  
> @@ -1063,6 +1073,9 @@ static int skl_resume_prepare(struct drm_i915_private *dev_priv)
>  {
>  	struct drm_device *dev = dev_priv->dev;
>  
> +	if (intel_csr_load_status_get(dev_priv) == FW_LOADED)
> +		skl_disable_dc6(dev_priv);
> +
>  	skl_init_cdclk(dev_priv);
>  	intel_csr_load_program(dev);
>  
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index 81b7d77..9cb7d4e 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1118,6 +1118,8 @@ void bxt_enable_dc9(struct drm_i915_private *dev_priv);
>  void bxt_disable_dc9(struct drm_i915_private *dev_priv);
>  void skl_init_cdclk(struct drm_i915_private *dev_priv);
>  void skl_uninit_cdclk(struct drm_i915_private *dev_priv);
> +void skl_enable_dc6(struct drm_i915_private *dev_priv);
> +void skl_disable_dc6(struct drm_i915_private *dev_priv);
>  void intel_dp_get_m_n(struct intel_crtc *crtc,
>  		      struct intel_crtc_state *pipe_config);
>  void intel_dp_set_m_n(struct intel_crtc *crtc, enum link_m_n_set m_n);
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index 821644d..23a3aa3 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -548,7 +548,7 @@ static void assert_can_disable_dc6(struct drm_i915_private *dev_priv)
>  		"DC6 already programmed to be disabled.\n");
>  }
>  
> -static void skl_enable_dc6(struct drm_i915_private *dev_priv)
> +void skl_enable_dc6(struct drm_i915_private *dev_priv)
>  {
>  	uint32_t val;
>  
> @@ -565,7 +565,7 @@ static void skl_enable_dc6(struct drm_i915_private *dev_priv)
>  	POSTING_READ(DC_STATE_EN);
>  }
>  
> -static void skl_disable_dc6(struct drm_i915_private *dev_priv)
> +void skl_disable_dc6(struct drm_i915_private *dev_priv)
>  {
>  	uint32_t val;
>  
> @@ -626,10 +626,10 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>  				!I915_READ(HSW_PWR_WELL_BIOS),
>  				"Invalid for power well status to be enabled, unless done by the BIOS, \
>  				when request is to disable!\n");
> -			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> -				power_well->data == SKL_DISP_PW_2) {
> +			if (power_well->data == SKL_DISP_PW_2) {
> +				if (GEN9_ENABLE_DC5(dev))
> +					gen9_disable_dc5(dev_priv);
>  				if (SKL_ENABLE_DC6(dev)) {
> -					skl_disable_dc6(dev_priv);
>  					/*
>  					 * DDI buffer programming unnecessary during driver-load/resume
>  					 * as it's already done during modeset initialization then.
> @@ -637,8 +637,6 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>  					 */
>  					if (!dev_priv->power_domains.initializing)
>  						intel_prepare_ddi(dev);
> -				} else {
> -					gen9_disable_dc5(dev_priv);
>  				}
>  			}
>  			I915_WRITE(HSW_PWR_WELL_DRIVER, tmp | req_mask);
> @@ -658,7 +656,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>  			POSTING_READ(HSW_PWR_WELL_DRIVER);
>  			DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
>  
> -			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> +			if (GEN9_ENABLE_DC5(dev) &&
>  				power_well->data == SKL_DISP_PW_2) {
>  				enum csr_state state;
>  				/* TODO: wait for a completion event or
> @@ -671,10 +669,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>  					DRM_ERROR("CSR firmware not ready (%d)\n",
>  							state);
>  				else
> -					if (SKL_ENABLE_DC6(dev))
> -						skl_enable_dc6(dev_priv);
> -					else
> -						gen9_enable_dc5(dev_priv);
> +					gen9_enable_dc5(dev_priv);
>  			}
>  		}
>  	}
> -- 
> 2.0.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 5/5] drm/i915/skl: Block disable call for pw1 if dmc firmware is present.
  2015-09-07 11:09   ` Sunil Kamath
@ 2015-09-28  7:24     ` Daniel Vetter
  0 siblings, 0 replies; 51+ messages in thread
From: Daniel Vetter @ 2015-09-28  7:24 UTC (permalink / raw)
  To: Sunil Kamath; +Cc: Daniel Vetter, intel-gfx

On Mon, Sep 07, 2015 at 04:39:19PM +0530, Sunil Kamath wrote:
> On Wednesday 26 August 2015 01:36 AM, Animesh Manna wrote:
> >Another interesting criteria to work dmc as expected is pw1 to be
> >enabled by driver and dmc will shut it off in its execution
> >sequence. If already disabled by driver dmc will get confuse and
> >behave differently than expected found during pc10 entry issue
> >for skl.
> >
> >So berfore we disable power-well 1, added check if dmc firmware is
> >present and driver will not disable power well 1, but for any reason
> >if firmware is not present of failed to load we can shut off the
> >power well 1 which will save some power.
> >
> >As skl is currently fully dependent on dmc to go in lowest possible
> >power state (dc6) but the same is not applicable for bxt. Display
> >engine can enter into dc9 without dmc, hence unblocking disable call.
> >
> >v1: Initial version.
> >
> >v2: Rebased as per current patch series.
> >
> >Cc: Daniel Vetter <daniel.vetter@intel.com>
> >Cc: Damien Lespiau <damien.lespiau@intel.com>
> >Cc: Imre Deak <imre.deak@intel.com>
> >Cc: Sunil Kamath <sunil.kamath@intel.com>
> >Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> >Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> >---
> >  drivers/gpu/drm/i915/intel_runtime_pm.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> >index 23a3aa3..340f386 100644
> >--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> >+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> >@@ -652,9 +652,15 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
> >  		}
> >  	} else {
> >  		if (enable_requested) {
> >-			I915_WRITE(HSW_PWR_WELL_DRIVER,	tmp & ~req_mask);
> >-			POSTING_READ(HSW_PWR_WELL_DRIVER);
> >-			DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
> >+			if (IS_SKYLAKE(dev) &&
> >+				(power_well->data == SKL_DISP_PW_1) &&
> >+				(intel_csr_load_status_get(dev_priv) == FW_LOADED))
> >+				DRM_DEBUG_KMS("Not Disabling PW1, dmc will handle\n");
> >+			else {
> >+				I915_WRITE(HSW_PWR_WELL_DRIVER,	tmp & ~req_mask);
> >+				POSTING_READ(HSW_PWR_WELL_DRIVER);
> >+				DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
> >+			}
> >  			if (GEN9_ENABLE_DC5(dev) &&
> >  				power_well->data == SKL_DISP_PW_2) {
> 
> Valid fix and patch is ready for merge now.
> 
> Reviewed-by: A.Sunil Kamath <sunil.kamath@intel.com>

Ok I pulled them all in except patch 3.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  2015-09-28  7:21   ` Daniel Vetter
@ 2015-09-28 18:49     ` Hindman, Gavin
  2015-09-29  5:31     ` [DMC_BUGFIX_V3] " Animesh Manna
  2015-09-29  5:38     ` [DMC_BUGFIX_SKL_V2 3/5] " Animesh Manna
  2 siblings, 0 replies; 51+ messages in thread
From: Hindman, Gavin @ 2015-09-28 18:49 UTC (permalink / raw)
  To: Daniel Vetter, Manna, Animesh
  Cc: Bhardwaj, Rajneesh, intel-gfx, Vetter, Daniel

You believe suspend w/ DC6 should work due to the unconditional power-well reference, or you don't know and just couldn't merge since the patch didn’t apply?

Gavin Hindman
Senior Program Manager
SSG/OTC – Open Source Technology Center

-----Original Message-----
From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf Of Daniel Vetter
Sent: Monday, September 28, 2015 12:22 AM
To: Manna, Animesh <animesh.manna@intel.com>
Cc: Bhardwaj, Rajneesh <rajneesh.bhardwaj@intel.com>; intel-gfx@lists.freedesktop.org; Vetter, Daniel <daniel.vetter@intel.com>
Subject: Re: [Intel-gfx] [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow.

On Wed, Aug 26, 2015 at 01:36:07AM +0530, Animesh Manna wrote:
> Mmio register access after dc6/dc5 entry is not allowed when
> DC6 power states are enabled according to bspec (bspec-id 0527), so 
> enabling dc6 as the last call in suspend flow.

We unconditionaly grab a power well reference for everything in our suspend/resume functions, which means dc6/5 should be prevented for long enough.

Also since dc6/5 are part of the power well framework you can't just enable/disable them directly, instead you need to get/put the corresponding power wells.

Finally this patch doesn't just do that, but also frobs around a lot in set power well code itself, and I have no idea what it does there and why.
It does smell a bit like you're just breaking dc6 for runtime pm though, which wouldn't be good.

Anyway I decided to just merge this since this patch series has been floating around since forever, but then the patch didn't apply cleanly and so dropped it.
-Daniel

> 
> v1: Initial version.
> 
> v2: Based on review comment from Daniel,
> - created a seperate patch for csr uninitialization set call.
> 
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Sunil Kamath <sunil.kamath@intel.com>
> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c         | 13 +++++++++++++
>  drivers/gpu/drm/i915/intel_drv.h        |  2 ++
>  drivers/gpu/drm/i915/intel_runtime_pm.c | 19 +++++++------------
>  3 files changed, 22 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c 
> b/drivers/gpu/drm/i915/i915_drv.c index 478101c..fa66162 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1013,10 +1013,20 @@ static int i915_pm_resume(struct device *dev)
>  
>  static int skl_suspend_complete(struct drm_i915_private *dev_priv)  {
> +	enum csr_state state;
>  	/* Enabling DC6 is not a hard requirement to enter runtime D3 */
>  
>  	skl_uninit_cdclk(dev_priv);
>  
> +	/* TODO: wait for a completion event or
> +	 * similar here instead of busy
> +	 * waiting using wait_for function.
> +	 */
> +	wait_for((state = intel_csr_load_status_get(dev_priv)) !=
> +			FW_UNINITIALIZED, 1000);
> +	if (state == FW_LOADED)
> +		skl_enable_dc6(dev_priv);
> +
>  	return 0;
>  }
>  
> @@ -1063,6 +1073,9 @@ static int skl_resume_prepare(struct 
> drm_i915_private *dev_priv)  {
>  	struct drm_device *dev = dev_priv->dev;
>  
> +	if (intel_csr_load_status_get(dev_priv) == FW_LOADED)
> +		skl_disable_dc6(dev_priv);
> +
>  	skl_init_cdclk(dev_priv);
>  	intel_csr_load_program(dev);
>  
> diff --git a/drivers/gpu/drm/i915/intel_drv.h 
> b/drivers/gpu/drm/i915/intel_drv.h
> index 81b7d77..9cb7d4e 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1118,6 +1118,8 @@ void bxt_enable_dc9(struct drm_i915_private 
> *dev_priv);  void bxt_disable_dc9(struct drm_i915_private *dev_priv);  
> void skl_init_cdclk(struct drm_i915_private *dev_priv);  void 
> skl_uninit_cdclk(struct drm_i915_private *dev_priv);
> +void skl_enable_dc6(struct drm_i915_private *dev_priv); void 
> +skl_disable_dc6(struct drm_i915_private *dev_priv);
>  void intel_dp_get_m_n(struct intel_crtc *crtc,
>  		      struct intel_crtc_state *pipe_config);  void 
> intel_dp_set_m_n(struct intel_crtc *crtc, enum link_m_n_set m_n); diff 
> --git a/drivers/gpu/drm/i915/intel_runtime_pm.c 
> b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index 821644d..23a3aa3 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -548,7 +548,7 @@ static void assert_can_disable_dc6(struct drm_i915_private *dev_priv)
>  		"DC6 already programmed to be disabled.\n");  }
>  
> -static void skl_enable_dc6(struct drm_i915_private *dev_priv)
> +void skl_enable_dc6(struct drm_i915_private *dev_priv)
>  {
>  	uint32_t val;
>  
> @@ -565,7 +565,7 @@ static void skl_enable_dc6(struct drm_i915_private *dev_priv)
>  	POSTING_READ(DC_STATE_EN);
>  }
>  
> -static void skl_disable_dc6(struct drm_i915_private *dev_priv)
> +void skl_disable_dc6(struct drm_i915_private *dev_priv)
>  {
>  	uint32_t val;
>  
> @@ -626,10 +626,10 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>  				!I915_READ(HSW_PWR_WELL_BIOS),
>  				"Invalid for power well status to be enabled, unless done by the BIOS, \
>  				when request is to disable!\n");
> -			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> -				power_well->data == SKL_DISP_PW_2) {
> +			if (power_well->data == SKL_DISP_PW_2) {
> +				if (GEN9_ENABLE_DC5(dev))
> +					gen9_disable_dc5(dev_priv);
>  				if (SKL_ENABLE_DC6(dev)) {
> -					skl_disable_dc6(dev_priv);
>  					/*
>  					 * DDI buffer programming unnecessary during driver-load/resume
>  					 * as it's already done during modeset initialization then.
> @@ -637,8 +637,6 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>  					 */
>  					if (!dev_priv->power_domains.initializing)
>  						intel_prepare_ddi(dev);
> -				} else {
> -					gen9_disable_dc5(dev_priv);
>  				}
>  			}
>  			I915_WRITE(HSW_PWR_WELL_DRIVER, tmp | req_mask); @@ -658,7 +656,7 
> @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>  			POSTING_READ(HSW_PWR_WELL_DRIVER);
>  			DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
>  
> -			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> +			if (GEN9_ENABLE_DC5(dev) &&
>  				power_well->data == SKL_DISP_PW_2) {
>  				enum csr_state state;
>  				/* TODO: wait for a completion event or @@ -671,10 +669,7 @@ 
> static void skl_set_power_well(struct drm_i915_private *dev_priv,
>  					DRM_ERROR("CSR firmware not ready (%d)\n",
>  							state);
>  				else
> -					if (SKL_ENABLE_DC6(dev))
> -						skl_enable_dc6(dev_priv);
> -					else
> -						gen9_enable_dc5(dev_priv);
> +					gen9_enable_dc5(dev_priv);
>  			}
>  		}
>  	}
> --
> 2.0.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-28  6:52                             ` Daniel Vetter
@ 2015-09-28 23:54                               ` Rafael J. Wysocki
  2015-09-29  8:51                                 ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Rafael J. Wysocki @ 2015-09-28 23:54 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx

On 9/28/2015 8:52 AM, Daniel Vetter wrote:
> On Wed, Sep 23, 2015 at 10:49:36PM +0200, Rafael J. Wysocki wrote:
>> On 9/23/2015 7:17 PM, Daniel Vetter wrote:
>>> acpi_target_system_state() seems to be almost the thing we're looking
>>> for, except that it's only valid in the suspend callbacks since it
>>> gets reset to ACPI_STATE_S0 when resuming. So probably we want
>>> something else ...
>> Right.
>>
>> The idea is to add a way for drivers to check if
>> (a) suspend is going to enter the BIOS
>> (b) resume has been triggered by the BIOS
>> and that's really what drivers need to know.
>>
>> For suspend-to-idle those two will return false and for S3 they'll return
>> true.
>>
>> Would that help?
> Not sure that matches exaxtly what we'd need here ... Essentially we need
> to know whether we've been in S3/S4 (firmware has been eaten) or in one of
> the higher suspend-to-idle/standby states (firmware still alive, don't
> disturb it). Additional fun that just crossed my mind is that if the
> suspend-to-mem is aborted (some other driver failed) then that function
> should _not_ indicate that we've been in S3. So maybe something like

So it really looks like the interface I was talking about would be suitable:

pm_suspend_via_firmware() == true -> firmware is going to be eaten (use 
that during suspend if needed)
pm_resume_via_firmware() == true -> firmware was eaten

The latter will only return 'true' if we really have entered the BIOS 
(platform firmware).

> acpi_source_system_state() which usually is S0 and only when acpi
> successfully went into the suspend state in platform_suspend_ops->enter it
> gets set to the value of acpi_target_system_state. And then reset once the
> resume has completed. I think that would be what we'd want here.

We need to new functions like the above, because some things already 
depend on acpi_target_system_state working the way it does currently.

I see no reason to make that ACPI-specific, though, in principle.

> Anyway I'll pull in Animesh series meanwhile, amended with a FIXME comment.

Fine by me.

Thanks,
Rafael


>>> On Wed, Sep 23, 2015 at 6:28 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>>> Actually add Rafael this time around ...
>>>> -Daniel
>>>>
>>>> On Wed, Sep 23, 2015 at 6:27 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>> On Wed, Sep 23, 2015 at 9:57 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>> On Thu, Sep 17, 2015 at 12:53:21AM +0530, Animesh Manna wrote:
>>>>>>> On 9/14/2015 1:16 PM, Daniel Vetter wrote:
>>>>>>>> On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote:
>>>>>>>>> On 9/10/2015 8:15 PM, Daniel Vetter wrote:
>>>>>>>>>> On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
>>>>>>>>>>> On 9/2/2015 2:24 PM, Daniel Vetter wrote:
>>>>>>>>>>>> On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
>>>>>>>>>>>>> On 8/26/2015 6:40 PM, Daniel Vetter wrote:
>>>>>>>>>>>>>> On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
>>>>>>>>>>>>>>> Dmc will restore the csr program except DC9, cold boot,
>>>>>>>>>>>>>>> warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> intel_csr_load_program() function is used to load the firmware
>>>>>>>>>>>>>>> data from kernel memory to csr address space.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> All values of csr address space will be zero if it got reset and
>>>>>>>>>>>>>>> the first byte of csr program is always a non-zero if firmware
>>>>>>>>>>>>>>> is loaded successfuly. Based on hardware status will load the
>>>>>>>>>>>>>>> firmware.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Without this condition check if we overwrite the firmware data the
>>>>>>>>>>>>>>> counters exposed for dc5/dc6 (help for debugging) will be nullified.
>>>>>>>>>>>>> Bacause of the above reason mentioned just above we need to block firmware loading again.
>>>>>>>>>>>>> So only WARN_ON will not help.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> v1: Initial version.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> v2: Based on review comments from Daniel,
>>>>>>>>>>>>>>> - Added a check to know hardware status and load the firmware if not loaded.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>>>>>>>>> Cc: Damien Lespiau <damien.lespiau@intel.com>
>>>>>>>>>>>>>>> Cc: Imre Deak <imre.deak@intel.com>
>>>>>>>>>>>>>>> Cc: Sunil Kamath <sunil.kamath@intel.com>
>>>>>>>>>>>>>>> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>>>>>>>>>>>>>>> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>   drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>>>>>>>>>>>>>>>   1 file changed, 9 insertions(+)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>>>>>>>>> index ba1ae03..682cc26 100644
>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>>>>>>>>> @@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>>>>>>>>>>>>>>>               return;
>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>> +     /*
>>>>>>>>>>>>>>> +      * Dmc will restore the csr the program except DC9, cold boot,
>>>>>>>>>>>>>>> +      * warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>>>>>>>>> +      * This condition will help to check if csr address space is reset/
>>>>>>>>>>>>>>> +      * not loaded.
>>>>>>>>>>>>>>> +      */
>>>>>>>>>>>>>> Atm we call this from driver load and resume, which doesn seem to cover
>>>>>>>>>>>>>> all the cases you mention in the comment. Should this be a WARN_ON
>>>>>>>>>>>>>> instead? Or do we have troubles in our init sequence where we load too
>>>>>>>>>>>>>> many times?
>>>>>>>>>>>>> Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
>>>>>>>>>>>>> Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
>>>>>>>>>>>>> second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
>>>>>>>>>>>>> and cold boot(not loaded).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
>>>>>>>>>>>>> If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can the below comment more clear to you.
>>>>>>>>>>>>>
>>>>>>>>>>>>>         /*
>>>>>>>>>>>>>          * Dmc will restore the csr the program except DC9, cold boot,
>>>>>>>>>>>>>          * warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>>>>>>>          * If firmware is restored by dmc then no need to load again which
>>>>>>>>>>>>>          * will keep the dc5/dc6 counter exposed by firmware.
>>>>>>>>>>>>>          */
>>>>>>>>>>>>>
>>>>>>>>>>>>> No issue in init sequence.
>>>>>>>>>>>> That seems to still cover all the callers of the function afaics - we do
>>>>>>>>>>>> pci resets over suspend resume unconditionally. So I still don't
>>>>>>>>>>>> understand where exactly we try to load the dmc firmware in i915.ko when
>>>>>>>>>>>> it's already loaded.
>>>>>>>>>>> During resume intel_csr_load_program() will be called from
>>>>>>>>>>> intel_runtime_resume().
>>>>>>>>>>>
>>>>>>>>>>> intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
>>>>>>>>>>>
>>>>>>>>>>> During Pc10 entry testing I can see dmc is restoring back the firmware always,
>>>>>>>>>>> but as you mentioned pci-reset can happen unconditionally, but still then
>>>>>>>>>>> also during resume intel_runtime_resume() will be called and based on
>>>>>>>>>>> register read of csr-base-address firmware loading will happen.
>>>>>>>>>> But in your comment you're saying it won't get restored in case of dc9 and
>>>>>>>>>> suspend. So that seems to mismatch what you're saying here (and what the
>>>>>>>>>> commit message says) and what the code does. And this function here is
>>>>>>>>>> called for resume after suspend/hibernate only.
>>>>>>>>> pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
>>>>>>>>> I think you are confusing between dc6 and dc9. Pc10 can be achieved by
>>>>>>>>> entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
>>>>>>>>> for broxton which is not present for skylake.
>>>>>>>> I have no idea at all about different pc levels on skl. What I'm talking
>>>>>>>> about is system suspend/resume and driver load, which are the places this
>>>>>>>> function gets called. At least afaics.
>>>>>>>>
>>>>>>>>> Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
>>>>>>>>> execution flow will be different in case of suspend/resume which I think is confusing
>>>>>>>>> you.
>>>>>>>> That seems like really important information. What's different on bxt?
>>>>>>>> These are the kind of details you should explain in the commit message ...
>>>>>>>>
>>>>>>>>> I am ready explain you in detail. It will be good if we discuss specific use-case scenario
>>>>>>>>> and itz software design for specific platform. Another point - as dmc related code for
>>>>>>>>> broxton is not merged better first we close design for skylake. Now, I have added dc9
>>>>>>>>> description in comment thinking of future. If you want I can remove for now and later
>>>>>>>>> can add in bxt patch series for enabling dmc. Will wait for your reply.
>>>>>>>> This question here isn't about the overall design and how to handle power
>>>>>>>> wells in skl/bxt. That's a separate discussion and tracked somewhere else.
>>>>>>>> I'm really just confused about when exactly we need to reload to firmware,
>>>>>>>> and why we need a runtime check for that. Normally we should know when to
>>>>>>>> reload the firmware and just either reload or not, without checking hw
>>>>>>>> state. And I don't like checking for hw state since at least in the past
>>>>>>>> that kind of code ended up being fragile - it's an illusion that it does
>>>>>>>> the right thing no matter what, since often there's other tricky ordering
>>>>>>>> constraints. And if you have automatic duct-tape like then no one will
>>>>>>>> ever spot those other, harder to spot issues, until an expensive customer
>>>>>>>> escalation happens.
>>>>>>>>
>>>>>>>> So what I want to know here is:
>>>>>>>> - When exactly do we need to reload dmc firmware.
>>>>>>> In skl, during driver load first time we load the firmware, during normal
>>>>>>> suspend-resume (dc6 entry/exit)
>>>>>>> no need to reload the firmware again as dmc will take care of it. But during
>>>>>>> suspend/hibernation
>>>>>>> dmc will not restore the firmware. In that case driver need to reload it
>>>>>>> again. I do not know
>>>>>>> how to differentiate pm-suspend and suspend-hibernation and thought both the
>>>>>>> cases
>>>>>>> intel_runtime_resume() will be called where we can check the h/w state and
>>>>>>> reload the
>>>>>>> firmware if dmc is not restored.
>>>>>>>
>>>>>>> In bxt, during driver load first time we load the firmware, during normal
>>>>>>> suspend-resume
>>>>>>> display engine will enter into dc9 and dmc will not restore the firmware. So
>>>>>>> every
>>>>>>> suspend-resume we need to reload the firmware.
>>>>>>>> - What exactly is the reason why we can't make that decision statically in
>>>>>>>>    the code (by calling csr_load at the right spots).
>>>>>>> As I mentioned before in case of skylake can we differentiate between
>>>>>>> "resume from pm-suspend" with "resume from suspend-hibernation" inside
>>>>>>> driver?
>>>>>>>
>>>>>>> In case of broxton, every time we need to reload, so we can decide
>>>>>>> statically.
>>>>>> Of course we can differentiate between all the different resume paths, and
>>>>>> we also have a per-platform split to take care of bxt vs. skl. And there
>>>>>> are actually 3 different resume paths:
>>>>>>
>>>>>> - runtime PM resume. This calls the runtime_resume hook. It sounds like on
>>>>>>    skl we should _not_ load the csr firmware, but on bxt we should load it.
>>>>>>    This can be fixed by removing the intel_csr_load_program call from
>>>>>>    skl_resume_prepare.
>>>>>> - resume from hibernate-to-disk (i.e. system completely off, state stored
>>>>>>    on the swap partition) is done by calling the thaw callbacks.
>>>>>> - resume from suspend-to-mem (i.e. system in low-power with only memory
>>>>>>    in self-refresh, all state stored in memory) is done by calling the
>>>>>>    resume callbacks.
>>>>>>
>>>>>> For i915 we use unified handlers in our dev_pm_ops for both thaw and
>>>>>> resume, but it sounds like that won't be a problem for skl/bxt since we
>>>>>> need to reload the csr firmware in all cases. Although I'm not perfectly
>>>>>> sure since you don't explain what kind of resume you mean exactly (since
>>>>>> you don't use the linux names for them).
>>>>>>
>>>>>> Anyway it sounds like we can replace this patch by one where we remove
>>>>>> that errornous csr load call from skl runtime pm resume and that's all.
>>>>>> But I suggest to make sure we get this right we keep the check you're
>>>>>> adding here, but wrap it in a WARN_ON. Then we'll get a backtrace when
>>>>>> this is going wrong again. Like this:
>>>>>>
>>>>>>          if (WARN_ON(csr_loaded_already()))
>>>>>>                  return;
>>>>>>
>>>>>> Also when redoing the commits please explain in detail what exactly are
>>>>>> the requirements like you've done above, but please use the standard linux
>>>>>> names, i.e. "runtime PM" and "hibernate-to-disk" and "suspend-to-mem".
>>>>> Ok hooray there's more suspend-to-something things I've totally missed:
>>>>> - suspend-to-idle (done by cat freeze > /sys/power/state) and
>>>>> - suspend (done by cat suspend > /sys/power/state)
>>>>>
>>>>> And apparently there's really no way to drivers to tell them apart.
>>>>> Rafael, is there really no way for drivers to take different paths for
>>>>> these 3 suspend cases? I tried grepping for PM_SUSPEND_ON/STANDY/MEM
>>>>> and didn't spot anything.
>>>>>
>>>>> Also we're completely missing test coverage for that in igt. That is
>>>>> something that needs to be fixed asap (yet another case of
>>>>> combinatorial explosion in igt tests, yay). And at least one of those
>>>>> suspend-to-idle testcase better be in the BAT.
>>>>> -Daniel
>>>>> --
>>>>> Daniel Vetter
>>>>> Software Engineer, Intel Corporation
>>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>>>>
>>>> --
>>>> Daniel Vetter
>>>> Software Engineer, Intel Corporation
>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>>>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [DMC_BUGFIX_V3] drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  2015-09-28  7:21   ` Daniel Vetter
  2015-09-28 18:49     ` Hindman, Gavin
@ 2015-09-29  5:31     ` Animesh Manna
  2015-10-16 12:22       ` Imre Deak
  2015-09-29  5:38     ` [DMC_BUGFIX_SKL_V2 3/5] " Animesh Manna
  2 siblings, 1 reply; 51+ messages in thread
From: Animesh Manna @ 2015-09-29  5:31 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rajneesh Bhardwaj, Daniel Vetter

Mmio register access after dc6/dc5 entry is not allowed when
DC6 power states are enabled according to bspec (bspec-id 0527),
so enabling dc6 as the last call in suspend flow.

v1: Initial version.

v2: Based on review comment from Daniel,
- created a seperate patch for csr uninitialization set call.

v3: Rebased on top of latest code.

Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c         | 13 +++++++++++++
 drivers/gpu/drm/i915/intel_drv.h        |  2 ++
 drivers/gpu/drm/i915/intel_runtime_pm.c | 19 +++++++------------
 3 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 1cb6b82..51075d5 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1049,10 +1049,20 @@ static int i915_pm_resume(struct device *dev)
 
 static int skl_suspend_complete(struct drm_i915_private *dev_priv)
 {
+	enum csr_state state;
 	/* Enabling DC6 is not a hard requirement to enter runtime D3 */
 
 	skl_uninit_cdclk(dev_priv);
 
+	/* TODO: wait for a completion event or
+	 * similar here instead of busy
+	 * waiting using wait_for function.
+	 */
+	wait_for((state = intel_csr_load_status_get(dev_priv)) !=
+			FW_UNINITIALIZED, 1000);
+	if (state == FW_LOADED)
+		skl_enable_dc6(dev_priv);
+
 	return 0;
 }
 
@@ -1099,6 +1109,9 @@ static int skl_resume_prepare(struct drm_i915_private *dev_priv)
 {
 	struct drm_device *dev = dev_priv->dev;
 
+	if (intel_csr_load_status_get(dev_priv) == FW_LOADED)
+		skl_disable_dc6(dev_priv);
+
 	skl_init_cdclk(dev_priv);
 	intel_csr_load_program(dev);
 
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index c96289d..990161d 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1143,6 +1143,8 @@ void bxt_enable_dc9(struct drm_i915_private *dev_priv);
 void bxt_disable_dc9(struct drm_i915_private *dev_priv);
 void skl_init_cdclk(struct drm_i915_private *dev_priv);
 void skl_uninit_cdclk(struct drm_i915_private *dev_priv);
+void skl_enable_dc6(struct drm_i915_private *dev_priv);
+void skl_disable_dc6(struct drm_i915_private *dev_priv);
 void intel_dp_get_m_n(struct intel_crtc *crtc,
 		      struct intel_crtc_state *pipe_config);
 void intel_dp_set_m_n(struct intel_crtc *crtc, enum link_m_n_set m_n);
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index d8e9416..d6b4f61 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -551,7 +551,7 @@ static void assert_can_disable_dc6(struct drm_i915_private *dev_priv)
 		  "DC6 already programmed to be disabled.\n");
 }
 
-static void skl_enable_dc6(struct drm_i915_private *dev_priv)
+void skl_enable_dc6(struct drm_i915_private *dev_priv)
 {
 	uint32_t val;
 
@@ -568,7 +568,7 @@ static void skl_enable_dc6(struct drm_i915_private *dev_priv)
 	POSTING_READ(DC_STATE_EN);
 }
 
-static void skl_disable_dc6(struct drm_i915_private *dev_priv)
+void skl_disable_dc6(struct drm_i915_private *dev_priv)
 {
 	uint32_t val;
 
@@ -629,10 +629,10 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
 				!I915_READ(HSW_PWR_WELL_BIOS),
 				"Invalid for power well status to be enabled, unless done by the BIOS, \
 				when request is to disable!\n");
-			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
-				power_well->data == SKL_DISP_PW_2) {
+			if (power_well->data == SKL_DISP_PW_2) {
+				if (GEN9_ENABLE_DC5(dev))
+					gen9_disable_dc5(dev_priv);
 				if (SKL_ENABLE_DC6(dev)) {
-					skl_disable_dc6(dev_priv);
 					/*
 					 * DDI buffer programming unnecessary during driver-load/resume
 					 * as it's already done during modeset initialization then.
@@ -640,8 +640,6 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
 					 */
 					if (!dev_priv->power_domains.initializing)
 						intel_prepare_ddi(dev);
-				} else {
-					gen9_disable_dc5(dev_priv);
 				}
 			}
 			I915_WRITE(HSW_PWR_WELL_DRIVER, tmp | req_mask);
@@ -667,7 +665,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
 				DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
 			}
 
-			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
+			if (GEN9_ENABLE_DC5(dev) &&
 				power_well->data == SKL_DISP_PW_2) {
 				enum csr_state state;
 				/* TODO: wait for a completion event or
@@ -680,10 +678,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
 					DRM_DEBUG("CSR firmware not ready (%d)\n",
 							state);
 				else
-					if (SKL_ENABLE_DC6(dev))
-						skl_enable_dc6(dev_priv);
-					else
-						gen9_enable_dc5(dev_priv);
+					gen9_enable_dc5(dev_priv);
 			}
 		}
 	}
-- 
2.0.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  2015-09-28  7:21   ` Daniel Vetter
  2015-09-28 18:49     ` Hindman, Gavin
  2015-09-29  5:31     ` [DMC_BUGFIX_V3] " Animesh Manna
@ 2015-09-29  5:38     ` Animesh Manna
  2015-09-29  9:01       ` Daniel Vetter
  2 siblings, 1 reply; 51+ messages in thread
From: Animesh Manna @ 2015-09-29  5:38 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Rajneesh Bhardwaj, intel-gfx, Daniel Vetter



On 09/28/2015 12:51 PM, Daniel Vetter wrote:
> On Wed, Aug 26, 2015 at 01:36:07AM +0530, Animesh Manna wrote:
>> Mmio register access after dc6/dc5 entry is not allowed when
>> DC6 power states are enabled according to bspec (bspec-id 0527),
>> so enabling dc6 as the last call in suspend flow.
> We unconditionaly grab a power well reference for everything in our
> suspend/resume functions, which means dc6/5 should be prevented for long
> enough.
>
> Also since dc6/5 are part of the power well framework you can't just
> enable/disable them directly, instead you need to get/put the
> corresponding power wells.
>
> Finally this patch doesn't just do that, but also frobs around a lot in
> set power well code itself, and I have no idea what it does there and why.
> It does smell a bit like you're just breaking dc6 for runtime pm though,
> which wouldn't be good.
>
> Anyway I decided to just merge this since this patch series has been
> floating around since forever, but then the patch didn't apply cleanly and
> so dropped it.

I mentioned in my commit message that we have a h/w workaround.
"Mmio register access after dc6/dc5 entry is not allowed when
DC6 power states are enabled"

This patch is mandatory to solve the pc10 entry issue for skylake.
Planning to send again after rebase on top of tree.

-Animesh

> -Daniel
>
>> v1: Initial version.
>>
>> v2: Based on review comment from Daniel,
>> - created a seperate patch for csr uninitialization set call.
>>
>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: Damien Lespiau <damien.lespiau@intel.com>
>> Cc: Imre Deak <imre.deak@intel.com>
>> Cc: Sunil Kamath <sunil.kamath@intel.com>
>> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.c         | 13 +++++++++++++
>>   drivers/gpu/drm/i915/intel_drv.h        |  2 ++
>>   drivers/gpu/drm/i915/intel_runtime_pm.c | 19 +++++++------------
>>   3 files changed, 22 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>> index 478101c..fa66162 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -1013,10 +1013,20 @@ static int i915_pm_resume(struct device *dev)
>>   
>>   static int skl_suspend_complete(struct drm_i915_private *dev_priv)
>>   {
>> +	enum csr_state state;
>>   	/* Enabling DC6 is not a hard requirement to enter runtime D3 */
>>   
>>   	skl_uninit_cdclk(dev_priv);
>>   
>> +	/* TODO: wait for a completion event or
>> +	 * similar here instead of busy
>> +	 * waiting using wait_for function.
>> +	 */
>> +	wait_for((state = intel_csr_load_status_get(dev_priv)) !=
>> +			FW_UNINITIALIZED, 1000);
>> +	if (state == FW_LOADED)
>> +		skl_enable_dc6(dev_priv);
>> +
>>   	return 0;
>>   }
>>   
>> @@ -1063,6 +1073,9 @@ static int skl_resume_prepare(struct drm_i915_private *dev_priv)
>>   {
>>   	struct drm_device *dev = dev_priv->dev;
>>   
>> +	if (intel_csr_load_status_get(dev_priv) == FW_LOADED)
>> +		skl_disable_dc6(dev_priv);
>> +
>>   	skl_init_cdclk(dev_priv);
>>   	intel_csr_load_program(dev);
>>   
>> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
>> index 81b7d77..9cb7d4e 100644
>> --- a/drivers/gpu/drm/i915/intel_drv.h
>> +++ b/drivers/gpu/drm/i915/intel_drv.h
>> @@ -1118,6 +1118,8 @@ void bxt_enable_dc9(struct drm_i915_private *dev_priv);
>>   void bxt_disable_dc9(struct drm_i915_private *dev_priv);
>>   void skl_init_cdclk(struct drm_i915_private *dev_priv);
>>   void skl_uninit_cdclk(struct drm_i915_private *dev_priv);
>> +void skl_enable_dc6(struct drm_i915_private *dev_priv);
>> +void skl_disable_dc6(struct drm_i915_private *dev_priv);
>>   void intel_dp_get_m_n(struct intel_crtc *crtc,
>>   		      struct intel_crtc_state *pipe_config);
>>   void intel_dp_set_m_n(struct intel_crtc *crtc, enum link_m_n_set m_n);
>> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> index 821644d..23a3aa3 100644
>> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> @@ -548,7 +548,7 @@ static void assert_can_disable_dc6(struct drm_i915_private *dev_priv)
>>   		"DC6 already programmed to be disabled.\n");
>>   }
>>   
>> -static void skl_enable_dc6(struct drm_i915_private *dev_priv)
>> +void skl_enable_dc6(struct drm_i915_private *dev_priv)
>>   {
>>   	uint32_t val;
>>   
>> @@ -565,7 +565,7 @@ static void skl_enable_dc6(struct drm_i915_private *dev_priv)
>>   	POSTING_READ(DC_STATE_EN);
>>   }
>>   
>> -static void skl_disable_dc6(struct drm_i915_private *dev_priv)
>> +void skl_disable_dc6(struct drm_i915_private *dev_priv)
>>   {
>>   	uint32_t val;
>>   
>> @@ -626,10 +626,10 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>>   				!I915_READ(HSW_PWR_WELL_BIOS),
>>   				"Invalid for power well status to be enabled, unless done by the BIOS, \
>>   				when request is to disable!\n");
>> -			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
>> -				power_well->data == SKL_DISP_PW_2) {
>> +			if (power_well->data == SKL_DISP_PW_2) {
>> +				if (GEN9_ENABLE_DC5(dev))
>> +					gen9_disable_dc5(dev_priv);
>>   				if (SKL_ENABLE_DC6(dev)) {
>> -					skl_disable_dc6(dev_priv);
>>   					/*
>>   					 * DDI buffer programming unnecessary during driver-load/resume
>>   					 * as it's already done during modeset initialization then.
>> @@ -637,8 +637,6 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>>   					 */
>>   					if (!dev_priv->power_domains.initializing)
>>   						intel_prepare_ddi(dev);
>> -				} else {
>> -					gen9_disable_dc5(dev_priv);
>>   				}
>>   			}
>>   			I915_WRITE(HSW_PWR_WELL_DRIVER, tmp | req_mask);
>> @@ -658,7 +656,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>>   			POSTING_READ(HSW_PWR_WELL_DRIVER);
>>   			DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
>>   
>> -			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
>> +			if (GEN9_ENABLE_DC5(dev) &&
>>   				power_well->data == SKL_DISP_PW_2) {
>>   				enum csr_state state;
>>   				/* TODO: wait for a completion event or
>> @@ -671,10 +669,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>>   					DRM_ERROR("CSR firmware not ready (%d)\n",
>>   							state);
>>   				else
>> -					if (SKL_ENABLE_DC6(dev))
>> -						skl_enable_dc6(dev_priv);
>> -					else
>> -						gen9_enable_dc5(dev_priv);
>> +					gen9_enable_dc5(dev_priv);
>>   			}
>>   		}
>>   	}
>> -- 
>> 2.0.2
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-28 23:54                               ` Rafael J. Wysocki
@ 2015-09-29  8:51                                 ` Daniel Vetter
  2015-09-30  0:50                                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2015-09-29  8:51 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Daniel Vetter, intel-gfx

On Tue, Sep 29, 2015 at 01:54:53AM +0200, Rafael J. Wysocki wrote:
> On 9/28/2015 8:52 AM, Daniel Vetter wrote:
> >On Wed, Sep 23, 2015 at 10:49:36PM +0200, Rafael J. Wysocki wrote:
> >>On 9/23/2015 7:17 PM, Daniel Vetter wrote:
> >>>acpi_target_system_state() seems to be almost the thing we're looking
> >>>for, except that it's only valid in the suspend callbacks since it
> >>>gets reset to ACPI_STATE_S0 when resuming. So probably we want
> >>>something else ...
> >>Right.
> >>
> >>The idea is to add a way for drivers to check if
> >>(a) suspend is going to enter the BIOS
> >>(b) resume has been triggered by the BIOS
> >>and that's really what drivers need to know.
> >>
> >>For suspend-to-idle those two will return false and for S3 they'll return
> >>true.
> >>
> >>Would that help?
> >Not sure that matches exaxtly what we'd need here ... Essentially we need
> >to know whether we've been in S3/S4 (firmware has been eaten) or in one of
> >the higher suspend-to-idle/standby states (firmware still alive, don't
> >disturb it). Additional fun that just crossed my mind is that if the
> >suspend-to-mem is aborted (some other driver failed) then that function
> >should _not_ indicate that we've been in S3. So maybe something like
> 
> So it really looks like the interface I was talking about would be suitable:
> 
> pm_suspend_via_firmware() == true -> firmware is going to be eaten (use that
> during suspend if needed)
> pm_resume_via_firmware() == true -> firmware was eaten
> 
> The latter will only return 'true' if we really have entered the BIOS
> (platform firmware).

Yeah that seems to fit the bill. We already have a check in our suspend
paths to figure out whether we'll suspend to idle or go into full S3, so
i915 could use them both. And making them generic would make sense I
guess.
-Daniel

> >acpi_source_system_state() which usually is S0 and only when acpi
> >successfully went into the suspend state in platform_suspend_ops->enter it
> >gets set to the value of acpi_target_system_state. And then reset once the
> >resume has completed. I think that would be what we'd want here.
> 
> We need to new functions like the above, because some things already depend
> on acpi_target_system_state working the way it does currently.
> 
> I see no reason to make that ACPI-specific, though, in principle.
> 
> >Anyway I'll pull in Animesh series meanwhile, amended with a FIXME comment.
> 
> Fine by me.
> 
> Thanks,
> Rafael
> 
> 
> >>>On Wed, Sep 23, 2015 at 6:28 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>Actually add Rafael this time around ...
> >>>>-Daniel
> >>>>
> >>>>On Wed, Sep 23, 2015 at 6:27 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>>On Wed, Sep 23, 2015 at 9:57 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>>>On Thu, Sep 17, 2015 at 12:53:21AM +0530, Animesh Manna wrote:
> >>>>>>>On 9/14/2015 1:16 PM, Daniel Vetter wrote:
> >>>>>>>>On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote:
> >>>>>>>>>On 9/10/2015 8:15 PM, Daniel Vetter wrote:
> >>>>>>>>>>On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
> >>>>>>>>>>>On 9/2/2015 2:24 PM, Daniel Vetter wrote:
> >>>>>>>>>>>>On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
> >>>>>>>>>>>>>On 8/26/2015 6:40 PM, Daniel Vetter wrote:
> >>>>>>>>>>>>>>On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
> >>>>>>>>>>>>>>>Dmc will restore the csr program except DC9, cold boot,
> >>>>>>>>>>>>>>>warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>intel_csr_load_program() function is used to load the firmware
> >>>>>>>>>>>>>>>data from kernel memory to csr address space.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>All values of csr address space will be zero if it got reset and
> >>>>>>>>>>>>>>>the first byte of csr program is always a non-zero if firmware
> >>>>>>>>>>>>>>>is loaded successfuly. Based on hardware status will load the
> >>>>>>>>>>>>>>>firmware.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>Without this condition check if we overwrite the firmware data the
> >>>>>>>>>>>>>>>counters exposed for dc5/dc6 (help for debugging) will be nullified.
> >>>>>>>>>>>>>Bacause of the above reason mentioned just above we need to block firmware loading again.
> >>>>>>>>>>>>>So only WARN_ON will not help.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>>v1: Initial version.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>v2: Based on review comments from Daniel,
> >>>>>>>>>>>>>>>- Added a check to know hardware status and load the firmware if not loaded.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>Cc: Daniel Vetter <daniel.vetter@intel.com>
> >>>>>>>>>>>>>>>Cc: Damien Lespiau <damien.lespiau@intel.com>
> >>>>>>>>>>>>>>>Cc: Imre Deak <imre.deak@intel.com>
> >>>>>>>>>>>>>>>Cc: Sunil Kamath <sunil.kamath@intel.com>
> >>>>>>>>>>>>>>>Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> >>>>>>>>>>>>>>>Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> >>>>>>>>>>>>>>>---
> >>>>>>>>>>>>>>>  drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
> >>>>>>>>>>>>>>>  1 file changed, 9 insertions(+)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>>>>>>>>>>index ba1ae03..682cc26 100644
> >>>>>>>>>>>>>>>--- a/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>>>>>>>>>>+++ b/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>>>>>>>>>>@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
> >>>>>>>>>>>>>>>              return;
> >>>>>>>>>>>>>>>      }
> >>>>>>>>>>>>>>>+     /*
> >>>>>>>>>>>>>>>+      * Dmc will restore the csr the program except DC9, cold boot,
> >>>>>>>>>>>>>>>+      * warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>>>>>>>>>>+      * This condition will help to check if csr address space is reset/
> >>>>>>>>>>>>>>>+      * not loaded.
> >>>>>>>>>>>>>>>+      */
> >>>>>>>>>>>>>>Atm we call this from driver load and resume, which doesn seem to cover
> >>>>>>>>>>>>>>all the cases you mention in the comment. Should this be a WARN_ON
> >>>>>>>>>>>>>>instead? Or do we have troubles in our init sequence where we load too
> >>>>>>>>>>>>>>many times?
> >>>>>>>>>>>>>Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
> >>>>>>>>>>>>>Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
> >>>>>>>>>>>>>second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
> >>>>>>>>>>>>>and cold boot(not loaded).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
> >>>>>>>>>>>>>If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>Can the below comment more clear to you.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>        /*
> >>>>>>>>>>>>>         * Dmc will restore the csr the program except DC9, cold boot,
> >>>>>>>>>>>>>         * warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>>>>>>>>         * If firmware is restored by dmc then no need to load again which
> >>>>>>>>>>>>>         * will keep the dc5/dc6 counter exposed by firmware.
> >>>>>>>>>>>>>         */
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>No issue in init sequence.
> >>>>>>>>>>>>That seems to still cover all the callers of the function afaics - we do
> >>>>>>>>>>>>pci resets over suspend resume unconditionally. So I still don't
> >>>>>>>>>>>>understand where exactly we try to load the dmc firmware in i915.ko when
> >>>>>>>>>>>>it's already loaded.
> >>>>>>>>>>>During resume intel_csr_load_program() will be called from
> >>>>>>>>>>>intel_runtime_resume().
> >>>>>>>>>>>
> >>>>>>>>>>>intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
> >>>>>>>>>>>
> >>>>>>>>>>>During Pc10 entry testing I can see dmc is restoring back the firmware always,
> >>>>>>>>>>>but as you mentioned pci-reset can happen unconditionally, but still then
> >>>>>>>>>>>also during resume intel_runtime_resume() will be called and based on
> >>>>>>>>>>>register read of csr-base-address firmware loading will happen.
> >>>>>>>>>>But in your comment you're saying it won't get restored in case of dc9 and
> >>>>>>>>>>suspend. So that seems to mismatch what you're saying here (and what the
> >>>>>>>>>>commit message says) and what the code does. And this function here is
> >>>>>>>>>>called for resume after suspend/hibernate only.
> >>>>>>>>>pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
> >>>>>>>>>I think you are confusing between dc6 and dc9. Pc10 can be achieved by
> >>>>>>>>>entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
> >>>>>>>>>for broxton which is not present for skylake.
> >>>>>>>>I have no idea at all about different pc levels on skl. What I'm talking
> >>>>>>>>about is system suspend/resume and driver load, which are the places this
> >>>>>>>>function gets called. At least afaics.
> >>>>>>>>
> >>>>>>>>>Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
> >>>>>>>>>execution flow will be different in case of suspend/resume which I think is confusing
> >>>>>>>>>you.
> >>>>>>>>That seems like really important information. What's different on bxt?
> >>>>>>>>These are the kind of details you should explain in the commit message ...
> >>>>>>>>
> >>>>>>>>>I am ready explain you in detail. It will be good if we discuss specific use-case scenario
> >>>>>>>>>and itz software design for specific platform. Another point - as dmc related code for
> >>>>>>>>>broxton is not merged better first we close design for skylake. Now, I have added dc9
> >>>>>>>>>description in comment thinking of future. If you want I can remove for now and later
> >>>>>>>>>can add in bxt patch series for enabling dmc. Will wait for your reply.
> >>>>>>>>This question here isn't about the overall design and how to handle power
> >>>>>>>>wells in skl/bxt. That's a separate discussion and tracked somewhere else.
> >>>>>>>>I'm really just confused about when exactly we need to reload to firmware,
> >>>>>>>>and why we need a runtime check for that. Normally we should know when to
> >>>>>>>>reload the firmware and just either reload or not, without checking hw
> >>>>>>>>state. And I don't like checking for hw state since at least in the past
> >>>>>>>>that kind of code ended up being fragile - it's an illusion that it does
> >>>>>>>>the right thing no matter what, since often there's other tricky ordering
> >>>>>>>>constraints. And if you have automatic duct-tape like then no one will
> >>>>>>>>ever spot those other, harder to spot issues, until an expensive customer
> >>>>>>>>escalation happens.
> >>>>>>>>
> >>>>>>>>So what I want to know here is:
> >>>>>>>>- When exactly do we need to reload dmc firmware.
> >>>>>>>In skl, during driver load first time we load the firmware, during normal
> >>>>>>>suspend-resume (dc6 entry/exit)
> >>>>>>>no need to reload the firmware again as dmc will take care of it. But during
> >>>>>>>suspend/hibernation
> >>>>>>>dmc will not restore the firmware. In that case driver need to reload it
> >>>>>>>again. I do not know
> >>>>>>>how to differentiate pm-suspend and suspend-hibernation and thought both the
> >>>>>>>cases
> >>>>>>>intel_runtime_resume() will be called where we can check the h/w state and
> >>>>>>>reload the
> >>>>>>>firmware if dmc is not restored.
> >>>>>>>
> >>>>>>>In bxt, during driver load first time we load the firmware, during normal
> >>>>>>>suspend-resume
> >>>>>>>display engine will enter into dc9 and dmc will not restore the firmware. So
> >>>>>>>every
> >>>>>>>suspend-resume we need to reload the firmware.
> >>>>>>>>- What exactly is the reason why we can't make that decision statically in
> >>>>>>>>   the code (by calling csr_load at the right spots).
> >>>>>>>As I mentioned before in case of skylake can we differentiate between
> >>>>>>>"resume from pm-suspend" with "resume from suspend-hibernation" inside
> >>>>>>>driver?
> >>>>>>>
> >>>>>>>In case of broxton, every time we need to reload, so we can decide
> >>>>>>>statically.
> >>>>>>Of course we can differentiate between all the different resume paths, and
> >>>>>>we also have a per-platform split to take care of bxt vs. skl. And there
> >>>>>>are actually 3 different resume paths:
> >>>>>>
> >>>>>>- runtime PM resume. This calls the runtime_resume hook. It sounds like on
> >>>>>>   skl we should _not_ load the csr firmware, but on bxt we should load it.
> >>>>>>   This can be fixed by removing the intel_csr_load_program call from
> >>>>>>   skl_resume_prepare.
> >>>>>>- resume from hibernate-to-disk (i.e. system completely off, state stored
> >>>>>>   on the swap partition) is done by calling the thaw callbacks.
> >>>>>>- resume from suspend-to-mem (i.e. system in low-power with only memory
> >>>>>>   in self-refresh, all state stored in memory) is done by calling the
> >>>>>>   resume callbacks.
> >>>>>>
> >>>>>>For i915 we use unified handlers in our dev_pm_ops for both thaw and
> >>>>>>resume, but it sounds like that won't be a problem for skl/bxt since we
> >>>>>>need to reload the csr firmware in all cases. Although I'm not perfectly
> >>>>>>sure since you don't explain what kind of resume you mean exactly (since
> >>>>>>you don't use the linux names for them).
> >>>>>>
> >>>>>>Anyway it sounds like we can replace this patch by one where we remove
> >>>>>>that errornous csr load call from skl runtime pm resume and that's all.
> >>>>>>But I suggest to make sure we get this right we keep the check you're
> >>>>>>adding here, but wrap it in a WARN_ON. Then we'll get a backtrace when
> >>>>>>this is going wrong again. Like this:
> >>>>>>
> >>>>>>         if (WARN_ON(csr_loaded_already()))
> >>>>>>                 return;
> >>>>>>
> >>>>>>Also when redoing the commits please explain in detail what exactly are
> >>>>>>the requirements like you've done above, but please use the standard linux
> >>>>>>names, i.e. "runtime PM" and "hibernate-to-disk" and "suspend-to-mem".
> >>>>>Ok hooray there's more suspend-to-something things I've totally missed:
> >>>>>- suspend-to-idle (done by cat freeze > /sys/power/state) and
> >>>>>- suspend (done by cat suspend > /sys/power/state)
> >>>>>
> >>>>>And apparently there's really no way to drivers to tell them apart.
> >>>>>Rafael, is there really no way for drivers to take different paths for
> >>>>>these 3 suspend cases? I tried grepping for PM_SUSPEND_ON/STANDY/MEM
> >>>>>and didn't spot anything.
> >>>>>
> >>>>>Also we're completely missing test coverage for that in igt. That is
> >>>>>something that needs to be fixed asap (yet another case of
> >>>>>combinatorial explosion in igt tests, yay). And at least one of those
> >>>>>suspend-to-idle testcase better be in the BAT.
> >>>>>-Daniel
> >>>>>--
> >>>>>Daniel Vetter
> >>>>>Software Engineer, Intel Corporation
> >>>>>+41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >>>>
> >>>>--
> >>>>Daniel Vetter
> >>>>Software Engineer, Intel Corporation
> >>>>+41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >>>
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  2015-09-29  5:38     ` [DMC_BUGFIX_SKL_V2 3/5] " Animesh Manna
@ 2015-09-29  9:01       ` Daniel Vetter
  2015-09-29 12:35         ` Patrik Jakobsson
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2015-09-29  9:01 UTC (permalink / raw)
  To: Animesh Manna; +Cc: Rajneesh Bhardwaj, intel-gfx, Daniel Vetter

On Tue, Sep 29, 2015 at 11:08:25AM +0530, Animesh Manna wrote:
> 
> 
> On 09/28/2015 12:51 PM, Daniel Vetter wrote:
> >On Wed, Aug 26, 2015 at 01:36:07AM +0530, Animesh Manna wrote:
> >>Mmio register access after dc6/dc5 entry is not allowed when
> >>DC6 power states are enabled according to bspec (bspec-id 0527),
> >>so enabling dc6 as the last call in suspend flow.
> >We unconditionaly grab a power well reference for everything in our
> >suspend/resume functions, which means dc6/5 should be prevented for long
> >enough.
> >
> >Also since dc6/5 are part of the power well framework you can't just
> >enable/disable them directly, instead you need to get/put the
> >corresponding power wells.
> >
> >Finally this patch doesn't just do that, but also frobs around a lot in
> >set power well code itself, and I have no idea what it does there and why.
> >It does smell a bit like you're just breaking dc6 for runtime pm though,
> >which wouldn't be good.
> >
> >Anyway I decided to just merge this since this patch series has been
> >floating around since forever, but then the patch didn't apply cleanly and
> >so dropped it.
> 
> I mentioned in my commit message that we have a h/w workaround.
> "Mmio register access after dc6/dc5 entry is not allowed when
> DC6 power states are enabled"
> 
> This patch is mandatory to solve the pc10 entry issue for skylake.
> Planning to send again after rebase on top of tree.

The problem isn't that the patch didn't apply cleanly, but that it seems
to have fundamental issues:
- We already grab power well references around suspend/resume code, see
  the call to intel_power_domains_init_hw and the calls to
  intel_display_set_init_power. That code is supposed to ensure that while
  suspend/resume is going on _all_ power wells are on (and hence dc6/5 are
  forbidden). It's a bit a big hammer, but we already have the code to do
  exactly what you're trying to do here. It could be though that for skl
  it's in the wrong order (but then the commit message must state very
  clearly what's the exact depency, the gpu is assembled from a pile of
  various different blocks which are all mostly independent).

- Your patch directly calls skl_enable/disable_dc6 instead of going
  through the power well abstraction. Breaking these abstraction needs a
  really good reason, adequate assurance that the refcounts are ok and
  nothing breaks (using WARN_ON) and an explanation why the refcount
  interface for display power management doesn't work.

- You also do changes to the power well code itself, which looks like it
  might break dc6 for runtime pm (by only going into dc5 at most). That
  needs to be a separate patch or have a much better explanation.

We probably need to set up a meeting to resolve this sooner than the few
months it took to figure out the resume issue ...

Thanks, Daniel
> >>
> >>v2: Based on review comment from Daniel,
> >>- created a seperate patch for csr uninitialization set call.
> >>
> >>Cc: Daniel Vetter <daniel.vetter@intel.com>
> >>Cc: Damien Lespiau <damien.lespiau@intel.com>
> >>Cc: Imre Deak <imre.deak@intel.com>
> >>Cc: Sunil Kamath <sunil.kamath@intel.com>
> >>Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> >>Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> >>Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
> >>---
> >>  drivers/gpu/drm/i915/i915_drv.c         | 13 +++++++++++++
> >>  drivers/gpu/drm/i915/intel_drv.h        |  2 ++
> >>  drivers/gpu/drm/i915/intel_runtime_pm.c | 19 +++++++------------
> >>  3 files changed, 22 insertions(+), 12 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> >>index 478101c..fa66162 100644
> >>--- a/drivers/gpu/drm/i915/i915_drv.c
> >>+++ b/drivers/gpu/drm/i915/i915_drv.c
> >>@@ -1013,10 +1013,20 @@ static int i915_pm_resume(struct device *dev)
> >>  static int skl_suspend_complete(struct drm_i915_private *dev_priv)
> >>  {
> >>+	enum csr_state state;
> >>  	/* Enabling DC6 is not a hard requirement to enter runtime D3 */
> >>  	skl_uninit_cdclk(dev_priv);
> >>+	/* TODO: wait for a completion event or
> >>+	 * similar here instead of busy
> >>+	 * waiting using wait_for function.
> >>+	 */
> >>+	wait_for((state = intel_csr_load_status_get(dev_priv)) !=
> >>+			FW_UNINITIALIZED, 1000);
> >>+	if (state == FW_LOADED)
> >>+		skl_enable_dc6(dev_priv);
> >>+
> >>  	return 0;
> >>  }
> >>@@ -1063,6 +1073,9 @@ static int skl_resume_prepare(struct drm_i915_private *dev_priv)
> >>  {
> >>  	struct drm_device *dev = dev_priv->dev;
> >>+	if (intel_csr_load_status_get(dev_priv) == FW_LOADED)
> >>+		skl_disable_dc6(dev_priv);
> >>+
> >>  	skl_init_cdclk(dev_priv);
> >>  	intel_csr_load_program(dev);
> >>diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> >>index 81b7d77..9cb7d4e 100644
> >>--- a/drivers/gpu/drm/i915/intel_drv.h
> >>+++ b/drivers/gpu/drm/i915/intel_drv.h
> >>@@ -1118,6 +1118,8 @@ void bxt_enable_dc9(struct drm_i915_private *dev_priv);
> >>  void bxt_disable_dc9(struct drm_i915_private *dev_priv);
> >>  void skl_init_cdclk(struct drm_i915_private *dev_priv);
> >>  void skl_uninit_cdclk(struct drm_i915_private *dev_priv);
> >>+void skl_enable_dc6(struct drm_i915_private *dev_priv);
> >>+void skl_disable_dc6(struct drm_i915_private *dev_priv);
> >>  void intel_dp_get_m_n(struct intel_crtc *crtc,
> >>  		      struct intel_crtc_state *pipe_config);
> >>  void intel_dp_set_m_n(struct intel_crtc *crtc, enum link_m_n_set m_n);
> >>diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> >>index 821644d..23a3aa3 100644
> >>--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> >>+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> >>@@ -548,7 +548,7 @@ static void assert_can_disable_dc6(struct drm_i915_private *dev_priv)
> >>  		"DC6 already programmed to be disabled.\n");
> >>  }
> >>-static void skl_enable_dc6(struct drm_i915_private *dev_priv)
> >>+void skl_enable_dc6(struct drm_i915_private *dev_priv)
> >>  {
> >>  	uint32_t val;
> >>@@ -565,7 +565,7 @@ static void skl_enable_dc6(struct drm_i915_private *dev_priv)
> >>  	POSTING_READ(DC_STATE_EN);
> >>  }
> >>-static void skl_disable_dc6(struct drm_i915_private *dev_priv)
> >>+void skl_disable_dc6(struct drm_i915_private *dev_priv)
> >>  {
> >>  	uint32_t val;
> >>@@ -626,10 +626,10 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
> >>  				!I915_READ(HSW_PWR_WELL_BIOS),
> >>  				"Invalid for power well status to be enabled, unless done by the BIOS, \
> >>  				when request is to disable!\n");
> >>-			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> >>-				power_well->data == SKL_DISP_PW_2) {
> >>+			if (power_well->data == SKL_DISP_PW_2) {
> >>+				if (GEN9_ENABLE_DC5(dev))
> >>+					gen9_disable_dc5(dev_priv);
> >>  				if (SKL_ENABLE_DC6(dev)) {
> >>-					skl_disable_dc6(dev_priv);
> >>  					/*
> >>  					 * DDI buffer programming unnecessary during driver-load/resume
> >>  					 * as it's already done during modeset initialization then.
> >>@@ -637,8 +637,6 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
> >>  					 */
> >>  					if (!dev_priv->power_domains.initializing)
> >>  						intel_prepare_ddi(dev);
> >>-				} else {
> >>-					gen9_disable_dc5(dev_priv);
> >>  				}
> >>  			}
> >>  			I915_WRITE(HSW_PWR_WELL_DRIVER, tmp | req_mask);
> >>@@ -658,7 +656,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
> >>  			POSTING_READ(HSW_PWR_WELL_DRIVER);
> >>  			DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
> >>-			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> >>+			if (GEN9_ENABLE_DC5(dev) &&
> >>  				power_well->data == SKL_DISP_PW_2) {
> >>  				enum csr_state state;
> >>  				/* TODO: wait for a completion event or
> >>@@ -671,10 +669,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
> >>  					DRM_ERROR("CSR firmware not ready (%d)\n",
> >>  							state);
> >>  				else
> >>-					if (SKL_ENABLE_DC6(dev))
> >>-						skl_enable_dc6(dev_priv);
> >>-					else
> >>-						gen9_enable_dc5(dev_priv);
> >>+					gen9_enable_dc5(dev_priv);
> >>  			}
> >>  		}
> >>  	}
> >>-- 
> >>2.0.2
> >>
> >>_______________________________________________
> >>Intel-gfx mailing list
> >>Intel-gfx@lists.freedesktop.org
> >>http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  2015-09-29  9:01       ` Daniel Vetter
@ 2015-09-29 12:35         ` Patrik Jakobsson
  2015-09-29 13:01           ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Patrik Jakobsson @ 2015-09-29 12:35 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Rajneesh Bhardwaj, intel-gfx, Daniel Vetter

On Tue, Sep 29, 2015 at 11:01:35AM +0200, Daniel Vetter wrote:
> On Tue, Sep 29, 2015 at 11:08:25AM +0530, Animesh Manna wrote:
> > 
> > 
> > On 09/28/2015 12:51 PM, Daniel Vetter wrote:
> > >On Wed, Aug 26, 2015 at 01:36:07AM +0530, Animesh Manna wrote:
> > >>Mmio register access after dc6/dc5 entry is not allowed when
> > >>DC6 power states are enabled according to bspec (bspec-id 0527),
> > >>so enabling dc6 as the last call in suspend flow.
> > >We unconditionaly grab a power well reference for everything in our
> > >suspend/resume functions, which means dc6/5 should be prevented for long
> > >enough.
> > >
> > >Also since dc6/5 are part of the power well framework you can't just
> > >enable/disable them directly, instead you need to get/put the
> > >corresponding power wells.
> > >
> > >Finally this patch doesn't just do that, but also frobs around a lot in
> > >set power well code itself, and I have no idea what it does there and why.
> > >It does smell a bit like you're just breaking dc6 for runtime pm though,
> > >which wouldn't be good.
> > >
> > >Anyway I decided to just merge this since this patch series has been
> > >floating around since forever, but then the patch didn't apply cleanly and
> > >so dropped it.
> > 
> > I mentioned in my commit message that we have a h/w workaround.
> > "Mmio register access after dc6/dc5 entry is not allowed when
> > DC6 power states are enabled"
> > 
> > This patch is mandatory to solve the pc10 entry issue for skylake.
> > Planning to send again after rebase on top of tree.

Hi Animesh and Daniel

I'm trying to shed some light on this. It seems rather confusing atm. Animesh,
please correct me if I'm wrong below.

> The problem isn't that the patch didn't apply cleanly, but that it seems
> to have fundamental issues:
> - We already grab power well references around suspend/resume code, see
>   the call to intel_power_domains_init_hw and the calls to
>   intel_display_set_init_power. That code is supposed to ensure that while
>   suspend/resume is going on _all_ power wells are on (and hence dc6/5 are
>   forbidden). It's a bit a big hammer, but we already have the code to do
>   exactly what you're trying to do here. It could be though that for skl
>   it's in the wrong order (but then the commit message must state very
>   clearly what's the exact depency, the gpu is assembled from a pile of
>   various different blocks which are all mostly independent).

DC6 can only be enabled when _all_ power wells are disabled. Hence this patch
moves DC6 enabled/disabled to the runtime suspend/resume callbacks. Previously 
we only disabled DC6 when PW2 was enabled.

The DC state table says:
- Everything off: DC6 enabled
- PW0 enabled: DC5 enabled but DC6 is disabled
- PW1 enabled: DC5 and DC6 disabled 
- PW2 enabled: DC5 and DC6 disabled 

> - Your patch directly calls skl_enable/disable_dc6 instead of going
>   through the power well abstraction. Breaking these abstraction needs a
>   really good reason, adequate assurance that the refcounts are ok and
>   nothing breaks (using WARN_ON) and an explanation why the refcount
>   interface for display power management doesn't work.

This is indeed a hack but since all we need is the guarantee that a power well
is in fact enabled we can see the operation as: enable pw + disable dc state.
This is currently hacked into skl_set_power_well(). It's not nice but it takes
away the complexity of modeling DC states as power wells.

The alternative would be to turn DC5 and DC6 into power wells and add proper
dependencies. What I've discovered is that it's quite a tight fit.

> - You also do changes to the power well code itself, which looks like it
>   might break dc6 for runtime pm (by only going into dc5 at most). That
>   needs to be a separate patch or have a much better explanation.

If I understood correctly, that is the entire point of this patch. Don't allow
DC6 unless all power wells are disabled. My question though is why we don't
disable DC5 for PW1 as well?

> We probably need to set up a meeting to resolve this sooner than the few
> months it took to figure out the resume issue ...
> 
> Thanks, Daniel

I'd like to join if possible.

Thanks
Patrik

> > >>
> > >>v2: Based on review comment from Daniel,
> > >>- created a seperate patch for csr uninitialization set call.
> > >>
> > >>Cc: Daniel Vetter <daniel.vetter@intel.com>
> > >>Cc: Damien Lespiau <damien.lespiau@intel.com>
> > >>Cc: Imre Deak <imre.deak@intel.com>
> > >>Cc: Sunil Kamath <sunil.kamath@intel.com>
> > >>Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> > >>Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> > >>Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
> > >>---
> > >>  drivers/gpu/drm/i915/i915_drv.c         | 13 +++++++++++++
> > >>  drivers/gpu/drm/i915/intel_drv.h        |  2 ++
> > >>  drivers/gpu/drm/i915/intel_runtime_pm.c | 19 +++++++------------
> > >>  3 files changed, 22 insertions(+), 12 deletions(-)
> > >>
> > >>diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > >>index 478101c..fa66162 100644
> > >>--- a/drivers/gpu/drm/i915/i915_drv.c
> > >>+++ b/drivers/gpu/drm/i915/i915_drv.c
> > >>@@ -1013,10 +1013,20 @@ static int i915_pm_resume(struct device *dev)
> > >>  static int skl_suspend_complete(struct drm_i915_private *dev_priv)
> > >>  {
> > >>+	enum csr_state state;
> > >>  	/* Enabling DC6 is not a hard requirement to enter runtime D3 */
> > >>  	skl_uninit_cdclk(dev_priv);
> > >>+	/* TODO: wait for a completion event or
> > >>+	 * similar here instead of busy
> > >>+	 * waiting using wait_for function.
> > >>+	 */
> > >>+	wait_for((state = intel_csr_load_status_get(dev_priv)) !=
> > >>+			FW_UNINITIALIZED, 1000);
> > >>+	if (state == FW_LOADED)
> > >>+		skl_enable_dc6(dev_priv);
> > >>+
> > >>  	return 0;
> > >>  }
> > >>@@ -1063,6 +1073,9 @@ static int skl_resume_prepare(struct drm_i915_private *dev_priv)
> > >>  {
> > >>  	struct drm_device *dev = dev_priv->dev;
> > >>+	if (intel_csr_load_status_get(dev_priv) == FW_LOADED)
> > >>+		skl_disable_dc6(dev_priv);
> > >>+
> > >>  	skl_init_cdclk(dev_priv);
> > >>  	intel_csr_load_program(dev);
> > >>diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> > >>index 81b7d77..9cb7d4e 100644
> > >>--- a/drivers/gpu/drm/i915/intel_drv.h
> > >>+++ b/drivers/gpu/drm/i915/intel_drv.h
> > >>@@ -1118,6 +1118,8 @@ void bxt_enable_dc9(struct drm_i915_private *dev_priv);
> > >>  void bxt_disable_dc9(struct drm_i915_private *dev_priv);
> > >>  void skl_init_cdclk(struct drm_i915_private *dev_priv);
> > >>  void skl_uninit_cdclk(struct drm_i915_private *dev_priv);
> > >>+void skl_enable_dc6(struct drm_i915_private *dev_priv);
> > >>+void skl_disable_dc6(struct drm_i915_private *dev_priv);
> > >>  void intel_dp_get_m_n(struct intel_crtc *crtc,
> > >>  		      struct intel_crtc_state *pipe_config);
> > >>  void intel_dp_set_m_n(struct intel_crtc *crtc, enum link_m_n_set m_n);
> > >>diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > >>index 821644d..23a3aa3 100644
> > >>--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > >>+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > >>@@ -548,7 +548,7 @@ static void assert_can_disable_dc6(struct drm_i915_private *dev_priv)
> > >>  		"DC6 already programmed to be disabled.\n");
> > >>  }
> > >>-static void skl_enable_dc6(struct drm_i915_private *dev_priv)
> > >>+void skl_enable_dc6(struct drm_i915_private *dev_priv)
> > >>  {
> > >>  	uint32_t val;
> > >>@@ -565,7 +565,7 @@ static void skl_enable_dc6(struct drm_i915_private *dev_priv)
> > >>  	POSTING_READ(DC_STATE_EN);
> > >>  }
> > >>-static void skl_disable_dc6(struct drm_i915_private *dev_priv)
> > >>+void skl_disable_dc6(struct drm_i915_private *dev_priv)
> > >>  {
> > >>  	uint32_t val;
> > >>@@ -626,10 +626,10 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
> > >>  				!I915_READ(HSW_PWR_WELL_BIOS),
> > >>  				"Invalid for power well status to be enabled, unless done by the BIOS, \
> > >>  				when request is to disable!\n");
> > >>-			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> > >>-				power_well->data == SKL_DISP_PW_2) {
> > >>+			if (power_well->data == SKL_DISP_PW_2) {
> > >>+				if (GEN9_ENABLE_DC5(dev))
> > >>+					gen9_disable_dc5(dev_priv);
> > >>  				if (SKL_ENABLE_DC6(dev)) {
> > >>-					skl_disable_dc6(dev_priv);
> > >>  					/*
> > >>  					 * DDI buffer programming unnecessary during driver-load/resume
> > >>  					 * as it's already done during modeset initialization then.
> > >>@@ -637,8 +637,6 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
> > >>  					 */
> > >>  					if (!dev_priv->power_domains.initializing)
> > >>  						intel_prepare_ddi(dev);
> > >>-				} else {
> > >>-					gen9_disable_dc5(dev_priv);
> > >>  				}
> > >>  			}
> > >>  			I915_WRITE(HSW_PWR_WELL_DRIVER, tmp | req_mask);
> > >>@@ -658,7 +656,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
> > >>  			POSTING_READ(HSW_PWR_WELL_DRIVER);
> > >>  			DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
> > >>-			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> > >>+			if (GEN9_ENABLE_DC5(dev) &&
> > >>  				power_well->data == SKL_DISP_PW_2) {
> > >>  				enum csr_state state;
> > >>  				/* TODO: wait for a completion event or
> > >>@@ -671,10 +669,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
> > >>  					DRM_ERROR("CSR firmware not ready (%d)\n",
> > >>  							state);
> > >>  				else
> > >>-					if (SKL_ENABLE_DC6(dev))
> > >>-						skl_enable_dc6(dev_priv);
> > >>-					else
> > >>-						gen9_enable_dc5(dev_priv);
> > >>+					gen9_enable_dc5(dev_priv);
> > >>  			}
> > >>  		}
> > >>  	}
> > >>-- 
> > >>2.0.2
> > >>
> > >>_______________________________________________
> > >>Intel-gfx mailing list
> > >>Intel-gfx@lists.freedesktop.org
> > >>http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  2015-09-29 12:35         ` Patrik Jakobsson
@ 2015-09-29 13:01           ` Daniel Vetter
  2015-09-29 13:23             ` Ville Syrjälä
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2015-09-29 13:01 UTC (permalink / raw)
  To: Daniel Vetter, Animesh Manna, Rajneesh Bhardwaj, intel-gfx,
	Daniel Vetter

On Tue, Sep 29, 2015 at 2:35 PM, Patrik Jakobsson
<patrik.jakobsson@linux.intel.com> wrote:
> On Tue, Sep 29, 2015 at 11:01:35AM +0200, Daniel Vetter wrote:
>> On Tue, Sep 29, 2015 at 11:08:25AM +0530, Animesh Manna wrote:
>> >
>> >
>> > On 09/28/2015 12:51 PM, Daniel Vetter wrote:
>> > >On Wed, Aug 26, 2015 at 01:36:07AM +0530, Animesh Manna wrote:
>> > >>Mmio register access after dc6/dc5 entry is not allowed when
>> > >>DC6 power states are enabled according to bspec (bspec-id 0527),
>> > >>so enabling dc6 as the last call in suspend flow.
>> > >We unconditionaly grab a power well reference for everything in our
>> > >suspend/resume functions, which means dc6/5 should be prevented for long
>> > >enough.
>> > >
>> > >Also since dc6/5 are part of the power well framework you can't just
>> > >enable/disable them directly, instead you need to get/put the
>> > >corresponding power wells.
>> > >
>> > >Finally this patch doesn't just do that, but also frobs around a lot in
>> > >set power well code itself, and I have no idea what it does there and why.
>> > >It does smell a bit like you're just breaking dc6 for runtime pm though,
>> > >which wouldn't be good.
>> > >
>> > >Anyway I decided to just merge this since this patch series has been
>> > >floating around since forever, but then the patch didn't apply cleanly and
>> > >so dropped it.
>> >
>> > I mentioned in my commit message that we have a h/w workaround.
>> > "Mmio register access after dc6/dc5 entry is not allowed when
>> > DC6 power states are enabled"
>> >
>> > This patch is mandatory to solve the pc10 entry issue for skylake.
>> > Planning to send again after rebase on top of tree.
>
> Hi Animesh and Daniel
>
> I'm trying to shed some light on this. It seems rather confusing atm. Animesh,
> please correct me if I'm wrong below.
>
>> The problem isn't that the patch didn't apply cleanly, but that it seems
>> to have fundamental issues:
>> - We already grab power well references around suspend/resume code, see
>>   the call to intel_power_domains_init_hw and the calls to
>>   intel_display_set_init_power. That code is supposed to ensure that while
>>   suspend/resume is going on _all_ power wells are on (and hence dc6/5 are
>>   forbidden). It's a bit a big hammer, but we already have the code to do
>>   exactly what you're trying to do here. It could be though that for skl
>>   it's in the wrong order (but then the commit message must state very
>>   clearly what's the exact depency, the gpu is assembled from a pile of
>>   various different blocks which are all mostly independent).
>
> DC6 can only be enabled when _all_ power wells are disabled. Hence this patch
> moves DC6 enabled/disabled to the runtime suspend/resume callbacks. Previously
> we only disabled DC6 when PW2 was enabled.
>
> The DC state table says:
> - Everything off: DC6 enabled
> - PW0 enabled: DC5 enabled but DC6 is disabled
> - PW1 enabled: DC5 and DC6 disabled
> - PW2 enabled: DC5 and DC6 disabled

Ok, so what the patch does (and that is at least self-consistent) is
to move the DC6 enable/disable from PW2 to the top-level domain. That
doesn't quite match what you're describing here since we still have
PW1 managed by the driver, but that's short-circuited now.

>> - Your patch directly calls skl_enable/disable_dc6 instead of going
>>   through the power well abstraction. Breaking these abstraction needs a
>>   really good reason, adequate assurance that the refcounts are ok and
>>   nothing breaks (using WARN_ON) and an explanation why the refcount
>>   interface for display power management doesn't work.
>
> This is indeed a hack but since all we need is the guarantee that a power well
> is in fact enabled we can see the operation as: enable pw + disable dc state.
> This is currently hacked into skl_set_power_well(). It's not nice but it takes
> away the complexity of modeling DC states as power wells.
>
> The alternative would be to turn DC5 and DC6 into power wells and add proper
> dependencies. What I've discovered is that it's quite a tight fit.

Nah, if the goal is to move just DC6 the patch seems fine. I was just
really confused that the commit message talked about both dc6 _and_
dc5 being impossible to do, and then only moving half of the story
around. And I was also a bit confused with runtime pm vs. system
suspend (again).

So if we can confirm that this is really only about DC6 and not also
about dc5 like the commit message claims, then I can clean up the
commit message and apply the patch.

>> - You also do changes to the power well code itself, which looks like it
>>   might break dc6 for runtime pm (by only going into dc5 at most). That
>>   needs to be a separate patch or have a much better explanation.
>
> If I understood correctly, that is the entire point of this patch. Don't allow
> DC6 unless all power wells are disabled. My question though is why we don't
> disable DC5 for PW1 as well?

Afaiui dmc manages pw1 for us and we shouldn't ever touch it from the
driver side. I'm not entirely sure about that though since the current
fix is a hack. If we really shouldn't touch pw1 on skl then we should
remove that power domain from our skl code entirely.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  2015-09-29 13:01           ` Daniel Vetter
@ 2015-09-29 13:23             ` Ville Syrjälä
  2015-09-29 14:00               ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Ville Syrjälä @ 2015-09-29 13:23 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Rajneesh Bhardwaj, intel-gfx, Daniel Vetter

On Tue, Sep 29, 2015 at 03:01:57PM +0200, Daniel Vetter wrote:
> On Tue, Sep 29, 2015 at 2:35 PM, Patrik Jakobsson
> <patrik.jakobsson@linux.intel.com> wrote:
> > On Tue, Sep 29, 2015 at 11:01:35AM +0200, Daniel Vetter wrote:
> >> On Tue, Sep 29, 2015 at 11:08:25AM +0530, Animesh Manna wrote:
> >> >
> >> >
> >> > On 09/28/2015 12:51 PM, Daniel Vetter wrote:
> >> > >On Wed, Aug 26, 2015 at 01:36:07AM +0530, Animesh Manna wrote:
> >> > >>Mmio register access after dc6/dc5 entry is not allowed when
> >> > >>DC6 power states are enabled according to bspec (bspec-id 0527),
> >> > >>so enabling dc6 as the last call in suspend flow.
> >> > >We unconditionaly grab a power well reference for everything in our
> >> > >suspend/resume functions, which means dc6/5 should be prevented for long
> >> > >enough.
> >> > >
> >> > >Also since dc6/5 are part of the power well framework you can't just
> >> > >enable/disable them directly, instead you need to get/put the
> >> > >corresponding power wells.
> >> > >
> >> > >Finally this patch doesn't just do that, but also frobs around a lot in
> >> > >set power well code itself, and I have no idea what it does there and why.
> >> > >It does smell a bit like you're just breaking dc6 for runtime pm though,
> >> > >which wouldn't be good.
> >> > >
> >> > >Anyway I decided to just merge this since this patch series has been
> >> > >floating around since forever, but then the patch didn't apply cleanly and
> >> > >so dropped it.
> >> >
> >> > I mentioned in my commit message that we have a h/w workaround.
> >> > "Mmio register access after dc6/dc5 entry is not allowed when
> >> > DC6 power states are enabled"
> >> >
> >> > This patch is mandatory to solve the pc10 entry issue for skylake.
> >> > Planning to send again after rebase on top of tree.
> >
> > Hi Animesh and Daniel
> >
> > I'm trying to shed some light on this. It seems rather confusing atm. Animesh,
> > please correct me if I'm wrong below.
> >
> >> The problem isn't that the patch didn't apply cleanly, but that it seems
> >> to have fundamental issues:
> >> - We already grab power well references around suspend/resume code, see
> >>   the call to intel_power_domains_init_hw and the calls to
> >>   intel_display_set_init_power. That code is supposed to ensure that while
> >>   suspend/resume is going on _all_ power wells are on (and hence dc6/5 are
> >>   forbidden). It's a bit a big hammer, but we already have the code to do
> >>   exactly what you're trying to do here. It could be though that for skl
> >>   it's in the wrong order (but then the commit message must state very
> >>   clearly what's the exact depency, the gpu is assembled from a pile of
> >>   various different blocks which are all mostly independent).
> >
> > DC6 can only be enabled when _all_ power wells are disabled. Hence this patch
> > moves DC6 enabled/disabled to the runtime suspend/resume callbacks. Previously
> > we only disabled DC6 when PW2 was enabled.
> >
> > The DC state table says:
> > - Everything off: DC6 enabled
> > - PW0 enabled: DC5 enabled but DC6 is disabled
> > - PW1 enabled: DC5 and DC6 disabled
> > - PW2 enabled: DC5 and DC6 disabled
> 
> Ok, so what the patch does (and that is at least self-consistent) is
> to move the DC6 enable/disable from PW2 to the top-level domain. That
> doesn't quite match what you're describing here since we still have
> PW1 managed by the driver, but that's short-circuited now.

Hmm. Why are we going back and forth on this all the time? Was there
some problem with the plan [1] that Imre and I hatched?

[1] http://lists.freedesktop.org/archives/intel-gfx/2015-September/076612.html

> 
> >> - Your patch directly calls skl_enable/disable_dc6 instead of going
> >>   through the power well abstraction. Breaking these abstraction needs a
> >>   really good reason, adequate assurance that the refcounts are ok and
> >>   nothing breaks (using WARN_ON) and an explanation why the refcount
> >>   interface for display power management doesn't work.
> >
> > This is indeed a hack but since all we need is the guarantee that a power well
> > is in fact enabled we can see the operation as: enable pw + disable dc state.
> > This is currently hacked into skl_set_power_well(). It's not nice but it takes
> > away the complexity of modeling DC states as power wells.
> >
> > The alternative would be to turn DC5 and DC6 into power wells and add proper
> > dependencies. What I've discovered is that it's quite a tight fit.
> 
> Nah, if the goal is to move just DC6 the patch seems fine. I was just
> really confused that the commit message talked about both dc6 _and_
> dc5 being impossible to do, and then only moving half of the story
> around. And I was also a bit confused with runtime pm vs. system
> suspend (again).
> 
> So if we can confirm that this is really only about DC6 and not also
> about dc5 like the commit message claims, then I can clean up the
> commit message and apply the patch.
> 
> >> - You also do changes to the power well code itself, which looks like it
> >>   might break dc6 for runtime pm (by only going into dc5 at most). That
> >>   needs to be a separate patch or have a much better explanation.
> >
> > If I understood correctly, that is the entire point of this patch. Don't allow
> > DC6 unless all power wells are disabled. My question though is why we don't
> > disable DC5 for PW1 as well?
> 
> Afaiui dmc manages pw1 for us and we shouldn't ever touch it from the
> driver side. I'm not entirely sure about that though since the current
> fix is a hack. If we really shouldn't touch pw1 on skl then we should
> remove that power domain from our skl code entirely.
> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  2015-09-29 13:23             ` Ville Syrjälä
@ 2015-09-29 14:00               ` Daniel Vetter
  0 siblings, 0 replies; 51+ messages in thread
From: Daniel Vetter @ 2015-09-29 14:00 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Rajneesh Bhardwaj, intel-gfx, Daniel Vetter

On Tue, Sep 29, 2015 at 3:23 PM, Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> Hmm. Why are we going back and forth on this all the time? Was there
> some problem with the plan [1] that Imre and I hatched?
>
> [1] http://lists.freedesktop.org/archives/intel-gfx/2015-September/076612.html

Well this is just the interim bugfix with a confusing commit message.
Compared to your overall plan we now have dc5&pw2 fused together, pw1
is still there but a no-op and dc6 is now (with this patch) fused with
with pw0 and overall device d3 (or whatever we use to kill the entire
thing and still appease the firmware).

So at least matches the overall plan.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-29  8:51                                 ` Daniel Vetter
@ 2015-09-30  0:50                                   ` Rafael J. Wysocki
  2015-09-30 12:14                                     ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Rafael J. Wysocki @ 2015-09-30  0:50 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx

On 9/29/2015 10:51 AM, Daniel Vetter wrote:
> On Tue, Sep 29, 2015 at 01:54:53AM +0200, Rafael J. Wysocki wrote:
>> On 9/28/2015 8:52 AM, Daniel Vetter wrote:
>>> On Wed, Sep 23, 2015 at 10:49:36PM +0200, Rafael J. Wysocki wrote:
>>>> On 9/23/2015 7:17 PM, Daniel Vetter wrote:
>>>>> acpi_target_system_state() seems to be almost the thing we're looking
>>>>> for, except that it's only valid in the suspend callbacks since it
>>>>> gets reset to ACPI_STATE_S0 when resuming. So probably we want
>>>>> something else ...
>>>> Right.
>>>>
>>>> The idea is to add a way for drivers to check if
>>>> (a) suspend is going to enter the BIOS
>>>> (b) resume has been triggered by the BIOS
>>>> and that's really what drivers need to know.
>>>>
>>>> For suspend-to-idle those two will return false and for S3 they'll return
>>>> true.
>>>>
>>>> Would that help?
>>> Not sure that matches exaxtly what we'd need here ... Essentially we need
>>> to know whether we've been in S3/S4 (firmware has been eaten) or in one of
>>> the higher suspend-to-idle/standby states (firmware still alive, don't
>>> disturb it). Additional fun that just crossed my mind is that if the
>>> suspend-to-mem is aborted (some other driver failed) then that function
>>> should _not_ indicate that we've been in S3. So maybe something like
>> So it really looks like the interface I was talking about would be suitable:
>>
>> pm_suspend_via_firmware() == true -> firmware is going to be eaten (use that
>> during suspend if needed)
>> pm_resume_via_firmware() == true -> firmware was eaten
>>
>> The latter will only return 'true' if we really have entered the BIOS
>> (platform firmware).
> Yeah that seems to fit the bill. We already have a check in our suspend
> paths to figure out whether we'll suspend to idle or go into full S3, so
> i915 could use them both. And making them generic would make sense I
> guess.

OK, sent patches (CCed you), please have a look.

Thanks,
Rafael


>>> acpi_source_system_state() which usually is S0 and only when acpi
>>> successfully went into the suspend state in platform_suspend_ops->enter it
>>> gets set to the value of acpi_target_system_state. And then reset once the
>>> resume has completed. I think that would be what we'd want here.
>> We need to new functions like the above, because some things already depend
>> on acpi_target_system_state working the way it does currently.
>>
>> I see no reason to make that ACPI-specific, though, in principle.
>>
>>> Anyway I'll pull in Animesh series meanwhile, amended with a FIXME comment.
>> Fine by me.
>>
>> Thanks,
>> Rafael
>>
>>
>>>>> On Wed, Sep 23, 2015 at 6:28 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>> Actually add Rafael this time around ...
>>>>>> -Daniel
>>>>>>
>>>>>> On Wed, Sep 23, 2015 at 6:27 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>> On Wed, Sep 23, 2015 at 9:57 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>> On Thu, Sep 17, 2015 at 12:53:21AM +0530, Animesh Manna wrote:
>>>>>>>>> On 9/14/2015 1:16 PM, Daniel Vetter wrote:
>>>>>>>>>> On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote:
>>>>>>>>>>> On 9/10/2015 8:15 PM, Daniel Vetter wrote:
>>>>>>>>>>>> On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
>>>>>>>>>>>>> On 9/2/2015 2:24 PM, Daniel Vetter wrote:
>>>>>>>>>>>>>> On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
>>>>>>>>>>>>>>> On 8/26/2015 6:40 PM, Daniel Vetter wrote:
>>>>>>>>>>>>>>>> On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
>>>>>>>>>>>>>>>>> Dmc will restore the csr program except DC9, cold boot,
>>>>>>>>>>>>>>>>> warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> intel_csr_load_program() function is used to load the firmware
>>>>>>>>>>>>>>>>> data from kernel memory to csr address space.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> All values of csr address space will be zero if it got reset and
>>>>>>>>>>>>>>>>> the first byte of csr program is always a non-zero if firmware
>>>>>>>>>>>>>>>>> is loaded successfuly. Based on hardware status will load the
>>>>>>>>>>>>>>>>> firmware.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Without this condition check if we overwrite the firmware data the
>>>>>>>>>>>>>>>>> counters exposed for dc5/dc6 (help for debugging) will be nullified.
>>>>>>>>>>>>>>> Bacause of the above reason mentioned just above we need to block firmware loading again.
>>>>>>>>>>>>>>> So only WARN_ON will not help.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> v1: Initial version.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> v2: Based on review comments from Daniel,
>>>>>>>>>>>>>>>>> - Added a check to know hardware status and load the firmware if not loaded.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>>>>>>>>>>> Cc: Damien Lespiau <damien.lespiau@intel.com>
>>>>>>>>>>>>>>>>> Cc: Imre Deak <imre.deak@intel.com>
>>>>>>>>>>>>>>>>> Cc: Sunil Kamath <sunil.kamath@intel.com>
>>>>>>>>>>>>>>>>> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>>>>>>>>>>>>>>>>> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>   drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>>>>>>>>>>>>>>>>>   1 file changed, 9 insertions(+)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>>>>>>>>>>> index ba1ae03..682cc26 100644
>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>>>>>>>>>>> @@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>>>>>>>>>>>>>>>>>               return;
>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>> +     /*
>>>>>>>>>>>>>>>>> +      * Dmc will restore the csr the program except DC9, cold boot,
>>>>>>>>>>>>>>>>> +      * warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>>>>>>>>>>> +      * This condition will help to check if csr address space is reset/
>>>>>>>>>>>>>>>>> +      * not loaded.
>>>>>>>>>>>>>>>>> +      */
>>>>>>>>>>>>>>>> Atm we call this from driver load and resume, which doesn seem to cover
>>>>>>>>>>>>>>>> all the cases you mention in the comment. Should this be a WARN_ON
>>>>>>>>>>>>>>>> instead? Or do we have troubles in our init sequence where we load too
>>>>>>>>>>>>>>>> many times?
>>>>>>>>>>>>>>> Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
>>>>>>>>>>>>>>> Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
>>>>>>>>>>>>>>> second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
>>>>>>>>>>>>>>> and cold boot(not loaded).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
>>>>>>>>>>>>>>> If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can the below comment more clear to you.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>         /*
>>>>>>>>>>>>>>>          * Dmc will restore the csr the program except DC9, cold boot,
>>>>>>>>>>>>>>>          * warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>>>>>>>>>          * If firmware is restored by dmc then no need to load again which
>>>>>>>>>>>>>>>          * will keep the dc5/dc6 counter exposed by firmware.
>>>>>>>>>>>>>>>          */
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> No issue in init sequence.
>>>>>>>>>>>>>> That seems to still cover all the callers of the function afaics - we do
>>>>>>>>>>>>>> pci resets over suspend resume unconditionally. So I still don't
>>>>>>>>>>>>>> understand where exactly we try to load the dmc firmware in i915.ko when
>>>>>>>>>>>>>> it's already loaded.
>>>>>>>>>>>>> During resume intel_csr_load_program() will be called from
>>>>>>>>>>>>> intel_runtime_resume().
>>>>>>>>>>>>>
>>>>>>>>>>>>> intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
>>>>>>>>>>>>>
>>>>>>>>>>>>> During Pc10 entry testing I can see dmc is restoring back the firmware always,
>>>>>>>>>>>>> but as you mentioned pci-reset can happen unconditionally, but still then
>>>>>>>>>>>>> also during resume intel_runtime_resume() will be called and based on
>>>>>>>>>>>>> register read of csr-base-address firmware loading will happen.
>>>>>>>>>>>> But in your comment you're saying it won't get restored in case of dc9 and
>>>>>>>>>>>> suspend. So that seems to mismatch what you're saying here (and what the
>>>>>>>>>>>> commit message says) and what the code does. And this function here is
>>>>>>>>>>>> called for resume after suspend/hibernate only.
>>>>>>>>>>> pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
>>>>>>>>>>> I think you are confusing between dc6 and dc9. Pc10 can be achieved by
>>>>>>>>>>> entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
>>>>>>>>>>> for broxton which is not present for skylake.
>>>>>>>>>> I have no idea at all about different pc levels on skl. What I'm talking
>>>>>>>>>> about is system suspend/resume and driver load, which are the places this
>>>>>>>>>> function gets called. At least afaics.
>>>>>>>>>>
>>>>>>>>>>> Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
>>>>>>>>>>> execution flow will be different in case of suspend/resume which I think is confusing
>>>>>>>>>>> you.
>>>>>>>>>> That seems like really important information. What's different on bxt?
>>>>>>>>>> These are the kind of details you should explain in the commit message ...
>>>>>>>>>>
>>>>>>>>>>> I am ready explain you in detail. It will be good if we discuss specific use-case scenario
>>>>>>>>>>> and itz software design for specific platform. Another point - as dmc related code for
>>>>>>>>>>> broxton is not merged better first we close design for skylake. Now, I have added dc9
>>>>>>>>>>> description in comment thinking of future. If you want I can remove for now and later
>>>>>>>>>>> can add in bxt patch series for enabling dmc. Will wait for your reply.
>>>>>>>>>> This question here isn't about the overall design and how to handle power
>>>>>>>>>> wells in skl/bxt. That's a separate discussion and tracked somewhere else.
>>>>>>>>>> I'm really just confused about when exactly we need to reload to firmware,
>>>>>>>>>> and why we need a runtime check for that. Normally we should know when to
>>>>>>>>>> reload the firmware and just either reload or not, without checking hw
>>>>>>>>>> state. And I don't like checking for hw state since at least in the past
>>>>>>>>>> that kind of code ended up being fragile - it's an illusion that it does
>>>>>>>>>> the right thing no matter what, since often there's other tricky ordering
>>>>>>>>>> constraints. And if you have automatic duct-tape like then no one will
>>>>>>>>>> ever spot those other, harder to spot issues, until an expensive customer
>>>>>>>>>> escalation happens.
>>>>>>>>>>
>>>>>>>>>> So what I want to know here is:
>>>>>>>>>> - When exactly do we need to reload dmc firmware.
>>>>>>>>> In skl, during driver load first time we load the firmware, during normal
>>>>>>>>> suspend-resume (dc6 entry/exit)
>>>>>>>>> no need to reload the firmware again as dmc will take care of it. But during
>>>>>>>>> suspend/hibernation
>>>>>>>>> dmc will not restore the firmware. In that case driver need to reload it
>>>>>>>>> again. I do not know
>>>>>>>>> how to differentiate pm-suspend and suspend-hibernation and thought both the
>>>>>>>>> cases
>>>>>>>>> intel_runtime_resume() will be called where we can check the h/w state and
>>>>>>>>> reload the
>>>>>>>>> firmware if dmc is not restored.
>>>>>>>>>
>>>>>>>>> In bxt, during driver load first time we load the firmware, during normal
>>>>>>>>> suspend-resume
>>>>>>>>> display engine will enter into dc9 and dmc will not restore the firmware. So
>>>>>>>>> every
>>>>>>>>> suspend-resume we need to reload the firmware.
>>>>>>>>>> - What exactly is the reason why we can't make that decision statically in
>>>>>>>>>>    the code (by calling csr_load at the right spots).
>>>>>>>>> As I mentioned before in case of skylake can we differentiate between
>>>>>>>>> "resume from pm-suspend" with "resume from suspend-hibernation" inside
>>>>>>>>> driver?
>>>>>>>>>
>>>>>>>>> In case of broxton, every time we need to reload, so we can decide
>>>>>>>>> statically.
>>>>>>>> Of course we can differentiate between all the different resume paths, and
>>>>>>>> we also have a per-platform split to take care of bxt vs. skl. And there
>>>>>>>> are actually 3 different resume paths:
>>>>>>>>
>>>>>>>> - runtime PM resume. This calls the runtime_resume hook. It sounds like on
>>>>>>>>    skl we should _not_ load the csr firmware, but on bxt we should load it.
>>>>>>>>    This can be fixed by removing the intel_csr_load_program call from
>>>>>>>>    skl_resume_prepare.
>>>>>>>> - resume from hibernate-to-disk (i.e. system completely off, state stored
>>>>>>>>    on the swap partition) is done by calling the thaw callbacks.
>>>>>>>> - resume from suspend-to-mem (i.e. system in low-power with only memory
>>>>>>>>    in self-refresh, all state stored in memory) is done by calling the
>>>>>>>>    resume callbacks.
>>>>>>>>
>>>>>>>> For i915 we use unified handlers in our dev_pm_ops for both thaw and
>>>>>>>> resume, but it sounds like that won't be a problem for skl/bxt since we
>>>>>>>> need to reload the csr firmware in all cases. Although I'm not perfectly
>>>>>>>> sure since you don't explain what kind of resume you mean exactly (since
>>>>>>>> you don't use the linux names for them).
>>>>>>>>
>>>>>>>> Anyway it sounds like we can replace this patch by one where we remove
>>>>>>>> that errornous csr load call from skl runtime pm resume and that's all.
>>>>>>>> But I suggest to make sure we get this right we keep the check you're
>>>>>>>> adding here, but wrap it in a WARN_ON. Then we'll get a backtrace when
>>>>>>>> this is going wrong again. Like this:
>>>>>>>>
>>>>>>>>          if (WARN_ON(csr_loaded_already()))
>>>>>>>>                  return;
>>>>>>>>
>>>>>>>> Also when redoing the commits please explain in detail what exactly are
>>>>>>>> the requirements like you've done above, but please use the standard linux
>>>>>>>> names, i.e. "runtime PM" and "hibernate-to-disk" and "suspend-to-mem".
>>>>>>> Ok hooray there's more suspend-to-something things I've totally missed:
>>>>>>> - suspend-to-idle (done by cat freeze > /sys/power/state) and
>>>>>>> - suspend (done by cat suspend > /sys/power/state)
>>>>>>>
>>>>>>> And apparently there's really no way to drivers to tell them apart.
>>>>>>> Rafael, is there really no way for drivers to take different paths for
>>>>>>> these 3 suspend cases? I tried grepping for PM_SUSPEND_ON/STANDY/MEM
>>>>>>> and didn't spot anything.
>>>>>>>
>>>>>>> Also we're completely missing test coverage for that in igt. That is
>>>>>>> something that needs to be fixed asap (yet another case of
>>>>>>> combinatorial explosion in igt tests, yay). And at least one of those
>>>>>>> suspend-to-idle testcase better be in the BAT.
>>>>>>> -Daniel
>>>>>>> --
>>>>>>> Daniel Vetter
>>>>>>> Software Engineer, Intel Corporation
>>>>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>>>>>> --
>>>>>> Daniel Vetter
>>>>>> Software Engineer, Intel Corporation
>>>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-30  0:50                                   ` Rafael J. Wysocki
@ 2015-09-30 12:14                                     ` Daniel Vetter
  2015-09-30 23:34                                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2015-09-30 12:14 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Daniel Vetter, intel-gfx

On Wed, Sep 30, 2015 at 02:50:40AM +0200, Rafael J. Wysocki wrote:
> On 9/29/2015 10:51 AM, Daniel Vetter wrote:
> >On Tue, Sep 29, 2015 at 01:54:53AM +0200, Rafael J. Wysocki wrote:
> >>On 9/28/2015 8:52 AM, Daniel Vetter wrote:
> >>>On Wed, Sep 23, 2015 at 10:49:36PM +0200, Rafael J. Wysocki wrote:
> >>>>On 9/23/2015 7:17 PM, Daniel Vetter wrote:
> >>>>>acpi_target_system_state() seems to be almost the thing we're looking
> >>>>>for, except that it's only valid in the suspend callbacks since it
> >>>>>gets reset to ACPI_STATE_S0 when resuming. So probably we want
> >>>>>something else ...
> >>>>Right.
> >>>>
> >>>>The idea is to add a way for drivers to check if
> >>>>(a) suspend is going to enter the BIOS
> >>>>(b) resume has been triggered by the BIOS
> >>>>and that's really what drivers need to know.
> >>>>
> >>>>For suspend-to-idle those two will return false and for S3 they'll return
> >>>>true.
> >>>>
> >>>>Would that help?
> >>>Not sure that matches exaxtly what we'd need here ... Essentially we need
> >>>to know whether we've been in S3/S4 (firmware has been eaten) or in one of
> >>>the higher suspend-to-idle/standby states (firmware still alive, don't
> >>>disturb it). Additional fun that just crossed my mind is that if the
> >>>suspend-to-mem is aborted (some other driver failed) then that function
> >>>should _not_ indicate that we've been in S3. So maybe something like
> >>So it really looks like the interface I was talking about would be suitable:
> >>
> >>pm_suspend_via_firmware() == true -> firmware is going to be eaten (use that
> >>during suspend if needed)
> >>pm_resume_via_firmware() == true -> firmware was eaten
> >>
> >>The latter will only return 'true' if we really have entered the BIOS
> >>(platform firmware).
> >Yeah that seems to fit the bill. We already have a check in our suspend
> >paths to figure out whether we'll suspend to idle or go into full S3, so
> >i915 could use them both. And making them generic would make sense I
> >guess.
> 
> OK, sent patches (CCed you), please have a look.

EXPORT_SYMBOL seems missing, but lgtm otherwise. I haven't tried to use
them for the two cases in i915 though.
-Daniel

> 
> Thanks,
> Rafael
> 
> 
> >>>acpi_source_system_state() which usually is S0 and only when acpi
> >>>successfully went into the suspend state in platform_suspend_ops->enter it
> >>>gets set to the value of acpi_target_system_state. And then reset once the
> >>>resume has completed. I think that would be what we'd want here.
> >>We need to new functions like the above, because some things already depend
> >>on acpi_target_system_state working the way it does currently.
> >>
> >>I see no reason to make that ACPI-specific, though, in principle.
> >>
> >>>Anyway I'll pull in Animesh series meanwhile, amended with a FIXME comment.
> >>Fine by me.
> >>
> >>Thanks,
> >>Rafael
> >>
> >>
> >>>>>On Wed, Sep 23, 2015 at 6:28 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>>>Actually add Rafael this time around ...
> >>>>>>-Daniel
> >>>>>>
> >>>>>>On Wed, Sep 23, 2015 at 6:27 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>>>>On Wed, Sep 23, 2015 at 9:57 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> >>>>>>>>On Thu, Sep 17, 2015 at 12:53:21AM +0530, Animesh Manna wrote:
> >>>>>>>>>On 9/14/2015 1:16 PM, Daniel Vetter wrote:
> >>>>>>>>>>On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote:
> >>>>>>>>>>>On 9/10/2015 8:15 PM, Daniel Vetter wrote:
> >>>>>>>>>>>>On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
> >>>>>>>>>>>>>On 9/2/2015 2:24 PM, Daniel Vetter wrote:
> >>>>>>>>>>>>>>On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
> >>>>>>>>>>>>>>>On 8/26/2015 6:40 PM, Daniel Vetter wrote:
> >>>>>>>>>>>>>>>>On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
> >>>>>>>>>>>>>>>>>Dmc will restore the csr program except DC9, cold boot,
> >>>>>>>>>>>>>>>>>warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>intel_csr_load_program() function is used to load the firmware
> >>>>>>>>>>>>>>>>>data from kernel memory to csr address space.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>All values of csr address space will be zero if it got reset and
> >>>>>>>>>>>>>>>>>the first byte of csr program is always a non-zero if firmware
> >>>>>>>>>>>>>>>>>is loaded successfuly. Based on hardware status will load the
> >>>>>>>>>>>>>>>>>firmware.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>Without this condition check if we overwrite the firmware data the
> >>>>>>>>>>>>>>>>>counters exposed for dc5/dc6 (help for debugging) will be nullified.
> >>>>>>>>>>>>>>>Bacause of the above reason mentioned just above we need to block firmware loading again.
> >>>>>>>>>>>>>>>So only WARN_ON will not help.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>v1: Initial version.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>v2: Based on review comments from Daniel,
> >>>>>>>>>>>>>>>>>- Added a check to know hardware status and load the firmware if not loaded.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>Cc: Daniel Vetter <daniel.vetter@intel.com>
> >>>>>>>>>>>>>>>>>Cc: Damien Lespiau <damien.lespiau@intel.com>
> >>>>>>>>>>>>>>>>>Cc: Imre Deak <imre.deak@intel.com>
> >>>>>>>>>>>>>>>>>Cc: Sunil Kamath <sunil.kamath@intel.com>
> >>>>>>>>>>>>>>>>>Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> >>>>>>>>>>>>>>>>>Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> >>>>>>>>>>>>>>>>>---
> >>>>>>>>>>>>>>>>>  drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
> >>>>>>>>>>>>>>>>>  1 file changed, 9 insertions(+)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>>>>>>>>>>>>index ba1ae03..682cc26 100644
> >>>>>>>>>>>>>>>>>--- a/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>>>>>>>>>>>>+++ b/drivers/gpu/drm/i915/intel_csr.c
> >>>>>>>>>>>>>>>>>@@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
> >>>>>>>>>>>>>>>>>              return;
> >>>>>>>>>>>>>>>>>      }
> >>>>>>>>>>>>>>>>>+     /*
> >>>>>>>>>>>>>>>>>+      * Dmc will restore the csr the program except DC9, cold boot,
> >>>>>>>>>>>>>>>>>+      * warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>>>>>>>>>>>>+      * This condition will help to check if csr address space is reset/
> >>>>>>>>>>>>>>>>>+      * not loaded.
> >>>>>>>>>>>>>>>>>+      */
> >>>>>>>>>>>>>>>>Atm we call this from driver load and resume, which doesn seem to cover
> >>>>>>>>>>>>>>>>all the cases you mention in the comment. Should this be a WARN_ON
> >>>>>>>>>>>>>>>>instead? Or do we have troubles in our init sequence where we load too
> >>>>>>>>>>>>>>>>many times?
> >>>>>>>>>>>>>>>Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
> >>>>>>>>>>>>>>>Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
> >>>>>>>>>>>>>>>second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
> >>>>>>>>>>>>>>>and cold boot(not loaded).
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
> >>>>>>>>>>>>>>>If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>Can the below comment more clear to you.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>        /*
> >>>>>>>>>>>>>>>         * Dmc will restore the csr the program except DC9, cold boot,
> >>>>>>>>>>>>>>>         * warm reset, PCI function level reset, and hibernate/suspend.
> >>>>>>>>>>>>>>>         * If firmware is restored by dmc then no need to load again which
> >>>>>>>>>>>>>>>         * will keep the dc5/dc6 counter exposed by firmware.
> >>>>>>>>>>>>>>>         */
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>No issue in init sequence.
> >>>>>>>>>>>>>>That seems to still cover all the callers of the function afaics - we do
> >>>>>>>>>>>>>>pci resets over suspend resume unconditionally. So I still don't
> >>>>>>>>>>>>>>understand where exactly we try to load the dmc firmware in i915.ko when
> >>>>>>>>>>>>>>it's already loaded.
> >>>>>>>>>>>>>During resume intel_csr_load_program() will be called from
> >>>>>>>>>>>>>intel_runtime_resume().
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>During Pc10 entry testing I can see dmc is restoring back the firmware always,
> >>>>>>>>>>>>>but as you mentioned pci-reset can happen unconditionally, but still then
> >>>>>>>>>>>>>also during resume intel_runtime_resume() will be called and based on
> >>>>>>>>>>>>>register read of csr-base-address firmware loading will happen.
> >>>>>>>>>>>>But in your comment you're saying it won't get restored in case of dc9 and
> >>>>>>>>>>>>suspend. So that seems to mismatch what you're saying here (and what the
> >>>>>>>>>>>>commit message says) and what the code does. And this function here is
> >>>>>>>>>>>>called for resume after suspend/hibernate only.
> >>>>>>>>>>>pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
> >>>>>>>>>>>I think you are confusing between dc6 and dc9. Pc10 can be achieved by
> >>>>>>>>>>>entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
> >>>>>>>>>>>for broxton which is not present for skylake.
> >>>>>>>>>>I have no idea at all about different pc levels on skl. What I'm talking
> >>>>>>>>>>about is system suspend/resume and driver load, which are the places this
> >>>>>>>>>>function gets called. At least afaics.
> >>>>>>>>>>
> >>>>>>>>>>>Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
> >>>>>>>>>>>execution flow will be different in case of suspend/resume which I think is confusing
> >>>>>>>>>>>you.
> >>>>>>>>>>That seems like really important information. What's different on bxt?
> >>>>>>>>>>These are the kind of details you should explain in the commit message ...
> >>>>>>>>>>
> >>>>>>>>>>>I am ready explain you in detail. It will be good if we discuss specific use-case scenario
> >>>>>>>>>>>and itz software design for specific platform. Another point - as dmc related code for
> >>>>>>>>>>>broxton is not merged better first we close design for skylake. Now, I have added dc9
> >>>>>>>>>>>description in comment thinking of future. If you want I can remove for now and later
> >>>>>>>>>>>can add in bxt patch series for enabling dmc. Will wait for your reply.
> >>>>>>>>>>This question here isn't about the overall design and how to handle power
> >>>>>>>>>>wells in skl/bxt. That's a separate discussion and tracked somewhere else.
> >>>>>>>>>>I'm really just confused about when exactly we need to reload to firmware,
> >>>>>>>>>>and why we need a runtime check for that. Normally we should know when to
> >>>>>>>>>>reload the firmware and just either reload or not, without checking hw
> >>>>>>>>>>state. And I don't like checking for hw state since at least in the past
> >>>>>>>>>>that kind of code ended up being fragile - it's an illusion that it does
> >>>>>>>>>>the right thing no matter what, since often there's other tricky ordering
> >>>>>>>>>>constraints. And if you have automatic duct-tape like then no one will
> >>>>>>>>>>ever spot those other, harder to spot issues, until an expensive customer
> >>>>>>>>>>escalation happens.
> >>>>>>>>>>
> >>>>>>>>>>So what I want to know here is:
> >>>>>>>>>>- When exactly do we need to reload dmc firmware.
> >>>>>>>>>In skl, during driver load first time we load the firmware, during normal
> >>>>>>>>>suspend-resume (dc6 entry/exit)
> >>>>>>>>>no need to reload the firmware again as dmc will take care of it. But during
> >>>>>>>>>suspend/hibernation
> >>>>>>>>>dmc will not restore the firmware. In that case driver need to reload it
> >>>>>>>>>again. I do not know
> >>>>>>>>>how to differentiate pm-suspend and suspend-hibernation and thought both the
> >>>>>>>>>cases
> >>>>>>>>>intel_runtime_resume() will be called where we can check the h/w state and
> >>>>>>>>>reload the
> >>>>>>>>>firmware if dmc is not restored.
> >>>>>>>>>
> >>>>>>>>>In bxt, during driver load first time we load the firmware, during normal
> >>>>>>>>>suspend-resume
> >>>>>>>>>display engine will enter into dc9 and dmc will not restore the firmware. So
> >>>>>>>>>every
> >>>>>>>>>suspend-resume we need to reload the firmware.
> >>>>>>>>>>- What exactly is the reason why we can't make that decision statically in
> >>>>>>>>>>   the code (by calling csr_load at the right spots).
> >>>>>>>>>As I mentioned before in case of skylake can we differentiate between
> >>>>>>>>>"resume from pm-suspend" with "resume from suspend-hibernation" inside
> >>>>>>>>>driver?
> >>>>>>>>>
> >>>>>>>>>In case of broxton, every time we need to reload, so we can decide
> >>>>>>>>>statically.
> >>>>>>>>Of course we can differentiate between all the different resume paths, and
> >>>>>>>>we also have a per-platform split to take care of bxt vs. skl. And there
> >>>>>>>>are actually 3 different resume paths:
> >>>>>>>>
> >>>>>>>>- runtime PM resume. This calls the runtime_resume hook. It sounds like on
> >>>>>>>>   skl we should _not_ load the csr firmware, but on bxt we should load it.
> >>>>>>>>   This can be fixed by removing the intel_csr_load_program call from
> >>>>>>>>   skl_resume_prepare.
> >>>>>>>>- resume from hibernate-to-disk (i.e. system completely off, state stored
> >>>>>>>>   on the swap partition) is done by calling the thaw callbacks.
> >>>>>>>>- resume from suspend-to-mem (i.e. system in low-power with only memory
> >>>>>>>>   in self-refresh, all state stored in memory) is done by calling the
> >>>>>>>>   resume callbacks.
> >>>>>>>>
> >>>>>>>>For i915 we use unified handlers in our dev_pm_ops for both thaw and
> >>>>>>>>resume, but it sounds like that won't be a problem for skl/bxt since we
> >>>>>>>>need to reload the csr firmware in all cases. Although I'm not perfectly
> >>>>>>>>sure since you don't explain what kind of resume you mean exactly (since
> >>>>>>>>you don't use the linux names for them).
> >>>>>>>>
> >>>>>>>>Anyway it sounds like we can replace this patch by one where we remove
> >>>>>>>>that errornous csr load call from skl runtime pm resume and that's all.
> >>>>>>>>But I suggest to make sure we get this right we keep the check you're
> >>>>>>>>adding here, but wrap it in a WARN_ON. Then we'll get a backtrace when
> >>>>>>>>this is going wrong again. Like this:
> >>>>>>>>
> >>>>>>>>         if (WARN_ON(csr_loaded_already()))
> >>>>>>>>                 return;
> >>>>>>>>
> >>>>>>>>Also when redoing the commits please explain in detail what exactly are
> >>>>>>>>the requirements like you've done above, but please use the standard linux
> >>>>>>>>names, i.e. "runtime PM" and "hibernate-to-disk" and "suspend-to-mem".
> >>>>>>>Ok hooray there's more suspend-to-something things I've totally missed:
> >>>>>>>- suspend-to-idle (done by cat freeze > /sys/power/state) and
> >>>>>>>- suspend (done by cat suspend > /sys/power/state)
> >>>>>>>
> >>>>>>>And apparently there's really no way to drivers to tell them apart.
> >>>>>>>Rafael, is there really no way for drivers to take different paths for
> >>>>>>>these 3 suspend cases? I tried grepping for PM_SUSPEND_ON/STANDY/MEM
> >>>>>>>and didn't spot anything.
> >>>>>>>
> >>>>>>>Also we're completely missing test coverage for that in igt. That is
> >>>>>>>something that needs to be fixed asap (yet another case of
> >>>>>>>combinatorial explosion in igt tests, yay). And at least one of those
> >>>>>>>suspend-to-idle testcase better be in the BAT.
> >>>>>>>-Daniel
> >>>>>>>--
> >>>>>>>Daniel Vetter
> >>>>>>>Software Engineer, Intel Corporation
> >>>>>>>+41 (0) 79 365 57 48 - http://blog.ffwll.ch
> >>>>>>--
> >>>>>>Daniel Vetter
> >>>>>>Software Engineer, Intel Corporation
> >>>>>>+41 (0) 79 365 57 48 - http://blog.ffwll.ch
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading.
  2015-09-30 12:14                                     ` Daniel Vetter
@ 2015-09-30 23:34                                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 51+ messages in thread
From: Rafael J. Wysocki @ 2015-09-30 23:34 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx

On 9/30/2015 2:14 PM, Daniel Vetter wrote:
> On Wed, Sep 30, 2015 at 02:50:40AM +0200, Rafael J. Wysocki wrote:
>> On 9/29/2015 10:51 AM, Daniel Vetter wrote:
>>> On Tue, Sep 29, 2015 at 01:54:53AM +0200, Rafael J. Wysocki wrote:
>>>> On 9/28/2015 8:52 AM, Daniel Vetter wrote:
>>>>> On Wed, Sep 23, 2015 at 10:49:36PM +0200, Rafael J. Wysocki wrote:
>>>>>> On 9/23/2015 7:17 PM, Daniel Vetter wrote:
>>>>>>> acpi_target_system_state() seems to be almost the thing we're looking
>>>>>>> for, except that it's only valid in the suspend callbacks since it
>>>>>>> gets reset to ACPI_STATE_S0 when resuming. So probably we want
>>>>>>> something else ...
>>>>>> Right.
>>>>>>
>>>>>> The idea is to add a way for drivers to check if
>>>>>> (a) suspend is going to enter the BIOS
>>>>>> (b) resume has been triggered by the BIOS
>>>>>> and that's really what drivers need to know.
>>>>>>
>>>>>> For suspend-to-idle those two will return false and for S3 they'll return
>>>>>> true.
>>>>>>
>>>>>> Would that help?
>>>>> Not sure that matches exaxtly what we'd need here ... Essentially we need
>>>>> to know whether we've been in S3/S4 (firmware has been eaten) or in one of
>>>>> the higher suspend-to-idle/standby states (firmware still alive, don't
>>>>> disturb it). Additional fun that just crossed my mind is that if the
>>>>> suspend-to-mem is aborted (some other driver failed) then that function
>>>>> should _not_ indicate that we've been in S3. So maybe something like
>>>> So it really looks like the interface I was talking about would be suitable:
>>>>
>>>> pm_suspend_via_firmware() == true -> firmware is going to be eaten (use that
>>>> during suspend if needed)
>>>> pm_resume_via_firmware() == true -> firmware was eaten
>>>>
>>>> The latter will only return 'true' if we really have entered the BIOS
>>>> (platform firmware).
>>> Yeah that seems to fit the bill. We already have a check in our suspend
>>> paths to figure out whether we'll suspend to idle or go into full S3, so
>>> i915 could use them both. And making them generic would make sense I
>>> guess.
>> OK, sent patches (CCed you), please have a look.
> EXPORT_SYMBOL seems missing,

Good point, will fix.

>   but lgtm otherwise. I haven't tried to use
> them for the two cases in i915 though.

OK, thanks!

Rafael


>
>>>>> acpi_source_system_state() which usually is S0 and only when acpi
>>>>> successfully went into the suspend state in platform_suspend_ops->enter it
>>>>> gets set to the value of acpi_target_system_state. And then reset once the
>>>>> resume has completed. I think that would be what we'd want here.
>>>> We need to new functions like the above, because some things already depend
>>>> on acpi_target_system_state working the way it does currently.
>>>>
>>>> I see no reason to make that ACPI-specific, though, in principle.
>>>>
>>>>> Anyway I'll pull in Animesh series meanwhile, amended with a FIXME comment.
>>>> Fine by me.
>>>>
>>>> Thanks,
>>>> Rafael
>>>>
>>>>
>>>>>>> On Wed, Sep 23, 2015 at 6:28 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>> Actually add Rafael this time around ...
>>>>>>>> -Daniel
>>>>>>>>
>>>>>>>> On Wed, Sep 23, 2015 at 6:27 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>>> On Wed, Sep 23, 2015 at 9:57 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>>>>>>>>> On Thu, Sep 17, 2015 at 12:53:21AM +0530, Animesh Manna wrote:
>>>>>>>>>>> On 9/14/2015 1:16 PM, Daniel Vetter wrote:
>>>>>>>>>>>> On Fri, Sep 11, 2015 at 12:36:24AM +0530, Animesh Manna wrote:
>>>>>>>>>>>>> On 9/10/2015 8:15 PM, Daniel Vetter wrote:
>>>>>>>>>>>>>> On Thu, Sep 10, 2015 at 01:58:54AM +0530, Animesh Manna wrote:
>>>>>>>>>>>>>>> On 9/2/2015 2:24 PM, Daniel Vetter wrote:
>>>>>>>>>>>>>>>> On Wed, Aug 26, 2015 at 07:40:54PM +0530, Animesh Manna wrote:
>>>>>>>>>>>>>>>>> On 8/26/2015 6:40 PM, Daniel Vetter wrote:
>>>>>>>>>>>>>>>>>> On Wed, Aug 26, 2015 at 01:36:05AM +0530, Animesh Manna wrote:
>>>>>>>>>>>>>>>>>>> Dmc will restore the csr program except DC9, cold boot,
>>>>>>>>>>>>>>>>>>> warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> intel_csr_load_program() function is used to load the firmware
>>>>>>>>>>>>>>>>>>> data from kernel memory to csr address space.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> All values of csr address space will be zero if it got reset and
>>>>>>>>>>>>>>>>>>> the first byte of csr program is always a non-zero if firmware
>>>>>>>>>>>>>>>>>>> is loaded successfuly. Based on hardware status will load the
>>>>>>>>>>>>>>>>>>> firmware.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Without this condition check if we overwrite the firmware data the
>>>>>>>>>>>>>>>>>>> counters exposed for dc5/dc6 (help for debugging) will be nullified.
>>>>>>>>>>>>>>>>> Bacause of the above reason mentioned just above we need to block firmware loading again.
>>>>>>>>>>>>>>>>> So only WARN_ON will not help.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> v1: Initial version.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> v2: Based on review comments from Daniel,
>>>>>>>>>>>>>>>>>>> - Added a check to know hardware status and load the firmware if not loaded.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Cc: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>>>>>>>>>>>>> Cc: Damien Lespiau <damien.lespiau@intel.com>
>>>>>>>>>>>>>>>>>>> Cc: Imre Deak <imre.deak@intel.com>
>>>>>>>>>>>>>>>>>>> Cc: Sunil Kamath <sunil.kamath@intel.com>
>>>>>>>>>>>>>>>>>>> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
>>>>>>>>>>>>>>>>>>> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
>>>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>>>   drivers/gpu/drm/i915/intel_csr.c | 9 +++++++++
>>>>>>>>>>>>>>>>>>>   1 file changed, 9 insertions(+)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>>>>>>>>>>>>> index ba1ae03..682cc26 100644
>>>>>>>>>>>>>>>>>>> --- a/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>>>>>>>>>>>>> +++ b/drivers/gpu/drm/i915/intel_csr.c
>>>>>>>>>>>>>>>>>>> @@ -252,6 +252,15 @@ void intel_csr_load_program(struct drm_device *dev)
>>>>>>>>>>>>>>>>>>>               return;
>>>>>>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>>>>>> +     /*
>>>>>>>>>>>>>>>>>>> +      * Dmc will restore the csr the program except DC9, cold boot,
>>>>>>>>>>>>>>>>>>> +      * warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>>>>>>>>>>>>> +      * This condition will help to check if csr address space is reset/
>>>>>>>>>>>>>>>>>>> +      * not loaded.
>>>>>>>>>>>>>>>>>>> +      */
>>>>>>>>>>>>>>>>>> Atm we call this from driver load and resume, which doesn seem to cover
>>>>>>>>>>>>>>>>>> all the cases you mention in the comment. Should this be a WARN_ON
>>>>>>>>>>>>>>>>>> instead? Or do we have troubles in our init sequence where we load too
>>>>>>>>>>>>>>>>>> many times?
>>>>>>>>>>>>>>>>> Yes, the above statement taken from bspec to describe about the special cases dmc will not restore the firmware.
>>>>>>>>>>>>>>>>> Agree, In our cases cold boot and hibernate/suspend mainly we need to load the firmware again, so in my
>>>>>>>>>>>>>>>>> second sentence I wanted to comment mainly regarding this condition check added for suspend-hibernate(reset)
>>>>>>>>>>>>>>>>> and cold boot(not loaded).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Anyways the same api later can be used to load the firmware from anywhere, so my intention to check firmware loaded or not.
>>>>>>>>>>>>>>>>> If already loaded then not to overwrite the csr address space to maintain the dc5/dc6 counter value.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Can the below comment more clear to you.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>         /*
>>>>>>>>>>>>>>>>>          * Dmc will restore the csr the program except DC9, cold boot,
>>>>>>>>>>>>>>>>>          * warm reset, PCI function level reset, and hibernate/suspend.
>>>>>>>>>>>>>>>>>          * If firmware is restored by dmc then no need to load again which
>>>>>>>>>>>>>>>>>          * will keep the dc5/dc6 counter exposed by firmware.
>>>>>>>>>>>>>>>>>          */
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> No issue in init sequence.
>>>>>>>>>>>>>>>> That seems to still cover all the callers of the function afaics - we do
>>>>>>>>>>>>>>>> pci resets over suspend resume unconditionally. So I still don't
>>>>>>>>>>>>>>>> understand where exactly we try to load the dmc firmware in i915.ko when
>>>>>>>>>>>>>>>> it's already loaded.
>>>>>>>>>>>>>>> During resume intel_csr_load_program() will be called from
>>>>>>>>>>>>>>> intel_runtime_resume().
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> intel_runtime_resume()-> skl_resume_prepare()-> intel_csr_load_program()
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> During Pc10 entry testing I can see dmc is restoring back the firmware always,
>>>>>>>>>>>>>>> but as you mentioned pci-reset can happen unconditionally, but still then
>>>>>>>>>>>>>>> also during resume intel_runtime_resume() will be called and based on
>>>>>>>>>>>>>>> register read of csr-base-address firmware loading will happen.
>>>>>>>>>>>>>> But in your comment you're saying it won't get restored in case of dc9 and
>>>>>>>>>>>>>> suspend. So that seems to mismatch what you're saying here (and what the
>>>>>>>>>>>>>> commit message says) and what the code does. And this function here is
>>>>>>>>>>>>>> called for resume after suspend/hibernate only.
>>>>>>>>>>>>> pc10 entry explanation I told is for skylake. dc9 in skylake is not possible.
>>>>>>>>>>>>> I think you are confusing between dc6 and dc9. Pc10 can be achieved by
>>>>>>>>>>>>> entering into dc6 (not dc9) for skylake. dc9 is the lowest possible state
>>>>>>>>>>>>> for broxton which is not present for skylake.
>>>>>>>>>>>> I have no idea at all about different pc levels on skl. What I'm talking
>>>>>>>>>>>> about is system suspend/resume and driver load, which are the places this
>>>>>>>>>>>> function gets called. At least afaics.
>>>>>>>>>>>>
>>>>>>>>>>>>> Here intel_csr_load_program() will be used for both skylake and broxton, and instruction
>>>>>>>>>>>>> execution flow will be different in case of suspend/resume which I think is confusing
>>>>>>>>>>>>> you.
>>>>>>>>>>>> That seems like really important information. What's different on bxt?
>>>>>>>>>>>> These are the kind of details you should explain in the commit message ...
>>>>>>>>>>>>
>>>>>>>>>>>>> I am ready explain you in detail. It will be good if we discuss specific use-case scenario
>>>>>>>>>>>>> and itz software design for specific platform. Another point - as dmc related code for
>>>>>>>>>>>>> broxton is not merged better first we close design for skylake. Now, I have added dc9
>>>>>>>>>>>>> description in comment thinking of future. If you want I can remove for now and later
>>>>>>>>>>>>> can add in bxt patch series for enabling dmc. Will wait for your reply.
>>>>>>>>>>>> This question here isn't about the overall design and how to handle power
>>>>>>>>>>>> wells in skl/bxt. That's a separate discussion and tracked somewhere else.
>>>>>>>>>>>> I'm really just confused about when exactly we need to reload to firmware,
>>>>>>>>>>>> and why we need a runtime check for that. Normally we should know when to
>>>>>>>>>>>> reload the firmware and just either reload or not, without checking hw
>>>>>>>>>>>> state. And I don't like checking for hw state since at least in the past
>>>>>>>>>>>> that kind of code ended up being fragile - it's an illusion that it does
>>>>>>>>>>>> the right thing no matter what, since often there's other tricky ordering
>>>>>>>>>>>> constraints. And if you have automatic duct-tape like then no one will
>>>>>>>>>>>> ever spot those other, harder to spot issues, until an expensive customer
>>>>>>>>>>>> escalation happens.
>>>>>>>>>>>>
>>>>>>>>>>>> So what I want to know here is:
>>>>>>>>>>>> - When exactly do we need to reload dmc firmware.
>>>>>>>>>>> In skl, during driver load first time we load the firmware, during normal
>>>>>>>>>>> suspend-resume (dc6 entry/exit)
>>>>>>>>>>> no need to reload the firmware again as dmc will take care of it. But during
>>>>>>>>>>> suspend/hibernation
>>>>>>>>>>> dmc will not restore the firmware. In that case driver need to reload it
>>>>>>>>>>> again. I do not know
>>>>>>>>>>> how to differentiate pm-suspend and suspend-hibernation and thought both the
>>>>>>>>>>> cases
>>>>>>>>>>> intel_runtime_resume() will be called where we can check the h/w state and
>>>>>>>>>>> reload the
>>>>>>>>>>> firmware if dmc is not restored.
>>>>>>>>>>>
>>>>>>>>>>> In bxt, during driver load first time we load the firmware, during normal
>>>>>>>>>>> suspend-resume
>>>>>>>>>>> display engine will enter into dc9 and dmc will not restore the firmware. So
>>>>>>>>>>> every
>>>>>>>>>>> suspend-resume we need to reload the firmware.
>>>>>>>>>>>> - What exactly is the reason why we can't make that decision statically in
>>>>>>>>>>>>    the code (by calling csr_load at the right spots).
>>>>>>>>>>> As I mentioned before in case of skylake can we differentiate between
>>>>>>>>>>> "resume from pm-suspend" with "resume from suspend-hibernation" inside
>>>>>>>>>>> driver?
>>>>>>>>>>>
>>>>>>>>>>> In case of broxton, every time we need to reload, so we can decide
>>>>>>>>>>> statically.
>>>>>>>>>> Of course we can differentiate between all the different resume paths, and
>>>>>>>>>> we also have a per-platform split to take care of bxt vs. skl. And there
>>>>>>>>>> are actually 3 different resume paths:
>>>>>>>>>>
>>>>>>>>>> - runtime PM resume. This calls the runtime_resume hook. It sounds like on
>>>>>>>>>>    skl we should _not_ load the csr firmware, but on bxt we should load it.
>>>>>>>>>>    This can be fixed by removing the intel_csr_load_program call from
>>>>>>>>>>    skl_resume_prepare.
>>>>>>>>>> - resume from hibernate-to-disk (i.e. system completely off, state stored
>>>>>>>>>>    on the swap partition) is done by calling the thaw callbacks.
>>>>>>>>>> - resume from suspend-to-mem (i.e. system in low-power with only memory
>>>>>>>>>>    in self-refresh, all state stored in memory) is done by calling the
>>>>>>>>>>    resume callbacks.
>>>>>>>>>>
>>>>>>>>>> For i915 we use unified handlers in our dev_pm_ops for both thaw and
>>>>>>>>>> resume, but it sounds like that won't be a problem for skl/bxt since we
>>>>>>>>>> need to reload the csr firmware in all cases. Although I'm not perfectly
>>>>>>>>>> sure since you don't explain what kind of resume you mean exactly (since
>>>>>>>>>> you don't use the linux names for them).
>>>>>>>>>>
>>>>>>>>>> Anyway it sounds like we can replace this patch by one where we remove
>>>>>>>>>> that errornous csr load call from skl runtime pm resume and that's all.
>>>>>>>>>> But I suggest to make sure we get this right we keep the check you're
>>>>>>>>>> adding here, but wrap it in a WARN_ON. Then we'll get a backtrace when
>>>>>>>>>> this is going wrong again. Like this:
>>>>>>>>>>
>>>>>>>>>>          if (WARN_ON(csr_loaded_already()))
>>>>>>>>>>                  return;
>>>>>>>>>>
>>>>>>>>>> Also when redoing the commits please explain in detail what exactly are
>>>>>>>>>> the requirements like you've done above, but please use the standard linux
>>>>>>>>>> names, i.e. "runtime PM" and "hibernate-to-disk" and "suspend-to-mem".
>>>>>>>>> Ok hooray there's more suspend-to-something things I've totally missed:
>>>>>>>>> - suspend-to-idle (done by cat freeze > /sys/power/state) and
>>>>>>>>> - suspend (done by cat suspend > /sys/power/state)
>>>>>>>>>
>>>>>>>>> And apparently there's really no way to drivers to tell them apart.
>>>>>>>>> Rafael, is there really no way for drivers to take different paths for
>>>>>>>>> these 3 suspend cases? I tried grepping for PM_SUSPEND_ON/STANDY/MEM
>>>>>>>>> and didn't spot anything.
>>>>>>>>>
>>>>>>>>> Also we're completely missing test coverage for that in igt. That is
>>>>>>>>> something that needs to be fixed asap (yet another case of
>>>>>>>>> combinatorial explosion in igt tests, yay). And at least one of those
>>>>>>>>> suspend-to-idle testcase better be in the BAT.
>>>>>>>>> -Daniel
>>>>>>>>> --
>>>>>>>>> Daniel Vetter
>>>>>>>>> Software Engineer, Intel Corporation
>>>>>>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>>>>>>>> --
>>>>>>>> Daniel Vetter
>>>>>>>> Software Engineer, Intel Corporation
>>>>>>>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_SKL_V2 0/5] pc10 entry fixes for skl.
  2015-08-25 20:06 [DMC_BUGFIX_SKL_V2 0/5] pc10 entry fixes for skl Animesh Manna
                   ` (4 preceding siblings ...)
  2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 5/5] drm/i915/skl: Block disable call for pw1 if dmc " Animesh Manna
@ 2015-10-09 13:58 ` Imre Deak
  5 siblings, 0 replies; 51+ messages in thread
From: Imre Deak @ 2015-10-09 13:58 UTC (permalink / raw)
  To: Animesh Manna; +Cc: intel-gfx, Runyan, Arthur J

After discussions with Art, Ville and others internally on how DC5/6
should normally work and existing open issues related to the HW
automatic DC5/6 functionality, I have the following understanding.

In the future we will require the firmware for any display side runtime
power management (and possibly some power savings during system power
states like S0ix and S3), so in the following we can ignore the case
when the firmware is not available.

Normally we need to run the BSpec display init sequence only when
loading the driver and resuming from S4 if the BIOS hasn't done so
already. PW1 and CDCLK is enabled as part of this sequence and will be
toggled by the DMC firmware on demand, so we must not toggle these
manually during runtime, since that would race with the FW. The same
goes goes for other resources enabled/reset in the init sequence: PCH
reset handshake, DBUF, Misc IO. The manual controls are DC5, DC6 and PW2
which the driver needs to toggle based on the active outputs. We also
need to disable DC5/6 while doing modeset and according to our tests
DPAUX transfers.

When HW enters DC6 and the GT side is idle too (in RC6) the system as a
whole can enter a deeper power state (PC9/10).

Due to a HW issue atm we cannot rely on the HW transitioning
automatically to DC6 and we need to use a manual way instead. We do this
by keeping DC6 disabled all the time and running a special
deinitialization sequence when both the display and GT side is idle, and
the reverse when something becomes active. (that is in the driver's
runtime suspend/resume hooks).

Currently the driver is not following the normal HW assisted flow, since
it toggles DBUF, CDCLK and PW1 manually during runtime. It also lacks
the manual sequence to enable PC9/10.

This patchset solves some of the above issues, by removing the manual
CDCLK and PW1 control and disabling DC6. This will solve some of the
stability issues, but leave PC9/10 disabled. As a follow-up we need to
- move the rest of the display init/uninit sequence out of the
power_well/suspend/resume path and call it only during driver loading/S4
resume as necessary
- add separate DC5/DC6 power wells
- add the manual sequence to enable PC9/10
- add a module parameter to enable either HW assisted DC6, or the manual
way, or disable DC6
- disable DC5/6 during modeset and other places like DPAUX as needed

I'll comment on the individual patches based on the above. I think with
those addressed we could merge this patchset and do the rest as a
follow-up.

--Imre

On ke, 2015-08-26 at 01:36 +0530, Animesh Manna wrote:
> The following patches helps to solve PC10 entry issue for SKL.
> Detailed description about the changes done to solve the issue
> is mentioned in commit message of each patch.
> 
> v1: http://lists.freedesktop.org/archives/intel-gfx/2015-August/072870.html
> 
> v2: Based on review comments from Daniel, changes made in the current version.
> 
> DMC redesign patch series has dependencies with current patch series. Need
> to rework on few patches, planning to send after initial review feedback
> of the current patch series.
> http://lists.freedesktop.org/archives/intel-gfx/2015-August/072921.html
>  
> 
> Animesh Manna (5):
>   drm/i915/skl: Added a check for the hardware status of csr fw before
>     loading.
>   drm/i915/skl Remove the call for csr uninitialization from suspend
>     path
>   drm/i915/skl: Making DC6 entry is the last call in suspend flow.
>   drm/i915/skl: Do not disable cdclk PLL if csr firmware is present
>   drm/i915/skl: Block disable call for pw1 if dmc firmware is present.
> 
>  drivers/gpu/drm/i915/i915_drv.c         | 19 +++++++++++++------
>  drivers/gpu/drm/i915/intel_csr.c        |  9 +++++++++
>  drivers/gpu/drm/i915/intel_display.c    | 14 ++++++++++----
>  drivers/gpu/drm/i915/intel_drv.h        |  2 ++
>  drivers/gpu/drm/i915/intel_runtime_pm.c | 31 ++++++++++++++++---------------
>  5 files changed, 50 insertions(+), 25 deletions(-)
> 


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_V3] drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  2015-09-29  5:31     ` [DMC_BUGFIX_V3] " Animesh Manna
@ 2015-10-16 12:22       ` Imre Deak
  2015-10-19  9:26         ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Imre Deak @ 2015-10-16 12:22 UTC (permalink / raw)
  To: Animesh Manna; +Cc: intel-gfx, Rajneesh Bhardwaj, Daniel Vetter

On ti, 2015-09-29 at 11:01 +0530, Animesh Manna wrote:
> Mmio register access after dc6/dc5 entry is not allowed when
> DC6 power states are enabled according to bspec (bspec-id 0527),
> so enabling dc6 as the last call in suspend flow.
> 
> v1: Initial version.
> 
> v2: Based on review comment from Daniel,
> - created a seperate patch for csr uninitialization set call.
> 
> v3: Rebased on top of latest code.
> 
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Damien Lespiau <damien.lespiau@intel.com>
> Cc: Imre Deak <imre.deak@intel.com>
> Cc: Sunil Kamath <sunil.kamath@intel.com>
> Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>

Acked-by: Imre Deak <imre.deak@intel.com>

Suggestion for the commit message:

Currently we keep DC6 enabled during modesets and DPAUX transfers, which
is not allowed according to the specification. This can lead at least to
PLL locking failures, DPAUX timeouts and prevent deeper package power
states (PC9/10). Fix this for now by enabling DC6 only when we know the
above events (modeset, DPAUX) can't happen.

This a temporary solution as some issues are still unsolved as described
in [1] and [2], we'll address those as a follow-up. 

[1]
http://lists.freedesktop.org/archives/intel-gfx/2015-October/077669.html
[2]
http://lists.freedesktop.org/archives/intel-gfx/2015-October/077787.html


> ---
>  drivers/gpu/drm/i915/i915_drv.c         | 13 +++++++++++++
>  drivers/gpu/drm/i915/intel_drv.h        |  2 ++
>  drivers/gpu/drm/i915/intel_runtime_pm.c | 19 +++++++------------
>  3 files changed, 22 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 1cb6b82..51075d5 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1049,10 +1049,20 @@ static int i915_pm_resume(struct device *dev)
>  
>  static int skl_suspend_complete(struct drm_i915_private *dev_priv)
>  {
> +	enum csr_state state;
>  	/* Enabling DC6 is not a hard requirement to enter runtime D3 */
>  
>  	skl_uninit_cdclk(dev_priv);
>  
> +	/* TODO: wait for a completion event or
> +	 * similar here instead of busy
> +	 * waiting using wait_for function.
> +	 */
> +	wait_for((state = intel_csr_load_status_get(dev_priv)) !=
> +			FW_UNINITIALIZED, 1000);
> +	if (state == FW_LOADED)
> +		skl_enable_dc6(dev_priv);
> +
>  	return 0;
>  }
>  
> @@ -1099,6 +1109,9 @@ static int skl_resume_prepare(struct drm_i915_private *dev_priv)
>  {
>  	struct drm_device *dev = dev_priv->dev;
>  
> +	if (intel_csr_load_status_get(dev_priv) == FW_LOADED)
> +		skl_disable_dc6(dev_priv);
> +
>  	skl_init_cdclk(dev_priv);
>  	intel_csr_load_program(dev);
>  
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index c96289d..990161d 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1143,6 +1143,8 @@ void bxt_enable_dc9(struct drm_i915_private *dev_priv);
>  void bxt_disable_dc9(struct drm_i915_private *dev_priv);
>  void skl_init_cdclk(struct drm_i915_private *dev_priv);
>  void skl_uninit_cdclk(struct drm_i915_private *dev_priv);
> +void skl_enable_dc6(struct drm_i915_private *dev_priv);
> +void skl_disable_dc6(struct drm_i915_private *dev_priv);
>  void intel_dp_get_m_n(struct intel_crtc *crtc,
>  		      struct intel_crtc_state *pipe_config);
>  void intel_dp_set_m_n(struct intel_crtc *crtc, enum link_m_n_set m_n);
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index d8e9416..d6b4f61 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -551,7 +551,7 @@ static void assert_can_disable_dc6(struct drm_i915_private *dev_priv)
>  		  "DC6 already programmed to be disabled.\n");
>  }
>  
> -static void skl_enable_dc6(struct drm_i915_private *dev_priv)
> +void skl_enable_dc6(struct drm_i915_private *dev_priv)
>  {
>  	uint32_t val;
>  
> @@ -568,7 +568,7 @@ static void skl_enable_dc6(struct drm_i915_private *dev_priv)
>  	POSTING_READ(DC_STATE_EN);
>  }
>  
> -static void skl_disable_dc6(struct drm_i915_private *dev_priv)
> +void skl_disable_dc6(struct drm_i915_private *dev_priv)
>  {
>  	uint32_t val;
>  
> @@ -629,10 +629,10 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>  				!I915_READ(HSW_PWR_WELL_BIOS),
>  				"Invalid for power well status to be enabled, unless done by the BIOS, \
>  				when request is to disable!\n");
> -			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> -				power_well->data == SKL_DISP_PW_2) {
> +			if (power_well->data == SKL_DISP_PW_2) {
> +				if (GEN9_ENABLE_DC5(dev))
> +					gen9_disable_dc5(dev_priv);
>  				if (SKL_ENABLE_DC6(dev)) {
> -					skl_disable_dc6(dev_priv);
>  					/*
>  					 * DDI buffer programming unnecessary during driver-load/resume
>  					 * as it's already done during modeset initialization then.
> @@ -640,8 +640,6 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>  					 */
>  					if (!dev_priv->power_domains.initializing)
>  						intel_prepare_ddi(dev);
> -				} else {
> -					gen9_disable_dc5(dev_priv);
>  				}
>  			}
>  			I915_WRITE(HSW_PWR_WELL_DRIVER, tmp | req_mask);
> @@ -667,7 +665,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>  				DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
>  			}
>  
> -			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> +			if (GEN9_ENABLE_DC5(dev) &&
>  				power_well->data == SKL_DISP_PW_2) {
>  				enum csr_state state;
>  				/* TODO: wait for a completion event or
> @@ -680,10 +678,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
>  					DRM_DEBUG("CSR firmware not ready (%d)\n",
>  							state);
>  				else
> -					if (SKL_ENABLE_DC6(dev))
> -						skl_enable_dc6(dev_priv);
> -					else
> -						gen9_enable_dc5(dev_priv);
> +					gen9_enable_dc5(dev_priv);
>  			}
>  		}
>  	}


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [DMC_BUGFIX_V3] drm/i915/skl: Making DC6 entry is the last call in suspend flow.
  2015-10-16 12:22       ` Imre Deak
@ 2015-10-19  9:26         ` Daniel Vetter
  0 siblings, 0 replies; 51+ messages in thread
From: Daniel Vetter @ 2015-10-19  9:26 UTC (permalink / raw)
  To: Imre Deak; +Cc: Rajneesh Bhardwaj, intel-gfx, Daniel Vetter

On Fri, Oct 16, 2015 at 03:22:45PM +0300, Imre Deak wrote:
> On ti, 2015-09-29 at 11:01 +0530, Animesh Manna wrote:
> > Mmio register access after dc6/dc5 entry is not allowed when
> > DC6 power states are enabled according to bspec (bspec-id 0527),
> > so enabling dc6 as the last call in suspend flow.
> > 
> > v1: Initial version.
> > 
> > v2: Based on review comment from Daniel,
> > - created a seperate patch for csr uninitialization set call.
> > 
> > v3: Rebased on top of latest code.
> > 
> > Cc: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Damien Lespiau <damien.lespiau@intel.com>
> > Cc: Imre Deak <imre.deak@intel.com>
> > Cc: Sunil Kamath <sunil.kamath@intel.com>
> > Signed-off-by: Animesh Manna <animesh.manna@intel.com>
> > Signed-off-by: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
> > Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@intel.com>
> 
> Acked-by: Imre Deak <imre.deak@intel.com>
> 
> Suggestion for the commit message:
> 
> Currently we keep DC6 enabled during modesets and DPAUX transfers, which
> is not allowed according to the specification. This can lead at least to
> PLL locking failures, DPAUX timeouts and prevent deeper package power
> states (PC9/10). Fix this for now by enabling DC6 only when we know the
> above events (modeset, DPAUX) can't happen.
> 
> This a temporary solution as some issues are still unsolved as described
> in [1] and [2], we'll address those as a follow-up. 
> 
> [1]
> http://lists.freedesktop.org/archives/intel-gfx/2015-October/077669.html
> [2]
> http://lists.freedesktop.org/archives/intel-gfx/2015-October/077787.html

Queued for -next, thanks for the patch.
-Daniel

> 
> 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c         | 13 +++++++++++++
> >  drivers/gpu/drm/i915/intel_drv.h        |  2 ++
> >  drivers/gpu/drm/i915/intel_runtime_pm.c | 19 +++++++------------
> >  3 files changed, 22 insertions(+), 12 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 1cb6b82..51075d5 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1049,10 +1049,20 @@ static int i915_pm_resume(struct device *dev)
> >  
> >  static int skl_suspend_complete(struct drm_i915_private *dev_priv)
> >  {
> > +	enum csr_state state;
> >  	/* Enabling DC6 is not a hard requirement to enter runtime D3 */
> >  
> >  	skl_uninit_cdclk(dev_priv);
> >  
> > +	/* TODO: wait for a completion event or
> > +	 * similar here instead of busy
> > +	 * waiting using wait_for function.
> > +	 */
> > +	wait_for((state = intel_csr_load_status_get(dev_priv)) !=
> > +			FW_UNINITIALIZED, 1000);
> > +	if (state == FW_LOADED)
> > +		skl_enable_dc6(dev_priv);
> > +
> >  	return 0;
> >  }
> >  
> > @@ -1099,6 +1109,9 @@ static int skl_resume_prepare(struct drm_i915_private *dev_priv)
> >  {
> >  	struct drm_device *dev = dev_priv->dev;
> >  
> > +	if (intel_csr_load_status_get(dev_priv) == FW_LOADED)
> > +		skl_disable_dc6(dev_priv);
> > +
> >  	skl_init_cdclk(dev_priv);
> >  	intel_csr_load_program(dev);
> >  
> > diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> > index c96289d..990161d 100644
> > --- a/drivers/gpu/drm/i915/intel_drv.h
> > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > @@ -1143,6 +1143,8 @@ void bxt_enable_dc9(struct drm_i915_private *dev_priv);
> >  void bxt_disable_dc9(struct drm_i915_private *dev_priv);
> >  void skl_init_cdclk(struct drm_i915_private *dev_priv);
> >  void skl_uninit_cdclk(struct drm_i915_private *dev_priv);
> > +void skl_enable_dc6(struct drm_i915_private *dev_priv);
> > +void skl_disable_dc6(struct drm_i915_private *dev_priv);
> >  void intel_dp_get_m_n(struct intel_crtc *crtc,
> >  		      struct intel_crtc_state *pipe_config);
> >  void intel_dp_set_m_n(struct intel_crtc *crtc, enum link_m_n_set m_n);
> > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > index d8e9416..d6b4f61 100644
> > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > @@ -551,7 +551,7 @@ static void assert_can_disable_dc6(struct drm_i915_private *dev_priv)
> >  		  "DC6 already programmed to be disabled.\n");
> >  }
> >  
> > -static void skl_enable_dc6(struct drm_i915_private *dev_priv)
> > +void skl_enable_dc6(struct drm_i915_private *dev_priv)
> >  {
> >  	uint32_t val;
> >  
> > @@ -568,7 +568,7 @@ static void skl_enable_dc6(struct drm_i915_private *dev_priv)
> >  	POSTING_READ(DC_STATE_EN);
> >  }
> >  
> > -static void skl_disable_dc6(struct drm_i915_private *dev_priv)
> > +void skl_disable_dc6(struct drm_i915_private *dev_priv)
> >  {
> >  	uint32_t val;
> >  
> > @@ -629,10 +629,10 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
> >  				!I915_READ(HSW_PWR_WELL_BIOS),
> >  				"Invalid for power well status to be enabled, unless done by the BIOS, \
> >  				when request is to disable!\n");
> > -			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> > -				power_well->data == SKL_DISP_PW_2) {
> > +			if (power_well->data == SKL_DISP_PW_2) {
> > +				if (GEN9_ENABLE_DC5(dev))
> > +					gen9_disable_dc5(dev_priv);
> >  				if (SKL_ENABLE_DC6(dev)) {
> > -					skl_disable_dc6(dev_priv);
> >  					/*
> >  					 * DDI buffer programming unnecessary during driver-load/resume
> >  					 * as it's already done during modeset initialization then.
> > @@ -640,8 +640,6 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
> >  					 */
> >  					if (!dev_priv->power_domains.initializing)
> >  						intel_prepare_ddi(dev);
> > -				} else {
> > -					gen9_disable_dc5(dev_priv);
> >  				}
> >  			}
> >  			I915_WRITE(HSW_PWR_WELL_DRIVER, tmp | req_mask);
> > @@ -667,7 +665,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
> >  				DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
> >  			}
> >  
> > -			if ((GEN9_ENABLE_DC5(dev) || SKL_ENABLE_DC6(dev)) &&
> > +			if (GEN9_ENABLE_DC5(dev) &&
> >  				power_well->data == SKL_DISP_PW_2) {
> >  				enum csr_state state;
> >  				/* TODO: wait for a completion event or
> > @@ -680,10 +678,7 @@ static void skl_set_power_well(struct drm_i915_private *dev_priv,
> >  					DRM_DEBUG("CSR firmware not ready (%d)\n",
> >  							state);
> >  				else
> > -					if (SKL_ENABLE_DC6(dev))
> > -						skl_enable_dc6(dev_priv);
> > -					else
> > -						gen9_enable_dc5(dev_priv);
> > +					gen9_enable_dc5(dev_priv);
> >  			}
> >  		}
> >  	}
> 
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2015-10-19  9:26 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-25 20:06 [DMC_BUGFIX_SKL_V2 0/5] pc10 entry fixes for skl Animesh Manna
2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 1/5] drm/i915/skl: Added a check for the hardware status of csr fw before loading Animesh Manna
2015-08-26 13:10   ` Daniel Vetter
2015-08-26 14:10     ` Animesh Manna
2015-09-02  8:54       ` Daniel Vetter
2015-09-09 20:28         ` Animesh Manna
2015-09-10 14:45           ` Daniel Vetter
2015-09-10 19:05             ` Animesh Manna
2015-09-10 19:06             ` Animesh Manna
2015-09-14  7:46               ` Daniel Vetter
2015-09-16 19:23                 ` Animesh Manna
2015-09-23  7:57                   ` Daniel Vetter
2015-09-23 16:27                     ` Daniel Vetter
2015-09-23 16:28                       ` Daniel Vetter
2015-09-23 17:17                         ` Daniel Vetter
2015-09-23 20:49                           ` Rafael J. Wysocki
2015-09-28  6:52                             ` Daniel Vetter
2015-09-28 23:54                               ` Rafael J. Wysocki
2015-09-29  8:51                                 ` Daniel Vetter
2015-09-30  0:50                                   ` Rafael J. Wysocki
2015-09-30 12:14                                     ` Daniel Vetter
2015-09-30 23:34                                       ` Rafael J. Wysocki
2015-09-07 11:04   ` Sunil Kamath
2015-09-07 16:22     ` Daniel Vetter
2015-09-09 20:33       ` Animesh Manna
2015-09-28  7:03     ` Daniel Vetter
2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 2/5] drm/i915/skl Remove the call for csr uninitialization from suspend path Animesh Manna
2015-09-07 11:05   ` Sunil Kamath
2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 3/5] drm/i915/skl: Making DC6 entry is the last call in suspend flow Animesh Manna
2015-09-07 11:06   ` Sunil Kamath
2015-09-28  7:21   ` Daniel Vetter
2015-09-28 18:49     ` Hindman, Gavin
2015-09-29  5:31     ` [DMC_BUGFIX_V3] " Animesh Manna
2015-10-16 12:22       ` Imre Deak
2015-10-19  9:26         ` Daniel Vetter
2015-09-29  5:38     ` [DMC_BUGFIX_SKL_V2 3/5] " Animesh Manna
2015-09-29  9:01       ` Daniel Vetter
2015-09-29 12:35         ` Patrik Jakobsson
2015-09-29 13:01           ` Daniel Vetter
2015-09-29 13:23             ` Ville Syrjälä
2015-09-29 14:00               ` Daniel Vetter
2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 4/5] drm/i915/skl: Do not disable cdclk PLL if csr firmware is present Animesh Manna
2015-08-26 13:11   ` Daniel Vetter
2015-08-26 14:31     ` Animesh Manna
2015-08-31  1:03     ` Hindman, Gavin
2015-09-02  8:58       ` Daniel Vetter
2015-09-07 11:07   ` Sunil Kamath
2015-08-25 20:06 ` [DMC_BUGFIX_SKL_V2 5/5] drm/i915/skl: Block disable call for pw1 if dmc " Animesh Manna
2015-09-07 11:09   ` Sunil Kamath
2015-09-28  7:24     ` Daniel Vetter
2015-10-09 13:58 ` [DMC_BUGFIX_SKL_V2 0/5] pc10 entry fixes for skl Imre Deak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.