All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] drm/i915: improve the RPM device suspended assert
@ 2015-11-09 18:20 Imre Deak
  2015-11-09 18:20 ` [PATCH 1/4] drm/i915: export assert_device_not_suspended Imre Deak
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Imre Deak @ 2015-11-09 18:20 UTC (permalink / raw)
  To: intel-gfx

Motivated by the discussions with Chris about imrpoving our RPM ref
get/put logic and seeing that we are still missing RPM refs from
places I improved a bit the assert that checks if the device is powered
on while we are accessing the HW. This produced at least one additional
assert on the atomic commit path, so I think it's useful.

This is based on Patrik's DC rework patches [1], but just to avoid merge
conflicts, functionally it doesn't depend on that.

Tested on SKL.

[1]
http://lists.freedesktop.org/archives/intel-gfx/2015-November/079758.html

Imre Deak (4):
  drm/i915: export assert_device_not_suspended
  drm/i915: use assert_device_not_suspended instead of opencoding it
  drm/i915: make assert_device_not_suspended more precise
  drm/i915: add assert_device_not_suspended to GGTT PTE updaters

 drivers/gpu/drm/i915/i915_drv.c         |  5 +++++
 drivers/gpu/drm/i915/i915_drv.h         |  5 +++++
 drivers/gpu/drm/i915/i915_gem_gtt.c     |  8 ++++++++
 drivers/gpu/drm/i915/intel_drv.h        |  2 ++
 drivers/gpu/drm/i915/intel_runtime_pm.c | 26 ++++++++++++++++++++------
 drivers/gpu/drm/i915/intel_uncore.c     |  9 +--------
 6 files changed, 41 insertions(+), 14 deletions(-)

-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/4] drm/i915: export assert_device_not_suspended
  2015-11-09 18:20 [PATCH 0/4] drm/i915: improve the RPM device suspended assert Imre Deak
@ 2015-11-09 18:20 ` Imre Deak
  2015-11-09 18:30   ` Ville Syrjälä
  2015-11-09 20:52   ` Chris Wilson
  2015-11-09 18:20 ` [PATCH 2/4] drm/i915: use assert_device_not_suspended instead of opencoding it Imre Deak
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 25+ messages in thread
From: Imre Deak @ 2015-11-09 18:20 UTC (permalink / raw)
  To: intel-gfx

We should use the same assert in more places, so export it. Move it to
intel_runtime_pm.c at the same time, since that's a more logical place
for it.

Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/intel_drv.h        | 2 ++
 drivers/gpu/drm/i915/intel_runtime_pm.c | 6 ++++++
 drivers/gpu/drm/i915/intel_uncore.c     | 7 -------
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index f9089df..373cc07 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1415,6 +1415,8 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
 			     enum intel_display_power_domain domain);
 void intel_display_power_put(struct drm_i915_private *dev_priv,
 			     enum intel_display_power_domain domain);
+
+void assert_device_not_suspended(struct drm_i915_private *dev_priv);
 void intel_runtime_pm_get(struct drm_i915_private *dev_priv);
 void intel_runtime_pm_get_noresume(struct drm_i915_private *dev_priv);
 void intel_runtime_pm_put(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index b42506b..0d4a03b 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -2120,6 +2120,12 @@ void intel_power_domains_init_hw(struct drm_i915_private *dev_priv, bool resume)
 	power_domains->initializing = false;
 }
 
+void assert_device_not_suspended(struct drm_i915_private *dev_priv)
+{
+	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv->pm.suspended,
+		  "Device suspended\n");
+}
+
 /**
  * intel_power_domains_suspend - suspend power domain state
  * @dev_priv: i915 device instance
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 5bb269c..2b93a68 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -50,13 +50,6 @@ intel_uncore_forcewake_domain_to_str(const enum forcewake_domain_id id)
 	return "unknown";
 }
 
-static void
-assert_device_not_suspended(struct drm_i915_private *dev_priv)
-{
-	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv->pm.suspended,
-		  "Device suspended\n");
-}
-
 static inline void
 fw_domain_reset(const struct intel_uncore_forcewake_domain *d)
 {
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/4] drm/i915: use assert_device_not_suspended instead of opencoding it
  2015-11-09 18:20 [PATCH 0/4] drm/i915: improve the RPM device suspended assert Imre Deak
  2015-11-09 18:20 ` [PATCH 1/4] drm/i915: export assert_device_not_suspended Imre Deak
@ 2015-11-09 18:20 ` Imre Deak
  2015-11-09 21:04   ` Chris Wilson
  2015-11-09 18:20 ` [PATCH 3/4] drm/i915: make assert_device_not_suspended more precise Imre Deak
  2015-11-09 18:20 ` [PATCH 4/4] drm/i915: add assert_device_not_suspended to GGTT PTE updaters Imre Deak
  3 siblings, 1 reply; 25+ messages in thread
From: Imre Deak @ 2015-11-09 18:20 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/intel_runtime_pm.c | 10 ++++------
 drivers/gpu/drm/i915/intel_uncore.c     |  2 +-
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 0d4a03b..4d39b3c 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -464,8 +464,7 @@ static void assert_can_enable_dc5(struct drm_i915_private *dev_priv)
 
 	WARN_ONCE((I915_READ(DC_STATE_EN) & DC_STATE_EN_UPTO_DC5),
 		  "DC5 already programmed to be enabled.\n");
-	WARN_ONCE(dev_priv->pm.suspended,
-		  "DC5 cannot be enabled, if platform is runtime-suspended.\n");
+	assert_device_not_suspended(dev_priv);
 
 	assert_csr_loaded(dev_priv);
 }
@@ -479,8 +478,7 @@ static void assert_can_disable_dc5(struct drm_i915_private *dev_priv)
 	if (dev_priv->power_domains.initializing)
 		return;
 
-	WARN_ONCE(dev_priv->pm.suspended,
-		"Disabling of DC5 while platform is runtime-suspended should never happen.\n");
+	assert_device_not_suspended(dev_priv);
 }
 
 static void gen9_enable_dc5(struct drm_i915_private *dev_priv)
@@ -2158,7 +2156,7 @@ void intel_runtime_pm_get(struct drm_i915_private *dev_priv)
 		return;
 
 	pm_runtime_get_sync(device);
-	WARN(dev_priv->pm.suspended, "Device still suspended.\n");
+	assert_device_not_suspended(dev_priv);
 }
 
 /**
@@ -2186,7 +2184,7 @@ void intel_runtime_pm_get_noresume(struct drm_i915_private *dev_priv)
 	if (!HAS_RUNTIME_PM(dev))
 		return;
 
-	WARN(dev_priv->pm.suspended, "Getting nosync-ref while suspended.\n");
+	assert_device_not_suspended(dev_priv);
 	pm_runtime_get_noresume(device);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 2b93a68..b4aea91 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -404,7 +404,7 @@ void intel_uncore_forcewake_get(struct drm_i915_private *dev_priv,
 	if (!dev_priv->uncore.funcs.force_wake_get)
 		return;
 
-	WARN_ON(dev_priv->pm.suspended);
+	assert_device_not_suspended(dev_priv);
 
 	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
 	__intel_uncore_forcewake_get(dev_priv, fw_domains);
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/4] drm/i915: make assert_device_not_suspended more precise
  2015-11-09 18:20 [PATCH 0/4] drm/i915: improve the RPM device suspended assert Imre Deak
  2015-11-09 18:20 ` [PATCH 1/4] drm/i915: export assert_device_not_suspended Imre Deak
  2015-11-09 18:20 ` [PATCH 2/4] drm/i915: use assert_device_not_suspended instead of opencoding it Imre Deak
@ 2015-11-09 18:20 ` Imre Deak
  2015-11-09 19:13   ` [PATCH v2 " Imre Deak
  2015-11-09 18:20 ` [PATCH 4/4] drm/i915: add assert_device_not_suspended to GGTT PTE updaters Imre Deak
  3 siblings, 1 reply; 25+ messages in thread
From: Imre Deak @ 2015-11-09 18:20 UTC (permalink / raw)
  To: intel-gfx

Atm, we assert that the device is not suspended after the point when the
HW is truly put to a suspended state. This is fine, but we can catch
more problems if we check the RPM refcount. After that one drops to zero
we shouldn't access the HW any more, although the actual suspend may be
delayed. The only complication is that we want to avoid asserts while
the suspend handler itself is running, so add a flag to handle this
case.

This caught additional WARNs from the atomic path, those will be fixed
as a follow-up.

Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c         |  5 +++++
 drivers/gpu/drm/i915/i915_drv.h         |  5 +++++
 drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++--
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 77d183d..caeb218 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1494,6 +1494,9 @@ static int intel_runtime_suspend(struct device *device)
 
 		return -EAGAIN;
 	}
+
+	dev_priv->pm.disable_suspended_assert = true;
+
 	/*
 	 * We are safe here against re-faults, since the fault handler takes
 	 * an RPM reference.
@@ -1518,6 +1521,8 @@ static int intel_runtime_suspend(struct device *device)
 	intel_uncore_forcewake_reset(dev, false);
 	dev_priv->pm.suspended = true;
 
+	dev_priv->pm.disable_suspended_assert = false;
+
 	/*
 	 * FIXME: We really should find a document that references the arguments
 	 * used below!
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5628c5a..43fd341 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1599,6 +1599,11 @@ struct skl_wm_level {
  * For more, read the Documentation/power/runtime_pm.txt.
  */
 struct i915_runtime_pm {
+	/*
+	 * Used for the duration of runtime suspend to avoid false device
+	 * suspended asserts.
+	 */
+	bool disable_suspended_assert;
 	bool suspended;
 	bool irqs_enabled;
 };
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 4d39b3c..2e2083c 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -2120,8 +2120,18 @@ void intel_power_domains_init_hw(struct drm_i915_private *dev_priv, bool resume)
 
 void assert_device_not_suspended(struct drm_i915_private *dev_priv)
 {
-	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv->pm.suspended,
-		  "Device suspended\n");
+	int rpm_usage;
+
+	if (!HAS_RUNTIME_PM(dev_priv) || dev_priv->pm.disable_suspended_assert)
+		return;
+
+#ifdef CONFIG_PM
+	rpm_usage = atomic_read(&dev_priv->dev->dev->power.usage_count);
+#else
+	rpm_usage = 1;
+#endif
+
+	WARN_ONCE(dev_priv->pm.suspended || !rpm_usage, "Device suspended\n");
 }
 
 /**
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/4] drm/i915: add assert_device_not_suspended to GGTT PTE updaters
  2015-11-09 18:20 [PATCH 0/4] drm/i915: improve the RPM device suspended assert Imre Deak
                   ` (2 preceding siblings ...)
  2015-11-09 18:20 ` [PATCH 3/4] drm/i915: make assert_device_not_suspended more precise Imre Deak
@ 2015-11-09 18:20 ` Imre Deak
  2015-11-09 18:37   ` Ville Syrjälä
  2015-11-09 19:14   ` [PATCH v2 " Imre Deak
  3 siblings, 2 replies; 25+ messages in thread
From: Imre Deak @ 2015-11-09 18:20 UTC (permalink / raw)
  To: intel-gfx

The device should be on when updating the GGTT PTEs, so add an assert to
all relevant places.

Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 016739e..d4a84b8 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2358,6 +2358,8 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 	struct sg_page_iter sg_iter;
 	dma_addr_t addr = 0; /* shut up gcc */
 
+	assert_device_not_suspended(vm->dev->dev_private);
+
 	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
 		addr = sg_dma_address(sg_iter.sg) +
 			(sg_iter.sg_pgoffset << PAGE_SHIFT);
@@ -2404,6 +2406,8 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 	struct sg_page_iter sg_iter;
 	dma_addr_t addr = 0;
 
+	assert_device_not_suspended(vm->dev->dev_private);
+
 	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
 		addr = sg_page_iter_dma_address(&sg_iter);
 		iowrite32(vm->pte_encode(addr, level, true, flags), &gtt_entries[i]);
@@ -2442,6 +2446,8 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
 	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
 	int i;
 
+	assert_device_not_suspended(vm->dev->dev_private);
+
 	if (WARN(num_entries > max_entries,
 		 "First entry = %d; Num entries = %d (max=%d)\n",
 		 first_entry, num_entries, max_entries))
@@ -2468,6 +2474,8 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
 	int i;
 
+	assert_device_not_suspended(vm->dev->dev_private);
+
 	if (WARN(num_entries > max_entries,
 		 "First entry = %d; Num entries = %d (max=%d)\n",
 		 first_entry, num_entries, max_entries))
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] drm/i915: export assert_device_not_suspended
  2015-11-09 18:20 ` [PATCH 1/4] drm/i915: export assert_device_not_suspended Imre Deak
@ 2015-11-09 18:30   ` Ville Syrjälä
  2015-11-09 18:43     ` Imre Deak
  2015-11-09 20:52   ` Chris Wilson
  1 sibling, 1 reply; 25+ messages in thread
From: Ville Syrjälä @ 2015-11-09 18:30 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Mon, Nov 09, 2015 at 08:20:08PM +0200, Imre Deak wrote:
> We should use the same assert in more places, so export it. Move it to
> intel_runtime_pm.c at the same time, since that's a more logical place
> for it.
> 
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_drv.h        | 2 ++
>  drivers/gpu/drm/i915/intel_runtime_pm.c | 6 ++++++
>  drivers/gpu/drm/i915/intel_uncore.c     | 7 -------
>  3 files changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index f9089df..373cc07 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1415,6 +1415,8 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
>  			     enum intel_display_power_domain domain);
>  void intel_display_power_put(struct drm_i915_private *dev_priv,
>  			     enum intel_display_power_domain domain);
> +
> +void assert_device_not_suspended(struct drm_i915_private *dev_priv);
>  void intel_runtime_pm_get(struct drm_i915_private *dev_priv);
>  void intel_runtime_pm_get_noresume(struct drm_i915_private *dev_priv);
>  void intel_runtime_pm_put(struct drm_i915_private *dev_priv);
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index b42506b..0d4a03b 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -2120,6 +2120,12 @@ void intel_power_domains_init_hw(struct drm_i915_private *dev_priv, bool resume)
>  	power_domains->initializing = false;
>  }
>  
> +void assert_device_not_suspended(struct drm_i915_private *dev_priv)
> +{
> +	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv->pm.suspended,

Is there any point in the HAS_RUNTIME_PM() check?

> +		  "Device suspended\n");
> +}
> +
>  /**
>   * intel_power_domains_suspend - suspend power domain state
>   * @dev_priv: i915 device instance
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index 5bb269c..2b93a68 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -50,13 +50,6 @@ intel_uncore_forcewake_domain_to_str(const enum forcewake_domain_id id)
>  	return "unknown";
>  }
>  
> -static void
> -assert_device_not_suspended(struct drm_i915_private *dev_priv)
> -{
> -	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv->pm.suspended,
> -		  "Device suspended\n");
> -}
> -
>  static inline void
>  fw_domain_reset(const struct intel_uncore_forcewake_domain *d)
>  {
> -- 
> 2.5.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] drm/i915: add assert_device_not_suspended to GGTT PTE updaters
  2015-11-09 18:20 ` [PATCH 4/4] drm/i915: add assert_device_not_suspended to GGTT PTE updaters Imre Deak
@ 2015-11-09 18:37   ` Ville Syrjälä
  2015-11-09 18:48     ` Imre Deak
  2015-11-09 19:14   ` [PATCH v2 " Imre Deak
  1 sibling, 1 reply; 25+ messages in thread
From: Ville Syrjälä @ 2015-11-09 18:37 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Mon, Nov 09, 2015 at 08:20:11PM +0200, Imre Deak wrote:
> The device should be on when updating the GGTT PTEs, so add an assert to
> all relevant places.
> 
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 016739e..d4a84b8 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -2358,6 +2358,8 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
>  	struct sg_page_iter sg_iter;
>  	dma_addr_t addr = 0; /* shut up gcc */
>  
> +	assert_device_not_suspended(vm->dev->dev_private);
> +

We do have the GFX_FLSH_CNTL_GEN6 write in the insert_entries hooks,
which should already do the check. But I see no harm in doing it
explicitly as well. Serves as documentation if nothing more. The
clear_range hooks don't do any register access, so there we should
definitely have this.

>  	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
>  		addr = sg_dma_address(sg_iter.sg) +
>  			(sg_iter.sg_pgoffset << PAGE_SHIFT);
> @@ -2404,6 +2406,8 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
>  	struct sg_page_iter sg_iter;
>  	dma_addr_t addr = 0;
>  
> +	assert_device_not_suspended(vm->dev->dev_private);

There's already a dev_priv around in all of these AFAICS.

> +
>  	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
>  		addr = sg_page_iter_dma_address(&sg_iter);
>  		iowrite32(vm->pte_encode(addr, level, true, flags), &gtt_entries[i]);
> @@ -2442,6 +2446,8 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
>  	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
>  	int i;
>  
> +	assert_device_not_suspended(vm->dev->dev_private);
> +
>  	if (WARN(num_entries > max_entries,
>  		 "First entry = %d; Num entries = %d (max=%d)\n",
>  		 first_entry, num_entries, max_entries))
> @@ -2468,6 +2474,8 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
>  	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
>  	int i;
>  
> +	assert_device_not_suspended(vm->dev->dev_private);
> +
>  	if (WARN(num_entries > max_entries,
>  		 "First entry = %d; Num entries = %d (max=%d)\n",
>  		 first_entry, num_entries, max_entries))
> -- 
> 2.5.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] drm/i915: export assert_device_not_suspended
  2015-11-09 18:30   ` Ville Syrjälä
@ 2015-11-09 18:43     ` Imre Deak
  0 siblings, 0 replies; 25+ messages in thread
From: Imre Deak @ 2015-11-09 18:43 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On ma, 2015-11-09 at 20:30 +0200, Ville Syrjälä wrote:
> On Mon, Nov 09, 2015 at 08:20:08PM +0200, Imre Deak wrote:
> > We should use the same assert in more places, so export it. Move it
> > to
> > intel_runtime_pm.c at the same time, since that's a more logical
> > place
> > for it.
> > 
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_drv.h        | 2 ++
> >  drivers/gpu/drm/i915/intel_runtime_pm.c | 6 ++++++
> >  drivers/gpu/drm/i915/intel_uncore.c     | 7 -------
> >  3 files changed, 8 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_drv.h
> > b/drivers/gpu/drm/i915/intel_drv.h
> > index f9089df..373cc07 100644
> > --- a/drivers/gpu/drm/i915/intel_drv.h
> > +++ b/drivers/gpu/drm/i915/intel_drv.h
> > @@ -1415,6 +1415,8 @@ void intel_display_power_get(struct
> > drm_i915_private *dev_priv,
> >  			     enum intel_display_power_domain
> > domain);
> >  void intel_display_power_put(struct drm_i915_private *dev_priv,
> >  			     enum intel_display_power_domain
> > domain);
> > +
> > +void assert_device_not_suspended(struct drm_i915_private
> > *dev_priv);
> >  void intel_runtime_pm_get(struct drm_i915_private *dev_priv);
> >  void intel_runtime_pm_get_noresume(struct drm_i915_private
> > *dev_priv);
> >  void intel_runtime_pm_put(struct drm_i915_private *dev_priv);
> > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > index b42506b..0d4a03b 100644
> > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > @@ -2120,6 +2120,12 @@ void intel_power_domains_init_hw(struct
> > drm_i915_private *dev_priv, bool resume)
> >  	power_domains->initializing = false;
> >  }
> >  
> > +void assert_device_not_suspended(struct drm_i915_private
> > *dev_priv)
> > +{
> > +	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv
> > ->pm.suspended,
> 
> Is there any point in the HAS_RUNTIME_PM() check?

All other platforms should have pm.suspended = false and a non-zero
refcount all the time, so I guess not. I can remove it in patch 3/4.

> 
> > +		  "Device suspended\n");
> > +}
> > +
> >  /**
> >   * intel_power_domains_suspend - suspend power domain state
> >   * @dev_priv: i915 device instance
> > diff --git a/drivers/gpu/drm/i915/intel_uncore.c
> > b/drivers/gpu/drm/i915/intel_uncore.c
> > index 5bb269c..2b93a68 100644
> > --- a/drivers/gpu/drm/i915/intel_uncore.c
> > +++ b/drivers/gpu/drm/i915/intel_uncore.c
> > @@ -50,13 +50,6 @@ intel_uncore_forcewake_domain_to_str(const enum
> > forcewake_domain_id id)
> >  	return "unknown";
> >  }
> >  
> > -static void
> > -assert_device_not_suspended(struct drm_i915_private *dev_priv)
> > -{
> > -	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv
> > ->pm.suspended,
> > -		  "Device suspended\n");
> > -}
> > -
> >  static inline void
> >  fw_domain_reset(const struct intel_uncore_forcewake_domain *d)
> >  {
> > -- 
> > 2.5.0
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] drm/i915: add assert_device_not_suspended to GGTT PTE updaters
  2015-11-09 18:37   ` Ville Syrjälä
@ 2015-11-09 18:48     ` Imre Deak
  0 siblings, 0 replies; 25+ messages in thread
From: Imre Deak @ 2015-11-09 18:48 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

On ma, 2015-11-09 at 20:37 +0200, Ville Syrjälä wrote:
> On Mon, Nov 09, 2015 at 08:20:11PM +0200, Imre Deak wrote:
> > The device should be on when updating the GGTT PTEs, so add an
> > assert to
> > all relevant places.
> > 
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 016739e..d4a84b8 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -2358,6 +2358,8 @@ static void gen8_ggtt_insert_entries(struct
> > i915_address_space *vm,
> >  	struct sg_page_iter sg_iter;
> >  	dma_addr_t addr = 0; /* shut up gcc */
> >  
> > +	assert_device_not_suspended(vm->dev->dev_private);
> > +
> 
> We do have the GFX_FLSH_CNTL_GEN6 write in the insert_entries hooks,
> which should already do the check. But I see no harm in doing it
> explicitly as well. Serves as documentation if nothing more. The
> clear_range hooks don't do any register access, so there we should
> definitely have this.

Yes, I think it's better for symmetry with the other updaters. And it's
a WARN_ONCE, so shouldn't add noise.

> 
> >  	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
> >  		addr = sg_dma_address(sg_iter.sg) +
> >  			(sg_iter.sg_pgoffset << PAGE_SHIFT);
> > @@ -2404,6 +2406,8 @@ static void gen6_ggtt_insert_entries(struct
> > i915_address_space *vm,
> >  	struct sg_page_iter sg_iter;
> >  	dma_addr_t addr = 0;
> >  
> > +	assert_device_not_suspended(vm->dev->dev_private);
> 
> There's already a dev_priv around in all of these AFAICS.

Yep, just copy pasted these.. Will fix them.

> 
> > +
> >  	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
> >  		addr = sg_page_iter_dma_address(&sg_iter);
> >  		iowrite32(vm->pte_encode(addr, level, true,
> > flags), &gtt_entries[i]);
> > @@ -2442,6 +2446,8 @@ static void gen8_ggtt_clear_range(struct
> > i915_address_space *vm,
> >  	const int max_entries = gtt_total_entries(dev_priv->gtt) -
> > first_entry;
> >  	int i;
> >  
> > +	assert_device_not_suspended(vm->dev->dev_private);
> > +
> >  	if (WARN(num_entries > max_entries,
> >  		 "First entry = %d; Num entries = %d (max=%d)\n",
> >  		 first_entry, num_entries, max_entries))
> > @@ -2468,6 +2474,8 @@ static void gen6_ggtt_clear_range(struct
> > i915_address_space *vm,
> >  	const int max_entries = gtt_total_entries(dev_priv->gtt) -
> > first_entry;
> >  	int i;
> >  
> > +	assert_device_not_suspended(vm->dev->dev_private);
> > +
> >  	if (WARN(num_entries > max_entries,
> >  		 "First entry = %d; Num entries = %d (max=%d)\n",
> >  		 first_entry, num_entries, max_entries))
> > -- 
> > 2.5.0
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 3/4] drm/i915: make assert_device_not_suspended more precise
  2015-11-09 18:20 ` [PATCH 3/4] drm/i915: make assert_device_not_suspended more precise Imre Deak
@ 2015-11-09 19:13   ` Imre Deak
  2015-11-09 21:44     ` Chris Wilson
  2015-11-18 14:37     ` Daniel Vetter
  0 siblings, 2 replies; 25+ messages in thread
From: Imre Deak @ 2015-11-09 19:13 UTC (permalink / raw)
  To: intel-gfx

Atm, we assert that the device is not suspended after the point when the
HW is truly put to a suspended state. This is fine, but we can catch
more problems if we check the RPM refcount. After that one drops to zero
we shouldn't access the HW any more, although the actual suspend may be
delayed. The only complication is that we want to avoid asserts while
the suspend handler itself is running, so add a flag to handle this
case.

While at it remove the HAS_RUNTIME_PM check, the pm.suspended flag is
false and the RPM refcount is non-zero on all platforms that don't
support RPM.

This caught additional WARNs from the atomic path, those will be fixed
as a follow-up.

v2:
- remove the redundant HAS_RUNTIME_PM check (Ville)

Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c         |  5 +++++
 drivers/gpu/drm/i915/i915_drv.h         |  5 +++++
 drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++--
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 77d183d..caeb218 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1494,6 +1494,9 @@ static int intel_runtime_suspend(struct device *device)
 
 		return -EAGAIN;
 	}
+
+	dev_priv->pm.disable_suspended_assert = true;
+
 	/*
 	 * We are safe here against re-faults, since the fault handler takes
 	 * an RPM reference.
@@ -1518,6 +1521,8 @@ static int intel_runtime_suspend(struct device *device)
 	intel_uncore_forcewake_reset(dev, false);
 	dev_priv->pm.suspended = true;
 
+	dev_priv->pm.disable_suspended_assert = false;
+
 	/*
 	 * FIXME: We really should find a document that references the arguments
 	 * used below!
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5628c5a..43fd341 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1599,6 +1599,11 @@ struct skl_wm_level {
  * For more, read the Documentation/power/runtime_pm.txt.
  */
 struct i915_runtime_pm {
+	/*
+	 * Used for the duration of runtime suspend to avoid false device
+	 * suspended asserts.
+	 */
+	bool disable_suspended_assert;
 	bool suspended;
 	bool irqs_enabled;
 };
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 4d39b3c..2bdbcd4 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -2120,8 +2120,18 @@ void intel_power_domains_init_hw(struct drm_i915_private *dev_priv, bool resume)
 
 void assert_device_not_suspended(struct drm_i915_private *dev_priv)
 {
-	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv->pm.suspended,
-		  "Device suspended\n");
+	int rpm_usage;
+
+	if (dev_priv->pm.disable_suspended_assert)
+		return;
+
+#ifdef CONFIG_PM
+	rpm_usage = atomic_read(&dev_priv->dev->dev->power.usage_count);
+#else
+	rpm_usage = 1;
+#endif
+
+	WARN_ONCE(dev_priv->pm.suspended || !rpm_usage, "Device suspended\n");
 }
 
 /**
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 4/4] drm/i915: add assert_device_not_suspended to GGTT PTE updaters
  2015-11-09 18:20 ` [PATCH 4/4] drm/i915: add assert_device_not_suspended to GGTT PTE updaters Imre Deak
  2015-11-09 18:37   ` Ville Syrjälä
@ 2015-11-09 19:14   ` Imre Deak
  2015-11-09 21:11     ` Chris Wilson
  1 sibling, 1 reply; 25+ messages in thread
From: Imre Deak @ 2015-11-09 19:14 UTC (permalink / raw)
  To: intel-gfx

The device should be on when updating the GGTT PTEs, so add an assert to
all relevant places.

v2:
- use the existing dev_priv directly everywhere (Ville)

Signed-off-by: Imre Deak <imre.deak@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 016739e..4d42ca5 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2358,6 +2358,8 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 	struct sg_page_iter sg_iter;
 	dma_addr_t addr = 0; /* shut up gcc */
 
+	assert_device_not_suspended(dev_priv);
+
 	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
 		addr = sg_dma_address(sg_iter.sg) +
 			(sg_iter.sg_pgoffset << PAGE_SHIFT);
@@ -2404,6 +2406,8 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 	struct sg_page_iter sg_iter;
 	dma_addr_t addr = 0;
 
+	assert_device_not_suspended(dev_priv);
+
 	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
 		addr = sg_page_iter_dma_address(&sg_iter);
 		iowrite32(vm->pte_encode(addr, level, true, flags), &gtt_entries[i]);
@@ -2442,6 +2446,8 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm,
 	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
 	int i;
 
+	assert_device_not_suspended(dev_priv);
+
 	if (WARN(num_entries > max_entries,
 		 "First entry = %d; Num entries = %d (max=%d)\n",
 		 first_entry, num_entries, max_entries))
@@ -2468,6 +2474,8 @@ static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
 	int i;
 
+	assert_device_not_suspended(dev_priv);
+
 	if (WARN(num_entries > max_entries,
 		 "First entry = %d; Num entries = %d (max=%d)\n",
 		 first_entry, num_entries, max_entries))
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] drm/i915: export assert_device_not_suspended
  2015-11-09 18:20 ` [PATCH 1/4] drm/i915: export assert_device_not_suspended Imre Deak
  2015-11-09 18:30   ` Ville Syrjälä
@ 2015-11-09 20:52   ` Chris Wilson
  1 sibling, 0 replies; 25+ messages in thread
From: Chris Wilson @ 2015-11-09 20:52 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Mon, Nov 09, 2015 at 08:20:08PM +0200, Imre Deak wrote:
> We should use the same assert in more places, so export it. Move it to
> intel_runtime_pm.c at the same time, since that's a more logical place
> for it.
> 
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_drv.h        | 2 ++
>  drivers/gpu/drm/i915/intel_runtime_pm.c | 6 ++++++
>  drivers/gpu/drm/i915/intel_uncore.c     | 7 -------
>  3 files changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index f9089df..373cc07 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1415,6 +1415,8 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
>  			     enum intel_display_power_domain domain);
>  void intel_display_power_put(struct drm_i915_private *dev_priv,
>  			     enum intel_display_power_domain domain);
> +
> +void assert_device_not_suspended(struct drm_i915_private *dev_priv);
>  void intel_runtime_pm_get(struct drm_i915_private *dev_priv);
>  void intel_runtime_pm_get_noresume(struct drm_i915_private *dev_priv);
>  void intel_runtime_pm_put(struct drm_i915_private *dev_priv);
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index b42506b..0d4a03b 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -2120,6 +2120,12 @@ void intel_power_domains_init_hw(struct drm_i915_private *dev_priv, bool resume)
>  	power_domains->initializing = false;
>  }
>  
> +void assert_device_not_suspended(struct drm_i915_private *dev_priv)
> +{
> +	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv->pm.suspended,

Can just be HAS_RUNTIME_PM(dev_priv).

Can we fix HAS_RUNTIME_PM not to be an ever-growing list but a nice
little feature flag?

{
  if (!HAS_RUNTIME_PM(dev_priv))
	return;

  WARN_ONCE(dev_priv->pm.suspended, "Device suspended during HW access");
}

The little extra information would be nice in the error message, but not
convinced the extra whitespace is that helpful.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] drm/i915: use assert_device_not_suspended instead of opencoding it
  2015-11-09 18:20 ` [PATCH 2/4] drm/i915: use assert_device_not_suspended instead of opencoding it Imre Deak
@ 2015-11-09 21:04   ` Chris Wilson
  0 siblings, 0 replies; 25+ messages in thread
From: Chris Wilson @ 2015-11-09 21:04 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Mon, Nov 09, 2015 at 08:20:09PM +0200, Imre Deak wrote:
> Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 4/4] drm/i915: add assert_device_not_suspended to GGTT PTE updaters
  2015-11-09 19:14   ` [PATCH v2 " Imre Deak
@ 2015-11-09 21:11     ` Chris Wilson
  2015-11-09 21:24       ` Imre Deak
  0 siblings, 1 reply; 25+ messages in thread
From: Chris Wilson @ 2015-11-09 21:11 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Mon, Nov 09, 2015 at 09:14:19PM +0200, Imre Deak wrote:
> The device should be on when updating the GGTT PTEs, so add an assert to
> all relevant places.
> 
> v2:
> - use the existing dev_priv directly everywhere (Ville)
> 
> Signed-off-by: Imre Deak <imre.deak@intel.com>

For completeness, add one to i915_ggtt_insert_entries(). Yes, I know
that we never will support rpm on any of those devices, but we may as
well be consistent in documenting that these functions will write
through the magic PCI region.

An issue I see here is dev_priv->pm.suspended doesn't actually say that
rpm can't suspend during the GGTT PTEs writes. Would we not want
something more like assert_rpm_wakelock() and !dev_priv->pm.can_suspend?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 4/4] drm/i915: add assert_device_not_suspended to GGTT PTE updaters
  2015-11-09 21:11     ` Chris Wilson
@ 2015-11-09 21:24       ` Imre Deak
  2015-11-09 21:29         ` Imre Deak
  0 siblings, 1 reply; 25+ messages in thread
From: Imre Deak @ 2015-11-09 21:24 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Mon, 2015-11-09 at 21:11 +0000, Chris Wilson wrote:
> On Mon, Nov 09, 2015 at 09:14:19PM +0200, Imre Deak wrote:
> > The device should be on when updating the GGTT PTEs, so add an assert to
> > all relevant places.
> > 
> > v2:
> > - use the existing dev_priv directly everywhere (Ville)
> > 
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> 
> For completeness, add one to i915_ggtt_insert_entries(). Yes, I know
> that we never will support rpm on any of those devices, but we may as
> well be consistent in documenting that these functions will write
> through the magic PCI region.

Ok, forgot about that one.

> An issue I see here is dev_priv->pm.suspended doesn't actually say that
> rpm can't suspend during the GGTT PTEs writes. Would we not want
> something more like assert_rpm_wakelock() and !dev_priv->pm.can_suspend?

Well, after this patchset we assert that the RPM refcount!=0 too, which
should guarantee that provided that pm.suspended=false as well. It's
possible to have a non-zero refcount with a suspended state (in case of
rpm_get_no_resume).

> -Chris
> 


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 4/4] drm/i915: add assert_device_not_suspended to GGTT PTE updaters
  2015-11-09 21:24       ` Imre Deak
@ 2015-11-09 21:29         ` Imre Deak
  0 siblings, 0 replies; 25+ messages in thread
From: Imre Deak @ 2015-11-09 21:29 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Mon, 2015-11-09 at 23:24 +0200, Imre Deak wrote:
> On Mon, 2015-11-09 at 21:11 +0000, Chris Wilson wrote:
> > On Mon, Nov 09, 2015 at 09:14:19PM +0200, Imre Deak wrote:
> > > The device should be on when updating the GGTT PTEs, so add an assert to
> > > all relevant places.
> > > 
> > > v2:
> > > - use the existing dev_priv directly everywhere (Ville)
> > > 
> > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > 
> > For completeness, add one to i915_ggtt_insert_entries(). Yes, I know
> > that we never will support rpm on any of those devices, but we may as
> > well be consistent in documenting that these functions will write
> > through the magic PCI region.
> 
> Ok, forgot about that one.
> 
> > An issue I see here is dev_priv->pm.suspended doesn't actually say that
> > rpm can't suspend during the GGTT PTEs writes. Would we not want
> > something more like assert_rpm_wakelock() and !dev_priv->pm.can_suspend?
> 
> Well, after this patchset we assert that the RPM refcount!=0 too, which
> should guarantee that provided that pm.suspended=false as well. It's
> possible to have a non-zero refcount with a suspended state (in case of
> rpm_get_no_resume).

But you are right that assert_device_not_suspended doesn't reflect that
completely, so I could rename it.

> 
> > -Chris
> > 
> 


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] drm/i915: make assert_device_not_suspended more precise
  2015-11-09 19:13   ` [PATCH v2 " Imre Deak
@ 2015-11-09 21:44     ` Chris Wilson
  2015-11-10  9:47       ` Imre Deak
  2015-11-18 14:37     ` Daniel Vetter
  1 sibling, 1 reply; 25+ messages in thread
From: Chris Wilson @ 2015-11-09 21:44 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Mon, Nov 09, 2015 at 09:13:45PM +0200, Imre Deak wrote:
> Atm, we assert that the device is not suspended after the point when the
> HW is truly put to a suspended state. This is fine, but we can catch
> more problems if we check the RPM refcount. After that one drops to zero
> we shouldn't access the HW any more, although the actual suspend may be
> delayed. The only complication is that we want to avoid asserts while
> the suspend handler itself is running, so add a flag to handle this
> case.
> 
> While at it remove the HAS_RUNTIME_PM check, the pm.suspended flag is
> false and the RPM refcount is non-zero on all platforms that don't
> support RPM.
> 
> This caught additional WARNs from the atomic path, those will be fixed
> as a follow-up.
> 
> v2:
> - remove the redundant HAS_RUNTIME_PM check (Ville)
> 
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> ---
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -2120,8 +2120,18 @@ void intel_power_domains_init_hw(struct drm_i915_private *dev_priv, bool resume)
>  
>  void assert_device_not_suspended(struct drm_i915_private *dev_priv)
>  {
> -	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv->pm.suspended,
> -		  "Device suspended\n");
> +	int rpm_usage;
> +
> +	if (dev_priv->pm.disable_suspended_assert)
> +		return;
> +
> +#ifdef CONFIG_PM
> +	rpm_usage = atomic_read(&dev_priv->dev->dev->power.usage_count);
> +#else
> +	rpm_usage = 1;
> +#endif

Whilst this should fix the issue I was worried about, I think for e.g.
the GGTT PTE access, we should be checking that we have a rpm ref (i.e.
we have called intel_runtime_pm_get()). Bonus points if we can narrow
that down to being inside an rpm critical section (made tricky because
the wakelocks can nest :(. The simplest way does impose an extra atomic
inc/dec simply for debug purposes, on the other hand it shouldn't then
need the pm.disable_suspended_assert and you can have an extra assert
that we cannot set pm.suspended whilst intel_runtime_pm_get() is held.

Otherwise, yeah just rename this to imply we aren't just checking that
the device isn't suspended right now, but cannot be.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] drm/i915: make assert_device_not_suspended more precise
  2015-11-09 21:44     ` Chris Wilson
@ 2015-11-10  9:47       ` Imre Deak
  0 siblings, 0 replies; 25+ messages in thread
From: Imre Deak @ 2015-11-10  9:47 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On ma, 2015-11-09 at 21:44 +0000, Chris Wilson wrote:
> On Mon, Nov 09, 2015 at 09:13:45PM +0200, Imre Deak wrote:
> > Atm, we assert that the device is not suspended after the point
> > when the
> > HW is truly put to a suspended state. This is fine, but we can
> > catch
> > more problems if we check the RPM refcount. After that one drops to
> > zero
> > we shouldn't access the HW any more, although the actual suspend
> > may be
> > delayed. The only complication is that we want to avoid asserts
> > while
> > the suspend handler itself is running, so add a flag to handle this
> > case.
> > 
> > While at it remove the HAS_RUNTIME_PM check, the pm.suspended flag
> > is
> > false and the RPM refcount is non-zero on all platforms that don't
> > support RPM.
> > 
> > This caught additional WARNs from the atomic path, those will be
> > fixed
> > as a follow-up.
> > 
> > v2:
> > - remove the redundant HAS_RUNTIME_PM check (Ville)
> > 
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > ---
> > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > @@ -2120,8 +2120,18 @@ void intel_power_domains_init_hw(struct
> > drm_i915_private *dev_priv, bool resume)
> >  
> >  void assert_device_not_suspended(struct drm_i915_private
> > *dev_priv)
> >  {
> > -	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv
> > ->pm.suspended,
> > -		  "Device suspended\n");
> > +	int rpm_usage;
> > +
> > +	if (dev_priv->pm.disable_suspended_assert)
> > +		return;
> > +
> > +#ifdef CONFIG_PM
> > +	rpm_usage = atomic_read(&dev_priv->dev->dev
> > ->power.usage_count);
> > +#else
> > +	rpm_usage = 1;
> > +#endif
> 
> Whilst this should fix the issue I was worried about, I think for
> e.g.
> the GGTT PTE access, we should be checking that we have a rpm ref
> (i.e.
> we have called intel_runtime_pm_get()).

Right, didn't think of that. Also we don't have to then to access RPM
internals.

> Bonus points if we can narrow
> that down to being inside an rpm critical section (made tricky
> because
> the wakelocks can nest :(. The simplest way does impose an extra
> atomic
> inc/dec simply for debug purposes, on the other hand it shouldn't
> then
> need the pm.disable_suspended_assert and you can have an extra assert
> that we cannot set pm.suspended whilst intel_runtime_pm_get() is
> held.

Ok, this was interesting and made me look at the current users of RPM.
We have two distinct use cases with different semantics: one in which
we take an RPM ref for a prolonged time and not necessarily release it
in the same context (thread). For example while the GT or any display
outputs are active. The other scenario is for short tasks like
programming the GGTT PTE, or anything else where we wrap something
between a get and a put in the same context. In this case the ref will
be released from the same context as where it was taken. These two use
cases are worth making explicit in the API imo, both for documentation
and for additional sanity checking: in the prolonged case we can only
check for the refcount not to get unbalanced globally (since the get
and put can happen in different contexts). In the case we take the ref
for a short time we can additionally check for the critical section
guarantee, plus that there is no inbalance from each context's POV. All
of these additional checks can be provided by lockdep.


> Otherwise, yeah just rename this to imply we aren't just checking
that
> the device isn't suspended right now, but cannot be.

Ok, I put all the fixes for the review comments so far and the lockdep
thing on top to:
https://github.com/ideak/linux/commits/rpm-assert-improvements

I can post these if there are no objections.

--Imre


> -Chris
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] drm/i915: make assert_device_not_suspended more precise
  2015-11-09 19:13   ` [PATCH v2 " Imre Deak
  2015-11-09 21:44     ` Chris Wilson
@ 2015-11-18 14:37     ` Daniel Vetter
  2015-11-18 14:44       ` Imre Deak
  1 sibling, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2015-11-18 14:37 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Mon, Nov 09, 2015 at 09:13:45PM +0200, Imre Deak wrote:
> Atm, we assert that the device is not suspended after the point when the
> HW is truly put to a suspended state. This is fine, but we can catch
> more problems if we check the RPM refcount. After that one drops to zero
> we shouldn't access the HW any more, although the actual suspend may be
> delayed. The only complication is that we want to avoid asserts while
> the suspend handler itself is running, so add a flag to handle this
> case.

Why do we want to avoid asserts firing while we go through the suspend
handler? Calling assert_device_not_suspended from within rpm
suspend/resume code sounds like a bug. Where/why does this happen?
-Daniel

> 
> While at it remove the HAS_RUNTIME_PM check, the pm.suspended flag is
> false and the RPM refcount is non-zero on all platforms that don't
> support RPM.
> 
> This caught additional WARNs from the atomic path, those will be fixed
> as a follow-up.
> 
> v2:
> - remove the redundant HAS_RUNTIME_PM check (Ville)
> 
> Signed-off-by: Imre Deak <imre.deak@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c         |  5 +++++
>  drivers/gpu/drm/i915/i915_drv.h         |  5 +++++
>  drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++--
>  3 files changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 77d183d..caeb218 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1494,6 +1494,9 @@ static int intel_runtime_suspend(struct device *device)
>  
>  		return -EAGAIN;
>  	}
> +
> +	dev_priv->pm.disable_suspended_assert = true;
> +
>  	/*
>  	 * We are safe here against re-faults, since the fault handler takes
>  	 * an RPM reference.
> @@ -1518,6 +1521,8 @@ static int intel_runtime_suspend(struct device *device)
>  	intel_uncore_forcewake_reset(dev, false);
>  	dev_priv->pm.suspended = true;
>  
> +	dev_priv->pm.disable_suspended_assert = false;
> +
>  	/*
>  	 * FIXME: We really should find a document that references the arguments
>  	 * used below!
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 5628c5a..43fd341 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1599,6 +1599,11 @@ struct skl_wm_level {
>   * For more, read the Documentation/power/runtime_pm.txt.
>   */
>  struct i915_runtime_pm {
> +	/*
> +	 * Used for the duration of runtime suspend to avoid false device
> +	 * suspended asserts.
> +	 */
> +	bool disable_suspended_assert;
>  	bool suspended;
>  	bool irqs_enabled;
>  };
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index 4d39b3c..2bdbcd4 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -2120,8 +2120,18 @@ void intel_power_domains_init_hw(struct drm_i915_private *dev_priv, bool resume)
>  
>  void assert_device_not_suspended(struct drm_i915_private *dev_priv)
>  {
> -	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv->pm.suspended,
> -		  "Device suspended\n");
> +	int rpm_usage;
> +
> +	if (dev_priv->pm.disable_suspended_assert)
> +		return;
> +
> +#ifdef CONFIG_PM
> +	rpm_usage = atomic_read(&dev_priv->dev->dev->power.usage_count);
> +#else
> +	rpm_usage = 1;
> +#endif
> +
> +	WARN_ONCE(dev_priv->pm.suspended || !rpm_usage, "Device suspended\n");
>  }
>  
>  /**
> -- 
> 2.5.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] drm/i915: make assert_device_not_suspended more precise
  2015-11-18 14:37     ` Daniel Vetter
@ 2015-11-18 14:44       ` Imre Deak
  2015-11-18 14:58         ` Imre Deak
  0 siblings, 1 reply; 25+ messages in thread
From: Imre Deak @ 2015-11-18 14:44 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On ke, 2015-11-18 at 15:37 +0100, Daniel Vetter wrote:
> On Mon, Nov 09, 2015 at 09:13:45PM +0200, Imre Deak wrote:
> > Atm, we assert that the device is not suspended after the point
> > when the
> > HW is truly put to a suspended state. This is fine, but we can
> > catch
> > more problems if we check the RPM refcount. After that one drops to
> > zero
> > we shouldn't access the HW any more, although the actual suspend
> > may be
> > delayed. The only complication is that we want to avoid asserts
> > while
> > the suspend handler itself is running, so add a flag to handle this
> > case.
> 
> Why do we want to avoid asserts firing while we go through the
> suspend
> handler? Calling assert_device_not_suspended from within rpm
> suspend/resume code sounds like a bug. Where/why does this happen?

Yea, disable_rpm_asserts() is misnamed. Should be
disable_rpm_wakelock_asserts(). Will change that in the next iteration.

> -Daniel
> 
> > 
> > While at it remove the HAS_RUNTIME_PM check, the pm.suspended flag
> > is
> > false and the RPM refcount is non-zero on all platforms that don't
> > support RPM.
> > 
> > This caught additional WARNs from the atomic path, those will be
> > fixed
> > as a follow-up.
> > 
> > v2:
> > - remove the redundant HAS_RUNTIME_PM check (Ville)
> > 
> > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c         |  5 +++++
> >  drivers/gpu/drm/i915/i915_drv.h         |  5 +++++
> >  drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++--
> >  3 files changed, 22 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index 77d183d..caeb218 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1494,6 +1494,9 @@ static int intel_runtime_suspend(struct
> > device *device)
> >  
> >  		return -EAGAIN;
> >  	}
> > +
> > +	dev_priv->pm.disable_suspended_assert = true;
> > +
> >  	/*
> >  	 * We are safe here against re-faults, since the fault
> > handler takes
> >  	 * an RPM reference.
> > @@ -1518,6 +1521,8 @@ static int intel_runtime_suspend(struct
> > device *device)
> >  	intel_uncore_forcewake_reset(dev, false);
> >  	dev_priv->pm.suspended = true;
> >  
> > +	dev_priv->pm.disable_suspended_assert = false;
> > +
> >  	/*
> >  	 * FIXME: We really should find a document that references
> > the arguments
> >  	 * used below!
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > b/drivers/gpu/drm/i915/i915_drv.h
> > index 5628c5a..43fd341 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1599,6 +1599,11 @@ struct skl_wm_level {
> >   * For more, read the Documentation/power/runtime_pm.txt.
> >   */
> >  struct i915_runtime_pm {
> > +	/*
> > +	 * Used for the duration of runtime suspend to avoid false
> > device
> > +	 * suspended asserts.
> > +	 */
> > +	bool disable_suspended_assert;
> >  	bool suspended;
> >  	bool irqs_enabled;
> >  };
> > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > index 4d39b3c..2bdbcd4 100644
> > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > @@ -2120,8 +2120,18 @@ void intel_power_domains_init_hw(struct
> > drm_i915_private *dev_priv, bool resume)
> >  
> >  void assert_device_not_suspended(struct drm_i915_private
> > *dev_priv)
> >  {
> > -	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv-
> > >pm.suspended,
> > -		  "Device suspended\n");
> > +	int rpm_usage;
> > +
> > +	if (dev_priv->pm.disable_suspended_assert)
> > +		return;
> > +
> > +#ifdef CONFIG_PM
> > +	rpm_usage = atomic_read(&dev_priv->dev->dev-
> > >power.usage_count);
> > +#else
> > +	rpm_usage = 1;
> > +#endif
> > +
> > +	WARN_ONCE(dev_priv->pm.suspended || !rpm_usage, "Device
> > suspended\n");
> >  }
> >  
> >  /**
> > -- 
> > 2.5.0
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] drm/i915: make assert_device_not_suspended more precise
  2015-11-18 14:44       ` Imre Deak
@ 2015-11-18 14:58         ` Imre Deak
  2015-11-18 15:01           ` Daniel Vetter
  0 siblings, 1 reply; 25+ messages in thread
From: Imre Deak @ 2015-11-18 14:58 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On ke, 2015-11-18 at 16:44 +0200, Imre Deak wrote:
> On ke, 2015-11-18 at 15:37 +0100, Daniel Vetter wrote:
> > On Mon, Nov 09, 2015 at 09:13:45PM +0200, Imre Deak wrote:
> > > Atm, we assert that the device is not suspended after the point
> > > when the
> > > HW is truly put to a suspended state. This is fine, but we can
> > > catch
> > > more problems if we check the RPM refcount. After that one drops
> > > to
> > > zero
> > > we shouldn't access the HW any more, although the actual suspend
> > > may be
> > > delayed. The only complication is that we want to avoid asserts
> > > while
> > > the suspend handler itself is running, so add a flag to handle
> > > this
> > > case.
> > 
> > Why do we want to avoid asserts firing while we go through the
> > suspend
> > handler? Calling assert_device_not_suspended from within rpm
> > suspend/resume code sounds like a bug. Where/why does this happen?
> 
> Yea, disable_rpm_asserts() is misnamed. Should be
> disable_rpm_wakelock_asserts(). Will change that in the next
> iteration.

Ok, misunderstood your question. assert_device_not_suspended() is
called during runtime suspend since we're accessing the HW until the
point we set dev_priv->pm.suspended = true. Atm this wouldn't trigger a
WARN, since assert_device_not_suspended() only checks pm.suspended and
that will check out fine, but once we start to check HW accesses
against the actual RPM refcount we want to disable the asserts on those
in the handlers, since there the refcount is zero. Hence disabling it
explicitly around the handlers, but we would still keep checking
pm.suspended.

Hope this explains,
Imre


> 
> > -Daniel
> > 
> > > 
> > > While at it remove the HAS_RUNTIME_PM check, the pm.suspended
> > > flag
> > > is
> > > false and the RPM refcount is non-zero on all platforms that
> > > don't
> > > support RPM.
> > > 
> > > This caught additional WARNs from the atomic path, those will be
> > > fixed
> > > as a follow-up.
> > > 
> > > v2:
> > > - remove the redundant HAS_RUNTIME_PM check (Ville)
> > > 
> > > Signed-off-by: Imre Deak <imre.deak@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.c         |  5 +++++
> > >  drivers/gpu/drm/i915/i915_drv.h         |  5 +++++
> > >  drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++--
> > >  3 files changed, 22 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > b/drivers/gpu/drm/i915/i915_drv.c
> > > index 77d183d..caeb218 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > @@ -1494,6 +1494,9 @@ static int intel_runtime_suspend(struct
> > > device *device)
> > >  
> > >  		return -EAGAIN;
> > >  	}
> > > +
> > > +	dev_priv->pm.disable_suspended_assert = true;
> > > +
> > >  	/*
> > >  	 * We are safe here against re-faults, since the fault
> > > handler takes
> > >  	 * an RPM reference.
> > > @@ -1518,6 +1521,8 @@ static int intel_runtime_suspend(struct
> > > device *device)
> > >  	intel_uncore_forcewake_reset(dev, false);
> > >  	dev_priv->pm.suspended = true;
> > >  
> > > +	dev_priv->pm.disable_suspended_assert = false;
> > > +
> > >  	/*
> > >  	 * FIXME: We really should find a document that
> > > references
> > > the arguments
> > >  	 * used below!
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > b/drivers/gpu/drm/i915/i915_drv.h
> > > index 5628c5a..43fd341 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -1599,6 +1599,11 @@ struct skl_wm_level {
> > >   * For more, read the Documentation/power/runtime_pm.txt.
> > >   */
> > >  struct i915_runtime_pm {
> > > +	/*
> > > +	 * Used for the duration of runtime suspend to avoid
> > > false
> > > device
> > > +	 * suspended asserts.
> > > +	 */
> > > +	bool disable_suspended_assert;
> > >  	bool suspended;
> > >  	bool irqs_enabled;
> > >  };
> > > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > index 4d39b3c..2bdbcd4 100644
> > > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > @@ -2120,8 +2120,18 @@ void intel_power_domains_init_hw(struct
> > > drm_i915_private *dev_priv, bool resume)
> > >  
> > >  void assert_device_not_suspended(struct drm_i915_private
> > > *dev_priv)
> > >  {
> > > -	WARN_ONCE(HAS_RUNTIME_PM(dev_priv->dev) && dev_priv-
> > > > pm.suspended,
> > > -		  "Device suspended\n");
> > > +	int rpm_usage;
> > > +
> > > +	if (dev_priv->pm.disable_suspended_assert)
> > > +		return;
> > > +
> > > +#ifdef CONFIG_PM
> > > +	rpm_usage = atomic_read(&dev_priv->dev->dev-
> > > > power.usage_count);
> > > +#else
> > > +	rpm_usage = 1;
> > > +#endif
> > > +
> > > +	WARN_ONCE(dev_priv->pm.suspended || !rpm_usage, "Device
> > > suspended\n");
> > >  }
> > >  
> > >  /**
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] drm/i915: make assert_device_not_suspended more precise
  2015-11-18 14:58         ` Imre Deak
@ 2015-11-18 15:01           ` Daniel Vetter
  2015-11-18 15:11             ` Imre Deak
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2015-11-18 15:01 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Wed, Nov 18, 2015 at 04:58:46PM +0200, Imre Deak wrote:
> On ke, 2015-11-18 at 16:44 +0200, Imre Deak wrote:
> > On ke, 2015-11-18 at 15:37 +0100, Daniel Vetter wrote:
> > > On Mon, Nov 09, 2015 at 09:13:45PM +0200, Imre Deak wrote:
> > > > Atm, we assert that the device is not suspended after the point
> > > > when the
> > > > HW is truly put to a suspended state. This is fine, but we can
> > > > catch
> > > > more problems if we check the RPM refcount. After that one drops
> > > > to
> > > > zero
> > > > we shouldn't access the HW any more, although the actual suspend
> > > > may be
> > > > delayed. The only complication is that we want to avoid asserts
> > > > while
> > > > the suspend handler itself is running, so add a flag to handle
> > > > this
> > > > case.
> > > 
> > > Why do we want to avoid asserts firing while we go through the
> > > suspend
> > > handler? Calling assert_device_not_suspended from within rpm
> > > suspend/resume code sounds like a bug. Where/why does this happen?
> > 
> > Yea, disable_rpm_asserts() is misnamed. Should be
> > disable_rpm_wakelock_asserts(). Will change that in the next
> > iteration.
> 
> Ok, misunderstood your question. assert_device_not_suspended() is
> called during runtime suspend since we're accessing the HW until the
> point we set dev_priv->pm.suspended = true. Atm this wouldn't trigger a
> WARN, since assert_device_not_suspended() only checks pm.suspended and
> that will check out fine, but once we start to check HW accesses
> against the actual RPM refcount we want to disable the asserts on those
> in the handlers, since there the refcount is zero. Hence disabling it
> explicitly around the handlers, but we would still keep checking
> pm.suspended.

That seems like we're mixing up 2 asserts:
- assert_device_not_suspended: To be used in runtime_suspend code.
- assert_holding_rpm_wakelock (or whatever, I'm bad at names): check the
  count.

What am I missing?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] drm/i915: make assert_device_not_suspended more precise
  2015-11-18 15:01           ` Daniel Vetter
@ 2015-11-18 15:11             ` Imre Deak
  2015-11-18 15:47               ` Daniel Vetter
  0 siblings, 1 reply; 25+ messages in thread
From: Imre Deak @ 2015-11-18 15:11 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On ke, 2015-11-18 at 16:01 +0100, Daniel Vetter wrote:
> On Wed, Nov 18, 2015 at 04:58:46PM +0200, Imre Deak wrote:
> > On ke, 2015-11-18 at 16:44 +0200, Imre Deak wrote:
> > > On ke, 2015-11-18 at 15:37 +0100, Daniel Vetter wrote:
> > > > On Mon, Nov 09, 2015 at 09:13:45PM +0200, Imre Deak wrote:
> > > > > Atm, we assert that the device is not suspended after the point
> > > > > when the
> > > > > HW is truly put to a suspended state. This is fine, but we can
> > > > > catch
> > > > > more problems if we check the RPM refcount. After that one drops
> > > > > to
> > > > > zero
> > > > > we shouldn't access the HW any more, although the actual suspend
> > > > > may be
> > > > > delayed. The only complication is that we want to avoid asserts
> > > > > while
> > > > > the suspend handler itself is running, so add a flag to handle
> > > > > this
> > > > > case.
> > > > 
> > > > Why do we want to avoid asserts firing while we go through the
> > > > suspend
> > > > handler? Calling assert_device_not_suspended from within rpm
> > > > suspend/resume code sounds like a bug. Where/why does this happen?
> > > 
> > > Yea, disable_rpm_asserts() is misnamed. Should be
> > > disable_rpm_wakelock_asserts(). Will change that in the next
> > > iteration.
> > 
> > Ok, misunderstood your question. assert_device_not_suspended() is
> > called during runtime suspend since we're accessing the HW until the
> > point we set dev_priv->pm.suspended = true. Atm this wouldn't trigger a
> > WARN, since assert_device_not_suspended() only checks pm.suspended and
> > that will check out fine, but once we start to check HW accesses
> > against the actual RPM refcount we want to disable the asserts on those
> > in the handlers, since there the refcount is zero. Hence disabling it
> > explicitly around the handlers, but we would still keep checking
> > pm.suspended.
> 
> That seems like we're mixing up 2 asserts:
> - assert_device_not_suspended: To be used in runtime_suspend code.
> - assert_holding_rpm_wakelock (or whatever, I'm bad at names): check the
>   count.

We call this assert (atm assert_device_not_suspended()) from low level
register access helpers, so we can't distinguish between calling one or
the other assert depending on whether we are on the rpm suspend path or
not. What this patch does is to switch all the places where call
assert_device_not_suspended() to assert_rpm_wakelock_held(), since that
one provides a bigger coverage. Since this change will also affect the
low level reg access functions which are called during rpm suspend, we
need to disable part of the assert that checks for the refcount which
is known to be zero there.

Otherwise assert_rpm_wakelock_held() also includes
assert_device_not_suspended(), since that should be true in all other
cases.

> 
> What am I missing?
> -Daniel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] drm/i915: make assert_device_not_suspended more precise
  2015-11-18 15:11             ` Imre Deak
@ 2015-11-18 15:47               ` Daniel Vetter
  2015-11-18 16:09                 ` Imre Deak
  0 siblings, 1 reply; 25+ messages in thread
From: Daniel Vetter @ 2015-11-18 15:47 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Wed, Nov 18, 2015 at 05:11:03PM +0200, Imre Deak wrote:
> On ke, 2015-11-18 at 16:01 +0100, Daniel Vetter wrote:
> > On Wed, Nov 18, 2015 at 04:58:46PM +0200, Imre Deak wrote:
> > > On ke, 2015-11-18 at 16:44 +0200, Imre Deak wrote:
> > > > On ke, 2015-11-18 at 15:37 +0100, Daniel Vetter wrote:
> > > > > On Mon, Nov 09, 2015 at 09:13:45PM +0200, Imre Deak wrote:
> > > > > > Atm, we assert that the device is not suspended after the point
> > > > > > when the
> > > > > > HW is truly put to a suspended state. This is fine, but we can
> > > > > > catch
> > > > > > more problems if we check the RPM refcount. After that one drops
> > > > > > to
> > > > > > zero
> > > > > > we shouldn't access the HW any more, although the actual suspend
> > > > > > may be
> > > > > > delayed. The only complication is that we want to avoid asserts
> > > > > > while
> > > > > > the suspend handler itself is running, so add a flag to handle
> > > > > > this
> > > > > > case.
> > > > > 
> > > > > Why do we want to avoid asserts firing while we go through the
> > > > > suspend
> > > > > handler? Calling assert_device_not_suspended from within rpm
> > > > > suspend/resume code sounds like a bug. Where/why does this happen?
> > > > 
> > > > Yea, disable_rpm_asserts() is misnamed. Should be
> > > > disable_rpm_wakelock_asserts(). Will change that in the next
> > > > iteration.
> > > 
> > > Ok, misunderstood your question. assert_device_not_suspended() is
> > > called during runtime suspend since we're accessing the HW until the
> > > point we set dev_priv->pm.suspended = true. Atm this wouldn't trigger a
> > > WARN, since assert_device_not_suspended() only checks pm.suspended and
> > > that will check out fine, but once we start to check HW accesses
> > > against the actual RPM refcount we want to disable the asserts on those
> > > in the handlers, since there the refcount is zero. Hence disabling it
> > > explicitly around the handlers, but we would still keep checking
> > > pm.suspended.
> > 
> > That seems like we're mixing up 2 asserts:
> > - assert_device_not_suspended: To be used in runtime_suspend code.
> > - assert_holding_rpm_wakelock (or whatever, I'm bad at names): check the
> >   count.
> 
> We call this assert (atm assert_device_not_suspended()) from low level
> register access helpers, so we can't distinguish between calling one or
> the other assert depending on whether we are on the rpm suspend path or
> not. What this patch does is to switch all the places where call
> assert_device_not_suspended() to assert_rpm_wakelock_held(), since that
> one provides a bigger coverage. Since this change will also affect the
> low level reg access functions which are called during rpm suspend, we
> need to disable part of the assert that checks for the refcount which
> is known to be zero there.
> 
> Otherwise assert_rpm_wakelock_held() also includes
> assert_device_not_suspended(), since that should be true in all other
> cases.

Ok, that makes sense. Should be in the commit message ;-)

Instead of cooking our own, what about checking
pci_dev->base.power.runtim_status == PM_SUSPENDING plus a comment?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 3/4] drm/i915: make assert_device_not_suspended more precise
  2015-11-18 15:47               ` Daniel Vetter
@ 2015-11-18 16:09                 ` Imre Deak
  0 siblings, 0 replies; 25+ messages in thread
From: Imre Deak @ 2015-11-18 16:09 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On ke, 2015-11-18 at 16:47 +0100, Daniel Vetter wrote:
> On Wed, Nov 18, 2015 at 05:11:03PM +0200, Imre Deak wrote:
> > On ke, 2015-11-18 at 16:01 +0100, Daniel Vetter wrote:
> > > On Wed, Nov 18, 2015 at 04:58:46PM +0200, Imre Deak wrote:
> > 
> > Otherwise assert_rpm_wakelock_held() also includes
> > assert_device_not_suspended(), since that should be true in all other
> > cases.
> 
> Ok, that makes sense. Should be in the commit message ;-)

Yea, I pieced together the changes on the way based on the discussion
and ideas from Chris, so it's not all in the log:) Will update it.

> Instead of cooking our own, what about checking
> pci_dev->base.power.runtim_status == PM_SUSPENDING plus a comment?

Yea, I was thinking also about that. If you mean to use it instead of
the new wakelock ref: the problem with that is that PM_SUSPENDING gets
set only right before the handler is called (with an optional delay +
work item). But we want to check things already before, right after the
last ref is dropped from the driver's POV. This has btw the benefit
that we have coverage even if RPM is completely disabled.

If you meant to use RPM_SUSPENDED instead of pm.suspended, perhaps we
could do that. The only drawback that I can see now is accessing RPM
internals, not sure how big of an issue that is.

--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-11-18 16:10 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-09 18:20 [PATCH 0/4] drm/i915: improve the RPM device suspended assert Imre Deak
2015-11-09 18:20 ` [PATCH 1/4] drm/i915: export assert_device_not_suspended Imre Deak
2015-11-09 18:30   ` Ville Syrjälä
2015-11-09 18:43     ` Imre Deak
2015-11-09 20:52   ` Chris Wilson
2015-11-09 18:20 ` [PATCH 2/4] drm/i915: use assert_device_not_suspended instead of opencoding it Imre Deak
2015-11-09 21:04   ` Chris Wilson
2015-11-09 18:20 ` [PATCH 3/4] drm/i915: make assert_device_not_suspended more precise Imre Deak
2015-11-09 19:13   ` [PATCH v2 " Imre Deak
2015-11-09 21:44     ` Chris Wilson
2015-11-10  9:47       ` Imre Deak
2015-11-18 14:37     ` Daniel Vetter
2015-11-18 14:44       ` Imre Deak
2015-11-18 14:58         ` Imre Deak
2015-11-18 15:01           ` Daniel Vetter
2015-11-18 15:11             ` Imre Deak
2015-11-18 15:47               ` Daniel Vetter
2015-11-18 16:09                 ` Imre Deak
2015-11-09 18:20 ` [PATCH 4/4] drm/i915: add assert_device_not_suspended to GGTT PTE updaters Imre Deak
2015-11-09 18:37   ` Ville Syrjälä
2015-11-09 18:48     ` Imre Deak
2015-11-09 19:14   ` [PATCH v2 " Imre Deak
2015-11-09 21:11     ` Chris Wilson
2015-11-09 21:24       ` Imre Deak
2015-11-09 21:29         ` Imre Deak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.