* [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
@ 2017-11-29 10:59 Tvrtko Ursulin
2017-11-29 11:10 ` Chris Wilson
` (7 more replies)
0 siblings, 8 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2017-11-29 10:59 UTC (permalink / raw)
To: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
It seems that the DMC likes to transition between the DC states a lot when
there are no connected displays (no active power domains) during command
submission.
This activity on DC states has a negative impact on the performance of the
chip with huge latencies observed in the interrupt handlers and elsewhere.
Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
eight.
Work around it by introducing a new power domain named,
POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
held for the duration of command submission activity.
v2:
* Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
* Protect macro body with braces. (Jani Nikula)
v3:
* Add dedicated power domain for clarity. (Chris, Imre)
* Commit message and comment text updates.
* Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
firmware release.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
Testcase: igt/gem_exec_nop/headless
Cc: Imre Deak <imre.deak@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 5 +++++
drivers/gpu/drm/i915/i915_gem.c | 4 ++++
drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
drivers/gpu/drm/i915/intel_runtime_pm.c | 4 ++++
4 files changed, 27 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bddd65839f60..17340aadc566 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -398,6 +398,7 @@ enum intel_display_power_domain {
POWER_DOMAIN_AUX_D,
POWER_DOMAIN_GMBUS,
POWER_DOMAIN_MODESET,
+ POWER_DOMAIN_GT_IRQ,
POWER_DOMAIN_INIT,
POWER_DOMAIN_NUM,
@@ -3285,6 +3286,10 @@ intel_info(const struct drm_i915_private *dev_priv)
#define GT_FREQUENCY_MULTIPLIER 50
#define GEN9_FREQ_SCALER 3
+#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
+ (HAS_CSR(dev_priv) && (dev_priv)->csr.dmc_payload && \
+ IS_GEN9_BC(dev_priv) && !IS_SKYLAKE(dev_priv))
+
#include "i915_trace.h"
static inline bool intel_vtd_active(void)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 354b0546a191..feec2a621120 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3381,7 +3381,11 @@ i915_gem_idle_work_handler(struct work_struct *work)
if (INTEL_GEN(dev_priv) >= 6)
gen6_rps_idle(dev_priv);
+
intel_runtime_pm_put(dev_priv);
+
+ if (NEEDS_CSR_GT_PERF_WA(dev_priv))
+ intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
out_unlock:
mutex_unlock(&dev_priv->drm.struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index a90bdd26571f..619377a05810 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -251,6 +251,20 @@ static void mark_busy(struct drm_i915_private *i915)
GEM_BUG_ON(!i915->gt.active_requests);
+ /*
+ * It seems that the DMC likes to transition between the DC states a lot
+ * when there are no connected displays (no active power domains) during
+ * command submission.
+ *
+ * This activity has negative impact on the performance of the chip with
+ * huge latencies observed in the interrupt handler and elsewhere.
+ *
+ * Work around it by grabbing a GT IRQ power domain whilst there is any
+ * GT activity, preventing any DC state transitions.
+ */
+ if (NEEDS_CSR_GT_PERF_WA(i915))
+ intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
+
intel_runtime_pm_get_noresume(i915);
i915->gt.awake = true;
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 8315499452dc..f491c7aaa096 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
return "INIT";
case POWER_DOMAIN_MODESET:
return "MODESET";
+ case POWER_DOMAIN_GT_IRQ:
+ return "GT_IRQ";
default:
MISSING_CASE(domain);
return "?";
@@ -1705,6 +1707,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
BIT_ULL(POWER_DOMAIN_INIT))
#define SKL_DISPLAY_DC_OFF_POWER_DOMAINS ( \
SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
+ BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
BIT_ULL(POWER_DOMAIN_MODESET) | \
BIT_ULL(POWER_DOMAIN_AUX_A) | \
BIT_ULL(POWER_DOMAIN_INIT))
@@ -1785,6 +1788,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
BIT_ULL(POWER_DOMAIN_INIT))
#define GLK_DISPLAY_DC_OFF_POWER_DOMAINS ( \
GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
+ BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
BIT_ULL(POWER_DOMAIN_MODESET) | \
BIT_ULL(POWER_DOMAIN_AUX_A) | \
BIT_ULL(POWER_DOMAIN_INIT))
--
2.14.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 10:59 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
@ 2017-11-29 11:10 ` Chris Wilson
2017-11-29 12:29 ` Imre Deak
2017-11-29 11:12 ` Daniel Vetter
` (6 subsequent siblings)
7 siblings, 1 reply; 32+ messages in thread
From: Chris Wilson @ 2017-11-29 11:10 UTC (permalink / raw)
To: Tvrtko Ursulin, Intel-gfx
Quoting Tvrtko Ursulin (2017-11-29 10:59:27)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> It seems that the DMC likes to transition between the DC states a lot when
> there are no connected displays (no active power domains) during command
> submission.
>
> This activity on DC states has a negative impact on the performance of the
> chip with huge latencies observed in the interrupt handlers and elsewhere.
> Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> eight.
>
> Work around it by introducing a new power domain named,
> POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> held for the duration of command submission activity.
>
> v2:
> * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> * Protect macro body with braces. (Jani Nikula)
>
> v3:
> * Add dedicated power domain for clarity. (Chris, Imre)
> * Commit message and comment text updates.
> * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> firmware release.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> Testcase: igt/gem_exec_nop/headless
> Cc: Imre Deak <imre.deak@intel.com>
> Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 5 +++++
> drivers/gpu/drm/i915/i915_gem.c | 4 ++++
> drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
> drivers/gpu/drm/i915/intel_runtime_pm.c | 4 ++++
> 4 files changed, 27 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index bddd65839f60..17340aadc566 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -398,6 +398,7 @@ enum intel_display_power_domain {
> POWER_DOMAIN_AUX_D,
> POWER_DOMAIN_GMBUS,
> POWER_DOMAIN_MODESET,
> + POWER_DOMAIN_GT_IRQ,
> POWER_DOMAIN_INIT,
>
> POWER_DOMAIN_NUM,
> @@ -3285,6 +3286,10 @@ intel_info(const struct drm_i915_private *dev_priv)
> #define GT_FREQUENCY_MULTIPLIER 50
> #define GEN9_FREQ_SCALER 3
>
> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> + (HAS_CSR(dev_priv) && (dev_priv)->csr.dmc_payload && \
We can't have a dmc_payload unless CSR, right?
> + IS_GEN9_BC(dev_priv) && !IS_SKYLAKE(dev_priv))
> +
> #include "i915_trace.h"
>
> static inline bool intel_vtd_active(void)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 354b0546a191..feec2a621120 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3381,7 +3381,11 @@ i915_gem_idle_work_handler(struct work_struct *work)
>
> if (INTEL_GEN(dev_priv) >= 6)
> gen6_rps_idle(dev_priv);
> +
> intel_runtime_pm_put(dev_priv);
> +
> + if (NEEDS_CSR_GT_PERF_WA(dev_priv))
> + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
Which is the outer wakeref? I expect runtime, being device-global, to be
the outer wakeref, with display_power_put being the inner subset.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 10:59 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
2017-11-29 11:10 ` Chris Wilson
@ 2017-11-29 11:12 ` Daniel Vetter
2017-11-29 11:34 ` Tvrtko Ursulin
2017-11-29 11:22 ` ✗ Fi.CI.BAT: warning for drm/i915: Restore GT performance in headless mode with DMC loaded (rev3) Patchwork
` (5 subsequent siblings)
7 siblings, 1 reply; 32+ messages in thread
From: Daniel Vetter @ 2017-11-29 11:12 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: Intel-gfx
On Wed, Nov 29, 2017 at 10:59:27AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> It seems that the DMC likes to transition between the DC states a lot when
> there are no connected displays (no active power domains) during command
> submission.
>
> This activity on DC states has a negative impact on the performance of the
> chip with huge latencies observed in the interrupt handlers and elsewhere.
> Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> eight.
>
> Work around it by introducing a new power domain named,
> POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> held for the duration of command submission activity.
>
> v2:
> * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> * Protect macro body with braces. (Jani Nikula)
>
> v3:
> * Add dedicated power domain for clarity. (Chris, Imre)
> * Commit message and comment text updates.
> * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> firmware release.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> Testcase: igt/gem_exec_nop/headless
> Cc: Imre Deak <imre.deak@intel.com>
> Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 5 +++++
> drivers/gpu/drm/i915/i915_gem.c | 4 ++++
> drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
> drivers/gpu/drm/i915/intel_runtime_pm.c | 4 ++++
> 4 files changed, 27 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index bddd65839f60..17340aadc566 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -398,6 +398,7 @@ enum intel_display_power_domain {
> POWER_DOMAIN_AUX_D,
> POWER_DOMAIN_GMBUS,
> POWER_DOMAIN_MODESET,
> + POWER_DOMAIN_GT_IRQ,
> POWER_DOMAIN_INIT,
>
> POWER_DOMAIN_NUM,
> @@ -3285,6 +3286,10 @@ intel_info(const struct drm_i915_private *dev_priv)
> #define GT_FREQUENCY_MULTIPLIER 50
> #define GEN9_FREQ_SCALER 3
>
> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> + (HAS_CSR(dev_priv) && (dev_priv)->csr.dmc_payload && \
Doesn't this check race with the async dmc loading? I.e. when you submit
something right at boot-up, before DMC is fully loaded, we might end up
with an unbalanced pm refcount.
I think given that DMC is strongly recommended there shouldn't be a real
cost with making this unconditional.
With that changed:
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> + IS_GEN9_BC(dev_priv) && !IS_SKYLAKE(dev_priv))
> +
> #include "i915_trace.h"
>
> static inline bool intel_vtd_active(void)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 354b0546a191..feec2a621120 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3381,7 +3381,11 @@ i915_gem_idle_work_handler(struct work_struct *work)
>
> if (INTEL_GEN(dev_priv) >= 6)
> gen6_rps_idle(dev_priv);
> +
> intel_runtime_pm_put(dev_priv);
> +
> + if (NEEDS_CSR_GT_PERF_WA(dev_priv))
> + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
> out_unlock:
> mutex_unlock(&dev_priv->drm.struct_mutex);
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index a90bdd26571f..619377a05810 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -251,6 +251,20 @@ static void mark_busy(struct drm_i915_private *i915)
>
> GEM_BUG_ON(!i915->gt.active_requests);
>
> + /*
> + * It seems that the DMC likes to transition between the DC states a lot
> + * when there are no connected displays (no active power domains) during
> + * command submission.
> + *
> + * This activity has negative impact on the performance of the chip with
> + * huge latencies observed in the interrupt handler and elsewhere.
> + *
> + * Work around it by grabbing a GT IRQ power domain whilst there is any
> + * GT activity, preventing any DC state transitions.
> + */
> + if (NEEDS_CSR_GT_PERF_WA(i915))
> + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> +
> intel_runtime_pm_get_noresume(i915);
> i915->gt.awake = true;
>
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index 8315499452dc..f491c7aaa096 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
> return "INIT";
> case POWER_DOMAIN_MODESET:
> return "MODESET";
> + case POWER_DOMAIN_GT_IRQ:
> + return "GT_IRQ";
> default:
> MISSING_CASE(domain);
> return "?";
> @@ -1705,6 +1707,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
> BIT_ULL(POWER_DOMAIN_INIT))
> #define SKL_DISPLAY_DC_OFF_POWER_DOMAINS ( \
> SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
> + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
> BIT_ULL(POWER_DOMAIN_MODESET) | \
> BIT_ULL(POWER_DOMAIN_AUX_A) | \
> BIT_ULL(POWER_DOMAIN_INIT))
> @@ -1785,6 +1788,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
> BIT_ULL(POWER_DOMAIN_INIT))
> #define GLK_DISPLAY_DC_OFF_POWER_DOMAINS ( \
> GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
> + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
> BIT_ULL(POWER_DOMAIN_MODESET) | \
> BIT_ULL(POWER_DOMAIN_AUX_A) | \
> BIT_ULL(POWER_DOMAIN_INIT))
> --
> 2.14.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* ✗ Fi.CI.BAT: warning for drm/i915: Restore GT performance in headless mode with DMC loaded (rev3)
2017-11-29 10:59 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
2017-11-29 11:10 ` Chris Wilson
2017-11-29 11:12 ` Daniel Vetter
@ 2017-11-29 11:22 ` Patchwork
2017-11-29 14:18 ` [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Imre Deak
` (4 subsequent siblings)
7 siblings, 0 replies; 32+ messages in thread
From: Patchwork @ 2017-11-29 11:22 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Restore GT performance in headless mode with DMC loaded (rev3)
URL : https://patchwork.freedesktop.org/series/24017/
State : warning
== Summary ==
Series 24017v3 drm/i915: Restore GT performance in headless mode with DMC loaded
https://patchwork.freedesktop.org/api/1.0/series/24017/revisions/3/mbox/
Test kms_pipe_crc_basic:
Subgroup read-crc-pipe-b:
fail -> PASS (fi-skl-6700k) fdo#103191
Subgroup suspend-read-crc-pipe-a:
pass -> DMESG-WARN (fi-kbl-r)
Subgroup suspend-read-crc-pipe-b:
incomplete -> PASS (fi-snb-2520m) fdo#103713
fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713
fi-bdw-5557u total:288 pass:267 dwarn:0 dfail:0 fail:0 skip:21 time:443s
fi-blb-e6850 total:288 pass:223 dwarn:1 dfail:0 fail:0 skip:64 time:382s
fi-bsw-n3050 total:288 pass:242 dwarn:0 dfail:0 fail:0 skip:46 time:511s
fi-bwr-2160 total:288 pass:183 dwarn:0 dfail:0 fail:0 skip:105 time:282s
fi-bxt-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:507s
fi-bxt-j4205 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:507s
fi-byt-j1900 total:288 pass:253 dwarn:0 dfail:0 fail:0 skip:35 time:480s
fi-byt-n2820 total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:469s
fi-elk-e7500 total:288 pass:229 dwarn:0 dfail:0 fail:0 skip:59 time:428s
fi-gdg-551 total:288 pass:178 dwarn:1 dfail:0 fail:1 skip:108 time:269s
fi-glk-1 total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:541s
fi-hsw-4770 total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:358s
fi-hsw-4770r total:288 pass:224 dwarn:0 dfail:0 fail:0 skip:64 time:255s
fi-ilk-650 total:288 pass:228 dwarn:0 dfail:0 fail:0 skip:60 time:433s
fi-ivb-3520m total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:479s
fi-kbl-7500u total:288 pass:263 dwarn:1 dfail:0 fail:0 skip:24 time:486s
fi-kbl-7560u total:288 pass:269 dwarn:0 dfail:0 fail:0 skip:19 time:528s
fi-kbl-7567u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:480s
fi-kbl-r total:288 pass:260 dwarn:1 dfail:0 fail:0 skip:27 time:535s
fi-pnv-d510 total:288 pass:222 dwarn:1 dfail:0 fail:0 skip:65 time:589s
fi-skl-6260u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:451s
fi-skl-6600u total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:544s
fi-skl-6700hq total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:568s
fi-skl-6700k total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:525s
fi-skl-6770hq total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:510s
fi-skl-gvtdvm total:288 pass:265 dwarn:0 dfail:0 fail:0 skip:23 time:444s
fi-snb-2520m total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:549s
fi-snb-2600 total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:414s
Blacklisted hosts:
fi-cfl-s2 total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:602s
fi-cnl-y total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:549s
c544c30501259c15ced047e547e787d546c24a43 drm-tip: 2017y-11m-29d-08h-19m-47s UTC integration manifest
93f678ee3268 drm/i915: Restore GT performance in headless mode with DMC loaded
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7337/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 11:12 ` Daniel Vetter
@ 2017-11-29 11:34 ` Tvrtko Ursulin
2017-11-29 11:40 ` Chris Wilson
2017-11-29 12:47 ` Daniel Vetter
0 siblings, 2 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2017-11-29 11:34 UTC (permalink / raw)
To: Daniel Vetter, Tvrtko Ursulin; +Cc: Intel-gfx
On 29/11/2017 11:12, Daniel Vetter wrote:
> On Wed, Nov 29, 2017 at 10:59:27AM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> It seems that the DMC likes to transition between the DC states a lot when
>> there are no connected displays (no active power domains) during command
>> submission.
>>
>> This activity on DC states has a negative impact on the performance of the
>> chip with huge latencies observed in the interrupt handlers and elsewhere.
>> Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
>> eight.
>>
>> Work around it by introducing a new power domain named,
>> POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
>> held for the duration of command submission activity.
>>
>> v2:
>> * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
>> * Protect macro body with braces. (Jani Nikula)
>>
>> v3:
>> * Add dedicated power domain for clarity. (Chris, Imre)
>> * Commit message and comment text updates.
>> * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
>> firmware release.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
>> Testcase: igt/gem_exec_nop/headless
>> Cc: Imre Deak <imre.deak@intel.com>
>> Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_drv.h | 5 +++++
>> drivers/gpu/drm/i915/i915_gem.c | 4 ++++
>> drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
>> drivers/gpu/drm/i915/intel_runtime_pm.c | 4 ++++
>> 4 files changed, 27 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index bddd65839f60..17340aadc566 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -398,6 +398,7 @@ enum intel_display_power_domain {
>> POWER_DOMAIN_AUX_D,
>> POWER_DOMAIN_GMBUS,
>> POWER_DOMAIN_MODESET,
>> + POWER_DOMAIN_GT_IRQ,
>> POWER_DOMAIN_INIT,
>>
>> POWER_DOMAIN_NUM,
>> @@ -3285,6 +3286,10 @@ intel_info(const struct drm_i915_private *dev_priv)
>> #define GT_FREQUENCY_MULTIPLIER 50
>> #define GEN9_FREQ_SCALER 3
>>
>> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
>> + (HAS_CSR(dev_priv) && (dev_priv)->csr.dmc_payload && \
>
> Doesn't this check race with the async dmc loading? I.e. when you submit
> something right at boot-up, before DMC is fully loaded, we might end up
> with an unbalanced pm refcount.
Oh yeah, well spotted.
> I think given that DMC is strongly recommended there shouldn't be a real
> cost with making this unconditional.
I don't know, not liking it on the first go. But then again I have no
idea how much power would that waste for use cases where DMC fw is
deliberately not present.
Perhaps it would be acceptable to mark GT busy during the async CSR
load. Chris, any thoughts?
Regards,
Tvrtko
> With that changed:
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
>> + IS_GEN9_BC(dev_priv) && !IS_SKYLAKE(dev_priv))
>> +
>> #include "i915_trace.h"
>>
>> static inline bool intel_vtd_active(void)
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 354b0546a191..feec2a621120 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -3381,7 +3381,11 @@ i915_gem_idle_work_handler(struct work_struct *work)
>>
>> if (INTEL_GEN(dev_priv) >= 6)
>> gen6_rps_idle(dev_priv);
>> +
>> intel_runtime_pm_put(dev_priv);
>> +
>> + if (NEEDS_CSR_GT_PERF_WA(dev_priv))
>> + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
>> out_unlock:
>> mutex_unlock(&dev_priv->drm.struct_mutex);
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
>> index a90bdd26571f..619377a05810 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_request.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
>> @@ -251,6 +251,20 @@ static void mark_busy(struct drm_i915_private *i915)
>>
>> GEM_BUG_ON(!i915->gt.active_requests);
>>
>> + /*
>> + * It seems that the DMC likes to transition between the DC states a lot
>> + * when there are no connected displays (no active power domains) during
>> + * command submission.
>> + *
>> + * This activity has negative impact on the performance of the chip with
>> + * huge latencies observed in the interrupt handler and elsewhere.
>> + *
>> + * Work around it by grabbing a GT IRQ power domain whilst there is any
>> + * GT activity, preventing any DC state transitions.
>> + */
>> + if (NEEDS_CSR_GT_PERF_WA(i915))
>> + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
>> +
>> intel_runtime_pm_get_noresume(i915);
>> i915->gt.awake = true;
>>
>> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> index 8315499452dc..f491c7aaa096 100644
>> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
>> return "INIT";
>> case POWER_DOMAIN_MODESET:
>> return "MODESET";
>> + case POWER_DOMAIN_GT_IRQ:
>> + return "GT_IRQ";
>> default:
>> MISSING_CASE(domain);
>> return "?";
>> @@ -1705,6 +1707,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
>> BIT_ULL(POWER_DOMAIN_INIT))
>> #define SKL_DISPLAY_DC_OFF_POWER_DOMAINS ( \
>> SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
>> + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
>> BIT_ULL(POWER_DOMAIN_MODESET) | \
>> BIT_ULL(POWER_DOMAIN_AUX_A) | \
>> BIT_ULL(POWER_DOMAIN_INIT))
>> @@ -1785,6 +1788,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
>> BIT_ULL(POWER_DOMAIN_INIT))
>> #define GLK_DISPLAY_DC_OFF_POWER_DOMAINS ( \
>> GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
>> + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
>> BIT_ULL(POWER_DOMAIN_MODESET) | \
>> BIT_ULL(POWER_DOMAIN_AUX_A) | \
>> BIT_ULL(POWER_DOMAIN_INIT))
>> --
>> 2.14.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 11:34 ` Tvrtko Ursulin
@ 2017-11-29 11:40 ` Chris Wilson
2017-11-29 11:53 ` Tvrtko Ursulin
2017-11-29 12:47 ` Daniel Vetter
1 sibling, 1 reply; 32+ messages in thread
From: Chris Wilson @ 2017-11-29 11:40 UTC (permalink / raw)
To: Tvrtko Ursulin, Daniel Vetter, Tvrtko Ursulin; +Cc: Intel-gfx
Quoting Tvrtko Ursulin (2017-11-29 11:34:27)
>
> On 29/11/2017 11:12, Daniel Vetter wrote:
> > I think given that DMC is strongly recommended there shouldn't be a real
> > cost with making this unconditional.
>
> I don't know, not liking it on the first go. But then again I have no
> idea how much power would that waste for use cases where DMC fw is
> deliberately not present.
>
> Perhaps it would be acceptable to mark GT busy during the async CSR
> load. Chris, any thoughts?
It's tightly coupled to requests, adding in an external call seems
troublesome.
What's the reason for depending on the CSR being loaded? The old fw is
broke no matter what, it doesn't get any more broken by us holding a
powerwell wakeref. I think we should go for simplicity and always take
the powerwell along with the rpm?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 11:40 ` Chris Wilson
@ 2017-11-29 11:53 ` Tvrtko Ursulin
2017-11-29 11:59 ` Chris Wilson
0 siblings, 1 reply; 32+ messages in thread
From: Tvrtko Ursulin @ 2017-11-29 11:53 UTC (permalink / raw)
To: Chris Wilson, Daniel Vetter, Tvrtko Ursulin; +Cc: Intel-gfx
On 29/11/2017 11:40, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-11-29 11:34:27)
>>
>> On 29/11/2017 11:12, Daniel Vetter wrote:
>>> I think given that DMC is strongly recommended there shouldn't be a real
>>> cost with making this unconditional.
>>
>> I don't know, not liking it on the first go. But then again I have no
>> idea how much power would that waste for use cases where DMC fw is
>> deliberately not present.
>>
>> Perhaps it would be acceptable to mark GT busy during the async CSR
>> load. Chris, any thoughts?
>
> It's tightly coupled to requests, adding in an external call seems
> troublesome.
>
> What's the reason for depending on the CSR being loaded? The old fw is
> broke no matter what, it doesn't get any more broken by us holding a
> powerwell wakeref. I think we should go for simplicity and always take
> the powerwell along with the rpm?
It's the unknown, maybe only for me, on how much power always holding
the power well would waste for use cases where DMC firmware has been
deliberately removed.
If I understand correctly that the Daniel's and your proposal is to just
got with HAS_CSR as the wa/ criteria, instead of fw loaded check.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 11:53 ` Tvrtko Ursulin
@ 2017-11-29 11:59 ` Chris Wilson
2017-11-29 14:15 ` Imre Deak
0 siblings, 1 reply; 32+ messages in thread
From: Chris Wilson @ 2017-11-29 11:59 UTC (permalink / raw)
To: Tvrtko Ursulin, Daniel Vetter, Tvrtko Ursulin; +Cc: Intel-gfx
Quoting Tvrtko Ursulin (2017-11-29 11:53:41)
>
> On 29/11/2017 11:40, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2017-11-29 11:34:27)
> >>
> >> On 29/11/2017 11:12, Daniel Vetter wrote:
> >>> I think given that DMC is strongly recommended there shouldn't be a real
> >>> cost with making this unconditional.
> >>
> >> I don't know, not liking it on the first go. But then again I have no
> >> idea how much power would that waste for use cases where DMC fw is
> >> deliberately not present.
> >>
> >> Perhaps it would be acceptable to mark GT busy during the async CSR
> >> load. Chris, any thoughts?
> >
> > It's tightly coupled to requests, adding in an external call seems
> > troublesome.
> >
> > What's the reason for depending on the CSR being loaded? The old fw is
> > broke no matter what, it doesn't get any more broken by us holding a
> > powerwell wakeref. I think we should go for simplicity and always take
> > the powerwell along with the rpm?
>
> It's the unknown, maybe only for me, on how much power always holding
> the power well would waste for use cases where DMC firmware has been
> deliberately removed.
>
> If I understand correctly that the Daniel's and your proposal is to just
> got with HAS_CSR as the wa/ criteria, instead of fw loaded check.
If I am reading the powerwell code correctly, it already takes the dmc
fw into account. I would transfer the problem to there :) i.e. we have
an unconditional call from gt:mark_busy, gt:mark_idle and the powerwell
code knows what needs to be done under the different circumstances.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 11:10 ` Chris Wilson
@ 2017-11-29 12:29 ` Imre Deak
0 siblings, 0 replies; 32+ messages in thread
From: Imre Deak @ 2017-11-29 12:29 UTC (permalink / raw)
To: Chris Wilson; +Cc: Intel-gfx
On Wed, Nov 29, 2017 at 11:10:58AM +0000, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-11-29 10:59:27)
> > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >
> > It seems that the DMC likes to transition between the DC states a lot when
> > there are no connected displays (no active power domains) during command
> > submission.
> >
> > This activity on DC states has a negative impact on the performance of the
> > chip with huge latencies observed in the interrupt handlers and elsewhere.
> > Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> > eight.
> >
> > Work around it by introducing a new power domain named,
> > POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> > held for the duration of command submission activity.
> >
> > v2:
> > * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> > * Protect macro body with braces. (Jani Nikula)
> >
> > v3:
> > * Add dedicated power domain for clarity. (Chris, Imre)
> > * Commit message and comment text updates.
> > * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> > firmware release.
> >
> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> > Testcase: igt/gem_exec_nop/headless
> > Cc: Imre Deak <imre.deak@intel.com>
> > Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_drv.h | 5 +++++
> > drivers/gpu/drm/i915/i915_gem.c | 4 ++++
> > drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
> > drivers/gpu/drm/i915/intel_runtime_pm.c | 4 ++++
> > 4 files changed, 27 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index bddd65839f60..17340aadc566 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -398,6 +398,7 @@ enum intel_display_power_domain {
> > POWER_DOMAIN_AUX_D,
> > POWER_DOMAIN_GMBUS,
> > POWER_DOMAIN_MODESET,
> > + POWER_DOMAIN_GT_IRQ,
> > POWER_DOMAIN_INIT,
> >
> > POWER_DOMAIN_NUM,
> > @@ -3285,6 +3286,10 @@ intel_info(const struct drm_i915_private *dev_priv)
> > #define GT_FREQUENCY_MULTIPLIER 50
> > #define GEN9_FREQ_SCALER 3
> >
> > +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> > + (HAS_CSR(dev_priv) && (dev_priv)->csr.dmc_payload && \
>
> We can't have a dmc_payload unless CSR, right?
>
> > + IS_GEN9_BC(dev_priv) && !IS_SKYLAKE(dev_priv))
> > +
> > #include "i915_trace.h"
> >
> > static inline bool intel_vtd_active(void)
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 354b0546a191..feec2a621120 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3381,7 +3381,11 @@ i915_gem_idle_work_handler(struct work_struct *work)
> >
> > if (INTEL_GEN(dev_priv) >= 6)
> > gen6_rps_idle(dev_priv);
> > +
> > intel_runtime_pm_put(dev_priv);
> > +
> > + if (NEEDS_CSR_GT_PERF_WA(dev_priv))
> > + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
>
> Which is the outer wakeref? I expect runtime, being device-global, to be
> the outer wakeref, with display_power_put being the inner subset.
Yes, logically the runtime ref is the outer one. But
intel_display_power_get() takes a runtime ref too and both runtime and
display power references can be held for "a long time" and so can be
taken in both order.
> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 11:34 ` Tvrtko Ursulin
2017-11-29 11:40 ` Chris Wilson
@ 2017-11-29 12:47 ` Daniel Vetter
1 sibling, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2017-11-29 12:47 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: Intel-gfx
On Wed, Nov 29, 2017 at 11:34:27AM +0000, Tvrtko Ursulin wrote:
>
> On 29/11/2017 11:12, Daniel Vetter wrote:
> > On Wed, Nov 29, 2017 at 10:59:27AM +0000, Tvrtko Ursulin wrote:
> > > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > >
> > > It seems that the DMC likes to transition between the DC states a lot when
> > > there are no connected displays (no active power domains) during command
> > > submission.
> > >
> > > This activity on DC states has a negative impact on the performance of the
> > > chip with huge latencies observed in the interrupt handlers and elsewhere.
> > > Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> > > eight.
> > >
> > > Work around it by introducing a new power domain named,
> > > POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> > > held for the duration of command submission activity.
> > >
> > > v2:
> > > * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> > > * Protect macro body with braces. (Jani Nikula)
> > >
> > > v3:
> > > * Add dedicated power domain for clarity. (Chris, Imre)
> > > * Commit message and comment text updates.
> > > * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> > > firmware release.
> > >
> > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> > > Testcase: igt/gem_exec_nop/headless
> > > Cc: Imre Deak <imre.deak@intel.com>
> > > Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> > > ---
> > > drivers/gpu/drm/i915/i915_drv.h | 5 +++++
> > > drivers/gpu/drm/i915/i915_gem.c | 4 ++++
> > > drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
> > > drivers/gpu/drm/i915/intel_runtime_pm.c | 4 ++++
> > > 4 files changed, 27 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index bddd65839f60..17340aadc566 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -398,6 +398,7 @@ enum intel_display_power_domain {
> > > POWER_DOMAIN_AUX_D,
> > > POWER_DOMAIN_GMBUS,
> > > POWER_DOMAIN_MODESET,
> > > + POWER_DOMAIN_GT_IRQ,
> > > POWER_DOMAIN_INIT,
> > > POWER_DOMAIN_NUM,
> > > @@ -3285,6 +3286,10 @@ intel_info(const struct drm_i915_private *dev_priv)
> > > #define GT_FREQUENCY_MULTIPLIER 50
> > > #define GEN9_FREQ_SCALER 3
> > > +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> > > + (HAS_CSR(dev_priv) && (dev_priv)->csr.dmc_payload && \
> >
> > Doesn't this check race with the async dmc loading? I.e. when you submit
> > something right at boot-up, before DMC is fully loaded, we might end up
> > with an unbalanced pm refcount.
>
> Oh yeah, well spotted.
>
> > I think given that DMC is strongly recommended there shouldn't be a real
> > cost with making this unconditional.
>
> I don't know, not liking it on the first go. But then again I have no idea
> how much power would that waste for use cases where DMC fw is deliberately
> not present.
>
> Perhaps it would be acceptable to mark GT busy during the async CSR load.
> Chris, any thoughts?
I only meant that we pm_put without pm_get (because when we would have
called pm_get dmc_payload == NULL and then != NULL when we reach the check
for pm_put).
The actual get/put vs. dmc loading should be protected already by the
async dmc load code holding pm references to prevent any havoc.
-Daniel
>
> Regards,
>
> Tvrtko
>
> > With that changed:
> >
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >
> > > + IS_GEN9_BC(dev_priv) && !IS_SKYLAKE(dev_priv))
> > > +
> > > #include "i915_trace.h"
> > > static inline bool intel_vtd_active(void)
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 354b0546a191..feec2a621120 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -3381,7 +3381,11 @@ i915_gem_idle_work_handler(struct work_struct *work)
> > > if (INTEL_GEN(dev_priv) >= 6)
> > > gen6_rps_idle(dev_priv);
> > > +
> > > intel_runtime_pm_put(dev_priv);
> > > +
> > > + if (NEEDS_CSR_GT_PERF_WA(dev_priv))
> > > + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
> > > out_unlock:
> > > mutex_unlock(&dev_priv->drm.struct_mutex);
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> > > index a90bdd26571f..619377a05810 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_request.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> > > @@ -251,6 +251,20 @@ static void mark_busy(struct drm_i915_private *i915)
> > > GEM_BUG_ON(!i915->gt.active_requests);
> > > + /*
> > > + * It seems that the DMC likes to transition between the DC states a lot
> > > + * when there are no connected displays (no active power domains) during
> > > + * command submission.
> > > + *
> > > + * This activity has negative impact on the performance of the chip with
> > > + * huge latencies observed in the interrupt handler and elsewhere.
> > > + *
> > > + * Work around it by grabbing a GT IRQ power domain whilst there is any
> > > + * GT activity, preventing any DC state transitions.
> > > + */
> > > + if (NEEDS_CSR_GT_PERF_WA(i915))
> > > + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> > > +
> > > intel_runtime_pm_get_noresume(i915);
> > > i915->gt.awake = true;
> > > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > index 8315499452dc..f491c7aaa096 100644
> > > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
> > > return "INIT";
> > > case POWER_DOMAIN_MODESET:
> > > return "MODESET";
> > > + case POWER_DOMAIN_GT_IRQ:
> > > + return "GT_IRQ";
> > > default:
> > > MISSING_CASE(domain);
> > > return "?";
> > > @@ -1705,6 +1707,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
> > > BIT_ULL(POWER_DOMAIN_INIT))
> > > #define SKL_DISPLAY_DC_OFF_POWER_DOMAINS ( \
> > > SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
> > > + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
> > > BIT_ULL(POWER_DOMAIN_MODESET) | \
> > > BIT_ULL(POWER_DOMAIN_AUX_A) | \
> > > BIT_ULL(POWER_DOMAIN_INIT))
> > > @@ -1785,6 +1788,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
> > > BIT_ULL(POWER_DOMAIN_INIT))
> > > #define GLK_DISPLAY_DC_OFF_POWER_DOMAINS ( \
> > > GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
> > > + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
> > > BIT_ULL(POWER_DOMAIN_MODESET) | \
> > > BIT_ULL(POWER_DOMAIN_AUX_A) | \
> > > BIT_ULL(POWER_DOMAIN_INIT))
> > > --
> > > 2.14.1
> > >
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 11:59 ` Chris Wilson
@ 2017-11-29 14:15 ` Imre Deak
0 siblings, 0 replies; 32+ messages in thread
From: Imre Deak @ 2017-11-29 14:15 UTC (permalink / raw)
To: Chris Wilson; +Cc: Intel-gfx
On Wed, Nov 29, 2017 at 11:59:04AM +0000, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-11-29 11:53:41)
> >
> > On 29/11/2017 11:40, Chris Wilson wrote:
> > > Quoting Tvrtko Ursulin (2017-11-29 11:34:27)
> > >>
> > >> On 29/11/2017 11:12, Daniel Vetter wrote:
> > >>> I think given that DMC is strongly recommended there shouldn't be a real
> > >>> cost with making this unconditional.
> > >>
> > >> I don't know, not liking it on the first go. But then again I have no
> > >> idea how much power would that waste for use cases where DMC fw is
> > >> deliberately not present.
> > >>
> > >> Perhaps it would be acceptable to mark GT busy during the async CSR
> > >> load. Chris, any thoughts?
> > >
> > > It's tightly coupled to requests, adding in an external call seems
> > > troublesome.
> > >
> > > What's the reason for depending on the CSR being loaded? The old fw is
> > > broke no matter what, it doesn't get any more broken by us holding a
> > > powerwell wakeref. I think we should go for simplicity and always take
> > > the powerwell along with the rpm?
> >
> > It's the unknown, maybe only for me, on how much power always holding
> > the power well would waste for use cases where DMC firmware has been
> > deliberately removed.
> >
> > If I understand correctly that the Daniel's and your proposal is to just
> > got with HAS_CSR as the wa/ criteria, instead of fw loaded check.
>
> If I am reading the powerwell code correctly, it already takes the dmc
> fw into account. I would transfer the problem to there :) i.e. we have
> an unconditional call from gt:mark_busy, gt:mark_idle and the powerwell
> code knows what needs to be done under the different circumstances.
Yes, a power well ref for all domains is held while the firmware is
being loaded, or the firmware fails to load. So taking it
unconditionally is ok (not counting the platform check due to the
corruption issue).
> -Chris
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 10:59 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
` (2 preceding siblings ...)
2017-11-29 11:22 ` ✗ Fi.CI.BAT: warning for drm/i915: Restore GT performance in headless mode with DMC loaded (rev3) Patchwork
@ 2017-11-29 14:18 ` Imre Deak
2017-11-29 14:30 ` [PATCH v4] " Tvrtko Ursulin
` (3 subsequent siblings)
7 siblings, 0 replies; 32+ messages in thread
From: Imre Deak @ 2017-11-29 14:18 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: Intel-gfx
On Wed, Nov 29, 2017 at 10:59:27AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> It seems that the DMC likes to transition between the DC states a lot when
> there are no connected displays (no active power domains) during command
> submission.
>
> This activity on DC states has a negative impact on the performance of the
> chip with huge latencies observed in the interrupt handlers and elsewhere.
> Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> eight.
>
> Work around it by introducing a new power domain named,
> POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> held for the duration of command submission activity.
>
> v2:
> * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> * Protect macro body with braces. (Jani Nikula)
>
> v3:
> * Add dedicated power domain for clarity. (Chris, Imre)
> * Commit message and comment text updates.
> * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> firmware release.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> Testcase: igt/gem_exec_nop/headless
> Cc: Imre Deak <imre.deak@intel.com>
> Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 5 +++++
> drivers/gpu/drm/i915/i915_gem.c | 4 ++++
> drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
> drivers/gpu/drm/i915/intel_runtime_pm.c | 4 ++++
> 4 files changed, 27 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index bddd65839f60..17340aadc566 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -398,6 +398,7 @@ enum intel_display_power_domain {
> POWER_DOMAIN_AUX_D,
> POWER_DOMAIN_GMBUS,
> POWER_DOMAIN_MODESET,
> + POWER_DOMAIN_GT_IRQ,
> POWER_DOMAIN_INIT,
>
> POWER_DOMAIN_NUM,
> @@ -3285,6 +3286,10 @@ intel_info(const struct drm_i915_private *dev_priv)
> #define GT_FREQUENCY_MULTIPLIER 50
> #define GEN9_FREQ_SCALER 3
>
> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> + (HAS_CSR(dev_priv) && (dev_priv)->csr.dmc_payload && \
> + IS_GEN9_BC(dev_priv) && !IS_SKYLAKE(dev_priv))
BXT and GLK seems to be also affected so IS_GEN9().
> +
> #include "i915_trace.h"
>
> static inline bool intel_vtd_active(void)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 354b0546a191..feec2a621120 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3381,7 +3381,11 @@ i915_gem_idle_work_handler(struct work_struct *work)
>
> if (INTEL_GEN(dev_priv) >= 6)
> gen6_rps_idle(dev_priv);
> +
> intel_runtime_pm_put(dev_priv);
> +
> + if (NEEDS_CSR_GT_PERF_WA(dev_priv))
> + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
> out_unlock:
> mutex_unlock(&dev_priv->drm.struct_mutex);
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index a90bdd26571f..619377a05810 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -251,6 +251,20 @@ static void mark_busy(struct drm_i915_private *i915)
>
> GEM_BUG_ON(!i915->gt.active_requests);
>
> + /*
> + * It seems that the DMC likes to transition between the DC states a lot
> + * when there are no connected displays (no active power domains) during
> + * command submission.
> + *
> + * This activity has negative impact on the performance of the chip with
> + * huge latencies observed in the interrupt handler and elsewhere.
> + *
> + * Work around it by grabbing a GT IRQ power domain whilst there is any
> + * GT activity, preventing any DC state transitions.
> + */
> + if (NEEDS_CSR_GT_PERF_WA(i915))
> + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> +
> intel_runtime_pm_get_noresume(i915);
> i915->gt.awake = true;
>
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index 8315499452dc..f491c7aaa096 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
> return "INIT";
> case POWER_DOMAIN_MODESET:
> return "MODESET";
> + case POWER_DOMAIN_GT_IRQ:
> + return "GT_IRQ";
> default:
> MISSING_CASE(domain);
> return "?";
> @@ -1705,6 +1707,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
> BIT_ULL(POWER_DOMAIN_INIT))
> #define SKL_DISPLAY_DC_OFF_POWER_DOMAINS ( \
> SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
> + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
> BIT_ULL(POWER_DOMAIN_MODESET) | \
> BIT_ULL(POWER_DOMAIN_AUX_A) | \
> BIT_ULL(POWER_DOMAIN_INIT))
> @@ -1785,6 +1788,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
> BIT_ULL(POWER_DOMAIN_INIT))
> #define GLK_DISPLAY_DC_OFF_POWER_DOMAINS ( \
> GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
> + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
It needs to be added to BXT too.
> BIT_ULL(POWER_DOMAIN_MODESET) | \
> BIT_ULL(POWER_DOMAIN_AUX_A) | \
> BIT_ULL(POWER_DOMAIN_INIT))
> --
> 2.14.1
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v4] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 10:59 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
` (3 preceding siblings ...)
2017-11-29 14:18 ` [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Imre Deak
@ 2017-11-29 14:30 ` Tvrtko Ursulin
2017-11-29 15:06 ` Imre Deak
2017-11-30 9:18 ` [PATCH v5] " Tvrtko Ursulin
2017-11-30 15:56 ` ✗ Fi.CI.BAT: warning for drm/i915: Restore GT performance in headless mode with DMC loaded (rev5) Patchwork
` (2 subsequent siblings)
7 siblings, 2 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2017-11-29 14:30 UTC (permalink / raw)
To: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
It seems that the DMC likes to transition between the DC states a lot when
there are no connected displays (no active power domains) during command
submission.
This activity on DC states has a negative impact on the performance of the
chip with huge latencies observed in the interrupt handlers and elsewhere.
Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
eight.
Work around it by introducing a new power domain named,
POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
held for the duration of command submission activity.
v2:
* Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
* Protect macro body with braces. (Jani Nikula)
v3:
* Add dedicated power domain for clarity. (Chris, Imre)
* Commit message and comment text updates.
* Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
firmware release.
v4:
* Power domain should be inner to device runtime pm. (Chris)
* Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
* Handle async DMC loading by moving the GT_IRQ power domain logic into
intel_runtime_pm. (Daniel, Chris)
* Include small core GEN9 as well. (Imre)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
Testcase: igt/gem_exec_nop/headless
Cc: Imre Deak <imre.deak@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 5 ++++
drivers/gpu/drm/i915/i915_gem.c | 2 ++
drivers/gpu/drm/i915/i915_gem_request.c | 14 +++++++++++
drivers/gpu/drm/i915/intel_csr.c | 29 +++++++++++++---------
drivers/gpu/drm/i915/intel_runtime_pm.c | 44 +++++++++++++++++++++++++++------
5 files changed, 76 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bddd65839f60..37b3da4fd0d4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -398,6 +398,7 @@ enum intel_display_power_domain {
POWER_DOMAIN_AUX_D,
POWER_DOMAIN_GMBUS,
POWER_DOMAIN_MODESET,
+ POWER_DOMAIN_GT_IRQ,
POWER_DOMAIN_INIT,
POWER_DOMAIN_NUM,
@@ -3285,6 +3286,10 @@ intel_info(const struct drm_i915_private *dev_priv)
#define GT_FREQUENCY_MULTIPLIER 50
#define GEN9_FREQ_SCALER 3
+#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
+ ((dev_priv)->csr.dmc_payload && \
+ IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
+
#include "i915_trace.h"
static inline bool intel_vtd_active(void)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 354b0546a191..a6f522e2d1a1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3381,6 +3381,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
if (INTEL_GEN(dev_priv) >= 6)
gen6_rps_idle(dev_priv);
+
+ intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
intel_runtime_pm_put(dev_priv);
out_unlock:
mutex_unlock(&dev_priv->drm.struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index a90bdd26571f..c28a4ceb016d 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
GEM_BUG_ON(!i915->gt.active_requests);
intel_runtime_pm_get_noresume(i915);
+
+ /*
+ * It seems that the DMC likes to transition between the DC states a lot
+ * when there are no connected displays (no active power domains) during
+ * command submission.
+ *
+ * This activity has negative impact on the performance of the chip with
+ * huge latencies observed in the interrupt handler and elsewhere.
+ *
+ * Work around it by grabbing a GT IRQ power domain whilst there is any
+ * GT activity, preventing any DC state transitions.
+ */
+ intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
+
i915->gt.awake = true;
intel_enable_gt_powersave(i915);
diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
index 07e4f7bc4412..8b188e9f283b 100644
--- a/drivers/gpu/drm/i915/intel_csr.c
+++ b/drivers/gpu/drm/i915/intel_csr.c
@@ -403,24 +403,33 @@ static uint32_t *parse_csr_fw(struct drm_i915_private *dev_priv,
static void csr_load_work_fn(struct work_struct *work)
{
- struct drm_i915_private *dev_priv;
- struct intel_csr *csr;
+ struct drm_i915_private *dev_priv =
+ container_of(work, typeof(*dev_priv), csr.work);
+ struct intel_csr *csr = &dev_priv->csr;
const struct firmware *fw = NULL;
+ u32 *dmc_payload;
- dev_priv = container_of(work, typeof(*dev_priv), csr.work);
- csr = &dev_priv->csr;
+ request_firmware(&fw, csr->fw_path, &dev_priv->drm.pdev->dev);
- request_firmware(&fw, dev_priv->csr.fw_path, &dev_priv->drm.pdev->dev);
- if (fw)
- dev_priv->csr.dmc_payload = parse_csr_fw(dev_priv, fw);
+ if (fw) {
+ dmc_payload = parse_csr_fw(dev_priv, fw);
+ release_firmware(fw);
+ } else {
+ dmc_payload = NULL;
+ }
- if (dev_priv->csr.dmc_payload) {
+ /* Lock to document memory barrier towards NEEDS_CSR_GT_PERF_WA. */
+ mutex_lock(&dev_priv->power_domains.lock);
+ csr->dmc_payload = dmc_payload;
+ mutex_unlock(&dev_priv->power_domains.lock);
+
+ if (csr->dmc_payload) {
intel_csr_load_program(dev_priv);
intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
DRM_INFO("Finished loading DMC firmware %s (v%u.%u)\n",
- dev_priv->csr.fw_path,
+ csr->fw_path,
CSR_VERSION_MAJOR(csr->version),
CSR_VERSION_MINOR(csr->version));
} else {
@@ -431,8 +440,6 @@ static void csr_load_work_fn(struct work_struct *work)
dev_notice(dev_priv->drm.dev, "DMC firmware homepage: %s",
INTEL_UC_FIRMWARE_URL);
}
-
- release_firmware(fw);
}
/**
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 8315499452dc..915124a2e945 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
return "INIT";
case POWER_DOMAIN_MODESET:
return "MODESET";
+ case POWER_DOMAIN_GT_IRQ:
+ return "GT_IRQ";
default:
MISSING_CASE(domain);
return "?";
@@ -1448,6 +1450,9 @@ __intel_display_power_get_domain(struct drm_i915_private *dev_priv,
struct i915_power_domains *power_domains = &dev_priv->power_domains;
struct i915_power_well *power_well;
+ if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
+ return;
+
for_each_power_domain_well(dev_priv, power_well, BIT_ULL(domain))
intel_power_well_get(dev_priv, power_well);
@@ -1518,6 +1523,19 @@ bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
return is_enabled;
}
+static void
+__intel_display_power_put_domain(struct drm_i915_private *dev_priv,
+ enum intel_display_power_domain domain)
+{
+ struct i915_power_domains *power_domains = &dev_priv->power_domains;
+ struct i915_power_well *power_well;
+
+ power_domains->domain_use_count[domain]--;
+
+ for_each_power_domain_well_rev(dev_priv, power_well, BIT_ULL(domain))
+ intel_power_well_put(dev_priv, power_well);
+}
+
/**
* intel_display_power_put - release a power domain reference
* @dev_priv: i915 device instance
@@ -1530,21 +1548,30 @@ bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
void intel_display_power_put(struct drm_i915_private *dev_priv,
enum intel_display_power_domain domain)
{
- struct i915_power_domains *power_domains;
- struct i915_power_well *power_well;
-
- power_domains = &dev_priv->power_domains;
+ struct i915_power_domains *power_domains = &dev_priv->power_domains;
mutex_lock(&power_domains->lock);
+ if (domain == POWER_DOMAIN_GT_IRQ) {
+ /*
+ * To handle asynchronous DMC loading we have to be careful to
+ * keep the use count on POWER_DOMAIN_GT_IRQ balanced.
+ *
+ * If there was GT activity before the DMC was loaded, we will
+ * have skipped obtaining the power domain so must not decrement
+ * it now.
+ */
+ if (!power_domains->domain_use_count[domain])
+ goto out;
+ }
+
WARN(!power_domains->domain_use_count[domain],
"Use count on domain %s is already zero\n",
intel_display_power_domain_str(domain));
- power_domains->domain_use_count[domain]--;
- for_each_power_domain_well_rev(dev_priv, power_well, BIT_ULL(domain))
- intel_power_well_put(dev_priv, power_well);
+ __intel_display_power_put_domain(dev_priv, domain);
+out:
mutex_unlock(&power_domains->lock);
intel_runtime_pm_put(dev_priv);
@@ -1705,6 +1732,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
BIT_ULL(POWER_DOMAIN_INIT))
#define SKL_DISPLAY_DC_OFF_POWER_DOMAINS ( \
SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
+ BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
BIT_ULL(POWER_DOMAIN_MODESET) | \
BIT_ULL(POWER_DOMAIN_AUX_A) | \
BIT_ULL(POWER_DOMAIN_INIT))
@@ -1727,6 +1755,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
BIT_ULL(POWER_DOMAIN_INIT))
#define BXT_DISPLAY_DC_OFF_POWER_DOMAINS ( \
BXT_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
+ BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
BIT_ULL(POWER_DOMAIN_MODESET) | \
BIT_ULL(POWER_DOMAIN_AUX_A) | \
BIT_ULL(POWER_DOMAIN_INIT))
@@ -1785,6 +1814,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
BIT_ULL(POWER_DOMAIN_INIT))
#define GLK_DISPLAY_DC_OFF_POWER_DOMAINS ( \
GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
+ BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
BIT_ULL(POWER_DOMAIN_MODESET) | \
BIT_ULL(POWER_DOMAIN_AUX_A) | \
BIT_ULL(POWER_DOMAIN_INIT))
--
2.14.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v4] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 14:30 ` [PATCH v4] " Tvrtko Ursulin
@ 2017-11-29 15:06 ` Imre Deak
2017-11-29 15:21 ` Tvrtko Ursulin
2017-11-30 9:18 ` [PATCH v5] " Tvrtko Ursulin
1 sibling, 1 reply; 32+ messages in thread
From: Imre Deak @ 2017-11-29 15:06 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: Intel-gfx
On Wed, Nov 29, 2017 at 02:30:30PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> It seems that the DMC likes to transition between the DC states a lot when
> there are no connected displays (no active power domains) during command
> submission.
>
> This activity on DC states has a negative impact on the performance of the
> chip with huge latencies observed in the interrupt handlers and elsewhere.
> Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> eight.
>
> Work around it by introducing a new power domain named,
> POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> held for the duration of command submission activity.
>
> v2:
> * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> * Protect macro body with braces. (Jani Nikula)
>
> v3:
> * Add dedicated power domain for clarity. (Chris, Imre)
> * Commit message and comment text updates.
> * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> firmware release.
>
> v4:
> * Power domain should be inner to device runtime pm. (Chris)
> * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
> * Handle async DMC loading by moving the GT_IRQ power domain logic into
> intel_runtime_pm. (Daniel, Chris)
> * Include small core GEN9 as well. (Imre)
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> Testcase: igt/gem_exec_nop/headless
> Cc: Imre Deak <imre.deak@intel.com>
> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 5 ++++
> drivers/gpu/drm/i915/i915_gem.c | 2 ++
> drivers/gpu/drm/i915/i915_gem_request.c | 14 +++++++++++
> drivers/gpu/drm/i915/intel_csr.c | 29 +++++++++++++---------
> drivers/gpu/drm/i915/intel_runtime_pm.c | 44 +++++++++++++++++++++++++++------
> 5 files changed, 76 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index bddd65839f60..37b3da4fd0d4 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -398,6 +398,7 @@ enum intel_display_power_domain {
> POWER_DOMAIN_AUX_D,
> POWER_DOMAIN_GMBUS,
> POWER_DOMAIN_MODESET,
> + POWER_DOMAIN_GT_IRQ,
> POWER_DOMAIN_INIT,
>
> POWER_DOMAIN_NUM,
> @@ -3285,6 +3286,10 @@ intel_info(const struct drm_i915_private *dev_priv)
> #define GT_FREQUENCY_MULTIPLIER 50
> #define GEN9_FREQ_SCALER 3
>
> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> + ((dev_priv)->csr.dmc_payload && \
> + IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
> +
> #include "i915_trace.h"
>
> static inline bool intel_vtd_active(void)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 354b0546a191..a6f522e2d1a1 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3381,6 +3381,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
>
> if (INTEL_GEN(dev_priv) >= 6)
> gen6_rps_idle(dev_priv);
> +
> + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
> intel_runtime_pm_put(dev_priv);
> out_unlock:
> mutex_unlock(&dev_priv->drm.struct_mutex);
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index a90bdd26571f..c28a4ceb016d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
> GEM_BUG_ON(!i915->gt.active_requests);
>
> intel_runtime_pm_get_noresume(i915);
> +
> + /*
> + * It seems that the DMC likes to transition between the DC states a lot
> + * when there are no connected displays (no active power domains) during
> + * command submission.
> + *
> + * This activity has negative impact on the performance of the chip with
> + * huge latencies observed in the interrupt handler and elsewhere.
> + *
> + * Work around it by grabbing a GT IRQ power domain whilst there is any
> + * GT activity, preventing any DC state transitions.
> + */
> + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> +
> i915->gt.awake = true;
>
> intel_enable_gt_powersave(i915);
> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> index 07e4f7bc4412..8b188e9f283b 100644
> --- a/drivers/gpu/drm/i915/intel_csr.c
> +++ b/drivers/gpu/drm/i915/intel_csr.c
> @@ -403,24 +403,33 @@ static uint32_t *parse_csr_fw(struct drm_i915_private *dev_priv,
>
> static void csr_load_work_fn(struct work_struct *work)
> {
> - struct drm_i915_private *dev_priv;
> - struct intel_csr *csr;
> + struct drm_i915_private *dev_priv =
> + container_of(work, typeof(*dev_priv), csr.work);
> + struct intel_csr *csr = &dev_priv->csr;
> const struct firmware *fw = NULL;
> + u32 *dmc_payload;
>
> - dev_priv = container_of(work, typeof(*dev_priv), csr.work);
> - csr = &dev_priv->csr;
> + request_firmware(&fw, csr->fw_path, &dev_priv->drm.pdev->dev);
>
> - request_firmware(&fw, dev_priv->csr.fw_path, &dev_priv->drm.pdev->dev);
> - if (fw)
> - dev_priv->csr.dmc_payload = parse_csr_fw(dev_priv, fw);
> + if (fw) {
> + dmc_payload = parse_csr_fw(dev_priv, fw);
> + release_firmware(fw);
> + } else {
> + dmc_payload = NULL;
> + }
>
> - if (dev_priv->csr.dmc_payload) {
> + /* Lock to document memory barrier towards NEEDS_CSR_GT_PERF_WA. */
> + mutex_lock(&dev_priv->power_domains.lock);
> + csr->dmc_payload = dmc_payload;
> + mutex_unlock(&dev_priv->power_domains.lock);
> +
> + if (csr->dmc_payload) {
> intel_csr_load_program(dev_priv);
>
> intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
>
> DRM_INFO("Finished loading DMC firmware %s (v%u.%u)\n",
> - dev_priv->csr.fw_path,
> + csr->fw_path,
> CSR_VERSION_MAJOR(csr->version),
> CSR_VERSION_MINOR(csr->version));
> } else {
> @@ -431,8 +440,6 @@ static void csr_load_work_fn(struct work_struct *work)
> dev_notice(dev_priv->drm.dev, "DMC firmware homepage: %s",
> INTEL_UC_FIRMWARE_URL);
> }
> -
> - release_firmware(fw);
> }
>
> /**
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index 8315499452dc..915124a2e945 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
> return "INIT";
> case POWER_DOMAIN_MODESET:
> return "MODESET";
> + case POWER_DOMAIN_GT_IRQ:
> + return "GT_IRQ";
> default:
> MISSING_CASE(domain);
> return "?";
> @@ -1448,6 +1450,9 @@ __intel_display_power_get_domain(struct drm_i915_private *dev_priv,
> struct i915_power_domains *power_domains = &dev_priv->power_domains;
> struct i915_power_well *power_well;
>
> + if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
> + return;
> +
> for_each_power_domain_well(dev_priv, power_well, BIT_ULL(domain))
> intel_power_well_get(dev_priv, power_well);
>
> @@ -1518,6 +1523,19 @@ bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
> return is_enabled;
> }
>
> +static void
> +__intel_display_power_put_domain(struct drm_i915_private *dev_priv,
> + enum intel_display_power_domain domain)
> +{
> + struct i915_power_domains *power_domains = &dev_priv->power_domains;
> + struct i915_power_well *power_well;
> +
> + power_domains->domain_use_count[domain]--;
> +
> + for_each_power_domain_well_rev(dev_priv, power_well, BIT_ULL(domain))
> + intel_power_well_put(dev_priv, power_well);
> +}
> +
> /**
> * intel_display_power_put - release a power domain reference
> * @dev_priv: i915 device instance
> @@ -1530,21 +1548,30 @@ bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
> void intel_display_power_put(struct drm_i915_private *dev_priv,
> enum intel_display_power_domain domain)
> {
> - struct i915_power_domains *power_domains;
> - struct i915_power_well *power_well;
> -
> - power_domains = &dev_priv->power_domains;
> + struct i915_power_domains *power_domains = &dev_priv->power_domains;
>
> mutex_lock(&power_domains->lock);
>
> + if (domain == POWER_DOMAIN_GT_IRQ) {
> + /*
> + * To handle asynchronous DMC loading we have to be careful to
> + * keep the use count on POWER_DOMAIN_GT_IRQ balanced.
> + *
> + * If there was GT activity before the DMC was loaded, we will
> + * have skipped obtaining the power domain so must not decrement
> + * it now.
> + */
> + if (!power_domains->domain_use_count[domain])
> + goto out;
> + }
Is it possible to have any GT activity before
intel_power_domains_init_hw() / intel_csr_ucode_init()? I couldn't spot
anything at least. Calling into the power well code before those isn't
correct in any case. If so, the INIT power ref taken in
intel_csr_ucode_init() makes sure the above scenario can't happen and
then we don't need dmc_payload check in NEEDS_CSR_GT_PERF_WA() either.
> +
> WARN(!power_domains->domain_use_count[domain],
> "Use count on domain %s is already zero\n",
> intel_display_power_domain_str(domain));
> - power_domains->domain_use_count[domain]--;
>
> - for_each_power_domain_well_rev(dev_priv, power_well, BIT_ULL(domain))
> - intel_power_well_put(dev_priv, power_well);
> + __intel_display_power_put_domain(dev_priv, domain);
>
> +out:
> mutex_unlock(&power_domains->lock);
>
> intel_runtime_pm_put(dev_priv);
> @@ -1705,6 +1732,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
> BIT_ULL(POWER_DOMAIN_INIT))
> #define SKL_DISPLAY_DC_OFF_POWER_DOMAINS ( \
> SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
> + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
> BIT_ULL(POWER_DOMAIN_MODESET) | \
> BIT_ULL(POWER_DOMAIN_AUX_A) | \
> BIT_ULL(POWER_DOMAIN_INIT))
> @@ -1727,6 +1755,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
> BIT_ULL(POWER_DOMAIN_INIT))
> #define BXT_DISPLAY_DC_OFF_POWER_DOMAINS ( \
> BXT_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
> + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
> BIT_ULL(POWER_DOMAIN_MODESET) | \
> BIT_ULL(POWER_DOMAIN_AUX_A) | \
> BIT_ULL(POWER_DOMAIN_INIT))
> @@ -1785,6 +1814,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
> BIT_ULL(POWER_DOMAIN_INIT))
> #define GLK_DISPLAY_DC_OFF_POWER_DOMAINS ( \
> GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
> + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
> BIT_ULL(POWER_DOMAIN_MODESET) | \
> BIT_ULL(POWER_DOMAIN_AUX_A) | \
> BIT_ULL(POWER_DOMAIN_INIT))
> --
> 2.14.1
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v4] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 15:06 ` Imre Deak
@ 2017-11-29 15:21 ` Tvrtko Ursulin
2017-11-29 17:28 ` Imre Deak
0 siblings, 1 reply; 32+ messages in thread
From: Tvrtko Ursulin @ 2017-11-29 15:21 UTC (permalink / raw)
To: imre.deak, Tvrtko Ursulin; +Cc: Intel-gfx
On 29/11/2017 15:06, Imre Deak wrote:
> On Wed, Nov 29, 2017 at 02:30:30PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> It seems that the DMC likes to transition between the DC states a lot when
>> there are no connected displays (no active power domains) during command
>> submission.
>>
>> This activity on DC states has a negative impact on the performance of the
>> chip with huge latencies observed in the interrupt handlers and elsewhere.
>> Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
>> eight.
>>
>> Work around it by introducing a new power domain named,
>> POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
>> held for the duration of command submission activity.
>>
>> v2:
>> * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
>> * Protect macro body with braces. (Jani Nikula)
>>
>> v3:
>> * Add dedicated power domain for clarity. (Chris, Imre)
>> * Commit message and comment text updates.
>> * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
>> firmware release.
>>
>> v4:
>> * Power domain should be inner to device runtime pm. (Chris)
>> * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
>> * Handle async DMC loading by moving the GT_IRQ power domain logic into
>> intel_runtime_pm. (Daniel, Chris)
>> * Include small core GEN9 as well. (Imre)
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
>> Testcase: igt/gem_exec_nop/headless
>> Cc: Imre Deak <imre.deak@intel.com>
>> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_drv.h | 5 ++++
>> drivers/gpu/drm/i915/i915_gem.c | 2 ++
>> drivers/gpu/drm/i915/i915_gem_request.c | 14 +++++++++++
>> drivers/gpu/drm/i915/intel_csr.c | 29 +++++++++++++---------
>> drivers/gpu/drm/i915/intel_runtime_pm.c | 44 +++++++++++++++++++++++++++------
>> 5 files changed, 76 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index bddd65839f60..37b3da4fd0d4 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -398,6 +398,7 @@ enum intel_display_power_domain {
>> POWER_DOMAIN_AUX_D,
>> POWER_DOMAIN_GMBUS,
>> POWER_DOMAIN_MODESET,
>> + POWER_DOMAIN_GT_IRQ,
>> POWER_DOMAIN_INIT,
>>
>> POWER_DOMAIN_NUM,
>> @@ -3285,6 +3286,10 @@ intel_info(const struct drm_i915_private *dev_priv)
>> #define GT_FREQUENCY_MULTIPLIER 50
>> #define GEN9_FREQ_SCALER 3
>>
>> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
>> + ((dev_priv)->csr.dmc_payload && \
>> + IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
>> +
>> #include "i915_trace.h"
>>
>> static inline bool intel_vtd_active(void)
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 354b0546a191..a6f522e2d1a1 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -3381,6 +3381,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
>>
>> if (INTEL_GEN(dev_priv) >= 6)
>> gen6_rps_idle(dev_priv);
>> +
>> + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
>> intel_runtime_pm_put(dev_priv);
>> out_unlock:
>> mutex_unlock(&dev_priv->drm.struct_mutex);
>> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
>> index a90bdd26571f..c28a4ceb016d 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_request.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
>> @@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
>> GEM_BUG_ON(!i915->gt.active_requests);
>>
>> intel_runtime_pm_get_noresume(i915);
>> +
>> + /*
>> + * It seems that the DMC likes to transition between the DC states a lot
>> + * when there are no connected displays (no active power domains) during
>> + * command submission.
>> + *
>> + * This activity has negative impact on the performance of the chip with
>> + * huge latencies observed in the interrupt handler and elsewhere.
>> + *
>> + * Work around it by grabbing a GT IRQ power domain whilst there is any
>> + * GT activity, preventing any DC state transitions.
>> + */
>> + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
>> +
>> i915->gt.awake = true;
>>
>> intel_enable_gt_powersave(i915);
>> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>> index 07e4f7bc4412..8b188e9f283b 100644
>> --- a/drivers/gpu/drm/i915/intel_csr.c
>> +++ b/drivers/gpu/drm/i915/intel_csr.c
>> @@ -403,24 +403,33 @@ static uint32_t *parse_csr_fw(struct drm_i915_private *dev_priv,
>>
>> static void csr_load_work_fn(struct work_struct *work)
>> {
>> - struct drm_i915_private *dev_priv;
>> - struct intel_csr *csr;
>> + struct drm_i915_private *dev_priv =
>> + container_of(work, typeof(*dev_priv), csr.work);
>> + struct intel_csr *csr = &dev_priv->csr;
>> const struct firmware *fw = NULL;
>> + u32 *dmc_payload;
>>
>> - dev_priv = container_of(work, typeof(*dev_priv), csr.work);
>> - csr = &dev_priv->csr;
>> + request_firmware(&fw, csr->fw_path, &dev_priv->drm.pdev->dev);
>>
>> - request_firmware(&fw, dev_priv->csr.fw_path, &dev_priv->drm.pdev->dev);
>> - if (fw)
>> - dev_priv->csr.dmc_payload = parse_csr_fw(dev_priv, fw);
>> + if (fw) {
>> + dmc_payload = parse_csr_fw(dev_priv, fw);
>> + release_firmware(fw);
>> + } else {
>> + dmc_payload = NULL;
>> + }
>>
>> - if (dev_priv->csr.dmc_payload) {
>> + /* Lock to document memory barrier towards NEEDS_CSR_GT_PERF_WA. */
>> + mutex_lock(&dev_priv->power_domains.lock);
>> + csr->dmc_payload = dmc_payload;
>> + mutex_unlock(&dev_priv->power_domains.lock);
>> +
>> + if (csr->dmc_payload) {
>> intel_csr_load_program(dev_priv);
>>
>> intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
>>
>> DRM_INFO("Finished loading DMC firmware %s (v%u.%u)\n",
>> - dev_priv->csr.fw_path,
>> + csr->fw_path,
>> CSR_VERSION_MAJOR(csr->version),
>> CSR_VERSION_MINOR(csr->version));
>> } else {
>> @@ -431,8 +440,6 @@ static void csr_load_work_fn(struct work_struct *work)
>> dev_notice(dev_priv->drm.dev, "DMC firmware homepage: %s",
>> INTEL_UC_FIRMWARE_URL);
>> }
>> -
>> - release_firmware(fw);
>> }
>>
>> /**
>> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> index 8315499452dc..915124a2e945 100644
>> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
>> return "INIT";
>> case POWER_DOMAIN_MODESET:
>> return "MODESET";
>> + case POWER_DOMAIN_GT_IRQ:
>> + return "GT_IRQ";
>> default:
>> MISSING_CASE(domain);
>> return "?";
>> @@ -1448,6 +1450,9 @@ __intel_display_power_get_domain(struct drm_i915_private *dev_priv,
>> struct i915_power_domains *power_domains = &dev_priv->power_domains;
>> struct i915_power_well *power_well;
>>
>> + if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
>> + return;
>> +
>> for_each_power_domain_well(dev_priv, power_well, BIT_ULL(domain))
>> intel_power_well_get(dev_priv, power_well);
>>
>> @@ -1518,6 +1523,19 @@ bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
>> return is_enabled;
>> }
>>
>> +static void
>> +__intel_display_power_put_domain(struct drm_i915_private *dev_priv,
>> + enum intel_display_power_domain domain)
>> +{
>> + struct i915_power_domains *power_domains = &dev_priv->power_domains;
>> + struct i915_power_well *power_well;
>> +
>> + power_domains->domain_use_count[domain]--;
>> +
>> + for_each_power_domain_well_rev(dev_priv, power_well, BIT_ULL(domain))
>> + intel_power_well_put(dev_priv, power_well);
>> +}
>> +
>> /**
>> * intel_display_power_put - release a power domain reference
>> * @dev_priv: i915 device instance
>> @@ -1530,21 +1548,30 @@ bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
>> void intel_display_power_put(struct drm_i915_private *dev_priv,
>> enum intel_display_power_domain domain)
>> {
>> - struct i915_power_domains *power_domains;
>> - struct i915_power_well *power_well;
>> -
>> - power_domains = &dev_priv->power_domains;
>> + struct i915_power_domains *power_domains = &dev_priv->power_domains;
>>
>> mutex_lock(&power_domains->lock);
>>
>> + if (domain == POWER_DOMAIN_GT_IRQ) {
>> + /*
>> + * To handle asynchronous DMC loading we have to be careful to
>> + * keep the use count on POWER_DOMAIN_GT_IRQ balanced.
>> + *
>> + * If there was GT activity before the DMC was loaded, we will
>> + * have skipped obtaining the power domain so must not decrement
>> + * it now.
>> + */
>> + if (!power_domains->domain_use_count[domain])
>> + goto out;
>> + }
>
> Is it possible to have any GT activity before
> intel_power_domains_init_hw() / intel_csr_ucode_init()? I couldn't spot
> anything at least. Calling into the power well code before those isn't
> correct in any case. If so, the INIT power ref taken in
> intel_csr_ucode_init() makes sure the above scenario can't happen and
> then we don't need dmc_payload check in NEEDS_CSR_GT_PERF_WA() either.
intel_csr_ucode_init schedules a worker to actually load the firmware so
unless I am missing some synchronisation point it can happen.
Simpler solution would be to only base the NEEDS_CSR_GT_PERF_WA on
HAS_CSR, but then if someone removes the firmware deliberately we lose
power keeping this power well up needlessly, no?
Regards,
Tvrtko
>
>> +
>> WARN(!power_domains->domain_use_count[domain],
>> "Use count on domain %s is already zero\n",
>> intel_display_power_domain_str(domain));
>> - power_domains->domain_use_count[domain]--;
>>
>> - for_each_power_domain_well_rev(dev_priv, power_well, BIT_ULL(domain))
>> - intel_power_well_put(dev_priv, power_well);
>> + __intel_display_power_put_domain(dev_priv, domain);
>>
>> +out:
>> mutex_unlock(&power_domains->lock);
>>
>> intel_runtime_pm_put(dev_priv);
>> @@ -1705,6 +1732,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
>> BIT_ULL(POWER_DOMAIN_INIT))
>> #define SKL_DISPLAY_DC_OFF_POWER_DOMAINS ( \
>> SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
>> + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
>> BIT_ULL(POWER_DOMAIN_MODESET) | \
>> BIT_ULL(POWER_DOMAIN_AUX_A) | \
>> BIT_ULL(POWER_DOMAIN_INIT))
>> @@ -1727,6 +1755,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
>> BIT_ULL(POWER_DOMAIN_INIT))
>> #define BXT_DISPLAY_DC_OFF_POWER_DOMAINS ( \
>> BXT_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
>> + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
>> BIT_ULL(POWER_DOMAIN_MODESET) | \
>> BIT_ULL(POWER_DOMAIN_AUX_A) | \
>> BIT_ULL(POWER_DOMAIN_INIT))
>> @@ -1785,6 +1814,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
>> BIT_ULL(POWER_DOMAIN_INIT))
>> #define GLK_DISPLAY_DC_OFF_POWER_DOMAINS ( \
>> GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
>> + BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
>> BIT_ULL(POWER_DOMAIN_MODESET) | \
>> BIT_ULL(POWER_DOMAIN_AUX_A) | \
>> BIT_ULL(POWER_DOMAIN_INIT))
>> --
>> 2.14.1
>>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v4] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 15:21 ` Tvrtko Ursulin
@ 2017-11-29 17:28 ` Imre Deak
2017-11-30 8:34 ` Tvrtko Ursulin
0 siblings, 1 reply; 32+ messages in thread
From: Imre Deak @ 2017-11-29 17:28 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: Intel-gfx
On Wed, Nov 29, 2017 at 03:21:23PM +0000, Tvrtko Ursulin wrote:
>
> On 29/11/2017 15:06, Imre Deak wrote:
> > On Wed, Nov 29, 2017 at 02:30:30PM +0000, Tvrtko Ursulin wrote:
> > > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > >
> > > It seems that the DMC likes to transition between the DC states a lot when
> > > there are no connected displays (no active power domains) during command
> > > submission.
> > >
> > > This activity on DC states has a negative impact on the performance of the
> > > chip with huge latencies observed in the interrupt handlers and elsewhere.
> > > Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> > > eight.
> > >
> > > Work around it by introducing a new power domain named,
> > > POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> > > held for the duration of command submission activity.
> > >
> > > v2:
> > > * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> > > * Protect macro body with braces. (Jani Nikula)
> > >
> > > v3:
> > > * Add dedicated power domain for clarity. (Chris, Imre)
> > > * Commit message and comment text updates.
> > > * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> > > firmware release.
> > >
> > > v4:
> > > * Power domain should be inner to device runtime pm. (Chris)
> > > * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
> > > * Handle async DMC loading by moving the GT_IRQ power domain logic into
> > > intel_runtime_pm. (Daniel, Chris)
> > > * Include small core GEN9 as well. (Imre)
> > >
> > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> > > Testcase: igt/gem_exec_nop/headless
> > > Cc: Imre Deak <imre.deak@intel.com>
> > > Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> > > ---
> > > drivers/gpu/drm/i915/i915_drv.h | 5 ++++
> > > drivers/gpu/drm/i915/i915_gem.c | 2 ++
> > > drivers/gpu/drm/i915/i915_gem_request.c | 14 +++++++++++
> > > drivers/gpu/drm/i915/intel_csr.c | 29 +++++++++++++---------
> > > drivers/gpu/drm/i915/intel_runtime_pm.c | 44 +++++++++++++++++++++++++++------
> > > 5 files changed, 76 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index bddd65839f60..37b3da4fd0d4 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -398,6 +398,7 @@ enum intel_display_power_domain {
> > > POWER_DOMAIN_AUX_D,
> > > POWER_DOMAIN_GMBUS,
> > > POWER_DOMAIN_MODESET,
> > > + POWER_DOMAIN_GT_IRQ,
> > > POWER_DOMAIN_INIT,
> > > POWER_DOMAIN_NUM,
> > > @@ -3285,6 +3286,10 @@ intel_info(const struct drm_i915_private *dev_priv)
> > > #define GT_FREQUENCY_MULTIPLIER 50
> > > #define GEN9_FREQ_SCALER 3
> > > +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> > > + ((dev_priv)->csr.dmc_payload && \
> > > + IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
> > > +
> > > #include "i915_trace.h"
> > > static inline bool intel_vtd_active(void)
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 354b0546a191..a6f522e2d1a1 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -3381,6 +3381,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
> > > if (INTEL_GEN(dev_priv) >= 6)
> > > gen6_rps_idle(dev_priv);
> > > +
> > > + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
> > > intel_runtime_pm_put(dev_priv);
> > > out_unlock:
> > > mutex_unlock(&dev_priv->drm.struct_mutex);
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> > > index a90bdd26571f..c28a4ceb016d 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_request.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> > > @@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
> > > GEM_BUG_ON(!i915->gt.active_requests);
> > > intel_runtime_pm_get_noresume(i915);
> > > +
> > > + /*
> > > + * It seems that the DMC likes to transition between the DC states a lot
> > > + * when there are no connected displays (no active power domains) during
> > > + * command submission.
> > > + *
> > > + * This activity has negative impact on the performance of the chip with
> > > + * huge latencies observed in the interrupt handler and elsewhere.
> > > + *
> > > + * Work around it by grabbing a GT IRQ power domain whilst there is any
> > > + * GT activity, preventing any DC state transitions.
> > > + */
> > > + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> > > +
> > > i915->gt.awake = true;
> > > intel_enable_gt_powersave(i915);
> > > diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
> > > index 07e4f7bc4412..8b188e9f283b 100644
> > > --- a/drivers/gpu/drm/i915/intel_csr.c
> > > +++ b/drivers/gpu/drm/i915/intel_csr.c
> > > @@ -403,24 +403,33 @@ static uint32_t *parse_csr_fw(struct drm_i915_private *dev_priv,
> > > static void csr_load_work_fn(struct work_struct *work)
> > > {
> > > - struct drm_i915_private *dev_priv;
> > > - struct intel_csr *csr;
> > > + struct drm_i915_private *dev_priv =
> > > + container_of(work, typeof(*dev_priv), csr.work);
> > > + struct intel_csr *csr = &dev_priv->csr;
> > > const struct firmware *fw = NULL;
> > > + u32 *dmc_payload;
> > > - dev_priv = container_of(work, typeof(*dev_priv), csr.work);
> > > - csr = &dev_priv->csr;
> > > + request_firmware(&fw, csr->fw_path, &dev_priv->drm.pdev->dev);
> > > - request_firmware(&fw, dev_priv->csr.fw_path, &dev_priv->drm.pdev->dev);
> > > - if (fw)
> > > - dev_priv->csr.dmc_payload = parse_csr_fw(dev_priv, fw);
> > > + if (fw) {
> > > + dmc_payload = parse_csr_fw(dev_priv, fw);
> > > + release_firmware(fw);
> > > + } else {
> > > + dmc_payload = NULL;
> > > + }
> > > - if (dev_priv->csr.dmc_payload) {
> > > + /* Lock to document memory barrier towards NEEDS_CSR_GT_PERF_WA. */
> > > + mutex_lock(&dev_priv->power_domains.lock);
> > > + csr->dmc_payload = dmc_payload;
> > > + mutex_unlock(&dev_priv->power_domains.lock);
> > > +
> > > + if (csr->dmc_payload) {
> > > intel_csr_load_program(dev_priv);
> > > intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
> > > DRM_INFO("Finished loading DMC firmware %s (v%u.%u)\n",
> > > - dev_priv->csr.fw_path,
> > > + csr->fw_path,
> > > CSR_VERSION_MAJOR(csr->version),
> > > CSR_VERSION_MINOR(csr->version));
> > > } else {
> > > @@ -431,8 +440,6 @@ static void csr_load_work_fn(struct work_struct *work)
> > > dev_notice(dev_priv->drm.dev, "DMC firmware homepage: %s",
> > > INTEL_UC_FIRMWARE_URL);
> > > }
> > > -
> > > - release_firmware(fw);
> > > }
> > > /**
> > > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > index 8315499452dc..915124a2e945 100644
> > > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
> > > return "INIT";
> > > case POWER_DOMAIN_MODESET:
> > > return "MODESET";
> > > + case POWER_DOMAIN_GT_IRQ:
> > > + return "GT_IRQ";
> > > default:
> > > MISSING_CASE(domain);
> > > return "?";
> > > @@ -1448,6 +1450,9 @@ __intel_display_power_get_domain(struct drm_i915_private *dev_priv,
> > > struct i915_power_domains *power_domains = &dev_priv->power_domains;
> > > struct i915_power_well *power_well;
> > > + if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
> > > + return;
> > > +
> > > for_each_power_domain_well(dev_priv, power_well, BIT_ULL(domain))
> > > intel_power_well_get(dev_priv, power_well);
> > > @@ -1518,6 +1523,19 @@ bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
> > > return is_enabled;
> > > }
> > > +static void
> > > +__intel_display_power_put_domain(struct drm_i915_private *dev_priv,
> > > + enum intel_display_power_domain domain)
> > > +{
> > > + struct i915_power_domains *power_domains = &dev_priv->power_domains;
> > > + struct i915_power_well *power_well;
> > > +
> > > + power_domains->domain_use_count[domain]--;
> > > +
> > > + for_each_power_domain_well_rev(dev_priv, power_well, BIT_ULL(domain))
> > > + intel_power_well_put(dev_priv, power_well);
> > > +}
> > > +
> > > /**
> > > * intel_display_power_put - release a power domain reference
> > > * @dev_priv: i915 device instance
> > > @@ -1530,21 +1548,30 @@ bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
> > > void intel_display_power_put(struct drm_i915_private *dev_priv,
> > > enum intel_display_power_domain domain)
> > > {
> > > - struct i915_power_domains *power_domains;
> > > - struct i915_power_well *power_well;
> > > -
> > > - power_domains = &dev_priv->power_domains;
> > > + struct i915_power_domains *power_domains = &dev_priv->power_domains;
> > > mutex_lock(&power_domains->lock);
> > > + if (domain == POWER_DOMAIN_GT_IRQ) {
> > > + /*
> > > + * To handle asynchronous DMC loading we have to be careful to
> > > + * keep the use count on POWER_DOMAIN_GT_IRQ balanced.
> > > + *
> > > + * If there was GT activity before the DMC was loaded, we will
> > > + * have skipped obtaining the power domain so must not decrement
> > > + * it now.
> > > + */
> > > + if (!power_domains->domain_use_count[domain])
> > > + goto out;
> > > + }
> >
> > Is it possible to have any GT activity before
> > intel_power_domains_init_hw() / intel_csr_ucode_init()? I couldn't spot
> > anything at least. Calling into the power well code before those isn't
> > correct in any case. If so, the INIT power ref taken in
> > intel_csr_ucode_init() makes sure the above scenario can't happen and
> > then we don't need dmc_payload check in NEEDS_CSR_GT_PERF_WA() either.
>
> intel_csr_ucode_init schedules a worker to actually load the firmware so
> unless I am missing some synchronisation point it can happen.
What I meant is that the INIT domain ref we take before scheduling that
work makes the dmc_payload check redundant both during getting and
putting the GT_IRQ domain. Without that check if the FW is not loaded
yet (or won't be at all) those get/puts will just inc/dec the GT_IRQ
domain refcount, but won't result in enabling/disabling the DC-off power
well, so will be kind of nops.
> Simpler solution would be to only base the NEEDS_CSR_GT_PERF_WA on HAS_CSR,
> but then if someone removes the firmware deliberately we lose power keeping
> this power well up needlessly, no?
If the firmware can't be loaded runtime PM as a whole is anyway disabled
(due to the above INIT power domain ref not being released).
--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v4] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 17:28 ` Imre Deak
@ 2017-11-30 8:34 ` Tvrtko Ursulin
0 siblings, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2017-11-30 8:34 UTC (permalink / raw)
To: imre.deak; +Cc: Intel-gfx
On 29/11/2017 17:28, Imre Deak wrote:
> On Wed, Nov 29, 2017 at 03:21:23PM +0000, Tvrtko Ursulin wrote:
>>
>> On 29/11/2017 15:06, Imre Deak wrote:
>>> On Wed, Nov 29, 2017 at 02:30:30PM +0000, Tvrtko Ursulin wrote:
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> It seems that the DMC likes to transition between the DC states a lot when
>>>> there are no connected displays (no active power domains) during command
>>>> submission.
>>>>
>>>> This activity on DC states has a negative impact on the performance of the
>>>> chip with huge latencies observed in the interrupt handlers and elsewhere.
>>>> Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
>>>> eight.
>>>>
>>>> Work around it by introducing a new power domain named,
>>>> POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
>>>> held for the duration of command submission activity.
>>>>
>>>> v2:
>>>> * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
>>>> * Protect macro body with braces. (Jani Nikula)
>>>>
>>>> v3:
>>>> * Add dedicated power domain for clarity. (Chris, Imre)
>>>> * Commit message and comment text updates.
>>>> * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
>>>> firmware release.
>>>>
>>>> v4:
>>>> * Power domain should be inner to device runtime pm. (Chris)
>>>> * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
>>>> * Handle async DMC loading by moving the GT_IRQ power domain logic into
>>>> intel_runtime_pm. (Daniel, Chris)
>>>> * Include small core GEN9 as well. (Imre)
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
>>>> Testcase: igt/gem_exec_nop/headless
>>>> Cc: Imre Deak <imre.deak@intel.com>
>>>> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
>>>> ---
>>>> drivers/gpu/drm/i915/i915_drv.h | 5 ++++
>>>> drivers/gpu/drm/i915/i915_gem.c | 2 ++
>>>> drivers/gpu/drm/i915/i915_gem_request.c | 14 +++++++++++
>>>> drivers/gpu/drm/i915/intel_csr.c | 29 +++++++++++++---------
>>>> drivers/gpu/drm/i915/intel_runtime_pm.c | 44 +++++++++++++++++++++++++++------
>>>> 5 files changed, 76 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>> index bddd65839f60..37b3da4fd0d4 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -398,6 +398,7 @@ enum intel_display_power_domain {
>>>> POWER_DOMAIN_AUX_D,
>>>> POWER_DOMAIN_GMBUS,
>>>> POWER_DOMAIN_MODESET,
>>>> + POWER_DOMAIN_GT_IRQ,
>>>> POWER_DOMAIN_INIT,
>>>> POWER_DOMAIN_NUM,
>>>> @@ -3285,6 +3286,10 @@ intel_info(const struct drm_i915_private *dev_priv)
>>>> #define GT_FREQUENCY_MULTIPLIER 50
>>>> #define GEN9_FREQ_SCALER 3
>>>> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
>>>> + ((dev_priv)->csr.dmc_payload && \
>>>> + IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
>>>> +
>>>> #include "i915_trace.h"
>>>> static inline bool intel_vtd_active(void)
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>>> index 354b0546a191..a6f522e2d1a1 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>>> @@ -3381,6 +3381,8 @@ i915_gem_idle_work_handler(struct work_struct *work)
>>>> if (INTEL_GEN(dev_priv) >= 6)
>>>> gen6_rps_idle(dev_priv);
>>>> +
>>>> + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
>>>> intel_runtime_pm_put(dev_priv);
>>>> out_unlock:
>>>> mutex_unlock(&dev_priv->drm.struct_mutex);
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
>>>> index a90bdd26571f..c28a4ceb016d 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem_request.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
>>>> @@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
>>>> GEM_BUG_ON(!i915->gt.active_requests);
>>>> intel_runtime_pm_get_noresume(i915);
>>>> +
>>>> + /*
>>>> + * It seems that the DMC likes to transition between the DC states a lot
>>>> + * when there are no connected displays (no active power domains) during
>>>> + * command submission.
>>>> + *
>>>> + * This activity has negative impact on the performance of the chip with
>>>> + * huge latencies observed in the interrupt handler and elsewhere.
>>>> + *
>>>> + * Work around it by grabbing a GT IRQ power domain whilst there is any
>>>> + * GT activity, preventing any DC state transitions.
>>>> + */
>>>> + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
>>>> +
>>>> i915->gt.awake = true;
>>>> intel_enable_gt_powersave(i915);
>>>> diff --git a/drivers/gpu/drm/i915/intel_csr.c b/drivers/gpu/drm/i915/intel_csr.c
>>>> index 07e4f7bc4412..8b188e9f283b 100644
>>>> --- a/drivers/gpu/drm/i915/intel_csr.c
>>>> +++ b/drivers/gpu/drm/i915/intel_csr.c
>>>> @@ -403,24 +403,33 @@ static uint32_t *parse_csr_fw(struct drm_i915_private *dev_priv,
>>>> static void csr_load_work_fn(struct work_struct *work)
>>>> {
>>>> - struct drm_i915_private *dev_priv;
>>>> - struct intel_csr *csr;
>>>> + struct drm_i915_private *dev_priv =
>>>> + container_of(work, typeof(*dev_priv), csr.work);
>>>> + struct intel_csr *csr = &dev_priv->csr;
>>>> const struct firmware *fw = NULL;
>>>> + u32 *dmc_payload;
>>>> - dev_priv = container_of(work, typeof(*dev_priv), csr.work);
>>>> - csr = &dev_priv->csr;
>>>> + request_firmware(&fw, csr->fw_path, &dev_priv->drm.pdev->dev);
>>>> - request_firmware(&fw, dev_priv->csr.fw_path, &dev_priv->drm.pdev->dev);
>>>> - if (fw)
>>>> - dev_priv->csr.dmc_payload = parse_csr_fw(dev_priv, fw);
>>>> + if (fw) {
>>>> + dmc_payload = parse_csr_fw(dev_priv, fw);
>>>> + release_firmware(fw);
>>>> + } else {
>>>> + dmc_payload = NULL;
>>>> + }
>>>> - if (dev_priv->csr.dmc_payload) {
>>>> + /* Lock to document memory barrier towards NEEDS_CSR_GT_PERF_WA. */
>>>> + mutex_lock(&dev_priv->power_domains.lock);
>>>> + csr->dmc_payload = dmc_payload;
>>>> + mutex_unlock(&dev_priv->power_domains.lock);
>>>> +
>>>> + if (csr->dmc_payload) {
>>>> intel_csr_load_program(dev_priv);
>>>> intel_display_power_put(dev_priv, POWER_DOMAIN_INIT);
>>>> DRM_INFO("Finished loading DMC firmware %s (v%u.%u)\n",
>>>> - dev_priv->csr.fw_path,
>>>> + csr->fw_path,
>>>> CSR_VERSION_MAJOR(csr->version),
>>>> CSR_VERSION_MINOR(csr->version));
>>>> } else {
>>>> @@ -431,8 +440,6 @@ static void csr_load_work_fn(struct work_struct *work)
>>>> dev_notice(dev_priv->drm.dev, "DMC firmware homepage: %s",
>>>> INTEL_UC_FIRMWARE_URL);
>>>> }
>>>> -
>>>> - release_firmware(fw);
>>>> }
>>>> /**
>>>> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
>>>> index 8315499452dc..915124a2e945 100644
>>>> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
>>>> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
>>>> @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
>>>> return "INIT";
>>>> case POWER_DOMAIN_MODESET:
>>>> return "MODESET";
>>>> + case POWER_DOMAIN_GT_IRQ:
>>>> + return "GT_IRQ";
>>>> default:
>>>> MISSING_CASE(domain);
>>>> return "?";
>>>> @@ -1448,6 +1450,9 @@ __intel_display_power_get_domain(struct drm_i915_private *dev_priv,
>>>> struct i915_power_domains *power_domains = &dev_priv->power_domains;
>>>> struct i915_power_well *power_well;
>>>> + if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
>>>> + return;
>>>> +
>>>> for_each_power_domain_well(dev_priv, power_well, BIT_ULL(domain))
>>>> intel_power_well_get(dev_priv, power_well);
>>>> @@ -1518,6 +1523,19 @@ bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
>>>> return is_enabled;
>>>> }
>>>> +static void
>>>> +__intel_display_power_put_domain(struct drm_i915_private *dev_priv,
>>>> + enum intel_display_power_domain domain)
>>>> +{
>>>> + struct i915_power_domains *power_domains = &dev_priv->power_domains;
>>>> + struct i915_power_well *power_well;
>>>> +
>>>> + power_domains->domain_use_count[domain]--;
>>>> +
>>>> + for_each_power_domain_well_rev(dev_priv, power_well, BIT_ULL(domain))
>>>> + intel_power_well_put(dev_priv, power_well);
>>>> +}
>>>> +
>>>> /**
>>>> * intel_display_power_put - release a power domain reference
>>>> * @dev_priv: i915 device instance
>>>> @@ -1530,21 +1548,30 @@ bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
>>>> void intel_display_power_put(struct drm_i915_private *dev_priv,
>>>> enum intel_display_power_domain domain)
>>>> {
>>>> - struct i915_power_domains *power_domains;
>>>> - struct i915_power_well *power_well;
>>>> -
>>>> - power_domains = &dev_priv->power_domains;
>>>> + struct i915_power_domains *power_domains = &dev_priv->power_domains;
>>>> mutex_lock(&power_domains->lock);
>>>> + if (domain == POWER_DOMAIN_GT_IRQ) {
>>>> + /*
>>>> + * To handle asynchronous DMC loading we have to be careful to
>>>> + * keep the use count on POWER_DOMAIN_GT_IRQ balanced.
>>>> + *
>>>> + * If there was GT activity before the DMC was loaded, we will
>>>> + * have skipped obtaining the power domain so must not decrement
>>>> + * it now.
>>>> + */
>>>> + if (!power_domains->domain_use_count[domain])
>>>> + goto out;
>>>> + }
>>>
>>> Is it possible to have any GT activity before
>>> intel_power_domains_init_hw() / intel_csr_ucode_init()? I couldn't spot
>>> anything at least. Calling into the power well code before those isn't
>>> correct in any case. If so, the INIT power ref taken in
>>> intel_csr_ucode_init() makes sure the above scenario can't happen and
>>> then we don't need dmc_payload check in NEEDS_CSR_GT_PERF_WA() either.
>>
>> intel_csr_ucode_init schedules a worker to actually load the firmware so
>> unless I am missing some synchronisation point it can happen.
>
> What I meant is that the INIT domain ref we take before scheduling that
> work makes the dmc_payload check redundant both during getting and
> putting the GT_IRQ domain. Without that check if the FW is not loaded
> yet (or won't be at all) those get/puts will just inc/dec the GT_IRQ
> domain refcount, but won't result in enabling/disabling the DC-off power
> well, so will be kind of nops.
>
>> Simpler solution would be to only base the NEEDS_CSR_GT_PERF_WA on HAS_CSR,
>> but then if someone removes the firmware deliberately we lose power keeping
>> this power well up needlessly, no?
>
> If the firmware can't be loaded runtime PM as a whole is anyway disabled
> (due to the above INIT power domain ref not being released).
You know, I did not even notice the DMC code leaves the power domain
grabbed if the fw load fails. Maybe the hints were to subtle, or I would
just not expect something like that. Guess thats what you get when you
have someone unfamiliar with the area fiddle around. :)
I'll respin..
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v5] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-29 14:30 ` [PATCH v4] " Tvrtko Ursulin
2017-11-29 15:06 ` Imre Deak
@ 2017-11-30 9:18 ` Tvrtko Ursulin
2017-11-30 9:45 ` Chris Wilson
1 sibling, 1 reply; 32+ messages in thread
From: Tvrtko Ursulin @ 2017-11-30 9:18 UTC (permalink / raw)
To: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
It seems that the DMC likes to transition between the DC states a lot when
there are no connected displays (no active power domains) during command
submission.
This activity on DC states has a negative impact on the performance of the
chip with huge latencies observed in the interrupt handlers and elsewhere.
Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
eight.
Work around it by introducing a new power domain named,
POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
held for the duration of command submission activity.
v2:
* Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
* Protect macro body with braces. (Jani Nikula)
v3:
* Add dedicated power domain for clarity. (Chris, Imre)
* Commit message and comment text updates.
* Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
firmware release.
v4:
* Power domain should be inner to device runtime pm. (Chris)
* Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
* Handle async DMC loading by moving the GT_IRQ power domain logic into
intel_runtime_pm. (Daniel, Chris)
* Include small core GEN9 as well. (Imre)
v5
* Special handling for async DMC load is not needed since on failure the
power domain reference is kept permanently taken. (Imre)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
Testcase: igt/gem_exec_nop/headless
Cc: Imre Deak <imre.deak@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 4 ++++
drivers/gpu/drm/i915/i915_gem.c | 3 +++
drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++++
4 files changed, 35 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bddd65839f60..59cf11dd5d3b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -398,6 +398,7 @@ enum intel_display_power_domain {
POWER_DOMAIN_AUX_D,
POWER_DOMAIN_GMBUS,
POWER_DOMAIN_MODESET,
+ POWER_DOMAIN_GT_IRQ,
POWER_DOMAIN_INIT,
POWER_DOMAIN_NUM,
@@ -3285,6 +3286,9 @@ intel_info(const struct drm_i915_private *dev_priv)
#define GT_FREQUENCY_MULTIPLIER 50
#define GEN9_FREQ_SCALER 3
+#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
+ (HAS_CSR(dev_priv) && IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
+
#include "i915_trace.h"
static inline bool intel_vtd_active(void)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 354b0546a191..c6067cba1dca 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3381,6 +3381,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
if (INTEL_GEN(dev_priv) >= 6)
gen6_rps_idle(dev_priv);
+
+ intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
+
intel_runtime_pm_put(dev_priv);
out_unlock:
mutex_unlock(&dev_priv->drm.struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index a90bdd26571f..c28a4ceb016d 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
GEM_BUG_ON(!i915->gt.active_requests);
intel_runtime_pm_get_noresume(i915);
+
+ /*
+ * It seems that the DMC likes to transition between the DC states a lot
+ * when there are no connected displays (no active power domains) during
+ * command submission.
+ *
+ * This activity has negative impact on the performance of the chip with
+ * huge latencies observed in the interrupt handler and elsewhere.
+ *
+ * Work around it by grabbing a GT IRQ power domain whilst there is any
+ * GT activity, preventing any DC state transitions.
+ */
+ intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
+
i915->gt.awake = true;
intel_enable_gt_powersave(i915);
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 8315499452dc..48ad0828ace7 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
return "INIT";
case POWER_DOMAIN_MODESET:
return "MODESET";
+ case POWER_DOMAIN_GT_IRQ:
+ return "GT_IRQ";
default:
MISSING_CASE(domain);
return "?";
@@ -1471,6 +1473,9 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
{
struct i915_power_domains *power_domains = &dev_priv->power_domains;
+ if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
+ return;
+
intel_runtime_pm_get(dev_priv);
mutex_lock(&power_domains->lock);
@@ -1498,6 +1503,9 @@ bool intel_display_power_get_if_enabled(struct drm_i915_private *dev_priv,
struct i915_power_domains *power_domains = &dev_priv->power_domains;
bool is_enabled;
+ if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
+ return false;
+
if (!intel_runtime_pm_get_if_in_use(dev_priv))
return false;
@@ -1533,6 +1541,9 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
struct i915_power_domains *power_domains;
struct i915_power_well *power_well;
+ if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
+ return;
+
power_domains = &dev_priv->power_domains;
mutex_lock(&power_domains->lock);
@@ -1705,6 +1716,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
BIT_ULL(POWER_DOMAIN_INIT))
#define SKL_DISPLAY_DC_OFF_POWER_DOMAINS ( \
SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
+ BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
BIT_ULL(POWER_DOMAIN_MODESET) | \
BIT_ULL(POWER_DOMAIN_AUX_A) | \
BIT_ULL(POWER_DOMAIN_INIT))
@@ -1727,6 +1739,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
BIT_ULL(POWER_DOMAIN_INIT))
#define BXT_DISPLAY_DC_OFF_POWER_DOMAINS ( \
BXT_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
+ BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
BIT_ULL(POWER_DOMAIN_MODESET) | \
BIT_ULL(POWER_DOMAIN_AUX_A) | \
BIT_ULL(POWER_DOMAIN_INIT))
@@ -1785,6 +1798,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
BIT_ULL(POWER_DOMAIN_INIT))
#define GLK_DISPLAY_DC_OFF_POWER_DOMAINS ( \
GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
+ BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
BIT_ULL(POWER_DOMAIN_MODESET) | \
BIT_ULL(POWER_DOMAIN_AUX_A) | \
BIT_ULL(POWER_DOMAIN_INIT))
--
2.14.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v5] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-30 9:18 ` [PATCH v5] " Tvrtko Ursulin
@ 2017-11-30 9:45 ` Chris Wilson
2017-11-30 9:55 ` Tvrtko Ursulin
2017-11-30 11:19 ` Imre Deak
0 siblings, 2 replies; 32+ messages in thread
From: Chris Wilson @ 2017-11-30 9:45 UTC (permalink / raw)
To: Tvrtko Ursulin, Intel-gfx
Quoting Tvrtko Ursulin (2017-11-30 09:18:20)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> It seems that the DMC likes to transition between the DC states a lot when
> there are no connected displays (no active power domains) during command
> submission.
>
> This activity on DC states has a negative impact on the performance of the
> chip with huge latencies observed in the interrupt handlers and elsewhere.
> Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> eight.
>
> Work around it by introducing a new power domain named,
> POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> held for the duration of command submission activity.
>
> v2:
> * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> * Protect macro body with braces. (Jani Nikula)
>
> v3:
> * Add dedicated power domain for clarity. (Chris, Imre)
> * Commit message and comment text updates.
> * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> firmware release.
>
> v4:
> * Power domain should be inner to device runtime pm. (Chris)
> * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
> * Handle async DMC loading by moving the GT_IRQ power domain logic into
> intel_runtime_pm. (Daniel, Chris)
> * Include small core GEN9 as well. (Imre)
>
> v5
> * Special handling for async DMC load is not needed since on failure the
> power domain reference is kept permanently taken. (Imre)
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> Testcase: igt/gem_exec_nop/headless
> Cc: Imre Deak <imre.deak@intel.com>
> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 4 ++++
> drivers/gpu/drm/i915/i915_gem.c | 3 +++
> drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
> drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++++
> 4 files changed, 35 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index bddd65839f60..59cf11dd5d3b 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -398,6 +398,7 @@ enum intel_display_power_domain {
> POWER_DOMAIN_AUX_D,
> POWER_DOMAIN_GMBUS,
> POWER_DOMAIN_MODESET,
> + POWER_DOMAIN_GT_IRQ,
> POWER_DOMAIN_INIT,
>
> POWER_DOMAIN_NUM,
> @@ -3285,6 +3286,9 @@ intel_info(const struct drm_i915_private *dev_priv)
> #define GT_FREQUENCY_MULTIPLIER 50
> #define GEN9_FREQ_SCALER 3
>
> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> + (HAS_CSR(dev_priv) && IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
> +
> #include "i915_trace.h"
>
> static inline bool intel_vtd_active(void)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 354b0546a191..c6067cba1dca 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3381,6 +3381,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
>
> if (INTEL_GEN(dev_priv) >= 6)
> gen6_rps_idle(dev_priv);
> +
> + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
> +
> intel_runtime_pm_put(dev_priv);
> out_unlock:
> mutex_unlock(&dev_priv->drm.struct_mutex);
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index a90bdd26571f..c28a4ceb016d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
> GEM_BUG_ON(!i915->gt.active_requests);
>
> intel_runtime_pm_get_noresume(i915);
> +
> + /*
> + * It seems that the DMC likes to transition between the DC states a lot
> + * when there are no connected displays (no active power domains) during
> + * command submission.
> + *
> + * This activity has negative impact on the performance of the chip with
> + * huge latencies observed in the interrupt handler and elsewhere.
> + *
> + * Work around it by grabbing a GT IRQ power domain whilst there is any
> + * GT activity, preventing any DC state transitions.
> + */
> + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> +
> i915->gt.awake = true;
>
> intel_enable_gt_powersave(i915);
> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> index 8315499452dc..48ad0828ace7 100644
> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
> return "INIT";
> case POWER_DOMAIN_MODESET:
> return "MODESET";
> + case POWER_DOMAIN_GT_IRQ:
> + return "GT_IRQ";
> default:
> MISSING_CASE(domain);
> return "?";
> @@ -1471,6 +1473,9 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
> {
> struct i915_power_domains *power_domains = &dev_priv->power_domains;
>
> + if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
> + return;
I was hoping that in the magic of the powerdomains, this would just
evaporate (or at least find its own place within the powerwell enabling).
If Imre is happy with plonking this here,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Or at least good enough for now and can be improved upon?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-30 9:45 ` Chris Wilson
@ 2017-11-30 9:55 ` Tvrtko Ursulin
2017-11-30 11:19 ` Imre Deak
1 sibling, 0 replies; 32+ messages in thread
From: Tvrtko Ursulin @ 2017-11-30 9:55 UTC (permalink / raw)
To: Chris Wilson, Tvrtko Ursulin, Intel-gfx
On 30/11/2017 09:45, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-11-30 09:18:20)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> It seems that the DMC likes to transition between the DC states a lot when
>> there are no connected displays (no active power domains) during command
>> submission.
>>
>> This activity on DC states has a negative impact on the performance of the
>> chip with huge latencies observed in the interrupt handlers and elsewhere.
>> Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
>> eight.
>>
>> Work around it by introducing a new power domain named,
>> POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
>> held for the duration of command submission activity.
>>
>> v2:
>> * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
>> * Protect macro body with braces. (Jani Nikula)
>>
>> v3:
>> * Add dedicated power domain for clarity. (Chris, Imre)
>> * Commit message and comment text updates.
>> * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
>> firmware release.
>>
>> v4:
>> * Power domain should be inner to device runtime pm. (Chris)
>> * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
>> * Handle async DMC loading by moving the GT_IRQ power domain logic into
>> intel_runtime_pm. (Daniel, Chris)
>> * Include small core GEN9 as well. (Imre)
>>
>> v5
>> * Special handling for async DMC load is not needed since on failure the
>> power domain reference is kept permanently taken. (Imre)
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
>> Testcase: igt/gem_exec_nop/headless
>> Cc: Imre Deak <imre.deak@intel.com>
>> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
>> ---
>> drivers/gpu/drm/i915/i915_drv.h | 4 ++++
>> drivers/gpu/drm/i915/i915_gem.c | 3 +++
>> drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
>> drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++++
>> 4 files changed, 35 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index bddd65839f60..59cf11dd5d3b 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -398,6 +398,7 @@ enum intel_display_power_domain {
>> POWER_DOMAIN_AUX_D,
>> POWER_DOMAIN_GMBUS,
>> POWER_DOMAIN_MODESET,
>> + POWER_DOMAIN_GT_IRQ,
>> POWER_DOMAIN_INIT,
>>
>> POWER_DOMAIN_NUM,
>> @@ -3285,6 +3286,9 @@ intel_info(const struct drm_i915_private *dev_priv)
>> #define GT_FREQUENCY_MULTIPLIER 50
>> #define GEN9_FREQ_SCALER 3
>>
>> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
>> + (HAS_CSR(dev_priv) && IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
>> +
>> #include "i915_trace.h"
>>
>> static inline bool intel_vtd_active(void)
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 354b0546a191..c6067cba1dca 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -3381,6 +3381,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
>>
>> if (INTEL_GEN(dev_priv) >= 6)
>> gen6_rps_idle(dev_priv);
>> +
>> + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
>> +
>> intel_runtime_pm_put(dev_priv);
>> out_unlock:
>> mutex_unlock(&dev_priv->drm.struct_mutex);
>> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
>> index a90bdd26571f..c28a4ceb016d 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_request.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
>> @@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
>> GEM_BUG_ON(!i915->gt.active_requests);
>>
>> intel_runtime_pm_get_noresume(i915);
>> +
>> + /*
>> + * It seems that the DMC likes to transition between the DC states a lot
>> + * when there are no connected displays (no active power domains) during
>> + * command submission.
>> + *
>> + * This activity has negative impact on the performance of the chip with
>> + * huge latencies observed in the interrupt handler and elsewhere.
>> + *
>> + * Work around it by grabbing a GT IRQ power domain whilst there is any
>> + * GT activity, preventing any DC state transitions.
>> + */
>> + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
>> +
>> i915->gt.awake = true;
>>
>> intel_enable_gt_powersave(i915);
>> diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> index 8315499452dc..48ad0828ace7 100644
>> --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
>> @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
>> return "INIT";
>> case POWER_DOMAIN_MODESET:
>> return "MODESET";
>> + case POWER_DOMAIN_GT_IRQ:
>> + return "GT_IRQ";
>> default:
>> MISSING_CASE(domain);
>> return "?";
>> @@ -1471,6 +1473,9 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
>> {
>> struct i915_power_domains *power_domains = &dev_priv->power_domains;
>>
>> + if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
>> + return;
>
> I was hoping that in the magic of the powerdomains, this would just
> evaporate (or at least find its own place within the powerwell enabling).
>
> If Imre is happy with plonking this here,
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> Or at least good enough for now and can be improved upon?
One of my earlier approaches was to remove the domain from the power
well bitmask when the DMC load fails, but I wasn't sure if the power
well arrays should (could?) really be const, and how ugly it would be to
expose some API to allow external callers to do that. Not sure, Imre
indeed to the rescue please. :)
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-30 9:45 ` Chris Wilson
2017-11-30 9:55 ` Tvrtko Ursulin
@ 2017-11-30 11:19 ` Imre Deak
2017-12-01 23:58 ` Rogozhkin, Dmitry V
2017-12-02 0:05 ` [PATCH v5] " Rogozhkin, Dmitry V
1 sibling, 2 replies; 32+ messages in thread
From: Imre Deak @ 2017-11-30 11:19 UTC (permalink / raw)
To: Chris Wilson; +Cc: Intel-gfx
On Thu, Nov 30, 2017 at 09:45:25AM +0000, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-11-30 09:18:20)
> > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >
> > It seems that the DMC likes to transition between the DC states a lot when
> > there are no connected displays (no active power domains) during command
> > submission.
> >
> > This activity on DC states has a negative impact on the performance of the
> > chip with huge latencies observed in the interrupt handlers and elsewhere.
> > Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> > eight.
> >
> > Work around it by introducing a new power domain named,
> > POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> > held for the duration of command submission activity.
> >
> > v2:
> > * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> > * Protect macro body with braces. (Jani Nikula)
> >
> > v3:
> > * Add dedicated power domain for clarity. (Chris, Imre)
> > * Commit message and comment text updates.
> > * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> > firmware release.
> >
> > v4:
> > * Power domain should be inner to device runtime pm. (Chris)
> > * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
> > * Handle async DMC loading by moving the GT_IRQ power domain logic into
> > intel_runtime_pm. (Daniel, Chris)
> > * Include small core GEN9 as well. (Imre)
> >
> > v5
> > * Special handling for async DMC load is not needed since on failure the
> > power domain reference is kept permanently taken. (Imre)
> >
> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> > Testcase: igt/gem_exec_nop/headless
> > Cc: Imre Deak <imre.deak@intel.com>
> > Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_drv.h | 4 ++++
> > drivers/gpu/drm/i915/i915_gem.c | 3 +++
> > drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
> > drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++++
> > 4 files changed, 35 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index bddd65839f60..59cf11dd5d3b 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -398,6 +398,7 @@ enum intel_display_power_domain {
> > POWER_DOMAIN_AUX_D,
> > POWER_DOMAIN_GMBUS,
> > POWER_DOMAIN_MODESET,
> > + POWER_DOMAIN_GT_IRQ,
> > POWER_DOMAIN_INIT,
> >
> > POWER_DOMAIN_NUM,
> > @@ -3285,6 +3286,9 @@ intel_info(const struct drm_i915_private *dev_priv)
> > #define GT_FREQUENCY_MULTIPLIER 50
> > #define GEN9_FREQ_SCALER 3
> >
> > +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> > + (HAS_CSR(dev_priv) && IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
Nitpick: could be just !IS_SKYLAKE(), but works in the above way too.
For all other platforms the GT_IRQ domain won't be mapped making
display_power_get/put on those just a domain ref inc/dec, otherwise a
nop.
> > +
> > #include "i915_trace.h"
> >
> > static inline bool intel_vtd_active(void)
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 354b0546a191..c6067cba1dca 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3381,6 +3381,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
> >
> > if (INTEL_GEN(dev_priv) >= 6)
> > gen6_rps_idle(dev_priv);
> > +
> > + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
> > +
> > intel_runtime_pm_put(dev_priv);
> > out_unlock:
> > mutex_unlock(&dev_priv->drm.struct_mutex);
> > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> > index a90bdd26571f..c28a4ceb016d 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_request.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> > @@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
> > GEM_BUG_ON(!i915->gt.active_requests);
> >
> > intel_runtime_pm_get_noresume(i915);
> > +
> > + /*
> > + * It seems that the DMC likes to transition between the DC states a lot
> > + * when there are no connected displays (no active power domains) during
> > + * command submission.
> > + *
> > + * This activity has negative impact on the performance of the chip with
> > + * huge latencies observed in the interrupt handler and elsewhere.
> > + *
> > + * Work around it by grabbing a GT IRQ power domain whilst there is any
> > + * GT activity, preventing any DC state transitions.
> > + */
> > + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> > +
> > i915->gt.awake = true;
> >
> > intel_enable_gt_powersave(i915);
> > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > index 8315499452dc..48ad0828ace7 100644
> > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
> > return "INIT";
> > case POWER_DOMAIN_MODESET:
> > return "MODESET";
> > + case POWER_DOMAIN_GT_IRQ:
> > + return "GT_IRQ";
> > default:
> > MISSING_CASE(domain);
> > return "?";
> > @@ -1471,6 +1473,9 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
> > {
> > struct i915_power_domains *power_domains = &dev_priv->power_domains;
> >
> > + if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
> > + return;
>
> I was hoping that in the magic of the powerdomains, this would just
> evaporate (or at least find its own place within the powerwell enabling).
Yep, that would be "normally" the case. But since we have only a single
set of domains->powerwells mapping for all GEN9_BC we can't isolate SKL
that way w/o duplicating the mapping for SKL. Although all GEN9_BC are
the same from power well handling POV for some reason there are still
separate DMC firmwares for KBL/CFL and SKL. The corruption problem is
only fixed in the KBL/CFL firmware the SKL one still to be added to
linux-firmware.git .. At that point we can also remove these checks.
>
> If Imre is happy with plonking this here,
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> Or at least good enough for now and can be improved upon?
Yes, looks ok and can adjust for SKL later.
> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* ✗ Fi.CI.BAT: warning for drm/i915: Restore GT performance in headless mode with DMC loaded (rev5)
2017-11-29 10:59 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
` (4 preceding siblings ...)
2017-11-29 14:30 ` [PATCH v4] " Tvrtko Ursulin
@ 2017-11-30 15:56 ` Patchwork
2017-12-05 15:06 ` ✗ Fi.CI.BAT: failure for drm/i915: Restore GT performance in headless mode with DMC loaded (rev6) Patchwork
2017-12-05 17:42 ` ✗ Fi.CI.BAT: warning " Patchwork
7 siblings, 0 replies; 32+ messages in thread
From: Patchwork @ 2017-11-30 15:56 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Restore GT performance in headless mode with DMC loaded (rev5)
URL : https://patchwork.freedesktop.org/series/24017/
State : warning
== Summary ==
Series 24017v5 drm/i915: Restore GT performance in headless mode with DMC loaded
https://patchwork.freedesktop.org/api/1.0/series/24017/revisions/5/mbox/
Test gem_exec_reloc:
Subgroup basic-cpu-active:
fail -> PASS (fi-gdg-551) fdo#102582 +1
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-b:
pass -> DMESG-WARN (fi-kbl-r)
fdo#102582 https://bugs.freedesktop.org/show_bug.cgi?id=102582
fi-bdw-5557u total:288 pass:267 dwarn:0 dfail:0 fail:0 skip:21 time:439s
fi-bdw-gvtdvm total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:446s
fi-blb-e6850 total:288 pass:223 dwarn:1 dfail:0 fail:0 skip:64 time:389s
fi-bsw-n3050 total:288 pass:242 dwarn:0 dfail:0 fail:0 skip:46 time:512s
fi-bwr-2160 total:288 pass:183 dwarn:0 dfail:0 fail:0 skip:105 time:282s
fi-bxt-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:504s
fi-bxt-j4205 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:508s
fi-byt-j1900 total:288 pass:253 dwarn:0 dfail:0 fail:0 skip:35 time:486s
fi-elk-e7500 total:224 pass:162 dwarn:16 dfail:0 fail:0 skip:45
fi-gdg-551 total:288 pass:178 dwarn:1 dfail:0 fail:1 skip:108 time:267s
fi-glk-1 total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:541s
fi-hsw-4770 total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:357s
fi-hsw-4770r total:288 pass:224 dwarn:0 dfail:0 fail:0 skip:64 time:259s
fi-ilk-650 total:288 pass:228 dwarn:0 dfail:0 fail:0 skip:60 time:392s
fi-ivb-3520m total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:484s
fi-ivb-3770 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:450s
fi-kbl-7500u total:288 pass:263 dwarn:1 dfail:0 fail:0 skip:24 time:484s
fi-kbl-7560u total:288 pass:269 dwarn:0 dfail:0 fail:0 skip:19 time:530s
fi-kbl-7567u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:474s
fi-kbl-r total:288 pass:260 dwarn:1 dfail:0 fail:0 skip:27 time:528s
fi-pnv-d510 total:288 pass:222 dwarn:1 dfail:0 fail:0 skip:65 time:596s
fi-skl-6260u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:459s
fi-skl-6600u total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:541s
fi-skl-6700hq total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:567s
fi-skl-6700k total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:514s
fi-skl-6770hq total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:495s
fi-skl-gvtdvm total:288 pass:265 dwarn:0 dfail:0 fail:0 skip:23 time:450s
fi-snb-2520m total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:555s
fi-snb-2600 total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:416s
Blacklisted hosts:
fi-cfl-s2 total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:609s
fi-glk-dsi total:288 pass:257 dwarn:0 dfail:0 fail:1 skip:30 time:496s
fi-byt-n2820 failed to collect. IGT log at Patchwork_7375/fi-byt-n2820/igt.log
6d6c48b9b35806aba461d2c8285db2689de9095f drm-tip: 2017y-11m-30d-12h-22m-59s UTC integration manifest
9e3d6f43e0a8 drm/i915: Restore GT performance in headless mode with DMC loaded
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7375/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-30 11:19 ` Imre Deak
@ 2017-12-01 23:58 ` Rogozhkin, Dmitry V
2017-12-05 13:08 ` Imre Deak
2017-12-02 0:05 ` [PATCH v5] " Rogozhkin, Dmitry V
1 sibling, 1 reply; 32+ messages in thread
From: Rogozhkin, Dmitry V @ 2017-12-01 23:58 UTC (permalink / raw)
To: Deak, Imre; +Cc: Intel-gfx
On Thu, 2017-11-30 at 13:19 +0200, Imre Deak wrote:
> On Thu, Nov 30, 2017 at 09:45:25AM +0000, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2017-11-30 09:18:20)
> > > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > >
> > > It seems that the DMC likes to transition between the DC states a lot when
> > > there are no connected displays (no active power domains) during command
> > > submission.
> > >
> > > This activity on DC states has a negative impact on the performance of the
> > > chip with huge latencies observed in the interrupt handlers and elsewhere.
> > > Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> > > eight.
> > >
> > > Work around it by introducing a new power domain named,
> > > POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> > > held for the duration of command submission activity.
> > >
> > > v2:
> > > * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> > > * Protect macro body with braces. (Jani Nikula)
> > >
> > > v3:
> > > * Add dedicated power domain for clarity. (Chris, Imre)
> > > * Commit message and comment text updates.
> > > * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> > > firmware release.
> > >
> > > v4:
> > > * Power domain should be inner to device runtime pm. (Chris)
> > > * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
> > > * Handle async DMC loading by moving the GT_IRQ power domain logic into
> > > intel_runtime_pm. (Daniel, Chris)
> > > * Include small core GEN9 as well. (Imre)
> > >
> > > v5
> > > * Special handling for async DMC load is not needed since on failure the
> > > power domain reference is kept permanently taken. (Imre)
> > >
> > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> > > Testcase: igt/gem_exec_nop/headless
> > > Cc: Imre Deak <imre.deak@intel.com>
> > > Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> > > ---
> > > drivers/gpu/drm/i915/i915_drv.h | 4 ++++
> > > drivers/gpu/drm/i915/i915_gem.c | 3 +++
> > > drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
> > > drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++++
> > > 4 files changed, 35 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index bddd65839f60..59cf11dd5d3b 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -398,6 +398,7 @@ enum intel_display_power_domain {
> > > POWER_DOMAIN_AUX_D,
> > > POWER_DOMAIN_GMBUS,
> > > POWER_DOMAIN_MODESET,
> > > + POWER_DOMAIN_GT_IRQ,
> > > POWER_DOMAIN_INIT,
> > >
> > > POWER_DOMAIN_NUM,
> > > @@ -3285,6 +3286,9 @@ intel_info(const struct drm_i915_private *dev_priv)
> > > #define GT_FREQUENCY_MULTIPLIER 50
> > > #define GEN9_FREQ_SCALER 3
> > >
> > > +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> > > + (HAS_CSR(dev_priv) && IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
>
> Nitpick: could be just !IS_SKYLAKE(), but works in the above way too.
> For all other platforms the GT_IRQ domain won't be mapped making
> display_power_get/put on those just a domain ref inc/dec, otherwise a
> nop.
Folks, is my understanding correct that once SKL DMC will be merged we
just remove !IS_SKYLAKE from the above check to get the performance fix
for SKL and nothing more changes in the patch? I am asking because there
is a need to have SKL patch on the plate already to verify the fix
sooner rather than later. From this perspective, can we have one more
patch in the series dedicated to fix SKL right now? After all, SKL DMC
is available in drm-firmware already, those who need can try it.
>
> > > +
> > > #include "i915_trace.h"
> > >
> > > static inline bool intel_vtd_active(void)
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 354b0546a191..c6067cba1dca 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -3381,6 +3381,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
> > >
> > > if (INTEL_GEN(dev_priv) >= 6)
> > > gen6_rps_idle(dev_priv);
> > > +
> > > + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
> > > +
> > > intel_runtime_pm_put(dev_priv);
> > > out_unlock:
> > > mutex_unlock(&dev_priv->drm.struct_mutex);
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> > > index a90bdd26571f..c28a4ceb016d 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_request.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> > > @@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
> > > GEM_BUG_ON(!i915->gt.active_requests);
> > >
> > > intel_runtime_pm_get_noresume(i915);
> > > +
> > > + /*
> > > + * It seems that the DMC likes to transition between the DC states a lot
> > > + * when there are no connected displays (no active power domains) during
> > > + * command submission.
> > > + *
> > > + * This activity has negative impact on the performance of the chip with
> > > + * huge latencies observed in the interrupt handler and elsewhere.
> > > + *
> > > + * Work around it by grabbing a GT IRQ power domain whilst there is any
> > > + * GT activity, preventing any DC state transitions.
> > > + */
> > > + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> > > +
> > > i915->gt.awake = true;
> > >
> > > intel_enable_gt_powersave(i915);
> > > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > index 8315499452dc..48ad0828ace7 100644
> > > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
> > > return "INIT";
> > > case POWER_DOMAIN_MODESET:
> > > return "MODESET";
> > > + case POWER_DOMAIN_GT_IRQ:
> > > + return "GT_IRQ";
> > > default:
> > > MISSING_CASE(domain);
> > > return "?";
> > > @@ -1471,6 +1473,9 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
> > > {
> > > struct i915_power_domains *power_domains = &dev_priv->power_domains;
> > >
> > > + if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
> > > + return;
> >
> > I was hoping that in the magic of the powerdomains, this would just
> > evaporate (or at least find its own place within the powerwell enabling).
>
> Yep, that would be "normally" the case. But since we have only a single
> set of domains->powerwells mapping for all GEN9_BC we can't isolate SKL
> that way w/o duplicating the mapping for SKL. Although all GEN9_BC are
> the same from power well handling POV for some reason there are still
> separate DMC firmwares for KBL/CFL and SKL. The corruption problem is
> only fixed in the KBL/CFL firmware the SKL one still to be added to
> linux-firmware.git .. At that point we can also remove these checks.
>
> >
> > If Imre is happy with plonking this here,
> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> >
> > Or at least good enough for now and can be improved upon?
>
> Yes, looks ok and can adjust for SKL later.
>
> > -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-11-30 11:19 ` Imre Deak
2017-12-01 23:58 ` Rogozhkin, Dmitry V
@ 2017-12-02 0:05 ` Rogozhkin, Dmitry V
2017-12-05 13:09 ` Imre Deak
1 sibling, 1 reply; 32+ messages in thread
From: Rogozhkin, Dmitry V @ 2017-12-02 0:05 UTC (permalink / raw)
To: Deak, Imre; +Cc: Intel-gfx
On Thu, 2017-11-30 at 13:19 +0200, Imre Deak wrote:
> > > +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> > > + (HAS_CSR(dev_priv) && IS_GEN9(dev_priv) && !
> IS_SKYLAKE(dev_priv))
>
> Nitpick: could be just !IS_SKYLAKE(), but works in the above way too.
> For all other platforms the GT_IRQ domain won't be mapped making
> display_power_get/put on those just a domain ref inc/dec, otherwise a
> nop.
Why not +&& !IS_BROXTON(dev_priv) by the way?
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-12-01 23:58 ` Rogozhkin, Dmitry V
@ 2017-12-05 13:08 ` Imre Deak
2017-12-05 13:28 ` [PATCH v6] " Tvrtko Ursulin
0 siblings, 1 reply; 32+ messages in thread
From: Imre Deak @ 2017-12-05 13:08 UTC (permalink / raw)
To: Rogozhkin, Dmitry V; +Cc: Intel-gfx
On Sat, Dec 02, 2017 at 01:58:12AM +0200, Rogozhkin, Dmitry V wrote:
> On Thu, 2017-11-30 at 13:19 +0200, Imre Deak wrote:
> > On Thu, Nov 30, 2017 at 09:45:25AM +0000, Chris Wilson wrote:
> > > Quoting Tvrtko Ursulin (2017-11-30 09:18:20)
> > > > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > >
> > > > It seems that the DMC likes to transition between the DC states a lot when
> > > > there are no connected displays (no active power domains) during command
> > > > submission.
> > > >
> > > > This activity on DC states has a negative impact on the performance of the
> > > > chip with huge latencies observed in the interrupt handlers and elsewhere.
> > > > Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> > > > eight.
> > > >
> > > > Work around it by introducing a new power domain named,
> > > > POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> > > > held for the duration of command submission activity.
> > > >
> > > > v2:
> > > > * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> > > > * Protect macro body with braces. (Jani Nikula)
> > > >
> > > > v3:
> > > > * Add dedicated power domain for clarity. (Chris, Imre)
> > > > * Commit message and comment text updates.
> > > > * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> > > > firmware release.
> > > >
> > > > v4:
> > > > * Power domain should be inner to device runtime pm. (Chris)
> > > > * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
> > > > * Handle async DMC loading by moving the GT_IRQ power domain logic into
> > > > intel_runtime_pm. (Daniel, Chris)
> > > > * Include small core GEN9 as well. (Imre)
> > > >
> > > > v5
> > > > * Special handling for async DMC load is not needed since on failure the
> > > > power domain reference is kept permanently taken. (Imre)
> > > >
> > > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> > > > Testcase: igt/gem_exec_nop/headless
> > > > Cc: Imre Deak <imre.deak@intel.com>
> > > > Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
> > > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> > > > ---
> > > > drivers/gpu/drm/i915/i915_drv.h | 4 ++++
> > > > drivers/gpu/drm/i915/i915_gem.c | 3 +++
> > > > drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
> > > > drivers/gpu/drm/i915/intel_runtime_pm.c | 14 ++++++++++++++
> > > > 4 files changed, 35 insertions(+)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > > index bddd65839f60..59cf11dd5d3b 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > @@ -398,6 +398,7 @@ enum intel_display_power_domain {
> > > > POWER_DOMAIN_AUX_D,
> > > > POWER_DOMAIN_GMBUS,
> > > > POWER_DOMAIN_MODESET,
> > > > + POWER_DOMAIN_GT_IRQ,
> > > > POWER_DOMAIN_INIT,
> > > >
> > > > POWER_DOMAIN_NUM,
> > > > @@ -3285,6 +3286,9 @@ intel_info(const struct drm_i915_private *dev_priv)
> > > > #define GT_FREQUENCY_MULTIPLIER 50
> > > > #define GEN9_FREQ_SCALER 3
> > > >
> > > > +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> > > > + (HAS_CSR(dev_priv) && IS_GEN9(dev_priv) && !IS_SKYLAKE(dev_priv))
> >
> > Nitpick: could be just !IS_SKYLAKE(), but works in the above way too.
> > For all other platforms the GT_IRQ domain won't be mapped making
> > display_power_get/put on those just a domain ref inc/dec, otherwise a
> > nop.
>
> Folks, is my understanding correct that once SKL DMC will be merged we
> just remove !IS_SKYLAKE from the above check to get the performance fix
> for SKL and nothing more changes in the patch? I am asking because there
> is a need to have SKL patch on the plate already to verify the fix
> sooner rather than later. From this perspective, can we have one more
> patch in the series dedicated to fix SKL right now? After all, SKL DMC
> is available in drm-firmware already, those who need can try it.
The fixed SKL firmware is now in linux-firmware.git.
Tvrtko, could you resend the patch without the NEEDS_CSR_GT_PERF_WA()
check?
Thanks,
Imre
>
> >
> > > > +
> > > > #include "i915_trace.h"
> > > >
> > > > static inline bool intel_vtd_active(void)
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > > index 354b0546a191..c6067cba1dca 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -3381,6 +3381,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
> > > >
> > > > if (INTEL_GEN(dev_priv) >= 6)
> > > > gen6_rps_idle(dev_priv);
> > > > +
> > > > + intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
> > > > +
> > > > intel_runtime_pm_put(dev_priv);
> > > > out_unlock:
> > > > mutex_unlock(&dev_priv->drm.struct_mutex);
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> > > > index a90bdd26571f..c28a4ceb016d 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_request.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> > > > @@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
> > > > GEM_BUG_ON(!i915->gt.active_requests);
> > > >
> > > > intel_runtime_pm_get_noresume(i915);
> > > > +
> > > > + /*
> > > > + * It seems that the DMC likes to transition between the DC states a lot
> > > > + * when there are no connected displays (no active power domains) during
> > > > + * command submission.
> > > > + *
> > > > + * This activity has negative impact on the performance of the chip with
> > > > + * huge latencies observed in the interrupt handler and elsewhere.
> > > > + *
> > > > + * Work around it by grabbing a GT IRQ power domain whilst there is any
> > > > + * GT activity, preventing any DC state transitions.
> > > > + */
> > > > + intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
> > > > +
> > > > i915->gt.awake = true;
> > > >
> > > > intel_enable_gt_powersave(i915);
> > > > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > > index 8315499452dc..48ad0828ace7 100644
> > > > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> > > > @@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
> > > > return "INIT";
> > > > case POWER_DOMAIN_MODESET:
> > > > return "MODESET";
> > > > + case POWER_DOMAIN_GT_IRQ:
> > > > + return "GT_IRQ";
> > > > default:
> > > > MISSING_CASE(domain);
> > > > return "?";
> > > > @@ -1471,6 +1473,9 @@ void intel_display_power_get(struct drm_i915_private *dev_priv,
> > > > {
> > > > struct i915_power_domains *power_domains = &dev_priv->power_domains;
> > > >
> > > > + if (domain == POWER_DOMAIN_GT_IRQ && !NEEDS_CSR_GT_PERF_WA(dev_priv))
> > > > + return;
> > >
> > > I was hoping that in the magic of the powerdomains, this would just
> > > evaporate (or at least find its own place within the powerwell enabling).
> >
> > Yep, that would be "normally" the case. But since we have only a single
> > set of domains->powerwells mapping for all GEN9_BC we can't isolate SKL
> > that way w/o duplicating the mapping for SKL. Although all GEN9_BC are
> > the same from power well handling POV for some reason there are still
> > separate DMC firmwares for KBL/CFL and SKL. The corruption problem is
> > only fixed in the KBL/CFL firmware the SKL one still to be added to
> > linux-firmware.git .. At that point we can also remove these checks.
> >
> > >
> > > If Imre is happy with plonking this here,
> > > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> > >
> > > Or at least good enough for now and can be improved upon?
> >
> > Yes, looks ok and can adjust for SKL later.
> >
> > > -Chris
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-12-02 0:05 ` [PATCH v5] " Rogozhkin, Dmitry V
@ 2017-12-05 13:09 ` Imre Deak
2017-12-05 19:01 ` Rogozhkin, Dmitry V
0 siblings, 1 reply; 32+ messages in thread
From: Imre Deak @ 2017-12-05 13:09 UTC (permalink / raw)
To: Rogozhkin, Dmitry V; +Cc: Intel-gfx
On Sat, Dec 02, 2017 at 02:05:42AM +0200, Rogozhkin, Dmitry V wrote:
> On Thu, 2017-11-30 at 13:19 +0200, Imre Deak wrote:
> > > > +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> > > > + (HAS_CSR(dev_priv) && IS_GEN9(dev_priv) && !
> > IS_SKYLAKE(dev_priv))
> >
> > Nitpick: could be just !IS_SKYLAKE(), but works in the above way too.
> > For all other platforms the GT_IRQ domain won't be mapped making
> > display_power_get/put on those just a domain ref inc/dec, otherwise a
> > nop.
>
> Why not +&& !IS_BROXTON(dev_priv) by the way?
We have the same slow-down problem on APL/BXT (and we don't have the DC6
corruption problem there).
--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v6] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-12-05 13:08 ` Imre Deak
@ 2017-12-05 13:28 ` Tvrtko Ursulin
2017-12-05 13:35 ` Chris Wilson
0 siblings, 1 reply; 32+ messages in thread
From: Tvrtko Ursulin @ 2017-12-05 13:28 UTC (permalink / raw)
To: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
It seems that the DMC likes to transition between the DC states a lot when
there are no connected displays (no active power domains) during command
submission.
This activity on DC states has a negative impact on the performance of the
chip with huge latencies observed in the interrupt handlers and elsewhere.
Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
eight.
Work around it by introducing a new power domain named,
POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
held for the duration of command submission activity.
v2:
* Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
* Protect macro body with braces. (Jani Nikula)
v3:
* Add dedicated power domain for clarity. (Chris, Imre)
* Commit message and comment text updates.
* Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
firmware release.
v4:
* Power domain should be inner to device runtime pm. (Chris)
* Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
* Handle async DMC loading by moving the GT_IRQ power domain logic into
intel_runtime_pm. (Daniel, Chris)
* Include small core GEN9 as well. (Imre)
v5
* Special handling for async DMC load is not needed since on failure the
power domain reference is kept permanently taken. (Imre)
v6:
* Drop the NEEDS_CSR_GT_PERF_WA macro since all firmwares have now been
deployed. (Imre, Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
Testcase: igt/gem_exec_nop/headless
Cc: Imre Deak <imre.deak@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v5)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5)
---
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_gem.c | 3 +++
drivers/gpu/drm/i915/i915_gem_request.c | 14 ++++++++++++++
drivers/gpu/drm/i915/intel_runtime_pm.c | 5 +++++
4 files changed, 23 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 594fd14e66c5..c56dccdf52e9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -398,6 +398,7 @@ enum intel_display_power_domain {
POWER_DOMAIN_AUX_D,
POWER_DOMAIN_GMBUS,
POWER_DOMAIN_MODESET,
+ POWER_DOMAIN_GT_IRQ,
POWER_DOMAIN_INIT,
POWER_DOMAIN_NUM,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e083f242b8dc..a8879854000b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3381,6 +3381,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
if (INTEL_GEN(dev_priv) >= 6)
gen6_rps_idle(dev_priv);
+
+ intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
+
intel_runtime_pm_put(dev_priv);
out_unlock:
mutex_unlock(&dev_priv->drm.struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index a90bdd26571f..c28a4ceb016d 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -252,6 +252,20 @@ static void mark_busy(struct drm_i915_private *i915)
GEM_BUG_ON(!i915->gt.active_requests);
intel_runtime_pm_get_noresume(i915);
+
+ /*
+ * It seems that the DMC likes to transition between the DC states a lot
+ * when there are no connected displays (no active power domains) during
+ * command submission.
+ *
+ * This activity has negative impact on the performance of the chip with
+ * huge latencies observed in the interrupt handler and elsewhere.
+ *
+ * Work around it by grabbing a GT IRQ power domain whilst there is any
+ * GT activity, preventing any DC state transitions.
+ */
+ intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
+
i915->gt.awake = true;
intel_enable_gt_powersave(i915);
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 8315499452dc..96ab74f3d101 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -130,6 +130,8 @@ intel_display_power_domain_str(enum intel_display_power_domain domain)
return "INIT";
case POWER_DOMAIN_MODESET:
return "MODESET";
+ case POWER_DOMAIN_GT_IRQ:
+ return "GT_IRQ";
default:
MISSING_CASE(domain);
return "?";
@@ -1705,6 +1707,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
BIT_ULL(POWER_DOMAIN_INIT))
#define SKL_DISPLAY_DC_OFF_POWER_DOMAINS ( \
SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
+ BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
BIT_ULL(POWER_DOMAIN_MODESET) | \
BIT_ULL(POWER_DOMAIN_AUX_A) | \
BIT_ULL(POWER_DOMAIN_INIT))
@@ -1727,6 +1730,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
BIT_ULL(POWER_DOMAIN_INIT))
#define BXT_DISPLAY_DC_OFF_POWER_DOMAINS ( \
BXT_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
+ BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
BIT_ULL(POWER_DOMAIN_MODESET) | \
BIT_ULL(POWER_DOMAIN_AUX_A) | \
BIT_ULL(POWER_DOMAIN_INIT))
@@ -1785,6 +1789,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
BIT_ULL(POWER_DOMAIN_INIT))
#define GLK_DISPLAY_DC_OFF_POWER_DOMAINS ( \
GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS | \
+ BIT_ULL(POWER_DOMAIN_GT_IRQ) | \
BIT_ULL(POWER_DOMAIN_MODESET) | \
BIT_ULL(POWER_DOMAIN_AUX_A) | \
BIT_ULL(POWER_DOMAIN_INIT))
--
2.14.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v6] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-12-05 13:28 ` [PATCH v6] " Tvrtko Ursulin
@ 2017-12-05 13:35 ` Chris Wilson
2017-12-08 10:46 ` Imre Deak
0 siblings, 1 reply; 32+ messages in thread
From: Chris Wilson @ 2017-12-05 13:35 UTC (permalink / raw)
To: Tvrtko Ursulin, Intel-gfx
Quoting Tvrtko Ursulin (2017-12-05 13:28:54)
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> It seems that the DMC likes to transition between the DC states a lot when
> there are no connected displays (no active power domains) during command
> submission.
>
> This activity on DC states has a negative impact on the performance of the
> chip with huge latencies observed in the interrupt handlers and elsewhere.
> Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> eight.
>
> Work around it by introducing a new power domain named,
> POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> held for the duration of command submission activity.
>
> v2:
> * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> * Protect macro body with braces. (Jani Nikula)
>
> v3:
> * Add dedicated power domain for clarity. (Chris, Imre)
> * Commit message and comment text updates.
> * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> firmware release.
>
> v4:
> * Power domain should be inner to device runtime pm. (Chris)
> * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
> * Handle async DMC loading by moving the GT_IRQ power domain logic into
> intel_runtime_pm. (Daniel, Chris)
> * Include small core GEN9 as well. (Imre)
>
> v5
> * Special handling for async DMC load is not needed since on failure the
> power domain reference is kept permanently taken. (Imre)
>
> v6:
> * Drop the NEEDS_CSR_GT_PERF_WA macro since all firmwares have now been
> deployed. (Imre, Chris)
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> Testcase: igt/gem_exec_nop/headless
> Cc: Imre Deak <imre.deak@intel.com>
> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v5)
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Could we add a note that Cannonlake is excluded and needs to be checked
as to whether it suffers the same?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* ✗ Fi.CI.BAT: failure for drm/i915: Restore GT performance in headless mode with DMC loaded (rev6)
2017-11-29 10:59 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
` (5 preceding siblings ...)
2017-11-30 15:56 ` ✗ Fi.CI.BAT: warning for drm/i915: Restore GT performance in headless mode with DMC loaded (rev5) Patchwork
@ 2017-12-05 15:06 ` Patchwork
2017-12-05 17:42 ` ✗ Fi.CI.BAT: warning " Patchwork
7 siblings, 0 replies; 32+ messages in thread
From: Patchwork @ 2017-12-05 15:06 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Restore GT performance in headless mode with DMC loaded (rev6)
URL : https://patchwork.freedesktop.org/series/24017/
State : failure
== Summary ==
Series 24017v6 drm/i915: Restore GT performance in headless mode with DMC loaded
https://patchwork.freedesktop.org/api/1.0/series/24017/revisions/6/mbox/
Test debugfs_test:
Subgroup read_all_entries:
dmesg-warn -> DMESG-FAIL (fi-elk-e7500) fdo#103989 +4
Test gem_exec_reloc:
Subgroup basic-gtt-read:
incomplete -> PASS (fi-byt-j1900)
Test gem_ringfill:
Subgroup basic-default-hang:
dmesg-warn -> PASS (fi-elk-e7500)
Test gem_sync:
Subgroup basic-all:
skip -> PASS (fi-elk-e7500)
Subgroup basic-each:
skip -> PASS (fi-elk-e7500)
Subgroup basic-many-each:
skip -> PASS (fi-elk-e7500)
Subgroup basic-store-all:
skip -> PASS (fi-elk-e7500)
Subgroup basic-store-each:
skip -> PASS (fi-elk-e7500)
Test gem_tiled_blits:
Subgroup basic:
skip -> PASS (fi-elk-e7500)
Test gem_tiled_fence_blits:
Subgroup basic:
skip -> PASS (fi-elk-e7500)
Test gem_wait:
Subgroup basic-busy-all:
skip -> PASS (fi-elk-e7500)
Subgroup basic-wait-all:
skip -> PASS (fi-elk-e7500)
Subgroup basic-await-all:
skip -> PASS (fi-elk-e7500)
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
pass -> DMESG-WARN (fi-kbl-r)
Subgroup suspend-read-crc-pipe-b:
incomplete -> PASS (fi-snb-2520m) fdo#103713
Test drv_module_reload:
Subgroup basic-reload:
pass -> INCOMPLETE (fi-bwr-2160)
fdo#103989 https://bugs.freedesktop.org/show_bug.cgi?id=103989
fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713
fi-bdw-5557u total:288 pass:267 dwarn:0 dfail:0 fail:0 skip:21 time:436s
fi-bdw-gvtdvm total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:447s
fi-blb-e6850 total:288 pass:223 dwarn:1 dfail:0 fail:0 skip:64 time:388s
fi-bsw-n3050 total:288 pass:242 dwarn:0 dfail:0 fail:0 skip:46 time:517s
fi-bwr-2160 total:285 pass:180 dwarn:0 dfail:0 fail:0 skip:104
fi-bxt-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:505s
fi-bxt-j4205 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:510s
fi-byt-j1900 total:288 pass:253 dwarn:0 dfail:0 fail:0 skip:35 time:486s
fi-byt-n2820 total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:480s
fi-elk-e7500 total:224 pass:163 dwarn:14 dfail:1 fail:0 skip:45
fi-gdg-551 total:288 pass:178 dwarn:1 dfail:0 fail:1 skip:108 time:264s
fi-glk-1 total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:541s
fi-hsw-4770 total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:375s
fi-hsw-4770r total:288 pass:224 dwarn:0 dfail:0 fail:0 skip:64 time:257s
fi-ilk-650 total:288 pass:228 dwarn:0 dfail:0 fail:0 skip:60 time:393s
fi-ivb-3520m total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:482s
fi-ivb-3770 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:452s
fi-kbl-7500u total:288 pass:263 dwarn:1 dfail:0 fail:0 skip:24 time:485s
fi-kbl-7560u total:288 pass:269 dwarn:0 dfail:0 fail:0 skip:19 time:530s
fi-kbl-7567u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:473s
fi-kbl-r total:288 pass:260 dwarn:1 dfail:0 fail:0 skip:27 time:535s
fi-pnv-d510 total:288 pass:222 dwarn:1 dfail:0 fail:0 skip:65 time:592s
fi-skl-6260u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:459s
fi-skl-6600u total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:536s
fi-skl-6700hq total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:566s
fi-skl-6700k total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:515s
fi-skl-6770hq total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:504s
fi-skl-gvtdvm total:288 pass:265 dwarn:0 dfail:0 fail:0 skip:23 time:445s
fi-snb-2520m total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:544s
fi-snb-2600 total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:415s
Blacklisted hosts:
fi-cfl-s2 total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:610s
fi-cnl-y total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:627s
fi-glk-dsi total:288 pass:257 dwarn:0 dfail:0 fail:1 skip:30 time:499s
73c6ae44485ad50f39ca0ed585a2ff3104ead5fb drm-tip: 2017y-12m-05d-14h-00m-03s UTC integration manifest
79f077ade7d3 drm/i915: Restore GT performance in headless mode with DMC loaded
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7410/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* ✗ Fi.CI.BAT: warning for drm/i915: Restore GT performance in headless mode with DMC loaded (rev6)
2017-11-29 10:59 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
` (6 preceding siblings ...)
2017-12-05 15:06 ` ✗ Fi.CI.BAT: failure for drm/i915: Restore GT performance in headless mode with DMC loaded (rev6) Patchwork
@ 2017-12-05 17:42 ` Patchwork
7 siblings, 0 replies; 32+ messages in thread
From: Patchwork @ 2017-12-05 17:42 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Restore GT performance in headless mode with DMC loaded (rev6)
URL : https://patchwork.freedesktop.org/series/24017/
State : warning
== Summary ==
Series 24017v6 drm/i915: Restore GT performance in headless mode with DMC loaded
https://patchwork.freedesktop.org/api/1.0/series/24017/revisions/6/mbox/
Test debugfs_test:
Subgroup read_all_entries:
dmesg-warn -> PASS (fi-bdw-gvtdvm) fdo#103938 +1
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
pass -> DMESG-WARN (fi-kbl-r)
fdo#103938 https://bugs.freedesktop.org/show_bug.cgi?id=103938
fi-bdw-5557u total:288 pass:267 dwarn:0 dfail:0 fail:0 skip:21 time:439s
fi-bdw-gvtdvm total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:445s
fi-blb-e6850 total:288 pass:223 dwarn:1 dfail:0 fail:0 skip:64 time:386s
fi-bsw-n3050 total:288 pass:242 dwarn:0 dfail:0 fail:0 skip:46 time:528s
fi-bwr-2160 total:288 pass:183 dwarn:0 dfail:0 fail:0 skip:105 time:282s
fi-bxt-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:505s
fi-bxt-j4205 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:504s
fi-byt-j1900 total:288 pass:253 dwarn:0 dfail:0 fail:0 skip:35 time:482s
fi-byt-n2820 total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:479s
fi-elk-e7500 total:224 pass:163 dwarn:15 dfail:0 fail:0 skip:45
fi-gdg-551 total:288 pass:178 dwarn:1 dfail:0 fail:1 skip:108 time:271s
fi-glk-1 total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:540s
fi-hsw-4770 total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:363s
fi-hsw-4770r total:288 pass:224 dwarn:0 dfail:0 fail:0 skip:64 time:261s
fi-ilk-650 total:288 pass:228 dwarn:0 dfail:0 fail:0 skip:60 time:399s
fi-ivb-3520m total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:468s
fi-ivb-3770 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:450s
fi-kbl-7500u total:288 pass:263 dwarn:1 dfail:0 fail:0 skip:24 time:486s
fi-kbl-7560u total:288 pass:269 dwarn:0 dfail:0 fail:0 skip:19 time:529s
fi-kbl-7567u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:477s
fi-kbl-r total:288 pass:260 dwarn:1 dfail:0 fail:0 skip:27 time:534s
fi-pnv-d510 total:288 pass:222 dwarn:1 dfail:0 fail:0 skip:65 time:589s
fi-skl-6260u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:455s
fi-skl-6600u total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:543s
fi-skl-6700hq total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:568s
fi-skl-6700k total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:513s
fi-skl-6770hq total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:507s
fi-skl-gvtdvm total:288 pass:265 dwarn:0 dfail:0 fail:0 skip:23 time:451s
fi-snb-2520m total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:549s
fi-snb-2600 total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:413s
Blacklisted hosts:
fi-cfl-s2 total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:616s
fi-cnl-y total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:629s
fi-glk-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:489s
0d0fe916f52ad8f05dddab384ae7c90bb62ebac4 drm-tip: 2017y-12m-05d-14h-52m-17s UTC integration manifest
02f30d0e595d drm/i915: Restore GT performance in headless mode with DMC loaded
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7414/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-12-05 13:09 ` Imre Deak
@ 2017-12-05 19:01 ` Rogozhkin, Dmitry V
0 siblings, 0 replies; 32+ messages in thread
From: Rogozhkin, Dmitry V @ 2017-12-05 19:01 UTC (permalink / raw)
To: Deak, Imre; +Cc: Intel-gfx
On Tue, 2017-12-05 at 15:09 +0200, Imre Deak wrote:
> On Sat, Dec 02, 2017 at 02:05:42AM +0200, Rogozhkin, Dmitry V wrote:
> > On Thu, 2017-11-30 at 13:19 +0200, Imre Deak wrote:
> > > > > +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> > > > > + (HAS_CSR(dev_priv) && IS_GEN9(dev_priv) && !
> > > IS_SKYLAKE(dev_priv))
> > >
> > > Nitpick: could be just !IS_SKYLAKE(), but works in the above way too.
> > > For all other platforms the GT_IRQ domain won't be mapped making
> > > display_power_get/put on those just a domain ref inc/dec, otherwise a
> > > nop.
> >
> > Why not +&& !IS_BROXTON(dev_priv) by the way?
>
> We have the same slow-down problem on APL/BXT (and we don't have the DC6
> corruption problem there).
Ok, this makes sense now. Thank you for clarification.
>
> --Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v6] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-12-05 13:35 ` Chris Wilson
@ 2017-12-08 10:46 ` Imre Deak
0 siblings, 0 replies; 32+ messages in thread
From: Imre Deak @ 2017-12-08 10:46 UTC (permalink / raw)
To: Chris Wilson, Tvrtko Ursulin, Daniel Vetter; +Cc: Intel-gfx
On Tue, Dec 05, 2017 at 01:35:11PM +0000, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-12-05 13:28:54)
> > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >
> > It seems that the DMC likes to transition between the DC states a lot when
> > there are no connected displays (no active power domains) during command
> > submission.
> >
> > This activity on DC states has a negative impact on the performance of the
> > chip with huge latencies observed in the interrupt handlers and elsewhere.
> > Simple tests like igt/gem_latency -n 0 are slowed down by a factor of
> > eight.
> >
> > Work around it by introducing a new power domain named,
> > POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is
> > held for the duration of command submission activity.
> >
> > v2:
> > * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
> > * Protect macro body with braces. (Jani Nikula)
> >
> > v3:
> > * Add dedicated power domain for clarity. (Chris, Imre)
> > * Commit message and comment text updates.
> > * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC
> > firmware release.
> >
> > v4:
> > * Power domain should be inner to device runtime pm. (Chris)
> > * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris)
> > * Handle async DMC loading by moving the GT_IRQ power domain logic into
> > intel_runtime_pm. (Daniel, Chris)
> > * Include small core GEN9 as well. (Imre)
> >
> > v5
> > * Special handling for async DMC load is not needed since on failure the
> > power domain reference is kept permanently taken. (Imre)
> >
> > v6:
> > * Drop the NEEDS_CSR_GT_PERF_WA macro since all firmwares have now been
> > deployed. (Imre, Chris)
> >
> > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> > Testcase: igt/gem_exec_nop/headless
> > Cc: Imre Deak <imre.deak@intel.com>
> > Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
> > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v5)
> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5)
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> Could we add a note that Cannonlake is excluded and needs to be checked
> as to whether it suffers the same?
Looks like CNL has the same problem, I added a note about this and a
follow-up patch to the commit log.
Thanks for the patch and reviews, I pushed it to drm-tip.
--Imre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2017-12-08 10:46 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-29 10:59 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
2017-11-29 11:10 ` Chris Wilson
2017-11-29 12:29 ` Imre Deak
2017-11-29 11:12 ` Daniel Vetter
2017-11-29 11:34 ` Tvrtko Ursulin
2017-11-29 11:40 ` Chris Wilson
2017-11-29 11:53 ` Tvrtko Ursulin
2017-11-29 11:59 ` Chris Wilson
2017-11-29 14:15 ` Imre Deak
2017-11-29 12:47 ` Daniel Vetter
2017-11-29 11:22 ` ✗ Fi.CI.BAT: warning for drm/i915: Restore GT performance in headless mode with DMC loaded (rev3) Patchwork
2017-11-29 14:18 ` [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Imre Deak
2017-11-29 14:30 ` [PATCH v4] " Tvrtko Ursulin
2017-11-29 15:06 ` Imre Deak
2017-11-29 15:21 ` Tvrtko Ursulin
2017-11-29 17:28 ` Imre Deak
2017-11-30 8:34 ` Tvrtko Ursulin
2017-11-30 9:18 ` [PATCH v5] " Tvrtko Ursulin
2017-11-30 9:45 ` Chris Wilson
2017-11-30 9:55 ` Tvrtko Ursulin
2017-11-30 11:19 ` Imre Deak
2017-12-01 23:58 ` Rogozhkin, Dmitry V
2017-12-05 13:08 ` Imre Deak
2017-12-05 13:28 ` [PATCH v6] " Tvrtko Ursulin
2017-12-05 13:35 ` Chris Wilson
2017-12-08 10:46 ` Imre Deak
2017-12-02 0:05 ` [PATCH v5] " Rogozhkin, Dmitry V
2017-12-05 13:09 ` Imre Deak
2017-12-05 19:01 ` Rogozhkin, Dmitry V
2017-11-30 15:56 ` ✗ Fi.CI.BAT: warning for drm/i915: Restore GT performance in headless mode with DMC loaded (rev5) Patchwork
2017-12-05 15:06 ` ✗ Fi.CI.BAT: failure for drm/i915: Restore GT performance in headless mode with DMC loaded (rev6) Patchwork
2017-12-05 17:42 ` ✗ Fi.CI.BAT: warning " Patchwork
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.