* [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
@ 2017-05-05 11:43 Tvrtko Ursulin
2017-05-05 11:54 ` Chris Wilson
` (5 more replies)
0 siblings, 6 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2017-05-05 11:43 UTC (permalink / raw)
To: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
It seems that the DMC likes to transition between the DC states
a lot when there are no connected displays (no active power
domains) during simple command submission.
This frantic activity on DC states has a terrible impact on the
performance of the overall chip with huge latencies observed in
the interrupt handlers and elsewhere. Simple tests like
igt/gem_latency -n 0 are slowed down by a factor of eight.
Work around it by grabbing a modeset display power domain whilst
there is any GT activity. This seems to be effective in making
the DMC keep its paws off the chip.
On the other hand this may have a negative impact on the overall
power budget of the chip and so could still affect performance.
This version limits the workaround got SKL GT3 and GT4 parts but
this is just due the absence of testing on other platforms. It
is possible we will have to apply it wider.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
Testcase: igt/gem_exec_nop/headless
Cc: Imre Deak <imre.deak@intel.com>
---
drivers/gpu/drm/i915/i915_drv.h | 5 +++++
drivers/gpu/drm/i915/i915_gem.c | 4 ++++
drivers/gpu/drm/i915/i915_gem_request.c | 3 +++
3 files changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 320c16df1c9c..4d58e2e28c2f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2990,6 +2990,11 @@ intel_info(const struct drm_i915_private *dev_priv)
#define HAS_DECOUPLED_MMIO(dev_priv) (INTEL_INFO(dev_priv)->has_decoupled_mmio)
+#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
+ HAS_CSR(dev_priv) && \
+ (IS_SKL_GT3(dev_priv) || IS_SKL_GT4(dev_priv)) && \
+ (dev_priv)->csr.dmc_payload
+
#include "i915_trace.h"
static inline bool intel_scanout_needs_vtd_wa(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b2727905ef2b..c52d863f409c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3200,7 +3200,11 @@ i915_gem_idle_work_handler(struct work_struct *work)
if (INTEL_GEN(dev_priv) >= 6)
gen6_rps_idle(dev_priv);
+
intel_runtime_pm_put(dev_priv);
+
+ if (NEEDS_CSR_GT_PERF_WA(dev_priv))
+ intel_display_power_put(dev_priv, POWER_DOMAIN_MODESET);
out_unlock:
mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 10361c7e3b37..10a3b51f6362 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -873,6 +873,9 @@ static void i915_gem_mark_busy(const struct intel_engine_cs *engine)
GEM_BUG_ON(!dev_priv->gt.active_requests);
+ if (NEEDS_CSR_GT_PERF_WA(dev_priv))
+ intel_display_power_get(dev_priv, POWER_DOMAIN_MODESET);
+
intel_runtime_pm_get_noresume(dev_priv);
dev_priv->gt.awake = true;
--
2.9.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-05-05 11:43 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
@ 2017-05-05 11:54 ` Chris Wilson
2017-05-05 12:03 ` ✓ Fi.CI.BAT: success for " Patchwork
` (4 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Chris Wilson @ 2017-05-05 11:54 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: Intel-gfx
On Fri, May 05, 2017 at 12:43:21PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> It seems that the DMC likes to transition between the DC states
> a lot when there are no connected displays (no active power
> domains) during simple command submission.
>
> This frantic activity on DC states has a terrible impact on the
> performance of the overall chip with huge latencies observed in
> the interrupt handlers and elsewhere. Simple tests like
> igt/gem_latency -n 0 are slowed down by a factor of eight.
>
> Work around it by grabbing a modeset display power domain whilst
> there is any GT activity. This seems to be effective in making
> the DMC keep its paws off the chip.
>
> On the other hand this may have a negative impact on the overall
> power budget of the chip and so could still affect performance.
Please add this as a comment to the code, I think in mark_busy(). I
don't think this w/a will remain applicable forever and so merits a
continual reminder and being discussed again in future.
> This version limits the workaround got SKL GT3 and GT4 parts but
> this is just due the absence of testing on other platforms. It
> is possible we will have to apply it wider.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> Testcase: igt/gem_exec_nop/headless
> Cc: Imre Deak <imre.deak@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 5 +++++
> drivers/gpu/drm/i915/i915_gem.c | 4 ++++
> drivers/gpu/drm/i915/i915_gem_request.c | 3 +++
> 3 files changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 320c16df1c9c..4d58e2e28c2f 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2990,6 +2990,11 @@ intel_info(const struct drm_i915_private *dev_priv)
>
> #define HAS_DECOUPLED_MMIO(dev_priv) (INTEL_INFO(dev_priv)->has_decoupled_mmio)
>
> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> + HAS_CSR(dev_priv) && \
> + (IS_SKL_GT3(dev_priv) || IS_SKL_GT4(dev_priv)) && \
> + (dev_priv)->csr.dmc_payload
csr.dmc_payload is a bit of a surprise, but looks correct.
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: Restore GT performance in headless mode with DMC loaded
2017-05-05 11:43 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
2017-05-05 11:54 ` Chris Wilson
@ 2017-05-05 12:03 ` Patchwork
2017-05-05 14:49 ` [PATCH] " Ville Syrjälä
` (3 subsequent siblings)
5 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2017-05-05 12:03 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Restore GT performance in headless mode with DMC loaded
URL : https://patchwork.freedesktop.org/series/24017/
State : success
== Summary ==
Series 24017v1 drm/i915: Restore GT performance in headless mode with DMC loaded
https://patchwork.freedesktop.org/api/1.0/series/24017/revisions/1/mbox/
fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:429s
fi-bdw-gvtdvm total:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:433s
fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:577s
fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:509s
fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:548s
fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:490s
fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:485s
fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:409s
fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:408s
fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:416s
fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:495s
fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:471s
fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:458s
fi-kbl-7560u total:278 pass:267 dwarn:1 dfail:0 fail:0 skip:10 time:566s
fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:458s
fi-skl-6700hq total:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:569s
fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:460s
fi-skl-6770hq total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:489s
fi-skl-gvtdvm total:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:431s
fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:531s
fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:408s
369880c1680bf9bde467a40d2a03d3ad32341281 drm-tip: 2017y-05m-04d-15h-00m-33s UTC integration manifest
e0b135e drm/i915: Restore GT performance in headless mode with DMC loaded
== Logs ==
For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4630/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-05-05 11:43 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
2017-05-05 11:54 ` Chris Wilson
2017-05-05 12:03 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2017-05-05 14:49 ` Ville Syrjälä
2017-05-05 16:13 ` Tvrtko Ursulin
2017-05-08 9:23 ` Jani Nikula
` (2 subsequent siblings)
5 siblings, 1 reply; 9+ messages in thread
From: Ville Syrjälä @ 2017-05-05 14:49 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: Intel-gfx
On Fri, May 05, 2017 at 12:43:21PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> It seems that the DMC likes to transition between the DC states
> a lot when there are no connected displays (no active power
> domains) during simple command submission.
Is it trapping on some interrupt register accesses or what? And
if so which registers are affected?
>
> This frantic activity on DC states has a terrible impact on the
> performance of the overall chip with huge latencies observed in
> the interrupt handlers and elsewhere. Simple tests like
> igt/gem_latency -n 0 are slowed down by a factor of eight.
>
> Work around it by grabbing a modeset display power domain whilst
> there is any GT activity. This seems to be effective in making
> the DMC keep its paws off the chip.
>
> On the other hand this may have a negative impact on the overall
> power budget of the chip and so could still affect performance.
>
> This version limits the workaround got SKL GT3 and GT4 parts but
> this is just due the absence of testing on other platforms. It
> is possible we will have to apply it wider.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> Testcase: igt/gem_exec_nop/headless
> Cc: Imre Deak <imre.deak@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 5 +++++
> drivers/gpu/drm/i915/i915_gem.c | 4 ++++
> drivers/gpu/drm/i915/i915_gem_request.c | 3 +++
> 3 files changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 320c16df1c9c..4d58e2e28c2f 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2990,6 +2990,11 @@ intel_info(const struct drm_i915_private *dev_priv)
>
> #define HAS_DECOUPLED_MMIO(dev_priv) (INTEL_INFO(dev_priv)->has_decoupled_mmio)
>
> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> + HAS_CSR(dev_priv) && \
> + (IS_SKL_GT3(dev_priv) || IS_SKL_GT4(dev_priv)) && \
> + (dev_priv)->csr.dmc_payload
> +
> #include "i915_trace.h"
>
> static inline bool intel_scanout_needs_vtd_wa(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index b2727905ef2b..c52d863f409c 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3200,7 +3200,11 @@ i915_gem_idle_work_handler(struct work_struct *work)
>
> if (INTEL_GEN(dev_priv) >= 6)
> gen6_rps_idle(dev_priv);
> +
> intel_runtime_pm_put(dev_priv);
> +
> + if (NEEDS_CSR_GT_PERF_WA(dev_priv))
> + intel_display_power_put(dev_priv, POWER_DOMAIN_MODESET);
> out_unlock:
> mutex_unlock(&dev->struct_mutex);
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index 10361c7e3b37..10a3b51f6362 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -873,6 +873,9 @@ static void i915_gem_mark_busy(const struct intel_engine_cs *engine)
>
> GEM_BUG_ON(!dev_priv->gt.active_requests);
>
> + if (NEEDS_CSR_GT_PERF_WA(dev_priv))
> + intel_display_power_get(dev_priv, POWER_DOMAIN_MODESET);
> +
> intel_runtime_pm_get_noresume(dev_priv);
> dev_priv->gt.awake = true;
>
> --
> 2.9.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-05-05 14:49 ` [PATCH] " Ville Syrjälä
@ 2017-05-05 16:13 ` Tvrtko Ursulin
2017-05-05 16:28 ` Ville Syrjälä
0 siblings, 1 reply; 9+ messages in thread
From: Tvrtko Ursulin @ 2017-05-05 16:13 UTC (permalink / raw)
To: Ville Syrjälä, Tvrtko Ursulin; +Cc: Intel-gfx
On 05/05/2017 15:49, Ville Syrjälä wrote:
> On Fri, May 05, 2017 at 12:43:21PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> It seems that the DMC likes to transition between the DC states
>> a lot when there are no connected displays (no active power
>> domains) during simple command submission.
>
> Is it trapping on some interrupt register accesses or what? And
> if so which registers are affected?
It looks like GT IIR or something along those lines it but I couldn't
say with total confidence. It is just a guess. Firmware binary
definitely "mentions" those registers as can be seen by inspecting it
with a hex editor.
The data I collected at least seems to present a correlation between the
batch frequency and DC state transition frequency:
tgt DC irqs irqs/s irq batch/s DC/s DC/batch
submit transitions /
freq batch
========================================================================
10000 20000 78300 7830.00 1.96 4000.00 2000.00 0.50
9901 14000 52855 7550.71 1.32 5714.29 2000.00 0.35
9524 13500 49100 7328.36 1.23 5970.15 2014.93 0.34
9091 13500 49200 7235.29 1.23 5882.35 1985.29 0.34
5000 16900 33290 3916.47 0.83 4705.88 1988.24 0.42
3333 27800 69550 4932.62 1.74 2836.88 1971.63 0.70
1667 57200 80200 2655.63 2.01 1324.50 1894.04 1.43
909 80000 80034 1482.11 2.00 740.74 1481.48 2.00
476 87000 80039 820.91 2.00 410.26 892.31 2.18
196 160000 80055 334.40 2.00 167.08 668.34 4.00
Submitted batches were ~100us long in all cases. So with low batch
frequency it looks pretty believable. For example when we have 167.08
batches/s, we have 334.40 irq/s - which is double - as expected for
execlists. And we get again double that in terms of DC transitions per
second. Each irq is one GT IIR write from the GPU side, and another from
the CPU side.
This was actually Imre's suggestion btw. Before I was only looking at
the irq/s to DC/s correlation which was confusing me a lot since there
are more of the latter, so I thought it can't be the trigger. But once
Imre mentioned the possibility that things are triggered by IIR register
writes numbers started making more sense.
Then with higher batch frequencies the ratio starts falling, is it
because DC transitions are too slow to keep up, or something else I am
not sure.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-05-05 16:13 ` Tvrtko Ursulin
@ 2017-05-05 16:28 ` Ville Syrjälä
0 siblings, 0 replies; 9+ messages in thread
From: Ville Syrjälä @ 2017-05-05 16:28 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: Intel-gfx
On Fri, May 05, 2017 at 05:13:58PM +0100, Tvrtko Ursulin wrote:
>
> On 05/05/2017 15:49, Ville Syrjälä wrote:
> > On Fri, May 05, 2017 at 12:43:21PM +0100, Tvrtko Ursulin wrote:
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >> It seems that the DMC likes to transition between the DC states
> >> a lot when there are no connected displays (no active power
> >> domains) during simple command submission.
> >
> > Is it trapping on some interrupt register accesses or what? And
> > if so which registers are affected?
>
> It looks like GT IIR or something along those lines it but I couldn't
> say with total confidence.
<read DC counters>
for i in `seq 1 100` ; do IGT_NO_FORCEWAKE=1 intel_reg read <whatever> ; done
<read DC counters>
Should be a pretty trivial to run that against the suspect
registers.
> It is just a guess. Firmware binary
> definitely "mentions" those registers as can be seen by inspecting it
> with a hex editor.
>
> The data I collected at least seems to present a correlation between the
> batch frequency and DC state transition frequency:
>
> tgt DC irqs irqs/s irq batch/s DC/s DC/batch
> submit transitions /
> freq batch
> ========================================================================
> 10000 20000 78300 7830.00 1.96 4000.00 2000.00 0.50
> 9901 14000 52855 7550.71 1.32 5714.29 2000.00 0.35
> 9524 13500 49100 7328.36 1.23 5970.15 2014.93 0.34
> 9091 13500 49200 7235.29 1.23 5882.35 1985.29 0.34
> 5000 16900 33290 3916.47 0.83 4705.88 1988.24 0.42
> 3333 27800 69550 4932.62 1.74 2836.88 1971.63 0.70
> 1667 57200 80200 2655.63 2.01 1324.50 1894.04 1.43
> 909 80000 80034 1482.11 2.00 740.74 1481.48 2.00
> 476 87000 80039 820.91 2.00 410.26 892.31 2.18
> 196 160000 80055 334.40 2.00 167.08 668.34 4.00
>
> Submitted batches were ~100us long in all cases. So with low batch
> frequency it looks pretty believable. For example when we have 167.08
> batches/s, we have 334.40 irq/s - which is double - as expected for
> execlists. And we get again double that in terms of DC transitions per
> second. Each irq is one GT IIR write from the GPU side, and another from
> the CPU side.
GPU doesn't actually write the IIRs. It's just latching stuff from the
ISR. Whether the ISR edge or some higher level interrupt event actually
causes the DMC to kick into action isn't clear at all. My original
impressions was that it just traps the register accesses.
--
Ville Syrjälä
Intel OTC
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-05-05 11:43 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
` (2 preceding siblings ...)
2017-05-05 14:49 ` [PATCH] " Ville Syrjälä
@ 2017-05-08 9:23 ` Jani Nikula
2017-05-08 11:30 ` [PATCH v2] " Tvrtko Ursulin
2017-05-08 12:21 ` ✓ Fi.CI.BAT: success for drm/i915: Restore GT performance in headless mode with DMC loaded (rev2) Patchwork
5 siblings, 0 replies; 9+ messages in thread
From: Jani Nikula @ 2017-05-08 9:23 UTC (permalink / raw)
To: Tvrtko Ursulin, Intel-gfx
On Fri, 05 May 2017, Tvrtko Ursulin <tursulin@ursulin.net> wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> It seems that the DMC likes to transition between the DC states
> a lot when there are no connected displays (no active power
> domains) during simple command submission.
>
> This frantic activity on DC states has a terrible impact on the
> performance of the overall chip with huge latencies observed in
> the interrupt handlers and elsewhere. Simple tests like
> igt/gem_latency -n 0 are slowed down by a factor of eight.
>
> Work around it by grabbing a modeset display power domain whilst
> there is any GT activity. This seems to be effective in making
> the DMC keep its paws off the chip.
>
> On the other hand this may have a negative impact on the overall
> power budget of the chip and so could still affect performance.
>
> This version limits the workaround got SKL GT3 and GT4 parts but
> this is just due the absence of testing on other platforms. It
> is possible we will have to apply it wider.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
> Testcase: igt/gem_exec_nop/headless
> Cc: Imre Deak <imre.deak@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 5 +++++
> drivers/gpu/drm/i915/i915_gem.c | 4 ++++
> drivers/gpu/drm/i915/i915_gem_request.c | 3 +++
> 3 files changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 320c16df1c9c..4d58e2e28c2f 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2990,6 +2990,11 @@ intel_info(const struct drm_i915_private *dev_priv)
>
> #define HAS_DECOUPLED_MMIO(dev_priv) (INTEL_INFO(dev_priv)->has_decoupled_mmio)
>
> +#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
> + HAS_CSR(dev_priv) && \
> + (IS_SKL_GT3(dev_priv) || IS_SKL_GT4(dev_priv)) && \
> + (dev_priv)->csr.dmc_payload
Nitpick, the whole thing could use braces around it for consistency,
although I don't see any sane use of the macro causing precedence
surprises.
BR,
Jani.
> +
> #include "i915_trace.h"
>
> static inline bool intel_scanout_needs_vtd_wa(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index b2727905ef2b..c52d863f409c 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3200,7 +3200,11 @@ i915_gem_idle_work_handler(struct work_struct *work)
>
> if (INTEL_GEN(dev_priv) >= 6)
> gen6_rps_idle(dev_priv);
> +
> intel_runtime_pm_put(dev_priv);
> +
> + if (NEEDS_CSR_GT_PERF_WA(dev_priv))
> + intel_display_power_put(dev_priv, POWER_DOMAIN_MODESET);
> out_unlock:
> mutex_unlock(&dev->struct_mutex);
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index 10361c7e3b37..10a3b51f6362 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -873,6 +873,9 @@ static void i915_gem_mark_busy(const struct intel_engine_cs *engine)
>
> GEM_BUG_ON(!dev_priv->gt.active_requests);
>
> + if (NEEDS_CSR_GT_PERF_WA(dev_priv))
> + intel_display_power_get(dev_priv, POWER_DOMAIN_MODESET);
> +
> intel_runtime_pm_get_noresume(dev_priv);
> dev_priv->gt.awake = true;
--
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2] drm/i915: Restore GT performance in headless mode with DMC loaded
2017-05-05 11:43 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
` (3 preceding siblings ...)
2017-05-08 9:23 ` Jani Nikula
@ 2017-05-08 11:30 ` Tvrtko Ursulin
2017-05-08 12:21 ` ✓ Fi.CI.BAT: success for drm/i915: Restore GT performance in headless mode with DMC loaded (rev2) Patchwork
5 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2017-05-08 11:30 UTC (permalink / raw)
To: Intel-gfx
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
It seems that the DMC likes to transition between the DC states
a lot when there are no connected displays (no active power
domains) during simple command submission.
This frantic activity on DC states has a terrible impact on the
performance of the overall chip with huge latencies observed in
the interrupt handlers and elsewhere. Simple tests like
igt/gem_latency -n 0 are slowed down by a factor of eight.
Work around it by grabbing a modeset display power domain whilst
there is any GT activity. This seems to be effective in making
the DMC keep its paws off the chip.
On the other hand this may have a negative impact on the overall
power budget of the chip and so could still affect performance.
This version limits the workaround got SKL GT3 and GT4 parts but
this is just due the absence of testing on other platforms. It
is possible we will have to apply it wider.
v2:
* Add commit text as comment in i915_gem_mark_busy. (Chris Wilson)
* Protect macro body with braces. (Jani Nikula)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572
Testcase: igt/gem_exec_nop/headless
Cc: Imre Deak <imre.deak@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_drv.h | 5 +++++
drivers/gpu/drm/i915/i915_gem.c | 4 ++++
drivers/gpu/drm/i915/i915_gem_request.c | 17 +++++++++++++++++
3 files changed, 26 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 320c16df1c9c..509b398d054d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2990,6 +2990,11 @@ intel_info(const struct drm_i915_private *dev_priv)
#define HAS_DECOUPLED_MMIO(dev_priv) (INTEL_INFO(dev_priv)->has_decoupled_mmio)
+#define NEEDS_CSR_GT_PERF_WA(dev_priv) \
+ (HAS_CSR(dev_priv) && \
+ (IS_SKL_GT3(dev_priv) || IS_SKL_GT4(dev_priv)) && \
+ (dev_priv)->csr.dmc_payload)
+
#include "i915_trace.h"
static inline bool intel_scanout_needs_vtd_wa(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b2727905ef2b..c52d863f409c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3200,7 +3200,11 @@ i915_gem_idle_work_handler(struct work_struct *work)
if (INTEL_GEN(dev_priv) >= 6)
gen6_rps_idle(dev_priv);
+
intel_runtime_pm_put(dev_priv);
+
+ if (NEEDS_CSR_GT_PERF_WA(dev_priv))
+ intel_display_power_put(dev_priv, POWER_DOMAIN_MODESET);
out_unlock:
mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 10361c7e3b37..c47c983b874e 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -873,6 +873,23 @@ static void i915_gem_mark_busy(const struct intel_engine_cs *engine)
GEM_BUG_ON(!dev_priv->gt.active_requests);
+ /*
+ * It seems that the DMC likes to transition between the DC states
+ * a lot when there are no connected displays (no active power
+ * domains) during simple command submission.
+ *
+ * This frantic activity on DC states has a terrible impact on the
+ * performance of the overall chip with huge latencies observed in
+ * the interrupt handlers and elsewhere. Simple tests like
+ * igt/gem_latency -n 0 are slowed down by a factor of eight.
+ *
+ * Work around it by grabbing a modeset display power domain whilst
+ * there is any GT activity. This seems to be effective in making
+ * the DMC keep its paws off the chip.
+ */
+ if (NEEDS_CSR_GT_PERF_WA(dev_priv))
+ intel_display_power_get(dev_priv, POWER_DOMAIN_MODESET);
+
intel_runtime_pm_get_noresume(dev_priv);
dev_priv->gt.awake = true;
--
2.9.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 9+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: Restore GT performance in headless mode with DMC loaded (rev2)
2017-05-05 11:43 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
` (4 preceding siblings ...)
2017-05-08 11:30 ` [PATCH v2] " Tvrtko Ursulin
@ 2017-05-08 12:21 ` Patchwork
5 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2017-05-08 12:21 UTC (permalink / raw)
To: Tvrtko Ursulin; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Restore GT performance in headless mode with DMC loaded (rev2)
URL : https://patchwork.freedesktop.org/series/24017/
State : success
== Summary ==
Series 24017v2 drm/i915: Restore GT performance in headless mode with DMC loaded
https://patchwork.freedesktop.org/api/1.0/series/24017/revisions/2/mbox/
fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:432s
fi-bdw-gvtdvm total:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:424s
fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:586s
fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:506s
fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:564s
fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:487s
fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:479s
fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:416s
fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:406s
fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:416s
fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:498s
fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:466s
fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:456s
fi-kbl-7560u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:571s
fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:451s
fi-skl-6700hq total:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:565s
fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:455s
fi-skl-6770hq total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:492s
fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:528s
fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:409s
fi-skl-gvtdvm failed to collect. IGT log at Patchwork_4639/fi-skl-gvtdvm/igt.log
f3c6147e2a7879ee7face0b7b634dd26704fc3d4 drm-tip: 2017y-05m-08d-11h-23m-22s UTC integration manifest
6a61edd drm/i915: Restore GT performance in headless mode with DMC loaded
== Logs ==
For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4639/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-05-08 12:21 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-05 11:43 [PATCH] drm/i915: Restore GT performance in headless mode with DMC loaded Tvrtko Ursulin
2017-05-05 11:54 ` Chris Wilson
2017-05-05 12:03 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-05-05 14:49 ` [PATCH] " Ville Syrjälä
2017-05-05 16:13 ` Tvrtko Ursulin
2017-05-05 16:28 ` Ville Syrjälä
2017-05-08 9:23 ` Jani Nikula
2017-05-08 11:30 ` [PATCH v2] " Tvrtko Ursulin
2017-05-08 12:21 ` ✓ Fi.CI.BAT: success for drm/i915: Restore GT performance in headless mode with DMC loaded (rev2) Patchwork
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.