* [PATCH 2/4] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context
2019-04-19 11:15 [PATCH 1/4] drm-tip: 2019y-04m-18d-17h-03m-51s UTC integration manifest Chris Wilson
@ 2019-04-19 11:15 ` Chris Wilson
2019-04-19 11:16 ` [PATCH 3/4] drm/i915: Enable render context support for Ironlake (gen5) Chris Wilson
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2019-04-19 11:15 UTC (permalink / raw)
To: intel-gfx
Despite what I think the prm recommends, commit f2253bd9859b
("drm/i915/ringbuffer: EMIT_INVALIDATE after switch context") turned out
to be a huge mistake when enabling Ironlake contexts as the GPU would
hang on either a MI_FLUSH or PIPE_CONTROL immediately following the
MI_SET_CONTEXT of an active mesa context (more vanilla contexts, e.g.
simple rendercopies with igt, do not suffer).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/intel_ringbuffer.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3844581f622c..8feb2d9b7b60 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1882,12 +1882,12 @@ static int ring_request_alloc(struct i915_request *request)
*/
request->reserved_space += LEGACY_REQUEST_SIZE;
- ret = switch_context(request);
+ /* Unconditionally invalidate GPU caches and TLBs. */
+ ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
if (ret)
return ret;
- /* Unconditionally invalidate GPU caches and TLBs. */
- ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
+ ret = switch_context(request);
if (ret)
return ret;
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/4] drm/i915: Enable render context support for Ironlake (gen5)
2019-04-19 11:15 [PATCH 1/4] drm-tip: 2019y-04m-18d-17h-03m-51s UTC integration manifest Chris Wilson
2019-04-19 11:15 ` [PATCH 2/4] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context Chris Wilson
@ 2019-04-19 11:16 ` Chris Wilson
2019-04-19 11:16 ` [PATCH 4/4] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga) Chris Wilson
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2019-04-19 11:16 UTC (permalink / raw)
To: intel-gfx; +Cc: Kenneth Graunke
Ironlake does support being able to saving and reloading context specific
registers between contexts, providing isolation of the basic GPU state
(as programmable by userspace). This allows userspace to assume that the
GPU retains their state from one batch to the next, minimising the
amount of state it needs to reload, or manually save and restore.
v2: Fix off-by-one in reading CXT_SIZE, and add a comment that the
CXT_SIZE and context-layout do not match in bspec, but the difference is
irrelevant as we overallocate the full page anyway (Ville).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/intel_engine_cs.c | 16 ++++++++++++++++
drivers/gpu/drm/i915/intel_ringbuffer.c | 13 +++++++++++++
2 files changed, 29 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index eea9bec04f1b..fc8be2fcb4e6 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -211,6 +211,22 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
return round_up(GEN6_CXT_TOTAL_SIZE(cxt_size) * 64,
PAGE_SIZE);
case 5:
+ /*
+ * There is a discrepancy here between the size reported
+ * by the register and the size of the context layout
+ * in the docs. Both are described as authorative!
+ *
+ * The discrepancy is on the order of a few cachelines,
+ * but the total is under one page (4k), which is our
+ * minimum allocation anyway so it should all come
+ * out in the wash.
+ */
+ cxt_size = I915_READ(CXT_SIZE) + 1;
+ DRM_DEBUG_DRIVER("gen%d CXT_SIZE = %d bytes [0x%08x]\n",
+ INTEL_GEN(dev_priv),
+ cxt_size * 64,
+ cxt_size - 1);
+ return round_up(cxt_size * 64, PAGE_SIZE);
case 4:
case 3:
case 2:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 8feb2d9b7b60..2d2e33cd3fae 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1640,11 +1640,14 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
/* These flags are for resource streamer on HSW+ */
flags |= HSW_MI_RS_SAVE_STATE_EN | HSW_MI_RS_RESTORE_STATE_EN;
else
+ /* We need to save the extended state for powersaving modes */
flags |= MI_SAVE_EXT_STATE_EN | MI_RESTORE_EXT_STATE_EN;
len = 4;
if (IS_GEN(i915, 7))
len += 2 + (num_engines ? 4 * num_engines + 6 : 0);
+ else if (IS_GEN(i915, 5))
+ len += 2;
if (flags & MI_FORCE_RESTORE) {
GEM_BUG_ON(flags & MI_RESTORE_INHIBIT);
flags &= ~MI_FORCE_RESTORE;
@@ -1673,6 +1676,14 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
GEN6_PSMI_SLEEP_MSG_DISABLE);
}
}
+ } else if (IS_GEN(i915, 5)) {
+ /*
+ * This w/a is only listed for pre-production ilk a/b steppings,
+ * but is also mentioned for programming the powerctx. To be
+ * safe, just apply the workaround; we do not use SyncFlush so
+ * this should never take effect and so be a no-op!
+ */
+ *cs++ = MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN;
}
if (force_restore) {
@@ -1726,6 +1737,8 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
*cs++ = MI_NOOP;
}
*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+ } else if (IS_GEN(i915, 5)) {
+ *cs++ = MI_SUSPEND_FLUSH;
}
intel_ring_advance(rq, cs);
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 4/4] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga)
2019-04-19 11:15 [PATCH 1/4] drm-tip: 2019y-04m-18d-17h-03m-51s UTC integration manifest Chris Wilson
2019-04-19 11:15 ` [PATCH 2/4] drm/i915/ringbuffer: EMIT_INVALIDATE *before* switch context Chris Wilson
2019-04-19 11:16 ` [PATCH 3/4] drm/i915: Enable render context support for Ironlake (gen5) Chris Wilson
@ 2019-04-19 11:16 ` Chris Wilson
2019-04-19 16:54 ` Kenneth Graunke
2019-04-19 11:17 ` [PATCH 1/4] drm-tip: 2019y-04m-18d-17h-03m-51s UTC integration manifest Chris Wilson
2019-04-19 11:25 ` ✗ Fi.CI.BAT: failure for series starting with [1/4] " Patchwork
4 siblings, 1 reply; 7+ messages in thread
From: Chris Wilson @ 2019-04-19 11:16 UTC (permalink / raw)
To: intel-gfx; +Cc: Kenneth Graunke
Broadwater and the rest of gen4 do support being able to saving and
reloading context specific registers between contexts, providing isolation
of the basic GPU state (as programmable by userspace). This allows
userspace to assume that the GPU retains their state from one batch to the
next, minimising the amount of state it needs to reload and manually save
across batches.
v2: CONSTANT_BUFFER woes
Running through piglit turned up an interesting issue, a GPU hang inside
the context load. The context image includes the CONSTANT_BUFFER command
that loads an address into a on-gpu buffer, and the context load was
executing that immediately. However, since it was reading from the GTT
there is no guarantee that the GTT retains the same configuration as
when the context was saved, resulting in stray reads and a GPU hang.
Having tried issuing a CONSTANT_BUFFER (to disable the command) from the
ring before saving the context to no avail, we resort to patching out
the instruction inside the context image before loading.
This does impose that gen4 always reissues CONSTANT_BUFFER commands on
each batch, but due to the use of a shared GTT that was and will remain
a requirement.
v3: ECOSPKD to the rescue
Ville found the magic bit in the ECOSPKD to disable saving and restoring
the CONSTANT_BUFFER from the context image, thereby completely avoiding
the GPU hangs from chasing invalid pointers. This appears to be the
default behaviour for gen5, and so we just need to tweak gen4 to match.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
---
drivers/gpu/drm/i915/i915_reg.h | 3 +++
drivers/gpu/drm/i915/intel_engine_cs.c | 2 +-
drivers/gpu/drm/i915/intel_ringbuffer.c | 14 ++++++++++++++
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index b74824f0b5b1..5815703ac35f 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2665,6 +2665,9 @@ enum i915_power_well_id {
# define MODE_IDLE (1 << 9)
# define STOP_RING (1 << 8)
+#define ECOSPKD _MMIO(0x21d0)
+# define CONSTANT_BUFFER_SR_DISABLE BIT(4)
+
#define GEN6_GT_MODE _MMIO(0x20d0)
#define GEN7_GT_MODE _MMIO(0x7008)
#define GEN6_WIZ_HASHING(hi, lo) (((hi) << 9) | ((lo) << 7))
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index fc8be2fcb4e6..f9db2e0bca12 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -211,6 +211,7 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
return round_up(GEN6_CXT_TOTAL_SIZE(cxt_size) * 64,
PAGE_SIZE);
case 5:
+ case 4:
/*
* There is a discrepancy here between the size reported
* by the register and the size of the context layout
@@ -227,7 +228,6 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
cxt_size * 64,
cxt_size - 1);
return round_up(cxt_size * 64, PAGE_SIZE);
- case 4:
case 3:
case 2:
/* For the special day when i810 gets merged. */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 2d2e33cd3fae..26b276ed00b3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -832,6 +832,20 @@ static int init_render_ring(struct intel_engine_cs *engine)
{
struct drm_i915_private *dev_priv = engine->i915;
+ /*
+ * Disable CONSTANT_BUFFER before it is loaded from the context
+ * image. For as it is loaded, it is executed and the stored
+ * address may no longer be valid, leading to a GPU hang.
+ *
+ * This imposes the requirement that userspace reload their
+ * CONSTANT_BUFFER on every batch, fortunately a requirement
+ * they are already accustomed to from before contexts were
+ * enabled.
+ */
+ if (IS_GEN(dev_priv, 4))
+ I915_WRITE(ECOSPKD,
+ _MASKED_BIT_ENABLE(CONSTANT_BUFFER_SR_DISABLE));
+
/* WaTimedSingleVertexDispatch:cl,bw,ctg,elk,ilk,snb */
if (IS_GEN_RANGE(dev_priv, 4, 6))
I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
--
2.20.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 4/4] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga)
2019-04-19 11:16 ` [PATCH 4/4] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga) Chris Wilson
@ 2019-04-19 16:54 ` Kenneth Graunke
0 siblings, 0 replies; 7+ messages in thread
From: Kenneth Graunke @ 2019-04-19 16:54 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
[-- Attachment #1.1: Type: text/plain, Size: 4965 bytes --]
On Friday, April 19, 2019 4:16:01 AM PDT Chris Wilson wrote:
> Broadwater and the rest of gen4 do support being able to saving and
> reloading context specific registers between contexts, providing isolation
> of the basic GPU state (as programmable by userspace). This allows
> userspace to assume that the GPU retains their state from one batch to the
> next, minimising the amount of state it needs to reload and manually save
> across batches.
>
> v2: CONSTANT_BUFFER woes
>
> Running through piglit turned up an interesting issue, a GPU hang inside
> the context load. The context image includes the CONSTANT_BUFFER command
> that loads an address into a on-gpu buffer, and the context load was
> executing that immediately. However, since it was reading from the GTT
> there is no guarantee that the GTT retains the same configuration as
> when the context was saved, resulting in stray reads and a GPU hang.
>
> Having tried issuing a CONSTANT_BUFFER (to disable the command) from the
> ring before saving the context to no avail, we resort to patching out
> the instruction inside the context image before loading.
>
> This does impose that gen4 always reissues CONSTANT_BUFFER commands on
> each batch, but due to the use of a shared GTT that was and will remain
> a requirement.
>
> v3: ECOSPKD to the rescue
>
> Ville found the magic bit in the ECOSPKD to disable saving and restoring
> the CONSTANT_BUFFER from the context image, thereby completely avoiding
> the GPU hangs from chasing invalid pointers. This appears to be the
> default behaviour for gen5, and so we just need to tweak gen4 to match.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Kenneth Graunke <kenneth@whitecape.org>
> ---
> drivers/gpu/drm/i915/i915_reg.h | 3 +++
> drivers/gpu/drm/i915/intel_engine_cs.c | 2 +-
> drivers/gpu/drm/i915/intel_ringbuffer.c | 14 ++++++++++++++
> 3 files changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index b74824f0b5b1..5815703ac35f 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -2665,6 +2665,9 @@ enum i915_power_well_id {
> # define MODE_IDLE (1 << 9)
> # define STOP_RING (1 << 8)
>
> +#define ECOSPKD _MMIO(0x21d0)
> +# define CONSTANT_BUFFER_SR_DISABLE BIT(4)
> +
The name of this register is ECOSKPD (Scratch Pad, or SK PD).
The G45 PRM says it's DevBW-C1+. I can't recall if earlier ones
shipped or not. If so, we might be in trouble. But I've seen a
lot of DevBW-A/B warnings that I'm pretty sure I've safely ignored...
With the typo fixed, this patch is:
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
> #define GEN6_GT_MODE _MMIO(0x20d0)
> #define GEN7_GT_MODE _MMIO(0x7008)
> #define GEN6_WIZ_HASHING(hi, lo) (((hi) << 9) | ((lo) << 7))
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index fc8be2fcb4e6..f9db2e0bca12 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -211,6 +211,7 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
> return round_up(GEN6_CXT_TOTAL_SIZE(cxt_size) * 64,
> PAGE_SIZE);
> case 5:
> + case 4:
> /*
> * There is a discrepancy here between the size reported
> * by the register and the size of the context layout
> @@ -227,7 +228,6 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
> cxt_size * 64,
> cxt_size - 1);
> return round_up(cxt_size * 64, PAGE_SIZE);
> - case 4:
> case 3:
> case 2:
> /* For the special day when i810 gets merged. */
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 2d2e33cd3fae..26b276ed00b3 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -832,6 +832,20 @@ static int init_render_ring(struct intel_engine_cs *engine)
> {
> struct drm_i915_private *dev_priv = engine->i915;
>
> + /*
> + * Disable CONSTANT_BUFFER before it is loaded from the context
> + * image. For as it is loaded, it is executed and the stored
> + * address may no longer be valid, leading to a GPU hang.
> + *
> + * This imposes the requirement that userspace reload their
> + * CONSTANT_BUFFER on every batch, fortunately a requirement
> + * they are already accustomed to from before contexts were
> + * enabled.
> + */
> + if (IS_GEN(dev_priv, 4))
> + I915_WRITE(ECOSPKD,
> + _MASKED_BIT_ENABLE(CONSTANT_BUFFER_SR_DISABLE));
> +
> /* WaTimedSingleVertexDispatch:cl,bw,ctg,elk,ilk,snb */
> if (IS_GEN_RANGE(dev_priv, 4, 6))
> I915_WRITE(MI_MODE, _MASKED_BIT_ENABLE(VS_TIMER_DISPATCH));
>
[-- Attachment #1.2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
[-- Attachment #2: Type: text/plain, Size: 159 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/4] drm-tip: 2019y-04m-18d-17h-03m-51s UTC integration manifest
2019-04-19 11:15 [PATCH 1/4] drm-tip: 2019y-04m-18d-17h-03m-51s UTC integration manifest Chris Wilson
` (2 preceding siblings ...)
2019-04-19 11:16 ` [PATCH 4/4] drm/i915: Enable render context support for gen4 (Broadwater to Cantiga) Chris Wilson
@ 2019-04-19 11:17 ` Chris Wilson
2019-04-19 11:25 ` ✗ Fi.CI.BAT: failure for series starting with [1/4] " Patchwork
4 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2019-04-19 11:17 UTC (permalink / raw)
To: intel-gfx
Quoting Chris Wilson (2019-04-19 12:15:58)
> From: Eric Anholt <eric@anholt.net>
Hmm, wrong base. Apologies for the noise.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread
* ✗ Fi.CI.BAT: failure for series starting with [1/4] drm-tip: 2019y-04m-18d-17h-03m-51s UTC integration manifest
2019-04-19 11:15 [PATCH 1/4] drm-tip: 2019y-04m-18d-17h-03m-51s UTC integration manifest Chris Wilson
` (3 preceding siblings ...)
2019-04-19 11:17 ` [PATCH 1/4] drm-tip: 2019y-04m-18d-17h-03m-51s UTC integration manifest Chris Wilson
@ 2019-04-19 11:25 ` Patchwork
4 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2019-04-19 11:25 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/4] drm-tip: 2019y-04m-18d-17h-03m-51s UTC integration manifest
URL : https://patchwork.freedesktop.org/series/59751/
State : failure
== Summary ==
Applying: drm-tip: 2019y-04m-18d-17h-03m-51s UTC integration manifest
Using index info to reconstruct a base tree...
Falling back to patching base and 3-way merge...
CONFLICT (add/add): Merge conflict in integration-manifest
Auto-merging integration-manifest
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0001 drm-tip: 2019y-04m-18d-17h-03m-51s UTC integration manifest
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 7+ messages in thread