[PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume
@ 2018-09-13 17:32 Chris Wilson
  2018-09-13 17:32 ` [PATCH 2/2] drm/i915/execlists: Use coherent writes into the context image Chris Wilson
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Chris Wilson @ 2018-09-13 17:32 UTC (permalink / raw)
  To: intel-gfx

Now that we reload both RING_HEAD and RING_TAIL when rebinding the
context, we do not need to scrub those registers immediately on resume.

v2: Handle the perma-pinned contexts.
v3: Set RING_TAIL on context-pin so that we always have known state in
the context image for the ring registers and all parties have similar
code (ripe for refactoring).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 33 +++++++++++++++-----------------
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 9b1f0e5211a0..d7fcbba8e982 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1338,11 +1338,13 @@ __execlists_context_pin(struct intel_engine_cs *engine,
 
 	intel_lr_context_descriptor_update(ctx, engine, ce);
 
+	GEM_BUG_ON(!intel_ring_offset_valid(ce->ring, ce->ring->head));
+
 	ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
 	ce->lrc_reg_state[CTX_RING_BUFFER_START+1] =
 		i915_ggtt_offset(ce->ring->vma);
-	GEM_BUG_ON(!intel_ring_offset_valid(ce->ring, ce->ring->head));
-	ce->lrc_reg_state[CTX_RING_HEAD+1] = ce->ring->head;
+	ce->lrc_reg_state[CTX_RING_HEAD + 1] = ce->ring->head;
+	ce->lrc_reg_state[CTX_RING_TAIL + 1] = ce->ring->tail;
 
 	ce->state->obj->pin_global++;
 	i915_gem_context_get(ctx);
@@ -2841,13 +2843,14 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 	return ret;
 }
 
-void intel_lr_context_resume(struct drm_i915_private *dev_priv)
+void intel_lr_context_resume(struct drm_i915_private *i915)
 {
 	struct intel_engine_cs *engine;
 	struct i915_gem_context *ctx;
 	enum intel_engine_id id;
 
-	/* Because we emit WA_TAIL_DWORDS there may be a disparity
+	/*
+	 * Because we emit WA_TAIL_DWORDS there may be a disparity
 	 * between our bookkeeping in ce->ring->head and ce->ring->tail and
 	 * that stored in context. As we only write new commands from
 	 * ce->ring->tail onwards, everything before that is junk. If the GPU
@@ -2857,28 +2860,22 @@ void intel_lr_context_resume(struct drm_i915_private *dev_priv)
 	 * So to avoid that we reset the context images upon resume. For
 	 * simplicity, we just zero everything out.
 	 */
-	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
-		for_each_engine(engine, dev_priv, id) {
+	list_for_each_entry(ctx, &i915->contexts.list, link) {
+		for_each_engine(engine, i915, id) {
 			struct intel_context *ce =
 				to_intel_context(ctx, engine);
-			u32 *reg;
 
 			if (!ce->state)
 				continue;
 
-			reg = i915_gem_object_pin_map(ce->state->obj,
-						      I915_MAP_WB);
-			if (WARN_ON(IS_ERR(reg)))
-				continue;
-
-			reg += LRC_STATE_PN * PAGE_SIZE / sizeof(*reg);
-			reg[CTX_RING_HEAD+1] = 0;
-			reg[CTX_RING_TAIL+1] = 0;
+			intel_ring_reset(ce->ring, 0);
 
-			ce->state->obj->mm.dirty = true;
-			i915_gem_object_unpin_map(ce->state->obj);
+			if (ce->pin_count) { /* otherwise done in context_pin */
+				u32 *regs = ce->lrc_reg_state;
 
-			intel_ring_reset(ce->ring, 0);
+				regs[CTX_RING_HEAD + 1] = ce->ring->head;
+				regs[CTX_RING_TAIL + 1] = ce->ring->tail;
+			}
 		}
 	}
 }
-- 
2.19.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] drm/i915/execlists: Use coherent writes into the context image
  2018-09-13 17:32 [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume Chris Wilson
@ 2018-09-13 17:32 ` Chris Wilson
  2018-09-13 19:33   ` [PATCH v3] " Chris Wilson
  2018-09-13 17:50 ` [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume Tvrtko Ursulin
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2018-09-13 17:32 UTC (permalink / raw)
  To: intel-gfx

That we use a WB mapping for updating the RING_TAIL register inside the
context image even on !llc machines has been a source of consternation
for every reader. It appears to work on bsw+, but it may just have been
that we have been incredibly bad at detecting the errors.

v2: With extra enthusiasm.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h         | 6 ++++++
 drivers/gpu/drm/i915/i915_gem.c         | 2 ++
 drivers/gpu/drm/i915/i915_perf.c        | 4 +++-
 drivers/gpu/drm/i915/intel_engine_cs.c  | 2 +-
 drivers/gpu/drm/i915/intel_lrc.c        | 8 +++++---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
 6 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7ea442033a57..5c833a45682d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3074,6 +3074,12 @@ enum i915_map_type {
 	I915_MAP_FORCE_WC = I915_MAP_WC | I915_MAP_OVERRIDE,
 };
 
+static inline enum i915_map_type
+i915_coherent_map_type(struct drm_i915_private *i915)
+{
+	return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
+}
+
 /**
  * i915_gem_object_pin_map - return a contiguous mapping of the entire object
  * @obj: the object to map into kernel address space
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 89834ce19acd..d6f2bbd6a0dc 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5417,6 +5417,8 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	for_each_engine(engine, i915, id) {
 		struct i915_vma *state;
 
+		GEM_BUG_ON(to_intel_context(ctx, engine)->pin_count);
+
 		state = to_intel_context(ctx, engine)->state;
 		if (!state)
 			continue;
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 3d7a052b4cca..90168ac845c2 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1735,13 +1735,15 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 	/* Update all contexts now that we've stalled the submission. */
 	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
 		struct intel_context *ce = to_intel_context(ctx, engine);
+		unsigned int map_type;
 		u32 *regs;
 
 		/* OA settings will be set upon first use */
 		if (!ce->state)
 			continue;
 
-		regs = i915_gem_object_pin_map(ce->state->obj, I915_MAP_WB);
+		map_type = i915_coherent_map_type(dev_priv);
+		regs = i915_gem_object_pin_map(ce->state->obj, map_type);
 		if (IS_ERR(regs))
 			return PTR_ERR(regs);
 
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 10cd051ba29e..c99f2cb9b0e1 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1150,7 +1150,7 @@ void intel_engines_unpark(struct drm_i915_private *i915)
 		map = NULL;
 		if (engine->default_state)
 			map = i915_gem_object_pin_map(engine->default_state,
-						      I915_MAP_WB);
+						      I915_MAP_FORCE_WB);
 		if (!IS_ERR_OR_NULL(map))
 			engine->pinned_default_state = map;
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d7fcbba8e982..7b1f322f232b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1294,7 +1294,7 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma)
 	 * on an active context (which by nature is already on the GPU).
 	 */
 	if (!(vma->flags & I915_VMA_GLOBAL_BIND)) {
-		err = i915_gem_object_set_to_gtt_domain(vma->obj, true);
+		err = i915_gem_object_set_to_wc_domain(vma->obj, true);
 		if (err)
 			return err;
 	}
@@ -1322,7 +1322,9 @@ __execlists_context_pin(struct intel_engine_cs *engine,
 	if (ret)
 		goto err;
 
-	vaddr = i915_gem_object_pin_map(ce->state->obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map(ce->state->obj,
+					i915_coherent_map_type(ctx->i915) |
+					I915_MAP_OVERRIDE);
 	if (IS_ERR(vaddr)) {
 		ret = PTR_ERR(vaddr);
 		goto unpin_vma;
@@ -2753,7 +2755,7 @@ populate_lr_context(struct i915_gem_context *ctx,
 		void *defaults;
 
 		defaults = i915_gem_object_pin_map(engine->default_state,
-						   I915_MAP_WB);
+						   I915_MAP_FORCE_WB);
 		if (IS_ERR(defaults)) {
 			ret = PTR_ERR(defaults);
 			goto err_unpin_ctx;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index d0ef50bf930a..1eb68d77b66c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1288,7 +1288,7 @@ alloc_context_vma(struct intel_engine_cs *engine)
 		}
 
 		defaults = i915_gem_object_pin_map(engine->default_state,
-						   I915_MAP_WB);
+						   I915_MAP_FORCE_WB);
 		if (IS_ERR(defaults)) {
 			err = PTR_ERR(defaults);
 			goto err_map;
-- 
2.19.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume
  2018-09-13 17:32 [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume Chris Wilson
  2018-09-13 17:32 ` [PATCH 2/2] drm/i915/execlists: Use coherent writes into the context image Chris Wilson
@ 2018-09-13 17:50 ` Tvrtko Ursulin
  2018-09-13 17:53 ` ✗ Fi.CI.SPARSE: warning for series starting with [1/2] " Patchwork
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-09-13 17:50 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 13/09/2018 18:32, Chris Wilson wrote:
> Now that we reload both RING_HEAD and RING_TAIL when rebinding the
> context, we do not need to scrub those registers immediately on resume.
> 
> v2: Handle the perma-pinned contexts.
> v3: Set RING_TAIL on context-pin so that we always have known state in
> the context image for the ring registers and all parties have similar
> code (ripe for refactoring).
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/intel_lrc.c | 33 +++++++++++++++-----------------
>   1 file changed, 15 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 9b1f0e5211a0..d7fcbba8e982 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1338,11 +1338,13 @@ __execlists_context_pin(struct intel_engine_cs *engine,
>   
>   	intel_lr_context_descriptor_update(ctx, engine, ce);
>   
> +	GEM_BUG_ON(!intel_ring_offset_valid(ce->ring, ce->ring->head));
> +
>   	ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
>   	ce->lrc_reg_state[CTX_RING_BUFFER_START+1] =
>   		i915_ggtt_offset(ce->ring->vma);
> -	GEM_BUG_ON(!intel_ring_offset_valid(ce->ring, ce->ring->head));
> -	ce->lrc_reg_state[CTX_RING_HEAD+1] = ce->ring->head;
> +	ce->lrc_reg_state[CTX_RING_HEAD + 1] = ce->ring->head;
> +	ce->lrc_reg_state[CTX_RING_TAIL + 1] = ce->ring->tail;
>   
>   	ce->state->obj->pin_global++;
>   	i915_gem_context_get(ctx);
> @@ -2841,13 +2843,14 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
>   	return ret;
>   }
>   
> -void intel_lr_context_resume(struct drm_i915_private *dev_priv)
> +void intel_lr_context_resume(struct drm_i915_private *i915)
>   {
>   	struct intel_engine_cs *engine;
>   	struct i915_gem_context *ctx;
>   	enum intel_engine_id id;
>   
> -	/* Because we emit WA_TAIL_DWORDS there may be a disparity
> +	/*
> +	 * Because we emit WA_TAIL_DWORDS there may be a disparity
>   	 * between our bookkeeping in ce->ring->head and ce->ring->tail and
>   	 * that stored in context. As we only write new commands from
>   	 * ce->ring->tail onwards, everything before that is junk. If the GPU
> @@ -2857,28 +2860,22 @@ void intel_lr_context_resume(struct drm_i915_private *dev_priv)
>   	 * So to avoid that we reset the context images upon resume. For
>   	 * simplicity, we just zero everything out.
>   	 */
> -	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
> -		for_each_engine(engine, dev_priv, id) {
> +	list_for_each_entry(ctx, &i915->contexts.list, link) {
> +		for_each_engine(engine, i915, id) {
>   			struct intel_context *ce =
>   				to_intel_context(ctx, engine);
> -			u32 *reg;
>   
>   			if (!ce->state)
>   				continue;
>   
> -			reg = i915_gem_object_pin_map(ce->state->obj,
> -						      I915_MAP_WB);
> -			if (WARN_ON(IS_ERR(reg)))
> -				continue;
> -
> -			reg += LRC_STATE_PN * PAGE_SIZE / sizeof(*reg);
> -			reg[CTX_RING_HEAD+1] = 0;
> -			reg[CTX_RING_TAIL+1] = 0;
> +			intel_ring_reset(ce->ring, 0);
>   
> -			ce->state->obj->mm.dirty = true;
> -			i915_gem_object_unpin_map(ce->state->obj);
> +			if (ce->pin_count) { /* otherwise done in context_pin */
> +				u32 *regs = ce->lrc_reg_state;
>   
> -			intel_ring_reset(ce->ring, 0);
> +				regs[CTX_RING_HEAD + 1] = ce->ring->head;
> +				regs[CTX_RING_TAIL + 1] = ce->ring->tail;
> +			}
>   		}
>   	}
>   }
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* ✗ Fi.CI.SPARSE: warning for series starting with [1/2] drm/i915/execlists: Delay updating ring register state after resume
  2018-09-13 17:32 [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume Chris Wilson
  2018-09-13 17:32 ` [PATCH 2/2] drm/i915/execlists: Use coherent writes into the context image Chris Wilson
  2018-09-13 17:50 ` [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume Tvrtko Ursulin
@ 2018-09-13 17:53 ` Patchwork
  2018-09-13 18:12 ` ✗ Fi.CI.BAT: failure " Patchwork
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2018-09-13 17:53 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915/execlists: Delay updating ring register state after resume
URL   : https://patchwork.freedesktop.org/series/49654/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Commit: drm/i915/execlists: Delay updating ring register state after resume
Okay!

Commit: drm/i915/execlists: Use coherent writes into the context image
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3689:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3695:16: warning: expression using sizeof(void)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915/execlists: Delay updating ring register state after resume
  2018-09-13 17:32 [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume Chris Wilson
                   ` (2 preceding siblings ...)
  2018-09-13 17:53 ` ✗ Fi.CI.SPARSE: warning for series starting with [1/2] " Patchwork
@ 2018-09-13 18:12 ` Patchwork
  2018-09-13 20:17 ` ✗ Fi.CI.SPARSE: warning for series starting with [1/2] drm/i915/execlists: Delay updating ring register state after resume (rev2) Patchwork
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2018-09-13 18:12 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915/execlists: Delay updating ring register state after resume
URL   : https://patchwork.freedesktop.org/series/49654/
State : failure

== Summary ==

= CI Bug Log - changes from CI_DRM_4817 -> Patchwork_10174 =

== Summary - FAILURE ==

  Serious unknown changes coming with Patchwork_10174 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_10174, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/49654/revisions/1/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_10174:

  === IGT changes ===

    ==== Possible regressions ====

    igt@drv_selftest@live_contexts:
      fi-bsw-n3050:       PASS -> DMESG-FAIL

    
    ==== Warnings ====

    igt@pm_rpm@module-reload:
      fi-hsw-4770r:       SKIP -> PASS

    
== Known issues ==

  Here are the changes found in Patchwork_10174 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@drv_module_reload@basic-reload:
      fi-blb-e6850:       NOTRUN -> INCOMPLETE (fdo#107718)

    igt@gem_exec_suspend@basic-s3:
      fi-kbl-soraka:      NOTRUN -> INCOMPLETE (fdo#107556, fdo#107774)

    
    ==== Possible fixes ====

    igt@drv_module_reload@basic-reload-inject:
      fi-hsw-4770r:       DMESG-WARN (fdo#107924) -> PASS

    igt@drv_selftest@live_coherency:
      fi-gdg-551:         DMESG-FAIL (fdo#107164) -> PASS

    igt@gem_exec_suspend@basic-s3:
      fi-blb-e6850:       INCOMPLETE (fdo#107718) -> PASS

    
  fdo#107164 https://bugs.freedesktop.org/show_bug.cgi?id=107164
  fdo#107556 https://bugs.freedesktop.org/show_bug.cgi?id=107556
  fdo#107718 https://bugs.freedesktop.org/show_bug.cgi?id=107718
  fdo#107774 https://bugs.freedesktop.org/show_bug.cgi?id=107774
  fdo#107924 https://bugs.freedesktop.org/show_bug.cgi?id=107924


== Participating hosts (48 -> 45) ==

  Additional (1): fi-kbl-soraka 
  Missing    (4): fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-hsw-4200u 


== Build changes ==

    * Linux: CI_DRM_4817 -> Patchwork_10174

  CI_DRM_4817: 0725fa5c5756ff86de0f33f10b9d92ac452d5118 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4640: 9a8da36e708f9ed15b20689dfe305e41f9a19008 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_10174: c701befaed7dfa1b985984037d6bf5e4299ce391 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

c701befaed7d drm/i915/execlists: Use coherent writes into the context image
0c773550b623 drm/i915/execlists: Delay updating ring register state after resume

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10174/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3] drm/i915/execlists: Use coherent writes into the context image
  2018-09-13 17:32 ` [PATCH 2/2] drm/i915/execlists: Use coherent writes into the context image Chris Wilson
@ 2018-09-13 19:33   ` Chris Wilson
  2018-09-14  8:14     ` Tvrtko Ursulin
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2018-09-13 19:33 UTC (permalink / raw)
  To: intel-gfx

That we use a WB mapping for updating the RING_TAIL register inside the
context image even on !llc machines has been a source of consternation
for every reader. It appears to work on bsw+, but it may just have been
that we have been incredibly bad at detecting the errors.

v2: With extra enthusiasm.
v3: Drop force of map type for pinned default_state as by the time we
pin it, the map type is always WB and doesn't conflict with the earlier
use by ce->state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h         | 6 ++++++
 drivers/gpu/drm/i915/i915_gem.c         | 2 ++
 drivers/gpu/drm/i915/i915_perf.c        | 4 +++-
 drivers/gpu/drm/i915/intel_lrc.c        | 8 +++++---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
 5 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7ea442033a57..5c833a45682d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3074,6 +3074,12 @@ enum i915_map_type {
 	I915_MAP_FORCE_WC = I915_MAP_WC | I915_MAP_OVERRIDE,
 };
 
+static inline enum i915_map_type
+i915_coherent_map_type(struct drm_i915_private *i915)
+{
+	return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
+}
+
 /**
  * i915_gem_object_pin_map - return a contiguous mapping of the entire object
  * @obj: the object to map into kernel address space
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 89834ce19acd..d6f2bbd6a0dc 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5417,6 +5417,8 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	for_each_engine(engine, i915, id) {
 		struct i915_vma *state;
 
+		GEM_BUG_ON(to_intel_context(ctx, engine)->pin_count);
+
 		state = to_intel_context(ctx, engine)->state;
 		if (!state)
 			continue;
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 3d7a052b4cca..90168ac845c2 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1735,13 +1735,15 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 	/* Update all contexts now that we've stalled the submission. */
 	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
 		struct intel_context *ce = to_intel_context(ctx, engine);
+		unsigned int map_type;
 		u32 *regs;
 
 		/* OA settings will be set upon first use */
 		if (!ce->state)
 			continue;
 
-		regs = i915_gem_object_pin_map(ce->state->obj, I915_MAP_WB);
+		map_type = i915_coherent_map_type(dev_priv);
+		regs = i915_gem_object_pin_map(ce->state->obj, map_type);
 		if (IS_ERR(regs))
 			return PTR_ERR(regs);
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d7fcbba8e982..7b1f322f232b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1294,7 +1294,7 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma)
 	 * on an active context (which by nature is already on the GPU).
 	 */
 	if (!(vma->flags & I915_VMA_GLOBAL_BIND)) {
-		err = i915_gem_object_set_to_gtt_domain(vma->obj, true);
+		err = i915_gem_object_set_to_wc_domain(vma->obj, true);
 		if (err)
 			return err;
 	}
@@ -1322,7 +1322,9 @@ __execlists_context_pin(struct intel_engine_cs *engine,
 	if (ret)
 		goto err;
 
-	vaddr = i915_gem_object_pin_map(ce->state->obj, I915_MAP_WB);
+	vaddr = i915_gem_object_pin_map(ce->state->obj,
+					i915_coherent_map_type(ctx->i915) |
+					I915_MAP_OVERRIDE);
 	if (IS_ERR(vaddr)) {
 		ret = PTR_ERR(vaddr);
 		goto unpin_vma;
@@ -2753,7 +2755,7 @@ populate_lr_context(struct i915_gem_context *ctx,
 		void *defaults;
 
 		defaults = i915_gem_object_pin_map(engine->default_state,
-						   I915_MAP_WB);
+						   I915_MAP_FORCE_WB);
 		if (IS_ERR(defaults)) {
 			ret = PTR_ERR(defaults);
 			goto err_unpin_ctx;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index d0ef50bf930a..1eb68d77b66c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1288,7 +1288,7 @@ alloc_context_vma(struct intel_engine_cs *engine)
 		}
 
 		defaults = i915_gem_object_pin_map(engine->default_state,
-						   I915_MAP_WB);
+						   I915_MAP_FORCE_WB);
 		if (IS_ERR(defaults)) {
 			err = PTR_ERR(defaults);
 			goto err_map;
-- 
2.19.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* ✗ Fi.CI.SPARSE: warning for series starting with [1/2] drm/i915/execlists: Delay updating ring register state after resume (rev2)
  2018-09-13 17:32 [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume Chris Wilson
                   ` (3 preceding siblings ...)
  2018-09-13 18:12 ` ✗ Fi.CI.BAT: failure " Patchwork
@ 2018-09-13 20:17 ` Patchwork
  2018-09-13 20:35 ` ✓ Fi.CI.BAT: success " Patchwork
  2018-09-13 23:02 ` ✓ Fi.CI.IGT: " Patchwork
  6 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2018-09-13 20:17 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915/execlists: Delay updating ring register state after resume (rev2)
URL   : https://patchwork.freedesktop.org/series/49654/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Commit: drm/i915/execlists: Delay updating ring register state after resume
Okay!

Commit: drm/i915/execlists: Use coherent writes into the context image
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3689:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3695:16: warning: expression using sizeof(void)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915/execlists: Delay updating ring register state after resume (rev2)
  2018-09-13 17:32 [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume Chris Wilson
                   ` (4 preceding siblings ...)
  2018-09-13 20:17 ` ✗ Fi.CI.SPARSE: warning for series starting with [1/2] drm/i915/execlists: Delay updating ring register state after resume (rev2) Patchwork
@ 2018-09-13 20:35 ` Patchwork
  2018-09-13 23:02 ` ✓ Fi.CI.IGT: " Patchwork
  6 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2018-09-13 20:35 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915/execlists: Delay updating ring register state after resume (rev2)
URL   : https://patchwork.freedesktop.org/series/49654/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4819 -> Patchwork_10176 =

== Summary - WARNING ==

  Minor unknown changes coming with Patchwork_10176 need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_10176, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/49654/revisions/2/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_10176:

  === IGT changes ===

    ==== Warnings ====

    igt@pm_rpm@module-reload:
      fi-hsw-4770r:       PASS -> SKIP

    
== Known issues ==

  Here are the changes found in Patchwork_10176 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@drv_module_reload@basic-reload-inject:
      fi-hsw-4770r:       PASS -> DMESG-WARN (fdo#107425, fdo#107924)

    igt@gem_exec_suspend@basic-s3:
      fi-blb-e6850:       PASS -> INCOMPLETE (fdo#107718)

    igt@kms_psr@primary_page_flip:
      fi-cfl-s3:          PASS -> FAIL (fdo#107336)
      fi-kbl-r:           PASS -> FAIL (fdo#107336)

    
    ==== Possible fixes ====

    igt@kms_frontbuffer_tracking@basic:
      fi-byt-clapper:     FAIL (fdo#103167) -> PASS

    igt@kms_pipe_crc_basic@read-crc-pipe-a-frame-sequence:
      fi-skl-6700k2:      FAIL (fdo#103191) -> PASS

    igt@kms_psr@primary_page_flip:
      fi-whl-u:           FAIL (fdo#107336) -> PASS

    
  fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
  fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
  fdo#107336 https://bugs.freedesktop.org/show_bug.cgi?id=107336
  fdo#107425 https://bugs.freedesktop.org/show_bug.cgi?id=107425
  fdo#107718 https://bugs.freedesktop.org/show_bug.cgi?id=107718
  fdo#107924 https://bugs.freedesktop.org/show_bug.cgi?id=107924


== Participating hosts (48 -> 45) ==

  Additional (1): fi-icl-u 
  Missing    (4): fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-hsw-4200u 


== Build changes ==

    * Linux: CI_DRM_4819 -> Patchwork_10176

  CI_DRM_4819: 974889a04616f44de2f342dfdc9753b6dad8de88 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4640: 9a8da36e708f9ed15b20689dfe305e41f9a19008 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_10176: 727e57a655b11345ea5681cf75a691d2440669c6 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

727e57a655b1 drm/i915/execlists: Use coherent writes into the context image
de52a519af9f drm/i915/execlists: Delay updating ring register state after resume

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10176/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [1/2] drm/i915/execlists: Delay updating ring register state after resume (rev2)
  2018-09-13 17:32 [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume Chris Wilson
                   ` (5 preceding siblings ...)
  2018-09-13 20:35 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-09-13 23:02 ` Patchwork
  6 siblings, 0 replies; 15+ messages in thread
From: Patchwork @ 2018-09-13 23:02 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915/execlists: Delay updating ring register state after resume (rev2)
URL   : https://patchwork.freedesktop.org/series/49654/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4819_full -> Patchwork_10176_full =

== Summary - SUCCESS ==

  No regressions found.

  

== Known issues ==

  Here are the changes found in Patchwork_10176_full that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@gem_exec_schedule@preempt-other-chain-vebox:
      shard-snb:          SKIP -> INCOMPLETE (fdo#105411)

    igt@gem_userptr_blits@readonly-unsync:
      shard-glk:          PASS -> INCOMPLETE (k.org#198133, fdo#103359)

    igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-shrfb-draw-mmap-gtt:
      shard-glk:          PASS -> FAIL (fdo#103167)

    igt@kms_setmode@basic:
      shard-apl:          PASS -> FAIL (fdo#99912)
      shard-kbl:          PASS -> FAIL (fdo#99912)

    
    ==== Possible fixes ====

    igt@kms_frontbuffer_tracking@fbc-2p-primscrn-spr-indfb-draw-blt:
      shard-glk:          DMESG-WARN (fdo#105763) -> PASS +3

    igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-pri-indfb-draw-render:
      shard-glk:          DMESG-FAIL -> PASS

    
  fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
  fdo#103359 https://bugs.freedesktop.org/show_bug.cgi?id=103359
  fdo#105411 https://bugs.freedesktop.org/show_bug.cgi?id=105411
  fdo#105763 https://bugs.freedesktop.org/show_bug.cgi?id=105763
  fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
  k.org#198133 https://bugzilla.kernel.org/show_bug.cgi?id=198133


== Participating hosts (5 -> 5) ==

  No changes in participating hosts


== Build changes ==

    * Linux: CI_DRM_4819 -> Patchwork_10176

  CI_DRM_4819: 974889a04616f44de2f342dfdc9753b6dad8de88 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4640: 9a8da36e708f9ed15b20689dfe305e41f9a19008 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_10176: 727e57a655b11345ea5681cf75a691d2440669c6 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10176/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] drm/i915/execlists: Use coherent writes into the context image
  2018-09-13 19:33   ` [PATCH v3] " Chris Wilson
@ 2018-09-14  8:14     ` Tvrtko Ursulin
  2018-09-14  8:21       ` Chris Wilson
  0 siblings, 1 reply; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-09-14  8:14 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 13/09/2018 20:33, Chris Wilson wrote:
> That we use a WB mapping for updating the RING_TAIL register inside the
> context image even on !llc machines has been a source of consternation
> for every reader. It appears to work on bsw+, but it may just have been
> that we have been incredibly bad at detecting the errors.
> 
> v2: With extra enthusiasm.
> v3: Drop force of map type for pinned default_state as by the time we
> pin it, the map type is always WB and doesn't conflict with the earlier
> use by ce->state.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_drv.h         | 6 ++++++
>   drivers/gpu/drm/i915/i915_gem.c         | 2 ++
>   drivers/gpu/drm/i915/i915_perf.c        | 4 +++-
>   drivers/gpu/drm/i915/intel_lrc.c        | 8 +++++---
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +-
>   5 files changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 7ea442033a57..5c833a45682d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3074,6 +3074,12 @@ enum i915_map_type {
>   	I915_MAP_FORCE_WC = I915_MAP_WC | I915_MAP_OVERRIDE,
>   };
>   
> +static inline enum i915_map_type
> +i915_coherent_map_type(struct drm_i915_private *i915)
> +{
> +	return HAS_LLC(i915) ? I915_MAP_WB : I915_MAP_WC;
> +}
> +
>   /**
>    * i915_gem_object_pin_map - return a contiguous mapping of the entire object
>    * @obj: the object to map into kernel address space
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 89834ce19acd..d6f2bbd6a0dc 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5417,6 +5417,8 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
>   	for_each_engine(engine, i915, id) {
>   		struct i915_vma *state;
>   
> +		GEM_BUG_ON(to_intel_context(ctx, engine)->pin_count);
> +
>   		state = to_intel_context(ctx, engine)->state;
>   		if (!state)
>   			continue;
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 3d7a052b4cca..90168ac845c2 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -1735,13 +1735,15 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
>   	/* Update all contexts now that we've stalled the submission. */
>   	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
>   		struct intel_context *ce = to_intel_context(ctx, engine);
> +		unsigned int map_type;
>   		u32 *regs;
>   
>   		/* OA settings will be set upon first use */
>   		if (!ce->state)
>   			continue;
>   
> -		regs = i915_gem_object_pin_map(ce->state->obj, I915_MAP_WB);
> +		map_type = i915_coherent_map_type(dev_priv);

Local for line length only? Could move it out of the loop as well to 
cache it if you feel like it.

> +		regs = i915_gem_object_pin_map(ce->state->obj, map_type);
>   		if (IS_ERR(regs))
>   			return PTR_ERR(regs);
>   
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index d7fcbba8e982..7b1f322f232b 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1294,7 +1294,7 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma)
>   	 * on an active context (which by nature is already on the GPU).
>   	 */
>   	if (!(vma->flags & I915_VMA_GLOBAL_BIND)) {
> -		err = i915_gem_object_set_to_gtt_domain(vma->obj, true);
> +		err = i915_gem_object_set_to_wc_domain(vma->obj, true);

I am still confused by this. Cache flushing effects of the old and new 
call seem the same due object being in CPU write domain at this point. 
What changes is that it will be marked differently from this point one. 
Does that come into play later in the objects lifetime and where?

>   		if (err)
>   			return err;
>   	}
> @@ -1322,7 +1322,9 @@ __execlists_context_pin(struct intel_engine_cs *engine,
>   	if (ret)
>   		goto err;
>   
> -	vaddr = i915_gem_object_pin_map(ce->state->obj, I915_MAP_WB);
> +	vaddr = i915_gem_object_pin_map(ce->state->obj,
> +					i915_coherent_map_type(ctx->i915) |
> +					I915_MAP_OVERRIDE);

Override MAP_WB from populate_lr_context - OK I think.

>   	if (IS_ERR(vaddr)) {
>   		ret = PTR_ERR(vaddr);
>   		goto unpin_vma;
> @@ -2753,7 +2755,7 @@ populate_lr_context(struct i915_gem_context *ctx,
>   		void *defaults;
>   
>   		defaults = i915_gem_object_pin_map(engine->default_state,
> -						   I915_MAP_WB);
> +						   I915_MAP_FORCE_WB);
>   		if (IS_ERR(defaults)) {
>   			ret = PTR_ERR(defaults);
>   			goto err_unpin_ctx;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index d0ef50bf930a..1eb68d77b66c 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1288,7 +1288,7 @@ alloc_context_vma(struct intel_engine_cs *engine)
>   		}
>   
>   		defaults = i915_gem_object_pin_map(engine->default_state,
> -						   I915_MAP_WB);
> +						   I915_MAP_FORCE_WB);
>   		if (IS_ERR(defaults)) {
>   			err = PTR_ERR(defaults);
>   			goto err_map;
> 

These two do not need to be changed AFAICT.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] drm/i915/execlists: Use coherent writes into the context image
  2018-09-14  8:14     ` Tvrtko Ursulin
@ 2018-09-14  8:21       ` Chris Wilson
  2018-09-14  9:12         ` Tvrtko Ursulin
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2018-09-14  8:21 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-09-14 09:14:54)
> 
> On 13/09/2018 20:33, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index d7fcbba8e982..7b1f322f232b 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -1294,7 +1294,7 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma)
> >        * on an active context (which by nature is already on the GPU).
> >        */
> >       if (!(vma->flags & I915_VMA_GLOBAL_BIND)) {
> > -             err = i915_gem_object_set_to_gtt_domain(vma->obj, true);
> > +             err = i915_gem_object_set_to_wc_domain(vma->obj, true);
> 
> I am still confused by this. Cache flushing effects of the old and new 
> call seem the same due object being in CPU write domain at this point. 
> What changes is that it will be marked differently from this point one. 
> Does that come into play later in the objects lifetime and where?

No, just taking the opportunity to use a more correct domain now that it
exists and logically ties in with using WC.

> >               if (err)
> >                       return err;
> >       }
> > @@ -1322,7 +1322,9 @@ __execlists_context_pin(struct intel_engine_cs *engine,
> >       if (ret)
> >               goto err;
> >   
> > -     vaddr = i915_gem_object_pin_map(ce->state->obj, I915_MAP_WB);
> > +     vaddr = i915_gem_object_pin_map(ce->state->obj,
> > +                                     i915_coherent_map_type(ctx->i915) |
> > +                                     I915_MAP_OVERRIDE);
> 
> Override MAP_WB from populate_lr_context - OK I think.
> 
> >       if (IS_ERR(vaddr)) {
> >               ret = PTR_ERR(vaddr);
> >               goto unpin_vma;
> > @@ -2753,7 +2755,7 @@ populate_lr_context(struct i915_gem_context *ctx,
> >               void *defaults;
> >   
> >               defaults = i915_gem_object_pin_map(engine->default_state,
> > -                                                I915_MAP_WB);
> > +                                                I915_MAP_FORCE_WB);
> >               if (IS_ERR(defaults)) {
> >                       ret = PTR_ERR(defaults);
> >                       goto err_unpin_ctx;
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index d0ef50bf930a..1eb68d77b66c 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -1288,7 +1288,7 @@ alloc_context_vma(struct intel_engine_cs *engine)
> >               }
> >   
> >               defaults = i915_gem_object_pin_map(engine->default_state,
> > -                                                I915_MAP_WB);
> > +                                                I915_MAP_FORCE_WB);
> >               if (IS_ERR(defaults)) {
> >                       err = PTR_ERR(defaults);
> >                       goto err_map;
> > 
> 
> These two do not need to be changed AFAICT.

I think we cannot rely on engine->default_state always being MAP_WB
already at this point, due to not having an idle cycle between creation
of engine->default_state on module_load and first use.

Having thought of that last night, I did mean to add a call to
__i915_gem_park() during init so we forced ourselves to idle.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] drm/i915/execlists: Use coherent writes into the context image
  2018-09-14  8:21       ` Chris Wilson
@ 2018-09-14  9:12         ` Tvrtko Ursulin
  2018-09-14  9:17           ` Chris Wilson
  0 siblings, 1 reply; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-09-14  9:12 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 14/09/2018 09:21, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-09-14 09:14:54)
>>
>> On 13/09/2018 20:33, Chris Wilson wrote:
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>>> index d7fcbba8e982..7b1f322f232b 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -1294,7 +1294,7 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma)
>>>         * on an active context (which by nature is already on the GPU).
>>>         */
>>>        if (!(vma->flags & I915_VMA_GLOBAL_BIND)) {
>>> -             err = i915_gem_object_set_to_gtt_domain(vma->obj, true);
>>> +             err = i915_gem_object_set_to_wc_domain(vma->obj, true);
>>
>> I am still confused by this. Cache flushing effects of the old and new
>> call seem the same due object being in CPU write domain at this point.
>> What changes is that it will be marked differently from this point one.
>> Does that come into play later in the objects lifetime and where?
> 
> No, just taking the opportunity to use a more correct domain now that it
> exists and logically ties in with using WC.

Ok.

>>>                if (err)
>>>                        return err;
>>>        }
>>> @@ -1322,7 +1322,9 @@ __execlists_context_pin(struct intel_engine_cs *engine,
>>>        if (ret)
>>>                goto err;
>>>    
>>> -     vaddr = i915_gem_object_pin_map(ce->state->obj, I915_MAP_WB);
>>> +     vaddr = i915_gem_object_pin_map(ce->state->obj,
>>> +                                     i915_coherent_map_type(ctx->i915) |
>>> +                                     I915_MAP_OVERRIDE);
>>
>> Override MAP_WB from populate_lr_context - OK I think.
>>
>>>        if (IS_ERR(vaddr)) {
>>>                ret = PTR_ERR(vaddr);
>>>                goto unpin_vma;
>>> @@ -2753,7 +2755,7 @@ populate_lr_context(struct i915_gem_context *ctx,
>>>                void *defaults;
>>>    
>>>                defaults = i915_gem_object_pin_map(engine->default_state,
>>> -                                                I915_MAP_WB);
>>> +                                                I915_MAP_FORCE_WB);
>>>                if (IS_ERR(defaults)) {
>>>                        ret = PTR_ERR(defaults);
>>>                        goto err_unpin_ctx;
>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> index d0ef50bf930a..1eb68d77b66c 100644
>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> @@ -1288,7 +1288,7 @@ alloc_context_vma(struct intel_engine_cs *engine)
>>>                }
>>>    
>>>                defaults = i915_gem_object_pin_map(engine->default_state,
>>> -                                                I915_MAP_WB);
>>> +                                                I915_MAP_FORCE_WB);
>>>                if (IS_ERR(defaults)) {
>>>                        err = PTR_ERR(defaults);
>>>                        goto err_map;
>>>
>>
>> These two do not need to be changed AFAICT.
> 
> I think we cannot rely on engine->default_state always being MAP_WB
> already at this point, due to not having an idle cycle between creation
> of engine->default_state on module_load and first use.
 >
 > Having thought of that last night, I did mean to add a call to
 > __i915_gem_park() during init so we forced ourselves to idle.

I don't follow - all places where we map it use MAP_WB. Isn't force flag 
just to override the existing different mapping?

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] drm/i915/execlists: Use coherent writes into the context image
  2018-09-14  9:12         ` Tvrtko Ursulin
@ 2018-09-14  9:17           ` Chris Wilson
  2018-09-14  9:25             ` Tvrtko Ursulin
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2018-09-14  9:17 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-09-14 10:12:15)
> 
> On 14/09/2018 09:21, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-09-14 09:14:54)
> >>
> >> On 13/09/2018 20:33, Chris Wilson wrote:
> >>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >>> index d7fcbba8e982..7b1f322f232b 100644
> >>> --- a/drivers/gpu/drm/i915/intel_lrc.c
> >>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> >>> @@ -1294,7 +1294,7 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma)
> >>>         * on an active context (which by nature is already on the GPU).
> >>>         */
> >>>        if (!(vma->flags & I915_VMA_GLOBAL_BIND)) {
> >>> -             err = i915_gem_object_set_to_gtt_domain(vma->obj, true);
> >>> +             err = i915_gem_object_set_to_wc_domain(vma->obj, true);
> >>
> >> I am still confused by this. Cache flushing effects of the old and new
> >> call seem the same due object being in CPU write domain at this point.
> >> What changes is that it will be marked differently from this point one.
> >> Does that come into play later in the objects lifetime and where?
> > 
> > No, just taking the opportunity to use a more correct domain now that it
> > exists and logically ties in with using WC.
> 
> Ok.
> 
> >>>                if (err)
> >>>                        return err;
> >>>        }
> >>> @@ -1322,7 +1322,9 @@ __execlists_context_pin(struct intel_engine_cs *engine,
> >>>        if (ret)
> >>>                goto err;
> >>>    
> >>> -     vaddr = i915_gem_object_pin_map(ce->state->obj, I915_MAP_WB);
> >>> +     vaddr = i915_gem_object_pin_map(ce->state->obj,
> >>> +                                     i915_coherent_map_type(ctx->i915) |
> >>> +                                     I915_MAP_OVERRIDE);
> >>
> >> Override MAP_WB from populate_lr_context - OK I think.
> >>
> >>>        if (IS_ERR(vaddr)) {
> >>>                ret = PTR_ERR(vaddr);
> >>>                goto unpin_vma;
> >>> @@ -2753,7 +2755,7 @@ populate_lr_context(struct i915_gem_context *ctx,
> >>>                void *defaults;
> >>>    
> >>>                defaults = i915_gem_object_pin_map(engine->default_state,
> >>> -                                                I915_MAP_WB);
> >>> +                                                I915_MAP_FORCE_WB);
> >>>                if (IS_ERR(defaults)) {
> >>>                        ret = PTR_ERR(defaults);
> >>>                        goto err_unpin_ctx;
> >>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >>> index d0ef50bf930a..1eb68d77b66c 100644
> >>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> >>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >>> @@ -1288,7 +1288,7 @@ alloc_context_vma(struct intel_engine_cs *engine)
> >>>                }
> >>>    
> >>>                defaults = i915_gem_object_pin_map(engine->default_state,
> >>> -                                                I915_MAP_WB);
> >>> +                                                I915_MAP_FORCE_WB);
> >>>                if (IS_ERR(defaults)) {
> >>>                        err = PTR_ERR(defaults);
> >>>                        goto err_map;
> >>>
> >>
> >> These two do not need to be changed AFAICT.
> > 
> > I think we cannot rely on engine->default_state always being MAP_WB
> > already at this point, due to not having an idle cycle between creation
> > of engine->default_state on module_load and first use.
>  >
>  > Having thought of that last night, I did mean to add a call to
>  > __i915_gem_park() during init so we forced ourselves to idle.
> 
> I don't follow - all places where we map it use MAP_WB. Isn't force flag 
> just to override the existing different mapping?

Yes, but we may not have done the force from MAP_WC to MAP_WB in
i915_gem_unpark->intel_engines_unpark by this point, so
engine->default_state may still have a WC mapping on it. To be sure, we
need to validate that we can acquire that mapping on init.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] drm/i915/execlists: Use coherent writes into the context image
  2018-09-14  9:17           ` Chris Wilson
@ 2018-09-14  9:25             ` Tvrtko Ursulin
  0 siblings, 0 replies; 15+ messages in thread
From: Tvrtko Ursulin @ 2018-09-14  9:25 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 14/09/2018 10:17, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-09-14 10:12:15)
>>
>> On 14/09/2018 09:21, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-09-14 09:14:54)
>>>>
>>>> On 13/09/2018 20:33, Chris Wilson wrote:
>>>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>>>>> index d7fcbba8e982..7b1f322f232b 100644
>>>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>>>> @@ -1294,7 +1294,7 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma)
>>>>>          * on an active context (which by nature is already on the GPU).
>>>>>          */
>>>>>         if (!(vma->flags & I915_VMA_GLOBAL_BIND)) {
>>>>> -             err = i915_gem_object_set_to_gtt_domain(vma->obj, true);
>>>>> +             err = i915_gem_object_set_to_wc_domain(vma->obj, true);
>>>>
>>>> I am still confused by this. Cache flushing effects of the old and new
>>>> call seem the same due object being in CPU write domain at this point.
>>>> What changes is that it will be marked differently from this point one.
>>>> Does that come into play later in the objects lifetime and where?
>>>
>>> No, just taking the opportunity to use a more correct domain now that it
>>> exists and logically ties in with using WC.
>>
>> Ok.
>>
>>>>>                 if (err)
>>>>>                         return err;
>>>>>         }
>>>>> @@ -1322,7 +1322,9 @@ __execlists_context_pin(struct intel_engine_cs *engine,
>>>>>         if (ret)
>>>>>                 goto err;
>>>>>     
>>>>> -     vaddr = i915_gem_object_pin_map(ce->state->obj, I915_MAP_WB);
>>>>> +     vaddr = i915_gem_object_pin_map(ce->state->obj,
>>>>> +                                     i915_coherent_map_type(ctx->i915) |
>>>>> +                                     I915_MAP_OVERRIDE);
>>>>
>>>> Override MAP_WB from populate_lr_context - OK I think.
>>>>
>>>>>         if (IS_ERR(vaddr)) {
>>>>>                 ret = PTR_ERR(vaddr);
>>>>>                 goto unpin_vma;
>>>>> @@ -2753,7 +2755,7 @@ populate_lr_context(struct i915_gem_context *ctx,
>>>>>                 void *defaults;
>>>>>     
>>>>>                 defaults = i915_gem_object_pin_map(engine->default_state,
>>>>> -                                                I915_MAP_WB);
>>>>> +                                                I915_MAP_FORCE_WB);
>>>>>                 if (IS_ERR(defaults)) {
>>>>>                         ret = PTR_ERR(defaults);
>>>>>                         goto err_unpin_ctx;
>>>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
>>>>> index d0ef50bf930a..1eb68d77b66c 100644
>>>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>>>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>>>>> @@ -1288,7 +1288,7 @@ alloc_context_vma(struct intel_engine_cs *engine)
>>>>>                 }
>>>>>     
>>>>>                 defaults = i915_gem_object_pin_map(engine->default_state,
>>>>> -                                                I915_MAP_WB);
>>>>> +                                                I915_MAP_FORCE_WB);
>>>>>                 if (IS_ERR(defaults)) {
>>>>>                         err = PTR_ERR(defaults);
>>>>>                         goto err_map;
>>>>>
>>>>
>>>> These two do not need to be changed AFAICT.
>>>
>>> I think we cannot rely on engine->default_state always being MAP_WB
>>> already at this point, due to not having an idle cycle between creation
>>> of engine->default_state on module_load and first use.
>>   >
>>   > Having thought of that last night, I did mean to add a call to
>>   > __i915_gem_park() during init so we forced ourselves to idle.
>>
>> I don't follow - all places where we map it use MAP_WB. Isn't force flag
>> just to override the existing different mapping?
> 
> Yes, but we may not have done the force from MAP_WC to MAP_WB in
> i915_gem_unpark->intel_engines_unpark by this point, so
> engine->default_state may still have a WC mapping on it. To be sure, we
> need to validate that we can acquire that mapping on init.

Okay I missed the fact we just transfer the engine->default_state object 
ownership from ce->state->obj. Somehow I assumed it is a dedicated buffer.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume
@ 2018-08-16 11:11 Chris Wilson
  0 siblings, 0 replies; 15+ messages in thread
From: Chris Wilson @ 2018-08-16 11:11 UTC (permalink / raw)
  To: intel-gfx

Now that we reload both RING_HEAD and RING_TAIL when rebinding the
context, we do not need to scrub those registers immediately on resume.

v2: Handle the perma-pinned contexts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 29 +++++++++++------------------
 1 file changed, 11 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 841895cfb05f..3632521a5fb2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2775,13 +2775,14 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 	return ret;
 }
 
-void intel_lr_context_resume(struct drm_i915_private *dev_priv)
+void intel_lr_context_resume(struct drm_i915_private *i915)
 {
 	struct intel_engine_cs *engine;
 	struct i915_gem_context *ctx;
 	enum intel_engine_id id;
 
-	/* Because we emit WA_TAIL_DWORDS there may be a disparity
+	/*
+	 * Because we emit WA_TAIL_DWORDS there may be a disparity
 	 * between our bookkeeping in ce->ring->head and ce->ring->tail and
 	 * that stored in context. As we only write new commands from
 	 * ce->ring->tail onwards, everything before that is junk. If the GPU
@@ -2791,28 +2792,20 @@ void intel_lr_context_resume(struct drm_i915_private *dev_priv)
 	 * So to avoid that we reset the context images upon resume. For
 	 * simplicity, we just zero everything out.
 	 */
-	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
-		for_each_engine(engine, dev_priv, id) {
+	list_for_each_entry(ctx, &i915->contexts.list, link) {
+		for_each_engine(engine, i915, id) {
 			struct intel_context *ce =
 				to_intel_context(ctx, engine);
-			u32 *reg;
-
-			if (!ce->state)
-				continue;
 
-			reg = i915_gem_object_pin_map(ce->state->obj,
-						      I915_MAP_WB);
-			if (WARN_ON(IS_ERR(reg)))
+			if (!ce->ring)
 				continue;
 
-			reg += LRC_STATE_PN * PAGE_SIZE / sizeof(*reg);
-			reg[CTX_RING_HEAD+1] = 0;
-			reg[CTX_RING_TAIL+1] = 0;
-
-			ce->state->obj->mm.dirty = true;
-			i915_gem_object_unpin_map(ce->state->obj);
-
 			intel_ring_reset(ce->ring, 0);
+
+			if (ce->pin_count) { /* otherwise done in context_pin */
+				ce->lrc_reg_state[CTX_RING_HEAD+1] = 0;
+				ce->lrc_reg_state[CTX_RING_TAIL+1] = 0;
+			}
 		}
 	}
 }
-- 
2.18.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-09-14  9:25 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-13 17:32 [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume Chris Wilson
2018-09-13 17:32 ` [PATCH 2/2] drm/i915/execlists: Use coherent writes into the context image Chris Wilson
2018-09-13 19:33   ` [PATCH v3] " Chris Wilson
2018-09-14  8:14     ` Tvrtko Ursulin
2018-09-14  8:21       ` Chris Wilson
2018-09-14  9:12         ` Tvrtko Ursulin
2018-09-14  9:17           ` Chris Wilson
2018-09-14  9:25             ` Tvrtko Ursulin
2018-09-13 17:50 ` [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume Tvrtko Ursulin
2018-09-13 17:53 ` ✗ Fi.CI.SPARSE: warning for series starting with [1/2] " Patchwork
2018-09-13 18:12 ` ✗ Fi.CI.BAT: failure " Patchwork
2018-09-13 20:17 ` ✗ Fi.CI.SPARSE: warning for series starting with [1/2] drm/i915/execlists: Delay updating ring register state after resume (rev2) Patchwork
2018-09-13 20:35 ` ✓ Fi.CI.BAT: success " Patchwork
2018-09-13 23:02 ` ✓ Fi.CI.IGT: " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2018-08-16 11:11 [PATCH 1/2] drm/i915/execlists: Delay updating ring register state after resume Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.