All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] drm/i915: Skip the ERR_PTR error state
@ 2018-12-06  8:44 Chris Wilson
  2018-12-06  8:44 ` [PATCH 2/3] drm/i915: Pipeline PDP updates for Braswell Chris Wilson
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Chris Wilson @ 2018-12-06  8:44 UTC (permalink / raw)
  To: intel-gfx

Although commit fb6f0b64e455 ("drm/i915: Prevent machine hang from
Broxton's vtd w/a and error capture") applied cleanly after a 24 month
hiatus, the code had moved on with new methods for peeking and fetching
the captured gpu info. Make sure we catch all uses of the stashed error
state and avoid dereferencing the error pointer.

Fixes: fb6f0b64e455 ("drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c   | 11 ++++++++++-
 drivers/gpu/drm/i915/i915_gpu_error.c | 12 +++++++++---
 drivers/gpu/drm/i915/i915_sysfs.c     |  3 +++
 3 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 38dcee1ca062..f8aa8586c9a6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -984,6 +984,9 @@ static int i915_gpu_info_open(struct inode *inode, struct file *file)
 	intel_runtime_pm_get(i915);
 	gpu = i915_capture_gpu_state(i915);
 	intel_runtime_pm_put(i915);
+
+	if (IS_ERR(gpu))
+		return PTR_ERR(gpu);
 	if (!gpu)
 		return -ENOMEM;
 
@@ -1018,7 +1021,13 @@ i915_error_state_write(struct file *filp,
 
 static int i915_error_state_open(struct inode *inode, struct file *file)
 {
-	file->private_data = i915_first_error_state(inode->i_private);
+	struct i915_gpu_state *error;
+
+	error = i915_first_error_state(inode->i_private);
+	if (IS_ERR(error))
+		return PTR_ERR(error);
+
+	file->private_data  = error;
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 07465123c166..7ca6f3f31d41 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1907,6 +1907,11 @@ i915_capture_gpu_state(struct drm_i915_private *i915)
 {
 	struct i915_gpu_state *error;
 
+	/* Check if GPU capture has been disabled */
+	error = READ_ONCE(i915->gpu_error.first_error);
+	if (IS_ERR(error))
+		return error;
+
 	error = kzalloc(sizeof(*error), GFP_ATOMIC);
 	if (!error)
 		return NULL;
@@ -1987,7 +1992,7 @@ i915_first_error_state(struct drm_i915_private *i915)
 
 	spin_lock_irq(&i915->gpu_error.lock);
 	error = i915->gpu_error.first_error;
-	if (error)
+	if (!IS_ERR_OR_NULL(error))
 		i915_gpu_state_get(error);
 	spin_unlock_irq(&i915->gpu_error.lock);
 
@@ -2000,10 +2005,11 @@ void i915_reset_error_state(struct drm_i915_private *i915)
 
 	spin_lock_irq(&i915->gpu_error.lock);
 	error = i915->gpu_error.first_error;
-	i915->gpu_error.first_error = NULL;
+	if (!IS_ERR_OR_NULL(error)) /* once disabled, always disabled */
+		i915->gpu_error.first_error = NULL;
 	spin_unlock_irq(&i915->gpu_error.lock);
 
-	if (!IS_ERR(error))
+	if (!IS_ERR_OR_NULL(error))
 		i915_gpu_state_put(error);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 535caebd9813..370b7497e9df 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -521,6 +521,9 @@ static ssize_t error_state_read(struct file *filp, struct kobject *kobj,
 	ssize_t ret;
 
 	gpu = i915_first_error_state(i915);
+	if (IS_ERR(gpu))
+		return PTR_ERR(gpu);
+
 	if (gpu) {
 		ret = i915_gpu_state_copy_to_buffer(gpu, buf, off, count);
 		i915_gpu_state_put(gpu);
-- 
2.20.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] drm/i915: Pipeline PDP updates for Braswell
  2018-12-06  8:44 [PATCH 1/3] drm/i915: Skip the ERR_PTR error state Chris Wilson
@ 2018-12-06  8:44 ` Chris Wilson
  2018-12-06 13:10   ` Tvrtko Ursulin
  2018-12-06  8:44 ` [PATCH 3/3] drm/i915/execlists: Apply a full mb before execution " Chris Wilson
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2018-12-06  8:44 UTC (permalink / raw)
  To: intel-gfx

Currently we face a severe problem on Braswell that manifests as invalid
ppGTT accesses. The code tries to maintain the PDP (page directory
pointers) inside the context in two ways, direct write into the context
and a pipelined LRI update. The direct write into the context is
fundamentally racy as it is unserialised with any access (read or write)
the GPU is doing. By asserting that Braswell is not used with vGPU
(currently an unsupported platform) we can eliminate the dangerous
direct write into the context image and solely use the pipelined update.

However, the LRI of the PDP fouls up the GPU, causing it to freeze and
take out the machine with "forcewake ack timeouts". This seems possible
to workaround by preventing the GPU from sleeping (via means of
disabling the power-state management interface, i.e. forcing each ring
to remain awake) around the update.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656
References: https://bugs.freedesktop.org/show_bug.cgi?id=108714
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c     |   2 -
 drivers/gpu/drm/i915/i915_request.c     |   5 -
 drivers/gpu/drm/i915/intel_lrc.c        | 143 ++++++++++++------------
 drivers/gpu/drm/i915/intel_ringbuffer.c |   5 +-
 4 files changed, 74 insertions(+), 81 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index add1fe7aeb93..62bde517d383 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1423,8 +1423,6 @@ static int gen8_ppgtt_alloc_pdp(struct i915_address_space *vm,
 			gen8_initialize_pd(vm, pd);
 			gen8_ppgtt_set_pdpe(vm, pdp, pd, pdpe);
 			GEM_BUG_ON(pdp->used_pdpes > i915_pdpes_per_pdp(vm));
-
-			mark_tlbs_dirty(i915_vm_to_ppgtt(vm));
 		}
 
 		ret = gen8_ppgtt_alloc_pd(vm, pd, start, length);
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index ca95ab2f4cfa..8ab8e8e6a086 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -719,11 +719,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 */
 	rq->head = rq->ring->emit;
 
-	/* Unconditionally invalidate GPU caches and TLBs. */
-	ret = engine->emit_flush(rq, EMIT_INVALIDATE);
-	if (ret)
-		goto err_unwind;
-
 	ret = engine->request_alloc(rq);
 	if (ret)
 		goto err_unwind;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d7fa301b5ec7..de1e9dc6aec0 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -363,31 +363,12 @@ execlists_context_schedule_out(struct i915_request *rq, unsigned long status)
 	trace_i915_request_out(rq);
 }
 
-static void
-execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
-{
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 3);
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 2);
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 1);
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
-}
-
 static u64 execlists_update_context(struct i915_request *rq)
 {
-	struct i915_hw_ppgtt *ppgtt = rq->gem_context->ppgtt;
 	struct intel_context *ce = rq->hw_context;
-	u32 *reg_state = ce->lrc_reg_state;
-
-	reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
 
-	/*
-	 * True 32b PPGTT with dynamic page allocation: update PDP
-	 * registers and point the unallocated PDPs to scratch page.
-	 * PML4 is allocated during ppgtt init, so this is not needed
-	 * in 48-bit mode.
-	 */
-	if (!i915_vm_is_48bit(&ppgtt->vm))
-		execlists_update_context_pdps(ppgtt, reg_state);
+	ce->lrc_reg_state[CTX_RING_TAIL + 1] =
+		intel_ring_set_tail(rq->ring, rq->tail);
 
 	/*
 	 * Make sure the context image is complete before we submit it to HW.
@@ -1242,29 +1223,86 @@ execlists_context_pin(struct intel_engine_cs *engine,
 	return __execlists_context_pin(engine, ctx, ce);
 }
 
+static int emit_pdps(struct i915_request *rq)
+{
+	const struct intel_engine_cs * const engine = rq->engine;
+	struct i915_hw_ppgtt * const ppgtt = rq->gem_context->ppgtt;
+	int err, i;
+	u32 *cs;
+
+	/*
+	 * Beware ye of the dragons, this sequence is magic!
+	 *
+	 * Small changes to this sequence can cause anything from
+	 * GPU hangs to forcewake errors and machine lockups!
+	 */
+
+	err = engine->emit_flush(rq, EMIT_FLUSH);
+	if (err)
+		return err;
+
+	err = engine->emit_flush(rq, EMIT_INVALIDATE);
+	if (err)
+		return err;
+
+	cs = intel_ring_begin(rq, 4 * GEN8_3LVL_PDPES + 2);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	/* Ensure the LRI have landed before we invalidate & continue */
+	*cs++ = MI_LOAD_REGISTER_IMM(2 * GEN8_3LVL_PDPES) | MI_LRI_FORCE_POSTED;
+	for (i = GEN8_3LVL_PDPES; i--; ) {
+		const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
+
+		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, i));
+		*cs++ = upper_32_bits(pd_daddr);
+		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, i));
+		*cs++ = lower_32_bits(pd_daddr);
+	}
+	*cs++ = MI_NOOP;
+
+	intel_ring_advance(rq, cs);
+
+	err = engine->emit_flush(rq, EMIT_FLUSH);
+	if (err)
+		return err;
+
+	return engine->emit_flush(rq, EMIT_INVALIDATE);
+}
+
 static int execlists_request_alloc(struct i915_request *request)
 {
 	int ret;
 
 	GEM_BUG_ON(!request->hw_context->pin_count);
 
-	/* Flush enough space to reduce the likelihood of waiting after
+	/*
+	 * Flush enough space to reduce the likelihood of waiting after
 	 * we start building the request - in which case we will just
 	 * have to repeat work.
 	 */
 	request->reserved_space += EXECLISTS_REQUEST_SIZE;
 
-	ret = intel_ring_wait_for_space(request->ring, request->reserved_space);
-	if (ret)
-		return ret;
-
-	/* Note that after this point, we have committed to using
+	/*
+	 * Note that after this point, we have committed to using
 	 * this request as it is being used to both track the
 	 * state of engine initialisation and liveness of the
 	 * golden renderstate above. Think twice before you try
 	 * to cancel/unwind this request now.
 	 */
 
+	/* Unconditionally invalidate GPU caches and TLBs. */
+	if (i915_vm_is_48bit(&request->gem_context->ppgtt->vm)) {
+		ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
+		if (ret)
+			return ret;
+	} else {
+		GEM_BUG_ON(intel_vgpu_active(request->i915));
+		ret = emit_pdps(request);
+		if (ret)
+			return ret;
+	}
+
 	request->reserved_space -= EXECLISTS_REQUEST_SIZE;
 	return 0;
 }
@@ -1836,56 +1874,11 @@ static void execlists_reset_finish(struct intel_engine_cs *engine)
 		  atomic_read(&execlists->tasklet.count));
 }
 
-static int intel_logical_ring_emit_pdps(struct i915_request *rq)
-{
-	struct i915_hw_ppgtt *ppgtt = rq->gem_context->ppgtt;
-	struct intel_engine_cs *engine = rq->engine;
-	const int num_lri_cmds = GEN8_3LVL_PDPES * 2;
-	u32 *cs;
-	int i;
-
-	cs = intel_ring_begin(rq, num_lri_cmds * 2 + 2);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	*cs++ = MI_LOAD_REGISTER_IMM(num_lri_cmds);
-	for (i = GEN8_3LVL_PDPES - 1; i >= 0; i--) {
-		const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
-
-		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, i));
-		*cs++ = upper_32_bits(pd_daddr);
-		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, i));
-		*cs++ = lower_32_bits(pd_daddr);
-	}
-
-	*cs++ = MI_NOOP;
-	intel_ring_advance(rq, cs);
-
-	return 0;
-}
-
 static int gen8_emit_bb_start(struct i915_request *rq,
 			      u64 offset, u32 len,
 			      const unsigned int flags)
 {
 	u32 *cs;
-	int ret;
-
-	/* Don't rely in hw updating PDPs, specially in lite-restore.
-	 * Ideally, we should set Force PD Restore in ctx descriptor,
-	 * but we can't. Force Restore would be a second option, but
-	 * it is unsafe in case of lite-restore (because the ctx is
-	 * not idle). PML4 is allocated during ppgtt init so this is
-	 * not needed in 48-bit.*/
-	if ((intel_engine_flag(rq->engine) & rq->gem_context->ppgtt->pd_dirty_rings) &&
-	    !i915_vm_is_48bit(&rq->gem_context->ppgtt->vm) &&
-	    !intel_vgpu_active(rq->i915)) {
-		ret = intel_logical_ring_emit_pdps(rq);
-		if (ret)
-			return ret;
-
-		rq->gem_context->ppgtt->pd_dirty_rings &= ~intel_engine_flag(rq->engine);
-	}
 
 	cs = intel_ring_begin(rq, 6);
 	if (IS_ERR(cs))
@@ -1918,6 +1911,7 @@ static int gen8_emit_bb_start(struct i915_request *rq,
 
 	*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
 	*cs++ = MI_NOOP;
+
 	intel_ring_advance(rq, cs);
 
 	return 0;
@@ -2533,6 +2527,11 @@ static void execlists_init_reg_state(u32 *regs,
 		 * other PDP Descriptors are ignored.
 		 */
 		ASSIGN_CTX_PML4(ctx->ppgtt, regs);
+	} else {
+		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 3);
+		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 2);
+		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 1);
+		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 0);
 	}
 
 	if (rcs) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index c5eb26a7ee79..ad5b1fc79498 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1826,11 +1826,12 @@ static int ring_request_alloc(struct i915_request *request)
 	 */
 	request->reserved_space += LEGACY_REQUEST_SIZE;
 
-	ret = intel_ring_wait_for_space(request->ring, request->reserved_space);
+	ret = switch_context(request);
 	if (ret)
 		return ret;
 
-	ret = switch_context(request);
+	/* Unconditionally invalidate GPU caches and TLBs. */
+	ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
 	if (ret)
 		return ret;
 
-- 
2.20.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] drm/i915/execlists: Apply a full mb before execution for Braswell
  2018-12-06  8:44 [PATCH 1/3] drm/i915: Skip the ERR_PTR error state Chris Wilson
  2018-12-06  8:44 ` [PATCH 2/3] drm/i915: Pipeline PDP updates for Braswell Chris Wilson
@ 2018-12-06  8:44 ` Chris Wilson
  2018-12-06 13:12   ` [Intel-gfx] " Tvrtko Ursulin
  2018-12-06  8:58 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] drm/i915: Skip the ERR_PTR error state Patchwork
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2018-12-06  8:44 UTC (permalink / raw)
  To: intel-gfx
  Cc: tvrtko.ursulin, Chris Wilson, Mika Kuoppala, Joonas Lahtinen, stable

Braswell is really picky about having our writes posted to memory before
we execute or else the GPU may see stale values. A wmb() is insufficient
as it only ensures the writes are visible to other cores, we need a full
mb() to ensure the writes are in memory and visible to the GPU.

The most frequent failure in flushing before execution is that we see
stale PTE values and execute the wrong pages.

References: 987abd5c62f9 ("drm/i915/execlists: Force write serialisation into context image vs execution")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/intel_lrc.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index de1e9dc6aec0..e6a86fa4502d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -379,8 +379,13 @@ static u64 execlists_update_context(struct i915_request *rq)
 	 * may not be visible to the HW prior to the completion of the UC
 	 * register write and that we may begin execution from the context
 	 * before its image is complete leading to invalid PD chasing.
+	 *
+	 * Furthermore, Braswell, at least, wants a full mb to be sure that
+	 * the writes are coherent in memory (visible to the GPU) prior to
+	 * execution, and not just visible to other CPUs (as is the result of
+	 * wmb).
 	 */
-	wmb();
+	mb();
 	return ce->lrc_desc;
 }
 
-- 
2.20.0.rc2

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] drm/i915: Skip the ERR_PTR error state
  2018-12-06  8:44 [PATCH 1/3] drm/i915: Skip the ERR_PTR error state Chris Wilson
  2018-12-06  8:44 ` [PATCH 2/3] drm/i915: Pipeline PDP updates for Braswell Chris Wilson
  2018-12-06  8:44 ` [PATCH 3/3] drm/i915/execlists: Apply a full mb before execution " Chris Wilson
@ 2018-12-06  8:58 ` Patchwork
  2018-12-06  9:15 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-12-06  8:58 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915: Skip the ERR_PTR error state
URL   : https://patchwork.freedesktop.org/series/53625/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
47d5e8ffa422 drm/i915: Skip the ERR_PTR error state
26553eeb0969 drm/i915: Pipeline PDP updates for Braswell
e14c4601fdfc drm/i915/execlists: Apply a full mb before execution for Braswell
-:15: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#15: 
References: 987abd5c62f9 ("drm/i915/execlists: Force write serialisation into context image vs execution")

-:15: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 987abd5c62f9 ("drm/i915/execlists: Force write serialisation into context image vs execution")'
#15: 
References: 987abd5c62f9 ("drm/i915/execlists: Force write serialisation into context image vs execution")

-:37: WARNING:MEMORY_BARRIER: memory barrier without comment
#37: FILE: drivers/gpu/drm/i915/intel_lrc.c:388:
+	mb();

total: 1 errors, 2 warnings, 0 checks, 14 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Skip the ERR_PTR error state
  2018-12-06  8:44 [PATCH 1/3] drm/i915: Skip the ERR_PTR error state Chris Wilson
                   ` (2 preceding siblings ...)
  2018-12-06  8:58 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] drm/i915: Skip the ERR_PTR error state Patchwork
@ 2018-12-06  9:15 ` Patchwork
  2018-12-06 12:57 ` [PATCH 1/3] " Tvrtko Ursulin
  2018-12-07  0:44 ` ✓ Fi.CI.IGT: success for series starting with [1/3] " Patchwork
  5 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-12-06  9:15 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915: Skip the ERR_PTR error state
URL   : https://patchwork.freedesktop.org/series/53625/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_5272 -> Patchwork_11031
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/53625/revisions/1/mbox/

Known issues
------------

  Here are the changes found in Patchwork_11031 that come from known issues:

### IGT changes ###

#### Possible fixes ####

  * igt@gem_ctx_create@basic-files:
    - fi-bsw-kefka:       FAIL [fdo#108656] -> PASS

  
  [fdo#108656]: https://bugs.freedesktop.org/show_bug.cgi?id=108656


Participating hosts (48 -> 44)
------------------------------

  Additional (1): fi-skl-guc 
  Missing    (5): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-apl-guc 


Build changes
-------------

    * Linux: CI_DRM_5272 -> Patchwork_11031

  CI_DRM_5272: 4bb8baa3a7b836ce18e1b27ba12bae2130ee38cc @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4743: edb2db2cf2b6665d7ba3fa9117263302f6307a4f @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_11031: e14c4601fdfcd633dea62087b23987e029194d1f @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

e14c4601fdfc drm/i915/execlists: Apply a full mb before execution for Braswell
26553eeb0969 drm/i915: Pipeline PDP updates for Braswell
47d5e8ffa422 drm/i915: Skip the ERR_PTR error state

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_11031/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] drm/i915: Skip the ERR_PTR error state
  2018-12-06  8:44 [PATCH 1/3] drm/i915: Skip the ERR_PTR error state Chris Wilson
                   ` (3 preceding siblings ...)
  2018-12-06  9:15 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-12-06 12:57 ` Tvrtko Ursulin
  2018-12-06 21:09   ` Chris Wilson
  2018-12-07  0:44 ` ✓ Fi.CI.IGT: success for series starting with [1/3] " Patchwork
  5 siblings, 1 reply; 11+ messages in thread
From: Tvrtko Ursulin @ 2018-12-06 12:57 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 06/12/2018 08:44, Chris Wilson wrote:
> Although commit fb6f0b64e455 ("drm/i915: Prevent machine hang from
> Broxton's vtd w/a and error capture") applied cleanly after a 24 month
> hiatus, the code had moved on with new methods for peeking and fetching
> the captured gpu info. Make sure we catch all uses of the stashed error
> state and avoid dereferencing the error pointer.
> 
> Fixes: fb6f0b64e455 ("drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c   | 11 ++++++++++-
>   drivers/gpu/drm/i915/i915_gpu_error.c | 12 +++++++++---
>   drivers/gpu/drm/i915/i915_sysfs.c     |  3 +++
>   3 files changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 38dcee1ca062..f8aa8586c9a6 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -984,6 +984,9 @@ static int i915_gpu_info_open(struct inode *inode, struct file *file)
>   	intel_runtime_pm_get(i915);
>   	gpu = i915_capture_gpu_state(i915);
>   	intel_runtime_pm_put(i915);
> +
> +	if (IS_ERR(gpu))
> +		return PTR_ERR(gpu);
>   	if (!gpu)
>   		return -ENOMEM;
>   
> @@ -1018,7 +1021,13 @@ i915_error_state_write(struct file *filp,
>   
>   static int i915_error_state_open(struct inode *inode, struct file *file)
>   {
> -	file->private_data = i915_first_error_state(inode->i_private);
> +	struct i915_gpu_state *error;
> +
> +	error = i915_first_error_state(inode->i_private);
> +	if (IS_ERR(error))
> +		return PTR_ERR(error);
> +
> +	file->private_data  = error;
>   	return 0;
>   }
>   
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 07465123c166..7ca6f3f31d41 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1907,6 +1907,11 @@ i915_capture_gpu_state(struct drm_i915_private *i915)
>   {
>   	struct i915_gpu_state *error;
>   
> +	/* Check if GPU capture has been disabled */
> +	error = READ_ONCE(i915->gpu_error.first_error);
> +	if (IS_ERR(error))
> +		return error;
> +
>   	error = kzalloc(sizeof(*error), GFP_ATOMIC);
>   	if (!error)
>   		return NULL;

Looks like it would be cleaner to return ERR_PTR(-ENOMEM) here. At least 
the i915_gpu_info_open hunk would be.

Then in drm-tip I see i915_capture_error_state as a caller which needs 
to handle ERR_PTR now as well, no?

> @@ -1987,7 +1992,7 @@ i915_first_error_state(struct drm_i915_private *i915)
>   
>   	spin_lock_irq(&i915->gpu_error.lock);
>   	error = i915->gpu_error.first_error;
> -	if (error)
> +	if (!IS_ERR_OR_NULL(error))
>   		i915_gpu_state_get(error);
>   	spin_unlock_irq(&i915->gpu_error.lock);
>   
> @@ -2000,10 +2005,11 @@ void i915_reset_error_state(struct drm_i915_private *i915)
>   
>   	spin_lock_irq(&i915->gpu_error.lock);
>   	error = i915->gpu_error.first_error;
> -	i915->gpu_error.first_error = NULL;
> +	if (!IS_ERR_OR_NULL(error)) /* once disabled, always disabled */

Hm, this will apply on transient -ENOMEM as well.

> +		i915->gpu_error.first_error = NULL;
>   	spin_unlock_irq(&i915->gpu_error.lock);
>   
> -	if (!IS_ERR(error))
> +	if (!IS_ERR_OR_NULL(error))
>   		i915_gpu_state_put(error);
>   }
>   
> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> index 535caebd9813..370b7497e9df 100644
> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> @@ -521,6 +521,9 @@ static ssize_t error_state_read(struct file *filp, struct kobject *kobj,
>   	ssize_t ret;
>   
>   	gpu = i915_first_error_state(i915);
> +	if (IS_ERR(gpu))
> +		return PTR_ERR(gpu);
> +
>   	if (gpu) {
>   		ret = i915_gpu_state_copy_to_buffer(gpu, buf, off, count);
>   		i915_gpu_state_put(gpu);#
> 

A single control block:

if (!IS_ERR_OR_NULL) {
	...
} else {
	...
}

seems like it would do the trick.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] drm/i915: Pipeline PDP updates for Braswell
  2018-12-06  8:44 ` [PATCH 2/3] drm/i915: Pipeline PDP updates for Braswell Chris Wilson
@ 2018-12-06 13:10   ` Tvrtko Ursulin
  0 siblings, 0 replies; 11+ messages in thread
From: Tvrtko Ursulin @ 2018-12-06 13:10 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 06/12/2018 08:44, Chris Wilson wrote:
> Currently we face a severe problem on Braswell that manifests as invalid
> ppGTT accesses. The code tries to maintain the PDP (page directory
> pointers) inside the context in two ways, direct write into the context
> and a pipelined LRI update. The direct write into the context is
> fundamentally racy as it is unserialised with any access (read or write)
> the GPU is doing. By asserting that Braswell is not used with vGPU
> (currently an unsupported platform) we can eliminate the dangerous
> direct write into the context image and solely use the pipelined update.
> 
> However, the LRI of the PDP fouls up the GPU, causing it to freeze and
> take out the machine with "forcewake ack timeouts". This seems possible
> to workaround by preventing the GPU from sleeping (via means of
> disabling the power-state management interface, i.e. forcing each ring
> to remain awake) around the update.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656
> References: https://bugs.freedesktop.org/show_bug.cgi?id=108714
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_gtt.c     |   2 -
>   drivers/gpu/drm/i915/i915_request.c     |   5 -
>   drivers/gpu/drm/i915/intel_lrc.c        | 143 ++++++++++++------------
>   drivers/gpu/drm/i915/intel_ringbuffer.c |   5 +-
>   4 files changed, 74 insertions(+), 81 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index add1fe7aeb93..62bde517d383 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1423,8 +1423,6 @@ static int gen8_ppgtt_alloc_pdp(struct i915_address_space *vm,
>   			gen8_initialize_pd(vm, pd);
>   			gen8_ppgtt_set_pdpe(vm, pdp, pd, pdpe);
>   			GEM_BUG_ON(pdp->used_pdpes > i915_pdpes_per_pdp(vm));
> -
> -			mark_tlbs_dirty(i915_vm_to_ppgtt(vm));
>   		}
>   
>   		ret = gen8_ppgtt_alloc_pd(vm, pd, start, length);
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index ca95ab2f4cfa..8ab8e8e6a086 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -719,11 +719,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	 */
>   	rq->head = rq->ring->emit;
>   
> -	/* Unconditionally invalidate GPU caches and TLBs. */
> -	ret = engine->emit_flush(rq, EMIT_INVALIDATE);
> -	if (ret)
> -		goto err_unwind;
> -
>   	ret = engine->request_alloc(rq);
>   	if (ret)
>   		goto err_unwind;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index d7fa301b5ec7..de1e9dc6aec0 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -363,31 +363,12 @@ execlists_context_schedule_out(struct i915_request *rq, unsigned long status)
>   	trace_i915_request_out(rq);
>   }
>   
> -static void
> -execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
> -{
> -	ASSIGN_CTX_PDP(ppgtt, reg_state, 3);
> -	ASSIGN_CTX_PDP(ppgtt, reg_state, 2);
> -	ASSIGN_CTX_PDP(ppgtt, reg_state, 1);
> -	ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
> -}
> -
>   static u64 execlists_update_context(struct i915_request *rq)
>   {
> -	struct i915_hw_ppgtt *ppgtt = rq->gem_context->ppgtt;
>   	struct intel_context *ce = rq->hw_context;
> -	u32 *reg_state = ce->lrc_reg_state;
> -
> -	reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
>   
> -	/*
> -	 * True 32b PPGTT with dynamic page allocation: update PDP
> -	 * registers and point the unallocated PDPs to scratch page.
> -	 * PML4 is allocated during ppgtt init, so this is not needed
> -	 * in 48-bit mode.
> -	 */
> -	if (!i915_vm_is_48bit(&ppgtt->vm))
> -		execlists_update_context_pdps(ppgtt, reg_state);
> +	ce->lrc_reg_state[CTX_RING_TAIL + 1] =
> +		intel_ring_set_tail(rq->ring, rq->tail);
>   
>   	/*
>   	 * Make sure the context image is complete before we submit it to HW.
> @@ -1242,29 +1223,86 @@ execlists_context_pin(struct intel_engine_cs *engine,
>   	return __execlists_context_pin(engine, ctx, ce);
>   }
>   
> +static int emit_pdps(struct i915_request *rq)
> +{
> +	const struct intel_engine_cs * const engine = rq->engine;
> +	struct i915_hw_ppgtt * const ppgtt = rq->gem_context->ppgtt;
> +	int err, i;
> +	u32 *cs;
> +
> +	/*
> +	 * Beware ye of the dragons, this sequence is magic!
> +	 *
> +	 * Small changes to this sequence can cause anything from
> +	 * GPU hangs to forcewake errors and machine lockups!
> +	 */
> +
> +	err = engine->emit_flush(rq, EMIT_FLUSH);
> +	if (err)
> +		return err;
> +
> +	err = engine->emit_flush(rq, EMIT_INVALIDATE);
> +	if (err)
> +		return err;
> +
> +	cs = intel_ring_begin(rq, 4 * GEN8_3LVL_PDPES + 2);
> +	if (IS_ERR(cs))
> +		return PTR_ERR(cs);
> +
> +	/* Ensure the LRI have landed before we invalidate & continue */
> +	*cs++ = MI_LOAD_REGISTER_IMM(2 * GEN8_3LVL_PDPES) | MI_LRI_FORCE_POSTED;
> +	for (i = GEN8_3LVL_PDPES; i--; ) {
> +		const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
> +
> +		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, i));
> +		*cs++ = upper_32_bits(pd_daddr);
> +		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, i));
> +		*cs++ = lower_32_bits(pd_daddr);
> +	}
> +	*cs++ = MI_NOOP;
> +
> +	intel_ring_advance(rq, cs);
> +
> +	err = engine->emit_flush(rq, EMIT_FLUSH);
> +	if (err)
> +		return err;
> +
> +	return engine->emit_flush(rq, EMIT_INVALIDATE);
> +}
> +
>   static int execlists_request_alloc(struct i915_request *request)
>   {
>   	int ret;
>   
>   	GEM_BUG_ON(!request->hw_context->pin_count);
>   
> -	/* Flush enough space to reduce the likelihood of waiting after
> +	/*
> +	 * Flush enough space to reduce the likelihood of waiting after
>   	 * we start building the request - in which case we will just
>   	 * have to repeat work.
>   	 */
>   	request->reserved_space += EXECLISTS_REQUEST_SIZE;
>   
> -	ret = intel_ring_wait_for_space(request->ring, request->reserved_space);
> -	if (ret)
> -		return ret;
> -
> -	/* Note that after this point, we have committed to using
> +	/*
> +	 * Note that after this point, we have committed to using
>   	 * this request as it is being used to both track the
>   	 * state of engine initialisation and liveness of the
>   	 * golden renderstate above. Think twice before you try
>   	 * to cancel/unwind this request now.
>   	 */
>   
> +	/* Unconditionally invalidate GPU caches and TLBs. */
> +	if (i915_vm_is_48bit(&request->gem_context->ppgtt->vm)) {
> +		ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
> +		if (ret)
> +			return ret;
> +	} else {
> +		GEM_BUG_ON(intel_vgpu_active(request->i915));
> +		ret = emit_pdps(request);
> +		if (ret)
> +			return ret;
> +	}

Can pull "if (ret).." after the loop to save, hm, almost no lines after 
added whitespace.

> +
>   	request->reserved_space -= EXECLISTS_REQUEST_SIZE;
>   	return 0;
>   }
> @@ -1836,56 +1874,11 @@ static void execlists_reset_finish(struct intel_engine_cs *engine)
>   		  atomic_read(&execlists->tasklet.count));
>   }
>   
> -static int intel_logical_ring_emit_pdps(struct i915_request *rq)
> -{
> -	struct i915_hw_ppgtt *ppgtt = rq->gem_context->ppgtt;
> -	struct intel_engine_cs *engine = rq->engine;
> -	const int num_lri_cmds = GEN8_3LVL_PDPES * 2;
> -	u32 *cs;
> -	int i;
> -
> -	cs = intel_ring_begin(rq, num_lri_cmds * 2 + 2);
> -	if (IS_ERR(cs))
> -		return PTR_ERR(cs);
> -
> -	*cs++ = MI_LOAD_REGISTER_IMM(num_lri_cmds);
> -	for (i = GEN8_3LVL_PDPES - 1; i >= 0; i--) {
> -		const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
> -
> -		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, i));
> -		*cs++ = upper_32_bits(pd_daddr);
> -		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, i));
> -		*cs++ = lower_32_bits(pd_daddr);
> -	}
> -
> -	*cs++ = MI_NOOP;
> -	intel_ring_advance(rq, cs);
> -
> -	return 0;
> -}
> -
>   static int gen8_emit_bb_start(struct i915_request *rq,
>   			      u64 offset, u32 len,
>   			      const unsigned int flags)
>   {
>   	u32 *cs;
> -	int ret;
> -
> -	/* Don't rely in hw updating PDPs, specially in lite-restore.
> -	 * Ideally, we should set Force PD Restore in ctx descriptor,
> -	 * but we can't. Force Restore would be a second option, but
> -	 * it is unsafe in case of lite-restore (because the ctx is
> -	 * not idle). PML4 is allocated during ppgtt init so this is
> -	 * not needed in 48-bit.*/
> -	if ((intel_engine_flag(rq->engine) & rq->gem_context->ppgtt->pd_dirty_rings) &&
> -	    !i915_vm_is_48bit(&rq->gem_context->ppgtt->vm) &&
> -	    !intel_vgpu_active(rq->i915)) {
> -		ret = intel_logical_ring_emit_pdps(rq);
> -		if (ret)
> -			return ret;
> -
> -		rq->gem_context->ppgtt->pd_dirty_rings &= ~intel_engine_flag(rq->engine);
> -	}
>   
>   	cs = intel_ring_begin(rq, 6);
>   	if (IS_ERR(cs))
> @@ -1918,6 +1911,7 @@ static int gen8_emit_bb_start(struct i915_request *rq,
>   
>   	*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
>   	*cs++ = MI_NOOP;
> +
>   	intel_ring_advance(rq, cs);
>   
>   	return 0;
> @@ -2533,6 +2527,11 @@ static void execlists_init_reg_state(u32 *regs,
>   		 * other PDP Descriptors are ignored.
>   		 */
>   		ASSIGN_CTX_PML4(ctx->ppgtt, regs);
> +	} else {
> +		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 3);
> +		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 2);
> +		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 1);
> +		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 0);
>   	}
>   
>   	if (rcs) {
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index c5eb26a7ee79..ad5b1fc79498 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1826,11 +1826,12 @@ static int ring_request_alloc(struct i915_request *request)
>   	 */
>   	request->reserved_space += LEGACY_REQUEST_SIZE;
>   
> -	ret = intel_ring_wait_for_space(request->ring, request->reserved_space);

Can I convince you to extract this changelet to a separate patch? (For 
both backends.)

> +	ret = switch_context(request);
>   	if (ret)
>   		return ret;
>   
> -	ret = switch_context(request);
> +	/* Unconditionally invalidate GPU caches and TLBs. */
> +	ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
>   	if (ret)
>   		return ret;
>   
> 

Looks good to me.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/execlists: Apply a full mb before execution for Braswell
  2018-12-06  8:44 ` [PATCH 3/3] drm/i915/execlists: Apply a full mb before execution " Chris Wilson
@ 2018-12-06 13:12   ` Tvrtko Ursulin
  2018-12-06 21:11     ` Chris Wilson
  0 siblings, 1 reply; 11+ messages in thread
From: Tvrtko Ursulin @ 2018-12-06 13:12 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: stable


On 06/12/2018 08:44, Chris Wilson wrote:
> Braswell is really picky about having our writes posted to memory before
> we execute or else the GPU may see stale values. A wmb() is insufficient
> as it only ensures the writes are visible to other cores, we need a full
> mb() to ensure the writes are in memory and visible to the GPU.
> 
> The most frequent failure in flushing before execution is that we see
> stale PTE values and execute the wrong pages.
> 
> References: 987abd5c62f9 ("drm/i915/execlists: Force write serialisation into context image vs execution")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: stable@vger.kernel.org
> ---
>   drivers/gpu/drm/i915/intel_lrc.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index de1e9dc6aec0..e6a86fa4502d 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -379,8 +379,13 @@ static u64 execlists_update_context(struct i915_request *rq)
>   	 * may not be visible to the HW prior to the completion of the UC
>   	 * register write and that we may begin execution from the context
>   	 * before its image is complete leading to invalid PD chasing.
> +	 *
> +	 * Furthermore, Braswell, at least, wants a full mb to be sure that
> +	 * the writes are coherent in memory (visible to the GPU) prior to
> +	 * execution, and not just visible to other CPUs (as is the result of
> +	 * wmb).
>   	 */
> -	wmb();
> +	mb();
>   	return ce->lrc_desc;
>   }
>   
> 

Too low level for me to really know what happens under the hood, but at 
least I know it can't break anything.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] drm/i915: Skip the ERR_PTR error state
  2018-12-06 12:57 ` [PATCH 1/3] " Tvrtko Ursulin
@ 2018-12-06 21:09   ` Chris Wilson
  0 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2018-12-06 21:09 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-12-06 12:57:23)
> 
> On 06/12/2018 08:44, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> > index 07465123c166..7ca6f3f31d41 100644
> > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > @@ -1907,6 +1907,11 @@ i915_capture_gpu_state(struct drm_i915_private *i915)
> >   {
> >       struct i915_gpu_state *error;
> >   
> > +     /* Check if GPU capture has been disabled */
> > +     error = READ_ONCE(i915->gpu_error.first_error);
> > +     if (IS_ERR(error))
> > +             return error;
> > +
> >       error = kzalloc(sizeof(*error), GFP_ATOMIC);
> >       if (!error)
> >               return NULL;
> 
> Looks like it would be cleaner to return ERR_PTR(-ENOMEM) here. At least 
> the i915_gpu_info_open hunk would be.
> 
> Then in drm-tip I see i915_capture_error_state as a caller which needs 
> to handle ERR_PTR now as well, no?

It shouldn't due to the earlier check, but it would work better if it
tool the ERR_PTR pointer from here.

> > @@ -1987,7 +1992,7 @@ i915_first_error_state(struct drm_i915_private *i915)
> >   
> >       spin_lock_irq(&i915->gpu_error.lock);
> >       error = i915->gpu_error.first_error;
> > -     if (error)
> > +     if (!IS_ERR_OR_NULL(error))
> >               i915_gpu_state_get(error);
> >       spin_unlock_irq(&i915->gpu_error.lock);
> >   
> > @@ -2000,10 +2005,11 @@ void i915_reset_error_state(struct drm_i915_private *i915)
> >   
> >       spin_lock_irq(&i915->gpu_error.lock);
> >       error = i915->gpu_error.first_error;
> > -     i915->gpu_error.first_error = NULL;
> > +     if (!IS_ERR_OR_NULL(error)) /* once disabled, always disabled */
> 
> Hm, this will apply on transient -ENOMEM as well.

Yeah. The best compromise is to say only ENODEV is special.

> > +             i915->gpu_error.first_error = NULL;
> >       spin_unlock_irq(&i915->gpu_error.lock);
> >   
> > -     if (!IS_ERR(error))
> > +     if (!IS_ERR_OR_NULL(error))
> >               i915_gpu_state_put(error);
> >   }
> >   
> > diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> > index 535caebd9813..370b7497e9df 100644
> > --- a/drivers/gpu/drm/i915/i915_sysfs.c
> > +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> > @@ -521,6 +521,9 @@ static ssize_t error_state_read(struct file *filp, struct kobject *kobj,
> >       ssize_t ret;
> >   
> >       gpu = i915_first_error_state(i915);
> > +     if (IS_ERR(gpu))
> > +             return PTR_ERR(gpu);
> > +
> >       if (gpu) {
> >               ret = i915_gpu_state_copy_to_buffer(gpu, buf, off, count);
> >               i915_gpu_state_put(gpu);#
> > 
> 
> A single control block:
> 
> if (!IS_ERR_OR_NULL) {
>         ...
> } else {
>         ...
> }
> 
> seems like it would do the trick.

Not quite unless you were thinking of if { } else if { } else { };
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Intel-gfx] [PATCH 3/3] drm/i915/execlists: Apply a full mb before execution for Braswell
  2018-12-06 13:12   ` [Intel-gfx] " Tvrtko Ursulin
@ 2018-12-06 21:11     ` Chris Wilson
  0 siblings, 0 replies; 11+ messages in thread
From: Chris Wilson @ 2018-12-06 21:11 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: stable

Quoting Tvrtko Ursulin (2018-12-06 13:12:35)
> 
> On 06/12/2018 08:44, Chris Wilson wrote:
> > Braswell is really picky about having our writes posted to memory before
> > we execute or else the GPU may see stale values. A wmb() is insufficient
> > as it only ensures the writes are visible to other cores, we need a full
> > mb() to ensure the writes are in memory and visible to the GPU.
> > 
> > The most frequent failure in flushing before execution is that we see
> > stale PTE values and execute the wrong pages.
> > 
> > References: 987abd5c62f9 ("drm/i915/execlists: Force write serialisation into context image vs execution")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: stable@vger.kernel.org
> > ---
> >   drivers/gpu/drm/i915/intel_lrc.c | 7 ++++++-
> >   1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index de1e9dc6aec0..e6a86fa4502d 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -379,8 +379,13 @@ static u64 execlists_update_context(struct i915_request *rq)
> >        * may not be visible to the HW prior to the completion of the UC
> >        * register write and that we may begin execution from the context
> >        * before its image is complete leading to invalid PD chasing.
> > +      *
> > +      * Furthermore, Braswell, at least, wants a full mb to be sure that
> > +      * the writes are coherent in memory (visible to the GPU) prior to
> > +      * execution, and not just visible to other CPUs (as is the result of
> > +      * wmb).
> >        */
> > -     wmb();
> > +     mb();
> >       return ce->lrc_desc;
> >   }
> >   
> > 
> 
> Too low level for me to really know what happens under the hood, but at 
> least I know it can't break anything.

The alternative I'm considering is using a mmio read instead. However,
the improvement in stability from switching to mb() here is already
enough to proceed without necessarily finding the ideal solution.
-Chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [1/3] drm/i915: Skip the ERR_PTR error state
  2018-12-06  8:44 [PATCH 1/3] drm/i915: Skip the ERR_PTR error state Chris Wilson
                   ` (4 preceding siblings ...)
  2018-12-06 12:57 ` [PATCH 1/3] " Tvrtko Ursulin
@ 2018-12-07  0:44 ` Patchwork
  5 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2018-12-07  0:44 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915: Skip the ERR_PTR error state
URL   : https://patchwork.freedesktop.org/series/53625/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_5272_full -> Patchwork_11031_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Known issues
------------

  Here are the changes found in Patchwork_11031_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ppgtt@blt-vs-render-ctx0:
    - shard-skl:          NOTRUN -> TIMEOUT [fdo#108039]

  * igt@i915_suspend@debugfs-reader:
    - shard-skl:          PASS -> INCOMPLETE [fdo#104108]

  * igt@kms_busy@extended-modeset-hang-newfb-render-c:
    - shard-skl:          NOTRUN -> DMESG-WARN [fdo#107956] +1

  * igt@kms_busy@extended-pageflip-modeset-hang-oldfb-render-a:
    - shard-apl:          NOTRUN -> DMESG-WARN [fdo#107956]

  * igt@kms_cursor_crc@cursor-128x128-sliding:
    - shard-apl:          PASS -> FAIL [fdo#103232] +1

  * igt@kms_cursor_crc@cursor-256x85-sliding:
    - shard-glk:          PASS -> FAIL [fdo#103232] +1

  * igt@kms_flip@modeset-vs-vblank-race-interruptible:
    - shard-apl:          PASS -> INCOMPLETE [fdo#103927]

  * igt@kms_flip@plain-flip-fb-recreate-interruptible:
    - shard-skl:          PASS -> FAIL [fdo#100368]

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-wc:
    - shard-apl:          PASS -> FAIL [fdo#103167]

  * igt@kms_panel_fitting@legacy:
    - shard-skl:          NOTRUN -> FAIL [fdo#105456]

  * {igt@kms_plane@pixel-format-pipe-c-planes-source-clamping}:
    - shard-skl:          NOTRUN -> DMESG-WARN [fdo#106885] +1

  * igt@kms_plane_alpha_blend@pipe-a-constant-alpha-max:
    - shard-skl:          NOTRUN -> FAIL [fdo#108145] +2

  * igt@kms_rmfb@close-fd:
    - {shard-iclb}:       PASS -> DMESG-WARN [fdo#107724] +1

  * {igt@kms_rotation_crc@multiplane-rotation-cropping-top}:
    - shard-kbl:          PASS -> DMESG-FAIL [fdo#108950]

  * igt@pm_rpm@modeset-non-lpsp-stress-no-wait:
    - {shard-iclb}:       SKIP -> INCOMPLETE [fdo#108840]

  * igt@pm_rpm@system-suspend-execbuf:
    - shard-skl:          NOTRUN -> INCOMPLETE [fdo#104108] / [fdo#107807]

  
#### Possible fixes ####

  * igt@kms_atomic_transition@plane-all-modeset-transition-internal-panels:
    - {shard-iclb}:       DMESG-WARN [fdo#107724] -> PASS +4

  * igt@kms_color@pipe-c-legacy-gamma:
    - shard-apl:          FAIL [fdo#104782] -> PASS

  * igt@kms_cursor_legacy@all-pipes-single-move:
    - shard-hsw:          DMESG-WARN [fdo#102614] -> PASS

  * igt@kms_draw_crc@draw-method-xrgb8888-mmap-wc-ytiled:
    - shard-skl:          FAIL [fdo#103184] -> PASS

  * igt@kms_draw_crc@draw-method-xrgb8888-pwrite-untiled:
    - shard-skl:          FAIL [fdo#108472] -> PASS

  * igt@kms_flip@2x-plain-flip-fb-recreate-interruptible:
    - shard-hsw:          DMESG-FAIL [fdo#102614] -> PASS

  * igt@kms_flip@flip-vs-expired-vblank:
    - shard-apl:          FAIL [fdo#102887] / [fdo#105363] -> PASS

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-spr-indfb-draw-mmap-cpu:
    - shard-glk:          FAIL [fdo#103167] -> PASS

  * igt@kms_frontbuffer_tracking@fbc-stridechange:
    - {shard-iclb}:       FAIL [fdo#105683] / [fdo#108040] -> PASS

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-offscren-pri-indfb-draw-pwrite:
    - {shard-iclb}:       DMESG-FAIL [fdo#107724] -> PASS +3

  * igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-indfb-draw-render:
    - shard-skl:          FAIL [fdo#103167] -> PASS +3

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-cur-indfb-draw-mmap-wc:
    - {shard-iclb}:       DMESG-WARN [fdo#107724] / [fdo#108336] -> PASS +5

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-move:
    - {shard-iclb}:       FAIL [fdo#103167] -> PASS +4

  * igt@kms_plane_alpha_blend@pipe-a-coverage-7efc:
    - shard-skl:          FAIL [fdo#107815] / [fdo#108145] -> PASS

  * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
    - shard-skl:          FAIL [fdo#107815] -> PASS

  * igt@kms_plane_multiple@atomic-pipe-c-tiling-yf:
    - shard-apl:          FAIL [fdo#103166] -> PASS +1

  * igt@kms_setmode@basic:
    - shard-apl:          FAIL [fdo#99912] -> PASS

  * igt@pm_rpm@gem-pread:
    - shard-skl:          INCOMPLETE [fdo#107807] -> PASS +1

  * igt@pm_rpm@legacy-planes:
    - {shard-iclb}:       DMESG-WARN -> PASS

  * igt@pm_rpm@modeset-non-lpsp-stress-no-wait:
    - shard-skl:          INCOMPLETE [fdo#107807] -> SKIP

  
#### Warnings ####

  * igt@kms_content_protection@legacy:
    - shard-apl:          INCOMPLETE [fdo#103927] -> FAIL [fdo#108597]

  * {igt@kms_plane@pixel-format-pipe-b-planes-source-clamping}:
    - {shard-iclb}:       DMESG-WARN [fdo#107724] / [fdo#108336] -> FAIL [fdo#108948]

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-c-planes:
    - {shard-iclb}:       FAIL [fdo#103375] -> INCOMPLETE [fdo#107713]

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#100368]: https://bugs.freedesktop.org/show_bug.cgi?id=100368
  [fdo#102614]: https://bugs.freedesktop.org/show_bug.cgi?id=102614
  [fdo#102887]: https://bugs.freedesktop.org/show_bug.cgi?id=102887
  [fdo#103166]: https://bugs.freedesktop.org/show_bug.cgi?id=103166
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103184]: https://bugs.freedesktop.org/show_bug.cgi?id=103184
  [fdo#103232]: https://bugs.freedesktop.org/show_bug.cgi?id=103232
  [fdo#103375]: https://bugs.freedesktop.org/show_bug.cgi?id=103375
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#104108]: https://bugs.freedesktop.org/show_bug.cgi?id=104108
  [fdo#104782]: https://bugs.freedesktop.org/show_bug.cgi?id=104782
  [fdo#105363]: https://bugs.freedesktop.org/show_bug.cgi?id=105363
  [fdo#105456]: https://bugs.freedesktop.org/show_bug.cgi?id=105456
  [fdo#105683]: https://bugs.freedesktop.org/show_bug.cgi?id=105683
  [fdo#106885]: https://bugs.freedesktop.org/show_bug.cgi?id=106885
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#107807]: https://bugs.freedesktop.org/show_bug.cgi?id=107807
  [fdo#107815]: https://bugs.freedesktop.org/show_bug.cgi?id=107815
  [fdo#107956]: https://bugs.freedesktop.org/show_bug.cgi?id=107956
  [fdo#108039]: https://bugs.freedesktop.org/show_bug.cgi?id=108039
  [fdo#108040]: https://bugs.freedesktop.org/show_bug.cgi?id=108040
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#108336]: https://bugs.freedesktop.org/show_bug.cgi?id=108336
  [fdo#108472]: https://bugs.freedesktop.org/show_bug.cgi?id=108472
  [fdo#108597]: https://bugs.freedesktop.org/show_bug.cgi?id=108597
  [fdo#108840]: https://bugs.freedesktop.org/show_bug.cgi?id=108840
  [fdo#108948]: https://bugs.freedesktop.org/show_bug.cgi?id=108948
  [fdo#108950]: https://bugs.freedesktop.org/show_bug.cgi?id=108950
  [fdo#99912]: https://bugs.freedesktop.org/show_bug.cgi?id=99912


Participating hosts (7 -> 7)
------------------------------

  No changes in participating hosts


Build changes
-------------

    * Linux: CI_DRM_5272 -> Patchwork_11031

  CI_DRM_5272: 4bb8baa3a7b836ce18e1b27ba12bae2130ee38cc @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4743: edb2db2cf2b6665d7ba3fa9117263302f6307a4f @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_11031: e14c4601fdfcd633dea62087b23987e029194d1f @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_11031/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-12-07  0:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-06  8:44 [PATCH 1/3] drm/i915: Skip the ERR_PTR error state Chris Wilson
2018-12-06  8:44 ` [PATCH 2/3] drm/i915: Pipeline PDP updates for Braswell Chris Wilson
2018-12-06 13:10   ` Tvrtko Ursulin
2018-12-06  8:44 ` [PATCH 3/3] drm/i915/execlists: Apply a full mb before execution " Chris Wilson
2018-12-06 13:12   ` [Intel-gfx] " Tvrtko Ursulin
2018-12-06 21:11     ` Chris Wilson
2018-12-06  8:58 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] drm/i915: Skip the ERR_PTR error state Patchwork
2018-12-06  9:15 ` ✓ Fi.CI.BAT: success " Patchwork
2018-12-06 12:57 ` [PATCH 1/3] " Tvrtko Ursulin
2018-12-06 21:09   ` Chris Wilson
2018-12-07  0:44 ` ✓ Fi.CI.IGT: success for series starting with [1/3] " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.