All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] ring/context initialization patches
@ 2015-03-16 13:46 Mika Kuoppala
  2015-03-16 13:46 ` [PATCH 1/3] drm/i915: Wait for render state init Mika Kuoppala
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Mika Kuoppala @ 2015-03-16 13:46 UTC (permalink / raw)
  To: intel-gfx

Hi,

Testing dynamic page tables series, there has
been also problems with our codebase interacting badly
with the patchset. These are the essentials I have needed
to testrun/debug the series on my gen7.

Chris Wilson (2):
  drm/i915: Detect page faults during hangcheck
  drm/i915: Reorder hw init to avoid executing with invalid context/mm
    state

Mika Kuoppala (1):
  drm/i915: Wait for render state init

 drivers/gpu/drm/i915/i915_gem.c              | 19 +++++++++++--------
 drivers/gpu/drm/i915/i915_gem_render_state.c |  4 ++++
 drivers/gpu/drm/i915/i915_irq.c              |  5 +++++
 drivers/gpu/drm/i915/intel_ringbuffer.c      | 26 ++++++++++++++++++++++----
 drivers/gpu/drm/i915/intel_ringbuffer.h      |  1 +
 drivers/gpu/drm/i915/intel_uncore.c          |  2 ++
 6 files changed, 45 insertions(+), 12 deletions(-)

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] drm/i915: Wait for render state init
  2015-03-16 13:46 [PATCH 0/3] ring/context initialization patches Mika Kuoppala
@ 2015-03-16 13:46 ` Mika Kuoppala
  2015-03-16 13:51   ` Chris Wilson
  2015-03-16 13:46 ` [PATCH 2/3] drm/i915: Detect page faults during hangcheck Mika Kuoppala
  2015-03-16 13:46 ` [PATCH 3/3] drm/i915: Reorder hw init to avoid executing with invalid context/mm state Mika Kuoppala
  2 siblings, 1 reply; 14+ messages in thread
From: Mika Kuoppala @ 2015-03-16 13:46 UTC (permalink / raw)
  To: intel-gfx

We are freeing the batch that is just pushed to ring.
Further, other ring inits should assume that the render
context with the workarounds are pushed before their rings
execute anything.

This fixes (or papers over) a problem where with full ppgtt
the blitter ring init sometimes fails, where the blt ring
ACTHD runs wild through the address space until eventually
hangcheck kills it.

With dynamic page table series this problem became more
easily reproduced by just normal boot failing to init blt ring,
instead of torturing init with gem_reset_stats.

Testcase: igt/gem_reset_stats --r ban-blt
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_render_state.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 521548a..9484c64 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -175,6 +175,10 @@ int i915_gem_render_state_init(struct intel_engine_cs *ring)
 
 	ret = __i915_add_request(ring, NULL, so.obj);
 	/* __i915_add_request moves object to inactive if it fails */
+	if (ret)
+		goto out;
+
+	ret = intel_ring_idle(ring);
 out:
 	i915_gem_render_state_fini(&so);
 	return ret;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] drm/i915: Detect page faults during hangcheck
  2015-03-16 13:46 [PATCH 0/3] ring/context initialization patches Mika Kuoppala
  2015-03-16 13:46 ` [PATCH 1/3] drm/i915: Wait for render state init Mika Kuoppala
@ 2015-03-16 13:46 ` Mika Kuoppala
  2015-03-16 17:53   ` Daniel Vetter
  2015-03-16 13:46 ` [PATCH 3/3] drm/i915: Reorder hw init to avoid executing with invalid context/mm state Mika Kuoppala
  2 siblings, 1 reply; 14+ messages in thread
From: Mika Kuoppala @ 2015-03-16 13:46 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

On Sandybridge+, the GPU provides the ERROR register for detecting page
faults. Hook this up to our hangcheck so that we can dump the error
state soon after such an event occurs. This would be better inside an
interrupt handler, but it serves a purpose here as it detects that our
initial context setup is invalid...

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_irq.c     | 5 +++++
 drivers/gpu/drm/i915/intel_uncore.c | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 49ad5fb..ea668fc 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2929,6 +2929,11 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 	if (!i915.enable_hangcheck)
 		return;
 
+	if (INTEL_INFO(dev_priv)->gen >= 6 && I915_READ(ERROR_GEN6)) {
+		i915_handle_error(dev, false, "GPU reported a page fault");
+		I915_WRITE(ERROR_GEN6, 0);
+	}
+
 	for_each_ring(ring, dev_priv, i) {
 		u64 acthd;
 		u32 seqno;
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index ab5cc94..c58535e 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -359,6 +359,8 @@ static void __intel_uncore_early_sanitize(struct drm_device *dev,
 	if (IS_GEN6(dev) || IS_GEN7(dev))
 		__raw_i915_write32(dev_priv, GTFIFODBG,
 				   __raw_i915_read32(dev_priv, GTFIFODBG));
+	if (INTEL_INFO(dev)->gen >= 6)
+		__raw_i915_write32(dev_priv, ERROR_GEN6, 0);
 
 	intel_uncore_forcewake_reset(dev, restore_forcewake);
 }
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] drm/i915: Reorder hw init to avoid executing with invalid context/mm state
  2015-03-16 13:46 [PATCH 0/3] ring/context initialization patches Mika Kuoppala
  2015-03-16 13:46 ` [PATCH 1/3] drm/i915: Wait for render state init Mika Kuoppala
  2015-03-16 13:46 ` [PATCH 2/3] drm/i915: Detect page faults during hangcheck Mika Kuoppala
@ 2015-03-16 13:46 ` Mika Kuoppala
  2 siblings, 0 replies; 14+ messages in thread
From: Mika Kuoppala @ 2015-03-16 13:46 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

Currently we initialise the rings, add the first context switch to the
ring and execute our golden state then enable (aliasing or full) ppgtt.
However, as we enable ppgtt using direct MMIO but load the PD using
MI_LRI, we end up executing the context switch and golden render state
with an invalid PD generating page faults. To solve this issue, first do
the ppgtt PD setup, then set the default context and write the commands
to run the render state into the ring, before we activate the ring. This
allows us to be sure that the register state is valid before we begin
execution.

This was spotted when writing the seqno/request conversion, but only with
the ERROR capture did I realise that it was a necessity now.

RFC: cleanup the error handling in i915_gem_init_hw.

v2: added intel_ring_reset

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         | 19 +++++++++++--------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 26 ++++++++++++++++++++++----
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 3 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0fe313d..e154770 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4817,6 +4817,9 @@ i915_gem_init_hw(struct drm_device *dev)
 		}
 	}
 
+	for_each_ring(ring, dev_priv, i)
+		intel_ring_reset(ring);
+
 	i915_gem_init_swizzling(dev);
 
 	/*
@@ -4827,19 +4830,19 @@ i915_gem_init_hw(struct drm_device *dev)
 	 */
 	init_unused_rings(dev);
 
-	for_each_ring(ring, dev_priv, i) {
-		ret = ring->init_hw(ring);
-		if (ret)
-			goto out;
+	ret = i915_ppgtt_init_hw(dev);
+	if (ret && ret != -EIO) {
+		DRM_ERROR("PPGTT enable failed %d\n", ret);
+		i915_gem_cleanup_ringbuffer(dev);
 	}
 
 	for (i = 0; i < NUM_L3_SLICES(dev); i++)
 		i915_gem_l3_remap(&dev_priv->ring[RCS], i);
 
-	ret = i915_ppgtt_init_hw(dev);
-	if (ret && ret != -EIO) {
-		DRM_ERROR("PPGTT enable failed %d\n", ret);
-		i915_gem_cleanup_ringbuffer(dev);
+	for_each_ring(ring, dev_priv, i) {
+		ret = ring->init_hw(ring);
+		if (ret)
+			goto out;
 	}
 
 	ret = i915_gem_context_enable(dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 441e250..0f97588 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -560,6 +560,22 @@ static bool stop_ring(struct intel_engine_cs *ring)
 	return (I915_READ_HEAD(ring) & HEAD_ADDR) == 0;
 }
 
+void intel_ring_reset(struct intel_engine_cs *ring)
+{
+	struct intel_ringbuffer *ringbuf = ring->buffer;
+
+	WARN_ON(!stop_ring(ring));
+
+	if (WARN_ON(ringbuf == NULL))
+		return;
+
+	ringbuf->head = 0;
+	ringbuf->tail = 0;
+
+	ringbuf->last_retired_head = -1;
+}
+
+
 static int init_ring_common(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
@@ -614,6 +630,9 @@ static int init_ring_common(struct intel_engine_cs *ring)
 	I915_WRITE_HEAD(ring, 0);
 	(void)I915_READ_HEAD(ring);
 
+	I915_WRITE_HEAD(ring, ringbuf->head);
+	I915_WRITE_TAIL(ring, ringbuf->head);
+
 	I915_WRITE_CTL(ring,
 			((ringbuf->size - PAGE_SIZE) & RING_NR_PAGES)
 			| RING_VALID);
@@ -621,7 +640,7 @@ static int init_ring_common(struct intel_engine_cs *ring)
 	/* If the head is still not zero, the ring is dead */
 	if (wait_for((I915_READ_CTL(ring) & RING_VALID) != 0 &&
 		     I915_READ_START(ring) == i915_gem_obj_ggtt_offset(obj) &&
-		     (I915_READ_HEAD(ring) & HEAD_ADDR) == 0, 50)) {
+		     (I915_READ_HEAD(ring) & HEAD_ADDR) == ringbuf->head, 50)) {
 		DRM_ERROR("%s initialization failed "
 			  "ctl %08x (valid? %d) head %08x tail %08x start %08x [expected %08lx]\n",
 			  ring->name,
@@ -632,13 +651,12 @@ static int init_ring_common(struct intel_engine_cs *ring)
 		goto out;
 	}
 
+	I915_WRITE_TAIL(ring, ringbuf->tail);
+
 	ringbuf->last_retired_head = -1;
-	ringbuf->head = I915_READ_HEAD(ring);
-	ringbuf->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
 	intel_ring_update_space(ringbuf);
 
 	memset(&ring->hangcheck, 0, sizeof(ring->hangcheck));
-
 out:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index c761fe0..168a670 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -408,6 +408,7 @@ int __intel_ring_space(int head, int tail, int size);
 void intel_ring_update_space(struct intel_ringbuffer *ringbuf);
 int intel_ring_space(struct intel_ringbuffer *ringbuf);
 bool intel_ring_stopped(struct intel_engine_cs *ring);
+void intel_ring_reset(struct intel_engine_cs *ring);
 void __intel_ring_advance(struct intel_engine_cs *ring);
 
 int __must_check intel_ring_idle(struct intel_engine_cs *ring);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] drm/i915: Wait for render state init
  2015-03-16 13:46 ` [PATCH 1/3] drm/i915: Wait for render state init Mika Kuoppala
@ 2015-03-16 13:51   ` Chris Wilson
  2015-03-16 14:20     ` Mika Kuoppala
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2015-03-16 13:51 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Mon, Mar 16, 2015 at 03:46:34PM +0200, Mika Kuoppala wrote:
> We are freeing the batch that is just pushed to ring.

Kill this. It is not correct.

> Further, other ring inits should assume that the render
> context with the workarounds are pushed before their rings
> execute anything.
> 
> This fixes (or papers over) a problem where with full ppgtt
> the blitter ring init sometimes fails, where the blt ring
> ACTHD runs wild through the address space until eventually
> hangcheck kills it.

Interesting. We don't do anything inside the golden render state to be
of relevance to the blitter engine. So I think this is pure paper and
will not ultimately fix the problem.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] drm/i915: Wait for render state init
  2015-03-16 13:51   ` Chris Wilson
@ 2015-03-16 14:20     ` Mika Kuoppala
  2015-03-16 14:38       ` Chris Wilson
  0 siblings, 1 reply; 14+ messages in thread
From: Mika Kuoppala @ 2015-03-16 14:20 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> On Mon, Mar 16, 2015 at 03:46:34PM +0200, Mika Kuoppala wrote:
>> We are freeing the batch that is just pushed to ring.
>
> Kill this. It is not correct.
>
>> Further, other ring inits should assume that the render
>> context with the workarounds are pushed before their rings
>> execute anything.
>> 
>> This fixes (or papers over) a problem where with full ppgtt
>> the blitter ring init sometimes fails, where the blt ring
>> ACTHD runs wild through the address space until eventually
>> hangcheck kills it.
>
> Interesting. We don't do anything inside the golden render state to be
> of relevance to the blitter engine. So I think this is pure paper and
> will not ultimately fix the problem.
> -Chris
>

Perhaps not in the golden state, but do we push any workarounds through
render ring init, that are global in scope and required by the other rings?

-Mika
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] drm/i915: Wait for render state init
  2015-03-16 14:20     ` Mika Kuoppala
@ 2015-03-16 14:38       ` Chris Wilson
  2015-03-16 15:56         ` Mika Kuoppala
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2015-03-16 14:38 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Mon, Mar 16, 2015 at 04:20:46PM +0200, Mika Kuoppala wrote:
> Perhaps not in the golden state, but do we push any workarounds through
> render ring init, that are global in scope and required by the other rings?

No. Afaict the w/a only apply to render. Or else we have been horribly
misinformed. I still suspect it is something due to the ordering of pd
loads between rings. Serialising the render context init should not fix
the similar hangs we see at runtime.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] drm/i915: Wait for render state init
  2015-03-16 14:38       ` Chris Wilson
@ 2015-03-16 15:56         ` Mika Kuoppala
  2015-03-16 15:58           ` [PATCH] drm/i915: Push mm switch immediately to ring Mika Kuoppala
  0 siblings, 1 reply; 14+ messages in thread
From: Mika Kuoppala @ 2015-03-16 15:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> On Mon, Mar 16, 2015 at 04:20:46PM +0200, Mika Kuoppala wrote:
>> Perhaps not in the golden state, but do we push any workarounds through
>> render ring init, that are global in scope and required by the other rings?
>
> No. Afaict the w/a only apply to render. Or else we have been horribly
> misinformed. I still suspect it is something due to the ordering of pd
> loads between rings. Serialising the render context init should not fix
> the similar hangs we see at runtime.

Ok please ignore this patch then. I will send another that
does the immediate tail updates on ring->mm_switch.

-Mika

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] drm/i915: Push mm switch immediately to ring
  2015-03-16 15:56         ` Mika Kuoppala
@ 2015-03-16 15:58           ` Mika Kuoppala
  2015-03-16 21:35             ` Chris Wilson
  0 siblings, 1 reply; 14+ messages in thread
From: Mika Kuoppala @ 2015-03-16 15:58 UTC (permalink / raw)
  To: intel-gfx

Sometimes when first batch is run with blitter ring,
and even tho its PP_BASE has been properly set, the ring
seems to be very confused about its memory view. The ring
ACHTD keeps progressing past batch boundary, towards the end
off address space. Eventually after quite amount of time,
hangcheck will declare ring as hanged. But the proceeding
reset can run into the same problem with the ring init.

If we update the tail immediately after mm switch, the render
ring will have a proper pd loaded, before the blitter
ring is initialized. This fixes, or papers over, the blitter 'ACTHD
proceeding through address space without progress' problem that has
been quite a while in ppgtt=2 ring init.

Also with Chris Wilson's patch to report GPU faults, I got
page faults with unloaded PD's from address zero when running,
igt/gem_ppgtt. This patch fixes also those.

Testcase: igt/gem_ppgtt
Testcase: igt/gem_reset_stats --r ban-blt
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 2034f7c..bbfcbb6 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -390,7 +390,7 @@ static int gen8_write_pdp(struct intel_engine_cs *ring, unsigned entry,
 	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
 	intel_ring_emit(ring, GEN8_RING_PDP_LDW(ring, entry));
 	intel_ring_emit(ring, (u32)(val));
-	intel_ring_advance(ring);
+	__intel_ring_advance(ring);
 
 	return 0;
 }
@@ -896,7 +896,7 @@ static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	intel_ring_emit(ring, RING_PP_DIR_BASE(ring));
 	intel_ring_emit(ring, get_pd_offset(ppgtt));
 	intel_ring_emit(ring, MI_NOOP);
-	intel_ring_advance(ring);
+	__intel_ring_advance(ring);
 
 	return 0;
 }
@@ -931,7 +931,6 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	intel_ring_emit(ring, RING_PP_DIR_BASE(ring));
 	intel_ring_emit(ring, get_pd_offset(ppgtt));
 	intel_ring_emit(ring, MI_NOOP);
-	intel_ring_advance(ring);
 
 	/* XXX: RCS is the only one to auto invalidate the TLBs? */
 	if (ring->id != RCS) {
@@ -940,6 +939,8 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
 			return ret;
 	}
 
+	__intel_ring_advance(ring);
+
 	return 0;
 }
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] drm/i915: Detect page faults during hangcheck
  2015-03-16 13:46 ` [PATCH 2/3] drm/i915: Detect page faults during hangcheck Mika Kuoppala
@ 2015-03-16 17:53   ` Daniel Vetter
  2015-03-16 21:30     ` Chris Wilson
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Vetter @ 2015-03-16 17:53 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Mon, Mar 16, 2015 at 03:46:35PM +0200, Mika Kuoppala wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> On Sandybridge+, the GPU provides the ERROR register for detecting page
> faults. Hook this up to our hangcheck so that we can dump the error
> state soon after such an event occurs. This would be better inside an
> interrupt handler, but it serves a purpose here as it detects that our
> initial context setup is invalid...
> 
> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_irq.c     | 5 +++++
>  drivers/gpu/drm/i915/intel_uncore.c | 2 ++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 49ad5fb..ea668fc 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -2929,6 +2929,11 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
>  	if (!i915.enable_hangcheck)
>  		return;
>  
> +	if (INTEL_INFO(dev_priv)->gen >= 6 && I915_READ(ERROR_GEN6)) {
> +		i915_handle_error(dev, false, "GPU reported a page fault");
> +		I915_WRITE(ERROR_GEN6, 0);

Shouldn't we also at least report the bits from the ERROR register
somewhere? Or are they supremely useless?
-Daniel

> +	}
> +
>  	for_each_ring(ring, dev_priv, i) {
>  		u64 acthd;
>  		u32 seqno;
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index ab5cc94..c58535e 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -359,6 +359,8 @@ static void __intel_uncore_early_sanitize(struct drm_device *dev,
>  	if (IS_GEN6(dev) || IS_GEN7(dev))
>  		__raw_i915_write32(dev_priv, GTFIFODBG,
>  				   __raw_i915_read32(dev_priv, GTFIFODBG));
> +	if (INTEL_INFO(dev)->gen >= 6)
> +		__raw_i915_write32(dev_priv, ERROR_GEN6, 0);
>  
>  	intel_uncore_forcewake_reset(dev, restore_forcewake);
>  }
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] drm/i915: Detect page faults during hangcheck
  2015-03-16 17:53   ` Daniel Vetter
@ 2015-03-16 21:30     ` Chris Wilson
  2015-03-17 10:31       ` Daniel Vetter
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2015-03-16 21:30 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Mon, Mar 16, 2015 at 06:53:40PM +0100, Daniel Vetter wrote:
> On Mon, Mar 16, 2015 at 03:46:35PM +0200, Mika Kuoppala wrote:
> > From: Chris Wilson <chris@chris-wilson.co.uk>
> > 
> > On Sandybridge+, the GPU provides the ERROR register for detecting page
> > faults. Hook this up to our hangcheck so that we can dump the error
> > state soon after such an event occurs. This would be better inside an
> > interrupt handler, but it serves a purpose here as it detects that our
> > initial context setup is invalid...
> > 
> > Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/i915_irq.c     | 5 +++++
> >  drivers/gpu/drm/i915/intel_uncore.c | 2 ++
> >  2 files changed, 7 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 49ad5fb..ea668fc 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -2929,6 +2929,11 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
> >  	if (!i915.enable_hangcheck)
> >  		return;
> >  
> > +	if (INTEL_INFO(dev_priv)->gen >= 6 && I915_READ(ERROR_GEN6)) {
> > +		i915_handle_error(dev, false, "GPU reported a page fault");
> > +		I915_WRITE(ERROR_GEN6, 0);
> 
> Shouldn't we also at least report the bits from the ERROR register
> somewhere? Or are they supremely useless?

It's recorded in the hangcheck already - that's how I knew we were
generating pagefaults during module init in the first place.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] drm/i915: Push mm switch immediately to ring
  2015-03-16 15:58           ` [PATCH] drm/i915: Push mm switch immediately to ring Mika Kuoppala
@ 2015-03-16 21:35             ` Chris Wilson
  2015-03-17 16:10               ` Mika Kuoppala
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Wilson @ 2015-03-16 21:35 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Mon, Mar 16, 2015 at 05:58:12PM +0200, Mika Kuoppala wrote:
> Sometimes when first batch is run with blitter ring,
> @@ -931,7 +931,6 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
>  	intel_ring_emit(ring, RING_PP_DIR_BASE(ring));
>  	intel_ring_emit(ring, get_pd_offset(ppgtt));
>  	intel_ring_emit(ring, MI_NOOP);
> -	intel_ring_advance(ring);
>  
>  	/* XXX: RCS is the only one to auto invalidate the TLBs? */
>  	if (ring->id != RCS) {
> @@ -940,6 +939,8 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
>  			return ret;
>  	}
>  
> +	__intel_ring_advance(ring);
> +

I really would like a comment before each of these, something like
/* XXX This papers over a bug loading PD prior to batch execution */

This last chunk is buggy. The earlier unadvanced return will break the
ringbuffer state tracking. Do you have equal success just with the plain
s/intel_ring_advance/__intel_ring_advance/ ?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] drm/i915: Detect page faults during hangcheck
  2015-03-16 21:30     ` Chris Wilson
@ 2015-03-17 10:31       ` Daniel Vetter
  0 siblings, 0 replies; 14+ messages in thread
From: Daniel Vetter @ 2015-03-17 10:31 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Mika Kuoppala, intel-gfx

On Mon, Mar 16, 2015 at 09:30:44PM +0000, Chris Wilson wrote:
> On Mon, Mar 16, 2015 at 06:53:40PM +0100, Daniel Vetter wrote:
> > On Mon, Mar 16, 2015 at 03:46:35PM +0200, Mika Kuoppala wrote:
> > > From: Chris Wilson <chris@chris-wilson.co.uk>
> > > 
> > > On Sandybridge+, the GPU provides the ERROR register for detecting page
> > > faults. Hook this up to our hangcheck so that we can dump the error
> > > state soon after such an event occurs. This would be better inside an
> > > interrupt handler, but it serves a purpose here as it detects that our
> > > initial context setup is invalid...
> > > 
> > > Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > ---
> > >  drivers/gpu/drm/i915/i915_irq.c     | 5 +++++
> > >  drivers/gpu/drm/i915/intel_uncore.c | 2 ++
> > >  2 files changed, 7 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > index 49ad5fb..ea668fc 100644
> > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > @@ -2929,6 +2929,11 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
> > >  	if (!i915.enable_hangcheck)
> > >  		return;
> > >  
> > > +	if (INTEL_INFO(dev_priv)->gen >= 6 && I915_READ(ERROR_GEN6)) {
> > > +		i915_handle_error(dev, false, "GPU reported a page fault");
> > > +		I915_WRITE(ERROR_GEN6, 0);
> > 
> > Shouldn't we also at least report the bits from the ERROR register
> > somewhere? Or are they supremely useless?
> 
> It's recorded in the hangcheck already - that's how I knew we were
> generating pagefaults during module init in the first place.

Ah right missed that, so I think we're covered. Maybe a comment that
handle_error captures ERROR_GEN6? Doesn't really need to be though, just
in case someone decides to reorder the clearing with the handling.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] drm/i915: Push mm switch immediately to ring
  2015-03-16 21:35             ` Chris Wilson
@ 2015-03-17 16:10               ` Mika Kuoppala
  0 siblings, 0 replies; 14+ messages in thread
From: Mika Kuoppala @ 2015-03-17 16:10 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> On Mon, Mar 16, 2015 at 05:58:12PM +0200, Mika Kuoppala wrote:
>> Sometimes when first batch is run with blitter ring,
>> @@ -931,7 +931,6 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
>>  	intel_ring_emit(ring, RING_PP_DIR_BASE(ring));
>>  	intel_ring_emit(ring, get_pd_offset(ppgtt));
>>  	intel_ring_emit(ring, MI_NOOP);
>> -	intel_ring_advance(ring);
>>  
>>  	/* XXX: RCS is the only one to auto invalidate the TLBs? */
>>  	if (ring->id != RCS) {
>> @@ -940,6 +939,8 @@ static int gen7_mm_switch(struct i915_hw_ppgtt *ppgtt,
>>  			return ret;
>>  	}
>>  
>> +	__intel_ring_advance(ring);
>> +
>
> I really would like a comment before each of these, something like
> /* XXX This papers over a bug loading PD prior to batch execution */
>
> This last chunk is buggy. The earlier unadvanced return will break the
> ringbuffer state tracking. Do you have equal success just with the plain
> s/intel_ring_advance/__intel_ring_advance/ ?

I have managed to create a series that doesn't need this tail update
trickery at all. Please ignore this series.

-Mika
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-03-17 16:10 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-16 13:46 [PATCH 0/3] ring/context initialization patches Mika Kuoppala
2015-03-16 13:46 ` [PATCH 1/3] drm/i915: Wait for render state init Mika Kuoppala
2015-03-16 13:51   ` Chris Wilson
2015-03-16 14:20     ` Mika Kuoppala
2015-03-16 14:38       ` Chris Wilson
2015-03-16 15:56         ` Mika Kuoppala
2015-03-16 15:58           ` [PATCH] drm/i915: Push mm switch immediately to ring Mika Kuoppala
2015-03-16 21:35             ` Chris Wilson
2015-03-17 16:10               ` Mika Kuoppala
2015-03-16 13:46 ` [PATCH 2/3] drm/i915: Detect page faults during hangcheck Mika Kuoppala
2015-03-16 17:53   ` Daniel Vetter
2015-03-16 21:30     ` Chris Wilson
2015-03-17 10:31       ` Daniel Vetter
2015-03-16 13:46 ` [PATCH 3/3] drm/i915: Reorder hw init to avoid executing with invalid context/mm state Mika Kuoppala

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.