* [PATCH 1/6] drm/i915: Extract register state error capture
@ 2014-01-30 8:19 Ben Widawsky
2014-01-30 8:19 ` [PATCH 2/6] drm/i915: Logically reorder error register capture Ben Widawsky
` (4 more replies)
0 siblings, 5 replies; 8+ messages in thread
From: Ben Widawsky @ 2014-01-30 8:19 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
The code has become quite hairy. By relocating all the generic registers
it will become more obvious where future ones should go. There is still
admittedly a bit of confusion left for things like per ring registers.
A subsequent patch will clean this function up.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_gpu_error.c | 77 +++++++++++++++++++----------------
1 file changed, 43 insertions(+), 34 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 21cf0cf..67c82e5 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1012,43 +1012,13 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
}
}
-/**
- * i915_capture_error_state - capture an error record for later analysis
- * @dev: drm device
- *
- * Should be called when an error is detected (either a hang or an error
- * interrupt) to capture error state from the time of the error. Fills
- * out a structure which becomes available in debugfs for user level tools
- * to pick up.
- */
-void i915_capture_error_state(struct drm_device *dev)
+/* Capture all registers which don't fit into another category. */
+static void i915_capture_reg_state(struct drm_i915_private *dev_priv,
+ struct drm_i915_error_state *error)
{
- struct drm_i915_private *dev_priv = dev->dev_private;
- struct drm_i915_error_state *error;
- unsigned long flags;
+ struct drm_device *dev = dev_priv->dev;
int pipe;
- spin_lock_irqsave(&dev_priv->gpu_error.lock, flags);
- error = dev_priv->gpu_error.first_error;
- spin_unlock_irqrestore(&dev_priv->gpu_error.lock, flags);
- if (error)
- return;
-
- /* Account for pipe specific data like PIPE*STAT */
- error = kzalloc(sizeof(*error), GFP_ATOMIC);
- if (!error) {
- DRM_DEBUG_DRIVER("out of memory, not capturing error state\n");
- return;
- }
-
- DRM_INFO("GPU crash dump saved to /sys/class/drm/card%d/error\n",
- dev->primary->index);
- DRM_INFO("GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.\n");
- DRM_INFO("Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel\n");
- DRM_INFO("drm/i915 developers can then reassign to the right component if it's not a kernel issue.\n");
- DRM_INFO("The gpu crash dump is required to analyze gpu hangs, so please always attach it.\n");
-
- kref_init(&error->ref);
error->eir = I915_READ(EIR);
error->pgtbl_er = I915_READ(PGTBL_ER);
if (HAS_HW_CONTEXTS(dev))
@@ -1086,7 +1056,46 @@ void i915_capture_error_state(struct drm_device *dev)
error->err_int = I915_READ(GEN7_ERR_INT);
i915_get_extra_instdone(dev, error->extra_instdone);
+}
+
+/**
+ * i915_capture_error_state - capture an error record for later analysis
+ * @dev: drm device
+ *
+ * Should be called when an error is detected (either a hang or an error
+ * interrupt) to capture error state from the time of the error. Fills
+ * out a structure which becomes available in debugfs for user level tools
+ * to pick up.
+ */
+void i915_capture_error_state(struct drm_device *dev)
+{
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct drm_i915_error_state *error;
+ unsigned long flags;
+
+ spin_lock_irqsave(&dev_priv->gpu_error.lock, flags);
+ error = dev_priv->gpu_error.first_error;
+ spin_unlock_irqrestore(&dev_priv->gpu_error.lock, flags);
+ if (error)
+ return;
+
+ /* Account for pipe specific data like PIPE*STAT */
+ error = kzalloc(sizeof(*error), GFP_ATOMIC);
+ if (!error) {
+ DRM_DEBUG_DRIVER("out of memory, not capturing error state\n");
+ return;
+ }
+
+ DRM_INFO("GPU crash dump saved to /sys/class/drm/card%d/error\n",
+ dev->primary->index);
+ DRM_INFO("GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.\n");
+ DRM_INFO("Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel\n");
+ DRM_INFO("drm/i915 developers can then reassign to the right component if it's not a kernel issue.\n");
+ DRM_INFO("The gpu crash dump is required to analyze gpu hangs, so please always attach it.\n");
+
+ kref_init(&error->ref);
+ i915_capture_reg_state(dev_priv, error);
i915_gem_capture_buffers(dev_priv, error);
i915_gem_record_fences(dev, error);
i915_gem_record_rings(dev, error);
--
1.8.5.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/6] drm/i915: Logically reorder error register capture
2014-01-30 8:19 [PATCH 1/6] drm/i915: Extract register state error capture Ben Widawsky
@ 2014-01-30 8:19 ` Ben Widawsky
2014-01-30 8:19 ` [PATCH 3/6] drm/i915: Reorder struct members Ben Widawsky
` (3 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Ben Widawsky @ 2014-01-30 8:19 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
Create logical sections in an attempt to clean up, and continue to keep
future additions clean.
v2: Reworded the comments. Added section headers (Chris)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_gpu_error.c | 59 +++++++++++++++++++++--------------
1 file changed, 36 insertions(+), 23 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 67c82e5..c9d4a18 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1019,41 +1019,54 @@ static void i915_capture_reg_state(struct drm_i915_private *dev_priv,
struct drm_device *dev = dev_priv->dev;
int pipe;
- error->eir = I915_READ(EIR);
- error->pgtbl_er = I915_READ(PGTBL_ER);
- if (HAS_HW_CONTEXTS(dev))
- error->ccid = I915_READ(CCID);
+ /* General organization
+ * 1. Registers specific to a single generation
+ * 2. Registers which belong to multiple generations
+ * 3. Feature specific registers.
+ * 4. Everything else
+ * Please try to follow the order.
+ */
- if (HAS_PCH_SPLIT(dev))
- error->ier = I915_READ(DEIER) | I915_READ(GTIER);
- else if (IS_VALLEYVIEW(dev))
+ /* 1: Registers specific to a single generation */
+ if (IS_VALLEYVIEW(dev)) {
error->ier = I915_READ(GTIER) | I915_READ(VLV_IER);
- else if (IS_GEN2(dev))
- error->ier = I915_READ16(IER);
- else
- error->ier = I915_READ(IER);
+ error->forcewake = I915_READ(FORCEWAKE_VLV);
+ }
- if (INTEL_INFO(dev)->gen >= 6)
- error->derrmr = I915_READ(DERRMR);
+ if (IS_GEN7(dev))
+ error->err_int = I915_READ(GEN7_ERR_INT);
- if (IS_VALLEYVIEW(dev))
- error->forcewake = I915_READ(FORCEWAKE_VLV);
- else if (INTEL_INFO(dev)->gen >= 7)
- error->forcewake = I915_READ(FORCEWAKE_MT);
- else if (INTEL_INFO(dev)->gen == 6)
+ if (IS_GEN6(dev))
error->forcewake = I915_READ(FORCEWAKE);
- if (!HAS_PCH_SPLIT(dev))
- for_each_pipe(pipe)
- error->pipestat[pipe] = I915_READ(PIPESTAT(pipe));
+ if (IS_GEN2(dev))
+ error->ier = I915_READ16(IER);
+
+ /* 2: Registers which belong to multiple generations */
+ if (INTEL_INFO(dev)->gen >= 7)
+ error->forcewake = I915_READ(FORCEWAKE_MT);
if (INTEL_INFO(dev)->gen >= 6) {
+ error->derrmr = I915_READ(DERRMR);
error->error = I915_READ(ERROR_GEN6);
error->done_reg = I915_READ(DONE_REG);
}
- if (INTEL_INFO(dev)->gen == 7)
- error->err_int = I915_READ(GEN7_ERR_INT);
+ /* 3: Feature specific registers */
+ if (HAS_HW_CONTEXTS(dev))
+ error->ccid = I915_READ(CCID);
+
+ if (HAS_PCH_SPLIT(dev))
+ error->ier = I915_READ(DEIER) | I915_READ(GTIER);
+ else {
+ error->ier = I915_READ(IER);
+ for_each_pipe(pipe)
+ error->pipestat[pipe] = I915_READ(PIPESTAT(pipe));
+ }
+
+ /* 4: Everything else */
+ error->eir = I915_READ(EIR);
+ error->pgtbl_er = I915_READ(PGTBL_ER);
i915_get_extra_instdone(dev, error->extra_instdone);
}
--
1.8.5.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/6] drm/i915: Reorder struct members
2014-01-30 8:19 [PATCH 1/6] drm/i915: Extract register state error capture Ben Widawsky
2014-01-30 8:19 ` [PATCH 2/6] drm/i915: Logically reorder error register capture Ben Widawsky
@ 2014-01-30 8:19 ` Ben Widawsky
2014-01-30 8:19 ` [PATCH 4/6] drm/i915: Move per ring error state to ring_error Ben Widawsky
` (2 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Ben Widawsky @ 2014-01-30 8:19 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
This helps make an upcoming patch a bit more reviewable
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_drv.h | 43 ++++++++++++++++++++++++-----------------
1 file changed, 25 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3782b36..cd97c86 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -295,14 +295,26 @@ struct intel_display_error_state;
struct drm_i915_error_state {
struct kref ref;
+ struct timeval time;
+
+ /* Generic register state */
u32 eir;
u32 pgtbl_er;
u32 ier;
u32 ccid;
u32 derrmr;
u32 forcewake;
- bool waiting[I915_NUM_RINGS];
+ u32 error; /* gen6+ */
+ u32 err_int; /* gen7 */
+ u32 done_reg;
+ u32 extra_instdone[I915_NUM_INSTDONE_REG];
u32 pipestat[I915_MAX_PIPES];
+ u64 fence[I915_MAX_NUM_FENCES];
+ struct intel_overlay_error_state *overlay;
+ struct intel_display_error_state *display;
+
+ /* Per ring register state
+ * TODO: Move these to per ring */
u32 tail[I915_NUM_RINGS];
u32 head[I915_NUM_RINGS];
u32 ctl[I915_NUM_RINGS];
@@ -311,25 +323,25 @@ struct drm_i915_error_state {
u32 ipehr[I915_NUM_RINGS];
u32 instdone[I915_NUM_RINGS];
u32 acthd[I915_NUM_RINGS];
- u32 semaphore_mboxes[I915_NUM_RINGS][I915_NUM_RINGS - 1];
- u32 semaphore_seqno[I915_NUM_RINGS][I915_NUM_RINGS - 1];
- u32 rc_psmi[I915_NUM_RINGS]; /* sleep state */
- /* our own tracking of ring head and tail */
- u32 cpu_ring_head[I915_NUM_RINGS];
- u32 cpu_ring_tail[I915_NUM_RINGS];
- u32 error; /* gen6+ */
- u32 err_int; /* gen7 */
u32 bbstate[I915_NUM_RINGS];
u32 instpm[I915_NUM_RINGS];
u32 instps[I915_NUM_RINGS];
- u32 extra_instdone[I915_NUM_INSTDONE_REG];
u32 seqno[I915_NUM_RINGS];
u64 bbaddr[I915_NUM_RINGS];
u32 fault_reg[I915_NUM_RINGS];
- u32 done_reg;
u32 faddr[I915_NUM_RINGS];
- u64 fence[I915_MAX_NUM_FENCES];
- struct timeval time;
+ u32 rc_psmi[I915_NUM_RINGS]; /* sleep state */
+ u32 semaphore_mboxes[I915_NUM_RINGS][I915_NUM_RINGS - 1];
+
+ /* Software tracked state */
+ bool waiting[I915_NUM_RINGS];
+ int hangcheck_score[I915_NUM_RINGS];
+ enum intel_ring_hangcheck_action hangcheck_action[I915_NUM_RINGS];
+
+ /* our own tracking of ring head and tail */
+ u32 cpu_ring_head[I915_NUM_RINGS];
+ u32 cpu_ring_tail[I915_NUM_RINGS];
+ u32 semaphore_seqno[I915_NUM_RINGS][I915_NUM_RINGS - 1];
struct drm_i915_error_ring {
bool valid;
@@ -363,11 +375,6 @@ struct drm_i915_error_state {
} **active_bo, **pinned_bo;
u32 *active_bo_count, *pinned_bo_count;
u32 vm_count;
-
- struct intel_overlay_error_state *overlay;
- struct intel_display_error_state *display;
- int hangcheck_score[I915_NUM_RINGS];
- enum intel_ring_hangcheck_action hangcheck_action[I915_NUM_RINGS];
};
struct intel_connector;
--
1.8.5.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 4/6] drm/i915: Move per ring error state to ring_error
2014-01-30 8:19 [PATCH 1/6] drm/i915: Extract register state error capture Ben Widawsky
2014-01-30 8:19 ` [PATCH 2/6] drm/i915: Logically reorder error register capture Ben Widawsky
2014-01-30 8:19 ` [PATCH 3/6] drm/i915: Reorder struct members Ben Widawsky
@ 2014-01-30 8:19 ` Ben Widawsky
2014-01-30 8:19 ` [PATCH 5/6] drm/i915: Add some more registers to error state Ben Widawsky
2014-01-30 8:19 ` [PATCH 6/6] drm/i915: Capture PPGTT info on error capture Ben Widawsky
4 siblings, 0 replies; 8+ messages in thread
From: Ben Widawsky @ 2014-01-30 8:19 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
v2: Moved num_requests up (Chris)
Rebased on new hws page capture which required a rename since it made
two members named, 'hws' in the per ring error state. (Ben)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_drv.h | 65 ++++++++--------
drivers/gpu/drm/i915/i915_gpu_error.c | 143 +++++++++++++++++-----------------
2 files changed, 104 insertions(+), 104 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cd97c86..d20fc80 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -313,49 +313,50 @@ struct drm_i915_error_state {
struct intel_overlay_error_state *overlay;
struct intel_display_error_state *display;
- /* Per ring register state
- * TODO: Move these to per ring */
- u32 tail[I915_NUM_RINGS];
- u32 head[I915_NUM_RINGS];
- u32 ctl[I915_NUM_RINGS];
- u32 hws[I915_NUM_RINGS];
- u32 ipeir[I915_NUM_RINGS];
- u32 ipehr[I915_NUM_RINGS];
- u32 instdone[I915_NUM_RINGS];
- u32 acthd[I915_NUM_RINGS];
- u32 bbstate[I915_NUM_RINGS];
- u32 instpm[I915_NUM_RINGS];
- u32 instps[I915_NUM_RINGS];
- u32 seqno[I915_NUM_RINGS];
- u64 bbaddr[I915_NUM_RINGS];
- u32 fault_reg[I915_NUM_RINGS];
- u32 faddr[I915_NUM_RINGS];
- u32 rc_psmi[I915_NUM_RINGS]; /* sleep state */
- u32 semaphore_mboxes[I915_NUM_RINGS][I915_NUM_RINGS - 1];
-
- /* Software tracked state */
- bool waiting[I915_NUM_RINGS];
- int hangcheck_score[I915_NUM_RINGS];
- enum intel_ring_hangcheck_action hangcheck_action[I915_NUM_RINGS];
-
- /* our own tracking of ring head and tail */
- u32 cpu_ring_head[I915_NUM_RINGS];
- u32 cpu_ring_tail[I915_NUM_RINGS];
- u32 semaphore_seqno[I915_NUM_RINGS][I915_NUM_RINGS - 1];
-
struct drm_i915_error_ring {
bool valid;
+ /* Software tracked state */
+ bool waiting;
+ int hangcheck_score;
+ enum intel_ring_hangcheck_action hangcheck_action;
+ int num_requests;
+
+ /* our own tracking of ring head and tail */
+ u32 cpu_ring_head;
+ u32 cpu_ring_tail;
+
+ u32 semaphore_seqno[I915_NUM_RINGS - 1];
+
+ /* Register state */
+ u32 tail;
+ u32 head;
+ u32 ctl;
+ u32 hws;
+ u32 ipeir;
+ u32 ipehr;
+ u32 instdone;
+ u32 acthd;
+ u32 bbstate;
+ u32 instpm;
+ u32 instps;
+ u32 seqno;
+ u64 bbaddr;
+ u32 fault_reg;
+ u32 faddr;
+ u32 rc_psmi; /* sleep state */
+ u32 semaphore_mboxes[I915_NUM_RINGS - 1];
+
struct drm_i915_error_object {
int page_count;
u32 gtt_offset;
u32 *pages[0];
- } *ringbuffer, *batchbuffer, *ctx, *hws;
+ } *ringbuffer, *batchbuffer, *ctx, *hws_page;
+
struct drm_i915_error_request {
long jiffies;
u32 seqno;
u32 tail;
} *requests;
- int num_requests;
} ring[I915_NUM_RINGS];
struct drm_i915_error_buffer {
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index c9d4a18..07433bc 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -235,51 +235,48 @@ static const char *hangcheck_action_to_str(enum intel_ring_hangcheck_action a)
static void i915_ring_error_state(struct drm_i915_error_state_buf *m,
struct drm_device *dev,
- struct drm_i915_error_state *error,
- unsigned ring)
+ struct drm_i915_error_ring *ring)
{
- BUG_ON(ring >= I915_NUM_RINGS); /* shut up confused gcc */
- if (!error->ring[ring].valid)
+ if (!ring->valid)
return;
- err_printf(m, "%s command stream:\n", ring_str(ring));
- err_printf(m, " HEAD: 0x%08x\n", error->head[ring]);
- err_printf(m, " TAIL: 0x%08x\n", error->tail[ring]);
- err_printf(m, " CTL: 0x%08x\n", error->ctl[ring]);
- err_printf(m, " HWS: 0x%08x\n", error->hws[ring]);
- err_printf(m, " ACTHD: 0x%08x\n", error->acthd[ring]);
- err_printf(m, " IPEIR: 0x%08x\n", error->ipeir[ring]);
- err_printf(m, " IPEHR: 0x%08x\n", error->ipehr[ring]);
- err_printf(m, " INSTDONE: 0x%08x\n", error->instdone[ring]);
+ err_printf(m, " HEAD: 0x%08x\n", ring->head);
+ err_printf(m, " TAIL: 0x%08x\n", ring->tail);
+ err_printf(m, " CTL: 0x%08x\n", ring->ctl);
+ err_printf(m, " HWS: 0x%08x\n", ring->hws);
+ err_printf(m, " ACTHD: 0x%08x\n", ring->acthd);
+ err_printf(m, " IPEIR: 0x%08x\n", ring->ipeir);
+ err_printf(m, " IPEHR: 0x%08x\n", ring->ipehr);
+ err_printf(m, " INSTDONE: 0x%08x\n", ring->instdone);
if (INTEL_INFO(dev)->gen >= 4) {
- err_printf(m, " BBADDR: 0x%08llx\n", error->bbaddr[ring]);
- err_printf(m, " BB_STATE: 0x%08x\n", error->bbstate[ring]);
- err_printf(m, " INSTPS: 0x%08x\n", error->instps[ring]);
+ err_printf(m, " BBADDR: 0x%08llx\n", ring->bbaddr);
+ err_printf(m, " BB_STATE: 0x%08x\n", ring->bbstate);
+ err_printf(m, " INSTPS: 0x%08x\n", ring->instps);
}
- err_printf(m, " INSTPM: 0x%08x\n", error->instpm[ring]);
- err_printf(m, " FADDR: 0x%08x\n", error->faddr[ring]);
+ err_printf(m, " INSTPM: 0x%08x\n", ring->instpm);
+ err_printf(m, " FADDR: 0x%08x\n", ring->faddr);
if (INTEL_INFO(dev)->gen >= 6) {
- err_printf(m, " RC PSMI: 0x%08x\n", error->rc_psmi[ring]);
- err_printf(m, " FAULT_REG: 0x%08x\n", error->fault_reg[ring]);
+ err_printf(m, " RC PSMI: 0x%08x\n", ring->rc_psmi);
+ err_printf(m, " FAULT_REG: 0x%08x\n", ring->fault_reg);
err_printf(m, " SYNC_0: 0x%08x [last synced 0x%08x]\n",
- error->semaphore_mboxes[ring][0],
- error->semaphore_seqno[ring][0]);
+ ring->semaphore_mboxes[0],
+ ring->semaphore_seqno[0]);
err_printf(m, " SYNC_1: 0x%08x [last synced 0x%08x]\n",
- error->semaphore_mboxes[ring][1],
- error->semaphore_seqno[ring][1]);
+ ring->semaphore_mboxes[1],
+ ring->semaphore_seqno[1]);
if (HAS_VEBOX(dev)) {
err_printf(m, " SYNC_2: 0x%08x [last synced 0x%08x]\n",
- error->semaphore_mboxes[ring][2],
- error->semaphore_seqno[ring][2]);
+ ring->semaphore_mboxes[2],
+ ring->semaphore_seqno[2]);
}
}
- err_printf(m, " seqno: 0x%08x\n", error->seqno[ring]);
- err_printf(m, " waiting: %s\n", yesno(error->waiting[ring]));
- err_printf(m, " ring->head: 0x%08x\n", error->cpu_ring_head[ring]);
- err_printf(m, " ring->tail: 0x%08x\n", error->cpu_ring_tail[ring]);
+ err_printf(m, " seqno: 0x%08x\n", ring->seqno);
+ err_printf(m, " waiting: %s\n", yesno(ring->waiting));
+ err_printf(m, " ring->head: 0x%08x\n", ring->cpu_ring_head);
+ err_printf(m, " ring->tail: 0x%08x\n", ring->cpu_ring_tail);
err_printf(m, " hangcheck: %s [%d]\n",
- hangcheck_action_to_str(error->hangcheck_action[ring]),
- error->hangcheck_score[ring]);
+ hangcheck_action_to_str(ring->hangcheck_action),
+ ring->hangcheck_score);
}
void i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...)
@@ -331,8 +328,10 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
if (INTEL_INFO(dev)->gen == 7)
err_printf(m, "ERR_INT: 0x%08x\n", error->err_int);
- for (i = 0; i < ARRAY_SIZE(error->ring); i++)
- i915_ring_error_state(m, dev, error, i);
+ for (i = 0; i < ARRAY_SIZE(error->ring); i++) {
+ err_printf(m, "%s command stream:\n", ring_str(i));
+ i915_ring_error_state(m, dev, &error->ring[i]);
+ }
for (i = 0; i < error->vm_count; i++) {
err_printf(m, "vm[%d]\n", i);
@@ -390,7 +389,7 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
}
}
- if ((obj = error->ring[i].hws)) {
+ if ((obj = error->ring[i].hws_page)) {
err_printf(m, "%s --- HW Status = 0x%08x\n",
dev_priv->ring[i].name,
obj->gtt_offset);
@@ -488,7 +487,7 @@ static void i915_error_state_free(struct kref *error_ref)
for (i = 0; i < ARRAY_SIZE(error->ring); i++) {
i915_error_object_free(error->ring[i].batchbuffer);
i915_error_object_free(error->ring[i].ringbuffer);
- i915_error_object_free(error->ring[i].hws);
+ i915_error_object_free(error->ring[i].hws_page);
i915_error_object_free(error->ring[i].ctx);
kfree(error->ring[i].requests);
}
@@ -767,52 +766,52 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
}
static void i915_record_ring_state(struct drm_device *dev,
- struct drm_i915_error_state *error,
- struct intel_ring_buffer *ring)
+ struct intel_ring_buffer *ring,
+ struct drm_i915_error_ring *ering)
{
struct drm_i915_private *dev_priv = dev->dev_private;
if (INTEL_INFO(dev)->gen >= 6) {
- error->rc_psmi[ring->id] = I915_READ(ring->mmio_base + 0x50);
- error->fault_reg[ring->id] = I915_READ(RING_FAULT_REG(ring));
- error->semaphore_mboxes[ring->id][0]
+ ering->rc_psmi = I915_READ(ring->mmio_base + 0x50);
+ ering->fault_reg = I915_READ(RING_FAULT_REG(ring));
+ ering->semaphore_mboxes[0]
= I915_READ(RING_SYNC_0(ring->mmio_base));
- error->semaphore_mboxes[ring->id][1]
+ ering->semaphore_mboxes[1]
= I915_READ(RING_SYNC_1(ring->mmio_base));
- error->semaphore_seqno[ring->id][0] = ring->sync_seqno[0];
- error->semaphore_seqno[ring->id][1] = ring->sync_seqno[1];
+ ering->semaphore_seqno[0] = ring->sync_seqno[0];
+ ering->semaphore_seqno[1] = ring->sync_seqno[1];
}
if (HAS_VEBOX(dev)) {
- error->semaphore_mboxes[ring->id][2] =
+ ering->semaphore_mboxes[2] =
I915_READ(RING_SYNC_2(ring->mmio_base));
- error->semaphore_seqno[ring->id][2] = ring->sync_seqno[2];
+ ering->semaphore_seqno[2] = ring->sync_seqno[2];
}
if (INTEL_INFO(dev)->gen >= 4) {
- error->faddr[ring->id] = I915_READ(RING_DMA_FADD(ring->mmio_base));
- error->ipeir[ring->id] = I915_READ(RING_IPEIR(ring->mmio_base));
- error->ipehr[ring->id] = I915_READ(RING_IPEHR(ring->mmio_base));
- error->instdone[ring->id] = I915_READ(RING_INSTDONE(ring->mmio_base));
- error->instps[ring->id] = I915_READ(RING_INSTPS(ring->mmio_base));
- error->bbaddr[ring->id] = I915_READ(RING_BBADDR(ring->mmio_base));
+ ering->faddr = I915_READ(RING_DMA_FADD(ring->mmio_base));
+ ering->ipeir = I915_READ(RING_IPEIR(ring->mmio_base));
+ ering->ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
+ ering->instdone = I915_READ(RING_INSTDONE(ring->mmio_base));
+ ering->instps = I915_READ(RING_INSTPS(ring->mmio_base));
+ ering->bbaddr = I915_READ(RING_BBADDR(ring->mmio_base));
if (INTEL_INFO(dev)->gen >= 8)
- error->bbaddr[ring->id] |= (u64) I915_READ(RING_BBADDR_UDW(ring->mmio_base)) << 32;
- error->bbstate[ring->id] = I915_READ(RING_BBSTATE(ring->mmio_base));
+ ering->bbaddr |= (u64) I915_READ(RING_BBADDR_UDW(ring->mmio_base)) << 32;
+ ering->bbstate = I915_READ(RING_BBSTATE(ring->mmio_base));
} else {
- error->faddr[ring->id] = I915_READ(DMA_FADD_I8XX);
- error->ipeir[ring->id] = I915_READ(IPEIR);
- error->ipehr[ring->id] = I915_READ(IPEHR);
- error->instdone[ring->id] = I915_READ(INSTDONE);
+ ering->faddr = I915_READ(DMA_FADD_I8XX);
+ ering->ipeir = I915_READ(IPEIR);
+ ering->ipehr = I915_READ(IPEHR);
+ ering->instdone = I915_READ(INSTDONE);
}
- error->waiting[ring->id] = waitqueue_active(&ring->irq_queue);
- error->instpm[ring->id] = I915_READ(RING_INSTPM(ring->mmio_base));
- error->seqno[ring->id] = ring->get_seqno(ring, false);
- error->acthd[ring->id] = intel_ring_get_active_head(ring);
- error->head[ring->id] = I915_READ_HEAD(ring);
- error->tail[ring->id] = I915_READ_TAIL(ring);
- error->ctl[ring->id] = I915_READ_CTL(ring);
+ ering->waiting = waitqueue_active(&ring->irq_queue);
+ ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base));
+ ering->seqno = ring->get_seqno(ring, false);
+ ering->acthd = intel_ring_get_active_head(ring);
+ ering->head = I915_READ_HEAD(ring);
+ ering->tail = I915_READ_TAIL(ring);
+ ering->ctl = I915_READ_CTL(ring);
if (I915_NEED_GFX_HWS(dev)) {
int mmio;
@@ -840,14 +839,14 @@ static void i915_record_ring_state(struct drm_device *dev,
mmio = RING_HWS_PGA(ring->mmio_base);
}
- error->hws[ring->id] = I915_READ(mmio);
+ ering->hws = I915_READ(mmio);
}
- error->cpu_ring_head[ring->id] = ring->head;
- error->cpu_ring_tail[ring->id] = ring->tail;
+ ering->cpu_ring_head = ring->head;
+ ering->cpu_ring_tail = ring->tail;
- error->hangcheck_score[ring->id] = ring->hangcheck.score;
- error->hangcheck_action[ring->id] = ring->hangcheck.action;
+ ering->hangcheck_score = ring->hangcheck.score;
+ ering->hangcheck_action = ring->hangcheck.action;
}
@@ -888,7 +887,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
error->ring[i].valid = true;
- i915_record_ring_state(dev, error, ring);
+ i915_record_ring_state(dev, ring, &error->ring[i]);
error->ring[i].batchbuffer =
i915_error_first_batchbuffer(dev_priv, ring);
@@ -897,7 +896,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
i915_error_ggtt_object_create(dev_priv, ring->obj);
if (ring->status_page.obj)
- error->ring[i].hws =
+ error->ring[i].hws_page =
i915_error_ggtt_object_create(dev_priv, ring->status_page.obj);
i915_gem_record_active_context(ring, error, &error->ring[i]);
--
1.8.5.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 5/6] drm/i915: Add some more registers to error state
2014-01-30 8:19 [PATCH 1/6] drm/i915: Extract register state error capture Ben Widawsky
` (2 preceding siblings ...)
2014-01-30 8:19 ` [PATCH 4/6] drm/i915: Move per ring error state to ring_error Ben Widawsky
@ 2014-01-30 8:19 ` Ben Widawsky
2014-01-30 8:19 ` [PATCH 6/6] drm/i915: Capture PPGTT info on error capture Ben Widawsky
4 siblings, 0 replies; 8+ messages in thread
From: Ben Widawsky @ 2014-01-30 8:19 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
Chris:
Do we also want to capture?
GAC_ECO_BITS /* gen6,7 */
GAM_ECOCHK /* gen6,7 */
GAB_CTL /* gen6 */
GFX_MODE /* gen6 */
Requested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/i915_drv.h | 4 ++++
drivers/gpu/drm/i915/i915_gpu_error.c | 11 ++++++++++-
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d20fc80..e41f30a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -307,6 +307,10 @@ struct drm_i915_error_state {
u32 error; /* gen6+ */
u32 err_int; /* gen7 */
u32 done_reg;
+ u32 gac_eco;
+ u32 gam_ecochk;
+ u32 gab_ctl;
+ u32 gfx_mode;
u32 extra_instdone[I915_NUM_INSTDONE_REG];
u32 pipestat[I915_MAX_PIPES];
u64 fence[I915_MAX_NUM_FENCES];
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 07433bc..4c3ca11 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1035,8 +1035,11 @@ static void i915_capture_reg_state(struct drm_i915_private *dev_priv,
if (IS_GEN7(dev))
error->err_int = I915_READ(GEN7_ERR_INT);
- if (IS_GEN6(dev))
+ if (IS_GEN6(dev)) {
error->forcewake = I915_READ(FORCEWAKE);
+ error->gab_ctl = I915_READ(GAB_CTL);
+ error->gfx_mode = I915_READ(GFX_MODE);
+ }
if (IS_GEN2(dev))
error->ier = I915_READ16(IER);
@@ -1052,6 +1055,12 @@ static void i915_capture_reg_state(struct drm_i915_private *dev_priv,
}
/* 3: Feature specific registers */
+ if (IS_GEN6(dev) || IS_GEN7(dev)) {
+ error->gam_ecochk = I915_READ(GAM_ECOCHK);
+ error->gac_eco = I915_READ(GAC_ECO_BITS);
+ }
+
+ /* 4: Everything else */
if (HAS_HW_CONTEXTS(dev))
error->ccid = I915_READ(CCID);
--
1.8.5.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 6/6] drm/i915: Capture PPGTT info on error capture
2014-01-30 8:19 [PATCH 1/6] drm/i915: Extract register state error capture Ben Widawsky
` (3 preceding siblings ...)
2014-01-30 8:19 ` [PATCH 5/6] drm/i915: Add some more registers to error state Ben Widawsky
@ 2014-01-30 8:19 ` Ben Widawsky
2014-01-30 11:26 ` Daniel Vetter
4 siblings, 1 reply; 8+ messages in thread
From: Ben Widawsky @ 2014-01-30 8:19 UTC (permalink / raw)
To: Intel GFX; +Cc: Ben Widawsky, Ben Widawsky
v2: Rebased upon cleaned up error state
v3: Make sure hangcheck info remains last (Chris)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_drv.h | 9 +++++++++
drivers/gpu/drm/i915/i915_gpu_error.c | 37 +++++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e41f30a..3035bf3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -361,6 +361,14 @@ struct drm_i915_error_state {
u32 seqno;
u32 tail;
} *requests;
+
+ struct {
+ u32 gfx_mode;
+ union {
+ u64 pdp[4];
+ u32 pp_dir_base;
+ };
+ } vm_info;
} ring[I915_NUM_RINGS];
struct drm_i915_error_buffer {
@@ -378,6 +386,7 @@ struct drm_i915_error_state {
s32 ring:4;
u32 cache_level:3;
} **active_bo, **pinned_bo;
+
u32 *active_bo_count, *pinned_bo_count;
u32 vm_count;
};
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 4c3ca11..9d04e6a 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -270,6 +270,19 @@ static void i915_ring_error_state(struct drm_i915_error_state_buf *m,
ring->semaphore_seqno[2]);
}
}
+ if (USES_PPGTT(dev)) {
+ err_printf(m, " GFX_MODE: 0x%08x\n", ring->vm_info.gfx_mode);
+
+ if (INTEL_INFO(dev)->gen >= 8) {
+ int i;
+ for (i = 0; i < 4; i++)
+ err_printf(m, " PDP%d: 0x%016llx\n",
+ i, ring->vm_info.pdp[i]);
+ } else {
+ err_printf(m, " PP_DIR_BASE: 0x%08x\n",
+ ring->vm_info.pp_dir_base);
+ }
+ }
err_printf(m, " seqno: 0x%08x\n", ring->seqno);
err_printf(m, " waiting: %s\n", yesno(ring->waiting));
err_printf(m, " ring->head: 0x%08x\n", ring->cpu_ring_head);
@@ -847,6 +860,30 @@ static void i915_record_ring_state(struct drm_device *dev,
ering->hangcheck_score = ring->hangcheck.score;
ering->hangcheck_action = ring->hangcheck.action;
+
+ if (USES_PPGTT(dev)) {
+ int i;
+
+ ering->vm_info.gfx_mode = I915_READ(RING_MODE_GEN7(ring));
+
+ switch (INTEL_INFO(dev)->gen) {
+ case 8:
+ for (i = 0; i < 4; i++) {
+ ering->vm_info.pdp[i] =
+ I915_READ(GEN8_RING_PDP_UDW(ring, i));
+ ering->vm_info.pdp[i] <<= 32;
+ ering->vm_info.pdp[i] |=
+ I915_READ(GEN8_RING_PDP_LDW(ring, i));
+ }
+ break;
+ case 7:
+ ering->vm_info.pp_dir_base = RING_PP_DIR_BASE(ring);
+ break;
+ case 6:
+ ering->vm_info.pp_dir_base = RING_PP_DIR_BASE_READ(ring);
+ break;
+ }
+ }
}
--
1.8.5.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 6/6] drm/i915: Capture PPGTT info on error capture
2014-01-30 8:19 ` [PATCH 6/6] drm/i915: Capture PPGTT info on error capture Ben Widawsky
@ 2014-01-30 11:26 ` Daniel Vetter
2014-01-30 11:34 ` Daniel Vetter
0 siblings, 1 reply; 8+ messages in thread
From: Daniel Vetter @ 2014-01-30 11:26 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Thu, Jan 30, 2014 at 12:19:40AM -0800, Ben Widawsky wrote:
> v2: Rebased upon cleaned up error state
> v3: Make sure hangcheck info remains last (Chris)
>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Pulled in entire series with Chris' irc ack on the remaining two patches.
Note that there have been a few conflicts aroung ring_error_state->valid
(dunno how that happen, I guess this series wasn't strictly based on
-nightly), please double-check things.
Thanks, Daniel
> ---
> drivers/gpu/drm/i915/i915_drv.h | 9 +++++++++
> drivers/gpu/drm/i915/i915_gpu_error.c | 37 +++++++++++++++++++++++++++++++++++
> 2 files changed, 46 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e41f30a..3035bf3 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -361,6 +361,14 @@ struct drm_i915_error_state {
> u32 seqno;
> u32 tail;
> } *requests;
> +
> + struct {
> + u32 gfx_mode;
> + union {
> + u64 pdp[4];
> + u32 pp_dir_base;
> + };
> + } vm_info;
> } ring[I915_NUM_RINGS];
>
> struct drm_i915_error_buffer {
> @@ -378,6 +386,7 @@ struct drm_i915_error_state {
> s32 ring:4;
> u32 cache_level:3;
> } **active_bo, **pinned_bo;
> +
> u32 *active_bo_count, *pinned_bo_count;
> u32 vm_count;
> };
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 4c3ca11..9d04e6a 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -270,6 +270,19 @@ static void i915_ring_error_state(struct drm_i915_error_state_buf *m,
> ring->semaphore_seqno[2]);
> }
> }
> + if (USES_PPGTT(dev)) {
> + err_printf(m, " GFX_MODE: 0x%08x\n", ring->vm_info.gfx_mode);
> +
> + if (INTEL_INFO(dev)->gen >= 8) {
> + int i;
> + for (i = 0; i < 4; i++)
> + err_printf(m, " PDP%d: 0x%016llx\n",
> + i, ring->vm_info.pdp[i]);
> + } else {
> + err_printf(m, " PP_DIR_BASE: 0x%08x\n",
> + ring->vm_info.pp_dir_base);
> + }
> + }
> err_printf(m, " seqno: 0x%08x\n", ring->seqno);
> err_printf(m, " waiting: %s\n", yesno(ring->waiting));
> err_printf(m, " ring->head: 0x%08x\n", ring->cpu_ring_head);
> @@ -847,6 +860,30 @@ static void i915_record_ring_state(struct drm_device *dev,
>
> ering->hangcheck_score = ring->hangcheck.score;
> ering->hangcheck_action = ring->hangcheck.action;
> +
> + if (USES_PPGTT(dev)) {
> + int i;
> +
> + ering->vm_info.gfx_mode = I915_READ(RING_MODE_GEN7(ring));
> +
> + switch (INTEL_INFO(dev)->gen) {
> + case 8:
> + for (i = 0; i < 4; i++) {
> + ering->vm_info.pdp[i] =
> + I915_READ(GEN8_RING_PDP_UDW(ring, i));
> + ering->vm_info.pdp[i] <<= 32;
> + ering->vm_info.pdp[i] |=
> + I915_READ(GEN8_RING_PDP_LDW(ring, i));
> + }
> + break;
> + case 7:
> + ering->vm_info.pp_dir_base = RING_PP_DIR_BASE(ring);
> + break;
> + case 6:
> + ering->vm_info.pp_dir_base = RING_PP_DIR_BASE_READ(ring);
> + break;
> + }
> + }
> }
>
>
> --
> 1.8.5.3
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 6/6] drm/i915: Capture PPGTT info on error capture
2014-01-30 11:26 ` Daniel Vetter
@ 2014-01-30 11:34 ` Daniel Vetter
0 siblings, 0 replies; 8+ messages in thread
From: Daniel Vetter @ 2014-01-30 11:34 UTC (permalink / raw)
To: Ben Widawsky; +Cc: Intel GFX, Ben Widawsky
On Thu, Jan 30, 2014 at 12:26:49PM +0100, Daniel Vetter wrote:
> On Thu, Jan 30, 2014 at 12:19:40AM -0800, Ben Widawsky wrote:
> > v2: Rebased upon cleaned up error state
> > v3: Make sure hangcheck info remains last (Chris)
> >
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>
> Pulled in entire series with Chris' irc ack on the remaining two patches.
> Note that there have been a few conflicts aroung ring_error_state->valid
> (dunno how that happen, I guess this series wasn't strictly based on
> -nightly), please double-check things.
Meh, I've forgotten to check -fixes - the ring->valid patch I've been
looking for was obviously there. Coffee doesn't seem to work today, I'll
do a backmerge and sort this out.
Sorry for the fuss.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-01-30 11:34 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-30 8:19 [PATCH 1/6] drm/i915: Extract register state error capture Ben Widawsky
2014-01-30 8:19 ` [PATCH 2/6] drm/i915: Logically reorder error register capture Ben Widawsky
2014-01-30 8:19 ` [PATCH 3/6] drm/i915: Reorder struct members Ben Widawsky
2014-01-30 8:19 ` [PATCH 4/6] drm/i915: Move per ring error state to ring_error Ben Widawsky
2014-01-30 8:19 ` [PATCH 5/6] drm/i915: Add some more registers to error state Ben Widawsky
2014-01-30 8:19 ` [PATCH 6/6] drm/i915: Capture PPGTT info on error capture Ben Widawsky
2014-01-30 11:26 ` Daniel Vetter
2014-01-30 11:34 ` Daniel Vetter
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.