[PATCH 0/3] Per Engine hang detection and recovery

* [PATCH 0/3] Per Engine hang detection and recovery
@ 2013-11-11 14:58 Siluvery, Arun
  2013-11-11 15:31 ` Daniel Vetter
  0 siblings, 1 reply; 4+ messages in thread
From: Siluvery, Arun @ 2013-11-11 14:58 UTC (permalink / raw)
  To: intel-gfx

From: "Siluvery, Arun" <arun.siluvery@intel.com>

This patchset contains changes for Timeout detection and recovery (TDR) which
provides per-engine hang detection and recovery.
The current driver performs full gpu reset in case of a hang, TDR attempts to
only reset the engine that is hung and it falls back to full reset if it fails.

Full GPU reset can leave the system in a state where the display updates
intermittently and possibly lock-up depending on the work load at the time of
hang. TDR can help recover the system in those case thus increasing the stability.

The changes are split in multiple patches.
1. Ring utility functions to save/restore context, reset ring etc
2. TDR hang detection logic and error recovery function
3. Debugfs changes to export TDR statistics.

I have tested these changes on drm-intel-nightly with simple test which
inserts a bad batch buffer on the specific to trigger a hang. TDR logic
then detects this and recovers from it by skipping the bad batch.

Please review and give your comments.

regards
Arun

Siluvery, Arun (3):
  drm/1915: Add ring functions to save/restore context for per-ring
    reset
  drm/i915: Per-engine Timeout detection and recovery on HSW
  drm/i915: Export TDR hang count to debugfs

 drivers/gpu/drm/i915/i915_debugfs.c     |  68 +++-
 drivers/gpu/drm/i915/i915_dma.c         |  16 +-
 drivers/gpu/drm/i915/i915_drv.c         | 195 +++++++++-
 drivers/gpu/drm/i915/i915_drv.h         |  92 ++++-
 drivers/gpu/drm/i915/i915_gem.c         |  77 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c   |  25 +-
 drivers/gpu/drm/i915/i915_irq.c         | 556 ++++++++++++++++-------------
 drivers/gpu/drm/i915/i915_reg.h         |   7 +
 drivers/gpu/drm/i915/intel_display.c    |  25 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 607 +++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  51 +++
 drivers/gpu/drm/i915/intel_uncore.c     |  31 +-
 include/drm/drmP.h                      |   7 +
 13 files changed, 1467 insertions(+), 290 deletions(-)

-- 
1.8.4

^ permalink raw reply	[flat|nested] 4+ messages in thread