All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/11] TDR/watchdog timeout support for gen8
@ 2015-06-08 17:03 Tomas Elf
  2015-06-08 17:03 ` [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode Tomas Elf
                   ` (13 more replies)
  0 siblings, 14 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-08 17:03 UTC (permalink / raw)
  To: Intel-GFX

This patch series introduces the following features:

* Feature 1: TDR (Timeout Detection and Recovery) for gen8 execlist mode.

TDR is an umbrella term for anything that goes into detecting and recovering
from GPU hangs and is a term more widely used outside of the upstream driver.
This feature introduces an extensible framework that currently supports gen8
but that can be easily extended to support gen7 as well (which is already the
case in GMIN but unfortunately in a not quite upstreamable form). The code
contained in this submission represents the essentials of what is currently in
GMIN merged with what is currently in upstream (as of the time when this work
commenced a few months back).

This feature adds a new hang recovery path alongside the legacy GPU reset path,
which takes care of engine recovery only. Aside from adding support for
per-engine recovery this feature also introduces rules for how to promote a
potential per-engine reset to a legacy, full GPU reset.

The hang checker now integrates with the error handler in a slightly different
way in that it allows hang recovery on multiple engines at the same time by
passing an engine flag mask to the error handler where flags representing all
of the hung engines are set. This allows us to schedule hang recovery once for
all currently hung engines instead of one hang recovery per detected engine
hang. Previously, when only full GPU reset was supported this was all the same
since it wouldn't matter if one or four engines were hung at any given point
since it would all amount to the same thing - the GPU getting reset. As it
stands now the behaviour is different depending on which engine is hung since
each engine is reset separately from all the other engines, therefore we have
to think about this in terms of scheduling cost and recovery latency. (see
open question below)

OPEN QUESTIONS:

	1. Do we want to investigate the possibility of per-engine hang
	detection? In the current upstream driver there is only one work queue
	that handles the hang checker and everything from initial hang
	detection to final hang recovery runs in this thread. This makes sense
	if you're only supporting one form of hang recovery - using full GPU
	reset and nothing tied to any particular engine. However, as part
	of this patch series we're changing that by introducing per-engine
	hang recovery. It could make sense to introduce multiple work
	queues - one per engine - to run multiple hang checking threads in
	parallel.

	This would potentially allow savings in terms of recovery latency since
	we don't have to scan all engines every time the hang checker is
	scheduled and the error handler does not have to scan all engines every
	time it is scheduled. Instead, we could implement one work queue per
	engine that would invoke the hang checker that only checks _that_
	particular engine and then the error handler is invoked for _that_
	particular engine. If one engine has hung the latency for getting to
	the hang recovery path for that particular engine would be (Time For
	Hang Checking One Engine) + (Time For Error Handling One Engine) rather
	than the time it takes to do hang checking for all engines + the time
	it takes to do error handling for all engines that have been detected
	as hung (which in the worst case would be all engines). There would
	potentially be as many hang checker and error handling threads going on
	concurrently as there are engines in the hardware but they would all be
	running in parallel without any significant locking. The first time
	where any thread needs exclusive access to the driver is at the point
	of the actual hang recovery but the time it takes to get there would
	theoretically be lower and the time it actually takes to do per-engine
	hang recovery is quite a lot lower than the time it takes to actually
	detect a hang reliably.

	How much we would save by such a change still needs to be analysed and
	compared against the current single-thread model but it makes sense
	from a theoretical design point of view.

	2. How does per-engine reset integrate with the public reset stats
	IOCTL? These stats are used for the GL robustness interface and
	currently these tests are failing when running per-engine hang recovery
	since we treat per-engine recovery differently from full GPU recovery,
	which is nothing that userland knows anything about. When userland
	expects to hang the hardware it expects the reset stat interface to
	reflect this, which is something that has changed as part of this code
	submission. There's more than one way to solve this. Here are two options:

		1. Expose per-engine reset statistics and set contexts as
		guilty the same way for per-engine reset as for full GPU
		resets.

		That would make this change to the hang recovery mechanism
		transparent to userland but it would change the semantics since
		an active context in the reset stats no longer implies that the
		GPU was fully reset.

		2. Add a new set of statistics for per-engine reset (one group
		of statistics for each engine) to reflect the extended
		capabilities that per-engine hang recovery offers.

		Would that be breaking the ABI?

		... Or are there any other way of doing this?

* Feature 2: Watchdog Timeout (a.k.a "media engine reset") for gen8.

This feature allows userland applications to control whether or not individual
batch buffers should have a first-level, fine-grained, hardware-based hang
detection mechanism on top of the ordinary, software-based periodic hang
checker that is already in the driver. The advantage over relying solely on the
current software-based hang checker is that the watchdog timeout mechanism is
about 1000x quicker and more precise. Since it's not a full driver-level hang
detection mechanism but only targetting one individual batch buffer at a time
it can afford to be that quick without risking an increase in false positive
hang detection.

This feature includes the following changes:

a) Watchdog timeout interrupt service routine for handling watchdog interrupts
and connecting these to per-engine hang recovery.

b) Injection of watchdog timer enablement/cancellation instructions
before/after the batch buffer start instruction in the ring buffer so that
watchdog timeout is connected to the submission of an individual batch buffer.

c) Extension of the DRM batch buffer interface, exposing the watchdog timeout
feature to userland. We've got two open source groups in VPG currently in the
process of integrating support for this feature, which should make it
principally possible to upstream this extension.

There is currently full watchdog timeout support for gen7 in GMIN and it is
quite similar to the gen8 implementation so there is nothing obvious that
prevents us from upstreaming that code along with the gen8 code. However,
watchdog timeout is fully dependent on the per-engine hang recovery path and
that is not part of this code submission for gen7. Therefore watchdog timeout
support for gen7 has been excluded until per-engine hang recovery support for
gen7 has landed upstream.

As part of this submission we've had to reinstate the work queue that was
previously in place between the error handler and the hang recovery path. The
reason for this is that the per-engine recovery path is called directly from
the interrupt handler in the case of watchdog timeout. In that situation
there's no way of grabbing the struct_mutex, which is a requirement for the
hang recovery path. Therefore, by reinstating the work queue we provide a
unified execution context for the hang recovery code that allows the hang
recovery code to grab whatever locks it needs without sacrificing interrupt
latency too much or sleeping indefinitely in hard interrupt context.

* Feature 3. Context Submission Status Consistency checking

Something that becomes apparent when you run long-duration operations tests
with concurrent rendering processes with intermittently injected hangs is that
it seems like the GPU forgets to send context completion interrupts to the
driver under some circumstances. What this means is that the driver sometimes
gets stuck on a context that never seems to finish, all the while the hardware
has completed and is waiting for more work.

The problem with this is that the per-engine hang recovery path relies on
context resubmission to kick off the hardware again following an engine reset.
This can only be done safely if the hardware and driver share the same opinion
about the current state. Therefore we've extended the periodic hang checker to
check for context submission state inconsistencies aside from the hang checking
it already does.

If such a state is detected it is assumed (based on experience) that a context
completion interrupt has been lost somehow. If this state persists for some
time an attempt to correct it is made by faking the presumably lost context
completion interrupt by manually calling the execlist interrupt handler, which
is normally called from the main interrupt handler cued by a received context
event interrupt. Just because an interrupt goes missing does not mean that the
context status buffer (CSB) does not get appropriately updated by the hardware,
which means that we can expect to find all the recent changes to the context
states for each engine captured there. If there are outstanding context status
changes in store there then the faked context event interrupt will allow the
interrupt handler to act on them. In the case of lost context completion
interrupts this will prompt the driver to remove the already completed context
from the execlist queue and move on to the next pending piece of work and
thereby eliminating the inconsistency.

* Feature 4. Debugfs extensions for per-engine hang recovery and TDR/watchdog trace
points.


Tomas Elf (11):
  drm/i915: Early exit from semaphore_waits_for for execlist mode.
  drm/i915: Introduce uevent for full GPU reset.
  drm/i915: Add reset stats entry point for per-engine reset.
  drm/i915: Adding TDR / per-engine reset support for gen8.
  drm/i915: Extending i915_gem_check_wedge to check engine reset in
    progress
  drm/i915: Disable warnings for TDR interruptions in the display
    driver.
  drm/i915: Reinstate hang recovery work queue.
  drm/i915: Watchdog timeout support for gen8.
  drm/i915: Fake lost context interrupts through forced CSB check.
  drm/i915: Debugfs interface for per-engine hang recovery.
  drm/i915: TDR/watchdog trace points.

 drivers/gpu/drm/i915/i915_debugfs.c     |  146 +++++-
 drivers/gpu/drm/i915/i915_dma.c         |   79 +++
 drivers/gpu/drm/i915/i915_drv.c         |  201 ++++++++
 drivers/gpu/drm/i915/i915_drv.h         |   91 +++-
 drivers/gpu/drm/i915/i915_gem.c         |   93 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c   |    2 +-
 drivers/gpu/drm/i915/i915_irq.c         |  378 ++++++++++++--
 drivers/gpu/drm/i915/i915_params.c      |   10 +
 drivers/gpu/drm/i915/i915_reg.h         |   13 +
 drivers/gpu/drm/i915/i915_trace.h       |  298 +++++++++++
 drivers/gpu/drm/i915/intel_display.c    |   16 +-
 drivers/gpu/drm/i915/intel_lrc.c        |  858 ++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_lrc.h        |   16 +-
 drivers/gpu/drm/i915/intel_lrc_tdr.h    |   40 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c |   87 +++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  109 ++++
 drivers/gpu/drm/i915/intel_uncore.c     |  241 ++++++++-
 include/uapi/drm/i915_drm.h             |    5 +-
 18 files changed, 2589 insertions(+), 94 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_lrc_tdr.h

-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode.
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
@ 2015-06-08 17:03 ` Tomas Elf
  2015-06-08 17:36   ` Chris Wilson
  2015-06-16 13:44   ` Daniel Vetter
  2015-06-08 17:03 ` [RFC 02/11] drm/i915: Introduce uevent for full GPU reset Tomas Elf
                   ` (12 subsequent siblings)
  13 siblings, 2 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-08 17:03 UTC (permalink / raw)
  To: Intel-GFX

When submitting semaphores in execlist mode the hang checker crashes in this
function because it is only runnable in ring submission mode. The reason this
is of particular interest to the TDR patch series is because we use semaphores
as a mean to induce hangs during testing (which is the recommended way to
induce hangs for gen8+). It's not clear how this is supposed to work in
execlist mode since:

1. This function requires a ring buffer.

2. Retrieving a ring buffer in execlist mode requires us to retrieve the
corresponding context, which we get from a request.

3. Retieving a request from the hang checker is not straight-forward since that
requires us to grab the struct_mutex in order to synchronize against the
request retirement thread.

4. Grabbing the struct_mutex from the hang checker is nothing that we will do
since that puts us at risk of deadlock since a hung thread might be holding the
struct_mutex already.

Therefore it's not obvious how we're supposed to deal with this. For now, we're
doing an early exit from this function, which avoids any kernel panic situation
when running our own internal TDR ULT.

Signed-off-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 46bcbff..40c44fc 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2698,6 +2698,26 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
 	u64 offset = 0;
 	int i, backwards;
 
+	/*
+	 * This function does not support execlist mode - any attempt to
+	 * proceed further into this function will result in a kernel panic
+	 * when dereferencing ring->buffer, which is not set up in execlist
+	 * mode.
+	 *
+	 * The correct way of doing it would be to derive the currently
+	 * executing ring buffer from the current context, which is derived
+	 * from the currently running request. Unfortunately, to get the
+	 * current request we would have to grab the struct_mutex before doing
+	 * anything else, which would be ill-advised since some other thread
+	 * might have grabbed it already and managed to hang itself, causing
+	 * the hang checker to deadlock.
+	 *
+	 * Therefore, this function does not support execlist mode in its
+	 * current form. Just return NULL and move on.
+	 */
+	if (i915.enable_execlists)
+		return NULL;
+
 	ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
 	if (!ipehr_is_semaphore_wait(ring->dev, ipehr))
 		return NULL;
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC 02/11] drm/i915: Introduce uevent for full GPU reset.
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
  2015-06-08 17:03 ` [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode Tomas Elf
@ 2015-06-08 17:03 ` Tomas Elf
  2015-06-16 13:43   ` Daniel Vetter
  2015-06-08 17:03 ` [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset Tomas Elf
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 59+ messages in thread
From: Tomas Elf @ 2015-06-08 17:03 UTC (permalink / raw)
  To: Intel-GFX

The TDR ULT used to validate this patch series requires a special uevent for
full GPU resets in order to distinguish between different kinds of resets.

Signed-off-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_uncore.c |   29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index d96d15f..770f526 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1449,18 +1449,33 @@ static int gen6_do_reset(struct drm_device *dev)
 
 int intel_gpu_reset(struct drm_device *dev)
 {
-	if (INTEL_INFO(dev)->gen >= 6)
-		return gen6_do_reset(dev);
+	int ret = -ENODEV;
+	int gen = INTEL_INFO(dev)->gen;
+
+	if (gen >= 6)
+		ret = gen6_do_reset(dev);
 	else if (IS_GEN5(dev))
-		return ironlake_do_reset(dev);
+		ret = ironlake_do_reset(dev);
 	else if (IS_G4X(dev))
-		return g4x_do_reset(dev);
+		ret = g4x_do_reset(dev);
 	else if (IS_G33(dev))
-		return g33_do_reset(dev);
+		ret = g33_do_reset(dev);
 	else if (INTEL_INFO(dev)->gen >= 3)
-		return i915_do_reset(dev);
+		ret = i915_do_reset(dev);
 	else
-		return -ENODEV;
+		WARN(1, "Full GPU reset not supported on gen %d\n", gen);
+
+	if (!ret) {
+		char *reset_event[2];
+
+		reset_event[1] = NULL;
+		reset_event[0] = kasprintf(GFP_KERNEL, "%s", "GPU RESET=0");
+		kobject_uevent_env(&dev->primary->kdev->kobj,
+				KOBJ_CHANGE, reset_event);
+		kfree(reset_event[0]);
+	}
+
+	return ret;
 }
 
 void intel_uncore_check_errors(struct drm_device *dev)
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset.
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
  2015-06-08 17:03 ` [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode Tomas Elf
  2015-06-08 17:03 ` [RFC 02/11] drm/i915: Introduce uevent for full GPU reset Tomas Elf
@ 2015-06-08 17:03 ` Tomas Elf
  2015-06-08 17:33   ` Chris Wilson
                     ` (2 more replies)
  2015-06-08 17:03 ` [RFC 04/11] drm/i915: Adding TDR / per-engine reset support for gen8 Tomas Elf
                   ` (10 subsequent siblings)
  13 siblings, 3 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-08 17:03 UTC (permalink / raw)
  To: Intel-GFX

In preparation for per-engine reset add way for setting context reset stats.

OPEN QUESTIONS:
1. How do we deal with get_reset_stats and the GL robustness interface when
introducing per-engine resets?

	a. Do we set context that cause per-engine resets as guilty? If so, how
	does this affect context banning?

	b. Do we extend the publically available reset stats to also contain
	per-engine reset statistics? If so, would this break the ABI?

Signed-off-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |    2 ++
 drivers/gpu/drm/i915/i915_gem.c |   14 ++++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 47be4a5..ab5dfdc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2781,6 +2781,8 @@ static inline bool i915_stop_ring_allow_warn(struct drm_i915_private *dev_priv)
 }
 
 void i915_gem_reset(struct drm_device *dev);
+void i915_gem_reset_engine(struct drm_i915_private *dev_priv,
+			   struct intel_engine_cs *engine);
 bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
 int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
 int __must_check i915_gem_init(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8ce363a..4c88e5c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2676,6 +2676,20 @@ void i915_gem_reset(struct drm_device *dev)
 	i915_gem_restore_fences(dev);
 }
 
+void i915_gem_reset_engine(struct drm_i915_private *dev_priv,
+			   struct intel_engine_cs *engine)
+{
+	u32 completed_seqno;
+	struct drm_i915_gem_request *req;
+
+	completed_seqno = engine->get_seqno(engine, false);
+
+	/* Find pending batch buffers and mark them as such  */
+	list_for_each_entry(req, &engine->request_list, list)
+	        if (req && (req->seqno > completed_seqno))
+	                i915_set_reset_status(dev_priv, req->ctx, false);
+}
+
 /**
  * This function clears the request list as sequence numbers are passed.
  */
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC 04/11] drm/i915: Adding TDR / per-engine reset support for gen8.
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
                   ` (2 preceding siblings ...)
  2015-06-08 17:03 ` [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset Tomas Elf
@ 2015-06-08 17:03 ` Tomas Elf
  2015-06-08 17:03 ` [RFC 05/11] drm/i915: Extending i915_gem_check_wedge to check engine reset in progress Tomas Elf
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-08 17:03 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Ian Lister

This change introduces support for TDR-style per-engine reset as a initial,
less intrusive hang recovery option to be attempted before falling back to the
legacy full GPU reset recovery mode if necessary. Initially we're only
supporting gen8 but adding support for gen7 is straight-forward since we've
already established an extensible framework where gen7 support can be plugged
in (add corresponding versions of intel_ring_enable, intel_ring_disable,
intel_ring_save, intel_ring_restore, etc.).

1. Per-engine recovery vs. Full GPU recovery

To capture the state of a single engine being detected as hung there is now a
new flag for every engine that can be set once the decision has been made to
schedule hang recovery for that particular engine.

The following algorithm is used to determine when to use which recovery mode:

	a. Once the hang check score reaches level HUNG hang recovery is
	scheduled as usual. The hang checker aggregates all engines currently
	detected as hung into a single engine flag mask and passes that to the
	error handler, which allows us to schedule hang recovery for all
	currently hung engines in a single call.

	b. The error handler checks all engines that have been marked as hung
	by the hang checker and - more specifically - checks how long ago it
	was since it last attempted to do per-engine hang recovery for each
	respective, currently hung engine. If the measured time period is
	within a certain time window, i.e. the last per-engine hang recovery
	was done too recently, it is determined that per-engine hang recovery
	is ineffective and the step is taken to promote a full GPU reset.

	c. If the error handler determines that no currently hung engine has
	recently had hang recovery a per-engine hang recovery is scheduled.

	d. Additionally, if the hang checker detects that the hang check score
	has grown too high (currently defined as twice the HUNG level) it
	determines that previous hang recovery attempts have failed for
	whatever reason and it will bypass the error checker full GPU reset
	promotion logic. One case where this is important is if the hang
	checker and error handler thinks that per-engine hang recovery is a
	suitable option and several such attempts are made - infrequently
	enough - but no effective reset is done, perhaps due to inconsistent
	context submission status, which is described further down below.

NOTE: Gen 7 and earlier will always promote to full GPU reset since there is
currently no per-engine reset support for these gens.

2. Context Submission Status Consistency.

Per-engine hang recovery on gen8 relies on the basic concept of context
submission status consistency. What this means is that we make sure that the
status of the hardware and the driver when it comes to the submission status of
the current context on any engine is consistent. For example, when submitting a
context to the corresponding ELSP port of an engine we expect the owning
request of that context to be at the head of the corresponding execution list
queue. Likewise, as long as the context is executing on the GPU we expect the
EXECLIST_STATUS register and the context status buffer to reflect this. Thus,
if the context submission status is consistent the ID of the currently
executing context should be in EXECLIST_STATUS and it should be consistent
with the context of the head request element in the execution list queue
corresponding to that engine.

The reason why this is important for per-engine hang recovery on gen8 is
because this recovery mode relies on context resubmission to resume execution
following the recovery. If a context has been determined to be hung and the
per-engine hang recovery mode is engaged leading to the resubmission of that
context it's important that the hardware is in fact not busy doing something
else or being idle since a resubmission during this state would cause unforseen
side-effects such as unexpected preemptions.

There are rare, although consistently reproducable, situations that have shown
up in practice where the driver and hardware are no longer consistent with each
other, e.g. due to lost context completion interrupts after which the hardware
would be idle but the driver would still think that a context would still be
active.

3. There is a new reset path for engine reset alongside the legacy full GPU
reset path. This path does the following:

	1) Check for context submission consistency to make sure that the
	context that the hardware is currently stuck on is actually what the
	driver is working on. If not then clearly we're not in a consistently
	hung state and we bail out early.

	2) Disable/idle the engine. This is done through reset handshaking on
	gen8+ unlike earlier gens where this was done by clearing the ring
	valid bits in MI_MODE and ring control registers, which are no longer
	supported on gen8+. Reset handshaking translates to setting the reset
	request bit in the reset control register.

	3) Save the current engine state.

	What this translates to on gen8 is simply to read the current value of
	the head register and nudge it so that it points to the next valid
	instruction in the ring buffer. Since we assume that the execution is
	currently stuck in a batch buffer the effect of this is that the
	batchbuffer start instruction of the hung batch buffer is skipped so
	that when execution resumes, following the hang recovery completion, it
	resumes immediately following the batch buffer.

	This effectively means that we're forcefully terminating the currently
	active, hung batch buffer. Obviously, the outcome of this intervention
	is potentially undefined but there are not many good options in this
	scenario. It's better than resetting the entire GPU in the vast
	majority of cases.

	Save the nudged head value to be applied later.

	4) Reset the engine.

	5) Apply the nudged head value to the head register.

	6) Reenable the engine. For gen8 this means resubmitting the fixed-up
	context, allowing execution to resume. In order to resubmit a context
	without relying on the currently hung execution list queues we use a
	privileged API that is dedicated for TDR use only. This submission API
	bypasses any currently queued work and gets exclusive access to the
	ELSP ports.

	7) If the engine hang recovery procedure fails at any point in between
	disablement and reenablement of the engine there is a back-off
	procedure: For gen8 it's possible to back out of the reset handshake by
	clearing the reset request bit in the reset control register.

NOTE:
It's possible that some of Ben Widawsky's original per-engine reset patches
from 3 years ago are in this commit but since this work has gone through the
hands of at least 3 people already any kind of ownership tracking has been lost
a long time ago. If you think that you should be on the sob list just let me
know.

Signed-off-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@intel.com>
Signed-off-by: Ian Lister <ian.lister@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |    2 +-
 drivers/gpu/drm/i915/i915_dma.c         |   17 +
 drivers/gpu/drm/i915/i915_drv.c         |  198 +++++++++
 drivers/gpu/drm/i915/i915_drv.h         |   63 ++-
 drivers/gpu/drm/i915/i915_irq.c         |  197 ++++++++-
 drivers/gpu/drm/i915/i915_params.c      |   10 +
 drivers/gpu/drm/i915/i915_reg.h         |    6 +
 drivers/gpu/drm/i915/intel_lrc.c        |  661 ++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_lrc.h        |   14 +
 drivers/gpu/drm/i915/intel_lrc_tdr.h    |   37 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c |   84 +++-
 drivers/gpu/drm/i915/intel_ringbuffer.h |   64 +++
 drivers/gpu/drm/i915/intel_uncore.c     |  208 ++++++++++
 13 files changed, 1520 insertions(+), 41 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_lrc_tdr.h

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 8446ef4..e33e105 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4183,7 +4183,7 @@ i915_wedged_set(void *data, u64 val)
 
 	intel_runtime_pm_get(dev_priv);
 
-	i915_handle_error(dev, val,
+	i915_handle_error(dev, 0x0, val,
 			  "Manually setting wedged to %llu", val);
 
 	intel_runtime_pm_put(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index e44116f..8f49e7f 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -776,6 +776,21 @@ static void intel_device_info_runtime_init(struct drm_device *dev)
 			 info->has_eu_pg ? "y" : "n");
 }
 
+static void
+i915_hangcheck_init(struct drm_device *dev)
+{
+	int i;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	for (i = 0; i < I915_NUM_RINGS; i++) {
+		struct intel_engine_cs *engine = &dev_priv->ring[i];
+
+		i915_hangcheck_reinit(engine);
+		engine->hangcheck.reset_count = 0;
+		engine->hangcheck.tdr_count = 0;
+	}
+}
+
 /**
  * i915_driver_load - setup chip and create an initial config
  * @dev: DRM device
@@ -956,6 +971,8 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 
 	i915_gem_load(dev);
 
+	i915_hangcheck_init(dev);
+
 	/* On the 945G/GM, the chipset reports the MSI capability on the
 	 * integrated graphics even though the support isn't actually there
 	 * according to the published specs.  It doesn't appear to function
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c3fdbb0..e1629a6 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -34,6 +34,7 @@
 #include "i915_drv.h"
 #include "i915_trace.h"
 #include "intel_drv.h"
+#include "intel_lrc_tdr.h"
 
 #include <linux/console.h>
 #include <linux/module.h>
@@ -581,6 +582,7 @@ static int i915_drm_suspend(struct drm_device *dev)
 	struct drm_crtc *crtc;
 	pci_power_t opregion_target_state;
 	int error;
+	int i;
 
 	/* ignore lid events during suspend */
 	mutex_lock(&dev_priv->modeset_restore_lock);
@@ -602,6 +604,16 @@ static int i915_drm_suspend(struct drm_device *dev)
 		return error;
 	}
 
+	/*
+	 * Clear any pending reset requests. They should be picked up
+	 * after resume when new work is submitted
+	 */
+	for (i = 0; i < I915_NUM_RINGS; i++)
+		atomic_set(&dev_priv->ring[i].hangcheck.flags, 0);
+
+	atomic_clear_mask(I915_RESET_IN_PROGRESS_FLAG,
+		&dev_priv->gpu_error.reset_counter);
+
 	intel_suspend_gt_powersave(dev);
 
 	/*
@@ -905,6 +917,192 @@ int i915_reset(struct drm_device *dev)
 	return 0;
 }
 
+/**
+ * i915_reset_engine - reset GPU engine after a hang
+ * @engine: engine to reset
+ *
+ * Reset a specific GPU engine. Useful if a hang is detected. Returns zero on successful
+ * reset or otherwise an error code.
+ *
+ * Procedure is fairly simple:
+ *
+ *	- Force engine to idle.
+ *
+ *	- Save current head register value and nudge it past the point of the hang in the
+ *	  ring buffer, which is typically the BB_START instruction of the hung batch buffer,
+ *	  on to the following instruction.
+ *
+ *	- Reset engine.
+ *
+ *	- Restore the previously saved, nudged head register value.
+ *
+ *	- Re-enable engine to resume running. On gen8 this requires the previously hung
+ *	  context to be resubmitted to ELSP via the dedicated TDR-execlists interface.
+ *
+ */
+int i915_reset_engine(struct intel_engine_cs *engine)
+{
+	struct drm_device *dev = engine->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_request *current_request = NULL;
+	uint32_t head;
+	bool force_advance = false;
+	int ret = 0;
+	int err_ret = 0;
+
+	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
+
+        /* Take wake lock to prevent power saving mode */
+	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+
+	i915_gem_reset_engine(dev_priv, engine);
+
+	if (i915.enable_execlists) {
+		enum context_submission_status status =
+			intel_execlists_TDR_get_current_request(engine, NULL);
+
+		/*
+		 * If the hardware and driver states do not coincide
+		 * or if there for some reason is no current context
+		 * in the process of being submitted then bail out and
+		 * try again. Do not proceed unless we have reliable
+		 * current context state information.
+		 */
+		if (status != CONTEXT_SUBMISSION_STATUS_OK) {
+			ret = -EAGAIN;
+			goto reset_engine_error;
+		}
+	}
+
+	ret = intel_ring_disable(engine);
+	if (ret != 0) {
+		DRM_ERROR("Failed to disable %s\n", engine->name);
+		goto reset_engine_error;
+	}
+
+	if (i915.enable_execlists) {
+		enum context_submission_status status;
+		bool inconsistent;
+
+		status = intel_execlists_TDR_get_current_request(engine,
+				&current_request);
+
+		inconsistent = (status != CONTEXT_SUBMISSION_STATUS_OK);
+		if (inconsistent) {
+			/*
+			 * If we somehow have reached this point with
+			 * an inconsistent context submission status then
+			 * back out of the previously requested reset and
+			 * retry later.
+			 */
+			WARN(inconsistent,
+			     "Inconsistent context status on %s: %u\n",
+			     engine->name, status);
+
+			ret = -EAGAIN;
+			goto reenable_reset_engine_error;
+		}
+	}
+
+	/* Sample the current ring head position */
+	head = I915_READ_HEAD(engine) & HEAD_ADDR;
+
+	if (head == engine->hangcheck.last_head) {
+		/*
+		 * The engine has not advanced since the last
+		 * time it hung so force it to advance to the
+		 * next QWORD. In most cases the engine head
+		 * pointer will automatically advance to the
+		 * next instruction as soon as it has read the
+		 * current instruction, without waiting for it
+		 * to complete. This seems to be the default
+		 * behaviour, however an MBOX wait inserted
+		 * directly to the VCS/BCS engines does not behave
+		 * in the same way, instead the head pointer
+		 * will still be pointing at the MBOX instruction
+		 * until it completes.
+		 */
+		force_advance = true;
+	}
+
+	engine->hangcheck.last_head = head;
+
+	ret = intel_ring_save(engine, current_request, force_advance);
+	if (ret) {
+		DRM_ERROR("Failed to save %s engine state\n", engine->name);
+		goto reenable_reset_engine_error;
+	}
+
+	ret = intel_gpu_engine_reset(engine);
+	if (ret) {
+		DRM_ERROR("Failed to reset %s\n", engine->name);
+		goto reenable_reset_engine_error;
+	}
+
+	ret = intel_ring_restore(engine, current_request);
+	if (ret) {
+		DRM_ERROR("Failed to restore %s engine state\n", engine->name);
+		goto reenable_reset_engine_error;
+	}
+
+	/* Correct driver state */
+	intel_gpu_engine_reset_resample(engine, current_request);
+
+	/*
+	 * Reenable engine
+	 *
+	 * In execlist mode on gen8+ this is implicit by simply resubmitting
+	 * the previously hung context. In ring buffer submission mode on gen7
+	 * and earlier we need to actively turn on the engine first.
+	 */
+	if (i915.enable_execlists)
+		ret = intel_execlists_TDR_context_queue(engine, current_request);
+	else
+		ret = intel_ring_enable(engine);
+
+	if (ret) {
+		DRM_ERROR("Failed to enable %s again after reset\n",
+			engine->name);
+
+		goto reset_engine_error;
+	}
+
+	/* Clear reset flags to allow future hangchecks */
+	atomic_set(&engine->hangcheck.flags, 0);
+
+	/* Wake up anything waiting on this engine's queue */
+	wake_up_all(&engine->irq_queue);
+
+	if (i915.enable_execlists && current_request)
+		i915_gem_request_unreference(current_request);
+
+	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+
+	return ret;
+
+reenable_reset_engine_error:
+
+	err_ret = intel_ring_enable(engine);
+	if (err_ret)
+		DRM_ERROR("Failed to reenable %s following error during reset (%d)\n",
+			engine->name, err_ret);
+
+reset_engine_error:
+
+	/* Clear reset flags to allow future hangchecks */
+	atomic_set(&engine->hangcheck.flags, 0);
+
+	/* Wake up anything waiting on this engine's queue */
+	wake_up_all(&engine->irq_queue);
+
+	if (i915.enable_execlists && current_request)
+		i915_gem_request_unreference(current_request);
+
+	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+
+	return ret;
+}
+
 static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
 	struct intel_device_info *intel_info =
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ab5dfdc..9cc5e8d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2280,6 +2280,48 @@ struct drm_i915_cmd_table {
 	int count;
 };
 
+/*
+ * Context submission status
+ *
+ * CONTEXT_SUBMISSION_STATUS_OK:
+ *	Context submitted to ELSP and state of execlist queue is the same as
+ *	the state of EXECLIST_STATUS register. Software and hardware states
+ *	are consistent and can be trusted.
+ *
+ * CONTEXT_SUBMISSION_STATUS_INCONSISTENT:
+ *	Context has been submitted to the execlist queue but the state of the
+ *	EXECLIST_STATUS register is different from the execlist queue state.
+ *	This could mean any of the following:
+ *
+ *		1. The context is in the head position of the execlist queue
+ *		   but has not yet been submitted to ELSP.
+ *
+ *		2. The hardware just recently completed the context but the
+ *		   context is pending removal from the execlist queue.
+ *
+ *		3. The driver has lost a context state transition interrupt.
+ *		   Typically what this means is that hardware has completed and
+ *		   is now idle but the driver thinks the hardware is still
+ *		   busy.
+ *
+ *	Overall what this means is that the context submission status is
+ *	currently in transition and cannot be trusted until it settles down.
+ *
+ * CONTEXT_SUBMISSION_STATUS_NONE_SUBMITTED:
+ *	No context submitted to the execlist queue and the EXECLIST_STATUS
+ *	register shows no context being processed.
+ *
+ * CONTEXT_SUBMISSION_STATUS_NONE_UNDEFINED:
+ *	Initial state before submission status has been determined.
+ *
+ */
+enum context_submission_status {
+	CONTEXT_SUBMISSION_STATUS_OK = 0,
+	CONTEXT_SUBMISSION_STATUS_INCONSISTENT,
+	CONTEXT_SUBMISSION_STATUS_NONE_SUBMITTED,
+	CONTEXT_SUBMISSION_STATUS_UNDEFINED
+};
+
 /* Note that the (struct drm_i915_private *) cast is just to shut up gcc. */
 #define __I915__(p) ({ \
 	struct drm_i915_private *__p; \
@@ -2478,6 +2520,7 @@ struct i915_params {
 	int enable_ips;
 	int invert_brightness;
 	int enable_cmd_parser;
+	unsigned int gpu_reset_promotion_time;
 	/* leave bools at the end to not create holes */
 	bool enable_hangcheck;
 	bool fastboot;
@@ -2508,18 +2551,34 @@ extern long i915_compat_ioctl(struct file *filp, unsigned int cmd,
 			      unsigned long arg);
 #endif
 extern int intel_gpu_reset(struct drm_device *dev);
+extern int intel_gpu_engine_reset(struct intel_engine_cs *engine);
+extern int intel_request_gpu_engine_reset(struct intel_engine_cs *engine);
+extern int intel_unrequest_gpu_engine_reset(struct intel_engine_cs *engine);
 extern int i915_reset(struct drm_device *dev);
+extern int i915_reset_engine(struct intel_engine_cs *engine);
 extern unsigned long i915_chipset_val(struct drm_i915_private *dev_priv);
 extern unsigned long i915_mch_val(struct drm_i915_private *dev_priv);
 extern unsigned long i915_gfx_val(struct drm_i915_private *dev_priv);
 extern void i915_update_gfx_val(struct drm_i915_private *dev_priv);
 int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on);
 void intel_hpd_cancel_work(struct drm_i915_private *dev_priv);
+static inline void i915_hangcheck_reinit(struct intel_engine_cs *engine)
+{
+	struct intel_ring_hangcheck *hc = &engine->hangcheck;
+
+	hc->acthd = 0;
+	hc->max_acthd = 0;
+	hc->seqno = 0;
+	hc->score = 0;
+	hc->action = HANGCHECK_IDLE;
+	hc->deadlock = 0;
+}
+
 
 /* i915_irq.c */
 void i915_queue_hangcheck(struct drm_device *dev);
-__printf(3, 4)
-void i915_handle_error(struct drm_device *dev, bool wedged,
+__printf(4, 5)
+void i915_handle_error(struct drm_device *dev, u32 engine_mask, bool wedged,
 		       const char *fmt, ...);
 
 extern void intel_irq_init(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 40c44fc..6a40e25 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2312,10 +2312,65 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
 	char *error_event[] = { I915_ERROR_UEVENT "=1", NULL };
 	char *reset_event[] = { I915_RESET_UEVENT "=1", NULL };
 	char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL };
+	bool reset_complete = false;
+	struct intel_engine_cs *ring;
 	int ret;
+	int i;
+
+	mutex_lock(&dev->struct_mutex);
 
 	kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, error_event);
 
+	for_each_ring(ring, dev_priv, i) {
+
+		/*
+		 * Skip further individual engine reset requests if full GPU
+		 * reset requested.
+		 */
+		if (i915_reset_in_progress(error))
+			break;
+
+		if (atomic_read(&ring->hangcheck.flags) &
+			I915_ENGINE_RESET_IN_PROGRESS) {
+
+			ret = i915_reset_engine(ring);
+
+			reset_complete = true;
+
+			/*
+			 * Execlist mode only:
+			 *
+			 * -EAGAIN means that between detecting a hang (and
+			 * also determining that the currently submitted
+			 * context is stable and valid) and trying to recover
+			 * from the hang the current context changed state.
+			 * This means that we are probably not completely hung
+			 * after all. Just fail and retry by exiting all the
+			 * way back and wait for the next hang detection. If we
+			 * have a true hang on our hands then we will detect it
+			 * again, otherwise we will continue like nothing
+			 * happened.
+			 */
+			if (ret == -EAGAIN) {
+				DRM_ERROR("Reset of %s aborted due to " \
+					  "change in context submission " \
+					  "state - retrying!", ring->name);
+				ret = 0;
+			}
+
+			if (ret) {
+				DRM_ERROR("Reset of %s failed! (%d)", ring->name, ret);
+
+				atomic_set_mask(I915_RESET_IN_PROGRESS_FLAG,
+						&dev_priv->gpu_error.reset_counter);
+				break;
+			}
+		}
+	}
+
+	/* The full GPU reset will grab the struct_mutex when it needs it */
+	mutex_unlock(&dev->struct_mutex);
+
 	/*
 	 * Note that there's only one work item which does gpu resets, so we
 	 * need not worry about concurrent gpu resets potentially incrementing
@@ -2362,24 +2417,37 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
 			 *
 			 * Since unlock operations are a one-sided barrier only,
 			 * we need to insert a barrier here to order any seqno
-			 * updates before
-			 * the counter increment.
+			 * updates before the counter increment.
+			 *
+			 * The increment clears I915_RESET_IN_PROGRESS_FLAG.
 			 */
 			smp_mb__before_atomic();
 			atomic_inc(&dev_priv->gpu_error.reset_counter);
 
-			kobject_uevent_env(&dev->primary->kdev->kobj,
-					   KOBJ_CHANGE, reset_done_event);
+			/*
+			 * If any per-engine resets were promoted to full GPU
+			 * reset don't forget to clear those reset flags.
+			 */
+			for_each_ring(ring, dev_priv, i)
+				atomic_set(&ring->hangcheck.flags, 0);
 		} else {
+			/* Terminal wedge condition */
+			WARN(1, "i915_reset failed, declaring GPU as wedged!\n");
 			atomic_set_mask(I915_WEDGED, &error->reset_counter);
 		}
 
-		/*
-		 * Note: The wake_up also serves as a memory barrier so that
-		 * waiters see the update value of the reset counter atomic_t.
-		 */
-		i915_error_wake_up(dev_priv, true);
+		reset_complete = true;
 	}
+
+	/*
+	 * Note: The wake_up also serves as a memory barrier so that
+	 * waiters see the update value of the reset counter atomic_t.
+	 */
+	if (reset_complete)
+		i915_error_wake_up(dev_priv, true);
+
+	kobject_uevent_env(&dev->primary->kdev->kobj,
+			   KOBJ_CHANGE, reset_done_event);
 }
 
 static void i915_report_and_clear_eir(struct drm_device *dev)
@@ -2476,21 +2544,42 @@ static void i915_report_and_clear_eir(struct drm_device *dev)
 
 /**
  * i915_handle_error - handle a gpu error
- * @dev: drm device
  *
- * Do some basic checking of regsiter state at error time and
+ * @dev: 		drm device
+ *
+ * @engine_mask: 	Bit mask containing the engine flags of all engines
+ *			associated with one or more detected errors.
+ *			May be 0x0.
+ *
+ *			If wedged is set to true this implies that one or more
+ *			engine hangs were detected. In this case we will
+ *			attempt to reset all engines that have been detected
+ *			as hung.
+ *
+ *			If a previous engine reset was attempted too recently
+ *			or if one of the current engine resets fails we fall
+ *			back to legacy full GPU reset.
+ *
+ * @wedged: 		true = Hang detected, invoke hang recovery.
+ * @fmt, ...: 		Error message describing reason for error.
+ *
+ * Do some basic checking of register state at error time and
  * dump it to the syslog.  Also call i915_capture_error_state() to make
  * sure we get a record and make it available in debugfs.  Fire a uevent
  * so userspace knows something bad happened (should trigger collection
- * of a ring dump etc.).
+ * of a ring dump etc.). If a hang was detected (wedged = true) try to
+ * reset the associated engine. Failing that, try to fall back to legacy
+ * full GPU reset recovery mode.
  */
-void i915_handle_error(struct drm_device *dev, bool wedged,
+void i915_handle_error(struct drm_device *dev, u32 engine_mask, bool wedged,
 		       const char *fmt, ...)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	va_list args;
 	char error_msg[80];
 
+	struct intel_engine_cs *engine;
+
 	va_start(args, fmt);
 	vscnprintf(error_msg, sizeof(error_msg), fmt, args);
 	va_end(args);
@@ -2499,8 +2588,59 @@ void i915_handle_error(struct drm_device *dev, bool wedged,
 	i915_report_and_clear_eir(dev);
 
 	if (wedged) {
-		atomic_set_mask(I915_RESET_IN_PROGRESS_FLAG,
-				&dev_priv->gpu_error.reset_counter);
+		/*
+		 * Defer to full GPU reset if any of the following is true:
+		 * 	1. The caller did not ask for per-engine reset.
+		 *	2. The hardware does not support it (pre-gen7).
+		 *	3. We already tried per-engine reset recently.
+		 */
+		bool full_reset = true;
+
+		/*
+		 * TBD: We currently only support per-engine reset for gen8+.
+		 * Implement support for gen7.
+		 */
+		if (engine_mask && (INTEL_INFO(dev)->gen >= 8)) {
+			u32 i;
+
+			for_each_ring(engine, dev_priv, i) {
+				u32 now, last_engine_reset_timediff;
+
+				if (!(intel_ring_flag(engine) & engine_mask))
+					continue;
+
+				/* Measure the time since this engine was last reset */
+				now = get_seconds();
+				last_engine_reset_timediff =
+					now - engine->hangcheck.last_engine_reset_time;
+
+				full_reset = last_engine_reset_timediff <
+					i915.gpu_reset_promotion_time;
+
+				engine->hangcheck.last_engine_reset_time = now;
+
+				/*
+				 * This engine was not reset too recently - go ahead
+				 * with engine reset instead of falling back to full
+				 * GPU reset.
+				 *
+				 * Flag that we want to try and reset this engine.
+				 * This can still be overridden by a global
+				 * reset e.g. if per-engine reset fails.
+				 */
+				if (!full_reset)
+					atomic_set_mask(I915_ENGINE_RESET_IN_PROGRESS,
+						&engine->hangcheck.flags);
+				else
+					break;
+
+			} /* for_each_ring */
+		}
+
+		if (full_reset) {
+			atomic_set_mask(I915_RESET_IN_PROGRESS_FLAG,
+					&dev_priv->gpu_error.reset_counter);
+		}
 
 		/*
 		 * Wakeup waiting processes so that the reset function
@@ -2823,7 +2963,7 @@ ring_stuck(struct intel_engine_cs *ring, u64 acthd)
 	 */
 	tmp = I915_READ_CTL(ring);
 	if (tmp & RING_WAIT) {
-		i915_handle_error(dev, false,
+		i915_handle_error(dev, intel_ring_flag(ring), false,
 				  "Kicking stuck wait on %s",
 				  ring->name);
 		I915_WRITE_CTL(ring, tmp);
@@ -2835,7 +2975,7 @@ ring_stuck(struct intel_engine_cs *ring, u64 acthd)
 		default:
 			return HANGCHECK_HUNG;
 		case 1:
-			i915_handle_error(dev, false,
+			i915_handle_error(dev, intel_ring_flag(ring), false,
 					  "Kicking stuck semaphore on %s",
 					  ring->name);
 			I915_WRITE_CTL(ring, tmp);
@@ -2864,8 +3004,10 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 	struct drm_device *dev = dev_priv->dev;
 	struct intel_engine_cs *ring;
 	int i;
-	int busy_count = 0, rings_hung = 0;
+	u32 engine_mask = 0;
+	int busy_count = 0;
 	bool stuck[I915_NUM_RINGS] = { 0 };
+	bool force_full_gpu_reset = false;
 #define BUSY 1
 #define KICK 5
 #define HUNG 20
@@ -2960,12 +3102,25 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 			DRM_INFO("%s on %s\n",
 				 stuck[i] ? "stuck" : "no progress",
 				 ring->name);
-			rings_hung++;
+
+			engine_mask |= intel_ring_flag(ring);
+			ring->hangcheck.tdr_count++;
+		} else if (ring->hangcheck.score >= (HANGCHECK_SCORE_RING_HUNG * 2)) {
+			DRM_INFO("%s on %s, hang recovery ineffective! "
+				 "Falling back to full gpu reset.\n",
+				 stuck[i] ? "stuck" : "no progress",
+				 ring->name);
+
+			force_full_gpu_reset = true;
+			break;
 		}
 	}
 
-	if (rings_hung)
-		return i915_handle_error(dev, true, "Ring hung");
+	if (engine_mask)
+		i915_handle_error(dev, engine_mask, true, "Ring hung (0x%02x)", engine_mask);
+	else if (force_full_gpu_reset)
+		i915_handle_error(dev, 0x0, true,
+			"Hang recovery ineffective, falling back to full GPU reset");
 
 	if (busy_count)
 		/* Reset timer case chip hangs without another request
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index bb64415..9cea004 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -50,6 +50,7 @@ struct i915_params i915 __read_mostly = {
 	.enable_cmd_parser = 1,
 	.disable_vtd_wa = 0,
 	.use_mmio_flip = 0,
+	.gpu_reset_promotion_time = 0,
 	.mmio_debug = 0,
 	.verbose_state_checks = 1,
 	.nuclear_pageflip = 0,
@@ -172,6 +173,15 @@ module_param_named(use_mmio_flip, i915.use_mmio_flip, int, 0600);
 MODULE_PARM_DESC(use_mmio_flip,
 		 "use MMIO flips (-1=never, 0=driver discretion [default], 1=always)");
 
+module_param_named(gpu_reset_promotion_time,
+               i915.gpu_reset_promotion_time, int, 0644);
+MODULE_PARM_DESC(gpu_reset_promotion_time,
+               "Catch excessive engine resets. Each engine maintains a "
+	       "timestamp of the last time it was reset. If it hangs again "
+	       "within this period then fall back to full GPU reset to try and"
+	       " recover from the hang. "
+               "default=0 seconds (disabled)");
+
 module_param_named(mmio_debug, i915.mmio_debug, int, 0600);
 MODULE_PARM_DESC(mmio_debug,
 	"Enable the MMIO debug code for the first N failures (default: off). "
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 9c97842..af9f0ad 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -100,6 +100,10 @@
 #define  GRDOM_RESET_STATUS (1<<1)
 #define  GRDOM_RESET_ENABLE (1<<0)
 
+#define RING_RESET_CTL(ring)	((ring)->mmio_base+0xd0)
+#define  READY_FOR_RESET	0x2
+#define  REQUEST_RESET		0x1
+
 #define ILK_GDSR 0x2ca4 /* MCHBAR offset */
 #define  ILK_GRDOM_FULL		(0<<1)
 #define  ILK_GRDOM_RENDER	(1<<1)
@@ -130,6 +134,8 @@
 #define  GEN6_GRDOM_RENDER		(1 << 1)
 #define  GEN6_GRDOM_MEDIA		(1 << 2)
 #define  GEN6_GRDOM_BLT			(1 << 3)
+#define  GEN6_GRDOM_VECS		(1 << 4)
+#define  GEN8_GRDOM_MEDIA2		(1 << 7)
 
 #define RING_PP_DIR_BASE(ring)		((ring)->mmio_base+0x228)
 #define RING_PP_DIR_BASE_READ(ring)	((ring)->mmio_base+0x518)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0fc35dd..a4273ac 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -135,6 +135,7 @@
 #include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
+#include "intel_lrc_tdr.h"
 
 #define GEN9_LR_CONTEXT_RENDER_SIZE (22 * PAGE_SIZE)
 #define GEN8_LR_CONTEXT_RENDER_SIZE (20 * PAGE_SIZE)
@@ -330,6 +331,164 @@ static void execlists_elsp_write(struct intel_engine_cs *ring,
 	spin_unlock(&dev_priv->uncore.lock);
 }
 
+/**
+ * execlist_get_context_reg_page() - Get memory page for context object
+ * @engine: engine
+ * @ctx: context running on engine
+ * @page: returned page
+ *
+ * Return: 0 if successful, otherwise propagates error codes.
+ */
+static inline int execlist_get_context_reg_page(struct intel_engine_cs *engine,
+		struct intel_context *ctx,
+		struct page **page)
+{
+	struct drm_i915_gem_object *ctx_obj;
+
+	if (!page)
+		return -EINVAL;
+
+	if (!ctx)
+		ctx = engine->default_context;
+
+	ctx_obj = ctx->engine[engine->id].state;
+
+	if (WARN(!ctx_obj, "Context object not set up!\n"))
+		return -EINVAL;
+
+	WARN(!i915_gem_obj_is_pinned(ctx_obj),
+	     "Context object is not pinned!\n");
+
+	*page = i915_gem_object_get_page(ctx_obj, 1);
+
+	if (WARN(!*page, "Context object page could not be resolved!\n"))
+		return -EINVAL;
+
+	return 0;
+}
+
+/**
+ * execlist_write_context_reg() - Write value to context register
+ * @engine: engine
+ * @ctx: context running on engine
+ * @ctx_reg: Index into context image pointing to register location
+ * @mmio_reg_addr: MMIO register address
+ * @val: Value to be written
+ *
+ * Return: 0 if successful, otherwise propagates error codes.
+ */
+static inline int execlists_write_context_reg(struct intel_engine_cs *engine,
+		struct intel_context *ctx, u32 ctx_reg, u32 mmio_reg_addr,
+		u32 val)
+{
+	struct page *page = NULL;
+	uint32_t *reg_state;
+
+	int ret = execlist_get_context_reg_page(engine, ctx, &page);
+	if (WARN(ret, "Failed to write %u to register %u for %s!\n",
+		(unsigned int) val, (unsigned int) ctx_reg, engine->name))
+			return ret;
+
+	reg_state = kmap_atomic(page);
+
+	WARN(reg_state[ctx_reg] != mmio_reg_addr,
+	     "Context register address (%x) != MMIO register address (%x)!\n",
+	     (unsigned int) reg_state[ctx_reg], (unsigned int) mmio_reg_addr);
+
+	reg_state[ctx_reg+1] = val;
+	kunmap_atomic(reg_state);
+
+	return ret;
+}
+
+/**
+ * execlist_read_context_reg() - Read value from context register
+ * @engine: engine
+ * @ctx: context running on engine
+ * @ctx_reg: Index into context image pointing to register location
+ * @mmio_reg_addr: MMIO register address
+ * @val: Output parameter returning register value
+ *
+ * Return: 0 if successful, otherwise propagates error codes.
+ */
+static inline int execlists_read_context_reg(struct intel_engine_cs *engine,
+		struct intel_context *ctx, u32 ctx_reg, u32 mmio_reg_addr,
+		u32 *val)
+{
+	struct page *page = NULL;
+	uint32_t *reg_state;
+	int ret = 0;
+
+	if (!val)
+		return -EINVAL;
+
+	ret = execlist_get_context_reg_page(engine, ctx, &page);
+	if (WARN(ret, "Failed to read from register %u for %s!\n",
+		(unsigned int) ctx_reg, engine->name))
+			return ret;
+
+	reg_state = kmap_atomic(page);
+
+	WARN(reg_state[ctx_reg] != mmio_reg_addr,
+	     "Context register address (%x) != MMIO register address (%x)!\n",
+	     (unsigned int) reg_state[ctx_reg], (unsigned int) mmio_reg_addr);
+
+	*val = reg_state[ctx_reg+1];
+	kunmap_atomic(reg_state);
+
+	return ret;
+ }
+
+/*
+ * Generic macros for generating function implementation for context register
+ * read/write functions.
+ *
+ * Macro parameters
+ * ----------------
+ * reg_name: Designated name of context register (e.g. tail, head, buffer_ctl)
+ *
+ * reg_def: Context register macro definition (e.g. CTX_RING_TAIL)
+ *
+ * mmio_reg_def: Name of macro function used to determine the address
+ *		 of the corresponding MMIO register (e.g. RING_TAIL, RING_HEAD).
+ *		 This macro function is assumed to be defined on the form of:
+ *
+ *			#define mmio_reg_def(base) (base+register_offset)
+ *
+ *		 Where "base" is the MMIO base address of the respective ring
+ *		 and "register_offset" is the offset relative to "base".
+ *
+ * Function parameters
+ * -------------------
+ * engine: The engine that the context is running on
+ * ctx: The context of the register that is to be accessed
+ * reg_name: Value to be written/read to/from the register.
+ */
+#define INTEL_EXECLISTS_WRITE_REG(reg_name, reg_def, mmio_reg_def) \
+	int intel_execlists_write_##reg_name(struct intel_engine_cs *engine, \
+					     struct intel_context *ctx, \
+					     u32 reg_name) \
+{ \
+	return execlists_write_context_reg(engine, ctx, (reg_def), \
+			mmio_reg_def(engine->mmio_base), (reg_name)); \
+}
+
+#define INTEL_EXECLISTS_READ_REG(reg_name, reg_def, mmio_reg_def) \
+	int intel_execlists_read_##reg_name(struct intel_engine_cs *engine, \
+					    struct intel_context *ctx, \
+					    u32 *reg_name) \
+{ \
+	return execlists_read_context_reg(engine, ctx, (reg_def), \
+			mmio_reg_def(engine->mmio_base), (reg_name)); \
+}
+
+INTEL_EXECLISTS_READ_REG(tail, CTX_RING_TAIL, RING_TAIL)
+INTEL_EXECLISTS_WRITE_REG(head, CTX_RING_HEAD, RING_HEAD)
+INTEL_EXECLISTS_READ_REG(head, CTX_RING_HEAD, RING_HEAD)
+
+#undef INTEL_EXECLISTS_READ_REG
+#undef INTEL_EXECLISTS_WRITE_REG
+
 static int execlists_update_context(struct drm_i915_gem_object *ctx_obj,
 				    struct drm_i915_gem_object *ring_obj,
 				    struct i915_hw_ppgtt *ppgtt,
@@ -387,34 +546,52 @@ static void execlists_submit_contexts(struct intel_engine_cs *ring,
 	execlists_elsp_write(ring, ctx_obj0, ctx_obj1);
 }
 
-static void execlists_context_unqueue(struct intel_engine_cs *ring)
+static void execlists_fetch_requests(struct intel_engine_cs *ring,
+			struct drm_i915_gem_request **req0,
+			struct drm_i915_gem_request **req1)
 {
-	struct drm_i915_gem_request *req0 = NULL, *req1 = NULL;
 	struct drm_i915_gem_request *cursor = NULL, *tmp = NULL;
 
-	assert_spin_locked(&ring->execlist_lock);
-
-	if (list_empty(&ring->execlist_queue))
+	if (!req0)
 		return;
 
+	*req0 = NULL;
+
+	if (req1)
+		*req1 = NULL;
+
 	/* Try to read in pairs */
 	list_for_each_entry_safe(cursor, tmp, &ring->execlist_queue,
 				 execlist_link) {
-		if (!req0) {
-			req0 = cursor;
-		} else if (req0->ctx == cursor->ctx) {
-			/* Same ctx: ignore first request, as second request
-			 * will update tail past first request's workload */
-			cursor->elsp_submitted = req0->elsp_submitted;
-			list_del(&req0->execlist_link);
-			list_add_tail(&req0->execlist_link,
+		if (!(*req0))
+			*req0 = cursor;
+		else if ((*req0)->ctx == cursor->ctx) {
+			/*
+			 * Same ctx: ignore first request, as second request
+			 * will update tail past first request's workload
+			 */
+			cursor->elsp_submitted = (*req0)->elsp_submitted;
+			list_del(&(*req0)->execlist_link);
+			list_add_tail(&(*req0)->execlist_link,
 				&ring->execlist_retired_req_list);
-			req0 = cursor;
+			*req0 = cursor;
 		} else {
-			req1 = cursor;
+			if (req1)
+				*req1 = cursor;
 			break;
 		}
 	}
+}
+
+static void execlists_context_unqueue(struct intel_engine_cs *ring)
+{
+	struct drm_i915_gem_request *req0 = NULL, *req1 = NULL;
+
+	assert_spin_locked(&ring->execlist_lock);
+	if (list_empty(&ring->execlist_queue))
+		return;
+
+	execlists_fetch_requests(ring, &req0, &req1);
 
 	WARN_ON(req1 && req1->elsp_submitted);
 
@@ -577,6 +754,154 @@ static int execlists_context_queue(struct intel_engine_cs *ring,
 	return 0;
 }
 
+/*
+ * execlists_TDR_context_unqueue is a TDR-specific variant of the
+ * ordinary unqueue function used exclusively by the TDR.
+ *
+ * When doing TDR context resubmission we only want to resubmit the hung
+ * context and nothing else, thus only fetch one request from the queue.
+ * The exception being if the second element in the queue already has been
+ * submitted, in which case we need to submit that one too. Also, don't
+ * increment the elsp_submitted counter following submission since lite restore
+ * context event interrupts do not happen if the engine is hung, which
+ * would normally happen in the case of a context resubmission. If we increment
+ * the elsp_counter in this special case the execlist state machine would
+ * expect a corresponding lite restore interrupt, which is never produced.
+ */
+static void execlists_TDR_context_unqueue(struct intel_engine_cs *ring)
+{
+	struct drm_i915_gem_request *req0 = NULL, *req1 = NULL;
+
+	assert_spin_locked(&ring->execlist_lock);
+	if (list_empty(&ring->execlist_queue))
+		return;
+
+	execlists_fetch_requests(ring, &req0, &req1);
+
+	/*
+	 * If the second head element was not already submitted we do not have
+	 * to resubmit it. Let the interrupt handler unqueue it at an
+	 * appropriate time. If it was already submitted it needs to go in
+	 * again to allow the hardware to switch over to it as expected.
+	 * Otherwise the interrupt handler will do another unqueue of the same
+	 * context and we will end up with a desync between number of
+	 * submissions and interrupts and thus wait endlessly for an interrupt
+	 * that will never come.
+	 */
+	if (req1 && !req1->elsp_submitted)
+		req1 = NULL;
+
+	execlists_submit_contexts(ring, req0->ctx, req0->tail,
+					req1 ? req1->ctx : NULL,
+					req1 ? req1->tail : 0);
+}
+
+/**
+ * intel_execlists_TDR_context_queue() - ELSP context submission bypassing
+ * queue
+ *
+ * Context submission mechanism exclusively used by TDR that bypasses the
+ * execlist queue. This is necessary since at the point of TDR hang recovery
+ * the hardware will be hung and resubmitting a fixed context (the context that
+ * the TDR has identified as hung and fixed up in order to move past the
+ * blocking batch buffer) to a hung execlist queue will lock up the TDR.
+ * Instead, opt for direct ELSP submission without depending on the rest of the
+ * driver.
+ *
+ * @ring: engine to submit context to
+ * @req: request containing context to be resubmitted
+ * @tail: position in ring to submit to
+ *
+ * Return:
+ *	0 if successful, otherwise propagate error code.
+ */
+int intel_execlists_TDR_context_queue(struct intel_engine_cs *ring,
+			struct drm_i915_gem_request *req)
+{
+	unsigned long flags;
+	int ret = 0;
+
+	if (WARN_ON(!req))
+	    return -EINVAL;
+
+	spin_lock_irqsave(&ring->execlist_lock, flags);
+
+	if (list_empty(&ring->execlist_queue)) {
+
+		WARN(1, "Execlist queue for the %s is empty! " \
+			"This should not be possible!\n", ring->name);
+
+		ret = -EINVAL;
+	} else {
+		/*
+		 * Submission path used for hang recovery bypassing execlist
+		 * queue. When a context needs to be resubmitted for lite
+		 * restore during hang recovery we cannot use the execlist
+		 * queue since it will be hung just like its corresponding ring
+		 * engine. Instead go for direct submission to ELSP.
+		 */
+		struct intel_context *tmpctx = NULL;
+		struct drm_i915_gem_request *tmpreq = NULL;
+		struct intel_context *reqctx = req->ctx;
+
+		u32 tmp_ctxid = 0;
+		u32 req_ctxid = 0;
+
+		tmpreq = list_first_entry(&ring->execlist_queue,
+			typeof(*tmpreq), execlist_link);
+
+		if (!tmpreq) {
+			WARN(1, "Request is null, " \
+				"context resubmission to %s failed!\n",
+					ring->name);
+
+			return -EINVAL;
+		}
+
+		i915_gem_request_reference(tmpreq);
+		tmpctx = tmpreq->ctx;
+
+		if (!tmpctx) {
+			WARN(1, "Context null for request %p, " \
+				"context resubmission to %s failed\n",
+				tmpreq, ring->name);
+
+			i915_gem_request_unreference(tmpreq);
+			return -EINVAL;
+		}
+
+		WARN(tmpreq->elsp_submitted == 0,
+			"Allegedly hung request has never been submitted " \
+			"to ELSP\n");
+
+		tmp_ctxid = intel_execlists_ctx_id(tmpctx->engine[ring->id].state);
+		req_ctxid = intel_execlists_ctx_id(reqctx->engine[ring->id].state);
+
+		/*
+		 * At the beginning of hang recovery the TDR asks for the
+		 * currently submitted context (which has been determined to be
+		 * hung at that point). This should be the context at the head
+		 * of the execlist queue. If we reach a point during the
+		 * recovery where we need to do a lite restore of the hung
+		 * context only to discover that the head context of the
+		 * execlist queue has changed, what do we do? The least we can
+		 * do is produce a warning.
+		 */
+		WARN(tmp_ctxid != req_ctxid,
+		    "Context (%x) at head of execlist queue for %s " \
+		    "is not the suspected hung context (%x)! Was execlist " \
+		    "queue reordered during hang recovery?\n",
+		    (unsigned int) tmp_ctxid, ring->name,
+		    (unsigned int) req_ctxid);
+
+		execlists_TDR_context_unqueue(ring);
+	}
+
+	spin_unlock_irqrestore(&ring->execlist_lock, flags);
+
+	return ret;
+}
+
 static int logical_ring_invalidate_all_caches(struct intel_ringbuffer *ringbuf,
 					      struct intel_context *ctx)
 {
@@ -1066,7 +1391,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *ring)
 	ring->next_context_status_buffer = 0;
 	DRM_DEBUG_DRIVER("Execlists enabled for %s\n", ring->name);
 
-	memset(&ring->hangcheck, 0, sizeof(ring->hangcheck));
+	i915_hangcheck_reinit(ring);
 
 	return 0;
 }
@@ -1314,6 +1639,173 @@ out:
 	return ret;
 }
 
+static int
+gen8_ring_disable(struct intel_engine_cs *ring)
+{
+	intel_request_gpu_engine_reset(ring);
+	return 0;
+}
+
+static int
+gen8_ring_enable(struct intel_engine_cs *ring)
+{
+	intel_unrequest_gpu_engine_reset(ring);
+	return 0;
+}
+
+/*
+ * gen8_ring_save()
+ *
+ * Saves the head MMIO register to scratch memory while engine is reset and
+ * reinitialized. Before saving the head register we nudge the head position to
+ * be correctly aligned with a QWORD boundary, which brings it up to the next
+ * presumably valid instruction. Typically, at the point of hang recovery the
+ * head register will be pointing to the last DWORD of the BB_START
+ * instruction, which is followed by a padding MI_NOOP inserted by the
+ * driver.
+ *
+ * ring: engine to be reset
+ * req: request containing the context currently running on engine
+ * force_advance: indicates whether or not we should nudge the head
+ *		  forward or not
+ */
+static int
+gen8_ring_save(struct intel_engine_cs *ring, struct drm_i915_gem_request *req,
+		bool force_advance)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_ringbuffer *ringbuf = NULL;
+	struct intel_context *ctx;
+	int ret = 0;
+	int clamp_to_tail = 0;
+	uint32_t head;
+	uint32_t tail;
+	uint32_t head_addr;
+	uint32_t tail_addr;
+
+	if (WARN_ON(!req))
+	    return -EINVAL;
+
+	ctx = req->ctx;
+	ringbuf = ctx->engine[ring->id].ringbuf;
+
+	/*
+	 * Read head from MMIO register since it contains the
+	 * most up to date value of head at this point.
+	 */
+	head = I915_READ_HEAD(ring);
+
+	/*
+	 * Read tail from the context because the execlist queue
+	 * updates the tail value there first during submission.
+	 * The MMIO tail register is not updated until the actual
+	 * ring submission completes.
+	 */
+	ret = I915_READ_TAIL_CTX(ring, ctx, tail);
+	if (ret)
+		return ret;
+
+	/*
+	 * head_addr and tail_addr are the head and tail values
+	 * excluding ring wrapping information and aligned to DWORD
+	 * boundary
+	 */
+	head_addr = head & HEAD_ADDR;
+	tail_addr = tail & TAIL_ADDR;
+
+	/*
+	 * The head must always chase the tail.
+	 * If the tail is beyond the head then do not allow
+	 * the head to overtake it. If the tail is less than
+	 * the head then the tail has already wrapped and
+	 * there is no problem in advancing the head or even
+	 * wrapping the head back to 0 as worst case it will
+	 * become equal to tail
+	 */
+	if (head_addr <= tail_addr)
+		clamp_to_tail = 1;
+
+	if (force_advance) {
+
+		/* Force head pointer to next QWORD boundary */
+		head_addr &= ~0x7;
+		head_addr += 8;
+
+	} else if (head & 0x7) {
+
+		/* Ensure head pointer is pointing to a QWORD boundary */
+		head += 0x7;
+		head &= ~0x7;
+		head_addr = head;
+	}
+
+	if (clamp_to_tail && (head_addr > tail_addr)) {
+		head_addr = tail_addr;
+	} else if (head_addr >= ringbuf->size) {
+		/* Wrap head back to start if it exceeds ring size */
+		head_addr = 0;
+	}
+
+	head &= ~HEAD_ADDR;
+	head |= (head_addr & HEAD_ADDR);
+	ring->saved_head = head;
+
+	return 0;
+}
+
+static int
+gen8_ring_restore(struct intel_engine_cs *ring, struct drm_i915_gem_request *req)
+{
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct intel_context *ctx;
+
+	if (WARN_ON(!req))
+	    return -EINVAL;
+
+	ctx = req->ctx;
+
+	/* Re-initialize ring */
+	if (ring->init_hw) {
+		int ret = ring->init_hw(ring);
+		if (ret != 0) {
+			DRM_ERROR("Failed to re-initialize %s\n",
+					ring->name);
+			return ret;
+		}
+	} else {
+		DRM_ERROR("ring init function pointer not set up\n");
+		return -EINVAL;
+	}
+
+	if (ring->id == RCS) {
+		/*
+		 * These register reinitializations are only located here
+		 * temporarily until they are moved out of the
+		 * init_clock_gating function to some function we can
+		 * call from here.
+		 */
+
+		/* WaVSRefCountFullforceMissDisable:chv */
+		/* WaDSRefCountFullforceMissDisable:chv */
+		I915_WRITE(GEN7_FF_THREAD_MODE,
+			   I915_READ(GEN7_FF_THREAD_MODE) &
+			   ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
+
+		I915_WRITE(_3D_CHICKEN3,
+			   _3D_CHICKEN_SDE_LIMIT_FIFO_POLY_DEPTH(2));
+
+		/* WaSwitchSolVfFArbitrationPriority:bdw */
+		I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
+	}
+
+	/* Restore head */
+
+	I915_WRITE_HEAD(ring, ring->saved_head);
+	I915_WRITE_HEAD_CTX(ring, ctx, ring->saved_head);
+
+	return 0;
+}
+
 static int gen8_init_rcs_context(struct intel_engine_cs *ring,
 		       struct intel_context *ctx)
 {
@@ -1412,6 +1904,10 @@ static int logical_render_ring_init(struct drm_device *dev)
 	ring->irq_get = gen8_logical_ring_get_irq;
 	ring->irq_put = gen8_logical_ring_put_irq;
 	ring->emit_bb_start = gen8_emit_bb_start;
+	ring->enable = gen8_ring_enable;
+	ring->disable = gen8_ring_disable;
+	ring->save = gen8_ring_save;
+	ring->restore = gen8_ring_restore;
 
 	ring->dev = dev;
 	ret = logical_ring_init(dev, ring);
@@ -1442,6 +1938,10 @@ static int logical_bsd_ring_init(struct drm_device *dev)
 	ring->irq_get = gen8_logical_ring_get_irq;
 	ring->irq_put = gen8_logical_ring_put_irq;
 	ring->emit_bb_start = gen8_emit_bb_start;
+	ring->enable = gen8_ring_enable;
+	ring->disable = gen8_ring_disable;
+	ring->save = gen8_ring_save;
+	ring->restore = gen8_ring_restore;
 
 	return logical_ring_init(dev, ring);
 }
@@ -1467,6 +1967,10 @@ static int logical_bsd2_ring_init(struct drm_device *dev)
 	ring->irq_get = gen8_logical_ring_get_irq;
 	ring->irq_put = gen8_logical_ring_put_irq;
 	ring->emit_bb_start = gen8_emit_bb_start;
+	ring->enable = gen8_ring_enable;
+	ring->disable = gen8_ring_disable;
+	ring->save = gen8_ring_save;
+	ring->restore = gen8_ring_restore;
 
 	return logical_ring_init(dev, ring);
 }
@@ -1492,6 +1996,10 @@ static int logical_blt_ring_init(struct drm_device *dev)
 	ring->irq_get = gen8_logical_ring_get_irq;
 	ring->irq_put = gen8_logical_ring_put_irq;
 	ring->emit_bb_start = gen8_emit_bb_start;
+	ring->enable = gen8_ring_enable;
+	ring->disable = gen8_ring_disable;
+	ring->save = gen8_ring_save;
+	ring->restore = gen8_ring_restore;
 
 	return logical_ring_init(dev, ring);
 }
@@ -1517,6 +2025,10 @@ static int logical_vebox_ring_init(struct drm_device *dev)
 	ring->irq_get = gen8_logical_ring_get_irq;
 	ring->irq_put = gen8_logical_ring_put_irq;
 	ring->emit_bb_start = gen8_emit_bb_start;
+	ring->enable = gen8_ring_enable;
+	ring->disable = gen8_ring_disable;
+	ring->save = gen8_ring_save;
+	ring->restore = gen8_ring_restore;
 
 	return logical_ring_init(dev, ring);
 }
@@ -1974,3 +2486,120 @@ void intel_lr_context_reset(struct drm_device *dev,
 		ringbuf->tail = 0;
 	}
 }
+
+/**
+ * intel_execlists_TDR_get_current_request() - return request currently
+ * processed by engine
+ *
+ * @ring: Engine currently running context to be returned.
+ *
+ * @req:  Output parameter containing the current request (the request at the
+ *	  head of execlist queue corresponding to the given ring). May be NULL
+ *	  if no request has been submitted to the execlist queue of this
+ *	  engine. If the req parameter passed in to the function is not NULL
+ *	  and a request is found and returned the request is referenced before
+ *	  it is returned. It is the responsibility of the caller to dereference
+ *	  it at the end of its life cycle.
+ *
+ * Return:
+ *	CONTEXT_SUBMISSION_STATUS_OK if request is found to be submitted and its
+ *	context is currently running on engine.
+ *
+ *	CONTEXT_SUBMISSION_STATUS_INCONSISTENT if request is found to be submitted
+ *	but its context is not in a state that is consistent with current
+ *	hardware state for the given engine. This has been observed in three cases:
+ *
+ *		1. Before the engine has switched to this context after it has
+ *		been submitted to the execlist queue.
+ *
+ *		2. After the engine has switched away from this context but
+ *		before the context has been removed from the execlist queue.
+ *
+ *		3. The driver has lost an interrupt. Typically the hardware has
+ *		gone to idle but the driver still thinks the context belonging to
+ *		the request at the head of the queue is still executing.
+ *
+ *	CONTEXT_SUBMISSION_STATUS_NONE_SUBMITTED if no context has been found
+ *	to be submitted to the execlist queue and if the hardware is idle.
+ */
+enum context_submission_status
+intel_execlists_TDR_get_current_request(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request **req)
+{
+	struct drm_i915_private *dev_priv;
+	unsigned long flags;
+	struct drm_i915_gem_request *tmpreq = NULL;
+	struct intel_context *tmpctx = NULL;
+	unsigned hw_context = 0;
+	bool hw_active = false;
+	enum context_submission_status status =
+			CONTEXT_SUBMISSION_STATUS_UNDEFINED;
+
+	if (WARN_ON(!ring))
+		return status;
+
+	dev_priv = ring->dev->dev_private;
+
+	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+	spin_lock_irqsave(&ring->execlist_lock, flags);
+	hw_context = I915_READ(RING_EXECLIST_STATUS_CTX_ID(ring));
+
+	hw_active = (I915_READ(RING_EXECLIST_STATUS(ring)) &
+		EXECLIST_STATUS_CURRENT_ACTIVE_ELEMENT_STATUS) ? true : false;
+
+	tmpreq = list_first_entry_or_null(&ring->execlist_queue,
+		struct drm_i915_gem_request, execlist_link);
+
+	if (tmpreq) {
+		/*
+		 * If the caller has not passed a non-NULL req parameter then
+		 * it is not interested in getting a request reference back.
+		 * Don't temporarily grab a reference since holding the execlist
+		 * lock is enough to ensure that the execlist code will hold its
+		 * reference all throughout this function. As long as that reference
+		 * is kept there is no need for us to take yet another reference.
+		 * The reason why this is of interest is because certain callers, such
+		 * as the TDR hang checker, cannot grab struct_mutex before calling
+		 * and because of that we cannot dereference any requests (DRM might
+		 * assert if we do). Just rely on the execlist code to provide
+		 * indirect protection.
+		 */
+		if (req)
+			i915_gem_request_reference(tmpreq);
+
+
+		if (tmpreq->ctx)
+			tmpctx = tmpreq->ctx;
+
+		WARN(!tmpctx, "No context in request %p\n", tmpreq);
+	}
+
+	if (tmpctx) {
+		unsigned sw_context =
+			intel_execlists_ctx_id((tmpctx)->engine[ring->id].state);
+
+		status = ((hw_context == sw_context) && hw_active) ?
+				CONTEXT_SUBMISSION_STATUS_OK :
+				CONTEXT_SUBMISSION_STATUS_INCONSISTENT;
+	} else {
+		/*
+		 * If we don't have any queue entries and the
+		 * EXECLIST_STATUS register points to zero we are
+		 * clearly not processing any context right now
+		 */
+		WARN((hw_context || hw_active), "hw_context=%x, hardware %s!\n",
+			hw_context, hw_active ? "not idle":"idle");
+
+		status = (hw_context || hw_active) ?
+			CONTEXT_SUBMISSION_STATUS_INCONSISTENT :
+			CONTEXT_SUBMISSION_STATUS_NONE_SUBMITTED;
+	}
+
+	if (req)
+		*req = tmpreq;
+
+	spin_unlock_irqrestore(&ring->execlist_lock, flags);
+	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
+
+	return status;
+}
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 04d3a6d..d2f497c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -29,6 +29,8 @@
 /* Execlists regs */
 #define RING_ELSP(ring)			((ring)->mmio_base+0x230)
 #define RING_EXECLIST_STATUS(ring)	((ring)->mmio_base+0x234)
+#define	  EXECLIST_STATUS_CURRENT_ACTIVE_ELEMENT_STATUS	(0x3 << 14)
+#define RING_EXECLIST_STATUS_CTX_ID(ring)	(RING_EXECLIST_STATUS(ring)+4)
 #define RING_CONTEXT_CONTROL(ring)	((ring)->mmio_base+0x244)
 #define	  CTX_CTRL_INHIBIT_SYN_CTX_SWITCH	(1 << 3)
 #define	  CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT	(1 << 0)
@@ -89,4 +91,16 @@ u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj);
 void intel_lrc_irq_handler(struct intel_engine_cs *ring);
 void intel_execlists_retire_requests(struct intel_engine_cs *ring);
 
+int intel_execlists_read_tail(struct intel_engine_cs *ring,
+			 struct intel_context *ctx,
+			 u32 *tail);
+
+int intel_execlists_write_head(struct intel_engine_cs *ring,
+			  struct intel_context *ctx,
+			  u32 head);
+
+int intel_execlists_read_head(struct intel_engine_cs *ring,
+			 struct intel_context *ctx,
+			 u32 *head);
+
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/intel_lrc_tdr.h b/drivers/gpu/drm/i915/intel_lrc_tdr.h
new file mode 100644
index 0000000..684b009
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_lrc_tdr.h
@@ -0,0 +1,37 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef _INTEL_LRC_TDR_H_
+#define _INTEL_LRC_TDR_H_
+
+/* Privileged execlist API used exclusively by TDR */
+
+int intel_execlists_TDR_context_queue(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request *req);
+
+enum context_submission_status
+intel_execlists_TDR_get_current_request(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request **req);
+
+#endif /* _INTEL_LRC_TDR_H_ */
+
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index f949583..0fdf983 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -442,6 +442,88 @@ static void ring_write_tail(struct intel_engine_cs *ring,
 	I915_WRITE_TAIL(ring, value);
 }
 
+int intel_ring_disable(struct intel_engine_cs *ring)
+{
+	WARN_ON(!ring);
+
+	if (ring && ring->disable)
+		return ring->disable(ring);
+	else {
+		DRM_ERROR("Ring disable not supported on %s\n", ring->name);
+		return -EINVAL;
+	}
+}
+
+int intel_ring_enable(struct intel_engine_cs *ring)
+{
+	WARN_ON(!ring);
+
+	if (ring && ring->enable)
+		return ring->enable(ring);
+	else {
+		DRM_ERROR("Ring enable not supported on %s\n", ring->name);
+		return -EINVAL;
+	}
+}
+
+int intel_ring_save(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request *req,
+		bool force_advance)
+{
+	WARN_ON(!ring);
+
+	if (ring && ring->save)
+		return ring->save(ring, req, force_advance);
+	else {
+		DRM_ERROR("Ring save not supported on %s\n", ring->name);
+		return -EINVAL;
+	}
+}
+
+int intel_ring_restore(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request *req)
+{
+	WARN_ON(!ring);
+
+	if (ring && ring->restore)
+		return ring->restore(ring, req);
+	else {
+		DRM_ERROR("Ring restore not supported on %s\n", ring->name);
+		return -EINVAL;
+	}
+}
+
+void intel_gpu_engine_reset_resample(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request *req)
+{
+	struct intel_ringbuffer *ringbuf;
+	struct drm_i915_private *dev_priv;
+
+	if (WARN_ON(!ring))
+		return;
+
+	dev_priv = ring->dev->dev_private;
+
+	if (i915.enable_execlists) {
+		struct intel_context *ctx;
+
+		if (WARN_ON(!req))
+			return;
+
+		ctx = req->ctx;
+		ringbuf = ctx->engine[ring->id].ringbuf;
+
+		/*
+		 * In gen8+ context head is restored during reset and
+		 * we can use it as a reference to set up the new
+		 * driver state.
+		 */
+		I915_READ_HEAD_CTX(ring, ctx, ringbuf->head);
+		ringbuf->last_retired_head = -1;
+		intel_ring_update_space(ringbuf);
+	}
+}
+
 u64 intel_ring_get_active_head(struct intel_engine_cs *ring)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
@@ -637,7 +719,7 @@ static int init_ring_common(struct intel_engine_cs *ring)
 	ringbuf->tail = I915_READ_TAIL(ring) & TAIL_ADDR;
 	intel_ring_update_space(ringbuf);
 
-	memset(&ring->hangcheck, 0, sizeof(ring->hangcheck));
+	i915_hangcheck_reinit(ring);
 
 out:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 39f6dfc..35360a4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -48,6 +48,22 @@ struct  intel_hw_status_page {
 #define I915_READ_MODE(ring) I915_READ(RING_MI_MODE((ring)->mmio_base))
 #define I915_WRITE_MODE(ring, val) I915_WRITE(RING_MI_MODE((ring)->mmio_base), val)
 
+
+#define I915_READ_TAIL_CTX(engine, ctx, outval) \
+	intel_execlists_read_tail((engine), \
+				(ctx), \
+				&(outval));
+
+#define I915_READ_HEAD_CTX(engine, ctx, outval) \
+	intel_execlists_read_head((engine), \
+				(ctx), \
+				&(outval));
+
+#define I915_WRITE_HEAD_CTX(engine, ctx, val) \
+	intel_execlists_write_head((engine), \
+				(ctx), \
+				(val));
+
 /* seqno size is actually only a uint32, but since we plan to use MI_FLUSH_DW to
  * do the writes, and that must have qw aligned offsets, simply pretend it's 8b.
  */
@@ -92,6 +108,34 @@ struct intel_ring_hangcheck {
 	int score;
 	enum intel_ring_hangcheck_action action;
 	int deadlock;
+
+	/*
+	 * Last recorded ring head index.
+	 * This is only ever a ring index where as active
+	 * head may be a graphics address in a ring buffer
+	 */
+	u32 last_head;
+
+	/* Flag to indicate if engine reset required */
+	atomic_t flags;
+
+	/* Indicates request to reset this engine */
+#define I915_ENGINE_RESET_IN_PROGRESS (1<<0)
+
+	/*
+	 * Timestamp (seconds) from when the last time
+	 * this engine was reset.
+	 */
+	u32 last_engine_reset_time;
+
+	/*
+	 * Number of times this engine has been
+	 * reset since boot
+	 */
+	u32 reset_count;
+
+	/* Number of TDR hang detections */
+	u32 tdr_count;
 };
 
 struct intel_ringbuffer {
@@ -177,6 +221,14 @@ struct  intel_engine_cs {
 #define I915_DISPATCH_PINNED 0x2
 	void		(*cleanup)(struct intel_engine_cs *ring);
 
+	int (*enable)(struct intel_engine_cs *ring);
+	int (*disable)(struct intel_engine_cs *ring);
+	int (*save)(struct intel_engine_cs *ring,
+		    struct drm_i915_gem_request *req,
+		    bool force_advance);
+	int (*restore)(struct intel_engine_cs *ring,
+		       struct drm_i915_gem_request *req);
+
 	/* GEN8 signal/wait table - never trust comments!
 	 *	  signal to	signal to    signal to   signal to      signal to
 	 *	    RCS		   VCS          BCS        VECS		 VCS2
@@ -283,6 +335,9 @@ struct  intel_engine_cs {
 
 	struct intel_ring_hangcheck hangcheck;
 
+	/* Saved head value to be restored after reset */
+	u32 saved_head;
+
 	struct {
 		struct drm_i915_gem_object *obj;
 		u32 gtt_offset;
@@ -420,6 +475,15 @@ int intel_ring_space(struct intel_ringbuffer *ringbuf);
 bool intel_ring_stopped(struct intel_engine_cs *ring);
 void __intel_ring_advance(struct intel_engine_cs *ring);
 
+void intel_gpu_engine_reset_resample(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request *req);
+int intel_ring_disable(struct intel_engine_cs *ring);
+int intel_ring_enable(struct intel_engine_cs *ring);
+int intel_ring_save(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request *req, bool force_advance);
+int intel_ring_restore(struct intel_engine_cs *ring,
+		struct drm_i915_gem_request *req);
+
 int __must_check intel_ring_idle(struct intel_engine_cs *ring);
 void intel_ring_init_seqno(struct intel_engine_cs *ring, u32 seqno);
 int intel_ring_flush_all_caches(struct intel_engine_cs *ring);
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 770f526..84774d2 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1478,6 +1478,214 @@ int intel_gpu_reset(struct drm_device *dev)
 	return ret;
 }
 
+static inline int wait_for_engine_reset(struct drm_i915_private *dev_priv,
+		unsigned int grdom)
+{
+#define _CND ((__raw_i915_read32(dev_priv, GEN6_GDRST) & grdom) == 0)
+
+	/*
+	 * Spin waiting for the device to ack the reset request.
+	 * Times out after 500 us
+	 * */
+	return wait_for_atomic_us(_CND, 500);
+
+#undef _CND
+}
+
+static int do_engine_reset_nolock(struct intel_engine_cs *engine)
+{
+	int ret = -ENODEV;
+	struct drm_i915_private *dev_priv = engine->dev->dev_private;
+
+	assert_spin_locked(&dev_priv->uncore.lock);
+
+	switch (engine->id) {
+	case RCS:
+		__raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_RENDER);
+		engine->hangcheck.reset_count++;
+		ret = wait_for_engine_reset(dev_priv, GEN6_GRDOM_RENDER);
+		break;
+
+	case BCS:
+		__raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_BLT);
+		engine->hangcheck.reset_count++;
+		ret = wait_for_engine_reset(dev_priv, GEN6_GRDOM_BLT);
+		break;
+
+	case VCS:
+		__raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_MEDIA);
+		engine->hangcheck.reset_count++;
+		ret = wait_for_engine_reset(dev_priv, GEN6_GRDOM_MEDIA);
+		break;
+
+	case VECS:
+		__raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_VECS);
+		engine->hangcheck.reset_count++;
+		ret = wait_for_engine_reset(dev_priv, GEN6_GRDOM_VECS);
+		break;
+
+	case VCS2:
+		__raw_i915_write32(dev_priv, GEN6_GDRST, GEN8_GRDOM_MEDIA2);
+		engine->hangcheck.reset_count++;
+		ret = wait_for_engine_reset(dev_priv, GEN8_GRDOM_MEDIA2);
+		break;
+
+	default:
+		DRM_ERROR("Unexpected engine: %d\n", engine->id);
+		break;
+	}
+
+	return ret;
+}
+
+static int gen8_do_engine_reset(struct intel_engine_cs *engine)
+{
+	struct drm_device *dev = engine->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int ret = -ENODEV;
+	unsigned long irqflags;
+	u32 reset_ctl = 0;
+
+	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
+	ret = do_engine_reset_nolock(engine);
+	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
+
+	if (!ret) {
+		char *reset_event[2];
+
+		/* Do uevent outside of spinlock as uevent can sleep */
+		reset_event[1] = NULL;
+		reset_event[0] = kasprintf(GFP_KERNEL, "RESET RING=%d", engine->id);
+		kobject_uevent_env(&dev->primary->kdev->kobj,
+			KOBJ_CHANGE, reset_event);
+		kfree(reset_event[0]);
+
+		/*
+		 * Confirm that reset control register back to normal
+		 * following the reset.
+		 */
+		reset_ctl = I915_READ(RING_RESET_CTL(engine));
+		WARN(reset_ctl & 0x3, "Reset control still active after reset! (0x%08x)\n",
+			reset_ctl);
+
+	} else {
+		DRM_ERROR("Engine reset failed! (%d)\n", ret);
+	}
+
+	return ret;
+}
+
+int intel_gpu_engine_reset(struct intel_engine_cs *engine)
+{
+	/* Reset an individual engine */
+	int ret = -ENODEV;
+	struct drm_device *dev = engine->dev;
+
+	switch (INTEL_INFO(dev)->gen) {
+	case 8:
+		ret = gen8_do_engine_reset(engine);
+		break;
+	default:
+		DRM_ERROR("Per Engine Reset not supported on Gen%d\n",
+			  INTEL_INFO(dev)->gen);
+		ret = -ENODEV;
+		break;
+	}
+
+	return ret;
+}
+
+static int gen8_request_engine_reset(struct intel_engine_cs *engine)
+{
+	int ret = 0;
+	unsigned long irqflags;
+	u32 reset_ctl = 0;
+	struct drm_i915_private *dev_priv = engine->dev->dev_private;
+
+	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
+
+	/*
+	 * Initiate reset handshake by requesting reset from the
+	 * reset control register.
+	 */
+	__raw_i915_write32(dev_priv, RING_RESET_CTL(engine),
+		_MASKED_BIT_ENABLE(REQUEST_RESET));
+
+	/*
+	 * Wait for ready to reset ack.
+	 */
+	ret = wait_for_atomic_us((__raw_i915_read32(dev_priv,
+		RING_RESET_CTL(engine)) & READY_FOR_RESET) ==
+			READY_FOR_RESET, 500);
+
+	reset_ctl = __raw_i915_read32(dev_priv, RING_RESET_CTL(engine));
+
+	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
+
+	WARN(ret, "Reset request failed! (err=%d, reset control=0x%08x)\n",
+		ret, reset_ctl);
+
+	return ret;
+}
+
+static int gen8_unrequest_engine_reset(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *dev_priv = engine->dev->dev_private;
+
+	I915_WRITE(RING_RESET_CTL(engine), _MASKED_BIT_DISABLE(REQUEST_RESET));
+	return 0;
+}
+
+/*
+ * On gen8+ a reset request has to be issued via the reset control register
+ * before a GPU engine can be reset in order to stop the command streamer
+ * and idle the engine. This replaces the legacy way of stopping an engine
+ * by writing to the stop ring bit in the MI_MODE register.
+ */
+int intel_request_gpu_engine_reset(struct intel_engine_cs *engine)
+{
+	/* Request reset for an individual engine */
+	int ret = -ENODEV;
+	struct drm_device *dev;
+
+	if (WARN_ON(!engine))
+		return -EINVAL;
+
+	dev = engine->dev;
+
+	if (INTEL_INFO(dev)->gen >= 8)
+		ret = gen8_request_engine_reset(engine);
+	else
+		DRM_ERROR("Reset request not supported on Gen%d\n",
+			  INTEL_INFO(dev)->gen);
+
+	return ret;
+}
+
+/*
+ * It is possible to back off from a previously issued reset request by simply
+ * clearing the reset request bit in the reset control register.
+ */
+int intel_unrequest_gpu_engine_reset(struct intel_engine_cs *engine)
+{
+	/* Request reset for an individual engine */
+	int ret = -ENODEV;
+	struct drm_device *dev;
+
+	if (WARN_ON(!engine))
+		return -EINVAL;
+
+	dev = engine->dev;
+
+	if (INTEL_INFO(dev)->gen >= 8)
+		ret = gen8_unrequest_engine_reset(engine);
+	else
+		DRM_ERROR("Reset unrequest not supported on Gen%d\n",
+			  INTEL_INFO(dev)->gen);
+
+	return ret;
+}
+
 void intel_uncore_check_errors(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC 05/11] drm/i915: Extending i915_gem_check_wedge to check engine reset in progress
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
                   ` (3 preceding siblings ...)
  2015-06-08 17:03 ` [RFC 04/11] drm/i915: Adding TDR / per-engine reset support for gen8 Tomas Elf
@ 2015-06-08 17:03 ` Tomas Elf
  2015-06-08 17:24   ` Chris Wilson
  2015-06-09 11:11   ` Chris Wilson
  2015-06-08 17:03 ` [RFC 06/11] drm/i915: Disable warnings for TDR interruptions in the display driver Tomas Elf
                   ` (8 subsequent siblings)
  13 siblings, 2 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-08 17:03 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Ian Lister

i915_gem_wedge now returns a positive result in three different cases:

1. Legacy: A hang has been detected and full GPU reset has been promoted.

2. Per-engine recovery:

	a. A single engine reference can be passed to the function, in which
	case only that engine will be checked. If that particular engine is
	detected to be hung and is to be reset this will yield a positive
	result but not if reset is in progress for any other engine.

	b. No engine reference is passed to the function, in which case all
	engines are checked for ongoing per-engine hang recovery.

Also, i915_wait_request was updated to take advantage of this new
functionality. This is important since the TDR hang recovery mechanism needs a
way to force waiting threads that hold the struct_mutex to give up the
struct_mutex and try again after the hang recovery has completed. If
i915_wait_request does not take per-engine hang recovery into account there is
no way for a waiting thread to know that a per-engine recovery is about to
happen and that it needs to back off.

Signed-off-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@intel.com>
Signed-off-by: Ian Lister <ian.lister@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |    3 +-
 drivers/gpu/drm/i915/i915_gem.c         |   79 ++++++++++++++++++++++++-------
 drivers/gpu/drm/i915/intel_lrc.c        |    3 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +-
 4 files changed, 68 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9cc5e8d..d092cb8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2807,7 +2807,8 @@ i915_gem_find_active_request(struct intel_engine_cs *ring);
 
 bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
-int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
+int __must_check i915_gem_check_wedge(struct drm_i915_private *dev_priv,
+				      struct intel_engine_cs *engine,
 				      bool interruptible);
 int __must_check i915_gem_check_olr(struct drm_i915_gem_request *req);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4c88e5c..2208b0f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -97,12 +97,38 @@ static void i915_gem_info_remove_obj(struct drm_i915_private *dev_priv,
 	spin_unlock(&dev_priv->mm.object_stat_lock);
 }
 
+static inline int
+i915_engine_reset_in_progress(struct drm_i915_private *dev_priv,
+	struct intel_engine_cs *engine)
+{
+	int ret = 0;
+
+	if (engine) {
+		ret = !!(atomic_read(&dev_priv->ring[engine->id].hangcheck.flags)
+			& I915_ENGINE_RESET_IN_PROGRESS);
+	} else {
+		int i;
+
+		for (i = 0; i < I915_NUM_RINGS; i++)
+			if (atomic_read(&dev_priv->ring[i].hangcheck.flags)
+				& I915_ENGINE_RESET_IN_PROGRESS) {
+
+				ret = 1;
+				break;
+			}
+	}
+
+	return ret;
+}
+
 static int
-i915_gem_wait_for_error(struct i915_gpu_error *error)
+i915_gem_wait_for_error(struct drm_i915_private *dev_priv)
 {
 	int ret;
+	struct i915_gpu_error *error = &dev_priv->gpu_error;
 
 #define EXIT_COND (!i915_reset_in_progress(error) || \
+		   !i915_engine_reset_in_progress(dev_priv, NULL) || \
 		   i915_terminally_wedged(error))
 	if (EXIT_COND)
 		return 0;
@@ -131,7 +157,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret;
 
-	ret = i915_gem_wait_for_error(&dev_priv->gpu_error);
+	ret = i915_gem_wait_for_error(dev_priv);
 	if (ret)
 		return ret;
 
@@ -1128,10 +1154,15 @@ put_rpm:
 }
 
 int
-i915_gem_check_wedge(struct i915_gpu_error *error,
+i915_gem_check_wedge(struct drm_i915_private *dev_priv,
+		     struct intel_engine_cs *engine,
 		     bool interruptible)
 {
-	if (i915_reset_in_progress(error)) {
+	struct i915_gpu_error *error = &dev_priv->gpu_error;
+
+	if (i915_reset_in_progress(error) ||
+	    i915_engine_reset_in_progress(dev_priv, engine)) {
+
 		/* Non-interruptible callers can't handle -EAGAIN, hence return
 		 * -EIO unconditionally for these. */
 		if (!interruptible)
@@ -1213,6 +1244,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 	unsigned long timeout_expire;
 	s64 before, now;
 	int ret;
+	int reset_in_progress = 0;
 
 	WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
 
@@ -1239,11 +1271,17 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 
 		/* We need to check whether any gpu reset happened in between
 		 * the caller grabbing the seqno and now ... */
-		if (reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) {
+		reset_in_progress =
+			i915_gem_check_wedge(ring->dev->dev_private, NULL, interruptible);
+
+		if ((reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) ||
+		     reset_in_progress) {
+
 			/* ... but upgrade the -EAGAIN to an -EIO if the gpu
 			 * is truely gone. */
-			ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
-			if (ret == 0)
+			if (reset_in_progress)
+				ret = reset_in_progress;
+			else
 				ret = -EAGAIN;
 			break;
 		}
@@ -1327,7 +1365,7 @@ i915_wait_request(struct drm_i915_gem_request *req)
 
 	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
+	ret = i915_gem_check_wedge(dev_priv, NULL, interruptible);
 	if (ret)
 		return ret;
 
@@ -1396,6 +1434,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	unsigned reset_counter;
 	int ret;
+	struct intel_engine_cs *ring = NULL;
 
 	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
 	BUG_ON(!dev_priv->mm.interruptible);
@@ -1404,7 +1443,9 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 	if (!req)
 		return 0;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error, true);
+	ring = i915_gem_request_get_ring(req);
+
+	ret = i915_gem_check_wedge(dev_priv, ring, true);
 	if (ret)
 		return ret;
 
@@ -4089,11 +4130,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	unsigned reset_counter;
 	int ret;
 
-	ret = i915_gem_wait_for_error(&dev_priv->gpu_error);
-	if (ret)
-		return ret;
-
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error, false);
+	ret = i915_gem_wait_for_error(dev_priv);
 	if (ret)
 		return ret;
 
@@ -4112,9 +4149,17 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	if (target == NULL)
 		return 0;
 
-	ret = __i915_wait_request(target, reset_counter, true, NULL, NULL);
-	if (ret == 0)
-		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
+	if (target->ring) {
+		if (i915_gem_check_wedge(dev_priv, NULL, false))
+			return -EIO;
+
+		ret = __i915_wait_request(target, reset_counter, true, NULL,
+			NULL);
+
+		if (ret == 0)
+			queue_delayed_work(dev_priv->wq,
+				&dev_priv->mm.retire_work, 0);
+	}
 
 	i915_gem_request_unreference__unlocked(target);
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index a4273ac..e9940cc 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1103,7 +1103,8 @@ static int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
+	ret = i915_gem_check_wedge(dev_priv,
+				   ring,
 				   dev_priv->mm.interruptible);
 	if (ret)
 		return ret;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 0fdf983..fc82942 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2259,7 +2259,8 @@ int intel_ring_begin(struct intel_engine_cs *ring,
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	int ret;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
+	ret = i915_gem_check_wedge(dev_priv,
+				   ring,
 				   dev_priv->mm.interruptible);
 	if (ret)
 		return ret;
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC 06/11] drm/i915: Disable warnings for TDR interruptions in the display driver.
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
                   ` (4 preceding siblings ...)
  2015-06-08 17:03 ` [RFC 05/11] drm/i915: Extending i915_gem_check_wedge to check engine reset in progress Tomas Elf
@ 2015-06-08 17:03 ` Tomas Elf
  2015-06-08 17:53   ` Chris Wilson
  2015-06-08 17:03 ` [RFC 07/11] drm/i915: Reinstate hang recovery work queue Tomas Elf
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 59+ messages in thread
From: Tomas Elf @ 2015-06-08 17:03 UTC (permalink / raw)
  To: Intel-GFX

Now that i915_wait_request takes per-engine hang recovery into account it is
more likely to fail and return -EAGAIN or -EIO due to hung engines (unlike
before when it would only fail if a full GPU reset was imminent). What this
means is that the display driver might see more frequent failures that are only
a consequence of ongoing hang recoveries. Therefore, let's not spew a lot of
warnings in the kernel log every time a flip fails due to an ongoing hang
recovery, since a) This is to be expected during hang recovery and b) It
severely degrades performance and makes the hang recovery take even longer to
complete, which ultimately might cause the userland window compositor to fail
because the flip is taking too long to complete and it simply gives up, leaving
the screen in a frozen state.

Signed-off-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/intel_display.c |   16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 97922fb..128c58c 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -10356,9 +10356,21 @@ static void intel_mmio_flip_work_func(struct work_struct *work)
 
 	mmio_flip = &crtc->mmio_flip;
 	if (mmio_flip->req)
-		WARN_ON(__i915_wait_request(mmio_flip->req,
+	{
+		int ret = __i915_wait_request(mmio_flip->req,
 					    crtc->reset_counter,
-					    false, NULL, NULL) != 0);
+					    false, NULL, NULL);
+
+		/*
+		 * If a hang has been detected then we expect
+		 * __i915_wait_request to fail since it's probably going to be
+		 * forced to give up the struct_mutex and try to grab it again
+		 * once the TDR is done. Don't produce a warning in that case!
+		 */
+		if (ret)
+			WARN_ON(!i915_gem_check_wedge(crtc->base.dev->dev_private,
+					NULL, true));
+	}
 
 	intel_do_mmio_flip(crtc);
 	if (mmio_flip->req) {
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC 07/11] drm/i915: Reinstate hang recovery work queue.
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
                   ` (5 preceding siblings ...)
  2015-06-08 17:03 ` [RFC 06/11] drm/i915: Disable warnings for TDR interruptions in the display driver Tomas Elf
@ 2015-06-08 17:03 ` Tomas Elf
  2015-06-08 17:03 ` [RFC 08/11] drm/i915: Watchdog timeout support for gen8 Tomas Elf
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-08 17:03 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Mika Kuoppala

There used to be a work queue separating the error handler from the hang
recovery path, which was removed a while back in this commit:

	commit b8d24a06568368076ebd5a858a011699a97bfa42
	Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
	Date:   Wed Jan 28 17:03:14 2015 +0200

	    drm/i915: Remove nested work in gpu error handling

Now we need to revert most of that commit since the work queue separating hang
detection from hang recovery is needed in preparation for the upcoming watchdog
timeout feature. The watchdog interrupt service routine will be a second
callsite of the error handler alongside the periodic hang checker, which runs
in a work queue context. Seeing as the error handler will be serving a caller
in a hard interrupt execution context that means that the error handler must
never end up in a situation where it needs to grab the struct_mutex.
Unfortunately, that is exactly what we need to do first at the start of the
hang recovery path, which might potentially sleep if the struct_mutex is
already held by another thread. Not good when you're in a hard interrupt
context.

Signed-off-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c |    1 +
 drivers/gpu/drm/i915/i915_drv.h |    1 +
 drivers/gpu/drm/i915/i915_irq.c |   28 +++++++++++++++++++++++-----
 3 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 8f49e7f..b98abf8 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1115,6 +1115,7 @@ int i915_driver_unload(struct drm_device *dev)
 	/* Free error state after interrupts are fully disabled. */
 	cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work);
 	i915_destroy_error_state(dev);
+	cancel_work_sync(&dev_priv->gpu_error.work);
 
 	if (dev->pdev->msi_enabled)
 		pci_disable_msi(dev->pdev);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d092cb8..efa43c3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1245,6 +1245,7 @@ struct i915_gpu_error {
 	spinlock_t lock;
 	/* Protected by the above dev->gpu_error.lock. */
 	struct drm_i915_error_state *first_error;
+	struct work_struct work;
 
 	unsigned long missed_irq_rings;
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 6a40e25..9913c8f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2300,15 +2300,18 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
 }
 
 /**
- * i915_reset_and_wakeup - do process context error handling work
+ * i915_error_work_func - do process context error handling work
  *
  * Fire an error uevent so userspace can see that a hang or error
  * was detected.
  */
-static void i915_reset_and_wakeup(struct drm_device *dev)
+static void i915_error_work_func(struct work_struct *work)
 {
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct i915_gpu_error *error = &dev_priv->gpu_error;
+	struct i915_gpu_error *error = container_of(work, struct i915_gpu_error,
+	                                            work);
+	struct drm_i915_private *dev_priv =
+	        container_of(error, struct drm_i915_private, gpu_error);
+	struct drm_device *dev = dev_priv->dev;
 	char *error_event[] = { I915_ERROR_UEVENT "=1", NULL };
 	char *reset_event[] = { I915_RESET_UEVENT "=1", NULL };
 	char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL };
@@ -2658,7 +2661,21 @@ void i915_handle_error(struct drm_device *dev, u32 engine_mask, bool wedged,
 		i915_error_wake_up(dev_priv, false);
 	}
 
-	i915_reset_and_wakeup(dev);
+	/*
+	 * Gen 7:
+	 *
+	 * Our reset work can grab modeset locks (since it needs to reset the
+	 * state of outstanding pageflips). Hence it must not be run on our own
+	 * dev-priv->wq work queue for otherwise the flush_work in the pageflip
+	 * code will deadlock.
+	 * If error_work is already in the work queue then it will not be added
+	 * again. It hasn't yet executed so it will see the reset flags when
+	 * it is scheduled. If it isn't in the queue or it is currently
+	 * executing then this call will add it to the queue again so that
+	 * even if it misses the reset flags during the current call it is
+	 * guaranteed to see them on the next call.
+	 */
+	schedule_work(&dev_priv->gpu_error.work);
 }
 
 /* Called from drm generic code, passed 'crtc' which
@@ -4391,6 +4408,7 @@ void intel_irq_init(struct drm_i915_private *dev_priv)
 	struct drm_device *dev = dev_priv->dev;
 
 	INIT_WORK(&dev_priv->hotplug_work, i915_hotplug_work_func);
+	INIT_WORK(&dev_priv->gpu_error.work, i915_error_work_func);
 	INIT_WORK(&dev_priv->dig_port_work, i915_digport_work_func);
 	INIT_WORK(&dev_priv->rps.work, gen6_pm_rps_work);
 	INIT_WORK(&dev_priv->l3_parity.error_work, ivybridge_parity_work);
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC 08/11] drm/i915: Watchdog timeout support for gen8.
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
                   ` (6 preceding siblings ...)
  2015-06-08 17:03 ` [RFC 07/11] drm/i915: Reinstate hang recovery work queue Tomas Elf
@ 2015-06-08 17:03 ` Tomas Elf
  2015-06-08 17:03 ` [RFC 09/11] drm/i915: Fake lost context interrupts through forced CSB check Tomas Elf
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-08 17:03 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Ian Lister

Watchdog timeout (or "media engine reset" as it is sometimes called, even
though the render engine is also supported) is a feature that allows userland
applications to enable hang detection on individual batch buffers. The
detection mechanism itself is mostly bound to the hardware and the only thing
that the driver needs to do to support this form of hang detection is to
implement the interrupt handling support as well as watchdog instruction
injection before and after the emitted batch buffer start instruction in the
ring buffer.

The principle of this hang detection mechanism is as follows:

1. Once the decision has been made to enable watchdog timeout for a particular
batch buffer and the driver is in the process of emitting the batch buffer
start instruction into the ring buffer it also emits a watchdog timer start
instruction before and a watchdog timer cancellation instruction after the
batch buffer instruction in the ring buffer.

2. Once the GPU execution reaches the watchdog timer start instruction the
hardware watchdog counter is started by the hardware.  The counter keeps
counting until it reaches a previously configured threshold value.

2a. If the counter reaches the threshold value the hardware fires a watchdog
interrupt that is picked up by the watchdog interrupt service routine in this
commit. This means that a hang has been detected and the driver needs to deal
with it the same way it would deal with a engine hang detected by the periodic
hang checker. The only difference between the two is that we never promote full
GPU reset following a watchdog timeout in case a per-engine reset was attempted
too recently. Thusly, the watchdog interrupt handler calls the error handler
directly passing the engine mask of the hung engine in question, which
immediately results in a per-engine hang recovery being scheduled.

2b. If the batch buffer finishes executing and the execution reaches the
watchdog cancellation instruction before the watchdog counter reaches its
threshold value the watchdog is cancelled and nothing more comes of it. No hang
was detected.

Currently watchdog timeout for the render engine and all available media
engines are supported. The specifications elude to the VECS engine being
supported but that is currently not supported by this commit.

The current default watchdog threshold value is 60 ms, since this has been
emprically determined to be a good compromise for low-latency requirements and
low rate of false positives.

NOTE: I don't know if Ben Widawsky had any part in this code from 3 years
ago. There have been so many people involved in this already that I am in no
position to know. If I've missed anyone's sob line please let me know.

Signed-off-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@intel.com>
Signed-off-by: Ian Lister <ian.lister@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |    2 +-
 drivers/gpu/drm/i915/i915_dma.c         |   59 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_drv.h         |    7 ++-
 drivers/gpu/drm/i915/i915_irq.c         |   86 +++++++++++++++++++++------
 drivers/gpu/drm/i915/i915_reg.h         |    7 +++
 drivers/gpu/drm/i915/intel_lrc.c        |   99 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.h |   31 ++++++++++
 include/uapi/drm/i915_drm.h             |    5 +-
 8 files changed, 272 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index e33e105..a89da48 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4183,7 +4183,7 @@ i915_wedged_set(void *data, u64 val)
 
 	intel_runtime_pm_get(dev_priv);
 
-	i915_handle_error(dev, 0x0, val,
+	i915_handle_error(dev, 0x0, false, val,
 			  "Manually setting wedged to %llu", val);
 
 	intel_runtime_pm_put(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index b98abf8..2ec3163 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -791,6 +791,64 @@ i915_hangcheck_init(struct drm_device *dev)
 	}
 }
 
+void i915_watchdog_init(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	int freq;
+	int i;
+
+	/*
+	 * Based on pre-defined time out value (60ms or 30ms) calculate
+	 * timer count thresholds needed based on core frequency.
+	 *
+	 * For RCS.
+	 * The timestamp resolution changed in Gen7 and beyond to 80ns
+	 * for all pipes. Before that it was 640ns.
+	 */
+
+#define KM_RCS_ENGINE_TIMEOUT_VALUE_IN_MS 60
+#define KM_BSD_ENGINE_TIMEOUT_VALUE_IN_MS 60
+#define KM_TIMER_MILLISECOND 1000
+
+	/*
+	 * Timestamp timer resolution = 0.080 uSec,
+	 * or 12500000 counts per second
+	 */
+#define KM_TIMESTAMP_CNTS_PER_SEC_80NS 12500000
+
+	/*
+	 * Timestamp timer resolution = 0.640 uSec,
+	 * or 1562500 counts per second
+	 */
+#define KM_TIMESTAMP_CNTS_PER_SEC_640NS 1562500
+
+	if (INTEL_INFO(dev)->gen >= 7)
+		freq = KM_TIMESTAMP_CNTS_PER_SEC_80NS;
+	else
+		freq = KM_TIMESTAMP_CNTS_PER_SEC_640NS;
+
+	dev_priv->ring[RCS].watchdog_threshold =
+		((KM_RCS_ENGINE_TIMEOUT_VALUE_IN_MS) *
+		(freq / KM_TIMER_MILLISECOND));
+
+	dev_priv->ring[VCS].watchdog_threshold =
+		((KM_BSD_ENGINE_TIMEOUT_VALUE_IN_MS) *
+		(freq / KM_TIMER_MILLISECOND));
+
+	dev_priv->ring[VCS2].watchdog_threshold =
+		((KM_BSD_ENGINE_TIMEOUT_VALUE_IN_MS) *
+		(freq / KM_TIMER_MILLISECOND));
+
+	for (i = 0; i < I915_NUM_RINGS; i++)
+		dev_priv->ring[i].hangcheck.watchdog_count = 0;
+
+	DRM_INFO("Watchdog Timeout [ms], " \
+			"RCS: 0x%08X, VCS: 0x%08X, VCS2: 0x%08X\n", \
+			KM_RCS_ENGINE_TIMEOUT_VALUE_IN_MS,
+			KM_BSD_ENGINE_TIMEOUT_VALUE_IN_MS,
+			KM_BSD_ENGINE_TIMEOUT_VALUE_IN_MS);
+}
+
 /**
  * i915_driver_load - setup chip and create an initial config
  * @dev: DRM device
@@ -972,6 +1030,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	i915_gem_load(dev);
 
 	i915_hangcheck_init(dev);
+	i915_watchdog_init(dev);
 
 	/* On the 945G/GM, the chipset reports the MSI capability on the
 	 * integrated graphics even though the support isn't actually there
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index efa43c3..5139daa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2563,6 +2563,7 @@ extern unsigned long i915_gfx_val(struct drm_i915_private *dev_priv);
 extern void i915_update_gfx_val(struct drm_i915_private *dev_priv);
 int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on);
 void intel_hpd_cancel_work(struct drm_i915_private *dev_priv);
+void i915_watchdog_init(struct drm_device *dev);
 static inline void i915_hangcheck_reinit(struct intel_engine_cs *engine)
 {
 	struct intel_ring_hangcheck *hc = &engine->hangcheck;
@@ -2578,9 +2579,9 @@ static inline void i915_hangcheck_reinit(struct intel_engine_cs *engine)
 
 /* i915_irq.c */
 void i915_queue_hangcheck(struct drm_device *dev);
-__printf(4, 5)
-void i915_handle_error(struct drm_device *dev, u32 engine_mask, bool wedged,
-		       const char *fmt, ...);
+__printf(5, 6)
+void i915_handle_error(struct drm_device *dev, u32 engine_mask,
+		       bool watchdog, bool wedged, const char *fmt, ...);
 
 extern void intel_irq_init(struct drm_i915_private *dev_priv);
 extern void intel_hpd_init(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 9913c8f..57c8568 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1289,6 +1289,18 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
 				intel_lrc_irq_handler(&dev_priv->ring[RCS]);
 			if (tmp & (GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT))
 				notify_ring(&dev_priv->ring[RCS]);
+			if (tmp & (GT_GEN8_RCS_WATCHDOG_INTERRUPT << GEN8_RCS_IRQ_SHIFT)) {
+				struct intel_engine_cs *ring;
+
+				/* Stop the counter to prevent further interrupts */
+				ring = &dev_priv->ring[RCS];
+				I915_WRITE(RING_CNTR(ring->mmio_base),
+					GEN6_RCS_WATCHDOG_DISABLE);
+
+				ring->hangcheck.watchdog_count++;
+				i915_handle_error(ring->dev, intel_ring_flag(ring), true, true,
+					"Render engine watchdog timed out");
+			}
 
 			if (tmp & (GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT))
 				intel_lrc_irq_handler(&dev_priv->ring[BCS]);
@@ -1308,11 +1320,35 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
 				intel_lrc_irq_handler(&dev_priv->ring[VCS]);
 			if (tmp & (GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT))
 				notify_ring(&dev_priv->ring[VCS]);
+			if (tmp & (GT_GEN8_VCS_WATCHDOG_INTERRUPT << GEN8_VCS1_IRQ_SHIFT)) {
+				struct intel_engine_cs *ring;
+
+				/* Stop the counter to prevent further interrupts */
+				ring = &dev_priv->ring[VCS];
+				I915_WRITE(RING_CNTR(ring->mmio_base),
+					GEN8_VCS_WATCHDOG_DISABLE);
+
+				ring->hangcheck.watchdog_count++;
+				i915_handle_error(ring->dev, intel_ring_flag(ring), true, true,
+						  "Media engine watchdog timed out");
+			}
 
 			if (tmp & (GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS2_IRQ_SHIFT))
 				intel_lrc_irq_handler(&dev_priv->ring[VCS2]);
 			if (tmp & (GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT))
 				notify_ring(&dev_priv->ring[VCS2]);
+			if (tmp & (GT_GEN8_VCS_WATCHDOG_INTERRUPT << GEN8_VCS2_IRQ_SHIFT)) {
+				struct intel_engine_cs *ring;
+
+				/* Stop the counter to prevent further interrupts */
+				ring = &dev_priv->ring[VCS2];
+				I915_WRITE(RING_CNTR(ring->mmio_base),
+					GEN8_VCS_WATCHDOG_DISABLE);
+
+				ring->hangcheck.watchdog_count++;
+				i915_handle_error(ring->dev, intel_ring_flag(ring), true, true,
+						  "Media engine 2 watchdog timed out");
+			}
 		} else
 			DRM_ERROR("The master control interrupt lied (GT1)!\n");
 	}
@@ -2563,6 +2599,7 @@ static void i915_report_and_clear_eir(struct drm_device *dev)
  *			or if one of the current engine resets fails we fall
  *			back to legacy full GPU reset.
  *
+ * @watchdog: 		true = Engine hang detected by hardware watchdog.
  * @wedged: 		true = Hang detected, invoke hang recovery.
  * @fmt, ...: 		Error message describing reason for error.
  *
@@ -2574,8 +2611,8 @@ static void i915_report_and_clear_eir(struct drm_device *dev)
  * reset the associated engine. Failing that, try to fall back to legacy
  * full GPU reset recovery mode.
  */
-void i915_handle_error(struct drm_device *dev, u32 engine_mask, bool wedged,
-		       const char *fmt, ...)
+void i915_handle_error(struct drm_device *dev, u32 engine_mask,
+                       bool watchdog, bool wedged, const char *fmt, ...)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	va_list args;
@@ -2607,20 +2644,27 @@ void i915_handle_error(struct drm_device *dev, u32 engine_mask, bool wedged,
 			u32 i;
 
 			for_each_ring(engine, dev_priv, i) {
-				u32 now, last_engine_reset_timediff;
 
 				if (!(intel_ring_flag(engine) & engine_mask))
 					continue;
 
-				/* Measure the time since this engine was last reset */
-				now = get_seconds();
-				last_engine_reset_timediff =
-					now - engine->hangcheck.last_engine_reset_time;
-
-				full_reset = last_engine_reset_timediff <
-					i915.gpu_reset_promotion_time;
-
-				engine->hangcheck.last_engine_reset_time = now;
+				if (!watchdog) {
+					/* Measure the time since this engine was last reset */
+					u32 now = get_seconds();
+					u32 last_engine_reset_timediff =
+						now - engine->hangcheck.last_engine_reset_time;
+
+					full_reset = last_engine_reset_timediff <
+						i915.gpu_reset_promotion_time;
+
+					engine->hangcheck.last_engine_reset_time = now;
+				} else {
+					/*
+					 * Watchdog timeout always results
+					 * in engine reset.
+					 */
+					full_reset = false;
+				}
 
 				/*
 				 * This engine was not reset too recently - go ahead
@@ -2631,10 +2675,11 @@ void i915_handle_error(struct drm_device *dev, u32 engine_mask, bool wedged,
 				 * This can still be overridden by a global
 				 * reset e.g. if per-engine reset fails.
 				 */
-				if (!full_reset)
+				if (watchdog || !full_reset)
 					atomic_set_mask(I915_ENGINE_RESET_IN_PROGRESS,
 						&engine->hangcheck.flags);
-				else
+
+				if (full_reset)
 					break;
 
 			} /* for_each_ring */
@@ -2642,7 +2687,7 @@ void i915_handle_error(struct drm_device *dev, u32 engine_mask, bool wedged,
 
 		if (full_reset) {
 			atomic_set_mask(I915_RESET_IN_PROGRESS_FLAG,
-					&dev_priv->gpu_error.reset_counter);
+				&dev_priv->gpu_error.reset_counter);
 		}
 
 		/*
@@ -2980,7 +3025,7 @@ ring_stuck(struct intel_engine_cs *ring, u64 acthd)
 	 */
 	tmp = I915_READ_CTL(ring);
 	if (tmp & RING_WAIT) {
-		i915_handle_error(dev, intel_ring_flag(ring), false,
+		i915_handle_error(dev, intel_ring_flag(ring), false, false,
 				  "Kicking stuck wait on %s",
 				  ring->name);
 		I915_WRITE_CTL(ring, tmp);
@@ -2992,7 +3037,7 @@ ring_stuck(struct intel_engine_cs *ring, u64 acthd)
 		default:
 			return HANGCHECK_HUNG;
 		case 1:
-			i915_handle_error(dev, intel_ring_flag(ring), false,
+			i915_handle_error(dev, intel_ring_flag(ring), false, false,
 					  "Kicking stuck semaphore on %s",
 					  ring->name);
 			I915_WRITE_CTL(ring, tmp);
@@ -3134,9 +3179,9 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 	}
 
 	if (engine_mask)
-		i915_handle_error(dev, engine_mask, true, "Ring hung (0x%02x)", engine_mask);
+		i915_handle_error(dev, engine_mask, false, true, "Ring hung (0x%02x)", engine_mask);
 	else if (force_full_gpu_reset)
-		i915_handle_error(dev, 0x0, true,
+		i915_handle_error(dev, 0x0, false, true,
 			"Hang recovery ineffective, falling back to full GPU reset");
 
 	if (busy_count)
@@ -3591,11 +3636,14 @@ static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv)
 {
 	/* These are interrupts we'll toggle with the ring mask register */
 	uint32_t gt_interrupts[] = {
+		GT_GEN8_RCS_WATCHDOG_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
 		GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
 			GT_CONTEXT_SWITCH_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
 			GT_RENDER_L3_PARITY_ERROR_INTERRUPT |
 			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT |
 			GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT,
+		GT_GEN8_VCS_WATCHDOG_INTERRUPT << GEN8_VCS1_IRQ_SHIFT |
+		GT_GEN8_VCS_WATCHDOG_INTERRUPT << GEN8_VCS2_IRQ_SHIFT |
 		GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT |
 			GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT |
 			GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT |
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index af9f0ad..d2adb9b 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -1181,6 +1181,8 @@ enum skl_disp_power_wells {
 #define RING_HEAD(base)		((base)+0x34)
 #define RING_START(base)	((base)+0x38)
 #define RING_CTL(base)		((base)+0x3c)
+#define RING_CNTR(base)        ((base)+0x178)
+#define RING_THRESH(base) ((base)+0x17C)
 #define RING_SYNC_0(base)	((base)+0x40)
 #define RING_SYNC_1(base)	((base)+0x44)
 #define RING_SYNC_2(base)	((base)+0x48)
@@ -1584,6 +1586,11 @@ enum skl_disp_power_wells {
 #define GT_BSD_USER_INTERRUPT			(1 << 12)
 #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1	(1 << 11) /* hsw+; rsvd on snb, ivb, vlv */
 #define GT_CONTEXT_SWITCH_INTERRUPT		(1 <<  8)
+#define GT_GEN6_RENDER_WATCHDOG_INTERRUPT	(1 <<  6)
+#define GT_GEN8_RCS_WATCHDOG_INTERRUPT		(1 <<  6)
+#define   GEN6_RCS_WATCHDOG_DISABLE		1
+#define GT_GEN8_VCS_WATCHDOG_INTERRUPT		(1 <<  6)
+#define   GEN8_VCS_WATCHDOG_DISABLE		0xFFFFFFFF
 #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT	(1 <<  5) /* !snb */
 #define GT_RENDER_PIPECTL_NOTIFY_INTERRUPT	(1 <<  4)
 #define GT_RENDER_CS_MASTER_ERROR_INTERRUPT	(1 <<  3)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e9940cc..051da09 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1122,6 +1122,78 @@ static int intel_logical_ring_begin(struct intel_ringbuffer *ringbuf,
 	return 0;
 }
 
+static int
+gen8_ring_start_watchdog(struct intel_ringbuffer *ringbuf, struct intel_context *ctx)
+{
+	int ret;
+	struct intel_engine_cs *ring = ringbuf->ring;
+
+	ret = intel_logical_ring_begin(ringbuf, ctx, 10);
+	if (ret)
+		return ret;
+
+	/*
+	 * i915_reg.h includes a warning to place a MI_NOOP
+	 * before a MI_LOAD_REGISTER_IMM
+	 */
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+
+	/* Set counter period */
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+	intel_logical_ring_emit(ringbuf, RING_THRESH(ring->mmio_base));
+	intel_logical_ring_emit(ringbuf, ring->watchdog_threshold);
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+
+	/* Start counter */
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+	intel_logical_ring_emit(ringbuf, RING_CNTR(ring->mmio_base));
+	intel_logical_ring_emit(ringbuf, I915_WATCHDOG_ENABLE);
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
+static int
+gen8_ring_stop_watchdog(struct intel_ringbuffer *ringbuf, struct intel_context *ctx)
+{
+	int ret;
+	struct intel_engine_cs *ring = ringbuf->ring;
+
+	ret = intel_logical_ring_begin(ringbuf, ctx, 6);
+	if (ret)
+		return ret;
+
+	/*
+	 * i915_reg.h includes a warning to place a MI_NOOP
+	 * before a MI_LOAD_REGISTER_IMM
+	 */
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+
+	intel_logical_ring_emit(ringbuf, MI_LOAD_REGISTER_IMM(1));
+	intel_logical_ring_emit(ringbuf, RING_CNTR(ring->mmio_base));
+
+	switch (ring->id) {
+	default:
+		WARN(1, "%s does not support watchdog timeout! " \
+			"Defaulting to render engine.\n", ring->name);
+	case RCS:
+		intel_logical_ring_emit(ringbuf, GEN6_RCS_WATCHDOG_DISABLE);
+		break;
+	case VCS:
+	case VCS2:
+		intel_logical_ring_emit(ringbuf, GEN8_VCS_WATCHDOG_DISABLE);
+		break;
+	}
+
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
 /**
  * execlists_submission() - submit a batchbuffer for execution, Execlists style
  * @dev: DRM device.
@@ -1152,6 +1224,7 @@ int intel_execlists_submission(struct drm_device *dev, struct drm_file *file,
 	int instp_mode;
 	u32 instp_mask;
 	int ret;
+	bool watchdog_running = false;
 
 	instp_mode = args->flags & I915_EXEC_CONSTANTS_MASK;
 	instp_mask = I915_EXEC_CONSTANTS_MASK;
@@ -1203,6 +1276,18 @@ int intel_execlists_submission(struct drm_device *dev, struct drm_file *file,
 	if (ret)
 		return ret;
 
+	/* Start watchdog timer */
+	if (args->flags & I915_EXEC_ENABLE_WATCHDOG) {
+		if (!intel_ring_supports_watchdog(ring))
+			return -EINVAL;
+
+		ret = gen8_ring_start_watchdog(ringbuf, ctx);
+		if (ret)
+			return ret;
+
+		watchdog_running = true;
+	}
+
 	if (ring == &dev_priv->ring[RCS] &&
 	    instp_mode != dev_priv->relative_constants_mode) {
 		ret = intel_logical_ring_begin(ringbuf, ctx, 4);
@@ -1224,6 +1309,13 @@ int intel_execlists_submission(struct drm_device *dev, struct drm_file *file,
 
 	trace_i915_gem_ring_dispatch(intel_ring_get_request(ring), dispatch_flags);
 
+	/* Cancel watchdog timer */
+	if (watchdog_running) {
+		ret = gen8_ring_stop_watchdog(ringbuf, ctx);
+		if (ret)
+			return ret;
+	}
+
 	i915_gem_execbuffer_move_to_active(vmas, ring);
 	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
 
@@ -1892,6 +1984,9 @@ static int logical_render_ring_init(struct drm_device *dev)
 	if (HAS_L3_DPF(dev))
 		ring->irq_keep_mask |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
 
+	ring->irq_keep_mask |=
+		(GT_GEN8_RCS_WATCHDOG_INTERRUPT << GEN8_RCS_IRQ_SHIFT);
+
 	if (INTEL_INFO(dev)->gen >= 9)
 		ring->init_hw = gen9_init_render_ring;
 	else
@@ -1930,6 +2025,8 @@ static int logical_bsd_ring_init(struct drm_device *dev)
 		GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
 	ring->irq_keep_mask =
 		GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
+	ring->irq_keep_mask |=
+		(GT_GEN8_VCS_WATCHDOG_INTERRUPT << GEN8_VCS1_IRQ_SHIFT);
 
 	ring->init_hw = gen8_init_common_ring;
 	ring->get_seqno = gen8_get_seqno;
@@ -1959,6 +2056,8 @@ static int logical_bsd2_ring_init(struct drm_device *dev)
 		GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT;
 	ring->irq_keep_mask =
 		GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS2_IRQ_SHIFT;
+	ring->irq_keep_mask |=
+		(GT_GEN8_VCS_WATCHDOG_INTERRUPT << GEN8_VCS2_IRQ_SHIFT);
 
 	ring->init_hw = gen8_init_common_ring;
 	ring->get_seqno = gen8_get_seqno;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 35360a4..9058789 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -30,6 +30,8 @@ struct  intel_hw_status_page {
 	struct		drm_i915_gem_object *obj;
 };
 
+#define I915_WATCHDOG_ENABLE 0
+
 #define I915_READ_TAIL(ring) I915_READ(RING_TAIL((ring)->mmio_base))
 #define I915_WRITE_TAIL(ring, val) I915_WRITE(RING_TAIL((ring)->mmio_base), val)
 
@@ -136,6 +138,9 @@ struct intel_ring_hangcheck {
 
 	/* Number of TDR hang detections */
 	u32 tdr_count;
+
+	/* Number of watchdog hang detections for this ring */
+	u32 watchdog_count;
 };
 
 struct intel_ringbuffer {
@@ -338,6 +343,12 @@ struct  intel_engine_cs {
 	/* Saved head value to be restored after reset */
 	u32 saved_head;
 
+	/*
+	 * Watchdog timer threshold values
+	 * only RCS, VCS, VCS2 rings have watchdog timeout support
+	 */
+	uint32_t watchdog_threshold;
+
 	struct {
 		struct drm_i915_gem_object *obj;
 		u32 gtt_offset;
@@ -484,6 +495,26 @@ int intel_ring_save(struct intel_engine_cs *ring,
 int intel_ring_restore(struct intel_engine_cs *ring,
 		struct drm_i915_gem_request *req);
 
+static inline bool intel_ring_supports_watchdog(struct intel_engine_cs *ring)
+{
+	bool ret = false;
+
+	if (WARN_ON(!ring))
+		goto exit;
+
+	ret = (	ring->id == RCS ||
+		ring->id == VCS ||
+		ring->id == VCS2);
+
+	if (!ret)
+		DRM_ERROR("%s does not support watchdog timeout!\n", ring->name);
+
+exit:
+	return ret;
+}
+int intel_ring_start_watchdog(struct intel_engine_cs *ring);
+int intel_ring_stop_watchdog(struct intel_engine_cs *ring);
+
 int __must_check intel_ring_idle(struct intel_engine_cs *ring);
 void intel_ring_init_seqno(struct intel_engine_cs *ring, u32 seqno);
 int intel_ring_flush_all_caches(struct intel_engine_cs *ring);
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 4851d66..f8af7d2 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -760,7 +760,10 @@ struct drm_i915_gem_execbuffer2 {
 #define I915_EXEC_BSD_RING1		(1<<13)
 #define I915_EXEC_BSD_RING2		(2<<13)
 
-#define __I915_EXEC_UNKNOWN_FLAGS -(1<<15)
+/* Enable watchdog timer for this batch buffer */
+#define I915_EXEC_ENABLE_WATCHDOG       (1<<15)
+
+#define __I915_EXEC_UNKNOWN_FLAGS -(1<<16)
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC 09/11] drm/i915: Fake lost context interrupts through forced CSB check.
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
                   ` (7 preceding siblings ...)
  2015-06-08 17:03 ` [RFC 08/11] drm/i915: Watchdog timeout support for gen8 Tomas Elf
@ 2015-06-08 17:03 ` Tomas Elf
  2015-06-08 17:03 ` [RFC 10/11] drm/i915: Debugfs interface for per-engine hang recovery Tomas Elf
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-08 17:03 UTC (permalink / raw)
  To: Intel-GFX

A recurring issue during long-duration operations testing of concurrent
rendering tasks with intermittent hangs is that context completion interrupts
following engine resets are sometimes lost. This becomes a real problem since
the hardware might have completed a previously hung context following a
per-engine hang recovery and then gone idle somehow without sending an
interrupt telling the driver about this. At this point the driver would be
stuck waiting for context completion, thinking that the context is still active,
even though the hardware would be idle and waiting for more work.

The way this is solved is by periodically checking for context submission
status inconsistencies. What this means is that the ID of the currently running
context on a given engine is compared against the context ID in the
EXECLIST_STATUS register of the respective engine. If the two do not match and
if the state does not change over time it is assumed that an interrupt was
missed and that the driver is now stuck in an inconsistent state.

Following the decision that the driver and the hardware are irreversibly stuck
in an inconsistent state on a certain engine, the presumably lost interrupt is
faked by simply calling the execlist interrupt handler from a non-interrupt
context. Even though interrupts might be lost that does not mean that the
hardware does not always update the context status buffer (CSB) when
appropriate, which means that any context state transitions would be captured
there regardless of the interrupt being sent or not. By faking the lost
interrupt the interrupt handler could act on the outstanding context status
transition events in the CSB, e.g. a context completion event. In the case
where the hardware would be idle but the driver would be waiting for
completion, faking an interrupt and finding a context completion status event
would cause the driver to remove the currently active request from the execlist
queue and go idle - thereby reestablishing a consistent context submission
status between the hardware and the driver.

The way this is implemented is that the hang checker will always keep alive as
long as there is outstanding work. Even if the enable_hangcheck flag is
disabled one part of the hang checker will always keep alive and reschedule
itself, only to scan for inconsistent context submission states on all engines.
As long as the context submission status of the currently running context on a
given engine is consistent the hang checker works as normal and schedules hang
recoveries as expected. If the status is not consistent no hang recoveries will
be scheduled since no context resubmission will be possible anyway, so there is
no point in trying until the status becomes consistent again. Of course, if
enough hangs on the same engine are detected without any change in consistency
the hang checker will go straight for the full GPU reset so there is no chance
of getting stuck in this state.

It's worth keeping in mind that the watchdog timeout hang detection mechanism
relies entirely on the per-engine hang recovery path. So if we have an
inconsistent context submission status on the engine that the watchdog timeout
has detected a hang there is no way to recover from that hang if the period
hangchecker is turned off since the per-engine hang recovery cannot do its
final context resubmission if the context submission status is inconsistent.
That's why we need to make sure that there is always a thread alive that keeps
an eye out for inconsistent context submission states, not only for the
periodic hang checker but also for watchdog timeout.

Finally, since a non-interrupt context thread could end up in the interrupt
handler as part of the forced CSB checking there's the chance of a race
condition between the interrupt handler and the ring init code since both
update ring->next_context_status_buffer. Therefore we've had to update the
interrupt handler so that it grabs the execlist spinlock before updating
the variable. We've also had to make sure that the ring init code
grabs the execlist spinlock before initing this variable.

Signed-off-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c         |    6 +-
 drivers/gpu/drm/i915/i915_irq.c         |   77 ++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_lrc.c        |   91 +++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_lrc.h        |    2 +-
 drivers/gpu/drm/i915/intel_lrc_tdr.h    |    3 +
 drivers/gpu/drm/i915/intel_ringbuffer.h |   14 +++++
 6 files changed, 179 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 2ec3163..ad4c9efa 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -784,10 +784,12 @@ i915_hangcheck_init(struct drm_device *dev)
 
 	for (i = 0; i < I915_NUM_RINGS; i++) {
 		struct intel_engine_cs *engine = &dev_priv->ring[i];
+		struct intel_ring_hangcheck *hc = &engine->hangcheck;
 
 		i915_hangcheck_reinit(engine);
-		engine->hangcheck.reset_count = 0;
-		engine->hangcheck.tdr_count = 0;
+		hc->reset_count = 0;
+		hc->tdr_count = 0;
+		hc->inconsistent_ctx_status_cnt = 0;
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 57c8568..56bd967 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -36,6 +36,7 @@
 #include "i915_drv.h"
 #include "i915_trace.h"
 #include "intel_drv.h"
+#include "intel_lrc_tdr.h"
 
 /**
  * DOC: interrupt handling
@@ -1286,7 +1287,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
 			ret = IRQ_HANDLED;
 
 			if (tmp & (GT_CONTEXT_SWITCH_INTERRUPT << GEN8_RCS_IRQ_SHIFT))
-				intel_lrc_irq_handler(&dev_priv->ring[RCS]);
+				intel_lrc_irq_handler(&dev_priv->ring[RCS], true);
 			if (tmp & (GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT))
 				notify_ring(&dev_priv->ring[RCS]);
 			if (tmp & (GT_GEN8_RCS_WATCHDOG_INTERRUPT << GEN8_RCS_IRQ_SHIFT)) {
@@ -1303,7 +1304,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
 			}
 
 			if (tmp & (GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT))
-				intel_lrc_irq_handler(&dev_priv->ring[BCS]);
+				intel_lrc_irq_handler(&dev_priv->ring[BCS], true);
 			if (tmp & (GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT))
 				notify_ring(&dev_priv->ring[BCS]);
 		} else
@@ -1317,7 +1318,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
 			ret = IRQ_HANDLED;
 
 			if (tmp & (GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT))
-				intel_lrc_irq_handler(&dev_priv->ring[VCS]);
+				intel_lrc_irq_handler(&dev_priv->ring[VCS], true);
 			if (tmp & (GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT))
 				notify_ring(&dev_priv->ring[VCS]);
 			if (tmp & (GT_GEN8_VCS_WATCHDOG_INTERRUPT << GEN8_VCS1_IRQ_SHIFT)) {
@@ -1334,7 +1335,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
 			}
 
 			if (tmp & (GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS2_IRQ_SHIFT))
-				intel_lrc_irq_handler(&dev_priv->ring[VCS2]);
+				intel_lrc_irq_handler(&dev_priv->ring[VCS2], true);
 			if (tmp & (GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT))
 				notify_ring(&dev_priv->ring[VCS2]);
 			if (tmp & (GT_GEN8_VCS_WATCHDOG_INTERRUPT << GEN8_VCS2_IRQ_SHIFT)) {
@@ -1360,7 +1361,7 @@ static irqreturn_t gen8_gt_irq_handler(struct drm_i915_private *dev_priv,
 			ret = IRQ_HANDLED;
 
 			if (tmp & (GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT))
-				intel_lrc_irq_handler(&dev_priv->ring[VECS]);
+				intel_lrc_irq_handler(&dev_priv->ring[VECS], true);
 			if (tmp & (GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT))
 				notify_ring(&dev_priv->ring[VECS]);
 		} else
@@ -3050,6 +3051,27 @@ ring_stuck(struct intel_engine_cs *ring, u64 acthd)
 	return HANGCHECK_HUNG;
 }
 
+static void check_ctx_submission_consistency(struct drm_i915_private *dev_priv,
+				   struct intel_engine_cs *engine,
+				   enum context_submission_status status)
+{
+	struct intel_ring_hangcheck *hc = &engine->hangcheck;
+
+	if (status == CONTEXT_SUBMISSION_STATUS_INCONSISTENT) {
+		if (hc->inconsistent_ctx_status_cnt++ >
+			I915_FAKED_CONTEXT_IRQ_THRESHOLD) {
+
+			DRM_ERROR("Inconsistent context submission state. " \
+				  "Faking interrupt on %s!\n", engine->name);
+
+			intel_execlists_TDR_force_CSB_check(dev_priv, engine);
+			hc->inconsistent_ctx_status_cnt = 0;
+		}
+	}
+	else
+		hc->inconsistent_ctx_status_cnt = 0;
+}
+
 /*
  * This is called when the chip hasn't reported back with completed
  * batchbuffers in a long time. We keep track per ring seqno progress and
@@ -3070,10 +3092,43 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 	int busy_count = 0;
 	bool stuck[I915_NUM_RINGS] = { 0 };
 	bool force_full_gpu_reset = false;
+	enum context_submission_status status[I915_NUM_RINGS] =
+		{ CONTEXT_SUBMISSION_STATUS_OK };
 #define BUSY 1
 #define KICK 5
 #define HUNG 20
 
+	/*
+	 * In execlist mode we need to check for inconsistent context
+	 * submission states regardless if we want to actually check for hangs
+	 * or not since watchdog timeout is dependent on per-engine recovery
+	 * working properly, which will not be the case if there is an
+	 * inconsistent submission state between hardware and driver.
+	 */
+	if (i915.enable_execlists)
+		for_each_ring(ring, dev_priv, i) {
+			status[i] = intel_execlists_TDR_get_current_request(ring, NULL);
+			check_ctx_submission_consistency(dev_priv,
+							 ring,
+							 status[i]);
+
+			/*
+			 * Work is still pending! If hang checking is turned on
+			 * then go through the normal hang check procedure.
+			 * Otherwise we obviously don't do the normal busyness
+			 * check but instead go for a simple check of the
+			 * execlist queues to see if there's work pending. If
+			 * so, there's the potential for an inconsistent
+			 * context submission state so we must keep hang
+			 * checking.
+			 */
+			if (!i915.enable_hangcheck &&
+			   (status[i] != CONTEXT_SUBMISSION_STATUS_NONE_SUBMITTED)) {
+				 i915_queue_hangcheck(dev);
+				 return;
+			}
+		}
+
 	if (!i915.enable_hangcheck)
 		return;
 
@@ -3160,7 +3215,17 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 	}
 
 	for_each_ring(ring, dev_priv, i) {
-		if (ring->hangcheck.score >= HANGCHECK_SCORE_RING_HUNG) {
+		/*
+		 * If the engine is hung but the context submission state is
+		 * inconsistent we cannot attempt recovery since we have no way
+		 * of resubmitting the context. Trying to do so would just
+		 * cause unforseen preemptions. At the top of this function we
+		 * check for - and attempt to rectify - any inconsistencies so
+		 * that future hang checks can safely proceed to recover from
+		 * the hang.
+		 */
+		if ((ring->hangcheck.score >= HANGCHECK_SCORE_RING_HUNG) &&
+		    (status[i] == CONTEXT_SUBMISSION_STATUS_OK)) {
 			DRM_INFO("%s on %s\n",
 				 stuck[i] ? "stuck" : "no progress",
 				 ring->name);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 051da09..0d197fe 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -641,7 +641,7 @@ static bool execlists_check_remove_request(struct intel_engine_cs *ring,
  * Check the unread Context Status Buffers and manage the submission of new
  * contexts to the ELSP accordingly.
  */
-void intel_lrc_irq_handler(struct intel_engine_cs *ring)
+int intel_lrc_irq_handler(struct intel_engine_cs *ring, bool do_lock)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	u32 status_pointer;
@@ -653,13 +653,14 @@ void intel_lrc_irq_handler(struct intel_engine_cs *ring)
 
 	status_pointer = I915_READ(RING_CONTEXT_STATUS_PTR(ring));
 
+	if (do_lock)
+		spin_lock(&ring->execlist_lock);
+
 	read_pointer = ring->next_context_status_buffer;
 	write_pointer = status_pointer & 0x07;
 	if (read_pointer > write_pointer)
 		write_pointer += 6;
 
-	spin_lock(&ring->execlist_lock);
-
 	while (read_pointer < write_pointer) {
 		read_pointer++;
 		status = I915_READ(RING_CONTEXT_STATUS_BUF(ring) +
@@ -685,13 +686,16 @@ void intel_lrc_irq_handler(struct intel_engine_cs *ring)
 	if (submit_contexts != 0)
 		execlists_context_unqueue(ring);
 
-	spin_unlock(&ring->execlist_lock);
-
 	WARN(submit_contexts > 2, "More than two context complete events?\n");
 	ring->next_context_status_buffer = write_pointer % 6;
 
+	if (do_lock)
+		spin_unlock(&ring->execlist_lock);
+
 	I915_WRITE(RING_CONTEXT_STATUS_PTR(ring),
 		   ((u32)ring->next_context_status_buffer & 0x07) << 8);
+
+	return submit_contexts;
 }
 
 static int execlists_context_queue(struct intel_engine_cs *ring,
@@ -1473,6 +1477,7 @@ static int gen8_init_common_ring(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	unsigned long flags;
 
 	I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
 	I915_WRITE(RING_HWSTAM(ring->mmio_base), 0xffffffff);
@@ -1481,7 +1486,11 @@ static int gen8_init_common_ring(struct intel_engine_cs *ring)
 		   _MASKED_BIT_DISABLE(GFX_REPLAY_MODE) |
 		   _MASKED_BIT_ENABLE(GFX_RUN_LIST_ENABLE));
 	POSTING_READ(RING_MODE_GEN7(ring));
+
+	spin_lock_irqsave(&ring->execlist_lock, flags);
 	ring->next_context_status_buffer = 0;
+	spin_unlock_irqrestore(&ring->execlist_lock, flags);
+
 	DRM_DEBUG_DRIVER("Execlists enabled for %s\n", ring->name);
 
 	i915_hangcheck_reinit(ring);
@@ -2703,3 +2712,75 @@ intel_execlists_TDR_get_current_request(struct intel_engine_cs *ring,
 
 	return status;
 }
+
+/**
+ * execlists_TDR_force_CSB_check() - check CSB manually to act on pending
+ * context status events.
+ *
+ * @dev_priv: ...
+ * @engine: engine whose CSB is to be checked.
+ *
+ * In case we missed a context event interrupt we can fake this interrupt by
+ * acting on pending CSB events manually by calling this function. This is
+ * normally what would happen in interrupt context but that does not prevent us
+ * from calling it from a user thread.
+ */
+void intel_execlists_TDR_force_CSB_check(struct drm_i915_private *dev_priv,
+					 struct intel_engine_cs *engine)
+{
+	unsigned long flags;
+	bool hw_active;
+	int was_effective;
+
+	if (atomic_read(&engine->hangcheck.flags)
+		& I915_ENGINE_RESET_IN_PROGRESS) {
+
+		/*
+		 * Normally it's not a problem to fake context event interrupts
+		 * at any point even though the real interrupt might come in as
+		 * well. However, following a per-engine reset the read pointer
+		 * is set to 0 and the write pointer is set to 7.
+		 * Seeing as 7 % 6 = 1 (% 6 meaning there are 6 event slots),
+		 * which is 1 above the post-reset read pointer position, that
+		 * means that we've got a CSB window of non-zero size that
+		 * might be populated with context events by the hardware
+		 * following the TDR context resubmission. If we do a faked
+		 * interrupt too early (before finishing hang recovery) we
+		 * clear out this window by setting read pointer = write
+		 * pointer = 1 expecting that all contained events have been
+		 * processed (following a reset there will be nothing but
+		 * zeroes in there, though). This does not prevent the hardware
+		 * from filling in CSB slots 0 and 1 with events after this
+		 * point in time, though. By checking the CSB before allowing
+		 * the hardware fill in the events we hide these events from
+		 * being processed, potentially causing irrecoverable hangs.
+		 *
+		 * Solution: Do not fake interrupts while hang recovery is ongoing.
+		 */
+		DRM_ERROR("Hang recovery in progress. Abort %s CSB check!\n",
+			engine->name);
+
+		return;
+	}
+
+	hw_active =
+		(I915_READ(RING_EXECLIST_STATUS(engine)) &
+			EXECLIST_STATUS_CURRENT_ACTIVE_ELEMENT_STATUS) ?
+				true : false;
+	if (hw_active) {
+		u32 hw_context;
+
+		hw_context = I915_READ(RING_EXECLIST_STATUS_CTX_ID(engine));
+		WARN(hw_active, "Context (%x) executing on %s - " \
+				"No need for faked IRQ!\n",
+				hw_context, engine->name);
+	}
+
+	spin_lock_irqsave(&engine->execlist_lock, flags);
+	if (!(was_effective = intel_lrc_irq_handler(engine, false)))
+		DRM_ERROR("Forced CSB check of %s ineffective!\n", engine->name);
+	spin_unlock_irqrestore(&engine->execlist_lock, flags);
+
+	wake_up_all(&engine->irq_queue);
+}
+
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index d2f497c..6fae3c8 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -88,7 +88,7 @@ int intel_execlists_submission(struct drm_device *dev, struct drm_file *file,
 			       u64 exec_start, u32 dispatch_flags);
 u32 intel_execlists_ctx_id(struct drm_i915_gem_object *ctx_obj);
 
-void intel_lrc_irq_handler(struct intel_engine_cs *ring);
+int intel_lrc_irq_handler(struct intel_engine_cs *ring, bool do_lock);
 void intel_execlists_retire_requests(struct intel_engine_cs *ring);
 
 int intel_execlists_read_tail(struct intel_engine_cs *ring,
diff --git a/drivers/gpu/drm/i915/intel_lrc_tdr.h b/drivers/gpu/drm/i915/intel_lrc_tdr.h
index 684b009..79cae7d 100644
--- a/drivers/gpu/drm/i915/intel_lrc_tdr.h
+++ b/drivers/gpu/drm/i915/intel_lrc_tdr.h
@@ -33,5 +33,8 @@ enum context_submission_status
 intel_execlists_TDR_get_current_request(struct intel_engine_cs *ring,
 		struct drm_i915_gem_request **req);
 
+void intel_execlists_TDR_force_CSB_check(struct drm_i915_private *dev_priv,
+					 struct intel_engine_cs *engine);
+
 #endif /* _INTEL_LRC_TDR_H_ */
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 9058789..f779d4d 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -141,6 +141,20 @@ struct intel_ring_hangcheck {
 
 	/* Number of watchdog hang detections for this ring */
 	u32 watchdog_count;
+
+	/*
+	 * Number of detected context submission status
+	 * inconsistencies
+	 */
+	u32 inconsistent_ctx_status_cnt;
+
+	/*
+	 * Number of detected context submission status
+	 * inconsistencies before faking the context event IRQ
+	 * that is presumed missing.
+	 */
+#define I915_FAKED_CONTEXT_IRQ_THRESHOLD 1
+
 };
 
 struct intel_ringbuffer {
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC 10/11] drm/i915: Debugfs interface for per-engine hang recovery.
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
                   ` (8 preceding siblings ...)
  2015-06-08 17:03 ` [RFC 09/11] drm/i915: Fake lost context interrupts through forced CSB check Tomas Elf
@ 2015-06-08 17:03 ` Tomas Elf
  2015-06-08 17:45   ` Chris Wilson
  2015-06-08 17:03 ` [RFC 11/11] drm/i915: TDR/watchdog trace points Tomas Elf
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 59+ messages in thread
From: Tomas Elf @ 2015-06-08 17:03 UTC (permalink / raw)
  To: Intel-GFX; +Cc: Ian Lister

1. The i915_wedged_set function allows us to schedule three forms of hang recovery:

	a) Legacy hang recovery: By passing e.g. -1 we trigger the legacy full
	GPU reset recovery path.

	b) Single engine hang recovery: By passing an engine ID in the interval
	of [0, I915_NUM_RINGS) we can schedule hang recovery of any single
	engine assuming that the context submission consistency requirements
	are met (otherwise the hang recovery path will simply exit early and
	wait for another hang detection). The values are assumed to use up bits
	3:0 only since we certainly do not support as many as 16 engines.

	This mode is supported since there are several legacy test applications
	that rely on this interface.

	c) Multiple engine hang recovery: By passing in an engine flag mask in
	bits 31:8 (bit 8 corresponds to engine 0 = RCS, bit 9 corresponds to
	engine 1 = VCS etc) we can schedule any combination of engine hang
	recoveries as we please. For example, by passing in the value 0x3 << 8
	we would schedule hang recovery for engines 0 and 1 (RCS and VCS) at
	the same time.

	If bits in fields 3:0 and 31:8 are both used then single engine hang
	recovery mode takes presidence and bits 31:8 are ignored.

2. The i915_wedged_get function produces a set of statistics related to:

	a) Number of engine hangs detected by periodic hang checker.
	b) Number of watchdog timeout hangs detected.
	c) Number of full GPU resets carried out.
	d) Number of engine resets carried out.

	These statistics are presented in a very parser-friendly way and are
	used by the TDR ULT to poll system behaviour to validate test outcomes.

Signed-off-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@intel.com>
Signed-off-by: Ian Lister <ian.lister@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c |  146 +++++++++++++++++++++++++++++++++--
 1 file changed, 141 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a89da48..f3305ed 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2030,7 +2030,7 @@ static int i915_execlists(struct seq_file *m, void *data)
 		seq_printf(m, "%s\n", ring->name);
 
 		status = I915_READ(RING_EXECLIST_STATUS(ring));
-		ctx_id = I915_READ(RING_EXECLIST_STATUS(ring) + 4);
+		ctx_id = I915_READ(RING_EXECLIST_STATUS_CTX_ID(ring));
 		seq_printf(m, "\tExeclist status: 0x%08X, context: %u\n",
 			   status, ctx_id);
 
@@ -4164,11 +4164,50 @@ i915_wedged_get(void *data, u64 *val)
 	return 0;
 }
 
+static const char *ringid_to_str(enum intel_ring_id ring_id)
+{
+	switch (ring_id) {
+	case RCS:
+		return "RCS";
+	case VCS:
+		return "VCS";
+	case BCS:
+		return "BCS";
+	case VECS:
+		return "VECS";
+	case VCS2:
+		return "VCS2";
+	}
+
+	return "unknown";
+}
+
 static int
 i915_wedged_set(void *data, u64 val)
 {
 	struct drm_device *dev = data;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_engine_cs *engine;
+	const u32 engine_mask = ((1 << I915_NUM_RINGS) - 1);
+	const u32 single_engine_reset_mask = 0xF;
+	const u32 bitfield_boundary = 8;
+	u32 val_mask = 0;
+	u32 i;
+#define ENGINE_MSGLEN 64
+	char msg[ENGINE_MSGLEN] = "";
+
+	/*
+	 * Val can contain values in one of the following mutally exclusive
+	 * formats:
+	 *
+	 * 1. Bits [3:0] != 0x0 :
+	 *	Index (0 .. I915_NUM_RINGS-1) of engine to be manually reset.
+	 *	Invalid indices translate to full gpu reset.
+	 *
+	 * 2. Bits [(I915_NUM_RINGS-1)+8 : 8] != 0x0 :
+	 *	Bit mask containing the engine flags of all the engines that
+	 *	are to be manually reset.
+	 */
 
 	/*
 	 * There is no safeguard against this debugfs entry colliding
@@ -4177,14 +4216,61 @@ i915_wedged_set(void *data, u64 val)
 	 * test harness is responsible enough not to inject gpu hangs
 	 * while it is writing to 'i915_wedged'
 	 */
-
-	if (i915_reset_in_progress(&dev_priv->gpu_error))
+	if (i915_gem_check_wedge(dev_priv, NULL, true))
 		return -EAGAIN;
 
 	intel_runtime_pm_get(dev_priv);
 
-	i915_handle_error(dev, 0x0, false, val,
-			  "Manually setting wedged to %llu", val);
+	if (!val || (single_engine_reset_mask & val)) {
+		/*
+		 * Single engine hang mode
+		 *
+		 * Bits [3:0] of val contains index of engine
+		 * to be manually reset.
+		 */
+		val &= single_engine_reset_mask;
+		if (val == single_engine_reset_mask)
+			val_mask = 0x0;
+		else
+			val_mask = (1 << (val & 0xF));
+
+	} else {
+		/*
+		 * Mask mode
+		 *
+		 * Bits [31:8] of val contains bit mask of engines to be
+		 * manually reset, engine index 0 at bit 4, engine index 1 at
+		 * bit 5 and so forth.
+		 */
+		val_mask = (val >> bitfield_boundary) & engine_mask;
+	}
+
+
+	if (val_mask) {
+		u32 len;
+
+		len = scnprintf(msg, sizeof(msg), "Manual reset:");
+
+		/* Assemble message string */
+		for_each_ring(engine, dev_priv, i)
+			if (intel_ring_flag(engine) & val_mask) {
+				DRM_INFO("Manual reset: %s\n", engine->name);
+
+				len += scnprintf(msg + len, sizeof(msg) - len,
+						 "%s [%s]",
+						 msg,
+						 ringid_to_str(i));
+			}
+
+	} else {
+		scnprintf(msg, sizeof(msg), "Manual global reset");
+	}
+
+	i915_handle_error(dev,
+			  val_mask,
+			  false,
+			  true,
+			  msg);
 
 	intel_runtime_pm_put(dev_priv);
 
@@ -4195,6 +4281,55 @@ DEFINE_SIMPLE_ATTRIBUTE(i915_wedged_fops,
 			i915_wedged_get, i915_wedged_set,
 			"%llu\n");
 
+static ssize_t
+i915_ring_hangcheck_read(struct file *filp, char __user *ubuf,
+			 size_t max, loff_t *ppos)
+{
+	int i;
+	int len;
+	char buf[300];
+	struct drm_device *dev = filp->private_data;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	/*
+	 * Returns the total number of times the rings
+	 * have hung and been reset since boot
+	 */
+	len = scnprintf(buf, sizeof(buf), "GPU=0x%08X,",
+			i915_reset_count(&dev_priv->gpu_error));
+	for (i = 0; i < I915_NUM_RINGS; ++i)
+		len += scnprintf(buf + len, sizeof(buf) - len,
+				 "%s=0x%08lX,",
+				 ringid_to_str(i),
+				 (long unsigned)
+				 dev_priv->ring[i].hangcheck.reset_count);
+
+	for (i = 0; i < I915_NUM_RINGS; ++i)
+		len += scnprintf(buf + len, sizeof(buf) - len,
+				 "%s_T=0x%08lX,",
+				 ringid_to_str(i),
+				 (long unsigned)
+				 dev_priv->ring[i].hangcheck.tdr_count);
+
+	for (i = 0; i < I915_NUM_RINGS; ++i)
+		len += scnprintf(buf + len, sizeof(buf) - len,
+				 "%s_W=0x%08lX,",
+				 ringid_to_str(i),
+				 (long unsigned)
+				 dev_priv->ring[i].hangcheck.watchdog_count);
+
+	len += scnprintf(buf + len - 1, sizeof(buf) - len, "\n");
+
+	return simple_read_from_buffer(ubuf, max, ppos, buf, len);
+}
+
+static const struct file_operations i915_ring_hangcheck_fops = {
+	.owner = THIS_MODULE,
+	.open = simple_open,
+	.read = i915_ring_hangcheck_read,
+	.llseek = default_llseek,
+};
+
 static int
 i915_ring_stop_get(void *data, u64 *val)
 {
@@ -4825,6 +4960,7 @@ static const struct i915_debugfs_files {
 	{"i915_ring_missed_irq", &i915_ring_missed_irq_fops},
 	{"i915_ring_test_irq", &i915_ring_test_irq_fops},
 	{"i915_gem_drop_caches", &i915_drop_caches_fops},
+	{"i915_ring_hangcheck", &i915_ring_hangcheck_fops},
 	{"i915_error_state", &i915_error_state_fops},
 	{"i915_next_seqno", &i915_next_seqno_fops},
 	{"i915_display_crc_ctl", &i915_display_crc_ctl_fops},
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC 11/11] drm/i915: TDR/watchdog trace points.
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
                   ` (9 preceding siblings ...)
  2015-06-08 17:03 ` [RFC 10/11] drm/i915: Debugfs interface for per-engine hang recovery Tomas Elf
@ 2015-06-08 17:03 ` Tomas Elf
  2015-06-23 10:05 ` [RFC 00/11] TDR/watchdog timeout support for gen8 Daniel Vetter
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-08 17:03 UTC (permalink / raw)
  To: Intel-GFX

Defined trace points and sprinkled the usage of these throughout the
TDR/watchdog implementation.

The following trace points are supported:

	1. trace_i915_tdr_gpu_recovery:
	Called at the onset of the full GPU reset recovery path.

	2. trace_i915_tdr_engine_recovery:
	Called at the onset of the per-engine recovery path.

	3. i915_tdr_recovery_start:
	Called at the onset of hang recovery before recovery mode has been
	decided.

	4. i915_tdr_recovery_complete:
	Called at the point of hang recovery completion.

	5. i915_tdr_recovery_queued:
	Called once the error handler decides to schedule the actual hang
	recovery, which marks the end of the hang detection path.

	6. i915_tdr_engine_save:
	Called at the point of saving the engine state during per-engine hang
	recovery.

	7. i915_tdr_gpu_reset_complete:
	Called at the point of full GPU reset recovery completion.

	8. i915_tdr_engine_reset_complete:
	Called at the point of per-engine recovery completion.

	9. i915_tdr_forced_csb_check:
	Called at the completion of a forced CSB check.

	10. i915_tdr_hang_check:
	Called for every engine in the periodic hang checker loop before moving
	on to the next engine. Provides an overview of all hang check stats in
	real-time. The collected stats are:

		a. Engine name.

		b. Current engine seqno.

		c. Seqno of previous hang check iteration for that engine.

		d. ACTHD register value of given engine.

		e. Current hang check score of given engine (and whether or not
		the engine has been detected as hung).

		f. Current action for given engine.

		g. Busyness of given engine.

		h. Submission status of currently running context on given engine.

Signed-off-by: Tomas Elf <tomas.elf@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c       |    3 +
 drivers/gpu/drm/i915/i915_drv.h       |   19 +++
 drivers/gpu/drm/i915/i915_gpu_error.c |    2 +-
 drivers/gpu/drm/i915/i915_irq.c       |   10 +-
 drivers/gpu/drm/i915/i915_trace.h     |  298 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c      |    8 +-
 drivers/gpu/drm/i915/intel_uncore.c   |    4 +
 7 files changed, 340 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index e1629a6..8030b92 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -845,6 +845,7 @@ int i915_reset(struct drm_device *dev)
 	if (!i915.reset)
 		return 0;
 
+	trace_i915_tdr_gpu_recovery(dev);
 	intel_reset_gt_powersave(dev);
 
 	mutex_lock(&dev->struct_mutex);
@@ -952,6 +953,8 @@ int i915_reset_engine(struct intel_engine_cs *engine)
 
 	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
 
+	trace_i915_tdr_engine_recovery(engine);
+
         /* Take wake lock to prevent power saving mode */
 	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5139daa..c8d62d2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2323,6 +2323,24 @@ enum context_submission_status {
 	CONTEXT_SUBMISSION_STATUS_UNDEFINED
 };
 
+static inline const char*
+i915_context_submission_status_to_str(enum context_submission_status status)
+{
+	switch(status)
+	{
+		case CONTEXT_SUBMISSION_STATUS_OK:
+			return "ok";
+		case CONTEXT_SUBMISSION_STATUS_INCONSISTENT:
+			return "inconsistent";
+		case CONTEXT_SUBMISSION_STATUS_NONE_SUBMITTED:
+			return "none";
+		case CONTEXT_SUBMISSION_STATUS_UNDEFINED:
+			return "undefined";
+		default:
+			return "invalid";
+	}
+}
+
 /* Note that the (struct drm_i915_private *) cast is just to shut up gcc. */
 #define __I915__(p) ({ \
 	struct drm_i915_private *__p; \
@@ -3129,6 +3147,7 @@ void i915_destroy_error_state(struct drm_device *dev);
 
 void i915_get_extra_instdone(struct drm_device *dev, uint32_t *instdone);
 const char *i915_cache_level_str(struct drm_i915_private *i915, int type);
+const char *hangcheck_action_to_str(enum intel_ring_hangcheck_action a);
 
 /* i915_cmd_parser.c */
 int i915_cmd_parser_get_version(void);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index ac22614..cee1825 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -220,7 +220,7 @@ static void print_error_buffers(struct drm_i915_error_state_buf *m,
 	}
 }
 
-static const char *hangcheck_action_to_str(enum intel_ring_hangcheck_action a)
+const char *hangcheck_action_to_str(enum intel_ring_hangcheck_action a)
 {
 	switch (a) {
 	case HANGCHECK_IDLE:
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 56bd967..66a8456 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2359,6 +2359,7 @@ static void i915_error_work_func(struct work_struct *work)
 
 	mutex_lock(&dev->struct_mutex);
 
+	trace_i915_tdr_recovery_start(dev);
 	kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, error_event);
 
 	for_each_ring(ring, dev_priv, i) {
@@ -2488,6 +2489,8 @@ static void i915_error_work_func(struct work_struct *work)
 
 	kobject_uevent_env(&dev->primary->kdev->kobj,
 			   KOBJ_CHANGE, reset_done_event);
+
+	trace_i915_tdr_recovery_complete(dev);
 }
 
 static void i915_report_and_clear_eir(struct drm_device *dev)
@@ -2618,6 +2621,7 @@ void i915_handle_error(struct drm_device *dev, u32 engine_mask,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	va_list args;
 	char error_msg[80];
+	bool full_reset = true;
 
 	struct intel_engine_cs *engine;
 
@@ -2635,7 +2639,6 @@ void i915_handle_error(struct drm_device *dev, u32 engine_mask,
 		 *	2. The hardware does not support it (pre-gen7).
 		 *	3. We already tried per-engine reset recently.
 		 */
-		bool full_reset = true;
 
 		/*
 		 * TBD: We currently only support per-engine reset for gen8+.
@@ -2659,6 +2662,7 @@ void i915_handle_error(struct drm_device *dev, u32 engine_mask,
 						i915.gpu_reset_promotion_time;
 
 					engine->hangcheck.last_engine_reset_time = now;
+
 				} else {
 					/*
 					 * Watchdog timeout always results
@@ -2707,6 +2711,8 @@ void i915_handle_error(struct drm_device *dev, u32 engine_mask,
 		i915_error_wake_up(dev_priv, false);
 	}
 
+	trace_i915_tdr_recovery_queued(dev, engine_mask, watchdog, full_reset);
+
 	/*
 	 * Gen 7:
 	 *
@@ -3212,6 +3218,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 		ring->hangcheck.seqno = seqno;
 		ring->hangcheck.acthd = acthd;
 		busy_count += busy;
+
+		trace_i915_tdr_hang_check(ring, seqno, acthd, busy, status[i]);
 	}
 
 	for_each_ring(ring, dev_priv, i) {
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 2aa140e..740033e 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -775,6 +775,304 @@ TRACE_EVENT(switch_mm,
 		  __entry->dev, __entry->ring, __entry->to, __entry->vm)
 );
 
+/**
+ * DOC: i915_tdr_gpu_recovery
+ *
+ * This tracepoint tracks the onset of the full GPU recovery path
+ */
+TRACE_EVENT(i915_tdr_gpu_recovery,
+	TP_PROTO(struct drm_device *dev),
+
+	TP_ARGS(dev),
+
+	TP_STRUCT__entry(
+			__field(u32, dev)
+	),
+
+	TP_fast_assign(
+			__entry->dev = dev->primary->index;
+	),
+
+	TP_printk("dev=%u, full GPU recovery started",
+		  __entry->dev)
+);
+
+/**
+ * DOC: i915_tdr_engine_recovery
+ *
+ * This tracepoint tracks the onset of the engine recovery path
+ */
+TRACE_EVENT(i915_tdr_engine_recovery,
+	TP_PROTO(struct intel_engine_cs *engine),
+
+	TP_ARGS(engine),
+
+	TP_STRUCT__entry(
+			__field(struct intel_engine_cs *, engine)
+	),
+
+	TP_fast_assign(
+			__entry->engine = engine;
+	),
+
+	TP_printk("dev=%u, engine=%u, recovery of %s started",
+		  __entry->engine->dev->primary->index,
+		  __entry->engine->id,
+		  __entry->engine->name)
+);
+
+/**
+ * DOC: i915_tdr_recovery_start
+ *
+ * This tracepoint tracks hang recovery start
+ */
+TRACE_EVENT(i915_tdr_recovery_start,
+	TP_PROTO(struct drm_device *dev),
+
+	TP_ARGS(dev),
+
+	TP_STRUCT__entry(
+			__field(u32, dev)
+	),
+
+	TP_fast_assign(
+			__entry->dev = dev->primary->index;
+	),
+
+	TP_printk("dev=%u, hang recovery started",
+		  __entry->dev)
+);
+
+/**
+ * DOC: i915_tdr_recovery_complete
+ *
+ * This tracepoint tracks hang recovery completion
+ */
+TRACE_EVENT(i915_tdr_recovery_complete,
+	TP_PROTO(struct drm_device *dev),
+
+	TP_ARGS(dev),
+
+	TP_STRUCT__entry(
+			__field(u32, dev)
+	),
+
+	TP_fast_assign(
+			__entry->dev = dev->primary->index;
+	),
+
+	TP_printk("dev=%u, hang recovery completed",
+		  __entry->dev)
+);
+
+/**
+ * DOC: i915_tdr_recovery_queued
+ *
+ * This tracepoint tracks the point of queuing recovery from hang check.
+ * If engine recovery is requested engine name will be displayed, otherwise
+ * it will be set to "none". If too many engine reset was attempted in the
+ * previous history we promote to full GPU reset, which is remarked by appending
+ * the "[PROMOTED]" flag.
+ */
+TRACE_EVENT(i915_tdr_recovery_queued,
+	TP_PROTO(struct drm_device *dev,
+		 u32 hung_engines,
+		 bool watchdog,
+		 bool full_reset),
+
+	TP_ARGS(dev, hung_engines, watchdog, full_reset),
+
+	TP_STRUCT__entry(
+			__field(u32, dev)
+			__field(u32, hung_engines)
+			__field(bool, watchdog)
+			__field(bool, full_reset)
+	),
+
+	TP_fast_assign(
+			__entry->dev = dev->primary->index;
+			__entry->hung_engines = hung_engines;
+			__entry->watchdog = watchdog;
+			__entry->full_reset = full_reset;
+	),
+
+	TP_printk("dev=%u, hung_engines=0x%02x%s%s%s%s%s%s%s, watchdog=%s, full_reset=%s",
+		  __entry->dev,
+		  __entry->hung_engines,
+		  __entry->hung_engines ? " (":"",
+		  __entry->hung_engines & RENDER_RING ? " [RCS] " : "",
+		  __entry->hung_engines & BSD_RING ? 	" [VCS] " : "",
+		  __entry->hung_engines & BLT_RING ? 	" [BCS] " : "",
+		  __entry->hung_engines & VEBOX_RING ? 	" [VECS] " : "",
+		  __entry->hung_engines & BSD2_RING ? 	" [VCS2] " : "",
+		  __entry->hung_engines ? ")":"",
+		  __entry->watchdog ? "true" : "false",
+		  __entry->full_reset ?
+			(__entry->hung_engines ? "true [PROMOTED]" : "true") :
+				"false")
+);
+
+/**
+ * DOC: i915_tdr_engine_save
+ *
+ * This tracepoint tracks the point of engine state save during the engine
+ * recovery path. Logs the head pointer position at point of hang, the position
+ * after recovering and whether or not we forced a head pointer advancement or
+ * rounded up to an aligned QWORD position.
+ */
+TRACE_EVENT(i915_tdr_engine_save,
+	TP_PROTO(struct intel_engine_cs *engine,
+		 u32 old_head,
+		 u32 new_head,
+		 bool forced_advance),
+
+	TP_ARGS(engine, old_head, new_head, forced_advance),
+
+	TP_STRUCT__entry(
+			__field(struct intel_engine_cs *, engine)
+			__field(u32, old_head)
+			__field(u32, new_head)
+			__field(bool, forced_advance)
+	),
+
+	TP_fast_assign(
+			__entry->engine = engine;
+			__entry->old_head = old_head;
+			__entry->new_head = new_head;
+			__entry->forced_advance = forced_advance;
+	),
+
+	TP_printk("dev=%u, engine=%s, old_head=%u, new_head=%u, forced_advance=%s",
+		  __entry->engine->dev->primary->index,
+		  __entry->engine->name,
+		  __entry->old_head,
+		  __entry->new_head,
+		  __entry->forced_advance ? "true" : "false")
+);
+
+/**
+ * DOC: i915_tdr_gpu_reset_complete
+ *
+ * This tracepoint tracks the point of full GPU reset completion
+ */
+TRACE_EVENT(i915_tdr_gpu_reset_complete,
+	TP_PROTO(struct drm_device *dev),
+
+	TP_ARGS(dev),
+
+	TP_STRUCT__entry(
+			__field(struct drm_device *, dev)
+	),
+
+	TP_fast_assign(
+			__entry->dev = dev;
+	),
+
+	TP_printk("dev=%u, resets=%u",
+		__entry->dev->primary->index,
+		i915_reset_count(&((struct drm_i915_private *)
+			(__entry->dev)->dev_private)->gpu_error) )
+);
+
+/**
+ * DOC: i915_tdr_engine_reset_complete
+ *
+ * This tracepoint tracks the point of engine reset completion
+ */
+TRACE_EVENT(i915_tdr_engine_reset_complete,
+	TP_PROTO(struct intel_engine_cs *engine),
+
+	TP_ARGS(engine),
+
+	TP_STRUCT__entry(
+			__field(struct intel_engine_cs *, engine)
+	),
+
+	TP_fast_assign(
+			__entry->engine = engine;
+	),
+
+	TP_printk("dev=%u, engine=%s, resets=%u",
+		  __entry->engine->dev->primary->index,
+		  __entry->engine->name,
+		  __entry->engine->hangcheck.reset_count)
+);
+
+/**
+ * DOC: i915_tdr_forced_csb_check
+ *
+ * This tracepoint tracks the occurences of forced CSB checks
+ * that the driver does when detecting inconsistent context
+ * submission states between the driver state and the current
+ * CPU engine state.
+ */
+TRACE_EVENT(i915_tdr_forced_csb_check,
+	TP_PROTO(struct intel_engine_cs *engine,
+		 bool was_effective),
+
+	TP_ARGS(engine, was_effective),
+
+	TP_STRUCT__entry(
+			__field(struct intel_engine_cs *, engine)
+			__field(bool, was_effective)
+	),
+
+	TP_fast_assign(
+			__entry->engine = engine;
+			__entry->was_effective = was_effective;
+	),
+
+	TP_printk("dev=%u, engine=%s, was_effective=%s",
+		  __entry->engine->dev->primary->index,
+		  __entry->engine->name,
+		  __entry->was_effective ? "yes" : "no")
+);
+
+/**
+ * DOC: i915_tdr_hang_check
+ *
+ * This tracepoint tracks hang checks on each engine.
+ */
+TRACE_EVENT(i915_tdr_hang_check,
+	TP_PROTO(struct intel_engine_cs *engine,
+		 u32 seqno,
+		 u64 acthd,
+		 bool busy,
+		 enum context_submission_status status),
+
+	TP_ARGS(engine, seqno, acthd, busy, status),
+
+	TP_STRUCT__entry(
+			__field(struct intel_engine_cs *, engine)
+			__field(u32, seqno)
+			__field(u64, acthd)
+			__field(bool, busy)
+			__field(enum context_submission_status, status)
+	),
+
+	TP_fast_assign(
+			__entry->engine = engine;
+			__entry->seqno = seqno;
+			__entry->acthd = acthd;
+			__entry->busy = busy;
+			__entry->status = status;
+	),
+
+	TP_printk("dev=%u, engine=%s, seqno=%u, last seqno=%u, acthd=%lu, score=%d%s, action=%u [%s], busy=%s, status=%u [%s]",
+		  __entry->engine->dev->primary->index,
+		  __entry->engine->name,
+		  __entry->seqno,
+		  __entry->engine->hangcheck.seqno,
+		  (long unsigned int) __entry->acthd,
+		  __entry->engine->hangcheck.score,
+		  (__entry->engine->hangcheck.score >= HANGCHECK_SCORE_RING_HUNG) ? " [HUNG]" : "",
+		  (unsigned int) __entry->engine->hangcheck.action,
+		  hangcheck_action_to_str(__entry->engine->hangcheck.action),
+		  __entry->busy ? "yes" : "no",
+		  __entry->status,
+		  i915_context_submission_status_to_str(__entry->status))
+);
+
 #endif /* _I915_TRACE_H_ */
 
 /* This part must be outside protection */
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0d197fe..748c0fb 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1780,7 +1780,7 @@ gen8_ring_save(struct intel_engine_cs *ring, struct drm_i915_gem_request *req,
 	struct intel_context *ctx;
 	int ret = 0;
 	int clamp_to_tail = 0;
-	uint32_t head;
+	uint32_t head, old_head;
 	uint32_t tail;
 	uint32_t head_addr;
 	uint32_t tail_addr;
@@ -1795,7 +1795,7 @@ gen8_ring_save(struct intel_engine_cs *ring, struct drm_i915_gem_request *req,
 	 * Read head from MMIO register since it contains the
 	 * most up to date value of head at this point.
 	 */
-	head = I915_READ_HEAD(ring);
+	old_head = head = I915_READ_HEAD(ring);
 
 	/*
 	 * Read tail from the context because the execlist queue
@@ -1852,6 +1852,9 @@ gen8_ring_save(struct intel_engine_cs *ring, struct drm_i915_gem_request *req,
 	head |= (head_addr & HEAD_ADDR);
 	ring->saved_head = head;
 
+	trace_i915_tdr_engine_save(ring, old_head,
+		head, force_advance);
+
 	return 0;
 }
 
@@ -2781,6 +2784,7 @@ void intel_execlists_TDR_force_CSB_check(struct drm_i915_private *dev_priv,
 		DRM_ERROR("Forced CSB check of %s ineffective!\n", engine->name);
 	spin_unlock_irqrestore(&engine->execlist_lock, flags);
 
+	trace_i915_tdr_forced_csb_check(engine, !!was_effective);
 	wake_up_all(&engine->irq_queue);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 84774d2..f72cfe1 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1442,6 +1442,8 @@ static int gen6_do_reset(struct drm_device *dev)
 	/* Spin waiting for the device to ack the reset request */
 	ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) & GEN6_GRDOM_FULL) == 0, 500);
 
+	trace_i915_tdr_gpu_reset_complete(dev);
+
 	intel_uncore_forcewake_reset(dev, true);
 
 	return ret;
@@ -1535,6 +1537,8 @@ static int do_engine_reset_nolock(struct intel_engine_cs *engine)
 		break;
 	}
 
+	trace_i915_tdr_engine_reset_complete(engine);
+
 	return ret;
 }
 
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [RFC 05/11] drm/i915: Extending i915_gem_check_wedge to check engine reset in progress
  2015-06-08 17:03 ` [RFC 05/11] drm/i915: Extending i915_gem_check_wedge to check engine reset in progress Tomas Elf
@ 2015-06-08 17:24   ` Chris Wilson
  2015-06-09 11:08     ` Tomas Elf
  2015-06-09 11:11   ` Chris Wilson
  1 sibling, 1 reply; 59+ messages in thread
From: Chris Wilson @ 2015-06-08 17:24 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX, Ian Lister

On Mon, Jun 08, 2015 at 06:03:23PM +0100, Tomas Elf wrote:
> @@ -4089,11 +4130,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  	unsigned reset_counter;
>  	int ret;
>  
> -	ret = i915_gem_wait_for_error(&dev_priv->gpu_error);
> -	if (ret)
> -		return ret;
> -
> -	ret = i915_gem_check_wedge(&dev_priv->gpu_error, false);
> +	ret = i915_gem_wait_for_error(dev_priv);
>  	if (ret)
>  		return ret;
>  
> @@ -4112,9 +4149,17 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  	if (target == NULL)
>  		return 0;
>  
> -	ret = __i915_wait_request(target, reset_counter, true, NULL, NULL);
> -	if (ret == 0)
> -		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
> +	if (target->ring) {
> +		if (i915_gem_check_wedge(dev_priv, NULL, false))
> +			return -EIO;
> +
> +		ret = __i915_wait_request(target, reset_counter, true, NULL,
> +			NULL);
> +
> +		if (ret == 0)
> +			queue_delayed_work(dev_priv->wq,
> +				&dev_priv->mm.retire_work, 0);
> +	}

This breaks an important bit of ABI. throttle() is used to detect when
the GPU is hung, even when the client is idle (i.e. when the client
starts up or is switched to).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset.
  2015-06-08 17:03 ` [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset Tomas Elf
@ 2015-06-08 17:33   ` Chris Wilson
  2015-06-09 11:06     ` Tomas Elf
  2015-06-16 13:48     ` Daniel Vetter
  2015-06-11  9:14   ` Dave Gordon
  2015-06-16 13:49   ` Daniel Vetter
  2 siblings, 2 replies; 59+ messages in thread
From: Chris Wilson @ 2015-06-08 17:33 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Mon, Jun 08, 2015 at 06:03:21PM +0100, Tomas Elf wrote:
> In preparation for per-engine reset add way for setting context reset stats.
> 
> OPEN QUESTIONS:
> 1. How do we deal with get_reset_stats and the GL robustness interface when
> introducing per-engine resets?
> 
> 	a. Do we set context that cause per-engine resets as guilty? If so, how
> 	does this affect context banning?

Yes. If the reset works quicker, then we can set a higher threshold for
DoS detection, but we still do need Dos detection?
 
> 	b. Do we extend the publically available reset stats to also contain
> 	per-engine reset statistics? If so, would this break the ABI?

No. The get_reset_stats is targetted at the GL API and describing it in
terms of whether my context is guilty or has been affected. That is
orthogonal to whether the reset was on a single ring or the entire GPU -
the question is how broad do want the "affected" to be. Ideally a
per-context reset wouldn't necessarily impact others, except for the
surfaces shared between them...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode.
  2015-06-08 17:03 ` [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode Tomas Elf
@ 2015-06-08 17:36   ` Chris Wilson
  2015-06-09 11:02     ` Tomas Elf
  2015-06-16 13:44   ` Daniel Vetter
  1 sibling, 1 reply; 59+ messages in thread
From: Chris Wilson @ 2015-06-08 17:36 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote:
> When submitting semaphores in execlist mode the hang checker crashes in this
> function because it is only runnable in ring submission mode. The reason this
> is of particular interest to the TDR patch series is because we use semaphores
> as a mean to induce hangs during testing (which is the recommended way to
> induce hangs for gen8+). It's not clear how this is supposed to work in
> execlist mode since:
> 
> 1. This function requires a ring buffer.
> 
> 2. Retrieving a ring buffer in execlist mode requires us to retrieve the
> corresponding context, which we get from a request.
> 
> 3. Retieving a request from the hang checker is not straight-forward since that
> requires us to grab the struct_mutex in order to synchronize against the
> request retirement thread.
> 
> 4. Grabbing the struct_mutex from the hang checker is nothing that we will do
> since that puts us at risk of deadlock since a hung thread might be holding the
> struct_mutex already.
> 
> Therefore it's not obvious how we're supposed to deal with this. For now, we're
> doing an early exit from this function, which avoids any kernel panic situation
> when running our own internal TDR ULT.
> 
> Signed-off-by: Tomas Elf <tomas.elf@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_irq.c |   20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 46bcbff..40c44fc 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -2698,6 +2698,26 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
>  	u64 offset = 0;
>  	int i, backwards;
>  
> +	/*
> +	 * This function does not support execlist mode - any attempt to
> +	 * proceed further into this function will result in a kernel panic
> +	 * when dereferencing ring->buffer, which is not set up in execlist
> +	 * mode.
> +	 *
> +	 * The correct way of doing it would be to derive the currently
> +	 * executing ring buffer from the current context, which is derived
> +	 * from the currently running request. Unfortunately, to get the
> +	 * current request we would have to grab the struct_mutex before doing
> +	 * anything else, which would be ill-advised since some other thread
> +	 * might have grabbed it already and managed to hang itself, causing
> +	 * the hang checker to deadlock.
> +	 *
> +	 * Therefore, this function does not support execlist mode in its
> +	 * current form. Just return NULL and move on.
> +	 */
> +	if (i915.enable_execlists)
> +		return NULL;

if (ring->buffer == NULL)
	return NULL;

if (i915.enable_execlists) is a layering violation that needs to be
erradicated.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 10/11] drm/i915: Debugfs interface for per-engine hang recovery.
  2015-06-08 17:03 ` [RFC 10/11] drm/i915: Debugfs interface for per-engine hang recovery Tomas Elf
@ 2015-06-08 17:45   ` Chris Wilson
  2015-06-09 11:18     ` Tomas Elf
  2015-06-11  9:32     ` Dave Gordon
  0 siblings, 2 replies; 59+ messages in thread
From: Chris Wilson @ 2015-06-08 17:45 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX, Ian Lister

On Mon, Jun 08, 2015 at 06:03:28PM +0100, Tomas Elf wrote:
> 1. The i915_wedged_set function allows us to schedule three forms of hang recovery:
> 
> 	a) Legacy hang recovery: By passing e.g. -1 we trigger the legacy full
> 	GPU reset recovery path.
> 
> 	b) Single engine hang recovery: By passing an engine ID in the interval
> 	of [0, I915_NUM_RINGS) we can schedule hang recovery of any single
> 	engine assuming that the context submission consistency requirements
> 	are met (otherwise the hang recovery path will simply exit early and
> 	wait for another hang detection). The values are assumed to use up bits
> 	3:0 only since we certainly do not support as many as 16 engines.
> 
> 	This mode is supported since there are several legacy test applications
> 	that rely on this interface.

Are there? I don't see them in igt - and let's not start making debugfs
ABI.
 
> 	c) Multiple engine hang recovery: By passing in an engine flag mask in
> 	bits 31:8 (bit 8 corresponds to engine 0 = RCS, bit 9 corresponds to
> 	engine 1 = VCS etc) we can schedule any combination of engine hang
> 	recoveries as we please. For example, by passing in the value 0x3 << 8
> 	we would schedule hang recovery for engines 0 and 1 (RCS and VCS) at
> 	the same time.

Seems fine. But I don't see the reason for the extra complication.

> 	If bits in fields 3:0 and 31:8 are both used then single engine hang
> 	recovery mode takes presidence and bits 31:8 are ignored.
> 
> 2. The i915_wedged_get function produces a set of statistics related to:

Add it to hangcheck_info instead.

i915_wedged_get could be updated to give the ring mask of wedged rings?
If that concept exists.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 06/11] drm/i915: Disable warnings for TDR interruptions in the display driver.
  2015-06-08 17:03 ` [RFC 06/11] drm/i915: Disable warnings for TDR interruptions in the display driver Tomas Elf
@ 2015-06-08 17:53   ` Chris Wilson
  0 siblings, 0 replies; 59+ messages in thread
From: Chris Wilson @ 2015-06-08 17:53 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Mon, Jun 08, 2015 at 06:03:24PM +0100, Tomas Elf wrote:
> Now that i915_wait_request takes per-engine hang recovery into account it is
> more likely to fail and return -EAGAIN or -EIO due to hung engines (unlike
> before when it would only fail if a full GPU reset was imminent). What this
> means is that the display driver might see more frequent failures that are only
> a consequence of ongoing hang recoveries. Therefore, let's not spew a lot of
> warnings in the kernel log every time a flip fails due to an ongoing hang
> recovery, since a) This is to be expected during hang recovery and b) It
> severely degrades performance and makes the hang recovery take even longer to
> complete, which ultimately might cause the userland window compositor to fail
> because the flip is taking too long to complete and it simply gives up, leaving
> the screen in a frozen state.
> 
> Signed-off-by: Tomas Elf <tomas.elf@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c |   16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 97922fb..128c58c 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -10356,9 +10356,21 @@ static void intel_mmio_flip_work_func(struct work_struct *work)
>  
>  	mmio_flip = &crtc->mmio_flip;
>  	if (mmio_flip->req)
> -		WARN_ON(__i915_wait_request(mmio_flip->req,
> +	{
> +		int ret = __i915_wait_request(mmio_flip->req,
>  					    crtc->reset_counter,
> -					    false, NULL, NULL) != 0);
> +					    false, NULL, NULL);
> +
> +		/*
> +		 * If a hang has been detected then we expect
> +		 * __i915_wait_request to fail since it's probably going to be
> +		 * forced to give up the struct_mutex and try to grab it again
> +		 * once the TDR is done. Don't produce a warning in that case!
> +		 */
> +		if (ret)
> +			WARN_ON(!i915_gem_check_wedge(crtc->base.dev->dev_private,
> +					NULL, true));

Now this is plain wrong and should have an alert that your proposed
changes to __i915_wait_request was wrong.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode.
  2015-06-08 17:36   ` Chris Wilson
@ 2015-06-09 11:02     ` Tomas Elf
  0 siblings, 0 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-09 11:02 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel-GFX

On 08/06/2015 18:36, Chris Wilson wrote:
> On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote:
>> When submitting semaphores in execlist mode the hang checker crashes in this
>> function because it is only runnable in ring submission mode. The reason this
>> is of particular interest to the TDR patch series is because we use semaphores
>> as a mean to induce hangs during testing (which is the recommended way to
>> induce hangs for gen8+). It's not clear how this is supposed to work in
>> execlist mode since:
>>
>> 1. This function requires a ring buffer.
>>
>> 2. Retrieving a ring buffer in execlist mode requires us to retrieve the
>> corresponding context, which we get from a request.
>>
>> 3. Retieving a request from the hang checker is not straight-forward since that
>> requires us to grab the struct_mutex in order to synchronize against the
>> request retirement thread.
>>
>> 4. Grabbing the struct_mutex from the hang checker is nothing that we will do
>> since that puts us at risk of deadlock since a hung thread might be holding the
>> struct_mutex already.
>>
>> Therefore it's not obvious how we're supposed to deal with this. For now, we're
>> doing an early exit from this function, which avoids any kernel panic situation
>> when running our own internal TDR ULT.
>>
>> Signed-off-by: Tomas Elf <tomas.elf@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_irq.c |   20 ++++++++++++++++++++
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>> index 46bcbff..40c44fc 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -2698,6 +2698,26 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
>>   	u64 offset = 0;
>>   	int i, backwards;
>>
>> +	/*
>> +	 * This function does not support execlist mode - any attempt to
>> +	 * proceed further into this function will result in a kernel panic
>> +	 * when dereferencing ring->buffer, which is not set up in execlist
>> +	 * mode.
>> +	 *
>> +	 * The correct way of doing it would be to derive the currently
>> +	 * executing ring buffer from the current context, which is derived
>> +	 * from the currently running request. Unfortunately, to get the
>> +	 * current request we would have to grab the struct_mutex before doing
>> +	 * anything else, which would be ill-advised since some other thread
>> +	 * might have grabbed it already and managed to hang itself, causing
>> +	 * the hang checker to deadlock.
>> +	 *
>> +	 * Therefore, this function does not support execlist mode in its
>> +	 * current form. Just return NULL and move on.
>> +	 */
>> +	if (i915.enable_execlists)
>> +		return NULL;
>
> if (ring->buffer == NULL)
> 	return NULL;
>
> if (i915.enable_execlists) is a layering violation that needs to be
> erradicated.
> -Chris
>

Fair enough.

Thanks,
Tomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset.
  2015-06-08 17:33   ` Chris Wilson
@ 2015-06-09 11:06     ` Tomas Elf
  2015-06-16 13:48     ` Daniel Vetter
  1 sibling, 0 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-09 11:06 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel-GFX

On 08/06/2015 18:33, Chris Wilson wrote:
> On Mon, Jun 08, 2015 at 06:03:21PM +0100, Tomas Elf wrote:
>> In preparation for per-engine reset add way for setting context reset stats.
>>
>> OPEN QUESTIONS:
>> 1. How do we deal with get_reset_stats and the GL robustness interface when
>> introducing per-engine resets?
>>
>> 	a. Do we set context that cause per-engine resets as guilty? If so, how
>> 	does this affect context banning?
>
> Yes. If the reset works quicker, then we can set a higher threshold for
> DoS detection, but we still do need Dos detection?

Cool, as long as we make sure to set the context banning period such 
that we allow at least one per-engine recovery attempt and one full GPU 
reset attempt to be made. Or set it any way that would not effectively 
disable any initial hang recovery attempt.

I'll replicate the behavior from the legacy full GPU reset path in the 
engine reset path then.

Thanks,
Tomas

>
>> 	b. Do we extend the publically available reset stats to also contain
>> 	per-engine reset statistics? If so, would this break the ABI?
>
> No. The get_reset_stats is targetted at the GL API and describing it in
> terms of whether my context is guilty or has been affected. That is
> orthogonal to whether the reset was on a single ring or the entire GPU -
> the question is how broad do want the "affected" to be. Ideally a
> per-context reset wouldn't necessarily impact others, except for the
> surfaces shared between them...
> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 05/11] drm/i915: Extending i915_gem_check_wedge to check engine reset in progress
  2015-06-08 17:24   ` Chris Wilson
@ 2015-06-09 11:08     ` Tomas Elf
  0 siblings, 0 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-09 11:08 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel-GFX, Ian Lister

On 08/06/2015 18:24, Chris Wilson wrote:
> On Mon, Jun 08, 2015 at 06:03:23PM +0100, Tomas Elf wrote:
>> @@ -4089,11 +4130,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>>   	unsigned reset_counter;
>>   	int ret;
>>
>> -	ret = i915_gem_wait_for_error(&dev_priv->gpu_error);
>> -	if (ret)
>> -		return ret;
>> -
>> -	ret = i915_gem_check_wedge(&dev_priv->gpu_error, false);
>> +	ret = i915_gem_wait_for_error(dev_priv);
>>   	if (ret)
>>   		return ret;
>>
>> @@ -4112,9 +4149,17 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>>   	if (target == NULL)
>>   		return 0;
>>
>> -	ret = __i915_wait_request(target, reset_counter, true, NULL, NULL);
>> -	if (ret == 0)
>> -		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
>> +	if (target->ring) {
>> +		if (i915_gem_check_wedge(dev_priv, NULL, false))
>> +			return -EIO;
>> +
>> +		ret = __i915_wait_request(target, reset_counter, true, NULL,
>> +			NULL);
>> +
>> +		if (ret == 0)
>> +			queue_delayed_work(dev_priv->wq,
>> +				&dev_priv->mm.retire_work, 0);
>> +	}
>
> This breaks an important bit of ABI. throttle() is used to detect when
> the GPU is hung, even when the client is idle (i.e. when the client
> starts up or is switched to).
> -Chris
>

Yeah, this is pretty silly, we've actually phased out this change from 
GMIN but somehow it survived in my local tree. There should be no 
changes to i915_gem_ring_throttle().

Good catch!

Thanks,
Tomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 05/11] drm/i915: Extending i915_gem_check_wedge to check engine reset in progress
  2015-06-08 17:03 ` [RFC 05/11] drm/i915: Extending i915_gem_check_wedge to check engine reset in progress Tomas Elf
  2015-06-08 17:24   ` Chris Wilson
@ 2015-06-09 11:11   ` Chris Wilson
  1 sibling, 0 replies; 59+ messages in thread
From: Chris Wilson @ 2015-06-09 11:11 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX, Ian Lister

On Mon, Jun 08, 2015 at 06:03:23PM +0100, Tomas Elf wrote:
> @@ -1239,11 +1271,17 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  
>  		/* We need to check whether any gpu reset happened in between
>  		 * the caller grabbing the seqno and now ... */
> -		if (reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) {
> +		reset_in_progress =
> +			i915_gem_check_wedge(ring->dev->dev_private, NULL, interruptible);
> +
> +		if ((reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) ||
> +		     reset_in_progress) {
> +
>  			/* ... but upgrade the -EAGAIN to an -EIO if the gpu
>  			 * is truely gone. */
> -			ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
> -			if (ret == 0)
> +			if (reset_in_progress)
> +				ret = reset_in_progress;
> +			else
>  				ret = -EAGAIN;

I've had a bit more time to digest, anything that touches this piece of
code makes me recoil in horror as, as it currently stands, it is already
a buggy piece of code that returns an error in cases where the caller
cannot possibly tolerate an error (and in circumstances where it is not
actually an error at all). This leads to unfortunates WARNs and SIGBUS
which I have been long arguing against and trying to get fixed for about
5 years.

Obviously we need the EAGAIN in order to do the struct_mutex backoff in
the caller, but there are a class of non-blocking __i915_wait_request
users that can just wait until the reset is resolved. Having that
support would simplify a number of cases. What scares me most though is
the possiblity of an EIO being reported here, those are almost entirely
wrong (if the GPU is hung, the request should be aborted and any waits
automatically completed). I'd like to be sure that TDR doesn't make EIO
handling any worse...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 10/11] drm/i915: Debugfs interface for per-engine hang recovery.
  2015-06-08 17:45   ` Chris Wilson
@ 2015-06-09 11:18     ` Tomas Elf
  2015-06-09 12:27       ` Chris Wilson
  2015-06-11  9:32     ` Dave Gordon
  1 sibling, 1 reply; 59+ messages in thread
From: Tomas Elf @ 2015-06-09 11:18 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel-GFX, Ian Lister

On 08/06/2015 18:45, Chris Wilson wrote:
> On Mon, Jun 08, 2015 at 06:03:28PM +0100, Tomas Elf wrote:
>> 1. The i915_wedged_set function allows us to schedule three forms of hang recovery:
>>
>> 	a) Legacy hang recovery: By passing e.g. -1 we trigger the legacy full
>> 	GPU reset recovery path.
>>
>> 	b) Single engine hang recovery: By passing an engine ID in the interval
>> 	of [0, I915_NUM_RINGS) we can schedule hang recovery of any single
>> 	engine assuming that the context submission consistency requirements
>> 	are met (otherwise the hang recovery path will simply exit early and
>> 	wait for another hang detection). The values are assumed to use up bits
>> 	3:0 only since we certainly do not support as many as 16 engines.
>>
>> 	This mode is supported since there are several legacy test applications
>> 	that rely on this interface.
>
> Are there? I don't see them in igt - and let's not start making debugfs
> ABI.

They're not in IGT only internal to VPG. I guess we could limit these 
changes and adapt the internal test suite in VPG instead of upstreaming 
changes that only VPG validation cares about.

>
>> 	c) Multiple engine hang recovery: By passing in an engine flag mask in
>> 	bits 31:8 (bit 8 corresponds to engine 0 = RCS, bit 9 corresponds to
>> 	engine 1 = VCS etc) we can schedule any combination of engine hang
>> 	recoveries as we please. For example, by passing in the value 0x3 << 8
>> 	we would schedule hang recovery for engines 0 and 1 (RCS and VCS) at
>> 	the same time.
>
> Seems fine. But I don't see the reason for the extra complication.

I wanted to make sure that we could test multiple concurrent hang 
recoveries, but to be fair nobody is actually using this at the moment 
so unless someone actually _needs_ this we probably don't need to 
upstream it.

I guess we could leave it in its currently upstreamed form where it only 
allows full GPU reset. Or would it be of use to anyone to support 
per-engine recovery?

>
>> 	If bits in fields 3:0 and 31:8 are both used then single engine hang
>> 	recovery mode takes presidence and bits 31:8 are ignored.
>>
>> 2. The i915_wedged_get function produces a set of statistics related to:
>
> Add it to hangcheck_info instead.

Yeah, I considered that but I felt that hangcheck_info had too much text 
and it would be too much of a hassle to parse out the data. But having 
spoken to the validation guys it seems like they're fine with updating 
the parser. So I could update hangcheck_info with this new information.

>
> i915_wedged_get could be updated to give the ring mask of wedged rings?
> If that concept exists.
> -Chris
>

Nah, no need, I'll just add the information to hangcheck_info. Besides, 
wedged_get needs to provide more information than just the current 
wedged state. It also provides information about the number of resets, 
the number of watchdog timeouts etc. So it's not that easy to summarise 
it as a ring mask.

Thanks,
Tomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 10/11] drm/i915: Debugfs interface for per-engine hang recovery.
  2015-06-09 11:18     ` Tomas Elf
@ 2015-06-09 12:27       ` Chris Wilson
  2015-06-09 17:28         ` Tomas Elf
  0 siblings, 1 reply; 59+ messages in thread
From: Chris Wilson @ 2015-06-09 12:27 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX, Ian Lister

On Tue, Jun 09, 2015 at 12:18:28PM +0100, Tomas Elf wrote:
> On 08/06/2015 18:45, Chris Wilson wrote:
> >On Mon, Jun 08, 2015 at 06:03:28PM +0100, Tomas Elf wrote:
> >>1. The i915_wedged_set function allows us to schedule three forms of hang recovery:
> >>
> >>	a) Legacy hang recovery: By passing e.g. -1 we trigger the legacy full
> >>	GPU reset recovery path.
> >>
> >>	b) Single engine hang recovery: By passing an engine ID in the interval
> >>	of [0, I915_NUM_RINGS) we can schedule hang recovery of any single
> >>	engine assuming that the context submission consistency requirements
> >>	are met (otherwise the hang recovery path will simply exit early and
> >>	wait for another hang detection). The values are assumed to use up bits
> >>	3:0 only since we certainly do not support as many as 16 engines.
> >>
> >>	This mode is supported since there are several legacy test applications
> >>	that rely on this interface.
> >
> >Are there? I don't see them in igt - and let's not start making debugfs
> >ABI.
> 
> They're not in IGT only internal to VPG. I guess we could limit
> these changes and adapt the internal test suite in VPG instead of
> upstreaming changes that only VPG validation cares about.

Also note that there are quite a few concurrent hang tests in igt that
this series should aim to fix.

You will be expected to provide basic validation tests for igt as well,
which will be using the debugfs interface I guess.

> >>	c) Multiple engine hang recovery: By passing in an engine flag mask in
> >>	bits 31:8 (bit 8 corresponds to engine 0 = RCS, bit 9 corresponds to
> >>	engine 1 = VCS etc) we can schedule any combination of engine hang
> >>	recoveries as we please. For example, by passing in the value 0x3 << 8
> >>	we would schedule hang recovery for engines 0 and 1 (RCS and VCS) at
> >>	the same time.
> >
> >Seems fine. But I don't see the reason for the extra complication.
> 
> I wanted to make sure that we could test multiple concurrent hang
> recoveries, but to be fair nobody is actually using this at the
> moment so unless someone actually _needs_ this we probably don't
> need to upstream it.
> 
> I guess we could leave it in its currently upstreamed form where it
> only allows full GPU reset. Or would it be of use to anyone to
> support per-engine recovery?

I like the per-engine flags, I was just arguing about having the
interface do both seems overly compliated (when the existing behaviour
can be retrained to use -1).

> >>	If bits in fields 3:0 and 31:8 are both used then single engine hang
> >>	recovery mode takes presidence and bits 31:8 are ignored.
> >>
> >>2. The i915_wedged_get function produces a set of statistics related to:
> >
> >Add it to hangcheck_info instead.
> 
> Yeah, I considered that but I felt that hangcheck_info had too much
> text and it would be too much of a hassle to parse out the data. But
> having spoken to the validation guys it seems like they're fine with
> updating the parser. So I could update hangcheck_info with this new
> information.

It can be more or less just be searching for the start of your info block.
A quick string search on new debugfs name, old debugfs name could even
provide backwards compatibility in the test.

> >i915_wedged_get could be updated to give the ring mask of wedged rings?
> >If that concept exists.
> >-Chris
> >
> 
> Nah, no need, I'll just add the information to hangcheck_info.
> Besides, wedged_get needs to provide more information than just the
> current wedged state. It also provides information about the number
> of resets, the number of watchdog timeouts etc. So it's not that
> easy to summarise it as a ring mask.

We are still talking about today's single valued debugfs/i915_wedged
rather than the extended info?

Oh, whilst I am thinking of it, you could also add the reset stats to
the error state.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 10/11] drm/i915: Debugfs interface for per-engine hang recovery.
  2015-06-09 12:27       ` Chris Wilson
@ 2015-06-09 17:28         ` Tomas Elf
  0 siblings, 0 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-09 17:28 UTC (permalink / raw)
  To: Chris Wilson, Intel-GFX, Ian Lister

On 09/06/2015 13:27, Chris Wilson wrote:
> On Tue, Jun 09, 2015 at 12:18:28PM +0100, Tomas Elf wrote:
>> On 08/06/2015 18:45, Chris Wilson wrote:
>>> On Mon, Jun 08, 2015 at 06:03:28PM +0100, Tomas Elf wrote:
>>>> 1. The i915_wedged_set function allows us to schedule three forms of hang recovery:
>>>>
>>>> 	a) Legacy hang recovery: By passing e.g. -1 we trigger the legacy full
>>>> 	GPU reset recovery path.
>>>>
>>>> 	b) Single engine hang recovery: By passing an engine ID in the interval
>>>> 	of [0, I915_NUM_RINGS) we can schedule hang recovery of any single
>>>> 	engine assuming that the context submission consistency requirements
>>>> 	are met (otherwise the hang recovery path will simply exit early and
>>>> 	wait for another hang detection). The values are assumed to use up bits
>>>> 	3:0 only since we certainly do not support as many as 16 engines.
>>>>
>>>> 	This mode is supported since there are several legacy test applications
>>>> 	that rely on this interface.
>>>
>>> Are there? I don't see them in igt - and let's not start making debugfs
>>> ABI.
>>
>> They're not in IGT only internal to VPG. I guess we could limit
>> these changes and adapt the internal test suite in VPG instead of
>> upstreaming changes that only VPG validation cares about.
>
> Also note that there are quite a few concurrent hang tests in igt that
> this series should aim to fix.
>
> You will be expected to provide basic validation tests for igt as well,
> which will be using the debugfs interface I guess.
>

Yeah, once we get past the RFC stage I can start looking into the IGTs. 
Obviously, the existing tests must not break and I'll add tests for 
per-engine recovery, full GPU reset promotion and watchdog timeout.

Daniel Vetter has already said that he wants me to add more hang 
concurrency tests that run a wider variety of tests with intermittent 
hangs trigging different hang recovery modes. Having already dealt with 
long-duration operations testing with concurrent rendering for more than 
a year now during TDR development I know what kind of havoc the TDR 
mechanism can raise when interacting with stuff like display driver and 
shrinker, so I'm hoping that we can distill some of those system-level 
tests into a smaller IGT form that can be run more easily and perhaps 
more determistically.

I haven't started looking into that yet, though, but it will have to be 
done once I start submitting the patch series proper.

>>>> 	c) Multiple engine hang recovery: By passing in an engine flag mask in
>>>> 	bits 31:8 (bit 8 corresponds to engine 0 = RCS, bit 9 corresponds to
>>>> 	engine 1 = VCS etc) we can schedule any combination of engine hang
>>>> 	recoveries as we please. For example, by passing in the value 0x3 << 8
>>>> 	we would schedule hang recovery for engines 0 and 1 (RCS and VCS) at
>>>> 	the same time.
>>>
>>> Seems fine. But I don't see the reason for the extra complication.
>>
>> I wanted to make sure that we could test multiple concurrent hang
>> recoveries, but to be fair nobody is actually using this at the
>> moment so unless someone actually _needs_ this we probably don't
>> need to upstream it.
>>
>> I guess we could leave it in its currently upstreamed form where it
>> only allows full GPU reset. Or would it be of use to anyone to
>> support per-engine recovery?
>
> I like the per-engine flags, I was just arguing about having the
> interface do both seems overly compliated (when the existing behaviour
> can be retrained to use -1).
>

Sure, we'll just go with the per-engine flags then.

>>>> 	If bits in fields 3:0 and 31:8 are both used then single engine hang
>>>> 	recovery mode takes presidence and bits 31:8 are ignored.
>>>>
>>>> 2. The i915_wedged_get function produces a set of statistics related to:
>>>
>>> Add it to hangcheck_info instead.
>>
>> Yeah, I considered that but I felt that hangcheck_info had too much
>> text and it would be too much of a hassle to parse out the data. But
>> having spoken to the validation guys it seems like they're fine with
>> updating the parser. So I could update hangcheck_info with this new
>> information.
>
> It can be more or less just be searching for the start of your info block.
> A quick string search on new debugfs name, old debugfs name could even
> provide backwards compatibility in the test.
>
>>> i915_wedged_get could be updated to give the ring mask of wedged rings?
>>> If that concept exists.
>>> -Chris
>>>
>>
>> Nah, no need, I'll just add the information to hangcheck_info.
>> Besides, wedged_get needs to provide more information than just the
>> current wedged state. It also provides information about the number
>> of resets, the number of watchdog timeouts etc. So it's not that
>> easy to summarise it as a ring mask.
>
> We are still talking about today's single valued debugfs/i915_wedged
> rather than the extended info?

Gah, you're right, I completely screwed up there (both in the patch and 
in the subsequent discussion): I'm not talking about i915_wedged_get 
(which produces a single value), I'm talking about i915_hangcheck_read 
(which produces extended info). For some reason my brain keeps 
convincing me that I've changed i915_wedged_set and i915_wedged_get, 
though (probably because it's one setter function and one getter 
function of sorts, but they're named differently). So, to make sure that 
we're all on the same page here:

* I've updated i915_wedged_set. This one will be changed to only accept 
engine flags.
* I've added i915_hangcheck_read. This function will be removed and 
replaced by i915_hangcheck_info instead.
* I won't be touching i915_wedged_get. Unless that is something that is 
requested. The VPG tests don't need it at least.

Anything else?

>
> Oh, whilst I am thinking of it, you could also add the reset stats to
> the error state.

Sure.

Thanks,
Tomas

> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset.
  2015-06-08 17:03 ` [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset Tomas Elf
  2015-06-08 17:33   ` Chris Wilson
@ 2015-06-11  9:14   ` Dave Gordon
  2015-06-16 13:49   ` Daniel Vetter
  2 siblings, 0 replies; 59+ messages in thread
From: Dave Gordon @ 2015-06-11  9:14 UTC (permalink / raw)
  To: Tomas Elf, Intel-GFX

On 08/06/15 18:03, Tomas Elf wrote:
> In preparation for per-engine reset add way for setting context reset stats.
> 
> OPEN QUESTIONS:
> 1. How do we deal with get_reset_stats and the GL robustness interface when
> introducing per-engine resets?
> 
> 	a. Do we set context that cause per-engine resets as guilty? If so, how
> 	does this affect context banning?
> 
> 	b. Do we extend the publically available reset stats to also contain
> 	per-engine reset statistics? If so, would this break the ABI?
> 
> Signed-off-by: Tomas Elf <tomas.elf@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h |    2 ++
>  drivers/gpu/drm/i915/i915_gem.c |   14 ++++++++++++++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 47be4a5..ab5dfdc 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2781,6 +2781,8 @@ static inline bool i915_stop_ring_allow_warn(struct drm_i915_private *dev_priv)
>  }
>  
>  void i915_gem_reset(struct drm_device *dev);
> +void i915_gem_reset_engine(struct drm_i915_private *dev_priv,
> +			   struct intel_engine_cs *engine);
>  bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
>  int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
>  int __must_check i915_gem_init(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 8ce363a..4c88e5c 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2676,6 +2676,20 @@ void i915_gem_reset(struct drm_device *dev)
>  	i915_gem_restore_fences(dev);
>  }
>  
> +void i915_gem_reset_engine(struct drm_i915_private *dev_priv,
> +			   struct intel_engine_cs *engine)
> +{
> +	u32 completed_seqno;
> +	struct drm_i915_gem_request *req;
> +
> +	completed_seqno = engine->get_seqno(engine, false);
> +
> +	/* Find pending batch buffers and mark them as such  */
> +	list_for_each_entry(req, &engine->request_list, list)
> +	        if (req && (req->seqno > completed_seqno))
> +	                i915_set_reset_status(dev_priv, req->ctx, false);

While we're here, can you change this function name; it's one of the
worst I've ever seen. How about update_ctx_hang_stats() which is what it
actually does? (Or ctx_hang_stats_update() if you prefer RPN!).

And if the function i915_gem_reset_ring_status() still exists please
change its name too, it doesn't reset the status of a ring!

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 10/11] drm/i915: Debugfs interface for per-engine hang recovery.
  2015-06-08 17:45   ` Chris Wilson
  2015-06-09 11:18     ` Tomas Elf
@ 2015-06-11  9:32     ` Dave Gordon
  1 sibling, 0 replies; 59+ messages in thread
From: Dave Gordon @ 2015-06-11  9:32 UTC (permalink / raw)
  To: Chris Wilson, Tomas Elf, Intel-GFX, Ian Lister

On 08/06/15 18:45, Chris Wilson wrote:
> On Mon, Jun 08, 2015 at 06:03:28PM +0100, Tomas Elf wrote:
>> 1. The i915_wedged_set function allows us to schedule three forms of hang recovery:
>>
>> 	a) Legacy hang recovery: By passing e.g. -1 we trigger the legacy full
>> 	GPU reset recovery path.
>>
>> 	b) Single engine hang recovery: By passing an engine ID in the interval
>> 	of [0, I915_NUM_RINGS) we can schedule hang recovery of any single
>> 	engine assuming that the context submission consistency requirements
>> 	are met (otherwise the hang recovery path will simply exit early and
>> 	wait for another hang detection). The values are assumed to use up bits
>> 	3:0 only since we certainly do not support as many as 16 engines.
>>
>> 	This mode is supported since there are several legacy test applications
>> 	that rely on this interface.
> 
> Are there? I don't see them in igt - and let's not start making debugfs
> ABI.

AFAIK there are currently only two meaningful values to write to
i915_wedged: zero, which triggers error capture without resetting, and
any non-zero value, which triggers error capture followed by reset.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 02/11] drm/i915: Introduce uevent for full GPU reset.
  2015-06-08 17:03 ` [RFC 02/11] drm/i915: Introduce uevent for full GPU reset Tomas Elf
@ 2015-06-16 13:43   ` Daniel Vetter
  2015-06-16 15:43     ` Tomas Elf
  0 siblings, 1 reply; 59+ messages in thread
From: Daniel Vetter @ 2015-06-16 13:43 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Mon, Jun 08, 2015 at 06:03:20PM +0100, Tomas Elf wrote:
> The TDR ULT used to validate this patch series requires a special uevent for
> full GPU resets in order to distinguish between different kinds of resets.
> 
> Signed-off-by: Tomas Elf <tomas.elf@intel.com>

Why duplicate the uevent we send out from i915_reset_and_wakeup? At least
I can't spot what this gets us in addition to the existing one.
-Daniel

> ---
>  drivers/gpu/drm/i915/intel_uncore.c |   29 ++++++++++++++++++++++-------
>  1 file changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index d96d15f..770f526 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -1449,18 +1449,33 @@ static int gen6_do_reset(struct drm_device *dev)
>  
>  int intel_gpu_reset(struct drm_device *dev)
>  {
> -	if (INTEL_INFO(dev)->gen >= 6)
> -		return gen6_do_reset(dev);
> +	int ret = -ENODEV;
> +	int gen = INTEL_INFO(dev)->gen;
> +
> +	if (gen >= 6)
> +		ret = gen6_do_reset(dev);
>  	else if (IS_GEN5(dev))
> -		return ironlake_do_reset(dev);
> +		ret = ironlake_do_reset(dev);
>  	else if (IS_G4X(dev))
> -		return g4x_do_reset(dev);
> +		ret = g4x_do_reset(dev);
>  	else if (IS_G33(dev))
> -		return g33_do_reset(dev);
> +		ret = g33_do_reset(dev);
>  	else if (INTEL_INFO(dev)->gen >= 3)
> -		return i915_do_reset(dev);
> +		ret = i915_do_reset(dev);
>  	else
> -		return -ENODEV;
> +		WARN(1, "Full GPU reset not supported on gen %d\n", gen);
> +
> +	if (!ret) {
> +		char *reset_event[2];
> +
> +		reset_event[1] = NULL;
> +		reset_event[0] = kasprintf(GFP_KERNEL, "%s", "GPU RESET=0");
> +		kobject_uevent_env(&dev->primary->kdev->kobj,
> +				KOBJ_CHANGE, reset_event);
> +		kfree(reset_event[0]);
> +	}
> +
> +	return ret;
>  }
>  
>  void intel_uncore_check_errors(struct drm_device *dev)
> -- 
> 1.7.9.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode.
  2015-06-08 17:03 ` [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode Tomas Elf
  2015-06-08 17:36   ` Chris Wilson
@ 2015-06-16 13:44   ` Daniel Vetter
  2015-06-16 15:46     ` Tomas Elf
  1 sibling, 1 reply; 59+ messages in thread
From: Daniel Vetter @ 2015-06-16 13:44 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote:
> When submitting semaphores in execlist mode the hang checker crashes in this
> function because it is only runnable in ring submission mode. The reason this
> is of particular interest to the TDR patch series is because we use semaphores
> as a mean to induce hangs during testing (which is the recommended way to
> induce hangs for gen8+). It's not clear how this is supposed to work in
> execlist mode since:
> 
> 1. This function requires a ring buffer.
> 
> 2. Retrieving a ring buffer in execlist mode requires us to retrieve the
> corresponding context, which we get from a request.
> 
> 3. Retieving a request from the hang checker is not straight-forward since that
> requires us to grab the struct_mutex in order to synchronize against the
> request retirement thread.
> 
> 4. Grabbing the struct_mutex from the hang checker is nothing that we will do
> since that puts us at risk of deadlock since a hung thread might be holding the
> struct_mutex already.
> 
> Therefore it's not obvious how we're supposed to deal with this. For now, we're
> doing an early exit from this function, which avoids any kernel panic situation
> when running our own internal TDR ULT.
> 
> Signed-off-by: Tomas Elf <tomas.elf@intel.com>

We should have a Testcase: line here which mentions the igt testcase which
provoke this bug. Or we need to fill this gap asap.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_irq.c |   20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 46bcbff..40c44fc 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -2698,6 +2698,26 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
>  	u64 offset = 0;
>  	int i, backwards;
>  
> +	/*
> +	 * This function does not support execlist mode - any attempt to
> +	 * proceed further into this function will result in a kernel panic
> +	 * when dereferencing ring->buffer, which is not set up in execlist
> +	 * mode.
> +	 *
> +	 * The correct way of doing it would be to derive the currently
> +	 * executing ring buffer from the current context, which is derived
> +	 * from the currently running request. Unfortunately, to get the
> +	 * current request we would have to grab the struct_mutex before doing
> +	 * anything else, which would be ill-advised since some other thread
> +	 * might have grabbed it already and managed to hang itself, causing
> +	 * the hang checker to deadlock.
> +	 *
> +	 * Therefore, this function does not support execlist mode in its
> +	 * current form. Just return NULL and move on.
> +	 */
> +	if (i915.enable_execlists)
> +		return NULL;
> +
>  	ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
>  	if (!ipehr_is_semaphore_wait(ring->dev, ipehr))
>  		return NULL;
> -- 
> 1.7.9.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset.
  2015-06-08 17:33   ` Chris Wilson
  2015-06-09 11:06     ` Tomas Elf
@ 2015-06-16 13:48     ` Daniel Vetter
  2015-06-16 13:54       ` Chris Wilson
  1 sibling, 1 reply; 59+ messages in thread
From: Daniel Vetter @ 2015-06-16 13:48 UTC (permalink / raw)
  To: Chris Wilson, Tomas Elf, Intel-GFX

On Mon, Jun 08, 2015 at 06:33:59PM +0100, Chris Wilson wrote:
> On Mon, Jun 08, 2015 at 06:03:21PM +0100, Tomas Elf wrote:
> > In preparation for per-engine reset add way for setting context reset stats.
> > 
> > OPEN QUESTIONS:
> > 1. How do we deal with get_reset_stats and the GL robustness interface when
> > introducing per-engine resets?
> > 
> > 	a. Do we set context that cause per-engine resets as guilty? If so, how
> > 	does this affect context banning?
> 
> Yes. If the reset works quicker, then we can set a higher threshold for
> DoS detection, but we still do need Dos detection?
>  
> > 	b. Do we extend the publically available reset stats to also contain
> > 	per-engine reset statistics? If so, would this break the ABI?
> 
> No. The get_reset_stats is targetted at the GL API and describing it in
> terms of whether my context is guilty or has been affected. That is
> orthogonal to whether the reset was on a single ring or the entire GPU -
> the question is how broad do want the "affected" to be. Ideally a
> per-context reset wouldn't necessarily impact others, except for the
> surfaces shared between them...

gl computes sharing sets itself, the kernel only tells it whether a given
context has been victimized, i.e. one of it's batches was not properly
executed due to reset after a hang.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset.
  2015-06-08 17:03 ` [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset Tomas Elf
  2015-06-08 17:33   ` Chris Wilson
  2015-06-11  9:14   ` Dave Gordon
@ 2015-06-16 13:49   ` Daniel Vetter
  2015-06-16 15:54     ` Tomas Elf
  2 siblings, 1 reply; 59+ messages in thread
From: Daniel Vetter @ 2015-06-16 13:49 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Mon, Jun 08, 2015 at 06:03:21PM +0100, Tomas Elf wrote:
> In preparation for per-engine reset add way for setting context reset stats.
> 
> OPEN QUESTIONS:
> 1. How do we deal with get_reset_stats and the GL robustness interface when
> introducing per-engine resets?
> 
> 	a. Do we set context that cause per-engine resets as guilty? If so, how
> 	does this affect context banning?
> 
> 	b. Do we extend the publically available reset stats to also contain
> 	per-engine reset statistics? If so, would this break the ABI?
> 
> Signed-off-by: Tomas Elf <tomas.elf@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h |    2 ++
>  drivers/gpu/drm/i915/i915_gem.c |   14 ++++++++++++++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 47be4a5..ab5dfdc 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2781,6 +2781,8 @@ static inline bool i915_stop_ring_allow_warn(struct drm_i915_private *dev_priv)
>  }
>  
>  void i915_gem_reset(struct drm_device *dev);
> +void i915_gem_reset_engine(struct drm_i915_private *dev_priv,
> +			   struct intel_engine_cs *engine);
>  bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
>  int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
>  int __must_check i915_gem_init(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 8ce363a..4c88e5c 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2676,6 +2676,20 @@ void i915_gem_reset(struct drm_device *dev)
>  	i915_gem_restore_fences(dev);
>  }
>  
> +void i915_gem_reset_engine(struct drm_i915_private *dev_priv,
> +			   struct intel_engine_cs *engine)
> +{
> +	u32 completed_seqno;
> +	struct drm_i915_gem_request *req;
> +
> +	completed_seqno = engine->get_seqno(engine, false);
> +
> +	/* Find pending batch buffers and mark them as such  */
> +	list_for_each_entry(req, &engine->request_list, list)
> +	        if (req && (req->seqno > completed_seqno))
> +	                i915_set_reset_status(dev_priv, req->ctx, false);
> +}

Please don't add dead code since it makes it impossible to review the
patch without peeking ahead. And that makes the split-up useless - the
point of splitting patches it to make review easier by presenting
logically self-contained small changes, not to evenly spread out changes
across a lot of mails.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset.
  2015-06-16 13:48     ` Daniel Vetter
@ 2015-06-16 13:54       ` Chris Wilson
  2015-06-16 15:55         ` Daniel Vetter
  2015-06-18 11:12         ` Dave Gordon
  0 siblings, 2 replies; 59+ messages in thread
From: Chris Wilson @ 2015-06-16 13:54 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On Tue, Jun 16, 2015 at 03:48:09PM +0200, Daniel Vetter wrote:
> On Mon, Jun 08, 2015 at 06:33:59PM +0100, Chris Wilson wrote:
> > On Mon, Jun 08, 2015 at 06:03:21PM +0100, Tomas Elf wrote:
> > > In preparation for per-engine reset add way for setting context reset stats.
> > > 
> > > OPEN QUESTIONS:
> > > 1. How do we deal with get_reset_stats and the GL robustness interface when
> > > introducing per-engine resets?
> > > 
> > > 	a. Do we set context that cause per-engine resets as guilty? If so, how
> > > 	does this affect context banning?
> > 
> > Yes. If the reset works quicker, then we can set a higher threshold for
> > DoS detection, but we still do need Dos detection?
> >  
> > > 	b. Do we extend the publically available reset stats to also contain
> > > 	per-engine reset statistics? If so, would this break the ABI?
> > 
> > No. The get_reset_stats is targetted at the GL API and describing it in
> > terms of whether my context is guilty or has been affected. That is
> > orthogonal to whether the reset was on a single ring or the entire GPU -
> > the question is how broad do want the "affected" to be. Ideally a
> > per-context reset wouldn't necessarily impact others, except for the
> > surfaces shared between them...
> 
> gl computes sharing sets itself, the kernel only tells it whether a given
> context has been victimized, i.e. one of it's batches was not properly
> executed due to reset after a hang.

So you don't think we should delete all pending requests that depend
upon state from the hung request?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 02/11] drm/i915: Introduce uevent for full GPU reset.
  2015-06-16 13:43   ` Daniel Vetter
@ 2015-06-16 15:43     ` Tomas Elf
  2015-06-16 16:55       ` Chris Wilson
  0 siblings, 1 reply; 59+ messages in thread
From: Tomas Elf @ 2015-06-16 15:43 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On 16/06/2015 14:43, Daniel Vetter wrote:
> On Mon, Jun 08, 2015 at 06:03:20PM +0100, Tomas Elf wrote:
>> The TDR ULT used to validate this patch series requires a special uevent for
>> full GPU resets in order to distinguish between different kinds of resets.
>>
>> Signed-off-by: Tomas Elf <tomas.elf@intel.com>
>
> Why duplicate the uevent we send out from i915_reset_and_wakeup? At least
> I can't spot what this gets us in addition to the existing one.
> -Daniel

Look at this line:
 >> +		reset_event[0] = kasprintf(GFP_KERNEL, "%s", "GPU RESET=0");

It doesn't exist in reset_and_wakeup (specifically, the "GPU RESET=0" 
part). It's a uevent that happens at the time of the actual GPU reset 
(GDRST register write). In the subsequent TDR commit we add another one 
to the point of the actual engine reset, which also includes information 
about what exact engine was reset.

The uevents in reset_and_wakeup only tell the user that an error has 
been detected and that some kind of reset has happened, these new 
uevents specify exactly what kind of reset has happened. This particular 
one on its own it's not very meaningful since there is only one 
supported form of reset at this point but once we add engine reset 
support it's useful to be able to discern the types of resets from each 
other (GPU reset, RCS engine reset, VCS engine reset, VCS2 engine reset, 
BCS engine reset, VECS engine reset).

Does that make sense?

Thanks,
Tomas

>
>> ---
>>   drivers/gpu/drm/i915/intel_uncore.c |   29 ++++++++++++++++++++++-------
>>   1 file changed, 22 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
>> index d96d15f..770f526 100644
>> --- a/drivers/gpu/drm/i915/intel_uncore.c
>> +++ b/drivers/gpu/drm/i915/intel_uncore.c
>> @@ -1449,18 +1449,33 @@ static int gen6_do_reset(struct drm_device *dev)
>>
>>   int intel_gpu_reset(struct drm_device *dev)
>>   {
>> -	if (INTEL_INFO(dev)->gen >= 6)
>> -		return gen6_do_reset(dev);
>> +	int ret = -ENODEV;
>> +	int gen = INTEL_INFO(dev)->gen;
>> +
>> +	if (gen >= 6)
>> +		ret = gen6_do_reset(dev);
>>   	else if (IS_GEN5(dev))
>> -		return ironlake_do_reset(dev);
>> +		ret = ironlake_do_reset(dev);
>>   	else if (IS_G4X(dev))
>> -		return g4x_do_reset(dev);
>> +		ret = g4x_do_reset(dev);
>>   	else if (IS_G33(dev))
>> -		return g33_do_reset(dev);
>> +		ret = g33_do_reset(dev);
>>   	else if (INTEL_INFO(dev)->gen >= 3)
>> -		return i915_do_reset(dev);
>> +		ret = i915_do_reset(dev);
>>   	else
>> -		return -ENODEV;
>> +		WARN(1, "Full GPU reset not supported on gen %d\n", gen);
>> +
>> +	if (!ret) {
>> +		char *reset_event[2];
>> +
>> +		reset_event[1] = NULL;
>> +		reset_event[0] = kasprintf(GFP_KERNEL, "%s", "GPU RESET=0");
>> +		kobject_uevent_env(&dev->primary->kdev->kobj,
>> +				KOBJ_CHANGE, reset_event);
>> +		kfree(reset_event[0]);
>> +	}
>> +
>> +	return ret;
>>   }
>>
>>   void intel_uncore_check_errors(struct drm_device *dev)
>> --
>> 1.7.9.5
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode.
  2015-06-16 13:44   ` Daniel Vetter
@ 2015-06-16 15:46     ` Tomas Elf
  2015-06-16 16:50       ` Chris Wilson
  2015-06-17 11:43       ` Daniel Vetter
  0 siblings, 2 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-16 15:46 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On 16/06/2015 14:44, Daniel Vetter wrote:
> On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote:
>> When submitting semaphores in execlist mode the hang checker crashes in this
>> function because it is only runnable in ring submission mode. The reason this
>> is of particular interest to the TDR patch series is because we use semaphores
>> as a mean to induce hangs during testing (which is the recommended way to
>> induce hangs for gen8+). It's not clear how this is supposed to work in
>> execlist mode since:
>>
>> 1. This function requires a ring buffer.
>>
>> 2. Retrieving a ring buffer in execlist mode requires us to retrieve the
>> corresponding context, which we get from a request.
>>
>> 3. Retieving a request from the hang checker is not straight-forward since that
>> requires us to grab the struct_mutex in order to synchronize against the
>> request retirement thread.
>>
>> 4. Grabbing the struct_mutex from the hang checker is nothing that we will do
>> since that puts us at risk of deadlock since a hung thread might be holding the
>> struct_mutex already.
>>
>> Therefore it's not obvious how we're supposed to deal with this. For now, we're
>> doing an early exit from this function, which avoids any kernel panic situation
>> when running our own internal TDR ULT.
>>
>> Signed-off-by: Tomas Elf <tomas.elf@intel.com>
>
> We should have a Testcase: line here which mentions the igt testcase which
> provoke this bug. Or we need to fill this gap asap.
> -Daniel

You know this better than I do: Is there an IGT test that submits a 
semaphore in execlist mode? Because that's all you need to do to 
reproduce this. We could certainly add one if there is none like that 
already.

Thanks,
Tomas

>
>> ---
>>   drivers/gpu/drm/i915/i915_irq.c |   20 ++++++++++++++++++++
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>> index 46bcbff..40c44fc 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -2698,6 +2698,26 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
>>   	u64 offset = 0;
>>   	int i, backwards;
>>
>> +	/*
>> +	 * This function does not support execlist mode - any attempt to
>> +	 * proceed further into this function will result in a kernel panic
>> +	 * when dereferencing ring->buffer, which is not set up in execlist
>> +	 * mode.
>> +	 *
>> +	 * The correct way of doing it would be to derive the currently
>> +	 * executing ring buffer from the current context, which is derived
>> +	 * from the currently running request. Unfortunately, to get the
>> +	 * current request we would have to grab the struct_mutex before doing
>> +	 * anything else, which would be ill-advised since some other thread
>> +	 * might have grabbed it already and managed to hang itself, causing
>> +	 * the hang checker to deadlock.
>> +	 *
>> +	 * Therefore, this function does not support execlist mode in its
>> +	 * current form. Just return NULL and move on.
>> +	 */
>> +	if (i915.enable_execlists)
>> +		return NULL;
>> +
>>   	ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
>>   	if (!ipehr_is_semaphore_wait(ring->dev, ipehr))
>>   		return NULL;
>> --
>> 1.7.9.5
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset.
  2015-06-16 13:49   ` Daniel Vetter
@ 2015-06-16 15:54     ` Tomas Elf
  2015-06-17 11:51       ` Daniel Vetter
  0 siblings, 1 reply; 59+ messages in thread
From: Tomas Elf @ 2015-06-16 15:54 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On 16/06/2015 14:49, Daniel Vetter wrote:
> On Mon, Jun 08, 2015 at 06:03:21PM +0100, Tomas Elf wrote:
>> In preparation for per-engine reset add way for setting context reset stats.
>>
>> OPEN QUESTIONS:
>> 1. How do we deal with get_reset_stats and the GL robustness interface when
>> introducing per-engine resets?
>>
>> 	a. Do we set context that cause per-engine resets as guilty? If so, how
>> 	does this affect context banning?
>>
>> 	b. Do we extend the publically available reset stats to also contain
>> 	per-engine reset statistics? If so, would this break the ABI?
>>
>> Signed-off-by: Tomas Elf <tomas.elf@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h |    2 ++
>>   drivers/gpu/drm/i915/i915_gem.c |   14 ++++++++++++++
>>   2 files changed, 16 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 47be4a5..ab5dfdc 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2781,6 +2781,8 @@ static inline bool i915_stop_ring_allow_warn(struct drm_i915_private *dev_priv)
>>   }
>>
>>   void i915_gem_reset(struct drm_device *dev);
>> +void i915_gem_reset_engine(struct drm_i915_private *dev_priv,
>> +			   struct intel_engine_cs *engine);
>>   bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
>>   int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
>>   int __must_check i915_gem_init(struct drm_device *dev);
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index 8ce363a..4c88e5c 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2676,6 +2676,20 @@ void i915_gem_reset(struct drm_device *dev)
>>   	i915_gem_restore_fences(dev);
>>   }
>>
>> +void i915_gem_reset_engine(struct drm_i915_private *dev_priv,
>> +			   struct intel_engine_cs *engine)
>> +{
>> +	u32 completed_seqno;
>> +	struct drm_i915_gem_request *req;
>> +
>> +	completed_seqno = engine->get_seqno(engine, false);
>> +
>> +	/* Find pending batch buffers and mark them as such  */
>> +	list_for_each_entry(req, &engine->request_list, list)
>> +	        if (req && (req->seqno > completed_seqno))
>> +	                i915_set_reset_status(dev_priv, req->ctx, false);
>> +}
>
> Please don't add dead code since it makes it impossible to review the
> patch without peeking ahead. And that makes the split-up useless - the
> point of splitting patches it to make review easier by presenting
> logically self-contained small changes, not to evenly spread out changes
> across a lot of mails.
> -Daniel

I did actually split this out from the main TDR patch (drm/i915: Adding 
TDR / per-engine reset support for gen8) by mistake. But since this is a 
RFC series, which I thought had the purpose of acting as material for a 
design discussion rather than serving as an actual code submission, I 
didn't spend too much time thinking about splitting the patch series 
into sensible chunks. If that is a problem I would expect people to take 
issue with the fact that e.g. the main TDR patch is a huge, monolithic 
chunk of code spanning more than 2000 lines. Obviously, that will be 
subdivided sensibly at a later stage and the code in this patch mail 
will be properly associated with the calling code.

Is it ok if we leave the patch subdivision discussion to after the 
initial RFC stage or how do these things typically work at this point in 
the process?

Thanks,
Tomas

>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset.
  2015-06-16 13:54       ` Chris Wilson
@ 2015-06-16 15:55         ` Daniel Vetter
  2015-06-18 11:12         ` Dave Gordon
  1 sibling, 0 replies; 59+ messages in thread
From: Daniel Vetter @ 2015-06-16 15:55 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Tomas Elf, Intel-GFX

On Tue, Jun 16, 2015 at 02:54:49PM +0100, Chris Wilson wrote:
> On Tue, Jun 16, 2015 at 03:48:09PM +0200, Daniel Vetter wrote:
> > On Mon, Jun 08, 2015 at 06:33:59PM +0100, Chris Wilson wrote:
> > > On Mon, Jun 08, 2015 at 06:03:21PM +0100, Tomas Elf wrote:
> > > > In preparation for per-engine reset add way for setting context reset stats.
> > > > 
> > > > OPEN QUESTIONS:
> > > > 1. How do we deal with get_reset_stats and the GL robustness interface when
> > > > introducing per-engine resets?
> > > > 
> > > > 	a. Do we set context that cause per-engine resets as guilty? If so, how
> > > > 	does this affect context banning?
> > > 
> > > Yes. If the reset works quicker, then we can set a higher threshold for
> > > DoS detection, but we still do need Dos detection?
> > >  
> > > > 	b. Do we extend the publically available reset stats to also contain
> > > > 	per-engine reset statistics? If so, would this break the ABI?
> > > 
> > > No. The get_reset_stats is targetted at the GL API and describing it in
> > > terms of whether my context is guilty or has been affected. That is
> > > orthogonal to whether the reset was on a single ring or the entire GPU -
> > > the question is how broad do want the "affected" to be. Ideally a
> > > per-context reset wouldn't necessarily impact others, except for the
> > > surfaces shared between them...
> > 
> > gl computes sharing sets itself, the kernel only tells it whether a given
> > context has been victimized, i.e. one of it's batches was not properly
> > executed due to reset after a hang.
> 
> So you don't think we should delete all pending requests that depend
> upon state from the hung request?

Tbh I haven't fully thought through what happens with partial resets.
Looking into the future with hardware faulting/svm it's clear that soonish
the kernel won't even be in a position to know depencies. And userspace
already needs to take any kind of texture sharing into account when
computing certain arb_robustness values.

Given that I'm leaning towards a lean implementation in the kernel of only
marking the actual victim batches/contexts and simply continuing to
execute everything else. That has a bit the risk of ending up in continual
resets if a bit of corruption causes all follow-up batches to fail, but
that's something we need to be able to handle (using a full-blown reset
where we throw away all the batches) anyway. And eventually even
escalating to refusing gpu accesses to repeat offenders.

But definitely something we need to decide upon, and something which needs
to be carefully tested with nasty igts for all corner cases. And
preferrably also at least some basic multi-context testcases on top of
mesa/libva robustness.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode.
  2015-06-16 15:46     ` Tomas Elf
@ 2015-06-16 16:50       ` Chris Wilson
  2015-06-16 17:07         ` Tomas Elf
  2015-06-17 11:43       ` Daniel Vetter
  1 sibling, 1 reply; 59+ messages in thread
From: Chris Wilson @ 2015-06-16 16:50 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Tue, Jun 16, 2015 at 04:46:05PM +0100, Tomas Elf wrote:
> On 16/06/2015 14:44, Daniel Vetter wrote:
> >On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote:
> >>When submitting semaphores in execlist mode the hang checker crashes in this
> >>function because it is only runnable in ring submission mode. The reason this
> >>is of particular interest to the TDR patch series is because we use semaphores
> >>as a mean to induce hangs during testing (which is the recommended way to
> >>induce hangs for gen8+). It's not clear how this is supposed to work in
> >>execlist mode since:
> >>
> >>1. This function requires a ring buffer.
> >>
> >>2. Retrieving a ring buffer in execlist mode requires us to retrieve the
> >>corresponding context, which we get from a request.
> >>
> >>3. Retieving a request from the hang checker is not straight-forward since that
> >>requires us to grab the struct_mutex in order to synchronize against the
> >>request retirement thread.
> >>
> >>4. Grabbing the struct_mutex from the hang checker is nothing that we will do
> >>since that puts us at risk of deadlock since a hung thread might be holding the
> >>struct_mutex already.
> >>
> >>Therefore it's not obvious how we're supposed to deal with this. For now, we're
> >>doing an early exit from this function, which avoids any kernel panic situation
> >>when running our own internal TDR ULT.
> >>
> >>Signed-off-by: Tomas Elf <tomas.elf@intel.com>
> >
> >We should have a Testcase: line here which mentions the igt testcase which
> >provoke this bug. Or we need to fill this gap asap.
> >-Daniel
> 
> You know this better than I do: Is there an IGT test that submits a
> semaphore in execlist mode? Because that's all you need to do to
> reproduce this. We could certainly add one if there is none like
> that already.

No, we don't have anything submitting a hanging semaphore from
userspace or igt specifically.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 02/11] drm/i915: Introduce uevent for full GPU reset.
  2015-06-16 15:43     ` Tomas Elf
@ 2015-06-16 16:55       ` Chris Wilson
  2015-06-16 17:32         ` Tomas Elf
  0 siblings, 1 reply; 59+ messages in thread
From: Chris Wilson @ 2015-06-16 16:55 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Tue, Jun 16, 2015 at 04:43:55PM +0100, Tomas Elf wrote:
> On 16/06/2015 14:43, Daniel Vetter wrote:
> >On Mon, Jun 08, 2015 at 06:03:20PM +0100, Tomas Elf wrote:
> >>The TDR ULT used to validate this patch series requires a special uevent for
> >>full GPU resets in order to distinguish between different kinds of resets.
> >>
> >>Signed-off-by: Tomas Elf <tomas.elf@intel.com>
> >
> >Why duplicate the uevent we send out from i915_reset_and_wakeup? At least
> >I can't spot what this gets us in addition to the existing one.
> >-Daniel
> 
> Look at this line:
> >> +		reset_event[0] = kasprintf(GFP_KERNEL, "%s", "GPU RESET=0");
> 
> It doesn't exist in reset_and_wakeup (specifically, the "GPU
> RESET=0" part). It's a uevent that happens at the time of the actual
> GPU reset (GDRST register write). In the subsequent TDR commit we
> add another one to the point of the actual engine reset, which also
> includes information about what exact engine was reset.
> 
> The uevents in reset_and_wakeup only tell the user that an error has
> been detected and that some kind of reset has happened, these new
> uevents specify exactly what kind of reset has happened. This
> particular one on its own it's not very meaningful since there is
> only one supported form of reset at this point but once we add
> engine reset support it's useful to be able to discern the types of
> resets from each other (GPU reset, RCS engine reset, VCS engine
> reset, VCS2 engine reset, BCS engine reset, VECS engine reset).
> 
> Does that make sense?

The ultimate question is how do you envisage these uevents being used?

At present, we have abrtd listening out for when to grab the
/sys/drm/cardX/error and maybe for GL robustness (though I would imagine
if they thought such events useful we would have had demands for a DRM
event on the guilty/victim fd).

Does it really make sense to send uevents for both hang, partial-reset,
and full-reset?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode.
  2015-06-16 16:50       ` Chris Wilson
@ 2015-06-16 17:07         ` Tomas Elf
  0 siblings, 0 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-16 17:07 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Intel-GFX

On 16/06/2015 17:50, Chris Wilson wrote:
> On Tue, Jun 16, 2015 at 04:46:05PM +0100, Tomas Elf wrote:
>> On 16/06/2015 14:44, Daniel Vetter wrote:
>>> On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote:
>>>> When submitting semaphores in execlist mode the hang checker crashes in this
>>>> function because it is only runnable in ring submission mode. The reason this
>>>> is of particular interest to the TDR patch series is because we use semaphores
>>>> as a mean to induce hangs during testing (which is the recommended way to
>>>> induce hangs for gen8+). It's not clear how this is supposed to work in
>>>> execlist mode since:
>>>>
>>>> 1. This function requires a ring buffer.
>>>>
>>>> 2. Retrieving a ring buffer in execlist mode requires us to retrieve the
>>>> corresponding context, which we get from a request.
>>>>
>>>> 3. Retieving a request from the hang checker is not straight-forward since that
>>>> requires us to grab the struct_mutex in order to synchronize against the
>>>> request retirement thread.
>>>>
>>>> 4. Grabbing the struct_mutex from the hang checker is nothing that we will do
>>>> since that puts us at risk of deadlock since a hung thread might be holding the
>>>> struct_mutex already.
>>>>
>>>> Therefore it's not obvious how we're supposed to deal with this. For now, we're
>>>> doing an early exit from this function, which avoids any kernel panic situation
>>>> when running our own internal TDR ULT.
>>>>
>>>> Signed-off-by: Tomas Elf <tomas.elf@intel.com>
>>>
>>> We should have a Testcase: line here which mentions the igt testcase which
>>> provoke this bug. Or we need to fill this gap asap.
>>> -Daniel
>>
>> You know this better than I do: Is there an IGT test that submits a
>> semaphore in execlist mode? Because that's all you need to do to
>> reproduce this. We could certainly add one if there is none like
>> that already.
>
> No, we don't have anything submitting a hanging semaphore from
> userspace or igt specifically.
> -Chris
>

At first I thought that it would be ok to just submit any semaphore but 
I guess it would have to be a hanging semaphore specifically. Or at 
least a semaphore that does not progress ACTHD from one hang check 
period to the following (seeing as we check for ACTHD progression in 
ring_stuck() and then call semaphore_passed() that calls 
semaphore_waits_for() if ACTHD hasn't moved).

Fine, we'll have to add that then.

Thanks,
Tomas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 02/11] drm/i915: Introduce uevent for full GPU reset.
  2015-06-16 16:55       ` Chris Wilson
@ 2015-06-16 17:32         ` Tomas Elf
  2015-06-16 19:33           ` Chris Wilson
  0 siblings, 1 reply; 59+ messages in thread
From: Tomas Elf @ 2015-06-16 17:32 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Intel-GFX

On 16/06/2015 17:55, Chris Wilson wrote:
> On Tue, Jun 16, 2015 at 04:43:55PM +0100, Tomas Elf wrote:
>> On 16/06/2015 14:43, Daniel Vetter wrote:
>>> On Mon, Jun 08, 2015 at 06:03:20PM +0100, Tomas Elf wrote:
>>>> The TDR ULT used to validate this patch series requires a special uevent for
>>>> full GPU resets in order to distinguish between different kinds of resets.
>>>>
>>>> Signed-off-by: Tomas Elf <tomas.elf@intel.com>
>>>
>>> Why duplicate the uevent we send out from i915_reset_and_wakeup? At least
>>> I can't spot what this gets us in addition to the existing one.
>>> -Daniel
>>
>> Look at this line:
>>>> +		reset_event[0] = kasprintf(GFP_KERNEL, "%s", "GPU RESET=0");
>>
>> It doesn't exist in reset_and_wakeup (specifically, the "GPU
>> RESET=0" part). It's a uevent that happens at the time of the actual
>> GPU reset (GDRST register write). In the subsequent TDR commit we
>> add another one to the point of the actual engine reset, which also
>> includes information about what exact engine was reset.
>>
>> The uevents in reset_and_wakeup only tell the user that an error has
>> been detected and that some kind of reset has happened, these new
>> uevents specify exactly what kind of reset has happened. This
>> particular one on its own it's not very meaningful since there is
>> only one supported form of reset at this point but once we add
>> engine reset support it's useful to be able to discern the types of
>> resets from each other (GPU reset, RCS engine reset, VCS engine
>> reset, VCS2 engine reset, BCS engine reset, VECS engine reset).
>>
>> Does that make sense?
>
> The ultimate question is how do you envisage these uevents being used?
>
> At present, we have abrtd listening out for when to grab the
> /sys/drm/cardX/error and maybe for GL robustness (though I would imagine
> if they thought such events useful we would have had demands for a DRM
> event on the guilty/victim fd).
>
> Does it really make sense to send uevents for both hang, partial-reset,
> and full-reset?
> -Chris
>

The reason we have such a detailed set of uevents is primarily for 
testing purposes. Our internal VPG tests check for these uevents to make 
sure that the expected recovery mode is actually being used. Which makes 
sense, because the TDR driver code contains reset promotion logic to 
decide what recovery mode to use and if that logic somehow gets broken 
the driver might go with the wrong recovery mode. Thus, it's worth 
testing and therefore those uevents need to be there. Of course, I guess 
the argument "our internal VPG tests do this" might not hold water since 
the tests haven't been upstreamed? If that's the case then I guess I 
don't have any opinion about what uevent goes where and we could go with 
whatever set of uevents you prefer.

Also, it might not be worth the hassle to have the reset_done_event at 
recovery completion at the end of i915_reset_and_wakeup() / 
i915_error_work_func() _as_well_as_ the respective uevent after each 
actual GPU/engine reset completion since reset_done_event doesn't really 
offer that much information that you didn't already know from the 
post-reset uevent. So I would be ok with removing reset_done_event.

Thanks,
Tomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 02/11] drm/i915: Introduce uevent for full GPU reset.
  2015-06-16 17:32         ` Tomas Elf
@ 2015-06-16 19:33           ` Chris Wilson
  2015-06-17 11:49             ` Daniel Vetter
  0 siblings, 1 reply; 59+ messages in thread
From: Chris Wilson @ 2015-06-16 19:33 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Tue, Jun 16, 2015 at 06:32:22PM +0100, Tomas Elf wrote:
> On 16/06/2015 17:55, Chris Wilson wrote:
> >On Tue, Jun 16, 2015 at 04:43:55PM +0100, Tomas Elf wrote:
> >>On 16/06/2015 14:43, Daniel Vetter wrote:
> >>>On Mon, Jun 08, 2015 at 06:03:20PM +0100, Tomas Elf wrote:
> >>>>The TDR ULT used to validate this patch series requires a special uevent for
> >>>>full GPU resets in order to distinguish between different kinds of resets.
> >>>>
> >>>>Signed-off-by: Tomas Elf <tomas.elf@intel.com>
> >>>
> >>>Why duplicate the uevent we send out from i915_reset_and_wakeup? At least
> >>>I can't spot what this gets us in addition to the existing one.
> >>>-Daniel
> >>
> >>Look at this line:
> >>>>+		reset_event[0] = kasprintf(GFP_KERNEL, "%s", "GPU RESET=0");
> >>
> >>It doesn't exist in reset_and_wakeup (specifically, the "GPU
> >>RESET=0" part). It's a uevent that happens at the time of the actual
> >>GPU reset (GDRST register write). In the subsequent TDR commit we
> >>add another one to the point of the actual engine reset, which also
> >>includes information about what exact engine was reset.
> >>
> >>The uevents in reset_and_wakeup only tell the user that an error has
> >>been detected and that some kind of reset has happened, these new
> >>uevents specify exactly what kind of reset has happened. This
> >>particular one on its own it's not very meaningful since there is
> >>only one supported form of reset at this point but once we add
> >>engine reset support it's useful to be able to discern the types of
> >>resets from each other (GPU reset, RCS engine reset, VCS engine
> >>reset, VCS2 engine reset, BCS engine reset, VECS engine reset).
> >>
> >>Does that make sense?
> >
> >The ultimate question is how do you envisage these uevents being used?
> >
> >At present, we have abrtd listening out for when to grab the
> >/sys/drm/cardX/error and maybe for GL robustness (though I would imagine
> >if they thought such events useful we would have had demands for a DRM
> >event on the guilty/victim fd).
> >
> >Does it really make sense to send uevents for both hang, partial-reset,
> >and full-reset?
> >-Chris
> >
> 
> The reason we have such a detailed set of uevents is primarily for
> testing purposes. Our internal VPG tests check for these uevents to
> make sure that the expected recovery mode is actually being used.
> Which makes sense, because the TDR driver code contains reset
> promotion logic to decide what recovery mode to use and if that
> logic somehow gets broken the driver might go with the wrong
> recovery mode. Thus, it's worth testing and therefore those uevents
> need to be there. Of course, I guess the argument "our internal VPG
> tests do this" might not hold water since the tests haven't been
> upstreamed? If that's the case then I guess I don't have any opinion
> about what uevent goes where and we could go with whatever set of
> uevents you prefer.

uevents aren't free, so if we can find a way to get coverage without
waking up the buses, I'd be happy (though we shouldn't need that many
reset uevents, it is mainly the principle of trying to use the right
tool for the job and not just the first one that comes to hand). Also
uevents are by their very nature userspace ABI so we should plan
carefully how we expect them to be used.

I would have thought tracepoints would have provided better level of
detail. Or just a debugfs file with each stage passed - polling that
throughout the test would give the same edge events.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode.
  2015-06-16 15:46     ` Tomas Elf
  2015-06-16 16:50       ` Chris Wilson
@ 2015-06-17 11:43       ` Daniel Vetter
  1 sibling, 0 replies; 59+ messages in thread
From: Daniel Vetter @ 2015-06-17 11:43 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Tue, Jun 16, 2015 at 04:46:05PM +0100, Tomas Elf wrote:
> On 16/06/2015 14:44, Daniel Vetter wrote:
> >On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote:
> >>When submitting semaphores in execlist mode the hang checker crashes in this
> >>function because it is only runnable in ring submission mode. The reason this
> >>is of particular interest to the TDR patch series is because we use semaphores
> >>as a mean to induce hangs during testing (which is the recommended way to
> >>induce hangs for gen8+). It's not clear how this is supposed to work in
> >>execlist mode since:
> >>
> >>1. This function requires a ring buffer.
> >>
> >>2. Retrieving a ring buffer in execlist mode requires us to retrieve the
> >>corresponding context, which we get from a request.
> >>
> >>3. Retieving a request from the hang checker is not straight-forward since that
> >>requires us to grab the struct_mutex in order to synchronize against the
> >>request retirement thread.
> >>
> >>4. Grabbing the struct_mutex from the hang checker is nothing that we will do
> >>since that puts us at risk of deadlock since a hung thread might be holding the
> >>struct_mutex already.
> >>
> >>Therefore it's not obvious how we're supposed to deal with this. For now, we're
> >>doing an early exit from this function, which avoids any kernel panic situation
> >>when running our own internal TDR ULT.
> >>
> >>Signed-off-by: Tomas Elf <tomas.elf@intel.com>
> >
> >We should have a Testcase: line here which mentions the igt testcase which
> >provoke this bug. Or we need to fill this gap asap.
> >-Daniel
> 
> You know this better than I do: Is there an IGT test that submits a
> semaphore in execlist mode? Because that's all you need to do to reproduce
> this. We could certainly add one if there is none like that already.

Not that I know of, so needs to be added. But it's a security fix and
should go to -fixes+cc:stable since any kind of userspace could Oops the
kernel with this.
-Daniel

> 
> Thanks,
> Tomas
> 
> >
> >>---
> >>  drivers/gpu/drm/i915/i915_irq.c |   20 ++++++++++++++++++++
> >>  1 file changed, 20 insertions(+)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> >>index 46bcbff..40c44fc 100644
> >>--- a/drivers/gpu/drm/i915/i915_irq.c
> >>+++ b/drivers/gpu/drm/i915/i915_irq.c
> >>@@ -2698,6 +2698,26 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
> >>  	u64 offset = 0;
> >>  	int i, backwards;
> >>
> >>+	/*
> >>+	 * This function does not support execlist mode - any attempt to
> >>+	 * proceed further into this function will result in a kernel panic
> >>+	 * when dereferencing ring->buffer, which is not set up in execlist
> >>+	 * mode.
> >>+	 *
> >>+	 * The correct way of doing it would be to derive the currently
> >>+	 * executing ring buffer from the current context, which is derived
> >>+	 * from the currently running request. Unfortunately, to get the
> >>+	 * current request we would have to grab the struct_mutex before doing
> >>+	 * anything else, which would be ill-advised since some other thread
> >>+	 * might have grabbed it already and managed to hang itself, causing
> >>+	 * the hang checker to deadlock.
> >>+	 *
> >>+	 * Therefore, this function does not support execlist mode in its
> >>+	 * current form. Just return NULL and move on.
> >>+	 */
> >>+	if (i915.enable_execlists)
> >>+		return NULL;
> >>+
> >>  	ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
> >>  	if (!ipehr_is_semaphore_wait(ring->dev, ipehr))
> >>  		return NULL;
> >>--
> >>1.7.9.5
> >>
> >>_______________________________________________
> >>Intel-gfx mailing list
> >>Intel-gfx@lists.freedesktop.org
> >>http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 02/11] drm/i915: Introduce uevent for full GPU reset.
  2015-06-16 19:33           ` Chris Wilson
@ 2015-06-17 11:49             ` Daniel Vetter
  2015-06-17 12:51               ` Chris Wilson
  0 siblings, 1 reply; 59+ messages in thread
From: Daniel Vetter @ 2015-06-17 11:49 UTC (permalink / raw)
  To: Chris Wilson, Tomas Elf, Daniel Vetter, Intel-GFX

On Tue, Jun 16, 2015 at 08:33:13PM +0100, Chris Wilson wrote:
> On Tue, Jun 16, 2015 at 06:32:22PM +0100, Tomas Elf wrote:
> > On 16/06/2015 17:55, Chris Wilson wrote:
> > >On Tue, Jun 16, 2015 at 04:43:55PM +0100, Tomas Elf wrote:
> > >>On 16/06/2015 14:43, Daniel Vetter wrote:
> > >>>On Mon, Jun 08, 2015 at 06:03:20PM +0100, Tomas Elf wrote:
> > >>>>The TDR ULT used to validate this patch series requires a special uevent for
> > >>>>full GPU resets in order to distinguish between different kinds of resets.
> > >>>>
> > >>>>Signed-off-by: Tomas Elf <tomas.elf@intel.com>
> > >>>
> > >>>Why duplicate the uevent we send out from i915_reset_and_wakeup? At least
> > >>>I can't spot what this gets us in addition to the existing one.
> > >>>-Daniel
> > >>
> > >>Look at this line:
> > >>>>+		reset_event[0] = kasprintf(GFP_KERNEL, "%s", "GPU RESET=0");
> > >>
> > >>It doesn't exist in reset_and_wakeup (specifically, the "GPU
> > >>RESET=0" part). It's a uevent that happens at the time of the actual
> > >>GPU reset (GDRST register write). In the subsequent TDR commit we
> > >>add another one to the point of the actual engine reset, which also
> > >>includes information about what exact engine was reset.
> > >>
> > >>The uevents in reset_and_wakeup only tell the user that an error has
> > >>been detected and that some kind of reset has happened, these new
> > >>uevents specify exactly what kind of reset has happened. This
> > >>particular one on its own it's not very meaningful since there is
> > >>only one supported form of reset at this point but once we add
> > >>engine reset support it's useful to be able to discern the types of
> > >>resets from each other (GPU reset, RCS engine reset, VCS engine
> > >>reset, VCS2 engine reset, BCS engine reset, VECS engine reset).
> > >>
> > >>Does that make sense?
> > >
> > >The ultimate question is how do you envisage these uevents being used?
> > >
> > >At present, we have abrtd listening out for when to grab the
> > >/sys/drm/cardX/error and maybe for GL robustness (though I would imagine
> > >if they thought such events useful we would have had demands for a DRM
> > >event on the guilty/victim fd).
> > >
> > >Does it really make sense to send uevents for both hang, partial-reset,
> > >and full-reset?
> > >-Chris
> > >
> > 
> > The reason we have such a detailed set of uevents is primarily for
> > testing purposes. Our internal VPG tests check for these uevents to
> > make sure that the expected recovery mode is actually being used.
> > Which makes sense, because the TDR driver code contains reset
> > promotion logic to decide what recovery mode to use and if that
> > logic somehow gets broken the driver might go with the wrong
> > recovery mode. Thus, it's worth testing and therefore those uevents
> > need to be there. Of course, I guess the argument "our internal VPG
> > tests do this" might not hold water since the tests haven't been
> > upstreamed? If that's the case then I guess I don't have any opinion
> > about what uevent goes where and we could go with whatever set of
> > uevents you prefer.
> 
> uevents aren't free, so if we can find a way to get coverage without
> waking up the buses, I'd be happy (though we shouldn't need that many
> reset uevents, it is mainly the principle of trying to use the right
> tool for the job and not just the first one that comes to hand). Also
> uevents are by their very nature userspace ABI so we should plan
> carefully how we expect them to be used.

Yeah uevents are ABI with full backwards compat guarantees, I don't want
to add ABI just for testing. What we've done for the arb robustness
testcaes is batches with canary writes to observe whether all the right
ones have been cancelled/killed depending upon where the hanging batch is.

Imo that's also more robust testing - we have behavioural expectations,
tight coupling of the tests with the current kernel implementations only
causes long-term maintainance headaches.

E.g. testcase that expects a engine reset:
- Submit dummo workload on ring A followed by a canary write.
- Submit hanging batch on ring B.
- Wait for reset.
- Check that canary write in ring A happened.

Testcase which expects full-blown gpu reset:
- Check that canary write didn't happen.

Variate as need and correlate with the reset stats the kernel reports.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset.
  2015-06-16 15:54     ` Tomas Elf
@ 2015-06-17 11:51       ` Daniel Vetter
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Vetter @ 2015-06-17 11:51 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Tue, Jun 16, 2015 at 04:54:14PM +0100, Tomas Elf wrote:
> On 16/06/2015 14:49, Daniel Vetter wrote:
> >On Mon, Jun 08, 2015 at 06:03:21PM +0100, Tomas Elf wrote:
> >>In preparation for per-engine reset add way for setting context reset stats.
> >>
> >>OPEN QUESTIONS:
> >>1. How do we deal with get_reset_stats and the GL robustness interface when
> >>introducing per-engine resets?
> >>
> >>	a. Do we set context that cause per-engine resets as guilty? If so, how
> >>	does this affect context banning?
> >>
> >>	b. Do we extend the publically available reset stats to also contain
> >>	per-engine reset statistics? If so, would this break the ABI?
> >>
> >>Signed-off-by: Tomas Elf <tomas.elf@intel.com>
> >>---
> >>  drivers/gpu/drm/i915/i915_drv.h |    2 ++
> >>  drivers/gpu/drm/i915/i915_gem.c |   14 ++++++++++++++
> >>  2 files changed, 16 insertions(+)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>index 47be4a5..ab5dfdc 100644
> >>--- a/drivers/gpu/drm/i915/i915_drv.h
> >>+++ b/drivers/gpu/drm/i915/i915_drv.h
> >>@@ -2781,6 +2781,8 @@ static inline bool i915_stop_ring_allow_warn(struct drm_i915_private *dev_priv)
> >>  }
> >>
> >>  void i915_gem_reset(struct drm_device *dev);
> >>+void i915_gem_reset_engine(struct drm_i915_private *dev_priv,
> >>+			   struct intel_engine_cs *engine);
> >>  bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, bool force);
> >>  int __must_check i915_gem_object_finish_gpu(struct drm_i915_gem_object *obj);
> >>  int __must_check i915_gem_init(struct drm_device *dev);
> >>diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >>index 8ce363a..4c88e5c 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>@@ -2676,6 +2676,20 @@ void i915_gem_reset(struct drm_device *dev)
> >>  	i915_gem_restore_fences(dev);
> >>  }
> >>
> >>+void i915_gem_reset_engine(struct drm_i915_private *dev_priv,
> >>+			   struct intel_engine_cs *engine)
> >>+{
> >>+	u32 completed_seqno;
> >>+	struct drm_i915_gem_request *req;
> >>+
> >>+	completed_seqno = engine->get_seqno(engine, false);
> >>+
> >>+	/* Find pending batch buffers and mark them as such  */
> >>+	list_for_each_entry(req, &engine->request_list, list)
> >>+	        if (req && (req->seqno > completed_seqno))
> >>+	                i915_set_reset_status(dev_priv, req->ctx, false);
> >>+}
> >
> >Please don't add dead code since it makes it impossible to review the
> >patch without peeking ahead. And that makes the split-up useless - the
> >point of splitting patches it to make review easier by presenting
> >logically self-contained small changes, not to evenly spread out changes
> >across a lot of mails.
> >-Daniel
> 
> I did actually split this out from the main TDR patch (drm/i915: Adding TDR
> / per-engine reset support for gen8) by mistake. But since this is a RFC
> series, which I thought had the purpose of acting as material for a design
> discussion rather than serving as an actual code submission, I didn't spend
> too much time thinking about splitting the patch series into sensible
> chunks. If that is a problem I would expect people to take issue with the
> fact that e.g. the main TDR patch is a huge, monolithic chunk of code
> spanning more than 2000 lines. Obviously, that will be subdivided sensibly
> at a later stage and the code in this patch mail will be properly associated
> with the calling code.
> 
> Is it ok if we leave the patch subdivision discussion to after the initial
> RFC stage or how do these things typically work at this point in the
> process?

No need to resend, was just boilerplate comment from your maintainer.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 02/11] drm/i915: Introduce uevent for full GPU reset.
  2015-06-17 11:49             ` Daniel Vetter
@ 2015-06-17 12:51               ` Chris Wilson
  0 siblings, 0 replies; 59+ messages in thread
From: Chris Wilson @ 2015-06-17 12:51 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On Wed, Jun 17, 2015 at 01:49:15PM +0200, Daniel Vetter wrote:
> On Tue, Jun 16, 2015 at 08:33:13PM +0100, Chris Wilson wrote:
> > On Tue, Jun 16, 2015 at 06:32:22PM +0100, Tomas Elf wrote:
> > > On 16/06/2015 17:55, Chris Wilson wrote:
> > > >On Tue, Jun 16, 2015 at 04:43:55PM +0100, Tomas Elf wrote:
> > > >>On 16/06/2015 14:43, Daniel Vetter wrote:
> > > >>>On Mon, Jun 08, 2015 at 06:03:20PM +0100, Tomas Elf wrote:
> > > >>>>The TDR ULT used to validate this patch series requires a special uevent for
> > > >>>>full GPU resets in order to distinguish between different kinds of resets.
> > > >>>>
> > > >>>>Signed-off-by: Tomas Elf <tomas.elf@intel.com>
> > > >>>
> > > >>>Why duplicate the uevent we send out from i915_reset_and_wakeup? At least
> > > >>>I can't spot what this gets us in addition to the existing one.
> > > >>>-Daniel
> > > >>
> > > >>Look at this line:
> > > >>>>+		reset_event[0] = kasprintf(GFP_KERNEL, "%s", "GPU RESET=0");
> > > >>
> > > >>It doesn't exist in reset_and_wakeup (specifically, the "GPU
> > > >>RESET=0" part). It's a uevent that happens at the time of the actual
> > > >>GPU reset (GDRST register write). In the subsequent TDR commit we
> > > >>add another one to the point of the actual engine reset, which also
> > > >>includes information about what exact engine was reset.
> > > >>
> > > >>The uevents in reset_and_wakeup only tell the user that an error has
> > > >>been detected and that some kind of reset has happened, these new
> > > >>uevents specify exactly what kind of reset has happened. This
> > > >>particular one on its own it's not very meaningful since there is
> > > >>only one supported form of reset at this point but once we add
> > > >>engine reset support it's useful to be able to discern the types of
> > > >>resets from each other (GPU reset, RCS engine reset, VCS engine
> > > >>reset, VCS2 engine reset, BCS engine reset, VECS engine reset).
> > > >>
> > > >>Does that make sense?
> > > >
> > > >The ultimate question is how do you envisage these uevents being used?
> > > >
> > > >At present, we have abrtd listening out for when to grab the
> > > >/sys/drm/cardX/error and maybe for GL robustness (though I would imagine
> > > >if they thought such events useful we would have had demands for a DRM
> > > >event on the guilty/victim fd).
> > > >
> > > >Does it really make sense to send uevents for both hang, partial-reset,
> > > >and full-reset?
> > > >-Chris
> > > >
> > > 
> > > The reason we have such a detailed set of uevents is primarily for
> > > testing purposes. Our internal VPG tests check for these uevents to
> > > make sure that the expected recovery mode is actually being used.
> > > Which makes sense, because the TDR driver code contains reset
> > > promotion logic to decide what recovery mode to use and if that
> > > logic somehow gets broken the driver might go with the wrong
> > > recovery mode. Thus, it's worth testing and therefore those uevents
> > > need to be there. Of course, I guess the argument "our internal VPG
> > > tests do this" might not hold water since the tests haven't been
> > > upstreamed? If that's the case then I guess I don't have any opinion
> > > about what uevent goes where and we could go with whatever set of
> > > uevents you prefer.
> > 
> > uevents aren't free, so if we can find a way to get coverage without
> > waking up the buses, I'd be happy (though we shouldn't need that many
> > reset uevents, it is mainly the principle of trying to use the right
> > tool for the job and not just the first one that comes to hand). Also
> > uevents are by their very nature userspace ABI so we should plan
> > carefully how we expect them to be used.
> 
> Yeah uevents are ABI with full backwards compat guarantees, I don't want
> to add ABI just for testing. What we've done for the arb robustness
> testcaes is batches with canary writes to observe whether all the right
> ones have been cancelled/killed depending upon where the hanging batch is.
> 
> Imo that's also more robust testing - we have behavioural expectations,
> tight coupling of the tests with the current kernel implementations only
> causes long-term maintainance headaches.
> 
> E.g. testcase that expects a engine reset:
> - Submit dummo workload on ring A followed by a canary write.
> - Submit hanging batch on ring B.
> - Wait for reset.
> - Check that canary write in ring A happened.
> 
> Testcase which expects full-blown gpu reset:
> - Check that canary write didn't happen.

Note that this presumes that we don't attempt to restart the rings with
pending requests (which we could and once upon were working on).

For more examples of the former, see gem_concurrent_all,
gem_reloc_vs_gpu, gem_evict_everything. These are coarse correctness
tests of different parts of the API and not the targetted TDR coverage
tests.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset.
  2015-06-16 13:54       ` Chris Wilson
  2015-06-16 15:55         ` Daniel Vetter
@ 2015-06-18 11:12         ` Dave Gordon
  1 sibling, 0 replies; 59+ messages in thread
From: Dave Gordon @ 2015-06-18 11:12 UTC (permalink / raw)
  To: intel-gfx

On 16/06/15 14:54, Chris Wilson wrote:
> On Tue, Jun 16, 2015 at 03:48:09PM +0200, Daniel Vetter wrote:
>> On Mon, Jun 08, 2015 at 06:33:59PM +0100, Chris Wilson wrote:
>>> On Mon, Jun 08, 2015 at 06:03:21PM +0100, Tomas Elf wrote:
>>>> In preparation for per-engine reset add way for setting context reset stats.
>>>>
>>>> OPEN QUESTIONS:
>>>> 1. How do we deal with get_reset_stats and the GL robustness interface when
>>>> introducing per-engine resets?
>>>>
>>>> 	a. Do we set context that cause per-engine resets as guilty? If so, how
>>>> 	does this affect context banning?
>>>
>>> Yes. If the reset works quicker, then we can set a higher threshold for
>>> DoS detection, but we still do need Dos detection?
>>>  
>>>> 	b. Do we extend the publically available reset stats to also contain
>>>> 	per-engine reset statistics? If so, would this break the ABI?
>>>
>>> No. The get_reset_stats is targetted at the GL API and describing it in
>>> terms of whether my context is guilty or has been affected. That is
>>> orthogonal to whether the reset was on a single ring or the entire GPU -
>>> the question is how broad do want the "affected" to be. Ideally a
>>> per-context reset wouldn't necessarily impact others, except for the
>>> surfaces shared between them...
>>
>> gl computes sharing sets itself, the kernel only tells it whether a given
>> context has been victimized, i.e. one of it's batches was not properly
>> executed due to reset after a hang.
> 
> So you don't think we should delete all pending requests that depend
> upon state from the hung request?
> -Chris

John Harrison & I discussed this yesterday; he's against doing so (even
though the scheduler is ideally placed to do it, if that were actually
the preferred policy). The primary argument (as I see it) is that you
actually don't and can't know the nature of an apparent dependency
between batches that share a buffer object. There are at least three cases:

1. "tightly-coupled": the dependent batch is going to rely on data
produced by the earlier batch. In this case, GIGO applies and the
results will be undefined, possibly including a further hang. Subsequent
batches presumably belong to the same or a closely-related
(co-operating) task, and killing them might be a reasonable strategy here.

2. "loosely-coupled": the dependent batch is going to access the data,
but not in any way that depends on the content (for example, blitting a
rectangle into a composition buffer). The result will be wrong, but only
in a limited way (e.g. window belonging to the faulty application will
appear corrupted). The dependent batches may well belong to unrelated
system tasks (e.g. X or surfaceflinger) and killing them is probably not
justified.

3. "uncoupled": the dependent batch wants the /buffer/, not the data in
it (most likely a framebuffer or similar object). Any incorrect data in
the buffer is irrelevant. Killing off subsequent batches would be wrong.

Buffer access mode (readonly, read/write, writeonly) might allow us to
distinguish these somewhat, but probably not enough to help make the
right decision. So the default must be *not* to kill off dependants
automatically, but if the failure does propagate in such a way as to
cause further consequent hangs, then the context-banning mechanism
should eventually catch and block all the downstream effects.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
                   ` (10 preceding siblings ...)
  2015-06-08 17:03 ` [RFC 11/11] drm/i915: TDR/watchdog trace points Tomas Elf
@ 2015-06-23 10:05 ` Daniel Vetter
  2015-06-23 10:47   ` Tomas Elf
  2015-07-03 11:15 ` Mika Kuoppala
  2015-07-09 18:47 ` Chris Wilson
  13 siblings, 1 reply; 59+ messages in thread
From: Daniel Vetter @ 2015-06-23 10:05 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

Your patches don't apply cleanly any more and I can't find a suitable
baseline where they would. But I'd like to see it all in context to check
a few things.

Can you pls push a git branch with these somewhere?

Thanks, Daniel

On Mon, Jun 08, 2015 at 06:03:18PM +0100, Tomas Elf wrote:
> This patch series introduces the following features:
> 
> * Feature 1: TDR (Timeout Detection and Recovery) for gen8 execlist mode.
> 
> TDR is an umbrella term for anything that goes into detecting and recovering
> from GPU hangs and is a term more widely used outside of the upstream driver.
> This feature introduces an extensible framework that currently supports gen8
> but that can be easily extended to support gen7 as well (which is already the
> case in GMIN but unfortunately in a not quite upstreamable form). The code
> contained in this submission represents the essentials of what is currently in
> GMIN merged with what is currently in upstream (as of the time when this work
> commenced a few months back).
> 
> This feature adds a new hang recovery path alongside the legacy GPU reset path,
> which takes care of engine recovery only. Aside from adding support for
> per-engine recovery this feature also introduces rules for how to promote a
> potential per-engine reset to a legacy, full GPU reset.
> 
> The hang checker now integrates with the error handler in a slightly different
> way in that it allows hang recovery on multiple engines at the same time by
> passing an engine flag mask to the error handler where flags representing all
> of the hung engines are set. This allows us to schedule hang recovery once for
> all currently hung engines instead of one hang recovery per detected engine
> hang. Previously, when only full GPU reset was supported this was all the same
> since it wouldn't matter if one or four engines were hung at any given point
> since it would all amount to the same thing - the GPU getting reset. As it
> stands now the behaviour is different depending on which engine is hung since
> each engine is reset separately from all the other engines, therefore we have
> to think about this in terms of scheduling cost and recovery latency. (see
> open question below)
> 
> OPEN QUESTIONS:
> 
> 	1. Do we want to investigate the possibility of per-engine hang
> 	detection? In the current upstream driver there is only one work queue
> 	that handles the hang checker and everything from initial hang
> 	detection to final hang recovery runs in this thread. This makes sense
> 	if you're only supporting one form of hang recovery - using full GPU
> 	reset and nothing tied to any particular engine. However, as part
> 	of this patch series we're changing that by introducing per-engine
> 	hang recovery. It could make sense to introduce multiple work
> 	queues - one per engine - to run multiple hang checking threads in
> 	parallel.
> 
> 	This would potentially allow savings in terms of recovery latency since
> 	we don't have to scan all engines every time the hang checker is
> 	scheduled and the error handler does not have to scan all engines every
> 	time it is scheduled. Instead, we could implement one work queue per
> 	engine that would invoke the hang checker that only checks _that_
> 	particular engine and then the error handler is invoked for _that_
> 	particular engine. If one engine has hung the latency for getting to
> 	the hang recovery path for that particular engine would be (Time For
> 	Hang Checking One Engine) + (Time For Error Handling One Engine) rather
> 	than the time it takes to do hang checking for all engines + the time
> 	it takes to do error handling for all engines that have been detected
> 	as hung (which in the worst case would be all engines). There would
> 	potentially be as many hang checker and error handling threads going on
> 	concurrently as there are engines in the hardware but they would all be
> 	running in parallel without any significant locking. The first time
> 	where any thread needs exclusive access to the driver is at the point
> 	of the actual hang recovery but the time it takes to get there would
> 	theoretically be lower and the time it actually takes to do per-engine
> 	hang recovery is quite a lot lower than the time it takes to actually
> 	detect a hang reliably.
> 
> 	How much we would save by such a change still needs to be analysed and
> 	compared against the current single-thread model but it makes sense
> 	from a theoretical design point of view.
> 
> 	2. How does per-engine reset integrate with the public reset stats
> 	IOCTL? These stats are used for the GL robustness interface and
> 	currently these tests are failing when running per-engine hang recovery
> 	since we treat per-engine recovery differently from full GPU recovery,
> 	which is nothing that userland knows anything about. When userland
> 	expects to hang the hardware it expects the reset stat interface to
> 	reflect this, which is something that has changed as part of this code
> 	submission. There's more than one way to solve this. Here are two options:
> 
> 		1. Expose per-engine reset statistics and set contexts as
> 		guilty the same way for per-engine reset as for full GPU
> 		resets.
> 
> 		That would make this change to the hang recovery mechanism
> 		transparent to userland but it would change the semantics since
> 		an active context in the reset stats no longer implies that the
> 		GPU was fully reset.
> 
> 		2. Add a new set of statistics for per-engine reset (one group
> 		of statistics for each engine) to reflect the extended
> 		capabilities that per-engine hang recovery offers.
> 
> 		Would that be breaking the ABI?
> 
> 		... Or are there any other way of doing this?
> 
> * Feature 2: Watchdog Timeout (a.k.a "media engine reset") for gen8.
> 
> This feature allows userland applications to control whether or not individual
> batch buffers should have a first-level, fine-grained, hardware-based hang
> detection mechanism on top of the ordinary, software-based periodic hang
> checker that is already in the driver. The advantage over relying solely on the
> current software-based hang checker is that the watchdog timeout mechanism is
> about 1000x quicker and more precise. Since it's not a full driver-level hang
> detection mechanism but only targetting one individual batch buffer at a time
> it can afford to be that quick without risking an increase in false positive
> hang detection.
> 
> This feature includes the following changes:
> 
> a) Watchdog timeout interrupt service routine for handling watchdog interrupts
> and connecting these to per-engine hang recovery.
> 
> b) Injection of watchdog timer enablement/cancellation instructions
> before/after the batch buffer start instruction in the ring buffer so that
> watchdog timeout is connected to the submission of an individual batch buffer.
> 
> c) Extension of the DRM batch buffer interface, exposing the watchdog timeout
> feature to userland. We've got two open source groups in VPG currently in the
> process of integrating support for this feature, which should make it
> principally possible to upstream this extension.
> 
> There is currently full watchdog timeout support for gen7 in GMIN and it is
> quite similar to the gen8 implementation so there is nothing obvious that
> prevents us from upstreaming that code along with the gen8 code. However,
> watchdog timeout is fully dependent on the per-engine hang recovery path and
> that is not part of this code submission for gen7. Therefore watchdog timeout
> support for gen7 has been excluded until per-engine hang recovery support for
> gen7 has landed upstream.
> 
> As part of this submission we've had to reinstate the work queue that was
> previously in place between the error handler and the hang recovery path. The
> reason for this is that the per-engine recovery path is called directly from
> the interrupt handler in the case of watchdog timeout. In that situation
> there's no way of grabbing the struct_mutex, which is a requirement for the
> hang recovery path. Therefore, by reinstating the work queue we provide a
> unified execution context for the hang recovery code that allows the hang
> recovery code to grab whatever locks it needs without sacrificing interrupt
> latency too much or sleeping indefinitely in hard interrupt context.
> 
> * Feature 3. Context Submission Status Consistency checking
> 
> Something that becomes apparent when you run long-duration operations tests
> with concurrent rendering processes with intermittently injected hangs is that
> it seems like the GPU forgets to send context completion interrupts to the
> driver under some circumstances. What this means is that the driver sometimes
> gets stuck on a context that never seems to finish, all the while the hardware
> has completed and is waiting for more work.
> 
> The problem with this is that the per-engine hang recovery path relies on
> context resubmission to kick off the hardware again following an engine reset.
> This can only be done safely if the hardware and driver share the same opinion
> about the current state. Therefore we've extended the periodic hang checker to
> check for context submission state inconsistencies aside from the hang checking
> it already does.
> 
> If such a state is detected it is assumed (based on experience) that a context
> completion interrupt has been lost somehow. If this state persists for some
> time an attempt to correct it is made by faking the presumably lost context
> completion interrupt by manually calling the execlist interrupt handler, which
> is normally called from the main interrupt handler cued by a received context
> event interrupt. Just because an interrupt goes missing does not mean that the
> context status buffer (CSB) does not get appropriately updated by the hardware,
> which means that we can expect to find all the recent changes to the context
> states for each engine captured there. If there are outstanding context status
> changes in store there then the faked context event interrupt will allow the
> interrupt handler to act on them. In the case of lost context completion
> interrupts this will prompt the driver to remove the already completed context
> from the execlist queue and move on to the next pending piece of work and
> thereby eliminating the inconsistency.
> 
> * Feature 4. Debugfs extensions for per-engine hang recovery and TDR/watchdog trace
> points.
> 
> 
> Tomas Elf (11):
>   drm/i915: Early exit from semaphore_waits_for for execlist mode.
>   drm/i915: Introduce uevent for full GPU reset.
>   drm/i915: Add reset stats entry point for per-engine reset.
>   drm/i915: Adding TDR / per-engine reset support for gen8.
>   drm/i915: Extending i915_gem_check_wedge to check engine reset in
>     progress
>   drm/i915: Disable warnings for TDR interruptions in the display
>     driver.
>   drm/i915: Reinstate hang recovery work queue.
>   drm/i915: Watchdog timeout support for gen8.
>   drm/i915: Fake lost context interrupts through forced CSB check.
>   drm/i915: Debugfs interface for per-engine hang recovery.
>   drm/i915: TDR/watchdog trace points.
> 
>  drivers/gpu/drm/i915/i915_debugfs.c     |  146 +++++-
>  drivers/gpu/drm/i915/i915_dma.c         |   79 +++
>  drivers/gpu/drm/i915/i915_drv.c         |  201 ++++++++
>  drivers/gpu/drm/i915/i915_drv.h         |   91 +++-
>  drivers/gpu/drm/i915/i915_gem.c         |   93 +++-
>  drivers/gpu/drm/i915/i915_gpu_error.c   |    2 +-
>  drivers/gpu/drm/i915/i915_irq.c         |  378 ++++++++++++--
>  drivers/gpu/drm/i915/i915_params.c      |   10 +
>  drivers/gpu/drm/i915/i915_reg.h         |   13 +
>  drivers/gpu/drm/i915/i915_trace.h       |  298 +++++++++++
>  drivers/gpu/drm/i915/intel_display.c    |   16 +-
>  drivers/gpu/drm/i915/intel_lrc.c        |  858 ++++++++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/intel_lrc.h        |   16 +-
>  drivers/gpu/drm/i915/intel_lrc_tdr.h    |   40 ++
>  drivers/gpu/drm/i915/intel_ringbuffer.c |   87 +++-
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  109 ++++
>  drivers/gpu/drm/i915/intel_uncore.c     |  241 ++++++++-
>  include/uapi/drm/i915_drm.h             |    5 +-
>  18 files changed, 2589 insertions(+), 94 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/intel_lrc_tdr.h
> 
> -- 
> 1.7.9.5
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-06-23 10:05 ` [RFC 00/11] TDR/watchdog timeout support for gen8 Daniel Vetter
@ 2015-06-23 10:47   ` Tomas Elf
  2015-06-23 11:38     ` Daniel Vetter
  0 siblings, 1 reply; 59+ messages in thread
From: Tomas Elf @ 2015-06-23 10:47 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On 23/06/2015 11:05, Daniel Vetter wrote:
> Your patches don't apply cleanly any more and I can't find a suitable
> baseline where they would. But I'd like to see it all in context to check
> a few things.
>
> Can you pls push a git branch with these somewhere?
>

Here's the baseline for my local tree:
cd07637 -  drm-intel-nightly: 2015y-04m-13d-09h-46m-59s UTC integration 
manifest <daniel.vetter@ffwll.ch>

I haven't updated it in a while obviously since I thought that could 
wait until we'd worked our way through the RFC series and I could get to 
work on the first real patch series.

Is it possible for you to set up a local tree of your own with my 
baseline and my RFC patches on top or would you prefer it if I push my 
branch to drm-private?

Thanks,
Tomas

> Thanks, Daniel
>
> On Mon, Jun 08, 2015 at 06:03:18PM +0100, Tomas Elf wrote:
>> This patch series introduces the following features:
>>
>> * Feature 1: TDR (Timeout Detection and Recovery) for gen8 execlist mode.
>>
>> TDR is an umbrella term for anything that goes into detecting and recovering
>> from GPU hangs and is a term more widely used outside of the upstream driver.
>> This feature introduces an extensible framework that currently supports gen8
>> but that can be easily extended to support gen7 as well (which is already the
>> case in GMIN but unfortunately in a not quite upstreamable form). The code
>> contained in this submission represents the essentials of what is currently in
>> GMIN merged with what is currently in upstream (as of the time when this work
>> commenced a few months back).
>>
>> This feature adds a new hang recovery path alongside the legacy GPU reset path,
>> which takes care of engine recovery only. Aside from adding support for
>> per-engine recovery this feature also introduces rules for how to promote a
>> potential per-engine reset to a legacy, full GPU reset.
>>
>> The hang checker now integrates with the error handler in a slightly different
>> way in that it allows hang recovery on multiple engines at the same time by
>> passing an engine flag mask to the error handler where flags representing all
>> of the hung engines are set. This allows us to schedule hang recovery once for
>> all currently hung engines instead of one hang recovery per detected engine
>> hang. Previously, when only full GPU reset was supported this was all the same
>> since it wouldn't matter if one or four engines were hung at any given point
>> since it would all amount to the same thing - the GPU getting reset. As it
>> stands now the behaviour is different depending on which engine is hung since
>> each engine is reset separately from all the other engines, therefore we have
>> to think about this in terms of scheduling cost and recovery latency. (see
>> open question below)
>>
>> OPEN QUESTIONS:
>>
>> 	1. Do we want to investigate the possibility of per-engine hang
>> 	detection? In the current upstream driver there is only one work queue
>> 	that handles the hang checker and everything from initial hang
>> 	detection to final hang recovery runs in this thread. This makes sense
>> 	if you're only supporting one form of hang recovery - using full GPU
>> 	reset and nothing tied to any particular engine. However, as part
>> 	of this patch series we're changing that by introducing per-engine
>> 	hang recovery. It could make sense to introduce multiple work
>> 	queues - one per engine - to run multiple hang checking threads in
>> 	parallel.
>>
>> 	This would potentially allow savings in terms of recovery latency since
>> 	we don't have to scan all engines every time the hang checker is
>> 	scheduled and the error handler does not have to scan all engines every
>> 	time it is scheduled. Instead, we could implement one work queue per
>> 	engine that would invoke the hang checker that only checks _that_
>> 	particular engine and then the error handler is invoked for _that_
>> 	particular engine. If one engine has hung the latency for getting to
>> 	the hang recovery path for that particular engine would be (Time For
>> 	Hang Checking One Engine) + (Time For Error Handling One Engine) rather
>> 	than the time it takes to do hang checking for all engines + the time
>> 	it takes to do error handling for all engines that have been detected
>> 	as hung (which in the worst case would be all engines). There would
>> 	potentially be as many hang checker and error handling threads going on
>> 	concurrently as there are engines in the hardware but they would all be
>> 	running in parallel without any significant locking. The first time
>> 	where any thread needs exclusive access to the driver is at the point
>> 	of the actual hang recovery but the time it takes to get there would
>> 	theoretically be lower and the time it actually takes to do per-engine
>> 	hang recovery is quite a lot lower than the time it takes to actually
>> 	detect a hang reliably.
>>
>> 	How much we would save by such a change still needs to be analysed and
>> 	compared against the current single-thread model but it makes sense
>> 	from a theoretical design point of view.
>>
>> 	2. How does per-engine reset integrate with the public reset stats
>> 	IOCTL? These stats are used for the GL robustness interface and
>> 	currently these tests are failing when running per-engine hang recovery
>> 	since we treat per-engine recovery differently from full GPU recovery,
>> 	which is nothing that userland knows anything about. When userland
>> 	expects to hang the hardware it expects the reset stat interface to
>> 	reflect this, which is something that has changed as part of this code
>> 	submission. There's more than one way to solve this. Here are two options:
>>
>> 		1. Expose per-engine reset statistics and set contexts as
>> 		guilty the same way for per-engine reset as for full GPU
>> 		resets.
>>
>> 		That would make this change to the hang recovery mechanism
>> 		transparent to userland but it would change the semantics since
>> 		an active context in the reset stats no longer implies that the
>> 		GPU was fully reset.
>>
>> 		2. Add a new set of statistics for per-engine reset (one group
>> 		of statistics for each engine) to reflect the extended
>> 		capabilities that per-engine hang recovery offers.
>>
>> 		Would that be breaking the ABI?
>>
>> 		... Or are there any other way of doing this?
>>
>> * Feature 2: Watchdog Timeout (a.k.a "media engine reset") for gen8.
>>
>> This feature allows userland applications to control whether or not individual
>> batch buffers should have a first-level, fine-grained, hardware-based hang
>> detection mechanism on top of the ordinary, software-based periodic hang
>> checker that is already in the driver. The advantage over relying solely on the
>> current software-based hang checker is that the watchdog timeout mechanism is
>> about 1000x quicker and more precise. Since it's not a full driver-level hang
>> detection mechanism but only targetting one individual batch buffer at a time
>> it can afford to be that quick without risking an increase in false positive
>> hang detection.
>>
>> This feature includes the following changes:
>>
>> a) Watchdog timeout interrupt service routine for handling watchdog interrupts
>> and connecting these to per-engine hang recovery.
>>
>> b) Injection of watchdog timer enablement/cancellation instructions
>> before/after the batch buffer start instruction in the ring buffer so that
>> watchdog timeout is connected to the submission of an individual batch buffer.
>>
>> c) Extension of the DRM batch buffer interface, exposing the watchdog timeout
>> feature to userland. We've got two open source groups in VPG currently in the
>> process of integrating support for this feature, which should make it
>> principally possible to upstream this extension.
>>
>> There is currently full watchdog timeout support for gen7 in GMIN and it is
>> quite similar to the gen8 implementation so there is nothing obvious that
>> prevents us from upstreaming that code along with the gen8 code. However,
>> watchdog timeout is fully dependent on the per-engine hang recovery path and
>> that is not part of this code submission for gen7. Therefore watchdog timeout
>> support for gen7 has been excluded until per-engine hang recovery support for
>> gen7 has landed upstream.
>>
>> As part of this submission we've had to reinstate the work queue that was
>> previously in place between the error handler and the hang recovery path. The
>> reason for this is that the per-engine recovery path is called directly from
>> the interrupt handler in the case of watchdog timeout. In that situation
>> there's no way of grabbing the struct_mutex, which is a requirement for the
>> hang recovery path. Therefore, by reinstating the work queue we provide a
>> unified execution context for the hang recovery code that allows the hang
>> recovery code to grab whatever locks it needs without sacrificing interrupt
>> latency too much or sleeping indefinitely in hard interrupt context.
>>
>> * Feature 3. Context Submission Status Consistency checking
>>
>> Something that becomes apparent when you run long-duration operations tests
>> with concurrent rendering processes with intermittently injected hangs is that
>> it seems like the GPU forgets to send context completion interrupts to the
>> driver under some circumstances. What this means is that the driver sometimes
>> gets stuck on a context that never seems to finish, all the while the hardware
>> has completed and is waiting for more work.
>>
>> The problem with this is that the per-engine hang recovery path relies on
>> context resubmission to kick off the hardware again following an engine reset.
>> This can only be done safely if the hardware and driver share the same opinion
>> about the current state. Therefore we've extended the periodic hang checker to
>> check for context submission state inconsistencies aside from the hang checking
>> it already does.
>>
>> If such a state is detected it is assumed (based on experience) that a context
>> completion interrupt has been lost somehow. If this state persists for some
>> time an attempt to correct it is made by faking the presumably lost context
>> completion interrupt by manually calling the execlist interrupt handler, which
>> is normally called from the main interrupt handler cued by a received context
>> event interrupt. Just because an interrupt goes missing does not mean that the
>> context status buffer (CSB) does not get appropriately updated by the hardware,
>> which means that we can expect to find all the recent changes to the context
>> states for each engine captured there. If there are outstanding context status
>> changes in store there then the faked context event interrupt will allow the
>> interrupt handler to act on them. In the case of lost context completion
>> interrupts this will prompt the driver to remove the already completed context
>> from the execlist queue and move on to the next pending piece of work and
>> thereby eliminating the inconsistency.
>>
>> * Feature 4. Debugfs extensions for per-engine hang recovery and TDR/watchdog trace
>> points.
>>
>>
>> Tomas Elf (11):
>>    drm/i915: Early exit from semaphore_waits_for for execlist mode.
>>    drm/i915: Introduce uevent for full GPU reset.
>>    drm/i915: Add reset stats entry point for per-engine reset.
>>    drm/i915: Adding TDR / per-engine reset support for gen8.
>>    drm/i915: Extending i915_gem_check_wedge to check engine reset in
>>      progress
>>    drm/i915: Disable warnings for TDR interruptions in the display
>>      driver.
>>    drm/i915: Reinstate hang recovery work queue.
>>    drm/i915: Watchdog timeout support for gen8.
>>    drm/i915: Fake lost context interrupts through forced CSB check.
>>    drm/i915: Debugfs interface for per-engine hang recovery.
>>    drm/i915: TDR/watchdog trace points.
>>
>>   drivers/gpu/drm/i915/i915_debugfs.c     |  146 +++++-
>>   drivers/gpu/drm/i915/i915_dma.c         |   79 +++
>>   drivers/gpu/drm/i915/i915_drv.c         |  201 ++++++++
>>   drivers/gpu/drm/i915/i915_drv.h         |   91 +++-
>>   drivers/gpu/drm/i915/i915_gem.c         |   93 +++-
>>   drivers/gpu/drm/i915/i915_gpu_error.c   |    2 +-
>>   drivers/gpu/drm/i915/i915_irq.c         |  378 ++++++++++++--
>>   drivers/gpu/drm/i915/i915_params.c      |   10 +
>>   drivers/gpu/drm/i915/i915_reg.h         |   13 +
>>   drivers/gpu/drm/i915/i915_trace.h       |  298 +++++++++++
>>   drivers/gpu/drm/i915/intel_display.c    |   16 +-
>>   drivers/gpu/drm/i915/intel_lrc.c        |  858 ++++++++++++++++++++++++++++++-
>>   drivers/gpu/drm/i915/intel_lrc.h        |   16 +-
>>   drivers/gpu/drm/i915/intel_lrc_tdr.h    |   40 ++
>>   drivers/gpu/drm/i915/intel_ringbuffer.c |   87 +++-
>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  109 ++++
>>   drivers/gpu/drm/i915/intel_uncore.c     |  241 ++++++++-
>>   include/uapi/drm/i915_drm.h             |    5 +-
>>   18 files changed, 2589 insertions(+), 94 deletions(-)
>>   create mode 100644 drivers/gpu/drm/i915/intel_lrc_tdr.h
>>
>> --
>> 1.7.9.5
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-06-23 10:47   ` Tomas Elf
@ 2015-06-23 11:38     ` Daniel Vetter
  2015-06-23 14:06       ` Tomas Elf
  0 siblings, 1 reply; 59+ messages in thread
From: Daniel Vetter @ 2015-06-23 11:38 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Tue, Jun 23, 2015 at 11:47:16AM +0100, Tomas Elf wrote:
> On 23/06/2015 11:05, Daniel Vetter wrote:
> >Your patches don't apply cleanly any more and I can't find a suitable
> >baseline where they would. But I'd like to see it all in context to check
> >a few things.
> >
> >Can you pls push a git branch with these somewhere?
> >
> 
> Here's the baseline for my local tree:
> cd07637 -  drm-intel-nightly: 2015y-04m-13d-09h-46m-59s UTC integration
> manifest <daniel.vetter@ffwll.ch>

I don't have that baseline around here (any more at least). Happens
regularly with rebasing trees.

> I haven't updated it in a while obviously since I thought that could wait
> until we'd worked our way through the RFC series and I could get to work on
> the first real patch series.
> 
> Is it possible for you to set up a local tree of your own with my baseline
> and my RFC patches on top or would you prefer it if I push my branch to
> drm-private?

So yeah I need your branch ;-)
-Danil
> 
> Thanks,
> Tomas
> 
> >Thanks, Daniel
> >
> >On Mon, Jun 08, 2015 at 06:03:18PM +0100, Tomas Elf wrote:
> >>This patch series introduces the following features:
> >>
> >>* Feature 1: TDR (Timeout Detection and Recovery) for gen8 execlist mode.
> >>
> >>TDR is an umbrella term for anything that goes into detecting and recovering
> >>from GPU hangs and is a term more widely used outside of the upstream driver.
> >>This feature introduces an extensible framework that currently supports gen8
> >>but that can be easily extended to support gen7 as well (which is already the
> >>case in GMIN but unfortunately in a not quite upstreamable form). The code
> >>contained in this submission represents the essentials of what is currently in
> >>GMIN merged with what is currently in upstream (as of the time when this work
> >>commenced a few months back).
> >>
> >>This feature adds a new hang recovery path alongside the legacy GPU reset path,
> >>which takes care of engine recovery only. Aside from adding support for
> >>per-engine recovery this feature also introduces rules for how to promote a
> >>potential per-engine reset to a legacy, full GPU reset.
> >>
> >>The hang checker now integrates with the error handler in a slightly different
> >>way in that it allows hang recovery on multiple engines at the same time by
> >>passing an engine flag mask to the error handler where flags representing all
> >>of the hung engines are set. This allows us to schedule hang recovery once for
> >>all currently hung engines instead of one hang recovery per detected engine
> >>hang. Previously, when only full GPU reset was supported this was all the same
> >>since it wouldn't matter if one or four engines were hung at any given point
> >>since it would all amount to the same thing - the GPU getting reset. As it
> >>stands now the behaviour is different depending on which engine is hung since
> >>each engine is reset separately from all the other engines, therefore we have
> >>to think about this in terms of scheduling cost and recovery latency. (see
> >>open question below)
> >>
> >>OPEN QUESTIONS:
> >>
> >>	1. Do we want to investigate the possibility of per-engine hang
> >>	detection? In the current upstream driver there is only one work queue
> >>	that handles the hang checker and everything from initial hang
> >>	detection to final hang recovery runs in this thread. This makes sense
> >>	if you're only supporting one form of hang recovery - using full GPU
> >>	reset and nothing tied to any particular engine. However, as part
> >>	of this patch series we're changing that by introducing per-engine
> >>	hang recovery. It could make sense to introduce multiple work
> >>	queues - one per engine - to run multiple hang checking threads in
> >>	parallel.
> >>
> >>	This would potentially allow savings in terms of recovery latency since
> >>	we don't have to scan all engines every time the hang checker is
> >>	scheduled and the error handler does not have to scan all engines every
> >>	time it is scheduled. Instead, we could implement one work queue per
> >>	engine that would invoke the hang checker that only checks _that_
> >>	particular engine and then the error handler is invoked for _that_
> >>	particular engine. If one engine has hung the latency for getting to
> >>	the hang recovery path for that particular engine would be (Time For
> >>	Hang Checking One Engine) + (Time For Error Handling One Engine) rather
> >>	than the time it takes to do hang checking for all engines + the time
> >>	it takes to do error handling for all engines that have been detected
> >>	as hung (which in the worst case would be all engines). There would
> >>	potentially be as many hang checker and error handling threads going on
> >>	concurrently as there are engines in the hardware but they would all be
> >>	running in parallel without any significant locking. The first time
> >>	where any thread needs exclusive access to the driver is at the point
> >>	of the actual hang recovery but the time it takes to get there would
> >>	theoretically be lower and the time it actually takes to do per-engine
> >>	hang recovery is quite a lot lower than the time it takes to actually
> >>	detect a hang reliably.
> >>
> >>	How much we would save by such a change still needs to be analysed and
> >>	compared against the current single-thread model but it makes sense
> >>	from a theoretical design point of view.
> >>
> >>	2. How does per-engine reset integrate with the public reset stats
> >>	IOCTL? These stats are used for the GL robustness interface and
> >>	currently these tests are failing when running per-engine hang recovery
> >>	since we treat per-engine recovery differently from full GPU recovery,
> >>	which is nothing that userland knows anything about. When userland
> >>	expects to hang the hardware it expects the reset stat interface to
> >>	reflect this, which is something that has changed as part of this code
> >>	submission. There's more than one way to solve this. Here are two options:
> >>
> >>		1. Expose per-engine reset statistics and set contexts as
> >>		guilty the same way for per-engine reset as for full GPU
> >>		resets.
> >>
> >>		That would make this change to the hang recovery mechanism
> >>		transparent to userland but it would change the semantics since
> >>		an active context in the reset stats no longer implies that the
> >>		GPU was fully reset.
> >>
> >>		2. Add a new set of statistics for per-engine reset (one group
> >>		of statistics for each engine) to reflect the extended
> >>		capabilities that per-engine hang recovery offers.
> >>
> >>		Would that be breaking the ABI?
> >>
> >>		... Or are there any other way of doing this?
> >>
> >>* Feature 2: Watchdog Timeout (a.k.a "media engine reset") for gen8.
> >>
> >>This feature allows userland applications to control whether or not individual
> >>batch buffers should have a first-level, fine-grained, hardware-based hang
> >>detection mechanism on top of the ordinary, software-based periodic hang
> >>checker that is already in the driver. The advantage over relying solely on the
> >>current software-based hang checker is that the watchdog timeout mechanism is
> >>about 1000x quicker and more precise. Since it's not a full driver-level hang
> >>detection mechanism but only targetting one individual batch buffer at a time
> >>it can afford to be that quick without risking an increase in false positive
> >>hang detection.
> >>
> >>This feature includes the following changes:
> >>
> >>a) Watchdog timeout interrupt service routine for handling watchdog interrupts
> >>and connecting these to per-engine hang recovery.
> >>
> >>b) Injection of watchdog timer enablement/cancellation instructions
> >>before/after the batch buffer start instruction in the ring buffer so that
> >>watchdog timeout is connected to the submission of an individual batch buffer.
> >>
> >>c) Extension of the DRM batch buffer interface, exposing the watchdog timeout
> >>feature to userland. We've got two open source groups in VPG currently in the
> >>process of integrating support for this feature, which should make it
> >>principally possible to upstream this extension.
> >>
> >>There is currently full watchdog timeout support for gen7 in GMIN and it is
> >>quite similar to the gen8 implementation so there is nothing obvious that
> >>prevents us from upstreaming that code along with the gen8 code. However,
> >>watchdog timeout is fully dependent on the per-engine hang recovery path and
> >>that is not part of this code submission for gen7. Therefore watchdog timeout
> >>support for gen7 has been excluded until per-engine hang recovery support for
> >>gen7 has landed upstream.
> >>
> >>As part of this submission we've had to reinstate the work queue that was
> >>previously in place between the error handler and the hang recovery path. The
> >>reason for this is that the per-engine recovery path is called directly from
> >>the interrupt handler in the case of watchdog timeout. In that situation
> >>there's no way of grabbing the struct_mutex, which is a requirement for the
> >>hang recovery path. Therefore, by reinstating the work queue we provide a
> >>unified execution context for the hang recovery code that allows the hang
> >>recovery code to grab whatever locks it needs without sacrificing interrupt
> >>latency too much or sleeping indefinitely in hard interrupt context.
> >>
> >>* Feature 3. Context Submission Status Consistency checking
> >>
> >>Something that becomes apparent when you run long-duration operations tests
> >>with concurrent rendering processes with intermittently injected hangs is that
> >>it seems like the GPU forgets to send context completion interrupts to the
> >>driver under some circumstances. What this means is that the driver sometimes
> >>gets stuck on a context that never seems to finish, all the while the hardware
> >>has completed and is waiting for more work.
> >>
> >>The problem with this is that the per-engine hang recovery path relies on
> >>context resubmission to kick off the hardware again following an engine reset.
> >>This can only be done safely if the hardware and driver share the same opinion
> >>about the current state. Therefore we've extended the periodic hang checker to
> >>check for context submission state inconsistencies aside from the hang checking
> >>it already does.
> >>
> >>If such a state is detected it is assumed (based on experience) that a context
> >>completion interrupt has been lost somehow. If this state persists for some
> >>time an attempt to correct it is made by faking the presumably lost context
> >>completion interrupt by manually calling the execlist interrupt handler, which
> >>is normally called from the main interrupt handler cued by a received context
> >>event interrupt. Just because an interrupt goes missing does not mean that the
> >>context status buffer (CSB) does not get appropriately updated by the hardware,
> >>which means that we can expect to find all the recent changes to the context
> >>states for each engine captured there. If there are outstanding context status
> >>changes in store there then the faked context event interrupt will allow the
> >>interrupt handler to act on them. In the case of lost context completion
> >>interrupts this will prompt the driver to remove the already completed context
> >>from the execlist queue and move on to the next pending piece of work and
> >>thereby eliminating the inconsistency.
> >>
> >>* Feature 4. Debugfs extensions for per-engine hang recovery and TDR/watchdog trace
> >>points.
> >>
> >>
> >>Tomas Elf (11):
> >>   drm/i915: Early exit from semaphore_waits_for for execlist mode.
> >>   drm/i915: Introduce uevent for full GPU reset.
> >>   drm/i915: Add reset stats entry point for per-engine reset.
> >>   drm/i915: Adding TDR / per-engine reset support for gen8.
> >>   drm/i915: Extending i915_gem_check_wedge to check engine reset in
> >>     progress
> >>   drm/i915: Disable warnings for TDR interruptions in the display
> >>     driver.
> >>   drm/i915: Reinstate hang recovery work queue.
> >>   drm/i915: Watchdog timeout support for gen8.
> >>   drm/i915: Fake lost context interrupts through forced CSB check.
> >>   drm/i915: Debugfs interface for per-engine hang recovery.
> >>   drm/i915: TDR/watchdog trace points.
> >>
> >>  drivers/gpu/drm/i915/i915_debugfs.c     |  146 +++++-
> >>  drivers/gpu/drm/i915/i915_dma.c         |   79 +++
> >>  drivers/gpu/drm/i915/i915_drv.c         |  201 ++++++++
> >>  drivers/gpu/drm/i915/i915_drv.h         |   91 +++-
> >>  drivers/gpu/drm/i915/i915_gem.c         |   93 +++-
> >>  drivers/gpu/drm/i915/i915_gpu_error.c   |    2 +-
> >>  drivers/gpu/drm/i915/i915_irq.c         |  378 ++++++++++++--
> >>  drivers/gpu/drm/i915/i915_params.c      |   10 +
> >>  drivers/gpu/drm/i915/i915_reg.h         |   13 +
> >>  drivers/gpu/drm/i915/i915_trace.h       |  298 +++++++++++
> >>  drivers/gpu/drm/i915/intel_display.c    |   16 +-
> >>  drivers/gpu/drm/i915/intel_lrc.c        |  858 ++++++++++++++++++++++++++++++-
> >>  drivers/gpu/drm/i915/intel_lrc.h        |   16 +-
> >>  drivers/gpu/drm/i915/intel_lrc_tdr.h    |   40 ++
> >>  drivers/gpu/drm/i915/intel_ringbuffer.c |   87 +++-
> >>  drivers/gpu/drm/i915/intel_ringbuffer.h |  109 ++++
> >>  drivers/gpu/drm/i915/intel_uncore.c     |  241 ++++++++-
> >>  include/uapi/drm/i915_drm.h             |    5 +-
> >>  18 files changed, 2589 insertions(+), 94 deletions(-)
> >>  create mode 100644 drivers/gpu/drm/i915/intel_lrc_tdr.h
> >>
> >>--
> >>1.7.9.5
> >>
> >>_______________________________________________
> >>Intel-gfx mailing list
> >>Intel-gfx@lists.freedesktop.org
> >>http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-06-23 11:38     ` Daniel Vetter
@ 2015-06-23 14:06       ` Tomas Elf
  2015-06-23 15:20         ` Daniel Vetter
  0 siblings, 1 reply; 59+ messages in thread
From: Tomas Elf @ 2015-06-23 14:06 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX

On 23/06/2015 12:38, Daniel Vetter wrote:
> On Tue, Jun 23, 2015 at 11:47:16AM +0100, Tomas Elf wrote:
>> On 23/06/2015 11:05, Daniel Vetter wrote:
>>> Your patches don't apply cleanly any more and I can't find a suitable
>>> baseline where they would. But I'd like to see it all in context to check
>>> a few things.
>>>
>>> Can you pls push a git branch with these somewhere?
>>>
>>
>> Here's the baseline for my local tree:
>> cd07637 -  drm-intel-nightly: 2015y-04m-13d-09h-46m-59s UTC integration
>> manifest <daniel.vetter@ffwll.ch>
>
> I don't have that baseline around here (any more at least). Happens
> regularly with rebasing trees.
>
>> I haven't updated it in a while obviously since I thought that could wait
>> until we'd worked our way through the RFC series and I could get to work on
>> the first real patch series.
>>
>> Is it possible for you to set up a local tree of your own with my baseline
>> and my RFC patches on top or would you prefer it if I push my branch to
>> drm-private?
>
> So yeah I need your branch ;-)

I pushed my branch, 
20150608_TDR_upstream_adaptation_single-thread_hangchecking_RFC_delivery_sendmail_1, 
to drm-private :

https://git-amr-2.devtools.intel.com/gerrit/gitweb?p=otc_gen_graphics-drm-private.git;a=shortlog;h=refs/heads/20150608_TDR_upstream_adaptation_single-thread_hangchecking_RFC_delivery_sendmail_1

Will that work or do you need something else?

Thanks,
Tomas

> -Danil
>>
>> Thanks,
>> Tomas
>>
>>> Thanks, Daniel
>>>
>>> On Mon, Jun 08, 2015 at 06:03:18PM +0100, Tomas Elf wrote:
>>>> This patch series introduces the following features:
>>>>
>>>> * Feature 1: TDR (Timeout Detection and Recovery) for gen8 execlist mode.
>>>>
>>>> TDR is an umbrella term for anything that goes into detecting and recovering
>>> >from GPU hangs and is a term more widely used outside of the upstream driver.
>>>> This feature introduces an extensible framework that currently supports gen8
>>>> but that can be easily extended to support gen7 as well (which is already the
>>>> case in GMIN but unfortunately in a not quite upstreamable form). The code
>>>> contained in this submission represents the essentials of what is currently in
>>>> GMIN merged with what is currently in upstream (as of the time when this work
>>>> commenced a few months back).
>>>>
>>>> This feature adds a new hang recovery path alongside the legacy GPU reset path,
>>>> which takes care of engine recovery only. Aside from adding support for
>>>> per-engine recovery this feature also introduces rules for how to promote a
>>>> potential per-engine reset to a legacy, full GPU reset.
>>>>
>>>> The hang checker now integrates with the error handler in a slightly different
>>>> way in that it allows hang recovery on multiple engines at the same time by
>>>> passing an engine flag mask to the error handler where flags representing all
>>>> of the hung engines are set. This allows us to schedule hang recovery once for
>>>> all currently hung engines instead of one hang recovery per detected engine
>>>> hang. Previously, when only full GPU reset was supported this was all the same
>>>> since it wouldn't matter if one or four engines were hung at any given point
>>>> since it would all amount to the same thing - the GPU getting reset. As it
>>>> stands now the behaviour is different depending on which engine is hung since
>>>> each engine is reset separately from all the other engines, therefore we have
>>>> to think about this in terms of scheduling cost and recovery latency. (see
>>>> open question below)
>>>>
>>>> OPEN QUESTIONS:
>>>>
>>>> 	1. Do we want to investigate the possibility of per-engine hang
>>>> 	detection? In the current upstream driver there is only one work queue
>>>> 	that handles the hang checker and everything from initial hang
>>>> 	detection to final hang recovery runs in this thread. This makes sense
>>>> 	if you're only supporting one form of hang recovery - using full GPU
>>>> 	reset and nothing tied to any particular engine. However, as part
>>>> 	of this patch series we're changing that by introducing per-engine
>>>> 	hang recovery. It could make sense to introduce multiple work
>>>> 	queues - one per engine - to run multiple hang checking threads in
>>>> 	parallel.
>>>>
>>>> 	This would potentially allow savings in terms of recovery latency since
>>>> 	we don't have to scan all engines every time the hang checker is
>>>> 	scheduled and the error handler does not have to scan all engines every
>>>> 	time it is scheduled. Instead, we could implement one work queue per
>>>> 	engine that would invoke the hang checker that only checks _that_
>>>> 	particular engine and then the error handler is invoked for _that_
>>>> 	particular engine. If one engine has hung the latency for getting to
>>>> 	the hang recovery path for that particular engine would be (Time For
>>>> 	Hang Checking One Engine) + (Time For Error Handling One Engine) rather
>>>> 	than the time it takes to do hang checking for all engines + the time
>>>> 	it takes to do error handling for all engines that have been detected
>>>> 	as hung (which in the worst case would be all engines). There would
>>>> 	potentially be as many hang checker and error handling threads going on
>>>> 	concurrently as there are engines in the hardware but they would all be
>>>> 	running in parallel without any significant locking. The first time
>>>> 	where any thread needs exclusive access to the driver is at the point
>>>> 	of the actual hang recovery but the time it takes to get there would
>>>> 	theoretically be lower and the time it actually takes to do per-engine
>>>> 	hang recovery is quite a lot lower than the time it takes to actually
>>>> 	detect a hang reliably.
>>>>
>>>> 	How much we would save by such a change still needs to be analysed and
>>>> 	compared against the current single-thread model but it makes sense
>>>> 	from a theoretical design point of view.
>>>>
>>>> 	2. How does per-engine reset integrate with the public reset stats
>>>> 	IOCTL? These stats are used for the GL robustness interface and
>>>> 	currently these tests are failing when running per-engine hang recovery
>>>> 	since we treat per-engine recovery differently from full GPU recovery,
>>>> 	which is nothing that userland knows anything about. When userland
>>>> 	expects to hang the hardware it expects the reset stat interface to
>>>> 	reflect this, which is something that has changed as part of this code
>>>> 	submission. There's more than one way to solve this. Here are two options:
>>>>
>>>> 		1. Expose per-engine reset statistics and set contexts as
>>>> 		guilty the same way for per-engine reset as for full GPU
>>>> 		resets.
>>>>
>>>> 		That would make this change to the hang recovery mechanism
>>>> 		transparent to userland but it would change the semantics since
>>>> 		an active context in the reset stats no longer implies that the
>>>> 		GPU was fully reset.
>>>>
>>>> 		2. Add a new set of statistics for per-engine reset (one group
>>>> 		of statistics for each engine) to reflect the extended
>>>> 		capabilities that per-engine hang recovery offers.
>>>>
>>>> 		Would that be breaking the ABI?
>>>>
>>>> 		... Or are there any other way of doing this?
>>>>
>>>> * Feature 2: Watchdog Timeout (a.k.a "media engine reset") for gen8.
>>>>
>>>> This feature allows userland applications to control whether or not individual
>>>> batch buffers should have a first-level, fine-grained, hardware-based hang
>>>> detection mechanism on top of the ordinary, software-based periodic hang
>>>> checker that is already in the driver. The advantage over relying solely on the
>>>> current software-based hang checker is that the watchdog timeout mechanism is
>>>> about 1000x quicker and more precise. Since it's not a full driver-level hang
>>>> detection mechanism but only targetting one individual batch buffer at a time
>>>> it can afford to be that quick without risking an increase in false positive
>>>> hang detection.
>>>>
>>>> This feature includes the following changes:
>>>>
>>>> a) Watchdog timeout interrupt service routine for handling watchdog interrupts
>>>> and connecting these to per-engine hang recovery.
>>>>
>>>> b) Injection of watchdog timer enablement/cancellation instructions
>>>> before/after the batch buffer start instruction in the ring buffer so that
>>>> watchdog timeout is connected to the submission of an individual batch buffer.
>>>>
>>>> c) Extension of the DRM batch buffer interface, exposing the watchdog timeout
>>>> feature to userland. We've got two open source groups in VPG currently in the
>>>> process of integrating support for this feature, which should make it
>>>> principally possible to upstream this extension.
>>>>
>>>> There is currently full watchdog timeout support for gen7 in GMIN and it is
>>>> quite similar to the gen8 implementation so there is nothing obvious that
>>>> prevents us from upstreaming that code along with the gen8 code. However,
>>>> watchdog timeout is fully dependent on the per-engine hang recovery path and
>>>> that is not part of this code submission for gen7. Therefore watchdog timeout
>>>> support for gen7 has been excluded until per-engine hang recovery support for
>>>> gen7 has landed upstream.
>>>>
>>>> As part of this submission we've had to reinstate the work queue that was
>>>> previously in place between the error handler and the hang recovery path. The
>>>> reason for this is that the per-engine recovery path is called directly from
>>>> the interrupt handler in the case of watchdog timeout. In that situation
>>>> there's no way of grabbing the struct_mutex, which is a requirement for the
>>>> hang recovery path. Therefore, by reinstating the work queue we provide a
>>>> unified execution context for the hang recovery code that allows the hang
>>>> recovery code to grab whatever locks it needs without sacrificing interrupt
>>>> latency too much or sleeping indefinitely in hard interrupt context.
>>>>
>>>> * Feature 3. Context Submission Status Consistency checking
>>>>
>>>> Something that becomes apparent when you run long-duration operations tests
>>>> with concurrent rendering processes with intermittently injected hangs is that
>>>> it seems like the GPU forgets to send context completion interrupts to the
>>>> driver under some circumstances. What this means is that the driver sometimes
>>>> gets stuck on a context that never seems to finish, all the while the hardware
>>>> has completed and is waiting for more work.
>>>>
>>>> The problem with this is that the per-engine hang recovery path relies on
>>>> context resubmission to kick off the hardware again following an engine reset.
>>>> This can only be done safely if the hardware and driver share the same opinion
>>>> about the current state. Therefore we've extended the periodic hang checker to
>>>> check for context submission state inconsistencies aside from the hang checking
>>>> it already does.
>>>>
>>>> If such a state is detected it is assumed (based on experience) that a context
>>>> completion interrupt has been lost somehow. If this state persists for some
>>>> time an attempt to correct it is made by faking the presumably lost context
>>>> completion interrupt by manually calling the execlist interrupt handler, which
>>>> is normally called from the main interrupt handler cued by a received context
>>>> event interrupt. Just because an interrupt goes missing does not mean that the
>>>> context status buffer (CSB) does not get appropriately updated by the hardware,
>>>> which means that we can expect to find all the recent changes to the context
>>>> states for each engine captured there. If there are outstanding context status
>>>> changes in store there then the faked context event interrupt will allow the
>>>> interrupt handler to act on them. In the case of lost context completion
>>>> interrupts this will prompt the driver to remove the already completed context
>>> >from the execlist queue and move on to the next pending piece of work and
>>>> thereby eliminating the inconsistency.
>>>>
>>>> * Feature 4. Debugfs extensions for per-engine hang recovery and TDR/watchdog trace
>>>> points.
>>>>
>>>>
>>>> Tomas Elf (11):
>>>>    drm/i915: Early exit from semaphore_waits_for for execlist mode.
>>>>    drm/i915: Introduce uevent for full GPU reset.
>>>>    drm/i915: Add reset stats entry point for per-engine reset.
>>>>    drm/i915: Adding TDR / per-engine reset support for gen8.
>>>>    drm/i915: Extending i915_gem_check_wedge to check engine reset in
>>>>      progress
>>>>    drm/i915: Disable warnings for TDR interruptions in the display
>>>>      driver.
>>>>    drm/i915: Reinstate hang recovery work queue.
>>>>    drm/i915: Watchdog timeout support for gen8.
>>>>    drm/i915: Fake lost context interrupts through forced CSB check.
>>>>    drm/i915: Debugfs interface for per-engine hang recovery.
>>>>    drm/i915: TDR/watchdog trace points.
>>>>
>>>>   drivers/gpu/drm/i915/i915_debugfs.c     |  146 +++++-
>>>>   drivers/gpu/drm/i915/i915_dma.c         |   79 +++
>>>>   drivers/gpu/drm/i915/i915_drv.c         |  201 ++++++++
>>>>   drivers/gpu/drm/i915/i915_drv.h         |   91 +++-
>>>>   drivers/gpu/drm/i915/i915_gem.c         |   93 +++-
>>>>   drivers/gpu/drm/i915/i915_gpu_error.c   |    2 +-
>>>>   drivers/gpu/drm/i915/i915_irq.c         |  378 ++++++++++++--
>>>>   drivers/gpu/drm/i915/i915_params.c      |   10 +
>>>>   drivers/gpu/drm/i915/i915_reg.h         |   13 +
>>>>   drivers/gpu/drm/i915/i915_trace.h       |  298 +++++++++++
>>>>   drivers/gpu/drm/i915/intel_display.c    |   16 +-
>>>>   drivers/gpu/drm/i915/intel_lrc.c        |  858 ++++++++++++++++++++++++++++++-
>>>>   drivers/gpu/drm/i915/intel_lrc.h        |   16 +-
>>>>   drivers/gpu/drm/i915/intel_lrc_tdr.h    |   40 ++
>>>>   drivers/gpu/drm/i915/intel_ringbuffer.c |   87 +++-
>>>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  109 ++++
>>>>   drivers/gpu/drm/i915/intel_uncore.c     |  241 ++++++++-
>>>>   include/uapi/drm/i915_drm.h             |    5 +-
>>>>   18 files changed, 2589 insertions(+), 94 deletions(-)
>>>>   create mode 100644 drivers/gpu/drm/i915/intel_lrc_tdr.h
>>>>
>>>> --
>>>> 1.7.9.5
>>>>
>>>> _______________________________________________
>>>> Intel-gfx mailing list
>>>> Intel-gfx@lists.freedesktop.org
>>>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>>
>>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-06-23 14:06       ` Tomas Elf
@ 2015-06-23 15:20         ` Daniel Vetter
  2015-06-23 15:35           ` Daniel Vetter
  0 siblings, 1 reply; 59+ messages in thread
From: Daniel Vetter @ 2015-06-23 15:20 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Tue, Jun 23, 2015 at 03:06:49PM +0100, Tomas Elf wrote:
> On 23/06/2015 12:38, Daniel Vetter wrote:
> >On Tue, Jun 23, 2015 at 11:47:16AM +0100, Tomas Elf wrote:
> >>On 23/06/2015 11:05, Daniel Vetter wrote:
> >>>Your patches don't apply cleanly any more and I can't find a suitable
> >>>baseline where they would. But I'd like to see it all in context to check
> >>>a few things.
> >>>
> >>>Can you pls push a git branch with these somewhere?
> >>>
> >>
> >>Here's the baseline for my local tree:
> >>cd07637 -  drm-intel-nightly: 2015y-04m-13d-09h-46m-59s UTC integration
> >>manifest <daniel.vetter@ffwll.ch>
> >
> >I don't have that baseline around here (any more at least). Happens
> >regularly with rebasing trees.
> >
> >>I haven't updated it in a while obviously since I thought that could wait
> >>until we'd worked our way through the RFC series and I could get to work on
> >>the first real patch series.
> >>
> >>Is it possible for you to set up a local tree of your own with my baseline
> >>and my RFC patches on top or would you prefer it if I push my branch to
> >>drm-private?
> >
> >So yeah I need your branch ;-)
> 
> I pushed my branch, 20150608_TDR_upstream_adaptation_single-thread_hangchecking_RFC_delivery_sendmail_1,
> to drm-private :
> 
> https://git-amr-2.devtools.intel.com/gerrit/gitweb?p=otc_gen_graphics-drm-private.git;a=shortlog;h=refs/heads/20150608_TDR_upstream_adaptation_single-thread_hangchecking_RFC_delivery_sendmail_1

For public stuff something public is preferred since then I don't have to
needle it through the firewall, which tends to be a pain and all that.
Personally I just have a priv remote pointing to something public, and
whenever I need to upload something for someone I do a

$ git push priv +HEAD:for-$person

The point of all this is that I can do a

$ git fetch $url $branch
$ git checkout FETCH_HEAD

and look at the tree with the full power of local editors. I'll try and
see whether I can coax gerrit into cooperation ...
-Daniel

> 
> Will that work or do you need something else?
> 
> Thanks,
> Tomas
> 
> >-Danil
> >>
> >>Thanks,
> >>Tomas
> >>
> >>>Thanks, Daniel
> >>>
> >>>On Mon, Jun 08, 2015 at 06:03:18PM +0100, Tomas Elf wrote:
> >>>>This patch series introduces the following features:
> >>>>
> >>>>* Feature 1: TDR (Timeout Detection and Recovery) for gen8 execlist mode.
> >>>>
> >>>>TDR is an umbrella term for anything that goes into detecting and recovering
> >>>>from GPU hangs and is a term more widely used outside of the upstream driver.
> >>>>This feature introduces an extensible framework that currently supports gen8
> >>>>but that can be easily extended to support gen7 as well (which is already the
> >>>>case in GMIN but unfortunately in a not quite upstreamable form). The code
> >>>>contained in this submission represents the essentials of what is currently in
> >>>>GMIN merged with what is currently in upstream (as of the time when this work
> >>>>commenced a few months back).
> >>>>
> >>>>This feature adds a new hang recovery path alongside the legacy GPU reset path,
> >>>>which takes care of engine recovery only. Aside from adding support for
> >>>>per-engine recovery this feature also introduces rules for how to promote a
> >>>>potential per-engine reset to a legacy, full GPU reset.
> >>>>
> >>>>The hang checker now integrates with the error handler in a slightly different
> >>>>way in that it allows hang recovery on multiple engines at the same time by
> >>>>passing an engine flag mask to the error handler where flags representing all
> >>>>of the hung engines are set. This allows us to schedule hang recovery once for
> >>>>all currently hung engines instead of one hang recovery per detected engine
> >>>>hang. Previously, when only full GPU reset was supported this was all the same
> >>>>since it wouldn't matter if one or four engines were hung at any given point
> >>>>since it would all amount to the same thing - the GPU getting reset. As it
> >>>>stands now the behaviour is different depending on which engine is hung since
> >>>>each engine is reset separately from all the other engines, therefore we have
> >>>>to think about this in terms of scheduling cost and recovery latency. (see
> >>>>open question below)
> >>>>
> >>>>OPEN QUESTIONS:
> >>>>
> >>>>	1. Do we want to investigate the possibility of per-engine hang
> >>>>	detection? In the current upstream driver there is only one work queue
> >>>>	that handles the hang checker and everything from initial hang
> >>>>	detection to final hang recovery runs in this thread. This makes sense
> >>>>	if you're only supporting one form of hang recovery - using full GPU
> >>>>	reset and nothing tied to any particular engine. However, as part
> >>>>	of this patch series we're changing that by introducing per-engine
> >>>>	hang recovery. It could make sense to introduce multiple work
> >>>>	queues - one per engine - to run multiple hang checking threads in
> >>>>	parallel.
> >>>>
> >>>>	This would potentially allow savings in terms of recovery latency since
> >>>>	we don't have to scan all engines every time the hang checker is
> >>>>	scheduled and the error handler does not have to scan all engines every
> >>>>	time it is scheduled. Instead, we could implement one work queue per
> >>>>	engine that would invoke the hang checker that only checks _that_
> >>>>	particular engine and then the error handler is invoked for _that_
> >>>>	particular engine. If one engine has hung the latency for getting to
> >>>>	the hang recovery path for that particular engine would be (Time For
> >>>>	Hang Checking One Engine) + (Time For Error Handling One Engine) rather
> >>>>	than the time it takes to do hang checking for all engines + the time
> >>>>	it takes to do error handling for all engines that have been detected
> >>>>	as hung (which in the worst case would be all engines). There would
> >>>>	potentially be as many hang checker and error handling threads going on
> >>>>	concurrently as there are engines in the hardware but they would all be
> >>>>	running in parallel without any significant locking. The first time
> >>>>	where any thread needs exclusive access to the driver is at the point
> >>>>	of the actual hang recovery but the time it takes to get there would
> >>>>	theoretically be lower and the time it actually takes to do per-engine
> >>>>	hang recovery is quite a lot lower than the time it takes to actually
> >>>>	detect a hang reliably.
> >>>>
> >>>>	How much we would save by such a change still needs to be analysed and
> >>>>	compared against the current single-thread model but it makes sense
> >>>>	from a theoretical design point of view.
> >>>>
> >>>>	2. How does per-engine reset integrate with the public reset stats
> >>>>	IOCTL? These stats are used for the GL robustness interface and
> >>>>	currently these tests are failing when running per-engine hang recovery
> >>>>	since we treat per-engine recovery differently from full GPU recovery,
> >>>>	which is nothing that userland knows anything about. When userland
> >>>>	expects to hang the hardware it expects the reset stat interface to
> >>>>	reflect this, which is something that has changed as part of this code
> >>>>	submission. There's more than one way to solve this. Here are two options:
> >>>>
> >>>>		1. Expose per-engine reset statistics and set contexts as
> >>>>		guilty the same way for per-engine reset as for full GPU
> >>>>		resets.
> >>>>
> >>>>		That would make this change to the hang recovery mechanism
> >>>>		transparent to userland but it would change the semantics since
> >>>>		an active context in the reset stats no longer implies that the
> >>>>		GPU was fully reset.
> >>>>
> >>>>		2. Add a new set of statistics for per-engine reset (one group
> >>>>		of statistics for each engine) to reflect the extended
> >>>>		capabilities that per-engine hang recovery offers.
> >>>>
> >>>>		Would that be breaking the ABI?
> >>>>
> >>>>		... Or are there any other way of doing this?
> >>>>
> >>>>* Feature 2: Watchdog Timeout (a.k.a "media engine reset") for gen8.
> >>>>
> >>>>This feature allows userland applications to control whether or not individual
> >>>>batch buffers should have a first-level, fine-grained, hardware-based hang
> >>>>detection mechanism on top of the ordinary, software-based periodic hang
> >>>>checker that is already in the driver. The advantage over relying solely on the
> >>>>current software-based hang checker is that the watchdog timeout mechanism is
> >>>>about 1000x quicker and more precise. Since it's not a full driver-level hang
> >>>>detection mechanism but only targetting one individual batch buffer at a time
> >>>>it can afford to be that quick without risking an increase in false positive
> >>>>hang detection.
> >>>>
> >>>>This feature includes the following changes:
> >>>>
> >>>>a) Watchdog timeout interrupt service routine for handling watchdog interrupts
> >>>>and connecting these to per-engine hang recovery.
> >>>>
> >>>>b) Injection of watchdog timer enablement/cancellation instructions
> >>>>before/after the batch buffer start instruction in the ring buffer so that
> >>>>watchdog timeout is connected to the submission of an individual batch buffer.
> >>>>
> >>>>c) Extension of the DRM batch buffer interface, exposing the watchdog timeout
> >>>>feature to userland. We've got two open source groups in VPG currently in the
> >>>>process of integrating support for this feature, which should make it
> >>>>principally possible to upstream this extension.
> >>>>
> >>>>There is currently full watchdog timeout support for gen7 in GMIN and it is
> >>>>quite similar to the gen8 implementation so there is nothing obvious that
> >>>>prevents us from upstreaming that code along with the gen8 code. However,
> >>>>watchdog timeout is fully dependent on the per-engine hang recovery path and
> >>>>that is not part of this code submission for gen7. Therefore watchdog timeout
> >>>>support for gen7 has been excluded until per-engine hang recovery support for
> >>>>gen7 has landed upstream.
> >>>>
> >>>>As part of this submission we've had to reinstate the work queue that was
> >>>>previously in place between the error handler and the hang recovery path. The
> >>>>reason for this is that the per-engine recovery path is called directly from
> >>>>the interrupt handler in the case of watchdog timeout. In that situation
> >>>>there's no way of grabbing the struct_mutex, which is a requirement for the
> >>>>hang recovery path. Therefore, by reinstating the work queue we provide a
> >>>>unified execution context for the hang recovery code that allows the hang
> >>>>recovery code to grab whatever locks it needs without sacrificing interrupt
> >>>>latency too much or sleeping indefinitely in hard interrupt context.
> >>>>
> >>>>* Feature 3. Context Submission Status Consistency checking
> >>>>
> >>>>Something that becomes apparent when you run long-duration operations tests
> >>>>with concurrent rendering processes with intermittently injected hangs is that
> >>>>it seems like the GPU forgets to send context completion interrupts to the
> >>>>driver under some circumstances. What this means is that the driver sometimes
> >>>>gets stuck on a context that never seems to finish, all the while the hardware
> >>>>has completed and is waiting for more work.
> >>>>
> >>>>The problem with this is that the per-engine hang recovery path relies on
> >>>>context resubmission to kick off the hardware again following an engine reset.
> >>>>This can only be done safely if the hardware and driver share the same opinion
> >>>>about the current state. Therefore we've extended the periodic hang checker to
> >>>>check for context submission state inconsistencies aside from the hang checking
> >>>>it already does.
> >>>>
> >>>>If such a state is detected it is assumed (based on experience) that a context
> >>>>completion interrupt has been lost somehow. If this state persists for some
> >>>>time an attempt to correct it is made by faking the presumably lost context
> >>>>completion interrupt by manually calling the execlist interrupt handler, which
> >>>>is normally called from the main interrupt handler cued by a received context
> >>>>event interrupt. Just because an interrupt goes missing does not mean that the
> >>>>context status buffer (CSB) does not get appropriately updated by the hardware,
> >>>>which means that we can expect to find all the recent changes to the context
> >>>>states for each engine captured there. If there are outstanding context status
> >>>>changes in store there then the faked context event interrupt will allow the
> >>>>interrupt handler to act on them. In the case of lost context completion
> >>>>interrupts this will prompt the driver to remove the already completed context
> >>>>from the execlist queue and move on to the next pending piece of work and
> >>>>thereby eliminating the inconsistency.
> >>>>
> >>>>* Feature 4. Debugfs extensions for per-engine hang recovery and TDR/watchdog trace
> >>>>points.
> >>>>
> >>>>
> >>>>Tomas Elf (11):
> >>>>   drm/i915: Early exit from semaphore_waits_for for execlist mode.
> >>>>   drm/i915: Introduce uevent for full GPU reset.
> >>>>   drm/i915: Add reset stats entry point for per-engine reset.
> >>>>   drm/i915: Adding TDR / per-engine reset support for gen8.
> >>>>   drm/i915: Extending i915_gem_check_wedge to check engine reset in
> >>>>     progress
> >>>>   drm/i915: Disable warnings for TDR interruptions in the display
> >>>>     driver.
> >>>>   drm/i915: Reinstate hang recovery work queue.
> >>>>   drm/i915: Watchdog timeout support for gen8.
> >>>>   drm/i915: Fake lost context interrupts through forced CSB check.
> >>>>   drm/i915: Debugfs interface for per-engine hang recovery.
> >>>>   drm/i915: TDR/watchdog trace points.
> >>>>
> >>>>  drivers/gpu/drm/i915/i915_debugfs.c     |  146 +++++-
> >>>>  drivers/gpu/drm/i915/i915_dma.c         |   79 +++
> >>>>  drivers/gpu/drm/i915/i915_drv.c         |  201 ++++++++
> >>>>  drivers/gpu/drm/i915/i915_drv.h         |   91 +++-
> >>>>  drivers/gpu/drm/i915/i915_gem.c         |   93 +++-
> >>>>  drivers/gpu/drm/i915/i915_gpu_error.c   |    2 +-
> >>>>  drivers/gpu/drm/i915/i915_irq.c         |  378 ++++++++++++--
> >>>>  drivers/gpu/drm/i915/i915_params.c      |   10 +
> >>>>  drivers/gpu/drm/i915/i915_reg.h         |   13 +
> >>>>  drivers/gpu/drm/i915/i915_trace.h       |  298 +++++++++++
> >>>>  drivers/gpu/drm/i915/intel_display.c    |   16 +-
> >>>>  drivers/gpu/drm/i915/intel_lrc.c        |  858 ++++++++++++++++++++++++++++++-
> >>>>  drivers/gpu/drm/i915/intel_lrc.h        |   16 +-
> >>>>  drivers/gpu/drm/i915/intel_lrc_tdr.h    |   40 ++
> >>>>  drivers/gpu/drm/i915/intel_ringbuffer.c |   87 +++-
> >>>>  drivers/gpu/drm/i915/intel_ringbuffer.h |  109 ++++
> >>>>  drivers/gpu/drm/i915/intel_uncore.c     |  241 ++++++++-
> >>>>  include/uapi/drm/i915_drm.h             |    5 +-
> >>>>  18 files changed, 2589 insertions(+), 94 deletions(-)
> >>>>  create mode 100644 drivers/gpu/drm/i915/intel_lrc_tdr.h
> >>>>
> >>>>--
> >>>>1.7.9.5
> >>>>
> >>>>_______________________________________________
> >>>>Intel-gfx mailing list
> >>>>Intel-gfx@lists.freedesktop.org
> >>>>http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >>>
> >>
> >
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-06-23 15:20         ` Daniel Vetter
@ 2015-06-23 15:35           ` Daniel Vetter
  2015-06-25 10:38             ` Tomas Elf
  0 siblings, 1 reply; 59+ messages in thread
From: Daniel Vetter @ 2015-06-23 15:35 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX@Lists.FreeDesktop.Org

On Tue, Jun 23, 2015 at 5:20 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>> I pushed my branch, 20150608_TDR_upstream_adaptation_single-thread_hangchecking_RFC_delivery_sendmail_1,
>> to drm-private :
>>
>> https://git-amr-2.devtools.intel.com/gerrit/gitweb?p=otc_gen_graphics-drm-private.git;a=shortlog;h=refs/heads/20150608_TDR_upstream_adaptation_single-thread_hangchecking_RFC_delivery_sendmail_1

I get a 404 not found on this. This is Windows on vpn ofc.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-06-23 15:35           ` Daniel Vetter
@ 2015-06-25 10:38             ` Tomas Elf
  0 siblings, 0 replies; 59+ messages in thread
From: Tomas Elf @ 2015-06-25 10:38 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-GFX@Lists.FreeDesktop.Org

On 23/06/2015 16:35, Daniel Vetter wrote:
> On Tue, Jun 23, 2015 at 5:20 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>> I pushed my branch, 20150608_TDR_upstream_adaptation_single-thread_hangchecking_RFC_delivery_sendmail_1,
>>> to drm-private :
>>>
>>> https://git-amr-2.devtools.intel.com/gerrit/gitweb?p=otc_gen_graphics-drm-private.git;a=shortlog;h=refs/heads/20150608_TDR_upstream_adaptation_single-thread_hangchecking_RFC_delivery_sendmail_1
>
> I get a 404 not found on this. This is Windows on vpn ofc.
> -Daniel
>

Can you see this one? :
https://github.com/telf/TDR_watchdog_RFC_1/commits/20150608_TDR_upstream_adaptation_single-thread_hangchecking_RFC_delivery_sendmail_1

It's public with default permissions on github. Check the commits from 
the top down to "drm/i915: Early exit from semaphore_waits_for for 
execlist mode" .

Thanks,
Tomas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
                   ` (11 preceding siblings ...)
  2015-06-23 10:05 ` [RFC 00/11] TDR/watchdog timeout support for gen8 Daniel Vetter
@ 2015-07-03 11:15 ` Mika Kuoppala
  2015-07-03 17:41   ` Tomas Elf
  2015-07-09 18:47 ` Chris Wilson
  13 siblings, 1 reply; 59+ messages in thread
From: Mika Kuoppala @ 2015-07-03 11:15 UTC (permalink / raw)
  To: Tomas Elf, Intel-GFX

Tomas Elf <tomas.elf@intel.com> writes:

> This patch series introduces the following features:
>
> * Feature 1: TDR (Timeout Detection and Recovery) for gen8 execlist mode.
>
> TDR is an umbrella term for anything that goes into detecting and recovering
> from GPU hangs and is a term more widely used outside of the upstream driver.
> This feature introduces an extensible framework that currently supports gen8
> but that can be easily extended to support gen7 as well (which is already the
> case in GMIN but unfortunately in a not quite upstreamable form). The code
> contained in this submission represents the essentials of what is currently in
> GMIN merged with what is currently in upstream (as of the time when this work
> commenced a few months back).
>
> This feature adds a new hang recovery path alongside the legacy GPU reset path,
> which takes care of engine recovery only. Aside from adding support for
> per-engine recovery this feature also introduces rules for how to promote a
> potential per-engine reset to a legacy, full GPU reset.
>

Have you considered promoting from failed full gpu reset to pci level reset?
Would require some amount of rethinking on init as it would kill the
display also. On lunch Ville threw idea of pushing the device to
d3(?) state to kill the power. That would get rid of the most stubborn
state (wrt skl gpu reset problems).

-Mika

> The hang checker now integrates with the error handler in a slightly different
> way in that it allows hang recovery on multiple engines at the same time by
> passing an engine flag mask to the error handler where flags representing all
> of the hung engines are set. This allows us to schedule hang recovery once for
> all currently hung engines instead of one hang recovery per detected engine
> hang. Previously, when only full GPU reset was supported this was all the same
> since it wouldn't matter if one or four engines were hung at any given point
> since it would all amount to the same thing - the GPU getting reset. As it
> stands now the behaviour is different depending on which engine is hung since
> each engine is reset separately from all the other engines, therefore we have
> to think about this in terms of scheduling cost and recovery latency. (see
> open question below)
>
> OPEN QUESTIONS:
>
> 	1. Do we want to investigate the possibility of per-engine hang
> 	detection? In the current upstream driver there is only one work queue
> 	that handles the hang checker and everything from initial hang
> 	detection to final hang recovery runs in this thread. This makes sense
> 	if you're only supporting one form of hang recovery - using full GPU
> 	reset and nothing tied to any particular engine. However, as part
> 	of this patch series we're changing that by introducing per-engine
> 	hang recovery. It could make sense to introduce multiple work
> 	queues - one per engine - to run multiple hang checking threads in
> 	parallel.
>
> 	This would potentially allow savings in terms of recovery latency since
> 	we don't have to scan all engines every time the hang checker is
> 	scheduled and the error handler does not have to scan all engines every
> 	time it is scheduled. Instead, we could implement one work queue per
> 	engine that would invoke the hang checker that only checks _that_
> 	particular engine and then the error handler is invoked for _that_
> 	particular engine. If one engine has hung the latency for getting to
> 	the hang recovery path for that particular engine would be (Time For
> 	Hang Checking One Engine) + (Time For Error Handling One Engine) rather
> 	than the time it takes to do hang checking for all engines + the time
> 	it takes to do error handling for all engines that have been detected
> 	as hung (which in the worst case would be all engines). There would
> 	potentially be as many hang checker and error handling threads going on
> 	concurrently as there are engines in the hardware but they would all be
> 	running in parallel without any significant locking. The first time
> 	where any thread needs exclusive access to the driver is at the point
> 	of the actual hang recovery but the time it takes to get there would
> 	theoretically be lower and the time it actually takes to do per-engine
> 	hang recovery is quite a lot lower than the time it takes to actually
> 	detect a hang reliably.
>
> 	How much we would save by such a change still needs to be analysed and
> 	compared against the current single-thread model but it makes sense
> 	from a theoretical design point of view.
>
> 	2. How does per-engine reset integrate with the public reset stats
> 	IOCTL? These stats are used for the GL robustness interface and
> 	currently these tests are failing when running per-engine hang recovery
> 	since we treat per-engine recovery differently from full GPU recovery,
> 	which is nothing that userland knows anything about. When userland
> 	expects to hang the hardware it expects the reset stat interface to
> 	reflect this, which is something that has changed as part of this code
> 	submission. There's more than one way to solve this. Here are two options:
>
> 		1. Expose per-engine reset statistics and set contexts as
> 		guilty the same way for per-engine reset as for full GPU
> 		resets.
>
> 		That would make this change to the hang recovery mechanism
> 		transparent to userland but it would change the semantics since
> 		an active context in the reset stats no longer implies that the
> 		GPU was fully reset.
>
> 		2. Add a new set of statistics for per-engine reset (one group
> 		of statistics for each engine) to reflect the extended
> 		capabilities that per-engine hang recovery offers.
>
> 		Would that be breaking the ABI?
>
> 		... Or are there any other way of doing this?
>
> * Feature 2: Watchdog Timeout (a.k.a "media engine reset") for gen8.
>
> This feature allows userland applications to control whether or not individual
> batch buffers should have a first-level, fine-grained, hardware-based hang
> detection mechanism on top of the ordinary, software-based periodic hang
> checker that is already in the driver. The advantage over relying solely on the
> current software-based hang checker is that the watchdog timeout mechanism is
> about 1000x quicker and more precise. Since it's not a full driver-level hang
> detection mechanism but only targetting one individual batch buffer at a time
> it can afford to be that quick without risking an increase in false positive
> hang detection.
>
> This feature includes the following changes:
>
> a) Watchdog timeout interrupt service routine for handling watchdog interrupts
> and connecting these to per-engine hang recovery.
>
> b) Injection of watchdog timer enablement/cancellation instructions
> before/after the batch buffer start instruction in the ring buffer so that
> watchdog timeout is connected to the submission of an individual batch buffer.
>
> c) Extension of the DRM batch buffer interface, exposing the watchdog timeout
> feature to userland. We've got two open source groups in VPG currently in the
> process of integrating support for this feature, which should make it
> principally possible to upstream this extension.
>
> There is currently full watchdog timeout support for gen7 in GMIN and it is
> quite similar to the gen8 implementation so there is nothing obvious that
> prevents us from upstreaming that code along with the gen8 code. However,
> watchdog timeout is fully dependent on the per-engine hang recovery path and
> that is not part of this code submission for gen7. Therefore watchdog timeout
> support for gen7 has been excluded until per-engine hang recovery support for
> gen7 has landed upstream.
>
> As part of this submission we've had to reinstate the work queue that was
> previously in place between the error handler and the hang recovery path. The
> reason for this is that the per-engine recovery path is called directly from
> the interrupt handler in the case of watchdog timeout. In that situation
> there's no way of grabbing the struct_mutex, which is a requirement for the
> hang recovery path. Therefore, by reinstating the work queue we provide a
> unified execution context for the hang recovery code that allows the hang
> recovery code to grab whatever locks it needs without sacrificing interrupt
> latency too much or sleeping indefinitely in hard interrupt context.
>
> * Feature 3. Context Submission Status Consistency checking
>
> Something that becomes apparent when you run long-duration operations tests
> with concurrent rendering processes with intermittently injected hangs is that
> it seems like the GPU forgets to send context completion interrupts to the
> driver under some circumstances. What this means is that the driver sometimes
> gets stuck on a context that never seems to finish, all the while the hardware
> has completed and is waiting for more work.
>
> The problem with this is that the per-engine hang recovery path relies on
> context resubmission to kick off the hardware again following an engine reset.
> This can only be done safely if the hardware and driver share the same opinion
> about the current state. Therefore we've extended the periodic hang checker to
> check for context submission state inconsistencies aside from the hang checking
> it already does.
>
> If such a state is detected it is assumed (based on experience) that a context
> completion interrupt has been lost somehow. If this state persists for some
> time an attempt to correct it is made by faking the presumably lost context
> completion interrupt by manually calling the execlist interrupt handler, which
> is normally called from the main interrupt handler cued by a received context
> event interrupt. Just because an interrupt goes missing does not mean that the
> context status buffer (CSB) does not get appropriately updated by the hardware,
> which means that we can expect to find all the recent changes to the context
> states for each engine captured there. If there are outstanding context status
> changes in store there then the faked context event interrupt will allow the
> interrupt handler to act on them. In the case of lost context completion
> interrupts this will prompt the driver to remove the already completed context
> from the execlist queue and move on to the next pending piece of work and
> thereby eliminating the inconsistency.
>
> * Feature 4. Debugfs extensions for per-engine hang recovery and TDR/watchdog trace
> points.
>
>
> Tomas Elf (11):
>   drm/i915: Early exit from semaphore_waits_for for execlist mode.
>   drm/i915: Introduce uevent for full GPU reset.
>   drm/i915: Add reset stats entry point for per-engine reset.
>   drm/i915: Adding TDR / per-engine reset support for gen8.
>   drm/i915: Extending i915_gem_check_wedge to check engine reset in
>     progress
>   drm/i915: Disable warnings for TDR interruptions in the display
>     driver.
>   drm/i915: Reinstate hang recovery work queue.
>   drm/i915: Watchdog timeout support for gen8.
>   drm/i915: Fake lost context interrupts through forced CSB check.
>   drm/i915: Debugfs interface for per-engine hang recovery.
>   drm/i915: TDR/watchdog trace points.
>
>  drivers/gpu/drm/i915/i915_debugfs.c     |  146 +++++-
>  drivers/gpu/drm/i915/i915_dma.c         |   79 +++
>  drivers/gpu/drm/i915/i915_drv.c         |  201 ++++++++
>  drivers/gpu/drm/i915/i915_drv.h         |   91 +++-
>  drivers/gpu/drm/i915/i915_gem.c         |   93 +++-
>  drivers/gpu/drm/i915/i915_gpu_error.c   |    2 +-
>  drivers/gpu/drm/i915/i915_irq.c         |  378 ++++++++++++--
>  drivers/gpu/drm/i915/i915_params.c      |   10 +
>  drivers/gpu/drm/i915/i915_reg.h         |   13 +
>  drivers/gpu/drm/i915/i915_trace.h       |  298 +++++++++++
>  drivers/gpu/drm/i915/intel_display.c    |   16 +-
>  drivers/gpu/drm/i915/intel_lrc.c        |  858 ++++++++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/intel_lrc.h        |   16 +-
>  drivers/gpu/drm/i915/intel_lrc_tdr.h    |   40 ++
>  drivers/gpu/drm/i915/intel_ringbuffer.c |   87 +++-
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  109 ++++
>  drivers/gpu/drm/i915/intel_uncore.c     |  241 ++++++++-
>  include/uapi/drm/i915_drm.h             |    5 +-
>  18 files changed, 2589 insertions(+), 94 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/intel_lrc_tdr.h
>
> -- 
> 1.7.9.5
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-07-03 11:15 ` Mika Kuoppala
@ 2015-07-03 17:41   ` Tomas Elf
  0 siblings, 0 replies; 59+ messages in thread
From: Tomas Elf @ 2015-07-03 17:41 UTC (permalink / raw)
  To: Mika Kuoppala, Intel-GFX

On 03/07/2015 12:15, Mika Kuoppala wrote:
> Tomas Elf <tomas.elf@intel.com> writes:
>
>> This patch series introduces the following features:
>>
>> * Feature 1: TDR (Timeout Detection and Recovery) for gen8 execlist mode.
>>
>> TDR is an umbrella term for anything that goes into detecting and recovering
>> from GPU hangs and is a term more widely used outside of the upstream driver.
>> This feature introduces an extensible framework that currently supports gen8
>> but that can be easily extended to support gen7 as well (which is already the
>> case in GMIN but unfortunately in a not quite upstreamable form). The code
>> contained in this submission represents the essentials of what is currently in
>> GMIN merged with what is currently in upstream (as of the time when this work
>> commenced a few months back).
>>
>> This feature adds a new hang recovery path alongside the legacy GPU reset path,
>> which takes care of engine recovery only. Aside from adding support for
>> per-engine recovery this feature also introduces rules for how to promote a
>> potential per-engine reset to a legacy, full GPU reset.
>>
>
> Have you considered promoting from failed full gpu reset to pci level reset?
> Would require some amount of rethinking on init as it would kill the
> display also. On lunch Ville threw idea of pushing the device to
> d3(?) state to kill the power. That would get rid of the most stubborn
> state (wrt skl gpu reset problems).

I haven't considered this but it definitely sounds interesting if we can 
get something like that to work. I know that Dave G. has been talking 
about something like this. Having four levels of hang recovery with the 
final one being actual power cycling would be truly awesome :). You 
might want to talk to the hardware guys about this first, though, unless 
you are fully confident that going beyond full GPU reset and actually 
cutting power and then bringing it up again wouldn't cause any weird 
effects. I reckon this might not exactly be in line with the intended 
usage model that the hardware engineers had in mind when they came up 
with these things. But you know more about that than I do.

I'd like to see an RFC of this if anyone can flesh out these ideas.

Thanks,
Tomas

>
> -Mika
>
>> The hang checker now integrates with the error handler in a slightly different
>> way in that it allows hang recovery on multiple engines at the same time by
>> passing an engine flag mask to the error handler where flags representing all
>> of the hung engines are set. This allows us to schedule hang recovery once for
>> all currently hung engines instead of one hang recovery per detected engine
>> hang. Previously, when only full GPU reset was supported this was all the same
>> since it wouldn't matter if one or four engines were hung at any given point
>> since it would all amount to the same thing - the GPU getting reset. As it
>> stands now the behaviour is different depending on which engine is hung since
>> each engine is reset separately from all the other engines, therefore we have
>> to think about this in terms of scheduling cost and recovery latency. (see
>> open question below)
>>
>> OPEN QUESTIONS:
>>
>> 	1. Do we want to investigate the possibility of per-engine hang
>> 	detection? In the current upstream driver there is only one work queue
>> 	that handles the hang checker and everything from initial hang
>> 	detection to final hang recovery runs in this thread. This makes sense
>> 	if you're only supporting one form of hang recovery - using full GPU
>> 	reset and nothing tied to any particular engine. However, as part
>> 	of this patch series we're changing that by introducing per-engine
>> 	hang recovery. It could make sense to introduce multiple work
>> 	queues - one per engine - to run multiple hang checking threads in
>> 	parallel.
>>
>> 	This would potentially allow savings in terms of recovery latency since
>> 	we don't have to scan all engines every time the hang checker is
>> 	scheduled and the error handler does not have to scan all engines every
>> 	time it is scheduled. Instead, we could implement one work queue per
>> 	engine that would invoke the hang checker that only checks _that_
>> 	particular engine and then the error handler is invoked for _that_
>> 	particular engine. If one engine has hung the latency for getting to
>> 	the hang recovery path for that particular engine would be (Time For
>> 	Hang Checking One Engine) + (Time For Error Handling One Engine) rather
>> 	than the time it takes to do hang checking for all engines + the time
>> 	it takes to do error handling for all engines that have been detected
>> 	as hung (which in the worst case would be all engines). There would
>> 	potentially be as many hang checker and error handling threads going on
>> 	concurrently as there are engines in the hardware but they would all be
>> 	running in parallel without any significant locking. The first time
>> 	where any thread needs exclusive access to the driver is at the point
>> 	of the actual hang recovery but the time it takes to get there would
>> 	theoretically be lower and the time it actually takes to do per-engine
>> 	hang recovery is quite a lot lower than the time it takes to actually
>> 	detect a hang reliably.
>>
>> 	How much we would save by such a change still needs to be analysed and
>> 	compared against the current single-thread model but it makes sense
>> 	from a theoretical design point of view.
>>
>> 	2. How does per-engine reset integrate with the public reset stats
>> 	IOCTL? These stats are used for the GL robustness interface and
>> 	currently these tests are failing when running per-engine hang recovery
>> 	since we treat per-engine recovery differently from full GPU recovery,
>> 	which is nothing that userland knows anything about. When userland
>> 	expects to hang the hardware it expects the reset stat interface to
>> 	reflect this, which is something that has changed as part of this code
>> 	submission. There's more than one way to solve this. Here are two options:
>>
>> 		1. Expose per-engine reset statistics and set contexts as
>> 		guilty the same way for per-engine reset as for full GPU
>> 		resets.
>>
>> 		That would make this change to the hang recovery mechanism
>> 		transparent to userland but it would change the semantics since
>> 		an active context in the reset stats no longer implies that the
>> 		GPU was fully reset.
>>
>> 		2. Add a new set of statistics for per-engine reset (one group
>> 		of statistics for each engine) to reflect the extended
>> 		capabilities that per-engine hang recovery offers.
>>
>> 		Would that be breaking the ABI?
>>
>> 		... Or are there any other way of doing this?
>>
>> * Feature 2: Watchdog Timeout (a.k.a "media engine reset") for gen8.
>>
>> This feature allows userland applications to control whether or not individual
>> batch buffers should have a first-level, fine-grained, hardware-based hang
>> detection mechanism on top of the ordinary, software-based periodic hang
>> checker that is already in the driver. The advantage over relying solely on the
>> current software-based hang checker is that the watchdog timeout mechanism is
>> about 1000x quicker and more precise. Since it's not a full driver-level hang
>> detection mechanism but only targetting one individual batch buffer at a time
>> it can afford to be that quick without risking an increase in false positive
>> hang detection.
>>
>> This feature includes the following changes:
>>
>> a) Watchdog timeout interrupt service routine for handling watchdog interrupts
>> and connecting these to per-engine hang recovery.
>>
>> b) Injection of watchdog timer enablement/cancellation instructions
>> before/after the batch buffer start instruction in the ring buffer so that
>> watchdog timeout is connected to the submission of an individual batch buffer.
>>
>> c) Extension of the DRM batch buffer interface, exposing the watchdog timeout
>> feature to userland. We've got two open source groups in VPG currently in the
>> process of integrating support for this feature, which should make it
>> principally possible to upstream this extension.
>>
>> There is currently full watchdog timeout support for gen7 in GMIN and it is
>> quite similar to the gen8 implementation so there is nothing obvious that
>> prevents us from upstreaming that code along with the gen8 code. However,
>> watchdog timeout is fully dependent on the per-engine hang recovery path and
>> that is not part of this code submission for gen7. Therefore watchdog timeout
>> support for gen7 has been excluded until per-engine hang recovery support for
>> gen7 has landed upstream.
>>
>> As part of this submission we've had to reinstate the work queue that was
>> previously in place between the error handler and the hang recovery path. The
>> reason for this is that the per-engine recovery path is called directly from
>> the interrupt handler in the case of watchdog timeout. In that situation
>> there's no way of grabbing the struct_mutex, which is a requirement for the
>> hang recovery path. Therefore, by reinstating the work queue we provide a
>> unified execution context for the hang recovery code that allows the hang
>> recovery code to grab whatever locks it needs without sacrificing interrupt
>> latency too much or sleeping indefinitely in hard interrupt context.
>>
>> * Feature 3. Context Submission Status Consistency checking
>>
>> Something that becomes apparent when you run long-duration operations tests
>> with concurrent rendering processes with intermittently injected hangs is that
>> it seems like the GPU forgets to send context completion interrupts to the
>> driver under some circumstances. What this means is that the driver sometimes
>> gets stuck on a context that never seems to finish, all the while the hardware
>> has completed and is waiting for more work.
>>
>> The problem with this is that the per-engine hang recovery path relies on
>> context resubmission to kick off the hardware again following an engine reset.
>> This can only be done safely if the hardware and driver share the same opinion
>> about the current state. Therefore we've extended the periodic hang checker to
>> check for context submission state inconsistencies aside from the hang checking
>> it already does.
>>
>> If such a state is detected it is assumed (based on experience) that a context
>> completion interrupt has been lost somehow. If this state persists for some
>> time an attempt to correct it is made by faking the presumably lost context
>> completion interrupt by manually calling the execlist interrupt handler, which
>> is normally called from the main interrupt handler cued by a received context
>> event interrupt. Just because an interrupt goes missing does not mean that the
>> context status buffer (CSB) does not get appropriately updated by the hardware,
>> which means that we can expect to find all the recent changes to the context
>> states for each engine captured there. If there are outstanding context status
>> changes in store there then the faked context event interrupt will allow the
>> interrupt handler to act on them. In the case of lost context completion
>> interrupts this will prompt the driver to remove the already completed context
>> from the execlist queue and move on to the next pending piece of work and
>> thereby eliminating the inconsistency.
>>
>> * Feature 4. Debugfs extensions for per-engine hang recovery and TDR/watchdog trace
>> points.
>>
>>
>> Tomas Elf (11):
>>    drm/i915: Early exit from semaphore_waits_for for execlist mode.
>>    drm/i915: Introduce uevent for full GPU reset.
>>    drm/i915: Add reset stats entry point for per-engine reset.
>>    drm/i915: Adding TDR / per-engine reset support for gen8.
>>    drm/i915: Extending i915_gem_check_wedge to check engine reset in
>>      progress
>>    drm/i915: Disable warnings for TDR interruptions in the display
>>      driver.
>>    drm/i915: Reinstate hang recovery work queue.
>>    drm/i915: Watchdog timeout support for gen8.
>>    drm/i915: Fake lost context interrupts through forced CSB check.
>>    drm/i915: Debugfs interface for per-engine hang recovery.
>>    drm/i915: TDR/watchdog trace points.
>>
>>   drivers/gpu/drm/i915/i915_debugfs.c     |  146 +++++-
>>   drivers/gpu/drm/i915/i915_dma.c         |   79 +++
>>   drivers/gpu/drm/i915/i915_drv.c         |  201 ++++++++
>>   drivers/gpu/drm/i915/i915_drv.h         |   91 +++-
>>   drivers/gpu/drm/i915/i915_gem.c         |   93 +++-
>>   drivers/gpu/drm/i915/i915_gpu_error.c   |    2 +-
>>   drivers/gpu/drm/i915/i915_irq.c         |  378 ++++++++++++--
>>   drivers/gpu/drm/i915/i915_params.c      |   10 +
>>   drivers/gpu/drm/i915/i915_reg.h         |   13 +
>>   drivers/gpu/drm/i915/i915_trace.h       |  298 +++++++++++
>>   drivers/gpu/drm/i915/intel_display.c    |   16 +-
>>   drivers/gpu/drm/i915/intel_lrc.c        |  858 ++++++++++++++++++++++++++++++-
>>   drivers/gpu/drm/i915/intel_lrc.h        |   16 +-
>>   drivers/gpu/drm/i915/intel_lrc_tdr.h    |   40 ++
>>   drivers/gpu/drm/i915/intel_ringbuffer.c |   87 +++-
>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  109 ++++
>>   drivers/gpu/drm/i915/intel_uncore.c     |  241 ++++++++-
>>   include/uapi/drm/i915_drm.h             |    5 +-
>>   18 files changed, 2589 insertions(+), 94 deletions(-)
>>   create mode 100644 drivers/gpu/drm/i915/intel_lrc_tdr.h
>>
>> --
>> 1.7.9.5
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
                   ` (12 preceding siblings ...)
  2015-07-03 11:15 ` Mika Kuoppala
@ 2015-07-09 18:47 ` Chris Wilson
  2015-07-10 15:24   ` Tomas Elf
  13 siblings, 1 reply; 59+ messages in thread
From: Chris Wilson @ 2015-07-09 18:47 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Mon, Jun 08, 2015 at 06:03:18PM +0100, Tomas Elf wrote:
> This patch series introduces the following features:
> 
> * Feature 1: TDR (Timeout Detection and Recovery) for gen8 execlist mode.
> * Feature 2: Watchdog Timeout (a.k.a "media engine reset") for gen8.
> * Feature 3. Context Submission Status Consistency checking

The high level design is reasonable and conceptually extends the current
system in fairly obvious ways.

In terms of discussing the implementation, I think this can be phased
as:

0. Move to a per-engine hangcheck

1. Add fake-interrupt recovery for CSSC

  I think this can be done without changing the hangcheck heuristics at
  all - we just need to think carefully about recovery (which is a nice
  precursor to per-engine reset). I may be wrong, and so would like to
  be told so early on! If the fake interrupt recovery is done as part of
  the reset handler, we should have (one) fewer concurrency issues to
  worry about. (In a similar vein, I think we should move the missed
  interupt handler for legacy out of hangcheck, partly to simplify some
  very confusing code and partly so that we have fewer deviations
  between legacy/execlists paths.) It also gets us thinking about the
  consistency detection and when it is viable to do a fake-interrupt and
  when we must do a full-reset (for example, we only want to
  fake-interrupt if the consistency check says the GPU is idle, and we
  definitely want to reset everything if the GPU is executing an alien
  context.)

  A test here would be to suspend the execlists irq and wait for the
  recovery. Cheekily we could punch the irq eir by root mmio and check
  hangcheck automagically recovers.

Whilst it would be nice to add the watchdog next, since it is
conceptually quite simple and basically just a new hangcheck source with
fancy setup - fast hangcheck without soft reset makes for an easy DoS.

2. TDR

  Given that we have a consistency check and begun to extend the reset
  path, we can implement a soft reset that only skips the hung request.
  (The devil is in the details, whilst the design here looked solid, I
  think the LRC recovery code could be simplified - I didn't feel
  another requeue path was required given that we need only pretend a
  request completed (fixing up the context image as required) and then
  use the normal unqueue.) There is also quite a bit of LRC cleanup on
  the lists which would be useful here.

  Lots of tests for concurrent engine utilisation, multiple contexts,
  etc and ensuring that a hang in one does not affect independent work
  (engines, batches, contexts).

3. Watchdog

  A new fast hangcheck source. So concurrency, promotion (relative
  scoring between watchdog / hangcheck) and issue with not programming
  the ring correctly (beware the interrupt after performing a dispatch
  and programming the tail, needs either reserved space, inlining into
  the dispatch etc).

  The only thing of remark here is the uapi. It is a server feature
  (interactive clients are less likely to tolerate data being thrown
  away). Do we want an execbuf bit or context param? An execbuf bit
  allows switching between two watchdog timers (or on/off), but we
  have many more context params available than bits. Do we want to
  expose context params to set the fast/slow timeouts?

  We haven't found any GL spec that descibe controllable watchdogs, so
  the ultimate uapi requirements are unknown. My feeling is that we want
  to do the initial uapi through context params, and start with a single
  fast watchdog timeout value. This is sufficient for testing and
  probably for most use cases. This can easily be extended by adding an
  execbuf bit to switch between two values and a context param to set
  the second value. (If the second value isn't set, the execbuf bit
  dosn't do anything the watchdog is always programmed to the user
  value. If the second value is set (maybe infinite), then the execbuf
  bit is used to select which timeout to use for that batch.) But given
  that this is a server-esque feature, it is likely to be a setting the
  userspace driver imposes upon its clients, and there is unlikely to be
  the need to switch timeouts within any one client.

  The tests here would focus on the uapi and ensuring that if the client
  asks for a 60ms hang detection, then all work packets take at most
  60ms to complete. Hmm, add wait_ioctl to the list of uapi that should
  report -EIO.

>  drivers/gpu/drm/i915/i915_debugfs.c     |  146 +++++-
>  drivers/gpu/drm/i915/i915_drv.c         |  201 ++++++++
>  drivers/gpu/drm/i915/intel_lrc.c        |  858 ++++++++++++++++++++++++++++++-

The balance here feels wrong ;-)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-07-09 18:47 ` Chris Wilson
@ 2015-07-10 15:24   ` Tomas Elf
  2015-07-10 15:48     ` Tomas Elf
  2015-07-11 18:22     ` Chris Wilson
  0 siblings, 2 replies; 59+ messages in thread
From: Tomas Elf @ 2015-07-10 15:24 UTC (permalink / raw)
  To: Chris Wilson, Intel-GFX

On 09/07/2015 19:47, Chris Wilson wrote:
> On Mon, Jun 08, 2015 at 06:03:18PM +0100, Tomas Elf wrote:
>> This patch series introduces the following features:
>>
>> * Feature 1: TDR (Timeout Detection and Recovery) for gen8 execlist mode.
>> * Feature 2: Watchdog Timeout (a.k.a "media engine reset") for gen8.
>> * Feature 3. Context Submission Status Consistency checking
>
> The high level design is reasonable and conceptually extends the current
> system in fairly obvious ways.
>
> In terms of discussing the implementation, I think this can be phased
> as:
>
> 0. Move to a per-engine hangcheck
>
> 1. Add fake-interrupt recovery for CSSC
>
>    I think this can be done without changing the hangcheck heuristics at
>    all - we just need to think carefully about recovery (which is a nice
>    precursor to per-engine reset). I may be wrong, and so would like to
>    be told so early on! If the fake interrupt recovery is done as part of
>    the reset handler, we should have (one) fewer concurrency issues to
>    worry about.

Some points about moving the CSSC out of the hang checker and into the 
reset handler:

1. If we deal with consistency rectification in the reset handler the 
turnaround time becomes REALLY long:
		
	a. First you have the time to detect the hang, call i915_handle_error() 
that then raises the reset in progress flag, preventing further 
submissions to the driver.

	b. Then go all the way to the per-engine recovery path, only to 
discover that we've got an inconsistency that has not been handled, fall 
back immediately with -EAGAIN and lower the reset in progress flag, let 
the system continue running and defer to the next hang check (another 
hang detection period)

	c. Once the hang has been detected AGAIN, raise the reset in progress 
flag AGAIN and go back to the engine reset path a second time.

	d. At the start of the engine reset path we do the second CSSC 
detection and realise that we've got a stable inconsistency that we can 
attempt to rectify. We can then try to rectify the inconsistency and go 
through with the engine reset... AFTER we've checked that the 
inconsistency rectification was indeed effective! If it's not and the 
problem remains then we have to fail the engine recovery mode and fall 
back to full GPU reset immediately... Which we could have done from the 
hang checker if we had just refused to schedule hang recovery and just 
let the context submission state inconsistency persist and let the hang 
score keep rising until the hang score reached 2*HUNG, which would then 
have triggered the full GPU reset fallback from the hang checker (see 
path 9/11 for all of this)

As you can see, dealing with context submission state inconsistency in 
the reset path is very long-winded way of doing it and does not make it 
more reliable. Also, it's more complicated to analyse from a concurrency 
point of view since we need to fall back several times and raise and 
lower the reset in progress flag, which allows driver submissions to 
happen vs. blocks submissions. It basically becomes very difficult to 
know what is going on.
	
2. Secondly, and more importantly, if a watchdog timeout is detected and 
we end up in the per-engine hang recovery path and have to fall back due 
to an inconsistent context submission state at that point and the hang 
checker is turned off then we're irrecoverably hung. Watchdog timeout is 
supposed to work without the periodic hang checker but it won't if CSSC 
is not ensured at all times. Which is why I chose to override the 
i915.enable_hangcheck flag to make sure that the hang checker always 
runs consistency pre-checking and reschedules itself if there is more 
work pending to make sure that as long as work is pending we do 
consistency checking asynchronously regardless of everything else so 
that if a watchdog timeout hits we have a consistent state once the 
watchdog timeout ends up in per-engine recovery.

Granted, if a watchdog timeout hits after we've first detected the 
inconsistency but not yet had time to rectify it it doesn't work if the 
hang checker is turned off and we cannot rely on periodic hang checking 
to schedule hang recovery in this case - so in that case we're still 
irrecoverably stuck. We could make change here and do a one-time 
i915.enable_hangcheck override and schedule hang recovery following this 
point. If you think it's worth it.
	
Bottom line: The consistency checking must happen at all times and 
cannot be done as a consequence of a scheduled reset if hang checking is 
turned off at any point.
	
As far as concurrency issues in the face of CSSC is concerned, 
disregarding the complication of handling CSSC in the recovery path and 
relying on deferring to the next hang detection	with all of the 
concurrency issues that entails: The question really is what kind of 
concurrency issues we're worried about. If the hang checker determines 
that we've got a hang then that's a stable state. If the hang checker 
consistency pre-check determines that we've got a sustained CSSC 
inconsistency then that's stable too. The states are not changing so 
whatever we do will not be because we detect the state in the middle of 
a state transition and the detection won't be subject to concurrency 
effects. If the hang checker decides that the inconsistency needs to be 
rectified and fakes the presumably lost interrupt and the real, presumed 
lost, interrupt happens to come in at the same time then that's fine, 
the CSB buffer check in the execlist interrupt handler is made to cope 
with that. We can have X number of calls to the interrupt handler or 
just one, the outcome is supposed to be the same - the only thing that 
matters is captured context state changes in the CSB buffer that we act 
upon.

So I'm not entirely sure what concurrency issues might be reason enough 
to move out the CSSC to the hang recovery path. In fact, I'd be more 
inclined to create a second async task for it to make sure it's being 
run at all times. But in that case we might as well let it stay in the 
hang checker.
	
(In a similar vein, I think we should move the missed
>    interupt handler for legacy out of hangcheck, partly to simplify some
>    very confusing code and partly so that we have fewer deviations
>    between legacy/execlists paths.) It also gets us thinking about the
>    consistency detection and when it is viable to do a fake-interrupt and
>    when we must do a full-reset (for example, we only want to
>    fake-interrupt if the consistency check says the GPU is idle, and we
>    definitely want to reset everything if the GPU is executing an alien
>    context.)
>
>    A test here would be to suspend the execlists irq and wait for the
>    recovery. Cheekily we could punch the irq eir by root mmio and check
>    hangcheck automagically recovers.
>
> Whilst it would be nice to add the watchdog next, since it is
> conceptually quite simple and basically just a new hangcheck source with
> fancy setup - fast hangcheck without soft reset makes for an easy DoS.
>
> 2. TDR
>
>    Given that we have a consistency check and begun to extend the reset
>    path, we can implement a soft reset that only skips the hung request.
>    (The devil is in the details, whilst the design here looked solid, I
>    think the LRC recovery code could be simplified - I didn't feel
>    another requeue path was required given that we need only pretend a
>    request completed (fixing up the context image as required) and then
>    use the normal unqueue.) There is also quite a bit of LRC cleanup on
>    the lists which would be useful here.

As far as the new requeue (or resubmission) path is concerned, you might 
have a point here. The reason it's as involved as it is is probably 
mostly because of all the validation that takes place in the 
resubmission path. Meaning that once the resubmission happens at the end 
of the per-engine hang recovery path we want to make extra sure that the 
context that gets resubmitted in the end (the head element of the queue 
at that point in time) is in fact the one that was passed down from the 
per-engine hang recovery path (the context at the head of the queue at 
the start of the hang recovery path), so that the state of the queue 
didn't change during hang recovery. Maybe we're too paranoid here.

>
>    Lots of tests for concurrent engine utilisation, multiple contexts,
>    etc and ensuring that a hang in one does not affect independent work
>    (engines, batches, contexts).
>
> 3. Watchdog
>
>    A new fast hangcheck source. So concurrency, promotion (relative
>    scoring between watchdog / hangcheck) and issue with not programming
>    the ring correctly (beware the interrupt after performing a dispatch
>    and programming the tail, needs either reserved space, inlining into
>    the dispatch etc).
>
>    The only thing of remark here is the uapi. It is a server feature
>    (interactive clients are less likely to tolerate data being thrown
>    away). Do we want an execbuf bit or context param? An execbuf bit
>    allows switching between two watchdog timers (or on/off), but we
>    have many more context params available than bits. Do we want to
>    expose context params to set the fast/slow timeouts?
>
>    We haven't found any GL spec that descibe controllable watchdogs, so
>    the ultimate uapi requirements are unknown. My feeling is that we want
>    to do the initial uapi through context params, and start with a single
>    fast watchdog timeout value. This is sufficient for testing and
>    probably for most use cases. This can easily be extended by adding an
>    execbuf bit to switch between two values and a context param to set
>    the second value. (If the second value isn't set, the execbuf bit
>    dosn't do anything the watchdog is always programmed to the user
>    value. If the second value is set (maybe infinite), then the execbuf
>    bit is used to select which timeout to use for that batch.) But given
>    that this is a server-esque feature, it is likely to be a setting the
>    userspace driver imposes upon its clients, and there is unlikely to be
>    the need to switch timeouts within any one client.


This may or may not be true. We need to thrash out these details. As you 
said, the requirements are quite fuzzy at this point. In the end a good 
method might be to just get something in there in cooperation with ONE 
open source user and let all other users scream after it's gone in there 
and then extend the interface (without breaking ABI) to accomodate the 
other users. I've tried to drum up enthusiasm for this new feature but 
so far it's not been overwhelming so it's difficult to solve this 
chicken and egg problem without proper input from userland users.
	
If someone writes something in stone and tells me to implement exactly 
that then I'll do it but so far there has been no convincing argument 
pointing to any particular design aside from the choice of default 
timeout, which was actually decided in collaboration with various 
userland groups and was nothing we just made up by ourselves.

>
>    The tests here would focus on the uapi and ensuring that if the client
>    asks for a 60ms hang detection, then all work packets take at most
>    60ms to complete. Hmm, add wait_ioctl to the list of uapi that should
>    report -EIO.
>
>>   drivers/gpu/drm/i915/i915_debugfs.c     |  146 +++++-
>>   drivers/gpu/drm/i915/i915_drv.c         |  201 ++++++++
>>   drivers/gpu/drm/i915/intel_lrc.c        |  858 ++++++++++++++++++++++++++++++-
>
> The balance here feels wrong ;-)

Once we get gen7 support in there, after this RFC, I can assure you that 
it will even out in regards to intel_ringbuffer.c and other files. The 
gen agnostic TDR framework does focus a lot on i915_drv.c and 
i915_irq.c, intel_lrc.c is heavy because our principal implementation 
focuses on gen8 in execlist mode which is localized in intel_lrc.c .

But, yeah, I get what you're saying ;). Just stating for the record.

Thanks,
Tomas

> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-07-10 15:24   ` Tomas Elf
@ 2015-07-10 15:48     ` Tomas Elf
  2015-07-11 18:15       ` Chris Wilson
  2015-07-11 18:22     ` Chris Wilson
  1 sibling, 1 reply; 59+ messages in thread
From: Tomas Elf @ 2015-07-10 15:48 UTC (permalink / raw)
  To: Chris Wilson, Intel-GFX

On 10/07/2015 16:24, Tomas Elf wrote:
> On 09/07/2015 19:47, Chris Wilson wrote:
>> On Mon, Jun 08, 2015 at 06:03:18PM +0100, Tomas Elf wrote:
>>> This patch series introduces the following features:
>>>
>>> * Feature 1: TDR (Timeout Detection and Recovery) for gen8 execlist
>>> mode.
>>> * Feature 2: Watchdog Timeout (a.k.a "media engine reset") for gen8.
>>> * Feature 3. Context Submission Status Consistency checking
>>
>> The high level design is reasonable and conceptually extends the current
>> system in fairly obvious ways.
>>
>> In terms of discussing the implementation, I think this can be phased
>> as:
>>
>> 0. Move to a per-engine hangcheck
>>
>> 1. Add fake-interrupt recovery for CSSC
>>
>>    I think this can be done without changing the hangcheck heuristics at
>>    all - we just need to think carefully about recovery (which is a nice
>>    precursor to per-engine reset). I may be wrong, and so would like to
>>    be told so early on! If the fake interrupt recovery is done as part of
>>    the reset handler, we should have (one) fewer concurrency issues to
>>    worry about.
>
> Some points about moving the CSSC out of the hang checker and into the
> reset handler:
>
> 1. If we deal with consistency rectification in the reset handler the
> turnaround time becomes REALLY long:
>
>      a. First you have the time to detect the hang, call
> i915_handle_error() that then raises the reset in progress flag,
> preventing further submissions to the driver.
>
>      b. Then go all the way to the per-engine recovery path, only to
> discover that we've got an inconsistency that has not been handled, fall
> back immediately with -EAGAIN and lower the reset in progress flag, let
> the system continue running and defer to the next hang check (another
> hang detection period)
>
>      c. Once the hang has been detected AGAIN, raise the reset in
> progress flag AGAIN and go back to the engine reset path a second time.
>
>      d. At the start of the engine reset path we do the second CSSC
> detection and realise that we've got a stable inconsistency that we can
> attempt to rectify. We can then try to rectify the inconsistency and go
> through with the engine reset... AFTER we've checked that the
> inconsistency rectification was indeed effective! If it's not and the
> problem remains then we have to fail the engine recovery mode and fall
> back to full GPU reset immediately... Which we could have done from the
> hang checker if we had just refused to schedule hang recovery and just
> let the context submission state inconsistency persist and let the hang
> score keep rising until the hang score reached 2*HUNG, which would then
> have triggered the full GPU reset fallback from the hang checker (see
> path 9/11 for all of this)
>
> As you can see, dealing with context submission state inconsistency in
> the reset path is very long-winded way of doing it and does not make it
> more reliable. Also, it's more complicated to analyse from a concurrency
> point of view since we need to fall back several times and raise and
> lower the reset in progress flag, which allows driver submissions to
> happen vs. blocks submissions. It basically becomes very difficult to
> know what is going on.
>
> 2. Secondly, and more importantly, if a watchdog timeout is detected and
> we end up in the per-engine hang recovery path and have to fall back due
> to an inconsistent context submission state at that point and the hang
> checker is turned off then we're irrecoverably hung. Watchdog timeout is
> supposed to work without the periodic hang checker but it won't if CSSC
> is not ensured at all times. Which is why I chose to override the
> i915.enable_hangcheck flag to make sure that the hang checker always
> runs consistency pre-checking and reschedules itself if there is more
> work pending to make sure that as long as work is pending we do
> consistency checking asynchronously regardless of everything else so
> that if a watchdog timeout hits we have a consistent state once the
> watchdog timeout ends up in per-engine recovery.
>
> Granted, if a watchdog timeout hits after we've first detected the
> inconsistency but not yet had time to rectify it it doesn't work if the
> hang checker is turned off and we cannot rely on periodic hang checking
> to schedule hang recovery in this case - so in that case we're still
> irrecoverably stuck. We could make change here and do a one-time
> i915.enable_hangcheck override and schedule hang recovery following this
> point. If you think it's worth it.
>
> Bottom line: The consistency checking must happen at all times and
> cannot be done as a consequence of a scheduled reset if hang checking is
> turned off at any point.
>
> As far as concurrency issues in the face of CSSC is concerned,
> disregarding the complication of handling CSSC in the recovery path and
> relying on deferring to the next hang detection    with all of the
> concurrency issues that entails: The question really is what kind of
> concurrency issues we're worried about. If the hang checker determines
> that we've got a hang then that's a stable state. If the hang checker
> consistency pre-check determines that we've got a sustained CSSC
> inconsistency then that's stable too. The states are not changing so
> whatever we do will not be because we detect the state in the middle of
> a state transition and the detection won't be subject to concurrency
> effects. If the hang checker decides that the inconsistency needs to be
> rectified and fakes the presumably lost interrupt and the real, presumed
> lost, interrupt happens to come in at the same time then that's fine,
> the CSB buffer check in the execlist interrupt handler is made to cope
> with that. We can have X number of calls to the interrupt handler or
> just one, the outcome is supposed to be the same - the only thing that
> matters is captured context state changes in the CSB buffer that we act
> upon.
>
> So I'm not entirely sure what concurrency issues might be reason enough
> to move out the CSSC to the hang recovery path. In fact, I'd be more
> inclined to create a second async task for it to make sure it's being
> run at all times. But in that case we might as well let it stay in the
> hang checker.
>
> (In a similar vein, I think we should move the missed
>>    interupt handler for legacy out of hangcheck, partly to simplify some
>>    very confusing code and partly so that we have fewer deviations
>>    between legacy/execlists paths.) It also gets us thinking about the
>>    consistency detection and when it is viable to do a fake-interrupt and
>>    when we must do a full-reset (for example, we only want to
>>    fake-interrupt if the consistency check says the GPU is idle, and we
>>    definitely want to reset everything if the GPU is executing an alien
>>    context.)
>>
>>    A test here would be to suspend the execlists irq and wait for the
>>    recovery. Cheekily we could punch the irq eir by root mmio and check
>>    hangcheck automagically recovers.
>>
>> Whilst it would be nice to add the watchdog next, since it is
>> conceptually quite simple and basically just a new hangcheck source with
>> fancy setup - fast hangcheck without soft reset makes for an easy DoS.
>>
>> 2. TDR
>>
>>    Given that we have a consistency check and begun to extend the reset
>>    path, we can implement a soft reset that only skips the hung request.
>>    (The devil is in the details, whilst the design here looked solid, I
>>    think the LRC recovery code could be simplified - I didn't feel
>>    another requeue path was required given that we need only pretend a
>>    request completed (fixing up the context image as required) and then
>>    use the normal unqueue.) There is also quite a bit of LRC cleanup on
>>    the lists which would be useful here.
>
> As far as the new requeue (or resubmission) path is concerned, you might
> have a point here. The reason it's as involved as it is is probably
> mostly because of all the validation that takes place in the
> resubmission path. Meaning that once the resubmission happens at the end
> of the per-engine hang recovery path we want to make extra sure that the
> context that gets resubmitted in the end (the head element of the queue
> at that point in time) is in fact the one that was passed down from the
> per-engine hang recovery path (the context at the head of the queue at
> the start of the hang recovery path), so that the state of the queue
> didn't change during hang recovery. Maybe we're too paranoid here.
>

Ah, yes, there is one crucial difference between the normal 
execlists_context_unqueue() function and execlists_TDR_context_unqueue() 
that means that we cannot fully reuse the former one for TDR-specific 
purposes.

When we resubmit the context via TDR it's important that we do not 
increment the elsp_submitted counter and otherwise treat the 
resubmission as a normal submission. The reason for this is that the 
hardware consistently refuses to send out any kind of interrupt 
acknowledging the context resubmission. If you just submit the context 
and increment elsp_submitted for that context, like you normally do, the 
interrupt handler will sit around forever waiting for the interrupt that 
will never come for the resubmission. It will wait - and receive - the 
interrupt for the original submission that caused the original hang and 
also the interrupt for the context completion following hang recovery. 
But it won't receive an interrupt for the resubmission.

Also, there are two cases here:

1. Only one context was in flight when the hang happened.

Head element has elsp_submitted > 0 and the second in line has 
elsp_submitted == 0 . Normally, the _unqueue function would pick up any 
contexts that are pending and just submit them but the TDR context 
resubmission function should only resubmit exactly the context that was 
hung to get that out of the way. In this case, if we submit both pending 
contexts to the hardware that means that this resubmission accidentally 
causes one resubmission (the hung context) and one first-time submission 
(the next context in line). Not incrementing elsp_submitted for the hung 
context is ok, but the hardware _will_ in fact send out an interrupt for 
the second context that was never submitted before. In that case the 
interrupt handler will pick up an interrupt _too many_ compared to the 
respective elsp_submitted count. Therefore we need to avoid this case by 
not picking up the second context in line.

2. Both the head context and the second context in the execlist queue 
were in flight when the hang happened (both have elsp_submitted > 0).

In this case we need to resubmit both contexts and increment none of 
their elsp_submitted counts. The reason for this is that if we only 
resubmit the head element and not the second one in this case the 
resubmission will clear the hung context and the interrupt for the 
original context submission (not the resubmission) will be received. In 
this case the interrupt handler will clear the hung context from the 
queue and see that there are contexts pending and will then unqueue the 
next context in line, that was previously submitted at the time of the 
hang, thus submitting that context _a second time_. Doing so will drive 
up the elsp_submitted count to 2 for that context, but we will never get 
any more than one interrupt back for that context - thus causing a hang 
in the interrupt handler.

Basically, when doing context resubmission following the per-engine hang 
recovery we need to restore the exact submission state that was in 
progress at the time of the hang and resubmit EXACTLY the contexts that 
were in flight at the time. And not touch the elsp_submitted values for 
any of them. That is absolutely not what the normal _unqueue function 
does and therefore we cannot reuse it as is.

Granted, we could've passed a parameter to the normal _unqueue function 
to separate normal context unqueueing from TDR-specific unqueueing but 
there really is not much overlap between the two cases. Also, seeing as 
there is value in the context validation we do in 
intel_execlists_TDR_context_queue() we might as well have a separate 
path for TDR-specific context resubmission.

Does that make any sense?

Thanks,
Tomas

>>
>>    Lots of tests for concurrent engine utilisation, multiple contexts,
>>    etc and ensuring that a hang in one does not affect independent work
>>    (engines, batches, contexts).
>>
>> 3. Watchdog
>>
>>    A new fast hangcheck source. So concurrency, promotion (relative
>>    scoring between watchdog / hangcheck) and issue with not programming
>>    the ring correctly (beware the interrupt after performing a dispatch
>>    and programming the tail, needs either reserved space, inlining into
>>    the dispatch etc).
>>
>>    The only thing of remark here is the uapi. It is a server feature
>>    (interactive clients are less likely to tolerate data being thrown
>>    away). Do we want an execbuf bit or context param? An execbuf bit
>>    allows switching between two watchdog timers (or on/off), but we
>>    have many more context params available than bits. Do we want to
>>    expose context params to set the fast/slow timeouts?
>>
>>    We haven't found any GL spec that descibe controllable watchdogs, so
>>    the ultimate uapi requirements are unknown. My feeling is that we want
>>    to do the initial uapi through context params, and start with a single
>>    fast watchdog timeout value. This is sufficient for testing and
>>    probably for most use cases. This can easily be extended by adding an
>>    execbuf bit to switch between two values and a context param to set
>>    the second value. (If the second value isn't set, the execbuf bit
>>    dosn't do anything the watchdog is always programmed to the user
>>    value. If the second value is set (maybe infinite), then the execbuf
>>    bit is used to select which timeout to use for that batch.) But given
>>    that this is a server-esque feature, it is likely to be a setting the
>>    userspace driver imposes upon its clients, and there is unlikely to be
>>    the need to switch timeouts within any one client.
>
>
> This may or may not be true. We need to thrash out these details. As you
> said, the requirements are quite fuzzy at this point. In the end a good
> method might be to just get something in there in cooperation with ONE
> open source user and let all other users scream after it's gone in there
> and then extend the interface (without breaking ABI) to accomodate the
> other users. I've tried to drum up enthusiasm for this new feature but
> so far it's not been overwhelming so it's difficult to solve this
> chicken and egg problem without proper input from userland users.
>
> If someone writes something in stone and tells me to implement exactly
> that then I'll do it but so far there has been no convincing argument
> pointing to any particular design aside from the choice of default
> timeout, which was actually decided in collaboration with various
> userland groups and was nothing we just made up by ourselves.
>
>>
>>    The tests here would focus on the uapi and ensuring that if the client
>>    asks for a 60ms hang detection, then all work packets take at most
>>    60ms to complete. Hmm, add wait_ioctl to the list of uapi that should
>>    report -EIO.
>>
>>>   drivers/gpu/drm/i915/i915_debugfs.c     |  146 +++++-
>>>   drivers/gpu/drm/i915/i915_drv.c         |  201 ++++++++
>>>   drivers/gpu/drm/i915/intel_lrc.c        |  858
>>> ++++++++++++++++++++++++++++++-
>>
>> The balance here feels wrong ;-)
>
> Once we get gen7 support in there, after this RFC, I can assure you that
> it will even out in regards to intel_ringbuffer.c and other files. The
> gen agnostic TDR framework does focus a lot on i915_drv.c and
> i915_irq.c, intel_lrc.c is heavy because our principal implementation
> focuses on gen8 in execlist mode which is localized in intel_lrc.c .
>
> But, yeah, I get what you're saying ;). Just stating for the record.
>
> Thanks,
> Tomas
>
>> -Chris
>>
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-07-10 15:48     ` Tomas Elf
@ 2015-07-11 18:15       ` Chris Wilson
  0 siblings, 0 replies; 59+ messages in thread
From: Chris Wilson @ 2015-07-11 18:15 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Fri, Jul 10, 2015 at 04:48:53PM +0100, Tomas Elf wrote:
> On 10/07/2015 16:24, Tomas Elf wrote:
> >On 09/07/2015 19:47, Chris Wilson wrote:
> >>On Mon, Jun 08, 2015 at 06:03:18PM +0100, Tomas Elf wrote:
> >>>This patch series introduces the following features:
> >>>
> >>>* Feature 1: TDR (Timeout Detection and Recovery) for gen8 execlist
> >>>mode.
> >>>* Feature 2: Watchdog Timeout (a.k.a "media engine reset") for gen8.
> >>>* Feature 3. Context Submission Status Consistency checking
> >>
> >>The high level design is reasonable and conceptually extends the current
> >>system in fairly obvious ways.
> >>
> >>In terms of discussing the implementation, I think this can be phased
> >>as:
> >>
> >>0. Move to a per-engine hangcheck
> >>
> >>1. Add fake-interrupt recovery for CSSC
> >>
> >>   I think this can be done without changing the hangcheck heuristics at
> >>   all - we just need to think carefully about recovery (which is a nice
> >>   precursor to per-engine reset). I may be wrong, and so would like to
> >>   be told so early on! If the fake interrupt recovery is done as part of
> >>   the reset handler, we should have (one) fewer concurrency issues to
> >>   worry about.
> >
> >Some points about moving the CSSC out of the hang checker and into the
> >reset handler:
> >
> >1. If we deal with consistency rectification in the reset handler the
> >turnaround time becomes REALLY long:
> >
> >     a. First you have the time to detect the hang, call
> >i915_handle_error() that then raises the reset in progress flag,
> >preventing further submissions to the driver.
> >
> >     b. Then go all the way to the per-engine recovery path, only to
> >discover that we've got an inconsistency that has not been handled, fall
> >back immediately with -EAGAIN and lower the reset in progress flag, let
> >the system continue running and defer to the next hang check (another
> >hang detection period)
> >
> >     c. Once the hang has been detected AGAIN, raise the reset in
> >progress flag AGAIN and go back to the engine reset path a second time.
> >
> >     d. At the start of the engine reset path we do the second CSSC
> >detection and realise that we've got a stable inconsistency that we can
> >attempt to rectify. We can then try to rectify the inconsistency and go
> >through with the engine reset... AFTER we've checked that the
> >inconsistency rectification was indeed effective! If it's not and the
> >problem remains then we have to fail the engine recovery mode and fall
> >back to full GPU reset immediately... Which we could have done from the
> >hang checker if we had just refused to schedule hang recovery and just
> >let the context submission state inconsistency persist and let the hang
> >score keep rising until the hang score reached 2*HUNG, which would then
> >have triggered the full GPU reset fallback from the hang checker (see
> >path 9/11 for all of this)
> >
> >As you can see, dealing with context submission state inconsistency in
> >the reset path is very long-winded way of doing it and does not make it
> >more reliable. Also, it's more complicated to analyse from a concurrency
> >point of view since we need to fall back several times and raise and
> >lower the reset in progress flag, which allows driver submissions to
> >happen vs. blocks submissions. It basically becomes very difficult to
> >know what is going on.
> >
> >2. Secondly, and more importantly, if a watchdog timeout is detected and
> >we end up in the per-engine hang recovery path and have to fall back due
> >to an inconsistent context submission state at that point and the hang
> >checker is turned off then we're irrecoverably hung. Watchdog timeout is
> >supposed to work without the periodic hang checker but it won't if CSSC
> >is not ensured at all times. Which is why I chose to override the
> >i915.enable_hangcheck flag to make sure that the hang checker always
> >runs consistency pre-checking and reschedules itself if there is more
> >work pending to make sure that as long as work is pending we do
> >consistency checking asynchronously regardless of everything else so
> >that if a watchdog timeout hits we have a consistent state once the
> >watchdog timeout ends up in per-engine recovery.
> >
> >Granted, if a watchdog timeout hits after we've first detected the
> >inconsistency but not yet had time to rectify it it doesn't work if the
> >hang checker is turned off and we cannot rely on periodic hang checking
> >to schedule hang recovery in this case - so in that case we're still
> >irrecoverably stuck. We could make change here and do a one-time
> >i915.enable_hangcheck override and schedule hang recovery following this
> >point. If you think it's worth it.
> >
> >Bottom line: The consistency checking must happen at all times and
> >cannot be done as a consequence of a scheduled reset if hang checking is
> >turned off at any point.
> >
> >As far as concurrency issues in the face of CSSC is concerned,
> >disregarding the complication of handling CSSC in the recovery path and
> >relying on deferring to the next hang detection    with all of the
> >concurrency issues that entails: The question really is what kind of
> >concurrency issues we're worried about. If the hang checker determines
> >that we've got a hang then that's a stable state. If the hang checker
> >consistency pre-check determines that we've got a sustained CSSC
> >inconsistency then that's stable too. The states are not changing so
> >whatever we do will not be because we detect the state in the middle of
> >a state transition and the detection won't be subject to concurrency
> >effects. If the hang checker decides that the inconsistency needs to be
> >rectified and fakes the presumably lost interrupt and the real, presumed
> >lost, interrupt happens to come in at the same time then that's fine,
> >the CSB buffer check in the execlist interrupt handler is made to cope
> >with that. We can have X number of calls to the interrupt handler or
> >just one, the outcome is supposed to be the same - the only thing that
> >matters is captured context state changes in the CSB buffer that we act
> >upon.
> >
> >So I'm not entirely sure what concurrency issues might be reason enough
> >to move out the CSSC to the hang recovery path. In fact, I'd be more
> >inclined to create a second async task for it to make sure it's being
> >run at all times. But in that case we might as well let it stay in the
> >hang checker.
> >
> >(In a similar vein, I think we should move the missed
> >>   interupt handler for legacy out of hangcheck, partly to simplify some
> >>   very confusing code and partly so that we have fewer deviations
> >>   between legacy/execlists paths.) It also gets us thinking about the
> >>   consistency detection and when it is viable to do a fake-interrupt and
> >>   when we must do a full-reset (for example, we only want to
> >>   fake-interrupt if the consistency check says the GPU is idle, and we
> >>   definitely want to reset everything if the GPU is executing an alien
> >>   context.)
> >>
> >>   A test here would be to suspend the execlists irq and wait for the
> >>   recovery. Cheekily we could punch the irq eir by root mmio and check
> >>   hangcheck automagically recovers.
> >>
> >>Whilst it would be nice to add the watchdog next, since it is
> >>conceptually quite simple and basically just a new hangcheck source with
> >>fancy setup - fast hangcheck without soft reset makes for an easy DoS.
> >>
> >>2. TDR
> >>
> >>   Given that we have a consistency check and begun to extend the reset
> >>   path, we can implement a soft reset that only skips the hung request.
> >>   (The devil is in the details, whilst the design here looked solid, I
> >>   think the LRC recovery code could be simplified - I didn't feel
> >>   another requeue path was required given that we need only pretend a
> >>   request completed (fixing up the context image as required) and then
> >>   use the normal unqueue.) There is also quite a bit of LRC cleanup on
> >>   the lists which would be useful here.
> >
> >As far as the new requeue (or resubmission) path is concerned, you might
> >have a point here. The reason it's as involved as it is is probably
> >mostly because of all the validation that takes place in the
> >resubmission path. Meaning that once the resubmission happens at the end
> >of the per-engine hang recovery path we want to make extra sure that the
> >context that gets resubmitted in the end (the head element of the queue
> >at that point in time) is in fact the one that was passed down from the
> >per-engine hang recovery path (the context at the head of the queue at
> >the start of the hang recovery path), so that the state of the queue
> >didn't change during hang recovery. Maybe we're too paranoid here.
> >
> 
> Ah, yes, there is one crucial difference between the normal
> execlists_context_unqueue() function and
> execlists_TDR_context_unqueue() that means that we cannot fully
> reuse the former one for TDR-specific purposes.
> 
> When we resubmit the context via TDR it's important that we do not
> increment the elsp_submitted counter and otherwise treat the
> resubmission as a normal submission. The reason for this is that the
> hardware consistently refuses to send out any kind of interrupt
> acknowledging the context resubmission. If you just submit the
> context and increment elsp_submitted for that context, like you
> normally do, the interrupt handler will sit around forever waiting
> for the interrupt that will never come for the resubmission. It will
> wait - and receive - the interrupt for the original submission that
> caused the original hang and also the interrupt for the context
> completion following hang recovery. But it won't receive an
> interrupt for the resubmission.

But that's just an issue with the current poor design of execlists...

If we tracked exactly what request we submitted to each port, and a
unique ctx id tag for every submission (incl subsumed ones for easier
completion handling) you completely eliminate the need for elsp_submitted.
And a TDR resubmit is just that. (And also the code is much smaller and
much simpler.)

So I don't think you have sufficiently reworked execlists to avoid
unnecessary duplication.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC 00/11] TDR/watchdog timeout support for gen8
  2015-07-10 15:24   ` Tomas Elf
  2015-07-10 15:48     ` Tomas Elf
@ 2015-07-11 18:22     ` Chris Wilson
  1 sibling, 0 replies; 59+ messages in thread
From: Chris Wilson @ 2015-07-11 18:22 UTC (permalink / raw)
  To: Tomas Elf; +Cc: Intel-GFX

On Fri, Jul 10, 2015 at 04:24:33PM +0100, Tomas Elf wrote:
> On 09/07/2015 19:47, Chris Wilson wrote:
> >On Mon, Jun 08, 2015 at 06:03:18PM +0100, Tomas Elf wrote:
> >>This patch series introduces the following features:
> >>
> >>* Feature 1: TDR (Timeout Detection and Recovery) for gen8 execlist mode.
> >>* Feature 2: Watchdog Timeout (a.k.a "media engine reset") for gen8.
> >>* Feature 3. Context Submission Status Consistency checking
> >
> >The high level design is reasonable and conceptually extends the current
> >system in fairly obvious ways.
> >
> >In terms of discussing the implementation, I think this can be phased
> >as:
> >
> >0. Move to a per-engine hangcheck
> >
> >1. Add fake-interrupt recovery for CSSC
> >
> >   I think this can be done without changing the hangcheck heuristics at
> >   all - we just need to think carefully about recovery (which is a nice
> >   precursor to per-engine reset). I may be wrong, and so would like to
> >   be told so early on! If the fake interrupt recovery is done as part of
> >   the reset handler, we should have (one) fewer concurrency issues to
> >   worry about.
> 
> Some points about moving the CSSC out of the hang checker and into
> the reset handler:
> 
> 1. If we deal with consistency rectification in the reset handler
> the turnaround time becomes REALLY long:
> 		
> 	a. First you have the time to detect the hang, call
> i915_handle_error() that then raises the reset in progress flag,
> preventing further submissions to the driver.
> 
> 	b. Then go all the way to the per-engine recovery path, only to
> discover that we've got an inconsistency that has not been handled,
> fall back immediately with -EAGAIN and lower the reset in progress
> flag, let the system continue running and defer to the next hang
> check (another hang detection period)

Urm, there is no per-engine recovery at this phase. And even if there were,
why would you do that if you could detect the inconsistency and just
issue the fake interrupt? This is why I think phasing it in this way
makes it more obvious about the choices we can make during reset
handling.

[snip]

> 2. Secondly, and more importantly, if a watchdog timeout is detected
> and we end up in the per-engine hang recovery path and have to fall
> back due to an inconsistent context submission state at that point
> and the hang checker is turned off then we're irrecoverably hung.
> Watchdog timeout is supposed to work without the periodic hang
> checker but it won't if CSSC is not ensured at all times. Which is
> why I chose to override the i915.enable_hangcheck flag to make sure
> that the hang checker always runs consistency pre-checking and
> reschedules itself if there is more work pending to make sure that
> as long as work is pending we do consistency checking asynchronously
> regardless of everything else so that if a watchdog timeout hits we
> have a consistent state once the watchdog timeout ends up in
> per-engine recovery.

Again, what? If the watchdog mechanism is just a faster way to detect
hangs and queue the reset task, what the reset does is then independent
of how we detect the hang.
 
> Granted, if a watchdog timeout hits after we've first detected the
> inconsistency but not yet had time to rectify it it doesn't work if
> the hang checker is turned off and we cannot rely on periodic hang
> checking to schedule hang recovery in this case - so in that case
> we're still irrecoverably stuck. We could make change here and do a
> one-time i915.enable_hangcheck override and schedule hang recovery
> following this point. If you think it's worth it.
> 	
> Bottom line: The consistency checking must happen at all times and
> cannot be done as a consequence of a scheduled reset if hang
> checking is turned off at any point.

I strongly disagree. I think of this as two unique tasks, (a) detect
hang and (b) do something about it. At the moment (b) isn't smart
enough to work out the minimum it has to do in order to recover.

Which is why I suggest you think about adding the simpler fake interrupt
into the reset path first, before you even think about doing TDR.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2015-07-11 18:22 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-08 17:03 [RFC 00/11] TDR/watchdog timeout support for gen8 Tomas Elf
2015-06-08 17:03 ` [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode Tomas Elf
2015-06-08 17:36   ` Chris Wilson
2015-06-09 11:02     ` Tomas Elf
2015-06-16 13:44   ` Daniel Vetter
2015-06-16 15:46     ` Tomas Elf
2015-06-16 16:50       ` Chris Wilson
2015-06-16 17:07         ` Tomas Elf
2015-06-17 11:43       ` Daniel Vetter
2015-06-08 17:03 ` [RFC 02/11] drm/i915: Introduce uevent for full GPU reset Tomas Elf
2015-06-16 13:43   ` Daniel Vetter
2015-06-16 15:43     ` Tomas Elf
2015-06-16 16:55       ` Chris Wilson
2015-06-16 17:32         ` Tomas Elf
2015-06-16 19:33           ` Chris Wilson
2015-06-17 11:49             ` Daniel Vetter
2015-06-17 12:51               ` Chris Wilson
2015-06-08 17:03 ` [RFC 03/11] drm/i915: Add reset stats entry point for per-engine reset Tomas Elf
2015-06-08 17:33   ` Chris Wilson
2015-06-09 11:06     ` Tomas Elf
2015-06-16 13:48     ` Daniel Vetter
2015-06-16 13:54       ` Chris Wilson
2015-06-16 15:55         ` Daniel Vetter
2015-06-18 11:12         ` Dave Gordon
2015-06-11  9:14   ` Dave Gordon
2015-06-16 13:49   ` Daniel Vetter
2015-06-16 15:54     ` Tomas Elf
2015-06-17 11:51       ` Daniel Vetter
2015-06-08 17:03 ` [RFC 04/11] drm/i915: Adding TDR / per-engine reset support for gen8 Tomas Elf
2015-06-08 17:03 ` [RFC 05/11] drm/i915: Extending i915_gem_check_wedge to check engine reset in progress Tomas Elf
2015-06-08 17:24   ` Chris Wilson
2015-06-09 11:08     ` Tomas Elf
2015-06-09 11:11   ` Chris Wilson
2015-06-08 17:03 ` [RFC 06/11] drm/i915: Disable warnings for TDR interruptions in the display driver Tomas Elf
2015-06-08 17:53   ` Chris Wilson
2015-06-08 17:03 ` [RFC 07/11] drm/i915: Reinstate hang recovery work queue Tomas Elf
2015-06-08 17:03 ` [RFC 08/11] drm/i915: Watchdog timeout support for gen8 Tomas Elf
2015-06-08 17:03 ` [RFC 09/11] drm/i915: Fake lost context interrupts through forced CSB check Tomas Elf
2015-06-08 17:03 ` [RFC 10/11] drm/i915: Debugfs interface for per-engine hang recovery Tomas Elf
2015-06-08 17:45   ` Chris Wilson
2015-06-09 11:18     ` Tomas Elf
2015-06-09 12:27       ` Chris Wilson
2015-06-09 17:28         ` Tomas Elf
2015-06-11  9:32     ` Dave Gordon
2015-06-08 17:03 ` [RFC 11/11] drm/i915: TDR/watchdog trace points Tomas Elf
2015-06-23 10:05 ` [RFC 00/11] TDR/watchdog timeout support for gen8 Daniel Vetter
2015-06-23 10:47   ` Tomas Elf
2015-06-23 11:38     ` Daniel Vetter
2015-06-23 14:06       ` Tomas Elf
2015-06-23 15:20         ` Daniel Vetter
2015-06-23 15:35           ` Daniel Vetter
2015-06-25 10:38             ` Tomas Elf
2015-07-03 11:15 ` Mika Kuoppala
2015-07-03 17:41   ` Tomas Elf
2015-07-09 18:47 ` Chris Wilson
2015-07-10 15:24   ` Tomas Elf
2015-07-10 15:48     ` Tomas Elf
2015-07-11 18:15       ` Chris Wilson
2015-07-11 18:22     ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.