All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
@ 2018-05-29 13:29 Chris Wilson
  2018-05-29 13:29 ` [PATCH 2/5] drm/i915: Switch to kernel context before idling at runtime Chris Wilson
                   ` (8 more replies)
  0 siblings, 9 replies; 18+ messages in thread
From: Chris Wilson @ 2018-05-29 13:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

Since we use i915_gem_find_active_request() from inside
intel_engine_dump() and may call that at any time, we do not guarantee
that the engine is paused nor that the signal kthreads and irq handler
are suspended, so we cannot assert that the breadcrumb doesn't advance
and that the irq hasn't happened on another CPU signaling the request we
believe to be idle.

Fixes: f636edb214a5 ("drm/i915: Make i915_engine_info pretty printer to standalone")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 05f44ca35a06..530d6d0109b4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2972,23 +2972,22 @@ i915_gem_find_active_request(struct intel_engine_cs *engine)
 	struct i915_request *request, *active = NULL;
 	unsigned long flags;
 
-	/* We are called by the error capture and reset at a random
-	 * point in time. In particular, note that neither is crucially
-	 * ordered with an interrupt. After a hang, the GPU is dead and we
-	 * assume that no more writes can happen (we waited long enough for
-	 * all writes that were in transaction to be flushed) - adding an
+	/*
+	 * We are called by the error capture, reset and to dump engine
+	 * state at random points in time. In particular, note that neither is
+	 * crucially ordered with an interrupt. After a hang, the GPU is dead
+	 * and we assume that no more writes can happen (we waited long enough
+	 * for all writes that were in transaction to be flushed) - adding an
 	 * extra delay for a recent interrupt is pointless. Hence, we do
 	 * not need an engine->irq_seqno_barrier() before the seqno reads.
+	 * At all other times, we must assume the GPU is still running, but
+	 * we only care about the snapshot of this moment.
 	 */
 	spin_lock_irqsave(&engine->timeline.lock, flags);
 	list_for_each_entry(request, &engine->timeline.requests, link) {
 		if (__i915_request_completed(request, request->global_seqno))
 			continue;
 
-		GEM_BUG_ON(request->engine != engine);
-		GEM_BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
-				    &request->fence.flags));
-
 		active = request;
 		break;
 	}
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/5] drm/i915: Switch to kernel context before idling at runtime
  2018-05-29 13:29 [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request() Chris Wilson
@ 2018-05-29 13:29 ` Chris Wilson
  2018-05-29 13:29 ` [PATCH 3/5] drm/i915: "Race-to-idle" after switching to the kernel context Chris Wilson
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2018-05-29 13:29 UTC (permalink / raw)
  To: intel-gfx

We can reduce our exposure to random neutrinos by resting on the kernel
context having flushed out the user contexts to system memory and
beyond. The corollary is that we then we require two passes through the
idle handler to go to sleep, which on a truly idle system involves an
extra pass through the slow and irregular retire work handler.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c |  8 ++++++--
 drivers/gpu/drm/i915/i915_gem.c     | 29 ++++++++++++++++++++++++-----
 2 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a8e7761cdc7d..594ee65a6c06 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4226,8 +4226,12 @@ i915_drop_caches_set(void *data, u64 val)
 		i915_gem_shrink_all(dev_priv);
 	fs_reclaim_release(GFP_KERNEL);
 
-	if (val & DROP_IDLE)
-		drain_delayed_work(&dev_priv->gt.idle_work);
+	if (val & DROP_IDLE) {
+		do {
+			flush_delayed_work(&dev_priv->gt.retire_work);
+			drain_delayed_work(&dev_priv->gt.idle_work);
+		} while (READ_ONCE(dev_priv->gt.awake));
+	}
 
 	if (val & DROP_FREED)
 		i915_gem_drain_freed_objects(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 530d6d0109b4..9eb93ca06309 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -139,6 +139,8 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 
 static u32 __i915_gem_park(struct drm_i915_private *i915)
 {
+	GEM_TRACE("\n");
+
 	lockdep_assert_held(&i915->drm.struct_mutex);
 	GEM_BUG_ON(i915->gt.active_requests);
 	GEM_BUG_ON(!list_empty(&i915->gt.active_rings));
@@ -193,6 +195,8 @@ void i915_gem_park(struct drm_i915_private *i915)
 
 void i915_gem_unpark(struct drm_i915_private *i915)
 {
+	GEM_TRACE("\n");
+
 	lockdep_assert_held(&i915->drm.struct_mutex);
 	GEM_BUG_ON(!i915->gt.active_requests);
 
@@ -3503,6 +3507,21 @@ i915_gem_idle_work_handler(struct work_struct *work)
 	if (!READ_ONCE(dev_priv->gt.awake))
 		return;
 
+	/*
+	 * Flush out the last user context, leaving only the pinned
+	 * kernel context resident. When we are idling on the kernel_context,
+	 * no more new requests (with a context switch) are emitted and we
+	 * can finally rest. A consequence is that the idle work handler is
+	 * always called at least twice before idling (and if the system is
+	 * idle that implies a round trip through the retire worker).
+	 */
+	mutex_lock(&dev_priv->drm.struct_mutex);
+	i915_gem_switch_to_kernel_context(dev_priv);
+	mutex_unlock(&dev_priv->drm.struct_mutex);
+
+	GEM_TRACE("active_requests=%d (after switch-to-kernel-context)\n",
+		  READ_ONCE(dev_priv->gt.active_requests));
+
 	/*
 	 * Wait for last execlists context complete, but bail out in case a
 	 * new request is submitted. As we don't trust the hardware, we
@@ -4913,11 +4932,9 @@ static void assert_kernel_context_is_current(struct drm_i915_private *i915)
 
 void i915_gem_sanitize(struct drm_i915_private *i915)
 {
-	if (i915_terminally_wedged(&i915->gpu_error)) {
-		mutex_lock(&i915->drm.struct_mutex);
+	mutex_lock(&i915->drm.struct_mutex);
+	if (i915_terminally_wedged(&i915->gpu_error))
 		i915_gem_unset_wedged(i915);
-		mutex_unlock(&i915->drm.struct_mutex);
-	}
 
 	/*
 	 * If we inherit context state from the BIOS or earlier occupants
@@ -4929,6 +4946,9 @@ void i915_gem_sanitize(struct drm_i915_private *i915)
 	 */
 	if (INTEL_GEN(i915) >= 5 && intel_has_gpu_reset(i915))
 		WARN_ON(intel_gpu_reset(i915, ALL_ENGINES));
+
+	i915_gem_contexts_lost(i915);
+	mutex_unlock(&i915->drm.struct_mutex);
 }
 
 int i915_gem_suspend(struct drm_i915_private *dev_priv)
@@ -4964,7 +4984,6 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
 
 		assert_kernel_context_is_current(dev_priv);
 	}
-	i915_gem_contexts_lost(dev_priv);
 	mutex_unlock(&dev->struct_mutex);
 
 	intel_uc_suspend(dev_priv);
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/5] drm/i915: "Race-to-idle" after switching to the kernel context
  2018-05-29 13:29 [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request() Chris Wilson
  2018-05-29 13:29 ` [PATCH 2/5] drm/i915: Switch to kernel context before idling at runtime Chris Wilson
@ 2018-05-29 13:29 ` Chris Wilson
  2018-05-29 13:29 ` [PATCH 4/5] drm/i915: After reset on sanitization, reset the engine backends Chris Wilson
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2018-05-29 13:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

During suspend we want to flush out all active contexts and their
rendering. To do so we queue a request from the kernel's context, once
we know that request is done, we know the GPU is completely idle. To
speed up that switch bump the GPU clocks.

Switching to the kernel context prior to idling is also used to enforce
a barrier before changing OA properties, and when evicting active
rendering from the global GTT. All cases where we do want to
race-to-idle.

v2: Limit the boosting to only the switch before suspend.
v3: Limit it to the wait-for-idle on suspend.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Tested-by: David Weinehall <david.weinehall@linux.intel.com> #v1
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c     | 27 +++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_request.h |  1 +
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9eb93ca06309..773a5910cc29 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3703,7 +3703,29 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 
 static int wait_for_timeline(struct i915_timeline *tl, unsigned int flags)
 {
-	return i915_gem_active_wait(&tl->last_request, flags);
+	struct i915_request *rq;
+	long ret;
+
+	rq = i915_gem_active_get_unlocked(&tl->last_request);
+	if (!rq)
+		return 0;
+
+	/*
+	 * "Race-to-idle".
+	 *
+	 * Switching to the kernel context is often used a synchronous
+	 * step prior to idling, e.g. in suspend for flushing all
+	 * current operations to memory before sleeping. These we
+	 * want to complete as quickly as possible to avoid prolonged
+	 * stalls, so allow the gpu to boost to maximum clocks.
+	 */
+	if (flags & I915_WAIT_FOR_IDLE_BOOST)
+		gen6_rps_boost(rq, NULL);
+
+	ret = i915_request_wait(rq, flags, MAX_SCHEDULE_TIMEOUT);
+	i915_request_put(rq);
+
+	return ret < 0 ? ret : 0;
 }
 
 static int wait_for_engines(struct drm_i915_private *i915)
@@ -4978,7 +5000,8 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
 
 		ret = i915_gem_wait_for_idle(dev_priv,
 					     I915_WAIT_INTERRUPTIBLE |
-					     I915_WAIT_LOCKED);
+					     I915_WAIT_LOCKED |
+					     I915_WAIT_FOR_IDLE_BOOST);
 		if (ret && ret != -EIO)
 			goto err_unlock;
 
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 1bbbb7a9fa03..491ff81d0fea 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -267,6 +267,7 @@ long i915_request_wait(struct i915_request *rq,
 #define I915_WAIT_INTERRUPTIBLE	BIT(0)
 #define I915_WAIT_LOCKED	BIT(1) /* struct_mutex held, handle GPU reset */
 #define I915_WAIT_ALL		BIT(2) /* used by i915_gem_object_wait() */
+#define I915_WAIT_FOR_IDLE_BOOST BIT(3)
 
 static inline u32 intel_engine_get_seqno(struct intel_engine_cs *engine);
 
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/5] drm/i915: After reset on sanitization, reset the engine backends
  2018-05-29 13:29 [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request() Chris Wilson
  2018-05-29 13:29 ` [PATCH 2/5] drm/i915: Switch to kernel context before idling at runtime Chris Wilson
  2018-05-29 13:29 ` [PATCH 3/5] drm/i915: "Race-to-idle" after switching to the kernel context Chris Wilson
@ 2018-05-29 13:29 ` Chris Wilson
  2018-05-31 13:11   ` kbuild test robot
  2018-05-31 14:34   ` kbuild test robot
  2018-05-29 13:29 ` [PATCH 5/5] drm/i915: Only sanitize GEM from late suspend Chris Wilson
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 18+ messages in thread
From: Chris Wilson @ 2018-05-29 13:29 UTC (permalink / raw)
  To: intel-gfx

As we reset the GPU on suspend/resume, we also do need to reset the
engine state tracking so call into the engine backends. This is
especially important so that we can also sanitize the state tracking
across resume.

References: https://bugs.freedesktop.org/show_bug.cgi?id=106702
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 773a5910cc29..75bdfafc97a2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4954,7 +4954,22 @@ static void assert_kernel_context_is_current(struct drm_i915_private *i915)
 
 void i915_gem_sanitize(struct drm_i915_private *i915)
 {
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	GEM_TRACE("\n");
+
 	mutex_lock(&i915->drm.struct_mutex);
+
+	intel_runtime_pm_get(i915);
+	intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
+
+	/*
+	 * As we have just resumed the machine and woken the device up from
+	 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
+	 * back to defaults, recovering from whatever wedged state we left it
+	 * in and so worth trying to use the device once more.
+	 */
 	if (i915_terminally_wedged(&i915->gpu_error))
 		i915_gem_unset_wedged(i915);
 
@@ -4969,6 +4984,15 @@ void i915_gem_sanitize(struct drm_i915_private *i915)
 	if (INTEL_GEN(i915) >= 5 && intel_has_gpu_reset(i915))
 		WARN_ON(intel_gpu_reset(i915, ALL_ENGINES));
 
+	/* Reset the submission backend after resume as well as the GPU reset */
+	for_each_engine(engine, i915, id) {
+		if (engine->reset.reset)
+			engine->reset.reset(engine, NULL);
+	}
+
+	intel_uncore_forcewake_put(i915, FORCEWAKE_ALL);
+	intel_runtime_pm_put(i915);
+
 	i915_gem_contexts_lost(i915);
 	mutex_unlock(&i915->drm.struct_mutex);
 }
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/5] drm/i915: Only sanitize GEM from late suspend
  2018-05-29 13:29 [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request() Chris Wilson
                   ` (2 preceding siblings ...)
  2018-05-29 13:29 ` [PATCH 4/5] drm/i915: After reset on sanitization, reset the engine backends Chris Wilson
@ 2018-05-29 13:29 ` Chris Wilson
  2018-05-30 16:56   ` Tvrtko Ursulin
  2018-05-29 13:47 ` ✗ Fi.CI.SPARSE: warning for series starting with [1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request() Patchwork
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2018-05-29 13:29 UTC (permalink / raw)
  To: intel-gfx

During testing we encounter a conflict between SUSPEND_TEST_DEVICES and
disabling reset (gem_eio/suspend). This results in the device continuing
on without being reset, but since it has gone through HW sanitization to
account for the suspend/resume cycle, we have to assume the device has
been reset to its defaults. A simple way around this is to skip the
sanitize phase for SUSPEND_TEST_DEVICES by moving it to suspend-late.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.c |  6 +++++-
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 drivers/gpu/drm/i915/i915_gem.c | 22 +++++++++++++---------
 3 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 8f002ae22e62..0d9b8cc0436d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -636,6 +636,8 @@ static const struct vga_switcheroo_client_ops i915_switcheroo_ops = {
 
 static void i915_gem_fini(struct drm_i915_private *dev_priv)
 {
+	i915_gem_suspend_late(dev_priv);
+
 	/* Flush any outstanding unpin_work. */
 	i915_gem_drain_workqueue(dev_priv);
 
@@ -1611,7 +1613,6 @@ static int i915_drm_suspend(struct drm_device *dev)
 	opregion_target_state = suspend_to_idle(dev_priv) ? PCI_D1 : PCI_D3cold;
 	intel_opregion_notify_adapter(dev_priv, opregion_target_state);
 
-	intel_uncore_suspend(dev_priv);
 	intel_opregion_unregister(dev_priv);
 
 	intel_fbdev_set_suspend(dev, FBINFO_STATE_SUSPENDED, true);
@@ -1633,7 +1634,10 @@ static int i915_drm_suspend_late(struct drm_device *dev, bool hibernation)
 
 	disable_rpm_wakeref_asserts(dev_priv);
 
+	i915_gem_suspend_late(dev_priv);
+
 	intel_display_set_init_power(dev_priv, false);
+	intel_uncore_suspend(dev_priv);
 
 	/*
 	 * In case of firmware assisted context save/restore don't manually
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 487922f88b76..4257ffc1bae1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3169,6 +3169,7 @@ void i915_gem_cleanup_engines(struct drm_i915_private *dev_priv);
 int i915_gem_wait_for_idle(struct drm_i915_private *dev_priv,
 			   unsigned int flags);
 int __must_check i915_gem_suspend(struct drm_i915_private *dev_priv);
+void i915_gem_suspend_late(struct drm_i915_private *dev_priv);
 void i915_gem_resume(struct drm_i915_private *dev_priv);
 int i915_gem_fault(struct vm_fault *vmf);
 int i915_gem_object_wait(struct drm_i915_gem_object *obj,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 75bdfafc97a2..874ac1a8ec47 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5050,6 +5050,17 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
 	if (WARN_ON(!intel_engines_are_idle(dev_priv)))
 		i915_gem_set_wedged(dev_priv); /* no hope, discard everything */
 
+	intel_runtime_pm_put(dev_priv);
+	return 0;
+
+err_unlock:
+	mutex_unlock(&dev->struct_mutex);
+	intel_runtime_pm_put(dev_priv);
+	return ret;
+}
+
+void i915_gem_suspend_late(struct drm_i915_private *i915)
+{
 	/*
 	 * Neither the BIOS, ourselves or any other kernel
 	 * expects the system to be in execlists mode on startup,
@@ -5069,16 +5080,9 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
 	 * machines is a good idea, we don't - just in case it leaves the
 	 * machine in an unusable condition.
 	 */
-	intel_uc_sanitize(dev_priv);
-	i915_gem_sanitize(dev_priv);
 
-	intel_runtime_pm_put(dev_priv);
-	return 0;
-
-err_unlock:
-	mutex_unlock(&dev->struct_mutex);
-	intel_runtime_pm_put(dev_priv);
-	return ret;
+	intel_uc_sanitize(i915);
+	i915_gem_sanitize(i915);
 }
 
 void i915_gem_resume(struct drm_i915_private *i915)
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* ✗ Fi.CI.SPARSE: warning for series starting with [1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
  2018-05-29 13:29 [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request() Chris Wilson
                   ` (3 preceding siblings ...)
  2018-05-29 13:29 ` [PATCH 5/5] drm/i915: Only sanitize GEM from late suspend Chris Wilson
@ 2018-05-29 13:47 ` Patchwork
  2018-05-29 14:02 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2018-05-29 13:47 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
URL   : https://patchwork.freedesktop.org/series/43892/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Commit: drm/i915: Remove stale asserts from i915_gem_find_active_request()
Okay!

Commit: drm/i915: Switch to kernel context before idling at runtime
Okay!

Commit: drm/i915: "Race-to-idle" after switching to the kernel context
Okay!

Commit: drm/i915: After reset on sanitization, reset the engine backends
Okay!

Commit: drm/i915: Only sanitize GEM from late suspend
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3664:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3665:16: warning: expression using sizeof(void)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
  2018-05-29 13:29 [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request() Chris Wilson
                   ` (4 preceding siblings ...)
  2018-05-29 13:47 ` ✗ Fi.CI.SPARSE: warning for series starting with [1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request() Patchwork
@ 2018-05-29 14:02 ` Patchwork
  2018-05-29 17:53 ` ✗ Fi.CI.IGT: failure " Patchwork
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2018-05-29 14:02 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
URL   : https://patchwork.freedesktop.org/series/43892/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4253 -> Patchwork_9139 =

== Summary - WARNING ==

  Minor unknown changes coming with Patchwork_9139 need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_9139, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/43892/revisions/1/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_9139:

  === IGT changes ===

    ==== Warnings ====

    igt@gem_exec_gttfill@basic:
      fi-pnv-d510:        PASS -> SKIP

    
== Known issues ==

  Here are the changes found in Patchwork_9139 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@kms_flip@basic-plain-flip:
      fi-glk-j4005:       PASS -> DMESG-WARN (fdo#106097)

    igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
      fi-cnl-y3:          NOTRUN -> DMESG-WARN (fdo#104951)

    
    ==== Possible fixes ====

    igt@debugfs_test@read_all_entries:
      fi-snb-2520m:       INCOMPLETE (fdo#103713) -> PASS

    igt@kms_flip@basic-flip-vs-wf_vblank:
      fi-cnl-psr:         FAIL (fdo#100368) -> PASS

    igt@kms_frontbuffer_tracking@basic:
      fi-hsw-4200u:       DMESG-FAIL (fdo#106103, fdo#102614) -> PASS

    
  fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
  fdo#102614 https://bugs.freedesktop.org/show_bug.cgi?id=102614
  fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713
  fdo#104951 https://bugs.freedesktop.org/show_bug.cgi?id=104951
  fdo#106097 https://bugs.freedesktop.org/show_bug.cgi?id=106097
  fdo#106103 https://bugs.freedesktop.org/show_bug.cgi?id=106103


== Participating hosts (44 -> 38) ==

  Additional (1): fi-cnl-y3 
  Missing    (7): fi-ilk-m540 fi-byt-j1900 fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-cfl-u2 fi-skl-6700hq 


== Build changes ==

    * Linux: CI_DRM_4253 -> Patchwork_9139

  CI_DRM_4253: d0a3423d398934e94a1924daea934b1578588336 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4499: f560ae5a464331f03f0a669ed46b8c9e56526187 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_9139: 242530373b1e6f8869530924967851b8b9e50f71 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

242530373b1e drm/i915: Only sanitize GEM from late suspend
e9ad0c1a4120 drm/i915: After reset on sanitization, reset the engine backends
6d020d8df475 drm/i915: "Race-to-idle" after switching to the kernel context
b3f18b12420e drm/i915: Switch to kernel context before idling at runtime
74726d9e911a drm/i915: Remove stale asserts from i915_gem_find_active_request()

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9139/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* ✗ Fi.CI.IGT: failure for series starting with [1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
  2018-05-29 13:29 [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request() Chris Wilson
                   ` (5 preceding siblings ...)
  2018-05-29 14:02 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-05-29 17:53 ` Patchwork
  2018-05-29 17:57   ` Chris Wilson
  2018-05-30 10:43 ` [PATCH 1/5] " Tvrtko Ursulin
  2018-05-30 16:06 ` ✗ Fi.CI.BAT: failure for series starting with [1/5] " Patchwork
  8 siblings, 1 reply; 18+ messages in thread
From: Patchwork @ 2018-05-29 17:53 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
URL   : https://patchwork.freedesktop.org/series/43892/
State : failure

== Summary ==

= CI Bug Log - changes from CI_DRM_4253_full -> Patchwork_9139_full =

== Summary - FAILURE ==

  Serious unknown changes coming with Patchwork_9139_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_9139_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/43892/revisions/1/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_9139_full:

  === IGT changes ===

    ==== Possible regressions ====

    igt@perf_pmu@busy-hang-rcs0:
      shard-kbl:          PASS -> FAIL +1
      shard-glk:          PASS -> FAIL +1

    igt@perf_pmu@busy-hang-vecs0:
      shard-apl:          PASS -> FAIL +2

    
    ==== Warnings ====

    igt@gem_exec_schedule@deep-bsd2:
      shard-kbl:          SKIP -> PASS +1

    igt@gem_mocs_settings@mocs-rc6-bsd1:
      shard-kbl:          PASS -> SKIP

    igt@kms_chv_cursor_fail@pipe-a-256x256-bottom-edge:
      shard-snb:          SKIP -> PASS

    
== Known issues ==

  Here are the changes found in Patchwork_9139_full that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@drv_selftest@live_hangcheck:
      shard-apl:          PASS -> DMESG-FAIL (fdo#106560)

    igt@gem_eio@suspend:
      shard-snb:          PASS -> INCOMPLETE (fdo#105411)

    igt@kms_atomic_transition@1x-modeset-transitions-nonblocking:
      shard-glk:          PASS -> FAIL (fdo#105703)

    
    ==== Possible fixes ====

    igt@drv_selftest@live_gtt:
      shard-glk:          INCOMPLETE (fdo#103359, k.org#198133) -> PASS

    igt@kms_cursor_crc@cursor-256x256-suspend:
      shard-kbl:          INCOMPLETE (fdo#103665) -> PASS

    igt@kms_flip@modeset-vs-vblank-race-interruptible:
      shard-hsw:          FAIL (fdo#103060) -> PASS

    igt@kms_flip_tiling@flip-to-y-tiled:
      shard-glk:          FAIL (fdo#104724) -> PASS

    igt@kms_setmode@basic:
      shard-apl:          FAIL (fdo#99912) -> PASS

    igt@kms_vblank@pipe-b-accuracy-idle:
      shard-hsw:          FAIL (fdo#102583) -> PASS

    
  fdo#102583 https://bugs.freedesktop.org/show_bug.cgi?id=102583
  fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
  fdo#103359 https://bugs.freedesktop.org/show_bug.cgi?id=103359
  fdo#103665 https://bugs.freedesktop.org/show_bug.cgi?id=103665
  fdo#104724 https://bugs.freedesktop.org/show_bug.cgi?id=104724
  fdo#105411 https://bugs.freedesktop.org/show_bug.cgi?id=105411
  fdo#105703 https://bugs.freedesktop.org/show_bug.cgi?id=105703
  fdo#106560 https://bugs.freedesktop.org/show_bug.cgi?id=106560
  fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
  k.org#198133 https://bugzilla.kernel.org/show_bug.cgi?id=198133


== Participating hosts (5 -> 5) ==

  No changes in participating hosts


== Build changes ==

    * Linux: CI_DRM_4253 -> Patchwork_9139

  CI_DRM_4253: d0a3423d398934e94a1924daea934b1578588336 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4499: f560ae5a464331f03f0a669ed46b8c9e56526187 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_9139: 242530373b1e6f8869530924967851b8b9e50f71 @ git://anongit.freedesktop.org/gfx-ci/linux

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9139/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: ✗ Fi.CI.IGT: failure for series starting with [1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
  2018-05-29 17:53 ` ✗ Fi.CI.IGT: failure " Patchwork
@ 2018-05-29 17:57   ` Chris Wilson
  0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2018-05-29 17:57 UTC (permalink / raw)
  To: Patchwork; +Cc: intel-gfx

Quoting Patchwork (2018-05-29 18:53:25)
> == Series Details ==
> 
> Series: series starting with [1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
> URL   : https://patchwork.freedesktop.org/series/43892/
> State : failure
> 
> == Summary ==
> 
> = CI Bug Log - changes from CI_DRM_4253_full -> Patchwork_9139_full =
> 
> == Summary - FAILURE ==
> 
>   Serious unknown changes coming with Patchwork_9139_full absolutely need to be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in Patchwork_9139_full, please notify your bug team to allow them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   External URL: https://patchwork.freedesktop.org/api/1.0/series/43892/revisions/1/mbox/
> 
> == Possible new issues ==
> 
>   Here are the unknown changes that may have been introduced in Patchwork_9139_full:
> 
>   === IGT changes ===
> 
>     ==== Possible regressions ====
> 
>     igt@perf_pmu@busy-hang-rcs0:
>       shard-kbl:          PASS -> FAIL +1
>       shard-glk:          PASS -> FAIL +1
> 
>     igt@perf_pmu@busy-hang-vecs0:
>       shard-apl:          PASS -> FAIL +2

These are an issue in the igt not coordinating it's wait-until-idle with
the kernel. Simple patch on list.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
  2018-05-29 13:29 [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request() Chris Wilson
                   ` (6 preceding siblings ...)
  2018-05-29 17:53 ` ✗ Fi.CI.IGT: failure " Patchwork
@ 2018-05-30 10:43 ` Tvrtko Ursulin
  2018-05-30 10:55   ` Chris Wilson
  2018-05-30 16:06 ` ✗ Fi.CI.BAT: failure for series starting with [1/5] " Patchwork
  8 siblings, 1 reply; 18+ messages in thread
From: Tvrtko Ursulin @ 2018-05-30 10:43 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Mika Kuoppala


On 29/05/2018 14:29, Chris Wilson wrote:
> Since we use i915_gem_find_active_request() from inside
> intel_engine_dump() and may call that at any time, we do not guarantee
> that the engine is paused nor that the signal kthreads and irq handler
> are suspended, so we cannot assert that the breadcrumb doesn't advance
> and that the irq hasn't happened on another CPU signaling the request we
> believe to be idle.
> 
> Fixes: f636edb214a5 ("drm/i915: Make i915_engine_info pretty printer to standalone")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem.c | 17 ++++++++---------
>   1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 05f44ca35a06..530d6d0109b4 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2972,23 +2972,22 @@ i915_gem_find_active_request(struct intel_engine_cs *engine)
>   	struct i915_request *request, *active = NULL;
>   	unsigned long flags;
>   
> -	/* We are called by the error capture and reset at a random
> -	 * point in time. In particular, note that neither is crucially
> -	 * ordered with an interrupt. After a hang, the GPU is dead and we
> -	 * assume that no more writes can happen (we waited long enough for
> -	 * all writes that were in transaction to be flushed) - adding an
> +	/*
> +	 * We are called by the error capture, reset and to dump engine
> +	 * state at random points in time. In particular, note that neither is
> +	 * crucially ordered with an interrupt. After a hang, the GPU is dead
> +	 * and we assume that no more writes can happen (we waited long enough
> +	 * for all writes that were in transaction to be flushed) - adding an
>   	 * extra delay for a recent interrupt is pointless. Hence, we do
>   	 * not need an engine->irq_seqno_barrier() before the seqno reads.
> +	 * At all other times, we must assume the GPU is still running, but
> +	 * we only care about the snapshot of this moment.
>   	 */
>   	spin_lock_irqsave(&engine->timeline.lock, flags);
>   	list_for_each_entry(request, &engine->timeline.requests, link) {
>   		if (__i915_request_completed(request, request->global_seqno))
>   			continue;
>   
> -		GEM_BUG_ON(request->engine != engine);

Removal of this one belongs to virtual engine perhaps?

Regards,

Tvrtko

> -		GEM_BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
> -				    &request->fence.flags));
> -
>   		active = request;
>   		break;
>   	}
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
  2018-05-30 10:43 ` [PATCH 1/5] " Tvrtko Ursulin
@ 2018-05-30 10:55   ` Chris Wilson
  2018-05-30 11:01     ` Tvrtko Ursulin
  0 siblings, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2018-05-30 10:55 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: Mika Kuoppala

Quoting Tvrtko Ursulin (2018-05-30 11:43:16)
> 
> On 29/05/2018 14:29, Chris Wilson wrote:
> > Since we use i915_gem_find_active_request() from inside
> > intel_engine_dump() and may call that at any time, we do not guarantee
> > that the engine is paused nor that the signal kthreads and irq handler
> > are suspended, so we cannot assert that the breadcrumb doesn't advance
> > and that the irq hasn't happened on another CPU signaling the request we
> > believe to be idle.
> > 
> > Fixes: f636edb214a5 ("drm/i915: Make i915_engine_info pretty printer to standalone")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_gem.c | 17 ++++++++---------
> >   1 file changed, 8 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 05f44ca35a06..530d6d0109b4 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2972,23 +2972,22 @@ i915_gem_find_active_request(struct intel_engine_cs *engine)
> >       struct i915_request *request, *active = NULL;
> >       unsigned long flags;
> >   
> > -     /* We are called by the error capture and reset at a random
> > -      * point in time. In particular, note that neither is crucially
> > -      * ordered with an interrupt. After a hang, the GPU is dead and we
> > -      * assume that no more writes can happen (we waited long enough for
> > -      * all writes that were in transaction to be flushed) - adding an
> > +     /*
> > +      * We are called by the error capture, reset and to dump engine
> > +      * state at random points in time. In particular, note that neither is
> > +      * crucially ordered with an interrupt. After a hang, the GPU is dead
> > +      * and we assume that no more writes can happen (we waited long enough
> > +      * for all writes that were in transaction to be flushed) - adding an
> >        * extra delay for a recent interrupt is pointless. Hence, we do
> >        * not need an engine->irq_seqno_barrier() before the seqno reads.
> > +      * At all other times, we must assume the GPU is still running, but
> > +      * we only care about the snapshot of this moment.
> >        */
> >       spin_lock_irqsave(&engine->timeline.lock, flags);
> >       list_for_each_entry(request, &engine->timeline.requests, link) {
> >               if (__i915_request_completed(request, request->global_seqno))
> >                       continue;
> >   
> > -             GEM_BUG_ON(request->engine != engine);
> 
> Removal of this one belongs to virtual engine perhaps?

Nah, even with virtual engine; the engine timeline list is still true. I
was just thinking it was too random to check here (e.g. in the middle of
error capture) and that our more recent addition of checking during
retirement was a little more rigorous (and timely).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
  2018-05-30 10:55   ` Chris Wilson
@ 2018-05-30 11:01     ` Tvrtko Ursulin
  2018-05-30 11:14       ` Chris Wilson
  0 siblings, 1 reply; 18+ messages in thread
From: Tvrtko Ursulin @ 2018-05-30 11:01 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Mika Kuoppala


On 30/05/2018 11:55, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-05-30 11:43:16)
>>
>> On 29/05/2018 14:29, Chris Wilson wrote:
>>> Since we use i915_gem_find_active_request() from inside
>>> intel_engine_dump() and may call that at any time, we do not guarantee
>>> that the engine is paused nor that the signal kthreads and irq handler
>>> are suspended, so we cannot assert that the breadcrumb doesn't advance
>>> and that the irq hasn't happened on another CPU signaling the request we
>>> believe to be idle.
>>>
>>> Fixes: f636edb214a5 ("drm/i915: Make i915_engine_info pretty printer to standalone")
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
>>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_gem.c | 17 ++++++++---------
>>>    1 file changed, 8 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>> index 05f44ca35a06..530d6d0109b4 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>> @@ -2972,23 +2972,22 @@ i915_gem_find_active_request(struct intel_engine_cs *engine)
>>>        struct i915_request *request, *active = NULL;
>>>        unsigned long flags;
>>>    
>>> -     /* We are called by the error capture and reset at a random
>>> -      * point in time. In particular, note that neither is crucially
>>> -      * ordered with an interrupt. After a hang, the GPU is dead and we
>>> -      * assume that no more writes can happen (we waited long enough for
>>> -      * all writes that were in transaction to be flushed) - adding an
>>> +     /*
>>> +      * We are called by the error capture, reset and to dump engine
>>> +      * state at random points in time. In particular, note that neither is
>>> +      * crucially ordered with an interrupt. After a hang, the GPU is dead
>>> +      * and we assume that no more writes can happen (we waited long enough
>>> +      * for all writes that were in transaction to be flushed) - adding an
>>>         * extra delay for a recent interrupt is pointless. Hence, we do
>>>         * not need an engine->irq_seqno_barrier() before the seqno reads.
>>> +      * At all other times, we must assume the GPU is still running, but
>>> +      * we only care about the snapshot of this moment.
>>>         */
>>>        spin_lock_irqsave(&engine->timeline.lock, flags);
>>>        list_for_each_entry(request, &engine->timeline.requests, link) {
>>>                if (__i915_request_completed(request, request->global_seqno))
>>>                        continue;
>>>    
>>> -             GEM_BUG_ON(request->engine != engine);
>>
>> Removal of this one belongs to virtual engine perhaps?
> 
> Nah, even with virtual engine; the engine timeline list is still true. I
> was just thinking it was too random to check here (e.g. in the middle of
> error capture) and that our more recent addition of checking during
> retirement was a little more rigorous (and timely).

It is a random check indeed. Commit message append: "While at it remove 
a random assert on..."

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
  2018-05-30 11:01     ` Tvrtko Ursulin
@ 2018-05-30 11:14       ` Chris Wilson
  0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2018-05-30 11:14 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: Mika Kuoppala

Quoting Tvrtko Ursulin (2018-05-30 12:01:47)
> 
> On 30/05/2018 11:55, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-05-30 11:43:16)
> >>
> >> On 29/05/2018 14:29, Chris Wilson wrote:
> >>> Since we use i915_gem_find_active_request() from inside
> >>> intel_engine_dump() and may call that at any time, we do not guarantee
> >>> that the engine is paused nor that the signal kthreads and irq handler
> >>> are suspended, so we cannot assert that the breadcrumb doesn't advance
> >>> and that the irq hasn't happened on another CPU signaling the request we
> >>> believe to be idle.
> >>>
> >>> Fixes: f636edb214a5 ("drm/i915: Make i915_engine_info pretty printer to standalone")
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> >>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/i915_gem.c | 17 ++++++++---------
> >>>    1 file changed, 8 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >>> index 05f44ca35a06..530d6d0109b4 100644
> >>> --- a/drivers/gpu/drm/i915/i915_gem.c
> >>> +++ b/drivers/gpu/drm/i915/i915_gem.c
> >>> @@ -2972,23 +2972,22 @@ i915_gem_find_active_request(struct intel_engine_cs *engine)
> >>>        struct i915_request *request, *active = NULL;
> >>>        unsigned long flags;
> >>>    
> >>> -     /* We are called by the error capture and reset at a random
> >>> -      * point in time. In particular, note that neither is crucially
> >>> -      * ordered with an interrupt. After a hang, the GPU is dead and we
> >>> -      * assume that no more writes can happen (we waited long enough for
> >>> -      * all writes that were in transaction to be flushed) - adding an
> >>> +     /*
> >>> +      * We are called by the error capture, reset and to dump engine
> >>> +      * state at random points in time. In particular, note that neither is
> >>> +      * crucially ordered with an interrupt. After a hang, the GPU is dead
> >>> +      * and we assume that no more writes can happen (we waited long enough
> >>> +      * for all writes that were in transaction to be flushed) - adding an
> >>>         * extra delay for a recent interrupt is pointless. Hence, we do
> >>>         * not need an engine->irq_seqno_barrier() before the seqno reads.
> >>> +      * At all other times, we must assume the GPU is still running, but
> >>> +      * we only care about the snapshot of this moment.
> >>>         */
> >>>        spin_lock_irqsave(&engine->timeline.lock, flags);
> >>>        list_for_each_entry(request, &engine->timeline.requests, link) {
> >>>                if (__i915_request_completed(request, request->global_seqno))
> >>>                        continue;
> >>>    
> >>> -             GEM_BUG_ON(request->engine != engine);
> >>
> >> Removal of this one belongs to virtual engine perhaps?
> > 
> > Nah, even with virtual engine; the engine timeline list is still true. I
> > was just thinking it was too random to check here (e.g. in the middle of
> > error capture) and that our more recent addition of checking during
> > retirement was a little more rigorous (and timely).
> 
> It is a random check indeed. Commit message append: "While at it remove 
> a random assert on..."
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Pushed this patch by itself, just so I can have a new CI baseline :)
Thanks for the review,
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
  2018-05-29 13:29 [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request() Chris Wilson
                   ` (7 preceding siblings ...)
  2018-05-30 10:43 ` [PATCH 1/5] " Tvrtko Ursulin
@ 2018-05-30 16:06 ` Patchwork
  8 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2018-05-30 16:06 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request()
URL   : https://patchwork.freedesktop.org/series/43892/
State : failure

== Summary ==

= CI Bug Log - changes from CI_DRM_4257 -> Patchwork_9150 =

== Summary - FAILURE ==

  Serious unknown changes coming with Patchwork_9150 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_9150, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/43892/revisions/1/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_9150:

  === IGT changes ===

    ==== Possible regressions ====

    igt@gem_exec_suspend@basic-s4-devices:
      fi-bdw-5557u:       PASS -> INCOMPLETE

    
== Known issues ==

  Here are the changes found in Patchwork_9150 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@gem_ctx_create@basic-files:
      fi-glk-j4005:       PASS -> DMESG-WARN (fdo#105719) +1

    igt@gem_exec_suspend@basic-s4-devices:
      fi-bdw-gvtdvm:      PASS -> INCOMPLETE (fdo#105600)

    igt@kms_flip@basic-flip-vs-dpms:
      fi-glk-j4005:       PASS -> DMESG-WARN (fdo#106000, fdo#105719)

    igt@kms_flip@basic-flip-vs-wf_vblank:
      fi-glk-j4005:       PASS -> FAIL (fdo#100368)

    igt@kms_pipe_crc_basic@read-crc-pipe-c-frame-sequence:
      fi-skl-6700k2:      PASS -> FAIL (fdo#103191, fdo#104724)

    
    ==== Possible fixes ====

    igt@gem_mmap_gtt@basic-small-bo-tiledx:
      fi-gdg-551:         FAIL (fdo#102575) -> PASS

    igt@kms_flip@basic-flip-vs-wf_vblank:
      fi-cnl-psr:         FAIL (fdo#100368) -> PASS

    igt@kms_frontbuffer_tracking@basic:
      fi-bsw-n3050:       INCOMPLETE (fdo#106729) -> PASS

    igt@kms_pipe_crc_basic@nonblocking-crc-pipe-c:
      fi-skl-guc:         FAIL (fdo#103191, fdo#104724) -> PASS

    igt@kms_pipe_crc_basic@read-crc-pipe-c:
      fi-glk-j4005:       DMESG-WARN (fdo#106000, fdo#106097) -> SKIP

    igt@prime_vgem@basic-fence-flip:
      fi-ilk-650:         FAIL (fdo#104008) -> PASS

    
  fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
  fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
  fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
  fdo#104008 https://bugs.freedesktop.org/show_bug.cgi?id=104008
  fdo#104724 https://bugs.freedesktop.org/show_bug.cgi?id=104724
  fdo#105600 https://bugs.freedesktop.org/show_bug.cgi?id=105600
  fdo#105719 https://bugs.freedesktop.org/show_bug.cgi?id=105719
  fdo#106000 https://bugs.freedesktop.org/show_bug.cgi?id=106000
  fdo#106097 https://bugs.freedesktop.org/show_bug.cgi?id=106097
  fdo#106729 https://bugs.freedesktop.org/show_bug.cgi?id=106729


== Participating hosts (45 -> 38) ==

  Missing    (7): fi-ilk-m540 fi-cnl-y3 fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-cfl-u2 fi-skl-6700hq 


== Build changes ==

    * Linux: CI_DRM_4257 -> Patchwork_9150

  CI_DRM_4257: 8aac35d26057479982a346c0e9cd57c2e930b7e1 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4501: 6796a604bab6df9c84af149e799902360afdd157 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_9150: e40d1490ef1623e7d985bf6f68eb6d0c055566e8 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

e40d1490ef16 drm/i915: Only sanitize GEM from late suspend
d161f21ca6dd drm/i915: After reset on sanitization, reset the engine backends
87510967cc30 drm/i915: "Race-to-idle" after switching to the kernel context
047628f19723 drm/i915: Switch to kernel context before idling at runtime

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9150/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] drm/i915: Only sanitize GEM from late suspend
  2018-05-29 13:29 ` [PATCH 5/5] drm/i915: Only sanitize GEM from late suspend Chris Wilson
@ 2018-05-30 16:56   ` Tvrtko Ursulin
  2018-05-30 17:07     ` Chris Wilson
  0 siblings, 1 reply; 18+ messages in thread
From: Tvrtko Ursulin @ 2018-05-30 16:56 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 29/05/2018 14:29, Chris Wilson wrote:
> During testing we encounter a conflict between SUSPEND_TEST_DEVICES and
> disabling reset (gem_eio/suspend). This results in the device continuing
> on without being reset, but since it has gone through HW sanitization to

Has some test been failing because of this and since when? gem_eio/suspend?

> account for the suspend/resume cycle, we have to assume the device has
> been reset to its defaults. A simple way around this is to skip the
> sanitize phase for SUSPEND_TEST_DEVICES by moving it to suspend-late.

So suspend_late is not called when suspending via SUSPEND_TEST_DEVICES? 
Sounds weird to me that core allows a "half-way" suspend mode. But I am 
not familiar with that code so don't know.

Either way, if we skip it, that only skips the reset which is skipped 
anyway in gem_eio/suspend so I did not figure out the difference.

Regards,

Tvrtko

> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_drv.c |  6 +++++-
>   drivers/gpu/drm/i915/i915_drv.h |  1 +
>   drivers/gpu/drm/i915/i915_gem.c | 22 +++++++++++++---------
>   3 files changed, 19 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 8f002ae22e62..0d9b8cc0436d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -636,6 +636,8 @@ static const struct vga_switcheroo_client_ops i915_switcheroo_ops = {
>   
>   static void i915_gem_fini(struct drm_i915_private *dev_priv)
>   {
> +	i915_gem_suspend_late(dev_priv);
> +
>   	/* Flush any outstanding unpin_work. */
>   	i915_gem_drain_workqueue(dev_priv);
>   
> @@ -1611,7 +1613,6 @@ static int i915_drm_suspend(struct drm_device *dev)
>   	opregion_target_state = suspend_to_idle(dev_priv) ? PCI_D1 : PCI_D3cold;
>   	intel_opregion_notify_adapter(dev_priv, opregion_target_state);
>   
> -	intel_uncore_suspend(dev_priv);
>   	intel_opregion_unregister(dev_priv);
>   
>   	intel_fbdev_set_suspend(dev, FBINFO_STATE_SUSPENDED, true);
> @@ -1633,7 +1634,10 @@ static int i915_drm_suspend_late(struct drm_device *dev, bool hibernation)
>   
>   	disable_rpm_wakeref_asserts(dev_priv);
>   
> +	i915_gem_suspend_late(dev_priv);
> +
>   	intel_display_set_init_power(dev_priv, false);
> +	intel_uncore_suspend(dev_priv);
>   
>   	/*
>   	 * In case of firmware assisted context save/restore don't manually
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 487922f88b76..4257ffc1bae1 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3169,6 +3169,7 @@ void i915_gem_cleanup_engines(struct drm_i915_private *dev_priv);
>   int i915_gem_wait_for_idle(struct drm_i915_private *dev_priv,
>   			   unsigned int flags);
>   int __must_check i915_gem_suspend(struct drm_i915_private *dev_priv);
> +void i915_gem_suspend_late(struct drm_i915_private *dev_priv);
>   void i915_gem_resume(struct drm_i915_private *dev_priv);
>   int i915_gem_fault(struct vm_fault *vmf);
>   int i915_gem_object_wait(struct drm_i915_gem_object *obj,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 75bdfafc97a2..874ac1a8ec47 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5050,6 +5050,17 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
>   	if (WARN_ON(!intel_engines_are_idle(dev_priv)))
>   		i915_gem_set_wedged(dev_priv); /* no hope, discard everything */
>   
> +	intel_runtime_pm_put(dev_priv);
> +	return 0;
> +
> +err_unlock:
> +	mutex_unlock(&dev->struct_mutex);
> +	intel_runtime_pm_put(dev_priv);
> +	return ret;
> +}
> +
> +void i915_gem_suspend_late(struct drm_i915_private *i915)
> +{
>   	/*
>   	 * Neither the BIOS, ourselves or any other kernel
>   	 * expects the system to be in execlists mode on startup,
> @@ -5069,16 +5080,9 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
>   	 * machines is a good idea, we don't - just in case it leaves the
>   	 * machine in an unusable condition.
>   	 */
> -	intel_uc_sanitize(dev_priv);
> -	i915_gem_sanitize(dev_priv);
>   
> -	intel_runtime_pm_put(dev_priv);
> -	return 0;
> -
> -err_unlock:
> -	mutex_unlock(&dev->struct_mutex);
> -	intel_runtime_pm_put(dev_priv);
> -	return ret;
> +	intel_uc_sanitize(i915);
> +	i915_gem_sanitize(i915);
>   }
>   
>   void i915_gem_resume(struct drm_i915_private *i915)
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 5/5] drm/i915: Only sanitize GEM from late suspend
  2018-05-30 16:56   ` Tvrtko Ursulin
@ 2018-05-30 17:07     ` Chris Wilson
  0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2018-05-30 17:07 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-05-30 17:56:20)
> 
> On 29/05/2018 14:29, Chris Wilson wrote:
> > During testing we encounter a conflict between SUSPEND_TEST_DEVICES and
> > disabling reset (gem_eio/suspend). This results in the device continuing
> > on without being reset, but since it has gone through HW sanitization to
> 
> Has some test been failing because of this and since when? gem_eio/suspend?

Yes. It will fail in the future because we'll remove all mmio from the
process_csb(); it fails right now because we lose track of what
interrupts we process across the suspend and may double handle CSB events.
E.g. https://bugs.freedesktop.org/show_bug.cgi?id=106702
 
> > account for the suspend/resume cycle, we have to assume the device has
> > been reset to its defaults. A simple way around this is to skip the
> > sanitize phase for SUSPEND_TEST_DEVICES by moving it to suspend-late.
> 
> So suspend_late is not called when suspending via SUSPEND_TEST_DEVICES? 
> Sounds weird to me that core allows a "half-way" suspend mode. But I am 
> not familiar with that code so don't know.

Yes, weird is one way to describe it.
 
> Either way, if we skip it, that only skips the reset which is skipped 
> anyway in gem_eio/suspend so I did not figure out the difference.

We then also skip the call to reset->reset which we need after a real
resume (when the device is powered up) and/or after the reset in
sanitization. It's fiddly.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] drm/i915: After reset on sanitization, reset the engine backends
  2018-05-29 13:29 ` [PATCH 4/5] drm/i915: After reset on sanitization, reset the engine backends Chris Wilson
@ 2018-05-31 13:11   ` kbuild test robot
  2018-05-31 14:34   ` kbuild test robot
  1 sibling, 0 replies; 18+ messages in thread
From: kbuild test robot @ 2018-05-31 13:11 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3200 bytes --]

Hi Chris,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on next-20180530]
[cannot apply to v4.17-rc7]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Remove-stale-asserts-from-i915_gem_find_active_request/20180531-202540
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-x019-201821 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/i915_gem.c: In function 'i915_gem_sanitize':
>> drivers/gpu/drm/i915/i915_gem.c:5035:15: error: 'struct intel_engine_cs' has no member named 'reset'; did you mean 'reset_hw'?
      if (engine->reset.reset)
                  ^~~~~
                  reset_hw
   drivers/gpu/drm/i915/i915_gem.c:5036:12: error: 'struct intel_engine_cs' has no member named 'reset'; did you mean 'reset_hw'?
       engine->reset.reset(engine, NULL);
               ^~~~~
               reset_hw

vim +5035 drivers/gpu/drm/i915/i915_gem.c

  5000	
  5001	void i915_gem_sanitize(struct drm_i915_private *i915)
  5002	{
  5003		struct intel_engine_cs *engine;
  5004		enum intel_engine_id id;
  5005	
  5006		GEM_TRACE("\n");
  5007	
  5008		mutex_lock(&i915->drm.struct_mutex);
  5009	
  5010		intel_runtime_pm_get(i915);
  5011		intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
  5012	
  5013		/*
  5014		 * As we have just resumed the machine and woken the device up from
  5015		 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
  5016		 * back to defaults, recovering from whatever wedged state we left it
  5017		 * in and so worth trying to use the device once more.
  5018		 */
  5019		if (i915_terminally_wedged(&i915->gpu_error))
  5020			i915_gem_unset_wedged(i915);
  5021	
  5022		/*
  5023		 * If we inherit context state from the BIOS or earlier occupants
  5024		 * of the GPU, the GPU may be in an inconsistent state when we
  5025		 * try to take over. The only way to remove the earlier state
  5026		 * is by resetting. However, resetting on earlier gen is tricky as
  5027		 * it may impact the display and we are uncertain about the stability
  5028		 * of the reset, so this could be applied to even earlier gen.
  5029		 */
  5030		if (INTEL_GEN(i915) >= 5 && intel_has_gpu_reset(i915))
  5031			WARN_ON(intel_gpu_reset(i915, ALL_ENGINES));
  5032	
  5033		/* Reset the submission backend after resume as well as the GPU reset */
  5034		for_each_engine(engine, i915, id) {
> 5035			if (engine->reset.reset)
  5036				engine->reset.reset(engine, NULL);
  5037		}
  5038	
  5039		intel_uncore_forcewake_put(i915, FORCEWAKE_ALL);
  5040		intel_runtime_pm_put(i915);
  5041	
  5042		i915_gem_contexts_lost(i915);
  5043		mutex_unlock(&i915->drm.struct_mutex);
  5044	}
  5045	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 30643 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] drm/i915: After reset on sanitization, reset the engine backends
  2018-05-29 13:29 ` [PATCH 4/5] drm/i915: After reset on sanitization, reset the engine backends Chris Wilson
  2018-05-31 13:11   ` kbuild test robot
@ 2018-05-31 14:34   ` kbuild test robot
  1 sibling, 0 replies; 18+ messages in thread
From: kbuild test robot @ 2018-05-31 14:34 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4658 bytes --]

Hi Chris,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on next-20180530]
[cannot apply to v4.17-rc7]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Remove-stale-asserts-from-i915_gem_find_active_request/20180531-202540
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-x006-201821 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All warnings (new ones prefixed by >>):

   In file included from include/linux/kernel.h:10:0,
                    from include/linux/list.h:9,
                    from include/linux/agp_backend.h:33,
                    from include/drm/drmP.h:35,
                    from drivers/gpu//drm/i915/i915_gem.c:28:
   drivers/gpu//drm/i915/i915_gem.c: In function 'i915_gem_sanitize':
   drivers/gpu//drm/i915/i915_gem.c:5035:15: error: 'struct intel_engine_cs' has no member named 'reset'; did you mean 'reset_hw'?
      if (engine->reset.reset)
                  ^
   include/linux/compiler.h:58:30: note: in definition of macro '__trace_if'
     if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
                                 ^~~~
>> drivers/gpu//drm/i915/i915_gem.c:5035:3: note: in expansion of macro 'if'
      if (engine->reset.reset)
      ^~
   drivers/gpu//drm/i915/i915_gem.c:5035:15: error: 'struct intel_engine_cs' has no member named 'reset'; did you mean 'reset_hw'?
      if (engine->reset.reset)
                  ^
   include/linux/compiler.h:58:42: note: in definition of macro '__trace_if'
     if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
                                             ^~~~
>> drivers/gpu//drm/i915/i915_gem.c:5035:3: note: in expansion of macro 'if'
      if (engine->reset.reset)
      ^~
   drivers/gpu//drm/i915/i915_gem.c:5035:15: error: 'struct intel_engine_cs' has no member named 'reset'; did you mean 'reset_hw'?
      if (engine->reset.reset)
                  ^
   include/linux/compiler.h:69:16: note: in definition of macro '__trace_if'
      ______r = !!(cond);     \
                   ^~~~
>> drivers/gpu//drm/i915/i915_gem.c:5035:3: note: in expansion of macro 'if'
      if (engine->reset.reset)
      ^~
   drivers/gpu//drm/i915/i915_gem.c:5036:12: error: 'struct intel_engine_cs' has no member named 'reset'; did you mean 'reset_hw'?
       engine->reset.reset(engine, NULL);
               ^~~~~
               reset_hw

vim +/if +5035 drivers/gpu//drm/i915/i915_gem.c

  5000	
  5001	void i915_gem_sanitize(struct drm_i915_private *i915)
  5002	{
  5003		struct intel_engine_cs *engine;
  5004		enum intel_engine_id id;
  5005	
  5006		GEM_TRACE("\n");
  5007	
  5008		mutex_lock(&i915->drm.struct_mutex);
  5009	
  5010		intel_runtime_pm_get(i915);
  5011		intel_uncore_forcewake_get(i915, FORCEWAKE_ALL);
  5012	
  5013		/*
  5014		 * As we have just resumed the machine and woken the device up from
  5015		 * deep PCI sleep (presumably D3_cold), assume the HW has been reset
  5016		 * back to defaults, recovering from whatever wedged state we left it
  5017		 * in and so worth trying to use the device once more.
  5018		 */
  5019		if (i915_terminally_wedged(&i915->gpu_error))
  5020			i915_gem_unset_wedged(i915);
  5021	
  5022		/*
  5023		 * If we inherit context state from the BIOS or earlier occupants
  5024		 * of the GPU, the GPU may be in an inconsistent state when we
  5025		 * try to take over. The only way to remove the earlier state
  5026		 * is by resetting. However, resetting on earlier gen is tricky as
  5027		 * it may impact the display and we are uncertain about the stability
  5028		 * of the reset, so this could be applied to even earlier gen.
  5029		 */
  5030		if (INTEL_GEN(i915) >= 5 && intel_has_gpu_reset(i915))
  5031			WARN_ON(intel_gpu_reset(i915, ALL_ENGINES));
  5032	
  5033		/* Reset the submission backend after resume as well as the GPU reset */
  5034		for_each_engine(engine, i915, id) {
> 5035			if (engine->reset.reset)
  5036				engine->reset.reset(engine, NULL);
  5037		}
  5038	
  5039		intel_uncore_forcewake_put(i915, FORCEWAKE_ALL);
  5040		intel_runtime_pm_put(i915);
  5041	
  5042		i915_gem_contexts_lost(i915);
  5043		mutex_unlock(&i915->drm.struct_mutex);
  5044	}
  5045	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24670 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-05-31 14:34 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-29 13:29 [PATCH 1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request() Chris Wilson
2018-05-29 13:29 ` [PATCH 2/5] drm/i915: Switch to kernel context before idling at runtime Chris Wilson
2018-05-29 13:29 ` [PATCH 3/5] drm/i915: "Race-to-idle" after switching to the kernel context Chris Wilson
2018-05-29 13:29 ` [PATCH 4/5] drm/i915: After reset on sanitization, reset the engine backends Chris Wilson
2018-05-31 13:11   ` kbuild test robot
2018-05-31 14:34   ` kbuild test robot
2018-05-29 13:29 ` [PATCH 5/5] drm/i915: Only sanitize GEM from late suspend Chris Wilson
2018-05-30 16:56   ` Tvrtko Ursulin
2018-05-30 17:07     ` Chris Wilson
2018-05-29 13:47 ` ✗ Fi.CI.SPARSE: warning for series starting with [1/5] drm/i915: Remove stale asserts from i915_gem_find_active_request() Patchwork
2018-05-29 14:02 ` ✓ Fi.CI.BAT: success " Patchwork
2018-05-29 17:53 ` ✗ Fi.CI.IGT: failure " Patchwork
2018-05-29 17:57   ` Chris Wilson
2018-05-30 10:43 ` [PATCH 1/5] " Tvrtko Ursulin
2018-05-30 10:55   ` Chris Wilson
2018-05-30 11:01     ` Tvrtko Ursulin
2018-05-30 11:14       ` Chris Wilson
2018-05-30 16:06 ` ✗ Fi.CI.BAT: failure for series starting with [1/5] " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.