All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset
@ 2018-04-05 11:02 Chris Wilson
  2018-04-05 11:02 ` [PATCH 2/4] drm/i915: Always assume the GPU reset occurs Chris Wilson
                   ` (7 more replies)
  0 siblings, 8 replies; 17+ messages in thread
From: Chris Wilson @ 2018-04-05 11:02 UTC (permalink / raw)
  To: intel-gfx

If we skip the intel_prepare_reset(), we should also skip the
intel_display_reset(). If we we use a flag set by intel_prepare_reset()
then we do not have to second guess based on external user controlled
state whether or not the prepare was called before deciding to finish
it after the reset. igt/gem_eio is one such example that may tweak
i915.reset faster than the code is expecting, leading to

[  190.233528] =====================================
[  190.233534] WARNING: bad unlock balance detected!
[  190.233540] 4.16.0-rc7-g335ef9849310-drmtip_10+ #1 Tainted: G     U
[  190.233547] -------------------------------------
[  190.233553] gem_eio/1348 is trying to release lock (crtc_ww_class_acquire) at:
[  190.233569] [<ffffffff895c7810>] drm_modeset_acquire_fini+0x0/0x60
[  190.233575] but there are no more locks to release!
[  190.233580]
               other info that might help us debug this:
[  190.233588] 3 locks held by gem_eio/1348:
[  190.233592]  #0:  (&f->f_pos_lock){+.+.}, at: [<00000000ab90c784>] __fdget_pos+0x3a/0x50
[  190.233607]  #1:  (sb_writers#11){.+.+}, at: [<00000000e1529265>] vfs_write+0x188/0x1a0
[  190.233622]  #2:  (&attr->mutex){+.+.}, at: [<0000000011f40afe>] simple_attr_write+0x36/0xd0
[  190.233635]
               stack backtrace:
[  190.233644] CPU: 0 PID: 1348 Comm: gem_eio Tainted: G     U           4.16.0-rc7-g335ef9849310-drmtip_10+ #1
[  190.233655] Hardware name: Dell Inc.                 OptiPlex GX280               /0G8310, BIOS A04 02/09/2005
[  190.233664] Call Trace:
[  190.233674]  dump_stack+0x67/0x95
[  190.233682]  ? drm_modeset_backoff+0x1b0/0x1b0
[  190.233690]  print_unlock_imbalance_bug+0xd2/0xe0
[  190.233698]  ? drm_modeset_backoff+0x1b0/0x1b0
[  190.233704]  lock_release+0x23e/0x300
[  190.233712]  drm_modeset_acquire_fini+0x16/0x60
[  190.233835]  intel_finish_reset+0x72/0x160 [i915]
[  190.233894]  i915_reset_device+0x1e9/0x240 [i915]
[  190.233953]  ? __intel_get_crtc_scanline+0x1c0/0x1c0 [i915]
[  190.233962]  ? work_on_cpu_safe+0x50/0x50
[  190.234020]  i915_handle_error+0x1f2/0x470 [i915]
[  190.234031]  ? __might_fault+0x39/0x90
[  190.234037]  ? __might_fault+0x39/0x90
[  190.234099]  i915_wedged_set+0x7f/0xc0 [i915]
[  190.234107]  simple_attr_write+0xb0/0xd0
[  190.234117]  full_proxy_write+0x51/0x80
[  190.234125]  __vfs_write+0x21/0x140
[  190.234133]  ? rcu_read_lock_sched_held+0x6f/0x80
[  190.234140]  ? rcu_sync_lockdep_assert+0x29/0x50
[  190.234147]  ? __sb_start_write+0x152/0x1f0
[  190.234152]  ? __sb_start_write+0x168/0x1f0
[  190.234159]  vfs_write+0xbd/0x1a0
[  190.234166]  SyS_write+0x40/0xa0
[  190.234173]  ? do_syscall_64+0x19/0x1b0
[  190.234180]  do_syscall_64+0x6b/0x1b0
[  190.234188]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
[  190.234196] RIP: 0033:0x7f84c1b392b7
[  190.234201] RSP: 002b:00007f84b6755b00 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[  190.234211] RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007f84c1b392b7
[  190.234218] RDX: 0000000000000002 RSI: 000055ec20abc8d6 RDI: 0000000000000046
[  190.234225] RBP: 000055ec20abc8d6 R08: 0000000000000000 R09: 0000000000000000
[  190.234231] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000002
[  190.234238] R13: 0000000000000000 R14: 00007f84b0000b20 R15: 000055ec20ce4eb8

Testcase: igt/gem_eio
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 415fb8cf2cf4..9c6156216e5e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3677,7 +3677,6 @@ void intel_prepare_reset(struct drm_i915_private *dev_priv)
 	struct drm_atomic_state *state;
 	int ret;
 
-
 	/* reset doesn't touch the display */
 	if (!i915_modparams.force_reset_modeset_test &&
 	    !gpu_reset_clobbers_display(dev_priv))
@@ -3731,19 +3730,17 @@ void intel_finish_reset(struct drm_i915_private *dev_priv)
 {
 	struct drm_device *dev = &dev_priv->drm;
 	struct drm_modeset_acquire_ctx *ctx = &dev_priv->reset_ctx;
-	struct drm_atomic_state *state = dev_priv->modeset_restore_state;
+	struct drm_atomic_state *state;
 	int ret;
 
 	/* reset doesn't touch the display */
-	if (!i915_modparams.force_reset_modeset_test &&
-	    !gpu_reset_clobbers_display(dev_priv))
+	if (!test_bit(I915_RESET_MODESET, &dev_priv->gpu_error.flags))
 		return;
 
+	state = fetch_and_zero(&dev_priv->modeset_restore_state);
 	if (!state)
 		goto unlock;
 
-	dev_priv->modeset_restore_state = NULL;
-
 	/* reset doesn't touch the display */
 	if (!gpu_reset_clobbers_display(dev_priv)) {
 		/* for testing only restore the display */
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] drm/i915: Always assume the GPU reset occurs
  2018-04-05 11:02 [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset Chris Wilson
@ 2018-04-05 11:02 ` Chris Wilson
  2018-04-05 11:02 ` [PATCH 3/4] drm/i915: Split out parking from the idle worker for reuse Chris Wilson
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Chris Wilson @ 2018-04-05 11:02 UTC (permalink / raw)
  To: intel-gfx

Don't trust the user controlled i915.reset, but assume the reset occured
and always clear the state beforehand and do a full recovery afterwards.

Testcase: igt/gem_eio
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_display.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 9c6156216e5e..2168611c593b 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3666,8 +3666,7 @@ __intel_display_resume(struct drm_device *dev,
 
 static bool gpu_reset_clobbers_display(struct drm_i915_private *dev_priv)
 {
-	return intel_has_gpu_reset(dev_priv) &&
-		INTEL_GEN(dev_priv) < 5 && !IS_G4X(dev_priv);
+	return INTEL_GEN(dev_priv) < 5 && !IS_G4X(dev_priv);
 }
 
 void intel_prepare_reset(struct drm_i915_private *dev_priv)
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] drm/i915: Split out parking from the idle worker for reuse
  2018-04-05 11:02 [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset Chris Wilson
  2018-04-05 11:02 ` [PATCH 2/4] drm/i915: Always assume the GPU reset occurs Chris Wilson
@ 2018-04-05 11:02 ` Chris Wilson
  2018-04-05 11:32   ` Mika Kuoppala
  2018-04-05 11:02 ` [PATCH 4/4] drm/i915: Park before resetting the submission backend Chris Wilson
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2018-04-05 11:02 UTC (permalink / raw)
  To: intel-gfx

We will want to park GEM before disengaging the drive^W^W^W unwedging.
Since we already do the work for idling, expose the guts as a new
function that we can then reuse.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 65 +++++++++++++++++++++++------------------
 1 file changed, 37 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9650a7b10c5f..e148db310ea6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -136,6 +136,42 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 	return 0;
 }
 
+static u32 i915_gem_park(struct drm_i915_private *i915)
+{
+	lockdep_assert_held(&i915->drm.struct_mutex);
+	GEM_BUG_ON(!i915->gt.awake);
+	GEM_BUG_ON(i915->gt.epoch == I915_EPOCH_INVALID);
+
+	/*
+	 * Be paranoid and flush a concurrent interrupt to make sure
+	 * we don't reactivate any irq tasklets after parking.
+	 *
+	 * FIXME: Note that even though we have waited for execlists to be idle,
+	 * there may still be an in-flight interrupt even though the CSB
+	 * is now empty. synchronize_irq() makes sure that a residual interrupt
+	 * is completed before we continue, but it doesn't prevent the HW from
+	 * raising a spurious interrupt later. To complete the shield we should
+	 * coordinate disabling the CS irq with flushing the interrupts.
+	 */
+	synchronize_irq(i915->drm.irq);
+
+	intel_engines_park(i915);
+	i915_gem_timelines_park(i915);
+
+	i915_pmu_gt_parked(i915);
+
+	i915->gt.awake = false;
+
+	if (INTEL_GEN(i915) >= 6)
+		gen6_rps_idle(i915);
+
+	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ);
+
+	intel_runtime_pm_put(i915);
+
+	return i915->gt.epoch;
+}
+
 int
 i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file)
@@ -3496,36 +3532,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
 	if (new_requests_since_last_retire(dev_priv))
 		goto out_unlock;
 
-	/*
-	 * Be paranoid and flush a concurrent interrupt to make sure
-	 * we don't reactivate any irq tasklets after parking.
-	 *
-	 * FIXME: Note that even though we have waited for execlists to be idle,
-	 * there may still be an in-flight interrupt even though the CSB
-	 * is now empty. synchronize_irq() makes sure that a residual interrupt
-	 * is completed before we continue, but it doesn't prevent the HW from
-	 * raising a spurious interrupt later. To complete the shield we should
-	 * coordinate disabling the CS irq with flushing the interrupts.
-	 */
-	synchronize_irq(dev_priv->drm.irq);
-
-	intel_engines_park(dev_priv);
-	i915_gem_timelines_park(dev_priv);
+	epoch = i915_gem_park(dev_priv);
 
-	i915_pmu_gt_parked(dev_priv);
-
-	GEM_BUG_ON(!dev_priv->gt.awake);
-	dev_priv->gt.awake = false;
-	epoch = dev_priv->gt.epoch;
-	GEM_BUG_ON(epoch == I915_EPOCH_INVALID);
 	rearm_hangcheck = false;
-
-	if (INTEL_GEN(dev_priv) >= 6)
-		gen6_rps_idle(dev_priv);
-
-	intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
-
-	intel_runtime_pm_put(dev_priv);
 out_unlock:
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] drm/i915: Park before resetting the submission backend
  2018-04-05 11:02 [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset Chris Wilson
  2018-04-05 11:02 ` [PATCH 2/4] drm/i915: Always assume the GPU reset occurs Chris Wilson
  2018-04-05 11:02 ` [PATCH 3/4] drm/i915: Split out parking from the idle worker for reuse Chris Wilson
@ 2018-04-05 11:02 ` Chris Wilson
  2018-04-05 11:54   ` Sagar Arun Kamble
  2018-04-05 11:10 ` [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset Maarten Lankhorst
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2018-04-05 11:02 UTC (permalink / raw)
  To: intel-gfx

As different backends may have different park/unpark callbacks, we
should only ever switch backends (reset_default_submission on wedge
recovery, or on enabling the guc) while parked.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c             | 11 +++++++++++
 drivers/gpu/drm/i915/intel_engine_cs.c      |  3 +++
 drivers/gpu/drm/i915/intel_guc_submission.c |  1 +
 3 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e148db310ea6..e2880de2fc7e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3380,6 +3380,17 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 	i915_retire_requests(i915);
 	GEM_BUG_ON(i915->gt.active_requests);
 
+	/*
+	 * Park before disengaging the old submit mechanism as different
+	 * backends may have different park/unpack callbacks.
+	 *
+	 * We are idle; the idle-worker will be queued, but we need to run
+	 * it now. As we already hold the struct mutex, we can get park
+	 * the GPU right away, letting the lazy worker see that we are
+	 * already active again by the time it acquires the mutex.
+	 */
+	i915_gem_park(i915);
+
 	/*
 	 * Undo nop_submit_request. We prevent all new i915 requests from
 	 * being queued (by disallowing execbuf whilst wedged) so having
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 12486d8f534b..b4ea77a2896c 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1651,6 +1651,9 @@ void intel_engines_reset_default_submission(struct drm_i915_private *i915)
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 
+	/* Must be parked first! */
+	GEM_BUG_ON(i915->gt.awake);
+
 	for_each_engine(engine, i915, id)
 		engine->set_default_submission(engine);
 }
diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 97121230656c..225fa3927a02 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -1243,6 +1243,7 @@ int intel_guc_submission_enable(struct intel_guc *guc)
 	/* Take over from manual control of ELSP (execlists) */
 	guc_interrupts_capture(dev_priv);
 
+	GEM_BUG_ON(dev_priv->gt.awake); /* Must be idle switching park/unpark */
 	for_each_engine(engine, dev_priv, id) {
 		struct intel_engine_execlists * const execlists =
 			&engine->execlists;
-- 
2.16.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset
  2018-04-05 11:02 [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset Chris Wilson
                   ` (2 preceding siblings ...)
  2018-04-05 11:02 ` [PATCH 4/4] drm/i915: Park before resetting the submission backend Chris Wilson
@ 2018-04-05 11:10 ` Maarten Lankhorst
  2018-04-05 16:19   ` Chris Wilson
  2018-04-05 11:11 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/4] " Patchwork
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Maarten Lankhorst @ 2018-04-05 11:10 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Op 05-04-18 om 13:02 schreef Chris Wilson:
> If we skip the intel_prepare_reset(), we should also skip the
> intel_display_reset(). If we we use a flag set by intel_prepare_reset()
> then we do not have to second guess based on external user controlled
> state whether or not the prepare was called before deciding to finish
> it after the reset. igt/gem_eio is one such example that may tweak
> i915.reset faster than the code is expecting, leading to
>
> [  190.233528] =====================================
> [  190.233534] WARNING: bad unlock balance detected!
> [  190.233540] 4.16.0-rc7-g335ef9849310-drmtip_10+ #1 Tainted: G     U
> [  190.233547] -------------------------------------
> [  190.233553] gem_eio/1348 is trying to release lock (crtc_ww_class_acquire) at:
> [  190.233569] [<ffffffff895c7810>] drm_modeset_acquire_fini+0x0/0x60
> [  190.233575] but there are no more locks to release!
> [  190.233580]
>                other info that might help us debug this:
> [  190.233588] 3 locks held by gem_eio/1348:
> [  190.233592]  #0:  (&f->f_pos_lock){+.+.}, at: [<00000000ab90c784>] __fdget_pos+0x3a/0x50
> [  190.233607]  #1:  (sb_writers#11){.+.+}, at: [<00000000e1529265>] vfs_write+0x188/0x1a0
> [  190.233622]  #2:  (&attr->mutex){+.+.}, at: [<0000000011f40afe>] simple_attr_write+0x36/0xd0
> [  190.233635]
>                stack backtrace:
> [  190.233644] CPU: 0 PID: 1348 Comm: gem_eio Tainted: G     U           4.16.0-rc7-g335ef9849310-drmtip_10+ #1
> [  190.233655] Hardware name: Dell Inc.                 OptiPlex GX280               /0G8310, BIOS A04 02/09/2005
> [  190.233664] Call Trace:
> [  190.233674]  dump_stack+0x67/0x95
> [  190.233682]  ? drm_modeset_backoff+0x1b0/0x1b0
> [  190.233690]  print_unlock_imbalance_bug+0xd2/0xe0
> [  190.233698]  ? drm_modeset_backoff+0x1b0/0x1b0
> [  190.233704]  lock_release+0x23e/0x300
> [  190.233712]  drm_modeset_acquire_fini+0x16/0x60
> [  190.233835]  intel_finish_reset+0x72/0x160 [i915]
> [  190.233894]  i915_reset_device+0x1e9/0x240 [i915]
> [  190.233953]  ? __intel_get_crtc_scanline+0x1c0/0x1c0 [i915]
> [  190.233962]  ? work_on_cpu_safe+0x50/0x50
> [  190.234020]  i915_handle_error+0x1f2/0x470 [i915]
> [  190.234031]  ? __might_fault+0x39/0x90
> [  190.234037]  ? __might_fault+0x39/0x90
> [  190.234099]  i915_wedged_set+0x7f/0xc0 [i915]
> [  190.234107]  simple_attr_write+0xb0/0xd0
> [  190.234117]  full_proxy_write+0x51/0x80
> [  190.234125]  __vfs_write+0x21/0x140
> [  190.234133]  ? rcu_read_lock_sched_held+0x6f/0x80
> [  190.234140]  ? rcu_sync_lockdep_assert+0x29/0x50
> [  190.234147]  ? __sb_start_write+0x152/0x1f0
> [  190.234152]  ? __sb_start_write+0x168/0x1f0
> [  190.234159]  vfs_write+0xbd/0x1a0
> [  190.234166]  SyS_write+0x40/0xa0
> [  190.234173]  ? do_syscall_64+0x19/0x1b0
> [  190.234180]  do_syscall_64+0x6b/0x1b0
> [  190.234188]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> [  190.234196] RIP: 0033:0x7f84c1b392b7
> [  190.234201] RSP: 002b:00007f84b6755b00 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
> [  190.234211] RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007f84c1b392b7
> [  190.234218] RDX: 0000000000000002 RSI: 000055ec20abc8d6 RDI: 0000000000000046
> [  190.234225] RBP: 000055ec20abc8d6 R08: 0000000000000000 R09: 0000000000000000
> [  190.234231] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000002
> [  190.234238] R13: 0000000000000000 R14: 00007f84b0000b20 R15: 000055ec20ce4eb8
>
> Testcase: igt/gem_eio
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_display.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 415fb8cf2cf4..9c6156216e5e 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -3677,7 +3677,6 @@ void intel_prepare_reset(struct drm_i915_private *dev_priv)
>  	struct drm_atomic_state *state;
>  	int ret;
>  
> -
>  	/* reset doesn't touch the display */
>  	if (!i915_modparams.force_reset_modeset_test &&
>  	    !gpu_reset_clobbers_display(dev_priv))
> @@ -3731,19 +3730,17 @@ void intel_finish_reset(struct drm_i915_private *dev_priv)
>  {
>  	struct drm_device *dev = &dev_priv->drm;
>  	struct drm_modeset_acquire_ctx *ctx = &dev_priv->reset_ctx;
> -	struct drm_atomic_state *state = dev_priv->modeset_restore_state;
> +	struct drm_atomic_state *state;
>  	int ret;
>  
>  	/* reset doesn't touch the display */
> -	if (!i915_modparams.force_reset_modeset_test &&
> -	    !gpu_reset_clobbers_display(dev_priv))
> +	if (!test_bit(I915_RESET_MODESET, &dev_priv->gpu_error.flags))
>  		return;
>  
> +	state = fetch_and_zero(&dev_priv->modeset_restore_state);
>  	if (!state)
>  		goto unlock;
>  
> -	dev_priv->modeset_restore_state = NULL;
> -
>  	/* reset doesn't touch the display */
>  	if (!gpu_reset_clobbers_display(dev_priv)) {
>  		/* for testing only restore the display */

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/4] drm/i915: Only call finish_reset after a prepare_reset
  2018-04-05 11:02 [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset Chris Wilson
                   ` (3 preceding siblings ...)
  2018-04-05 11:10 ` [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset Maarten Lankhorst
@ 2018-04-05 11:11 ` Patchwork
  2018-04-05 11:26 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2018-04-05 11:11 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/4] drm/i915: Only call finish_reset after a prepare_reset
URL   : https://patchwork.freedesktop.org/series/41204/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
01a4ecdd9894 drm/i915: Only call finish_reset after a prepare_reset
7569332e6c11 drm/i915: Always assume the GPU reset occurs
-:9: WARNING:TYPO_SPELLING: 'occured' may be misspelled - perhaps 'occurred'?
#9: 
Don't trust the user controlled i915.reset, but assume the reset occured

total: 0 errors, 1 warnings, 0 checks, 9 lines checked
4622755d0372 drm/i915: Split out parking from the idle worker for reuse
eeaf38a1c096 drm/i915: Park before resetting the submission backend

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/4] drm/i915: Only call finish_reset after a prepare_reset
  2018-04-05 11:02 [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset Chris Wilson
                   ` (4 preceding siblings ...)
  2018-04-05 11:11 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/4] " Patchwork
@ 2018-04-05 11:26 ` Patchwork
  2018-04-05 13:03 ` ✗ Fi.CI.IGT: failure " Patchwork
  2018-04-06 13:09 ` ✗ Fi.CI.BAT: failure for series starting with [1/4] drm/i915: Only call finish_reset after a prepare_reset (rev2) Patchwork
  7 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2018-04-05 11:26 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/4] drm/i915: Only call finish_reset after a prepare_reset
URL   : https://patchwork.freedesktop.org/series/41204/
State : success

== Summary ==

Series 41204v1 series starting with [1/4] drm/i915: Only call finish_reset after a prepare_reset
https://patchwork.freedesktop.org/api/1.0/series/41204/revisions/1/mbox/

---- Known issues:

Test drv_module_reload:
        Subgroup basic-reload:
                pass       -> DMESG-WARN (fi-glk-1) fdo#104238
Test gem_mmap_gtt:
        Subgroup basic-small-bo-tiledx:
                fail       -> PASS       (fi-gdg-551) fdo#102575
Test kms_flip:
        Subgroup basic-flip-vs-wf_vblank:
                pass       -> FAIL       (fi-skl-6770hq) fdo#100368
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-c:
                dmesg-warn -> PASS       (fi-glk-j4005) fdo#105644
Test prime_vgem:
        Subgroup basic-fence-flip:
                fail       -> PASS       (fi-ilk-650) fdo#104008

fdo#104238 https://bugs.freedesktop.org/show_bug.cgi?id=104238
fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#105644 https://bugs.freedesktop.org/show_bug.cgi?id=105644
fdo#104008 https://bugs.freedesktop.org/show_bug.cgi?id=104008

fi-bdw-5557u     total:285  pass:264  dwarn:0   dfail:0   fail:0   skip:21  time:432s
fi-bdw-gvtdvm    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:441s
fi-blb-e6850     total:285  pass:220  dwarn:1   dfail:0   fail:0   skip:64  time:380s
fi-bsw-n3050     total:285  pass:239  dwarn:0   dfail:0   fail:0   skip:46  time:539s
fi-bwr-2160      total:285  pass:180  dwarn:0   dfail:0   fail:0   skip:105 time:299s
fi-bxt-dsi       total:285  pass:255  dwarn:0   dfail:0   fail:0   skip:30  time:519s
fi-bxt-j4205     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:512s
fi-byt-j1900     total:285  pass:250  dwarn:0   dfail:0   fail:0   skip:35  time:520s
fi-byt-n2820     total:285  pass:246  dwarn:0   dfail:0   fail:0   skip:39  time:512s
fi-cfl-8700k     total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:413s
fi-cfl-s3        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:561s
fi-cfl-u         total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:512s
fi-cnl-y3        total:285  pass:259  dwarn:0   dfail:0   fail:0   skip:26  time:593s
fi-elk-e7500     total:285  pass:226  dwarn:0   dfail:0   fail:0   skip:59  time:426s
fi-gdg-551       total:285  pass:177  dwarn:0   dfail:0   fail:0   skip:108 time:312s
fi-glk-1         total:285  pass:256  dwarn:1   dfail:0   fail:0   skip:28  time:538s
fi-glk-j4005     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:484s
fi-hsw-4770      total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:405s
fi-ilk-650       total:285  pass:225  dwarn:0   dfail:0   fail:0   skip:60  time:422s
fi-ivb-3520m     total:285  pass:256  dwarn:0   dfail:0   fail:0   skip:29  time:470s
fi-ivb-3770      total:285  pass:252  dwarn:0   dfail:0   fail:0   skip:33  time:432s
fi-kbl-7500u     total:285  pass:260  dwarn:1   dfail:0   fail:0   skip:24  time:472s
fi-kbl-7567u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:463s
fi-kbl-r         total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:510s
fi-pnv-d510      total:285  pass:220  dwarn:1   dfail:0   fail:0   skip:64  time:665s
fi-skl-6260u     total:285  pass:265  dwarn:0   dfail:0   fail:0   skip:20  time:440s
fi-skl-6600u     total:285  pass:258  dwarn:0   dfail:0   fail:0   skip:27  time:542s
fi-skl-6700k2    total:285  pass:261  dwarn:0   dfail:0   fail:0   skip:24  time:501s
fi-skl-6770hq    total:285  pass:264  dwarn:0   dfail:0   fail:1   skip:20  time:482s
fi-skl-guc       total:285  pass:257  dwarn:0   dfail:0   fail:0   skip:28  time:431s
fi-skl-gvtdvm    total:285  pass:262  dwarn:0   dfail:0   fail:0   skip:23  time:447s
fi-snb-2520m     total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:570s
fi-snb-2600      total:285  pass:245  dwarn:0   dfail:0   fail:0   skip:40  time:403s
Blacklisted hosts:
fi-cnl-psr       total:285  pass:256  dwarn:3   dfail:0   fail:0   skip:26  time:511s

0eddede73765b01ec287cad00e23bee23c216a16 drm-tip: 2018y-04m-05d-09h-51m-03s UTC integration manifest
eeaf38a1c096 drm/i915: Park before resetting the submission backend
4622755d0372 drm/i915: Split out parking from the idle worker for reuse
7569332e6c11 drm/i915: Always assume the GPU reset occurs
01a4ecdd9894 drm/i915: Only call finish_reset after a prepare_reset

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8591/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] drm/i915: Split out parking from the idle worker for reuse
  2018-04-05 11:02 ` [PATCH 3/4] drm/i915: Split out parking from the idle worker for reuse Chris Wilson
@ 2018-04-05 11:32   ` Mika Kuoppala
  0 siblings, 0 replies; 17+ messages in thread
From: Mika Kuoppala @ 2018-04-05 11:32 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> We will want to park GEM before disengaging the drive^W^W^W unwedging.
> Since we already do the work for idling, expose the guts as a new
> function that we can then reuse.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/i915_gem.c | 65 +++++++++++++++++++++++------------------
>  1 file changed, 37 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 9650a7b10c5f..e148db310ea6 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -136,6 +136,42 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
>  	return 0;
>  }
>  
> +static u32 i915_gem_park(struct drm_i915_private *i915)
> +{
> +	lockdep_assert_held(&i915->drm.struct_mutex);
> +	GEM_BUG_ON(!i915->gt.awake);
> +	GEM_BUG_ON(i915->gt.epoch == I915_EPOCH_INVALID);
> +
> +	/*
> +	 * Be paranoid and flush a concurrent interrupt to make sure
> +	 * we don't reactivate any irq tasklets after parking.
> +	 *
> +	 * FIXME: Note that even though we have waited for execlists to be idle,
> +	 * there may still be an in-flight interrupt even though the CSB
> +	 * is now empty. synchronize_irq() makes sure that a residual interrupt
> +	 * is completed before we continue, but it doesn't prevent the HW from
> +	 * raising a spurious interrupt later. To complete the shield we should
> +	 * coordinate disabling the CS irq with flushing the interrupts.
> +	 */
> +	synchronize_irq(i915->drm.irq);
> +
> +	intel_engines_park(i915);
> +	i915_gem_timelines_park(i915);
> +
> +	i915_pmu_gt_parked(i915);
> +
> +	i915->gt.awake = false;
> +
> +	if (INTEL_GEN(i915) >= 6)
> +		gen6_rps_idle(i915);
> +
> +	intel_display_power_put(i915, POWER_DOMAIN_GT_IRQ);
> +
> +	intel_runtime_pm_put(i915);
> +
> +	return i915->gt.epoch;
> +}
> +
>  int
>  i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
>  			    struct drm_file *file)
> @@ -3496,36 +3532,9 @@ i915_gem_idle_work_handler(struct work_struct *work)
>  	if (new_requests_since_last_retire(dev_priv))
>  		goto out_unlock;
>  
> -	/*
> -	 * Be paranoid and flush a concurrent interrupt to make sure
> -	 * we don't reactivate any irq tasklets after parking.
> -	 *
> -	 * FIXME: Note that even though we have waited for execlists to be idle,
> -	 * there may still be an in-flight interrupt even though the CSB
> -	 * is now empty. synchronize_irq() makes sure that a residual interrupt
> -	 * is completed before we continue, but it doesn't prevent the HW from
> -	 * raising a spurious interrupt later. To complete the shield we should
> -	 * coordinate disabling the CS irq with flushing the interrupts.
> -	 */
> -	synchronize_irq(dev_priv->drm.irq);
> -
> -	intel_engines_park(dev_priv);
> -	i915_gem_timelines_park(dev_priv);
> +	epoch = i915_gem_park(dev_priv);
>  
> -	i915_pmu_gt_parked(dev_priv);
> -
> -	GEM_BUG_ON(!dev_priv->gt.awake);
> -	dev_priv->gt.awake = false;
> -	epoch = dev_priv->gt.epoch;
> -	GEM_BUG_ON(epoch == I915_EPOCH_INVALID);
>  	rearm_hangcheck = false;
> -
> -	if (INTEL_GEN(dev_priv) >= 6)
> -		gen6_rps_idle(dev_priv);
> -
> -	intel_display_power_put(dev_priv, POWER_DOMAIN_GT_IRQ);
> -
> -	intel_runtime_pm_put(dev_priv);
>  out_unlock:
>  	mutex_unlock(&dev_priv->drm.struct_mutex);
>  
> -- 
> 2.16.3
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] drm/i915: Park before resetting the submission backend
  2018-04-05 11:02 ` [PATCH 4/4] drm/i915: Park before resetting the submission backend Chris Wilson
@ 2018-04-05 11:54   ` Sagar Arun Kamble
  2018-04-05 12:17     ` Chris Wilson
  2018-04-06 12:25     ` Chris Wilson
  0 siblings, 2 replies; 17+ messages in thread
From: Sagar Arun Kamble @ 2018-04-05 11:54 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 4/5/2018 4:32 PM, Chris Wilson wrote:
> As different backends may have different park/unpark callbacks, we
> should only ever switch backends (reset_default_submission on wedge
> recovery, or on enabling the guc) while parked.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem.c             | 11 +++++++++++
>   drivers/gpu/drm/i915/intel_engine_cs.c      |  3 +++
>   drivers/gpu/drm/i915/intel_guc_submission.c |  1 +
>   3 files changed, 15 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index e148db310ea6..e2880de2fc7e 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3380,6 +3380,17 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
>   	i915_retire_requests(i915);
>   	GEM_BUG_ON(i915->gt.active_requests);
>   
> +	/*
> +	 * Park before disengaging the old submit mechanism as different
> +	 * backends may have different park/unpack callbacks.
> +	 *
> +	 * We are idle; the idle-worker will be queued, but we need to run
> +	 * it now. As we already hold the struct mutex, we can get park
> +	 * the GPU right away, letting the lazy worker see that we are
> +	 * already active again by the time it acquires the mutex.
> +	 */
> +	i915_gem_park(i915);
I think we should be calling this before gem_unset_wedged in i915_reset.
With GuC, hitting the GEM_BUG_ON(awake) in guc_submission_enable.
Also idle_work can execute just before reset so GEM_BUG_ON(!awake) in 
gem_park can be hit I think.
> +
>   	/*
>   	 * Undo nop_submit_request. We prevent all new i915 requests from
>   	 * being queued (by disallowing execbuf whilst wedged) so having
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 12486d8f534b..b4ea77a2896c 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1651,6 +1651,9 @@ void intel_engines_reset_default_submission(struct drm_i915_private *i915)
>   	struct intel_engine_cs *engine;
>   	enum intel_engine_id id;
>   
> +	/* Must be parked first! */
> +	GEM_BUG_ON(i915->gt.awake);
> +
>   	for_each_engine(engine, i915, id)
>   		engine->set_default_submission(engine);
>   }
> diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
> index 97121230656c..225fa3927a02 100644
> --- a/drivers/gpu/drm/i915/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/intel_guc_submission.c
> @@ -1243,6 +1243,7 @@ int intel_guc_submission_enable(struct intel_guc *guc)
>   	/* Take over from manual control of ELSP (execlists) */
>   	guc_interrupts_capture(dev_priv);
>   
> +	GEM_BUG_ON(dev_priv->gt.awake); /* Must be idle switching park/unpark */
>   	for_each_engine(engine, dev_priv, id) {
>   		struct intel_engine_execlists * const execlists =
>   			&engine->execlists;

-- 
Thanks,
Sagar

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] drm/i915: Park before resetting the submission backend
  2018-04-05 11:54   ` Sagar Arun Kamble
@ 2018-04-05 12:17     ` Chris Wilson
  2018-04-05 12:36       ` Chris Wilson
  2018-04-06 12:25     ` Chris Wilson
  1 sibling, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2018-04-05 12:17 UTC (permalink / raw)
  To: Sagar Arun Kamble, intel-gfx

Quoting Sagar Arun Kamble (2018-04-05 12:54:38)
> 
> 
> On 4/5/2018 4:32 PM, Chris Wilson wrote:
> > As different backends may have different park/unpark callbacks, we
> > should only ever switch backends (reset_default_submission on wedge
> > recovery, or on enabling the guc) while parked.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_gem.c             | 11 +++++++++++
> >   drivers/gpu/drm/i915/intel_engine_cs.c      |  3 +++
> >   drivers/gpu/drm/i915/intel_guc_submission.c |  1 +
> >   3 files changed, 15 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index e148db310ea6..e2880de2fc7e 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3380,6 +3380,17 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
> >       i915_retire_requests(i915);
> >       GEM_BUG_ON(i915->gt.active_requests);
> >   
> > +     /*
> > +      * Park before disengaging the old submit mechanism as different
> > +      * backends may have different park/unpack callbacks.
> > +      *
> > +      * We are idle; the idle-worker will be queued, but we need to run
> > +      * it now. As we already hold the struct mutex, we can get park
> > +      * the GPU right away, letting the lazy worker see that we are
> > +      * already active again by the time it acquires the mutex.
> > +      */
> > +     i915_gem_park(i915);
> I think we should be calling this before gem_unset_wedged in i915_reset.

We can't, we aren't yet conclusively idle at that point. This is the
earliest in that sequence where we know we are.

> With GuC, hitting the GEM_BUG_ON(awake) in guc_submission_enable.

Hmm. We definitely need to keep park/unpark balanced. Brute force would
be to force idle at that point. Or we do if (awake) engine->unpark on
the takeover.

> Also idle_work can execute just before reset so GEM_BUG_ON(!awake) in 
> gem_park can be hit I think.

True, we could just ignore the whole wait-for-idle if already !gt.awake.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] drm/i915: Park before resetting the submission backend
  2018-04-05 12:17     ` Chris Wilson
@ 2018-04-05 12:36       ` Chris Wilson
  0 siblings, 0 replies; 17+ messages in thread
From: Chris Wilson @ 2018-04-05 12:36 UTC (permalink / raw)
  To: Sagar Arun Kamble, intel-gfx

Quoting Chris Wilson (2018-04-05 13:17:24)
> Quoting Sagar Arun Kamble (2018-04-05 12:54:38)
> > With GuC, hitting the GEM_BUG_ON(awake) in guc_submission_enable.
> 
> Hmm. We definitely need to keep park/unpark balanced. Brute force would
> be to force idle at that point. Or we do if (awake) engine->unpark on
> the takeover.

I think since we are only gt.awake due to the early context copying, we
are actually idle here just haven't yet run the idle worker. Brute
forcing the park doesn't seem so bad.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✗ Fi.CI.IGT: failure for series starting with [1/4] drm/i915: Only call finish_reset after a prepare_reset
  2018-04-05 11:02 [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset Chris Wilson
                   ` (5 preceding siblings ...)
  2018-04-05 11:26 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-04-05 13:03 ` Patchwork
  2018-04-06 13:09 ` ✗ Fi.CI.BAT: failure for series starting with [1/4] drm/i915: Only call finish_reset after a prepare_reset (rev2) Patchwork
  7 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2018-04-05 13:03 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/4] drm/i915: Only call finish_reset after a prepare_reset
URL   : https://patchwork.freedesktop.org/series/41204/
State : failure

== Summary ==

---- Possible new issues:

Test gem_eio:
        Subgroup execbuf:
                pass       -> INCOMPLETE (shard-apl)
                pass       -> INCOMPLETE (shard-hsw)
                pass       -> INCOMPLETE (shard-snb)
        Subgroup hibernate:
                pass       -> INCOMPLETE (shard-snb)
        Subgroup suspend:
                pass       -> INCOMPLETE (shard-snb)
        Subgroup throttle:
                pass       -> INCOMPLETE (shard-apl)
                pass       -> INCOMPLETE (shard-hsw)
                pass       -> INCOMPLETE (shard-snb)
Test kms_cursor_legacy:
        Subgroup cursor-vs-flip-toggle:
                fail       -> PASS       (shard-hsw)

---- Known issues:

Test gem_eio:
        Subgroup hibernate:
                pass       -> INCOMPLETE (shard-hsw) fdo#103540
        Subgroup in-flight-suspend:
                pass       -> INCOMPLETE (shard-snb) fdo#103375 +2
        Subgroup suspend:
                pass       -> INCOMPLETE (shard-apl) fdo#103927 +1
                pass       -> INCOMPLETE (shard-hsw) fdo#105055
Test kms_flip:
        Subgroup modeset-vs-vblank-race-interruptible:
                pass       -> FAIL       (shard-hsw) fdo#103060 +2
        Subgroup wf_vblank-ts-check:
                pass       -> FAIL       (shard-hsw) fdo#100368 +1
Test kms_frontbuffer_tracking:
        Subgroup fbc-1p-primscrn-pri-indfb-draw-render:
                fail       -> PASS       (shard-snb) fdo#103167

fdo#103540 https://bugs.freedesktop.org/show_bug.cgi?id=103540
fdo#103375 https://bugs.freedesktop.org/show_bug.cgi?id=103375
fdo#103927 https://bugs.freedesktop.org/show_bug.cgi?id=103927
fdo#105055 https://bugs.freedesktop.org/show_bug.cgi?id=105055
fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167

shard-apl        total:2512 pass:1713 dwarn:1   dfail:0   fail:7   skip:785 time:10919s
shard-hsw        total:2512 pass:1659 dwarn:1   dfail:0   fail:6   skip:840 time:9633s
shard-snb        total:2512 pass:1283 dwarn:1   dfail:0   fail:3   skip:1220 time:5942s
Blacklisted hosts:
shard-kbl        total:2512 pass:1828 dwarn:1   dfail:1   fail:7   skip:670 time:7940s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8591/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset
  2018-04-05 11:10 ` [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset Maarten Lankhorst
@ 2018-04-05 16:19   ` Chris Wilson
  0 siblings, 0 replies; 17+ messages in thread
From: Chris Wilson @ 2018-04-05 16:19 UTC (permalink / raw)
  To: Maarten Lankhorst, intel-gfx

Quoting Maarten Lankhorst (2018-04-05 12:10:23)
> Op 05-04-18 om 13:02 schreef Chris Wilson:
> > If we skip the intel_prepare_reset(), we should also skip the
> > intel_display_reset(). If we we use a flag set by intel_prepare_reset()
> > then we do not have to second guess based on external user controlled
> > state whether or not the prepare was called before deciding to finish
> > it after the reset. igt/gem_eio is one such example that may tweak
> > i915.reset faster than the code is expecting, leading to
> >
> > [  190.233528] =====================================
> > [  190.233534] WARNING: bad unlock balance detected!
> > [  190.233540] 4.16.0-rc7-g335ef9849310-drmtip_10+ #1 Tainted: G     U
> > [  190.233547] -------------------------------------
> > [  190.233553] gem_eio/1348 is trying to release lock (crtc_ww_class_acquire) at:
> > [  190.233569] [<ffffffff895c7810>] drm_modeset_acquire_fini+0x0/0x60
> > [  190.233575] but there are no more locks to release!
> > [  190.233580]
> >                other info that might help us debug this:
> > [  190.233588] 3 locks held by gem_eio/1348:
> > [  190.233592]  #0:  (&f->f_pos_lock){+.+.}, at: [<00000000ab90c784>] __fdget_pos+0x3a/0x50
> > [  190.233607]  #1:  (sb_writers#11){.+.+}, at: [<00000000e1529265>] vfs_write+0x188/0x1a0
> > [  190.233622]  #2:  (&attr->mutex){+.+.}, at: [<0000000011f40afe>] simple_attr_write+0x36/0xd0
> > [  190.233635]
> >                stack backtrace:
> > [  190.233644] CPU: 0 PID: 1348 Comm: gem_eio Tainted: G     U           4.16.0-rc7-g335ef9849310-drmtip_10+ #1
> > [  190.233655] Hardware name: Dell Inc.                 OptiPlex GX280               /0G8310, BIOS A04 02/09/2005
> > [  190.233664] Call Trace:
> > [  190.233674]  dump_stack+0x67/0x95
> > [  190.233682]  ? drm_modeset_backoff+0x1b0/0x1b0
> > [  190.233690]  print_unlock_imbalance_bug+0xd2/0xe0
> > [  190.233698]  ? drm_modeset_backoff+0x1b0/0x1b0
> > [  190.233704]  lock_release+0x23e/0x300
> > [  190.233712]  drm_modeset_acquire_fini+0x16/0x60
> > [  190.233835]  intel_finish_reset+0x72/0x160 [i915]
> > [  190.233894]  i915_reset_device+0x1e9/0x240 [i915]
> > [  190.233953]  ? __intel_get_crtc_scanline+0x1c0/0x1c0 [i915]
> > [  190.233962]  ? work_on_cpu_safe+0x50/0x50
> > [  190.234020]  i915_handle_error+0x1f2/0x470 [i915]
> > [  190.234031]  ? __might_fault+0x39/0x90
> > [  190.234037]  ? __might_fault+0x39/0x90
> > [  190.234099]  i915_wedged_set+0x7f/0xc0 [i915]
> > [  190.234107]  simple_attr_write+0xb0/0xd0
> > [  190.234117]  full_proxy_write+0x51/0x80
> > [  190.234125]  __vfs_write+0x21/0x140
> > [  190.234133]  ? rcu_read_lock_sched_held+0x6f/0x80
> > [  190.234140]  ? rcu_sync_lockdep_assert+0x29/0x50
> > [  190.234147]  ? __sb_start_write+0x152/0x1f0
> > [  190.234152]  ? __sb_start_write+0x168/0x1f0
> > [  190.234159]  vfs_write+0xbd/0x1a0
> > [  190.234166]  SyS_write+0x40/0xa0
> > [  190.234173]  ? do_syscall_64+0x19/0x1b0
> > [  190.234180]  do_syscall_64+0x6b/0x1b0
> > [  190.234188]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> > [  190.234196] RIP: 0033:0x7f84c1b392b7
> > [  190.234201] RSP: 002b:00007f84b6755b00 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
> > [  190.234211] RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007f84c1b392b7
> > [  190.234218] RDX: 0000000000000002 RSI: 000055ec20abc8d6 RDI: 0000000000000046
> > [  190.234225] RBP: 000055ec20abc8d6 R08: 0000000000000000 R09: 0000000000000000
> > [  190.234231] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000002
> > [  190.234238] R13: 0000000000000000 R14: 00007f84b0000b20 R15: 000055ec20ce4eb8
> >
> > Testcase: igt/gem_eio
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
[snip]
> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

Thanks for the review, applied this patch and hopefully we only need
this one by itself to clear up the warnings from the shards on these old
machine.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] drm/i915: Park before resetting the submission backend
  2018-04-05 11:54   ` Sagar Arun Kamble
  2018-04-05 12:17     ` Chris Wilson
@ 2018-04-06 12:25     ` Chris Wilson
  2018-04-06 13:39       ` Michal Wajdeczko
  1 sibling, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2018-04-06 12:25 UTC (permalink / raw)
  To: Sagar Arun Kamble, intel-gfx

Quoting Sagar Arun Kamble (2018-04-05 12:54:38)
> 
> 
> On 4/5/2018 4:32 PM, Chris Wilson wrote:
> > As different backends may have different park/unpark callbacks, we
> > should only ever switch backends (reset_default_submission on wedge
> > recovery, or on enabling the guc) while parked.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_gem.c             | 11 +++++++++++
> >   drivers/gpu/drm/i915/intel_engine_cs.c      |  3 +++
> >   drivers/gpu/drm/i915/intel_guc_submission.c |  1 +
> >   3 files changed, 15 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index e148db310ea6..e2880de2fc7e 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3380,6 +3380,17 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
> >       i915_retire_requests(i915);
> >       GEM_BUG_ON(i915->gt.active_requests);
> >   
> > +     /*
> > +      * Park before disengaging the old submit mechanism as different
> > +      * backends may have different park/unpack callbacks.
> > +      *
> > +      * We are idle; the idle-worker will be queued, but we need to run
> > +      * it now. As we already hold the struct mutex, we can get park
> > +      * the GPU right away, letting the lazy worker see that we are
> > +      * already active again by the time it acquires the mutex.
> > +      */
> > +     i915_gem_park(i915);
> I think we should be calling this before gem_unset_wedged in i915_reset.
> With GuC, hitting the GEM_BUG_ON(awake) in guc_submission_enable.

Right, we really do need to restore guc submission before restarting. So
how can we fit in something like

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9650a7b10c5f..95fa30d9aec6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3132,6 +3132,9 @@ void i915_gem_reset(struct drm_i915_private *dev_priv)
 
        i915_retire_requests(dev_priv);
 
+       if (USES_GUC_SUBMISSION(dev_priv))
+               (void)intel_guc_submission_enable(guc);
+
        for_each_engine(engine, dev_priv, id) {
                struct i915_gem_context *ctx;
 
?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [1/4] drm/i915: Only call finish_reset after a prepare_reset (rev2)
  2018-04-05 11:02 [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset Chris Wilson
                   ` (6 preceding siblings ...)
  2018-04-05 13:03 ` ✗ Fi.CI.IGT: failure " Patchwork
@ 2018-04-06 13:09 ` Patchwork
  7 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2018-04-06 13:09 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/4] drm/i915: Only call finish_reset after a prepare_reset (rev2)
URL   : https://patchwork.freedesktop.org/series/41204/
State : failure

== Summary ==

Applying: drm/i915: Only call finish_reset after a prepare_reset
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/intel_display.c
Falling back to patching base and 3-way merge...
No changes -- Patch already applied.
Applying: drm/i915: Always assume the GPU reset occurs
Applying: drm/i915: Split out parking from the idle worker for reuse
Applying: drm/i915: Park before resetting the submission backend
error: patch failed: drivers/gpu/drm/i915/i915_gem.c:3132
error: drivers/gpu/drm/i915/i915_gem.c: patch does not apply
error: Did you hand edit your patch?
It does not apply to blobs recorded in its index.
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/i915_gem.c
Patch failed at 0004 drm/i915: Park before resetting the submission backend
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] drm/i915: Park before resetting the submission backend
  2018-04-06 12:25     ` Chris Wilson
@ 2018-04-06 13:39       ` Michal Wajdeczko
  2018-04-06 14:09         ` Chris Wilson
  0 siblings, 1 reply; 17+ messages in thread
From: Michal Wajdeczko @ 2018-04-06 13:39 UTC (permalink / raw)
  To: Sagar Arun Kamble, intel-gfx, Chris Wilson

On Fri, 06 Apr 2018 14:25:48 +0200, Chris Wilson  
<chris@chris-wilson.co.uk> wrote:

> Quoting Sagar Arun Kamble (2018-04-05 12:54:38)
>>
>>
>> On 4/5/2018 4:32 PM, Chris Wilson wrote:
>> > As different backends may have different park/unpark callbacks, we
>> > should only ever switch backends (reset_default_submission on wedge
>> > recovery, or on enabling the guc) while parked.
>> >
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> > Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> > ---
>> >   drivers/gpu/drm/i915/i915_gem.c             | 11 +++++++++++
>> >   drivers/gpu/drm/i915/intel_engine_cs.c      |  3 +++
>> >   drivers/gpu/drm/i915/intel_guc_submission.c |  1 +
>> >   3 files changed, 15 insertions(+)
>> >
>> > diff --git a/drivers/gpu/drm/i915/i915_gem.c  
>> b/drivers/gpu/drm/i915/i915_gem.c
>> > index e148db310ea6..e2880de2fc7e 100644
>> > --- a/drivers/gpu/drm/i915/i915_gem.c
>> > +++ b/drivers/gpu/drm/i915/i915_gem.c
>> > @@ -3380,6 +3380,17 @@ bool i915_gem_unset_wedged(struct  
>> drm_i915_private *i915)
>> >       i915_retire_requests(i915);
>> >       GEM_BUG_ON(i915->gt.active_requests);
>> >
>> > +     /*
>> > +      * Park before disengaging the old submit mechanism as different
>> > +      * backends may have different park/unpack callbacks.
>> > +      *
>> > +      * We are idle; the idle-worker will be queued, but we need to  
>> run
>> > +      * it now. As we already hold the struct mutex, we can get park
>> > +      * the GPU right away, letting the lazy worker see that we are
>> > +      * already active again by the time it acquires the mutex.
>> > +      */
>> > +     i915_gem_park(i915);
>> I think we should be calling this before gem_unset_wedged in i915_reset.
>> With GuC, hitting the GEM_BUG_ON(awake) in guc_submission_enable.
>
> Right, we really do need to restore guc submission before restarting. So
> how can we fit in something like
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c  
> b/drivers/gpu/drm/i915/i915_gem.c
> index 9650a7b10c5f..95fa30d9aec6 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3132,6 +3132,9 @@ void i915_gem_reset(struct drm_i915_private  
> *dev_priv)
>        i915_retire_requests(dev_priv);
> +       if (USES_GUC_SUBMISSION(dev_priv))
> +               (void)intel_guc_submission_enable(guc);
> +
>         for_each_engine(engine, dev_priv, id) {
>                 struct i915_gem_context *ctx;
> ?

In series [1] I was trying to fix symmetry in calls to uc_init_hw/fini_hw
where we are enabling/disabling GuC submission. In particular, patch [2]
fixes reset path.

So maybe we should try to merge* both series ?

Michal

*) without my patch 6/12 [3]

[1] https://patchwork.freedesktop.org/series/41159/
[2] https://patchwork.freedesktop.org/patch/214967/
[3] https://patchwork.freedesktop.org/patch/214976/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] drm/i915: Park before resetting the submission backend
  2018-04-06 13:39       ` Michal Wajdeczko
@ 2018-04-06 14:09         ` Chris Wilson
  0 siblings, 0 replies; 17+ messages in thread
From: Chris Wilson @ 2018-04-06 14:09 UTC (permalink / raw)
  To: Michal Wajdeczko, Sagar Arun Kamble, intel-gfx

Quoting Michal Wajdeczko (2018-04-06 14:39:28)
> On Fri, 06 Apr 2018 14:25:48 +0200, Chris Wilson  
> <chris@chris-wilson.co.uk> wrote:
> 
> > Quoting Sagar Arun Kamble (2018-04-05 12:54:38)
> >>
> >>
> >> On 4/5/2018 4:32 PM, Chris Wilson wrote:
> >> > As different backends may have different park/unpark callbacks, we
> >> > should only ever switch backends (reset_default_submission on wedge
> >> > recovery, or on enabling the guc) while parked.
> >> >
> >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> > Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> >> > Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> >> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> >> > ---
> >> >   drivers/gpu/drm/i915/i915_gem.c             | 11 +++++++++++
> >> >   drivers/gpu/drm/i915/intel_engine_cs.c      |  3 +++
> >> >   drivers/gpu/drm/i915/intel_guc_submission.c |  1 +
> >> >   3 files changed, 15 insertions(+)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/i915_gem.c  
> >> b/drivers/gpu/drm/i915/i915_gem.c
> >> > index e148db310ea6..e2880de2fc7e 100644
> >> > --- a/drivers/gpu/drm/i915/i915_gem.c
> >> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> >> > @@ -3380,6 +3380,17 @@ bool i915_gem_unset_wedged(struct  
> >> drm_i915_private *i915)
> >> >       i915_retire_requests(i915);
> >> >       GEM_BUG_ON(i915->gt.active_requests);
> >> >
> >> > +     /*
> >> > +      * Park before disengaging the old submit mechanism as different
> >> > +      * backends may have different park/unpack callbacks.
> >> > +      *
> >> > +      * We are idle; the idle-worker will be queued, but we need to  
> >> run
> >> > +      * it now. As we already hold the struct mutex, we can get park
> >> > +      * the GPU right away, letting the lazy worker see that we are
> >> > +      * already active again by the time it acquires the mutex.
> >> > +      */
> >> > +     i915_gem_park(i915);
> >> I think we should be calling this before gem_unset_wedged in i915_reset.
> >> With GuC, hitting the GEM_BUG_ON(awake) in guc_submission_enable.
> >
> > Right, we really do need to restore guc submission before restarting. So
> > how can we fit in something like
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c  
> > b/drivers/gpu/drm/i915/i915_gem.c
> > index 9650a7b10c5f..95fa30d9aec6 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3132,6 +3132,9 @@ void i915_gem_reset(struct drm_i915_private  
> > *dev_priv)
> >        i915_retire_requests(dev_priv);
> > +       if (USES_GUC_SUBMISSION(dev_priv))
> > +               (void)intel_guc_submission_enable(guc);
> > +
> >         for_each_engine(engine, dev_priv, id) {
> >                 struct i915_gem_context *ctx;
> > ?
> 
> In series [1] I was trying to fix symmetry in calls to uc_init_hw/fini_hw
> where we are enabling/disabling GuC submission. In particular, patch [2]
> fixes reset path.

Tbh, I first misread that as i915_gem_fini(). Hmm, but we still should
be re-enabling guc prior to submitting our first requests here in
i915_gem_reset() prior to calling i915_gem_init_hw(). We want to start
from the repopulated request lists, so sticking the current init_hw() is
overkill.

I guess I have this sketch in mind:

i915_reset:

	i915_gem_reset_prepare();
	i915_gem_disable_hw(); // placeholder!

	i915_gem_fini_hw(); // revert to default state

	intel_gpu_reset();

	i915_gem_init_hw(); // just the HW setup

	i915_gem_reset();

	i915_gem_enable_hw(); // tell the HW to run
	i915_gem_reset_finish();

That i915_gem_reset() sticks out like a sore thumb. I think that's what
we need to break up and rearrange. Hmm, there's some of that in the RPS
series. Looks like we're too early to try and do this neatly.

> So maybe we should try to merge* both series ?

Hmm, looking over that series doesn't patch 5 hit the GEM_BUG_ON removed
in patch 6? Otherwise, merging the first 5 with Sagar's blessing would
be on the cards as they look sane otherwise. Maybe start with the first
4 then?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-04-06 14:09 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-05 11:02 [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset Chris Wilson
2018-04-05 11:02 ` [PATCH 2/4] drm/i915: Always assume the GPU reset occurs Chris Wilson
2018-04-05 11:02 ` [PATCH 3/4] drm/i915: Split out parking from the idle worker for reuse Chris Wilson
2018-04-05 11:32   ` Mika Kuoppala
2018-04-05 11:02 ` [PATCH 4/4] drm/i915: Park before resetting the submission backend Chris Wilson
2018-04-05 11:54   ` Sagar Arun Kamble
2018-04-05 12:17     ` Chris Wilson
2018-04-05 12:36       ` Chris Wilson
2018-04-06 12:25     ` Chris Wilson
2018-04-06 13:39       ` Michal Wajdeczko
2018-04-06 14:09         ` Chris Wilson
2018-04-05 11:10 ` [PATCH 1/4] drm/i915: Only call finish_reset after a prepare_reset Maarten Lankhorst
2018-04-05 16:19   ` Chris Wilson
2018-04-05 11:11 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/4] " Patchwork
2018-04-05 11:26 ` ✓ Fi.CI.BAT: success " Patchwork
2018-04-05 13:03 ` ✗ Fi.CI.IGT: failure " Patchwork
2018-04-06 13:09 ` ✗ Fi.CI.BAT: failure for series starting with [1/4] drm/i915: Only call finish_reset after a prepare_reset (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.