All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: fix SFC reset flow
@ 2019-09-16 21:41 Daniele Ceraolo Spurio
  2019-09-16 22:41 ` ✓ Fi.CI.BAT: success for " Patchwork
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-09-16 21:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: Owen Zhang

Our assumption that the we can ask the HW to lock the SFC even if not
currently in use does not match the HW commitment. The expectation from
the HW is that SW will not try to lock the SFC if the engine is not
using it and if we do that the behavior is undefined; on ICL the HW
ends up to returning the ack and ignoring our lock request, but this is
not guaranteed and we shouldn't expect it going forward.

Reported-by: Owen Zhang <owen.zhang@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_reset.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index 8327220ac558..900958804bd5 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -352,13 +352,15 @@ static u32 gen11_lock_sfc(struct intel_engine_cs *engine)
 	}
 
 	/*
-	 * Tell the engine that a software reset is going to happen. The engine
-	 * will then try to force lock the SFC (if currently locked, it will
-	 * remain so until we tell the engine it is safe to unlock; if currently
-	 * unlocked, it will ignore this and all new lock requests). If SFC
-	 * ends up being locked to the engine we want to reset, we have to reset
-	 * it as well (we will unlock it once the reset sequence is completed).
+	 * If the engine is using a SFC, tell the engine that a software reset
+	 * is going to happen. The engine will then try to force lock the SFC.
+	 * If SFC ends up being locked to the engine we want to reset, we have
+	 * to reset it as well (we will unlock it once the reset sequence is
+	 * completed).
 	 */
+	if (!(intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit))
+		return 0;
+
 	rmw_set_fw(uncore, sfc_forced_lock, sfc_forced_lock_bit);
 
 	if (__intel_wait_for_register_fw(uncore,
@@ -366,10 +368,13 @@ static u32 gen11_lock_sfc(struct intel_engine_cs *engine)
 					 sfc_forced_lock_ack_bit,
 					 sfc_forced_lock_ack_bit,
 					 1000, 0, NULL)) {
-		DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
+		/* did we race the unlock? */
+		if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
+			DRM_ERROR("Wait for SFC forced lock ack failed\n");
 		return 0;
 	}
 
+	/* The HW could return the ack even if the sfc is not in use */
 	if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
 		return sfc_reset_bit;
 
@@ -382,6 +387,7 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
 	u8 vdbox_sfc_access = RUNTIME_INFO(engine->i915)->vdbox_sfc_access;
 	i915_reg_t sfc_forced_lock;
 	u32 sfc_forced_lock_bit;
+	u32 lock;
 
 	switch (engine->class) {
 	case VIDEO_DECODE_CLASS:
@@ -401,7 +407,10 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
 		return;
 	}
 
-	rmw_clear_fw(uncore, sfc_forced_lock, sfc_forced_lock_bit);
+	lock = intel_uncore_read_fw(uncore, sfc_forced_lock);
+	if (lock & sfc_forced_lock_bit)
+		intel_uncore_write_fw(uncore, sfc_forced_lock,
+				      lock & ~sfc_forced_lock_bit);
 }
 
 static int gen11_reset_engines(struct intel_gt *gt,
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: fix SFC reset flow
  2019-09-16 21:41 [PATCH] drm/i915: fix SFC reset flow Daniele Ceraolo Spurio
@ 2019-09-16 22:41 ` Patchwork
  2019-09-17  5:37 ` ✓ Fi.CI.IGT: " Patchwork
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2019-09-16 22:41 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: fix SFC reset flow
URL   : https://patchwork.freedesktop.org/series/66779/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6904 -> Patchwork_14421
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/

Known issues
------------

  Here are the changes found in Patchwork_14421 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_addfb_basic@bad-pitch-999:
    - fi-icl-u3:          [PASS][1] -> [DMESG-WARN][2] ([fdo#107724])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/fi-icl-u3/igt@kms_addfb_basic@bad-pitch-999.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/fi-icl-u3/igt@kms_addfb_basic@bad-pitch-999.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - fi-kbl-7500u:       [PASS][3] -> [FAIL][4] ([fdo#111407])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/fi-kbl-7500u/igt@kms_chamelium@hdmi-hpd-fast.html

  
#### Possible fixes ####

  * igt@gem_ctx_create@basic-files:
    - fi-bxt-dsi:         [INCOMPLETE][5] ([fdo#103927]) -> [PASS][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/fi-bxt-dsi/igt@gem_ctx_create@basic-files.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/fi-bxt-dsi/igt@gem_ctx_create@basic-files.html

  * igt@kms_busy@basic-flip-c:
    - fi-skl-6770hq:      [SKIP][7] ([fdo#109271] / [fdo#109278]) -> [PASS][8] +2 similar issues
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/fi-skl-6770hq/igt@kms_busy@basic-flip-c.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/fi-skl-6770hq/igt@kms_busy@basic-flip-c.html

  * igt@kms_flip@basic-flip-vs-dpms:
    - fi-skl-6770hq:      [SKIP][9] ([fdo#109271]) -> [PASS][10] +23 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/fi-skl-6770hq/igt@kms_flip@basic-flip-vs-dpms.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/fi-skl-6770hq/igt@kms_flip@basic-flip-vs-dpms.html

  * igt@prime_vgem@basic-fence-flip:
    - fi-icl-u3:          [DMESG-WARN][11] ([fdo#107724]) -> [PASS][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/fi-icl-u3/igt@prime_vgem@basic-fence-flip.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/fi-icl-u3/igt@prime_vgem@basic-fence-flip.html

  
#### Warnings ####

  * igt@i915_pm_rpm@basic-pci-d3-state:
    - fi-kbl-guc:         [SKIP][13] ([fdo#109271]) -> [FAIL][14] ([fdo#107707])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/fi-kbl-guc/igt@i915_pm_rpm@basic-pci-d3-state.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/fi-kbl-guc/igt@i915_pm_rpm@basic-pci-d3-state.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#107707]: https://bugs.freedesktop.org/show_bug.cgi?id=107707
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#109100]: https://bugs.freedesktop.org/show_bug.cgi?id=109100
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109278]: https://bugs.freedesktop.org/show_bug.cgi?id=109278
  [fdo#111407]: https://bugs.freedesktop.org/show_bug.cgi?id=111407
  [fdo#111562]: https://bugs.freedesktop.org/show_bug.cgi?id=111562
  [fdo#111597]: https://bugs.freedesktop.org/show_bug.cgi?id=111597


Participating hosts (54 -> 48)
------------------------------

  Additional (2): fi-icl-u4 fi-pnv-d510 
  Missing    (8): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-apl-guc fi-icl-y fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_6904 -> Patchwork_14421

  CI-20190529: 20190529
  CI_DRM_6904: ebd25d396bc39c10a4b6947fc42d1bb12ef36cd9 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5186: 0008b3e1b2cf7a73b1e995031c9a73fc97b35aad @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_14421: 2cea2a30eedfc2cf93d4ca59d7dc191bd779293c @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

2cea2a30eedf drm/i915: fix SFC reset flow

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* ✓ Fi.CI.IGT: success for drm/i915: fix SFC reset flow
  2019-09-16 21:41 [PATCH] drm/i915: fix SFC reset flow Daniele Ceraolo Spurio
  2019-09-16 22:41 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2019-09-17  5:37 ` Patchwork
  2019-09-17  7:57 ` [PATCH] " Chris Wilson
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2019-09-17  5:37 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: fix SFC reset flow
URL   : https://patchwork.freedesktop.org/series/66779/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6904_full -> Patchwork_14421_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Known issues
------------

  Here are the changes found in Patchwork_14421_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_capture@capture-bsd:
    - shard-iclb:         [PASS][1] -> [SKIP][2] ([fdo#111325])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb3/igt@gem_exec_capture@capture-bsd.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb2/igt@gem_exec_capture@capture-bsd.html

  * igt@gem_exec_schedule@fifo-bsd1:
    - shard-iclb:         [PASS][3] -> [SKIP][4] ([fdo#109276]) +16 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb2/igt@gem_exec_schedule@fifo-bsd1.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb5/igt@gem_exec_schedule@fifo-bsd1.html

  * igt@gem_tiled_swapping@non-threaded:
    - shard-glk:          [PASS][5] -> [DMESG-WARN][6] ([fdo#108686])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-glk1/igt@gem_tiled_swapping@non-threaded.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-glk4/igt@gem_tiled_swapping@non-threaded.html

  * igt@gem_workarounds@suspend-resume-context:
    - shard-apl:          [PASS][7] -> [DMESG-WARN][8] ([fdo#108566]) +6 similar issues
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-apl7/igt@gem_workarounds@suspend-resume-context.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-apl1/igt@gem_workarounds@suspend-resume-context.html

  * igt@i915_pm_rc6_residency@rc6-accuracy:
    - shard-snb:          [PASS][9] -> [SKIP][10] ([fdo#109271])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-snb1/igt@i915_pm_rc6_residency@rc6-accuracy.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-snb4/igt@i915_pm_rc6_residency@rc6-accuracy.html

  * igt@i915_selftest@live_hangcheck:
    - shard-iclb:         [PASS][11] -> [INCOMPLETE][12] ([fdo#107713] / [fdo#108569])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb3/igt@i915_selftest@live_hangcheck.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb3/igt@i915_selftest@live_hangcheck.html

  * igt@kms_cursor_crc@pipe-c-cursor-128x128-random:
    - shard-apl:          [PASS][13] -> [INCOMPLETE][14] ([fdo#103927]) +1 similar issue
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-apl2/igt@kms_cursor_crc@pipe-c-cursor-128x128-random.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-apl3/igt@kms_cursor_crc@pipe-c-cursor-128x128-random.html

  * igt@kms_cursor_crc@pipe-c-cursor-size-change:
    - shard-skl:          [PASS][15] -> [FAIL][16] ([fdo#103232])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-skl4/igt@kms_cursor_crc@pipe-c-cursor-size-change.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-skl1/igt@kms_cursor_crc@pipe-c-cursor-size-change.html

  * igt@kms_draw_crc@draw-method-xrgb8888-pwrite-ytiled:
    - shard-skl:          [PASS][17] -> [FAIL][18] ([fdo#103184] / [fdo#103232] / [fdo#108222])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-skl10/igt@kms_draw_crc@draw-method-xrgb8888-pwrite-ytiled.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-skl9/igt@kms_draw_crc@draw-method-xrgb8888-pwrite-ytiled.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-pwrite:
    - shard-iclb:         [PASS][19] -> [FAIL][20] ([fdo#103167]) +4 similar issues
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb5/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-pwrite.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-pwrite.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-rte:
    - shard-iclb:         [PASS][21] -> [FAIL][22] ([fdo#103167] / [fdo#110378])
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb1/igt@kms_frontbuffer_tracking@fbcpsr-1p-rte.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb1/igt@kms_frontbuffer_tracking@fbcpsr-1p-rte.html

  * igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min:
    - shard-skl:          [PASS][23] -> [FAIL][24] ([fdo#108145]) +1 similar issue
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-skl1/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-skl6/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min.html

  * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
    - shard-skl:          [PASS][25] -> [FAIL][26] ([fdo#108145] / [fdo#110403]) +1 similar issue
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-skl6/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-skl1/igt@kms_plane_alpha_blend@pipe-c-coverage-7efc.html

  * igt@kms_psr2_su@page_flip:
    - shard-iclb:         [PASS][27] -> [SKIP][28] ([fdo#109642] / [fdo#111068])
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb2/igt@kms_psr2_su@page_flip.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb5/igt@kms_psr2_su@page_flip.html

  * igt@kms_psr@psr2_cursor_blt:
    - shard-iclb:         [PASS][29] -> [SKIP][30] ([fdo#109441]) +3 similar issues
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb2/igt@kms_psr@psr2_cursor_blt.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb5/igt@kms_psr@psr2_cursor_blt.html

  * igt@kms_vblank@pipe-a-query-forked-busy-hang:
    - shard-hsw:          [PASS][31] -> [INCOMPLETE][32] ([fdo#103540])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-hsw7/igt@kms_vblank@pipe-a-query-forked-busy-hang.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-hsw5/igt@kms_vblank@pipe-a-query-forked-busy-hang.html

  * igt@kms_vblank@pipe-c-ts-continuation-suspend:
    - shard-skl:          [PASS][33] -> [INCOMPLETE][34] ([fdo#104108])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-skl1/igt@kms_vblank@pipe-c-ts-continuation-suspend.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-skl3/igt@kms_vblank@pipe-c-ts-continuation-suspend.html

  
#### Possible fixes ####

  * igt@gem_ctx_shared@exec-single-timeline-bsd:
    - shard-iclb:         [SKIP][35] ([fdo#110841]) -> [PASS][36]
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb2/igt@gem_ctx_shared@exec-single-timeline-bsd.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb5/igt@gem_ctx_shared@exec-single-timeline-bsd.html

  * igt@gem_exec_balancer@smoke:
    - shard-iclb:         [SKIP][37] ([fdo#110854]) -> [PASS][38]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb6/igt@gem_exec_balancer@smoke.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb4/igt@gem_exec_balancer@smoke.html

  * igt@gem_exec_schedule@preempt-contexts-bsd2:
    - shard-iclb:         [SKIP][39] ([fdo#109276]) -> [PASS][40] +16 similar issues
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb8/igt@gem_exec_schedule@preempt-contexts-bsd2.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb2/igt@gem_exec_schedule@preempt-contexts-bsd2.html

  * igt@gem_exec_schedule@reorder-wide-bsd:
    - shard-iclb:         [SKIP][41] ([fdo#111325]) -> [PASS][42] +7 similar issues
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb1/igt@gem_exec_schedule@reorder-wide-bsd.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb5/igt@gem_exec_schedule@reorder-wide-bsd.html

  * igt@i915_suspend@fence-restore-tiled2untiled:
    - shard-apl:          [DMESG-WARN][43] ([fdo#108566]) -> [PASS][44] +8 similar issues
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-apl7/igt@i915_suspend@fence-restore-tiled2untiled.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-apl1/igt@i915_suspend@fence-restore-tiled2untiled.html

  * igt@kms_busy@extended-modeset-hang-oldfb-with-reset-render-b:
    - shard-snb:          [INCOMPLETE][45] ([fdo#105411]) -> [PASS][46]
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-snb1/igt@kms_busy@extended-modeset-hang-oldfb-with-reset-render-b.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-snb2/igt@kms_busy@extended-modeset-hang-oldfb-with-reset-render-b.html

  * igt@kms_cursor_crc@pipe-c-cursor-128x128-onscreen:
    - shard-apl:          [INCOMPLETE][47] ([fdo#103927]) -> [PASS][48] +2 similar issues
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-apl6/igt@kms_cursor_crc@pipe-c-cursor-128x128-onscreen.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-apl3/igt@kms_cursor_crc@pipe-c-cursor-128x128-onscreen.html

  * igt@kms_cursor_legacy@2x-long-nonblocking-modeset-vs-cursor-atomic:
    - shard-glk:          [FAIL][49] ([fdo#106509] / [fdo#107409]) -> [PASS][50]
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-glk4/igt@kms_cursor_legacy@2x-long-nonblocking-modeset-vs-cursor-atomic.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-glk1/igt@kms_cursor_legacy@2x-long-nonblocking-modeset-vs-cursor-atomic.html

  * igt@kms_draw_crc@draw-method-rgb565-pwrite-untiled:
    - shard-skl:          [FAIL][51] ([fdo#103184] / [fdo#103232] / [fdo#108472]) -> [PASS][52]
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-skl6/igt@kms_draw_crc@draw-method-rgb565-pwrite-untiled.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-skl1/igt@kms_draw_crc@draw-method-rgb565-pwrite-untiled.html

  * igt@kms_flip@modeset-vs-vblank-race:
    - shard-hsw:          [FAIL][53] ([fdo#111609]) -> [PASS][54]
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-hsw6/igt@kms_flip@modeset-vs-vblank-race.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-hsw6/igt@kms_flip@modeset-vs-vblank-race.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite:
    - shard-iclb:         [FAIL][55] ([fdo#103167]) -> [PASS][56] +5 similar issues
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb5/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-pwrite.html

  * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes:
    - shard-skl:          [INCOMPLETE][57] ([fdo#104108]) -> [PASS][58]
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-skl7/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-skl7/igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes.html

  * igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min:
    - shard-skl:          [FAIL][59] ([fdo#108145]) -> [PASS][60]
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-skl9/igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-skl8/igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min.html

  * igt@kms_psr@psr2_primary_page_flip:
    - shard-iclb:         [SKIP][61] ([fdo#109441]) -> [PASS][62] +3 similar issues
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb3/igt@kms_psr@psr2_primary_page_flip.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb2/igt@kms_psr@psr2_primary_page_flip.html

  * igt@perf@polling:
    - shard-skl:          [FAIL][63] ([fdo#110728]) -> [PASS][64]
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-skl8/igt@perf@polling.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-skl3/igt@perf@polling.html

  
#### Warnings ####

  * igt@gem_mocs_settings@mocs-reset-bsd2:
    - shard-iclb:         [SKIP][65] ([fdo#109276]) -> [FAIL][66] ([fdo#111330])
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb8/igt@gem_mocs_settings@mocs-reset-bsd2.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb1/igt@gem_mocs_settings@mocs-reset-bsd2.html

  * igt@gem_mocs_settings@mocs-settings-bsd2:
    - shard-iclb:         [FAIL][67] ([fdo#111330]) -> [SKIP][68] ([fdo#109276])
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6904/shard-iclb2/igt@gem_mocs_settings@mocs-settings-bsd2.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/shard-iclb5/igt@gem_mocs_settings@mocs-settings-bsd2.html

  
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103184]: https://bugs.freedesktop.org/show_bug.cgi?id=103184
  [fdo#103232]: https://bugs.freedesktop.org/show_bug.cgi?id=103232
  [fdo#103540]: https://bugs.freedesktop.org/show_bug.cgi?id=103540
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#104108]: https://bugs.freedesktop.org/show_bug.cgi?id=104108
  [fdo#105411]: https://bugs.freedesktop.org/show_bug.cgi?id=105411
  [fdo#106509]: https://bugs.freedesktop.org/show_bug.cgi?id=106509
  [fdo#107409]: https://bugs.freedesktop.org/show_bug.cgi?id=107409
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#108222]: https://bugs.freedesktop.org/show_bug.cgi?id=108222
  [fdo#108472]: https://bugs.freedesktop.org/show_bug.cgi?id=108472
  [fdo#108566]: https://bugs.freedesktop.org/show_bug.cgi?id=108566
  [fdo#108569]: https://bugs.freedesktop.org/show_bug.cgi?id=108569
  [fdo#108686]: https://bugs.freedesktop.org/show_bug.cgi?id=108686
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109276]: https://bugs.freedesktop.org/show_bug.cgi?id=109276
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109642]: https://bugs.freedesktop.org/show_bug.cgi?id=109642
  [fdo#110378]: https://bugs.freedesktop.org/show_bug.cgi?id=110378
  [fdo#110403]: https://bugs.freedesktop.org/show_bug.cgi?id=110403
  [fdo#110728]: https://bugs.freedesktop.org/show_bug.cgi?id=110728
  [fdo#110841]: https://bugs.freedesktop.org/show_bug.cgi?id=110841
  [fdo#110854]: https://bugs.freedesktop.org/show_bug.cgi?id=110854
  [fdo#111068]: https://bugs.freedesktop.org/show_bug.cgi?id=111068
  [fdo#111325]: https://bugs.freedesktop.org/show_bug.cgi?id=111325
  [fdo#111330]: https://bugs.freedesktop.org/show_bug.cgi?id=111330
  [fdo#111609]: https://bugs.freedesktop.org/show_bug.cgi?id=111609


Participating hosts (9 -> 9)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_6904 -> Patchwork_14421

  CI-20190529: 20190529
  CI_DRM_6904: ebd25d396bc39c10a4b6947fc42d1bb12ef36cd9 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5186: 0008b3e1b2cf7a73b1e995031c9a73fc97b35aad @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_14421: 2cea2a30eedfc2cf93d4ca59d7dc191bd779293c @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14421/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: fix SFC reset flow
  2019-09-16 21:41 [PATCH] drm/i915: fix SFC reset flow Daniele Ceraolo Spurio
  2019-09-16 22:41 ` ✓ Fi.CI.BAT: success for " Patchwork
  2019-09-17  5:37 ` ✓ Fi.CI.IGT: " Patchwork
@ 2019-09-17  7:57 ` Chris Wilson
  2019-09-17 18:36   ` Daniele Ceraolo Spurio
  2019-09-17 10:22 ` Tvrtko Ursulin
  2019-09-17 16:06 ` Chris Wilson
  4 siblings, 1 reply; 13+ messages in thread
From: Chris Wilson @ 2019-09-17  7:57 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx; +Cc: Owen Zhang

Quoting Daniele Ceraolo Spurio (2019-09-16 22:41:04)
> Our assumption that the we can ask the HW to lock the SFC even if not
> currently in use does not match the HW commitment. The expectation from
> the HW is that SW will not try to lock the SFC if the engine is not
> using it and if we do that the behavior is undefined; on ICL the HW
> ends up to returning the ack and ignoring our lock request, but this is
> not guaranteed and we shouldn't expect it going forward.
> 
> Reported-by: Owen Zhang <owen.zhang@intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> ---
> @@ -366,10 +368,13 @@ static u32 gen11_lock_sfc(struct intel_engine_cs *engine)
>                                          sfc_forced_lock_ack_bit,
>                                          sfc_forced_lock_ack_bit,
>                                          1000, 0, NULL)) {
> -               DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
> +               /* did we race the unlock? */
> +               if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
> +                       DRM_ERROR("Wait for SFC forced lock ack failed\n");

What's our plan if this *ERROR* is ever triggered?

If it remains do nothing and check the logs on death, then it remains
just a debug splat. If there is a plan to actually do something to
handle the error, do it!
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: fix SFC reset flow
  2019-09-16 21:41 [PATCH] drm/i915: fix SFC reset flow Daniele Ceraolo Spurio
                   ` (2 preceding siblings ...)
  2019-09-17  7:57 ` [PATCH] " Chris Wilson
@ 2019-09-17 10:22 ` Tvrtko Ursulin
  2019-09-17 18:29   ` Daniele Ceraolo Spurio
  2019-09-17 16:06 ` Chris Wilson
  4 siblings, 1 reply; 13+ messages in thread
From: Tvrtko Ursulin @ 2019-09-17 10:22 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx; +Cc: Owen Zhang


On 16/09/2019 22:41, Daniele Ceraolo Spurio wrote:
> Our assumption that the we can ask the HW to lock the SFC even if not
> currently in use does not match the HW commitment. The expectation from
> the HW is that SW will not try to lock the SFC if the engine is not
> using it and if we do that the behavior is undefined; on ICL the HW
> ends up to returning the ack and ignoring our lock request, but this is
> not guaranteed and we shouldn't expect it going forward.
> 
> Reported-by: Owen Zhang <owen.zhang@intel.com>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_reset.c | 25 +++++++++++++++++--------
>   1 file changed, 17 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index 8327220ac558..900958804bd5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -352,13 +352,15 @@ static u32 gen11_lock_sfc(struct intel_engine_cs *engine)
>   	}
>   
>   	/*
> -	 * Tell the engine that a software reset is going to happen. The engine
> -	 * will then try to force lock the SFC (if currently locked, it will
> -	 * remain so until we tell the engine it is safe to unlock; if currently
> -	 * unlocked, it will ignore this and all new lock requests). If SFC
> -	 * ends up being locked to the engine we want to reset, we have to reset
> -	 * it as well (we will unlock it once the reset sequence is completed).
> +	 * If the engine is using a SFC, tell the engine that a software reset
> +	 * is going to happen. The engine will then try to force lock the SFC.
> +	 * If SFC ends up being locked to the engine we want to reset, we have
> +	 * to reset it as well (we will unlock it once the reset sequence is
> +	 * completed).
>   	 */
> +	if (!(intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit))
> +		return 0;
> +
>   	rmw_set_fw(uncore, sfc_forced_lock, sfc_forced_lock_bit);
>   
>   	if (__intel_wait_for_register_fw(uncore,
> @@ -366,10 +368,13 @@ static u32 gen11_lock_sfc(struct intel_engine_cs *engine)
>   					 sfc_forced_lock_ack_bit,
>   					 sfc_forced_lock_ack_bit,
>   					 1000, 0, NULL)) {
> -		DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
> +		/* did we race the unlock? */

How do we race here? Are we not in complete control of the engine at 
this point so the status of this engine using SFC or not should be 
static, no?

> +		if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
> +			DRM_ERROR("Wait for SFC forced lock ack failed\n");
>   		return 0;
>   	}
>   
> +	/* The HW could return the ack even if the sfc is not in use */

But the function checked whether SFC wasn't in use and bailed out early 
- so is this comment relevant? (I understand it is true against the 
specs just wondering about our exact code.)

>   	if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
>   		return sfc_reset_bit;
>   
> @@ -382,6 +387,7 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
>   	u8 vdbox_sfc_access = RUNTIME_INFO(engine->i915)->vdbox_sfc_access;
>   	i915_reg_t sfc_forced_lock;
>   	u32 sfc_forced_lock_bit;
> +	u32 lock;
>   
>   	switch (engine->class) {
>   	case VIDEO_DECODE_CLASS:
> @@ -401,7 +407,10 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
>   		return;
>   	}
>   
> -	rmw_clear_fw(uncore, sfc_forced_lock, sfc_forced_lock_bit);
> +	lock = intel_uncore_read_fw(uncore, sfc_forced_lock);
> +	if (lock & sfc_forced_lock_bit)
> +		intel_uncore_write_fw(uncore, sfc_forced_lock,
> +				      lock & ~sfc_forced_lock_bit);

Here we can't rely on the return code from gen11_lock_sfc and have to 
read the register ourselves? I guess it depends on my question about the 
race comment.

In addition to this I now see that gen11_reset_engines does not use the 
return value from gen11_lock_sfc when deciding which engines it needs to 
unlock. Should we change that as well?


>   }
>   
>   static int gen11_reset_engines(struct intel_gt *gt,
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: fix SFC reset flow
  2019-09-16 21:41 [PATCH] drm/i915: fix SFC reset flow Daniele Ceraolo Spurio
                   ` (3 preceding siblings ...)
  2019-09-17 10:22 ` Tvrtko Ursulin
@ 2019-09-17 16:06 ` Chris Wilson
  4 siblings, 0 replies; 13+ messages in thread
From: Chris Wilson @ 2019-09-17 16:06 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx; +Cc: Owen Zhang

Quoting Daniele Ceraolo Spurio (2019-09-16 22:41:04)
> @@ -401,7 +407,10 @@ static void gen11_unlock_sfc(struct intel_engine_cs *engine)
>                 return;
>         }
>  
> -       rmw_clear_fw(uncore, sfc_forced_lock, sfc_forced_lock_bit);
> +       lock = intel_uncore_read_fw(uncore, sfc_forced_lock);
> +       if (lock & sfc_forced_lock_bit)
> +               intel_uncore_write_fw(uncore, sfc_forced_lock,
> +                                     lock & ~sfc_forced_lock_bit);

This is handled by rmw_clear_fw() itself now,
80fa64d62067 ("drm/i915: Only apply a rmw mmio update if the value changes")
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: fix SFC reset flow
  2019-09-17 10:22 ` Tvrtko Ursulin
@ 2019-09-17 18:29   ` Daniele Ceraolo Spurio
  2019-09-18 13:42     ` Tvrtko Ursulin
  0 siblings, 1 reply; 13+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-09-17 18:29 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx; +Cc: Owen Zhang



On 9/17/2019 3:22 AM, Tvrtko Ursulin wrote:
>
> On 16/09/2019 22:41, Daniele Ceraolo Spurio wrote:
>> Our assumption that the we can ask the HW to lock the SFC even if not
>> currently in use does not match the HW commitment. The expectation from
>> the HW is that SW will not try to lock the SFC if the engine is not
>> using it and if we do that the behavior is undefined; on ICL the HW
>> ends up to returning the ack and ignoring our lock request, but this is
>> not guaranteed and we shouldn't expect it going forward.
>>
>> Reported-by: Owen Zhang <owen.zhang@intel.com>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_reset.c | 25 +++++++++++++++++--------
>>   1 file changed, 17 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
>> b/drivers/gpu/drm/i915/gt/intel_reset.c
>> index 8327220ac558..900958804bd5 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
>> @@ -352,13 +352,15 @@ static u32 gen11_lock_sfc(struct 
>> intel_engine_cs *engine)
>>       }
>>         /*
>> -     * Tell the engine that a software reset is going to happen. The 
>> engine
>> -     * will then try to force lock the SFC (if currently locked, it 
>> will
>> -     * remain so until we tell the engine it is safe to unlock; if 
>> currently
>> -     * unlocked, it will ignore this and all new lock requests). If SFC
>> -     * ends up being locked to the engine we want to reset, we have 
>> to reset
>> -     * it as well (we will unlock it once the reset sequence is 
>> completed).
>> +     * If the engine is using a SFC, tell the engine that a software 
>> reset
>> +     * is going to happen. The engine will then try to force lock 
>> the SFC.
>> +     * If SFC ends up being locked to the engine we want to reset, 
>> we have
>> +     * to reset it as well (we will unlock it once the reset 
>> sequence is
>> +     * completed).
>>        */
>> +    if (!(intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit))
>> +        return 0;
>> +
>>       rmw_set_fw(uncore, sfc_forced_lock, sfc_forced_lock_bit);
>>         if (__intel_wait_for_register_fw(uncore,
>> @@ -366,10 +368,13 @@ static u32 gen11_lock_sfc(struct 
>> intel_engine_cs *engine)
>>                        sfc_forced_lock_ack_bit,
>>                        sfc_forced_lock_ack_bit,
>>                        1000, 0, NULL)) {
>> -        DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
>> +        /* did we race the unlock? */
>
> How do we race here? Are we not in complete control of the engine at 
> this point so the status of this engine using SFC or not should be 
> static, no?

The hang detection might be due to a long non-preemptable batch, in 
which case there is in theory a chance for the batch to release the SFC 
while we try to lock it. The chance is incredibly small though, so am I 
being too paranoid?

>
>> +        if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
>> +            DRM_ERROR("Wait for SFC forced lock ack failed\n");
>>           return 0;
>>       }
>>   +    /* The HW could return the ack even if the sfc is not in use */
>
> But the function checked whether SFC wasn't in use and bailed out 
> early - so is this comment relevant? (I understand it is true against 
> the specs just wondering about our exact code.)
>

Same rationale as the above, if the engine relased the SFC while we were 
locking it, the locking might have been rejected, but on ICL we still 
get the ack.

>>       if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
>>           return sfc_reset_bit;
>>   @@ -382,6 +387,7 @@ static void gen11_unlock_sfc(struct 
>> intel_engine_cs *engine)
>>       u8 vdbox_sfc_access = 
>> RUNTIME_INFO(engine->i915)->vdbox_sfc_access;
>>       i915_reg_t sfc_forced_lock;
>>       u32 sfc_forced_lock_bit;
>> +    u32 lock;
>>         switch (engine->class) {
>>       case VIDEO_DECODE_CLASS:
>> @@ -401,7 +407,10 @@ static void gen11_unlock_sfc(struct 
>> intel_engine_cs *engine)
>>           return;
>>       }
>>   -    rmw_clear_fw(uncore, sfc_forced_lock, sfc_forced_lock_bit);
>> +    lock = intel_uncore_read_fw(uncore, sfc_forced_lock);
>> +    if (lock & sfc_forced_lock_bit)
>> +        intel_uncore_write_fw(uncore, sfc_forced_lock,
>> +                      lock & ~sfc_forced_lock_bit);
>
> Here we can't rely on the return code from gen11_lock_sfc and have to 
> read the register ourselves? I guess it depends on my question about 
> the race comment.
>
> In addition to this I now see that gen11_reset_engines does not use 
> the return value from gen11_lock_sfc when deciding which engines it 
> needs to unlock. Should we change that as well?

Paranoia here as well, in case something went wrong with the locking I'd 
like to be sure the unlocking can still be performed independently so we 
can recover. e.g. the locking might have succeeded after we hit the 
timeout in gen11_lock_sfc , in which case the return from that function 
won't reflect the status of the HW.

Thanks,
Daniele

>
>
>>   }
>>     static int gen11_reset_engines(struct intel_gt *gt,
>>
>
> Regards,
>
> Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: fix SFC reset flow
  2019-09-17  7:57 ` [PATCH] " Chris Wilson
@ 2019-09-17 18:36   ` Daniele Ceraolo Spurio
  2019-09-17 18:57     ` Chris Wilson
  0 siblings, 1 reply; 13+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-09-17 18:36 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Owen Zhang



On 9/17/2019 12:57 AM, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2019-09-16 22:41:04)
>> Our assumption that the we can ask the HW to lock the SFC even if not
>> currently in use does not match the HW commitment. The expectation from
>> the HW is that SW will not try to lock the SFC if the engine is not
>> using it and if we do that the behavior is undefined; on ICL the HW
>> ends up to returning the ack and ignoring our lock request, but this is
>> not guaranteed and we shouldn't expect it going forward.
>>
>> Reported-by: Owen Zhang <owen.zhang@intel.com>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> ---
>> @@ -366,10 +368,13 @@ static u32 gen11_lock_sfc(struct intel_engine_cs *engine)
>>                                           sfc_forced_lock_ack_bit,
>>                                           sfc_forced_lock_ack_bit,
>>                                           1000, 0, NULL)) {
>> -               DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
>> +               /* did we race the unlock? */
>> +               if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
>> +                       DRM_ERROR("Wait for SFC forced lock ack failed\n");
> What's our plan if this *ERROR* is ever triggered?
>
> If it remains do nothing and check the logs on death, then it remains
> just a debug splat. If there is a plan to actually do something to
> handle the error, do it!
> -Chris

AFAIU the only thing we can do is escalate to full gpu reset. However, 
the probability of this failing should be next to non-existent (only one 
engine can use the SFC at any time so there is no lock contention), so 
I'm not convinced the fallback is worth the effort. The error is still 
useful IMO to catch unexpected behavior on new platforms, as it happened 
in this case with the media team reporting seeing this message on gen12 
with the previous behavior. This said, I'm happy to add the extra logic 
if you believe it is worth it.

Daniele

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: fix SFC reset flow
  2019-09-17 18:36   ` Daniele Ceraolo Spurio
@ 2019-09-17 18:57     ` Chris Wilson
  2019-09-17 19:45       ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Wilson @ 2019-09-17 18:57 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx; +Cc: Owen Zhang

Quoting Daniele Ceraolo Spurio (2019-09-17 19:36:35)
> 
> 
> On 9/17/2019 12:57 AM, Chris Wilson wrote:
> > Quoting Daniele Ceraolo Spurio (2019-09-16 22:41:04)
> >> Our assumption that the we can ask the HW to lock the SFC even if not
> >> currently in use does not match the HW commitment. The expectation from
> >> the HW is that SW will not try to lock the SFC if the engine is not
> >> using it and if we do that the behavior is undefined; on ICL the HW
> >> ends up to returning the ack and ignoring our lock request, but this is
> >> not guaranteed and we shouldn't expect it going forward.
> >>
> >> Reported-by: Owen Zhang <owen.zhang@intel.com>
> >> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> >> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> >> ---
> >> @@ -366,10 +368,13 @@ static u32 gen11_lock_sfc(struct intel_engine_cs *engine)
> >>                                           sfc_forced_lock_ack_bit,
> >>                                           sfc_forced_lock_ack_bit,
> >>                                           1000, 0, NULL)) {
> >> -               DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
> >> +               /* did we race the unlock? */
> >> +               if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
> >> +                       DRM_ERROR("Wait for SFC forced lock ack failed\n");
> > What's our plan if this *ERROR* is ever triggered?
> >
> > If it remains do nothing and check the logs on death, then it remains
> > just a debug splat. If there is a plan to actually do something to
> > handle the error, do it!
> > -Chris
> 
> AFAIU the only thing we can do is escalate to full gpu reset. However, 
> the probability of this failing should be next to non-existent (only one 
> engine can use the SFC at any time so there is no lock contention), so 
> I'm not convinced the fallback is worth the effort. The error is still 
> useful IMO to catch unexpected behavior on new platforms, as it happened 
> in this case with the media team reporting seeing this message on gen12 
> with the previous behavior. This said, I'm happy to add the extra logic 
> if you believe it is worth it.

We've see this message on every icl run!
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: fix SFC reset flow
  2019-09-17 18:57     ` Chris Wilson
@ 2019-09-17 19:45       ` Daniele Ceraolo Spurio
  2019-09-17 19:49         ` Chris Wilson
  0 siblings, 1 reply; 13+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-09-17 19:45 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Owen Zhang



On 9/17/2019 11:57 AM, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2019-09-17 19:36:35)
>>
>> On 9/17/2019 12:57 AM, Chris Wilson wrote:
>>> Quoting Daniele Ceraolo Spurio (2019-09-16 22:41:04)
>>>> Our assumption that the we can ask the HW to lock the SFC even if not
>>>> currently in use does not match the HW commitment. The expectation from
>>>> the HW is that SW will not try to lock the SFC if the engine is not
>>>> using it and if we do that the behavior is undefined; on ICL the HW
>>>> ends up to returning the ack and ignoring our lock request, but this is
>>>> not guaranteed and we shouldn't expect it going forward.
>>>>
>>>> Reported-by: Owen Zhang <owen.zhang@intel.com>
>>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>>> ---
>>>> @@ -366,10 +368,13 @@ static u32 gen11_lock_sfc(struct intel_engine_cs *engine)
>>>>                                            sfc_forced_lock_ack_bit,
>>>>                                            sfc_forced_lock_ack_bit,
>>>>                                            1000, 0, NULL)) {
>>>> -               DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
>>>> +               /* did we race the unlock? */
>>>> +               if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
>>>> +                       DRM_ERROR("Wait for SFC forced lock ack failed\n");
>>> What's our plan if this *ERROR* is ever triggered?
>>>
>>> If it remains do nothing and check the logs on death, then it remains
>>> just a debug splat. If there is a plan to actually do something to
>>> handle the error, do it!
>>> -Chris
>> AFAIU the only thing we can do is escalate to full gpu reset. However,
>> the probability of this failing should be next to non-existent (only one
>> engine can use the SFC at any time so there is no lock contention), so
>> I'm not convinced the fallback is worth the effort. The error is still
>> useful IMO to catch unexpected behavior on new platforms, as it happened
>> in this case with the media team reporting seeing this message on gen12
>> with the previous behavior. This said, I'm happy to add the extra logic
>> if you believe it is worth it.
> We've see this message on every icl run!
> -Chris

I've never noticed it, which tests are hitting it? My understanding from 
what the HW team said is that on ICL the ack will always come back (even 
if it is not part of the "official" SW/HW interface) and the HW tweak 
that stops that is a gen12 change. Something else might be wrong is this 
is firing off in our ICL CI, also because I don't think we have any test 
case that actually uses the SFC, do we?

Daniele

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: fix SFC reset flow
  2019-09-17 19:45       ` Daniele Ceraolo Spurio
@ 2019-09-17 19:49         ` Chris Wilson
  2019-09-17 20:53           ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 13+ messages in thread
From: Chris Wilson @ 2019-09-17 19:49 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx; +Cc: Owen Zhang

Quoting Daniele Ceraolo Spurio (2019-09-17 20:45:02)
> 
> 
> On 9/17/2019 11:57 AM, Chris Wilson wrote:
> > Quoting Daniele Ceraolo Spurio (2019-09-17 19:36:35)
> >>
> >> On 9/17/2019 12:57 AM, Chris Wilson wrote:
> >>> Quoting Daniele Ceraolo Spurio (2019-09-16 22:41:04)
> >>>> Our assumption that the we can ask the HW to lock the SFC even if not
> >>>> currently in use does not match the HW commitment. The expectation from
> >>>> the HW is that SW will not try to lock the SFC if the engine is not
> >>>> using it and if we do that the behavior is undefined; on ICL the HW
> >>>> ends up to returning the ack and ignoring our lock request, but this is
> >>>> not guaranteed and we shouldn't expect it going forward.
> >>>>
> >>>> Reported-by: Owen Zhang <owen.zhang@intel.com>
> >>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> >>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> >>>> ---
> >>>> @@ -366,10 +368,13 @@ static u32 gen11_lock_sfc(struct intel_engine_cs *engine)
> >>>>                                            sfc_forced_lock_ack_bit,
> >>>>                                            sfc_forced_lock_ack_bit,
> >>>>                                            1000, 0, NULL)) {
> >>>> -               DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
> >>>> +               /* did we race the unlock? */
> >>>> +               if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
> >>>> +                       DRM_ERROR("Wait for SFC forced lock ack failed\n");
> >>> What's our plan if this *ERROR* is ever triggered?
> >>>
> >>> If it remains do nothing and check the logs on death, then it remains
> >>> just a debug splat. If there is a plan to actually do something to
> >>> handle the error, do it!
> >>> -Chris
> >> AFAIU the only thing we can do is escalate to full gpu reset. However,
> >> the probability of this failing should be next to non-existent (only one
> >> engine can use the SFC at any time so there is no lock contention), so
> >> I'm not convinced the fallback is worth the effort. The error is still
> >> useful IMO to catch unexpected behavior on new platforms, as it happened
> >> in this case with the media team reporting seeing this message on gen12
> >> with the previous behavior. This said, I'm happy to add the extra logic
> >> if you believe it is worth it.
> > We've see this message on every icl run!
> > -Chris
> 
> I've never noticed it, which tests are hitting it? My understanding from 
> what the HW team said is that on ICL the ack will always come back (even 
> if it is not part of the "official" SW/HW interface) and the HW tweak 
> that stops that is a gen12 change. Something else might be wrong is this 
> is firing off in our ICL CI, also because I don't think we have any test 
> case that actually uses the SFC, do we?

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6911/fi-icl-u2/igt@i915_selftest@live_hangcheck.html

All icl, live_hangcheck or live_reset, for as long as I can remember.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: fix SFC reset flow
  2019-09-17 19:49         ` Chris Wilson
@ 2019-09-17 20:53           ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 13+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-09-17 20:53 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Owen Zhang



On 9/17/2019 12:49 PM, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2019-09-17 20:45:02)
>>
>> On 9/17/2019 11:57 AM, Chris Wilson wrote:
>>> Quoting Daniele Ceraolo Spurio (2019-09-17 19:36:35)
>>>> On 9/17/2019 12:57 AM, Chris Wilson wrote:
>>>>> Quoting Daniele Ceraolo Spurio (2019-09-16 22:41:04)
>>>>>> Our assumption that the we can ask the HW to lock the SFC even if not
>>>>>> currently in use does not match the HW commitment. The expectation from
>>>>>> the HW is that SW will not try to lock the SFC if the engine is not
>>>>>> using it and if we do that the behavior is undefined; on ICL the HW
>>>>>> ends up to returning the ack and ignoring our lock request, but this is
>>>>>> not guaranteed and we shouldn't expect it going forward.
>>>>>>
>>>>>> Reported-by: Owen Zhang <owen.zhang@intel.com>
>>>>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>>>>> ---
>>>>>> @@ -366,10 +368,13 @@ static u32 gen11_lock_sfc(struct intel_engine_cs *engine)
>>>>>>                                             sfc_forced_lock_ack_bit,
>>>>>>                                             sfc_forced_lock_ack_bit,
>>>>>>                                             1000, 0, NULL)) {
>>>>>> -               DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
>>>>>> +               /* did we race the unlock? */
>>>>>> +               if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
>>>>>> +                       DRM_ERROR("Wait for SFC forced lock ack failed\n");
>>>>> What's our plan if this *ERROR* is ever triggered?
>>>>>
>>>>> If it remains do nothing and check the logs on death, then it remains
>>>>> just a debug splat. If there is a plan to actually do something to
>>>>> handle the error, do it!
>>>>> -Chris
>>>> AFAIU the only thing we can do is escalate to full gpu reset. However,
>>>> the probability of this failing should be next to non-existent (only one
>>>> engine can use the SFC at any time so there is no lock contention), so
>>>> I'm not convinced the fallback is worth the effort. The error is still
>>>> useful IMO to catch unexpected behavior on new platforms, as it happened
>>>> in this case with the media team reporting seeing this message on gen12
>>>> with the previous behavior. This said, I'm happy to add the extra logic
>>>> if you believe it is worth it.
>>> We've see this message on every icl run!
>>> -Chris
>> I've never noticed it, which tests are hitting it? My understanding from
>> what the HW team said is that on ICL the ack will always come back (even
>> if it is not part of the "official" SW/HW interface) and the HW tweak
>> that stops that is a gen12 change. Something else might be wrong is this
>> is firing off in our ICL CI, also because I don't think we have any test
>> case that actually uses the SFC, do we?
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6911/fi-icl-u2/igt@i915_selftest@live_hangcheck.html
>
> All icl, live_hangcheck or live_reset, for as long as I can remember.
> -Chris

Thanks. I'm going to check with the HW team to see what their 
recommended timeout value is, in case ours is too short. It could also 
be that even on ICL the ack is not always returned if the SFC is not 
actually in use.

Daniele

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] drm/i915: fix SFC reset flow
  2019-09-17 18:29   ` Daniele Ceraolo Spurio
@ 2019-09-18 13:42     ` Tvrtko Ursulin
  0 siblings, 0 replies; 13+ messages in thread
From: Tvrtko Ursulin @ 2019-09-18 13:42 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx; +Cc: Owen Zhang


On 17/09/2019 19:29, Daniele Ceraolo Spurio wrote:
> 
> 
> On 9/17/2019 3:22 AM, Tvrtko Ursulin wrote:
>>
>> On 16/09/2019 22:41, Daniele Ceraolo Spurio wrote:
>>> Our assumption that the we can ask the HW to lock the SFC even if not
>>> currently in use does not match the HW commitment. The expectation from
>>> the HW is that SW will not try to lock the SFC if the engine is not
>>> using it and if we do that the behavior is undefined; on ICL the HW
>>> ends up to returning the ack and ignoring our lock request, but this is
>>> not guaranteed and we shouldn't expect it going forward.
>>>
>>> Reported-by: Owen Zhang <owen.zhang@intel.com>
>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/gt/intel_reset.c | 25 +++++++++++++++++--------
>>>   1 file changed, 17 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
>>> b/drivers/gpu/drm/i915/gt/intel_reset.c
>>> index 8327220ac558..900958804bd5 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
>>> @@ -352,13 +352,15 @@ static u32 gen11_lock_sfc(struct 
>>> intel_engine_cs *engine)
>>>       }
>>>         /*
>>> -     * Tell the engine that a software reset is going to happen. The 
>>> engine
>>> -     * will then try to force lock the SFC (if currently locked, it 
>>> will
>>> -     * remain so until we tell the engine it is safe to unlock; if 
>>> currently
>>> -     * unlocked, it will ignore this and all new lock requests). If SFC
>>> -     * ends up being locked to the engine we want to reset, we have 
>>> to reset
>>> -     * it as well (we will unlock it once the reset sequence is 
>>> completed).
>>> +     * If the engine is using a SFC, tell the engine that a software 
>>> reset
>>> +     * is going to happen. The engine will then try to force lock 
>>> the SFC.
>>> +     * If SFC ends up being locked to the engine we want to reset, 
>>> we have
>>> +     * to reset it as well (we will unlock it once the reset 
>>> sequence is
>>> +     * completed).
>>>        */
>>> +    if (!(intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit))
>>> +        return 0;
>>> +
>>>       rmw_set_fw(uncore, sfc_forced_lock, sfc_forced_lock_bit);
>>>         if (__intel_wait_for_register_fw(uncore,
>>> @@ -366,10 +368,13 @@ static u32 gen11_lock_sfc(struct 
>>> intel_engine_cs *engine)
>>>                        sfc_forced_lock_ack_bit,
>>>                        sfc_forced_lock_ack_bit,
>>>                        1000, 0, NULL)) {
>>> -        DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
>>> +        /* did we race the unlock? */
>>
>> How do we race here? Are we not in complete control of the engine at 
>> this point so the status of this engine using SFC or not should be 
>> static, no?
> 
> The hang detection might be due to a long non-preemptable batch, in 
> which case there is in theory a chance for the batch to release the SFC 
> while we try to lock it. The chance is incredibly small though, so am I 
> being too paranoid?

I get it now, it is a legitimate race.

> 
>>
>>> +        if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
>>> +            DRM_ERROR("Wait for SFC forced lock ack failed\n");
>>>           return 0;
>>>       }
>>>   +    /* The HW could return the ack even if the sfc is not in use */
>>
>> But the function checked whether SFC wasn't in use and bailed out 
>> early - so is this comment relevant? (I understand it is true against 
>> the specs just wondering about our exact code.)
>>
> 
> Same rationale as the above, if the engine relased the SFC while we were 
> locking it, the locking might have been rejected, but on ICL we still 
> get the ack.
> 
>>>       if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
>>>           return sfc_reset_bit;
>>>   @@ -382,6 +387,7 @@ static void gen11_unlock_sfc(struct 
>>> intel_engine_cs *engine)
>>>       u8 vdbox_sfc_access = 
>>> RUNTIME_INFO(engine->i915)->vdbox_sfc_access;
>>>       i915_reg_t sfc_forced_lock;
>>>       u32 sfc_forced_lock_bit;
>>> +    u32 lock;
>>>         switch (engine->class) {
>>>       case VIDEO_DECODE_CLASS:
>>> @@ -401,7 +407,10 @@ static void gen11_unlock_sfc(struct 
>>> intel_engine_cs *engine)
>>>           return;
>>>       }
>>>   -    rmw_clear_fw(uncore, sfc_forced_lock, sfc_forced_lock_bit);
>>> +    lock = intel_uncore_read_fw(uncore, sfc_forced_lock);
>>> +    if (lock & sfc_forced_lock_bit)
>>> +        intel_uncore_write_fw(uncore, sfc_forced_lock,
>>> +                      lock & ~sfc_forced_lock_bit);
>>
>> Here we can't rely on the return code from gen11_lock_sfc and have to 
>> read the register ourselves? I guess it depends on my question about 
>> the race comment.
>>
>> In addition to this I now see that gen11_reset_engines does not use 
>> the return value from gen11_lock_sfc when deciding which engines it 
>> needs to unlock. Should we change that as well?
> 
> Paranoia here as well, in case something went wrong with the locking I'd 
> like to be sure the unlocking can still be performed independently so we 
> can recover. e.g. the locking might have succeeded after we hit the 
> timeout in gen11_lock_sfc , in which case the return from that function 
> won't reflect the status of the HW.

Put in a comment here explaining what's the story and with that:

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Another option is to cross-check software vs hardware locked status at 
this point and to log a mismatch. Just so we get data how often this 
happens in practice. This is probably best as follow up.

Regards,

Tvrtko

> 
> Thanks,
> Daniele
> 
>>
>>
>>>   }
>>>     static int gen11_reset_engines(struct intel_gt *gt,
>>>
>>
>> Regards,
>>
>> Tvrtko
> 
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-09-18 13:42 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-16 21:41 [PATCH] drm/i915: fix SFC reset flow Daniele Ceraolo Spurio
2019-09-16 22:41 ` ✓ Fi.CI.BAT: success for " Patchwork
2019-09-17  5:37 ` ✓ Fi.CI.IGT: " Patchwork
2019-09-17  7:57 ` [PATCH] " Chris Wilson
2019-09-17 18:36   ` Daniele Ceraolo Spurio
2019-09-17 18:57     ` Chris Wilson
2019-09-17 19:45       ` Daniele Ceraolo Spurio
2019-09-17 19:49         ` Chris Wilson
2019-09-17 20:53           ` Daniele Ceraolo Spurio
2019-09-17 10:22 ` Tvrtko Ursulin
2019-09-17 18:29   ` Daniele Ceraolo Spurio
2019-09-18 13:42     ` Tvrtko Ursulin
2019-09-17 16:06 ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.