[PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
@ 2019-09-09 22:55 Chris Wilson
  2019-09-09 22:55 ` [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first Chris Wilson
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-09 22:55 UTC (permalink / raw)
  To: intel-gfx

If the display is inactive, we need not worry about the gpu reset
clobbering the display!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_reset.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index b9d84d52e986..fe57296b790c 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -7,6 +7,7 @@
 #include <linux/sched/mm.h>
 #include <linux/stop_machine.h>
 
+#include "display/intel_display.h"
 #include "display/intel_display_types.h"
 #include "display/intel_overlay.h"
 
@@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
 	intel_engine_queue_breadcrumbs(engine);
 }
 
+static bool reset_clobbers_display(struct drm_i915_private *i915)
+{
+	struct intel_crtc *crtc;
+
+	if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
+		return false;
+
+	for_each_intel_crtc(&i915->drm, crtc) {
+		if (crtc->active)
+			return true;
+	}
+
+	return false;
+}
+
 static void __intel_gt_set_wedged(struct intel_gt *gt)
 {
 	struct intel_engine_cs *engine;
@@ -755,7 +771,7 @@ static void __intel_gt_set_wedged(struct intel_gt *gt)
 	awake = reset_prepare(gt);
 
 	/* Even if the GPU reset fails, it should still stop the engines */
-	if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
+	if (!reset_clobbers_display(gt->i915))
 		__intel_gt_reset(gt, ALL_ENGINES);
 
 	for_each_engine(engine, gt->i915, id)
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first
  2019-09-09 22:55 [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Chris Wilson
@ 2019-09-09 22:55 ` Chris Wilson
  2019-09-10  0:59   ` Daniele Ceraolo Spurio
  2019-09-10  7:39   ` Janusz Krzysztofik
  2019-09-09 23:33 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Patchwork
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-09 22:55 UTC (permalink / raw)
  To: intel-gfx

Unwedging the GPU requires a successful GPU reset before we restore the
default submission, or else we may see residual context switch events
that we were not expecting.

Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index fe57296b790c..5242496a893a 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
 	struct intel_gt_timelines *timelines = &gt->timelines;
 	struct intel_timeline *tl;
 	unsigned long flags;
+	bool ok;
 
 	if (!test_bit(I915_WEDGED, &gt->reset.flags))
 		return true;
@@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
 	}
 	spin_unlock_irqrestore(&timelines->lock, flags);
 
-	intel_gt_sanitize(gt, false);
+	ok = false;
+	if (!reset_clobbers_display(gt->i915))
+		ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;
+	if (!ok)
+		return false;
 
 	/*
 	 * Undo nop_submit_request. We prevent all new i915 requests from
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
  2019-09-09 22:55 [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Chris Wilson
  2019-09-09 22:55 ` [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first Chris Wilson
@ 2019-09-09 23:33 ` Patchwork
  2019-09-10  7:39 ` [PATCH 1/2] " Ville Syrjälä
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2019-09-09 23:33 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
URL   : https://patchwork.freedesktop.org/series/66459/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_6855 -> Patchwork_14334
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_14334 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_14334, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_14334:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_module_load@reload-with-fault-injection:
    - fi-blb-e6850:       NOTRUN -> [DMESG-WARN][1]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-blb-e6850/igt@i915_module_load@reload-with-fault-injection.html
    - fi-bwr-2160:        [PASS][2] -> [DMESG-WARN][3]
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-bwr-2160/igt@i915_module_load@reload-with-fault-injection.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-bwr-2160/igt@i915_module_load@reload-with-fault-injection.html
    - fi-pnv-d510:        NOTRUN -> [DMESG-WARN][4]
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-pnv-d510/igt@i915_module_load@reload-with-fault-injection.html
    - fi-gdg-551:         [PASS][5] -> [DMESG-WARN][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-gdg-551/igt@i915_module_load@reload-with-fault-injection.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-gdg-551/igt@i915_module_load@reload-with-fault-injection.html

  * igt@runner@aborted:
    - fi-pnv-d510:        NOTRUN -> [FAIL][7]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-pnv-d510/igt@runner@aborted.html
    - fi-gdg-551:         NOTRUN -> [FAIL][8]
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-gdg-551/igt@runner@aborted.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_sync@basic-store-all:
    - {fi-tgl-u}:         NOTRUN -> [INCOMPLETE][9]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-tgl-u/igt@gem_sync@basic-store-all.html

  
Known issues
------------

  Here are the changes found in Patchwork_14334 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_flink_basic@basic:
    - fi-icl-u3:          [PASS][10] -> [DMESG-WARN][11] ([fdo#107724])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-icl-u3/igt@gem_flink_basic@basic.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-icl-u3/igt@gem_flink_basic@basic.html

  * igt@i915_module_load@reload:
    - fi-icl-u3:          [PASS][12] -> [DMESG-WARN][13] ([fdo#107724] / [fdo#111214])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-icl-u3/igt@i915_module_load@reload.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-icl-u3/igt@i915_module_load@reload.html

  * igt@i915_selftest@live_gem_contexts:
    - fi-cfl-8700k:       [PASS][14] -> [INCOMPLETE][15] ([fdo#111514])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-cfl-8700k/igt@i915_selftest@live_gem_contexts.html
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-cfl-8700k/igt@i915_selftest@live_gem_contexts.html

  
#### Possible fixes ####

  * igt@core_auth@basic-auth:
    - fi-icl-u3:          [DMESG-WARN][16] ([fdo#107724]) -> [PASS][17]
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-icl-u3/igt@core_auth@basic-auth.html
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-icl-u3/igt@core_auth@basic-auth.html

  * igt@gem_exec_gttfill@basic:
    - {fi-tgl-u}:         [INCOMPLETE][18] ([fdo#111593]) -> [PASS][19]
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-tgl-u/igt@gem_exec_gttfill@basic.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-tgl-u/igt@gem_exec_gttfill@basic.html

  * igt@i915_module_load@reload:
    - fi-blb-e6850:       [INCOMPLETE][20] ([fdo#107718]) -> [PASS][21]
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-blb-e6850/igt@i915_module_load@reload.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-blb-e6850/igt@i915_module_load@reload.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#106107]: https://bugs.freedesktop.org/show_bug.cgi?id=106107
  [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#111214]: https://bugs.freedesktop.org/show_bug.cgi?id=111214
  [fdo#111514]: https://bugs.freedesktop.org/show_bug.cgi?id=111514
  [fdo#111593]: https://bugs.freedesktop.org/show_bug.cgi?id=111593


Participating hosts (53 -> 47)
------------------------------

  Additional (1): fi-pnv-d510 
  Missing    (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-y fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_6855 -> Patchwork_14334

  CI-20190529: 20190529
  CI_DRM_6855: 4cef087ee0994669a097115945717fedb413c368 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5177: 8f351d693352d21c96cef38c3fd77f778c6d7c33 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_14334: bf285b1528cb5a10ba2ec677b9da45507c996e9f @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

bf285b1528cb drm/i915/gt: Only unwedge if we can reset first
d5e889771f59 drm/i915/gt: Allow clobbering gpu resets if the display is off

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first
  2019-09-09 22:55 ` [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first Chris Wilson
@ 2019-09-10  0:59   ` Daniele Ceraolo Spurio
  2019-09-10  6:06     ` Chris Wilson
  2019-09-10  7:39   ` Janusz Krzysztofik
  1 sibling, 1 reply; 18+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-09-10  0:59 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 9/9/19 3:55 PM, Chris Wilson wrote:
> Unwedging the GPU requires a successful GPU reset before we restore the
> default submission, or else we may see residual context switch events
> that we were not expecting.
> 
> Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
>   drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index fe57296b790c..5242496a893a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>   	struct intel_gt_timelines *timelines = &gt->timelines;
>   	struct intel_timeline *tl;
>   	unsigned long flags;
> +	bool ok;
>   
>   	if (!test_bit(I915_WEDGED, &gt->reset.flags))
>   		return true;
> @@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>   	}
>   	spin_unlock_irqrestore(&timelines->lock, flags);
>   
> -	intel_gt_sanitize(gt, false);
> +	ok = false;
> +	if (!reset_clobbers_display(gt->i915))
> +		ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;

Of the thing we had in the gt_sanitize, we're ok skipping the 
uc_sanitize() because we take care of that during wedge (from 
intel_uc_reset_prepare), but what about the loop of 
__intel_engine_reset()? Is that safe to skip here?

Apart from that, the patch LGTM. Worth noting that with this change a 
successful reset is required to unwedge even after a suspend/resume 
cycle (in gem_sanitize), which is a good thing IMO.

Daniele

> +	if (!ok)
> +		return false;
>   
>   	/*
>   	 * Undo nop_submit_request. We prevent all new i915 requests from
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first
  2019-09-10  0:59   ` Daniele Ceraolo Spurio
@ 2019-09-10  6:06     ` Chris Wilson
  2019-09-10 21:20       ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2019-09-10  6:06 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-gfx

Quoting Daniele Ceraolo Spurio (2019-09-10 01:59:38)
> 
> 
> On 9/9/19 3:55 PM, Chris Wilson wrote:
> > Unwedging the GPU requires a successful GPU reset before we restore the
> > default submission, or else we may see residual context switch events
> > that we were not expecting.
> > 
> > Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++++++-
> >   1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> > index fe57296b790c..5242496a893a 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > @@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
> >       struct intel_gt_timelines *timelines = &gt->timelines;
> >       struct intel_timeline *tl;
> >       unsigned long flags;
> > +     bool ok;
> >   
> >       if (!test_bit(I915_WEDGED, &gt->reset.flags))
> >               return true;
> > @@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
> >       }
> >       spin_unlock_irqrestore(&timelines->lock, flags);
> >   
> > -     intel_gt_sanitize(gt, false);
> > +     ok = false;
> > +     if (!reset_clobbers_display(gt->i915))
> > +             ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;
> 
> Of the thing we had in the gt_sanitize, we're ok skipping the 
> uc_sanitize() because we take care of that during wedge (from 
> intel_uc_reset_prepare), but what about the loop of 
> __intel_engine_reset()? Is that safe to skip here?

I think yes, because we always follow the unwedge with a GT restart. That
is either via the full reset or the sanitize+restart on resume. Both call
paths will also set the wedged bit if they fail. gem_eio/suspend should
be testing the recovery upon resume path, and even gem_eio/*-stress
should give responsible coverage of the normal recovery via full reset.
 
> Apart from that, the patch LGTM. Worth noting that with this change a 
> successful reset is required to unwedge even after a suspend/resume 
> cycle (in gem_sanitize), which is a good thing IMO.

Hence why relaxing the gpu_clobbers_display is important to retain the
ability to clear wedged across suspend on older devices.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
  2019-09-09 22:55 [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Chris Wilson
  2019-09-09 22:55 ` [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first Chris Wilson
  2019-09-09 23:33 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Patchwork
@ 2019-09-10  7:39 ` Ville Syrjälä
  2019-09-10  8:32   ` Chris Wilson
  2019-09-11 10:19   ` [PATCH v2] " Chris Wilson
  2019-09-11 10:58 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2) Patchwork
  2019-09-11 11:35 ` ✗ Fi.CI.BAT: failure " Patchwork
  4 siblings, 2 replies; 18+ messages in thread
From: Ville Syrjälä @ 2019-09-10  7:39 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Mon, Sep 09, 2019 at 11:55:35PM +0100, Chris Wilson wrote:
> If the display is inactive, we need not worry about the gpu reset
> clobbering the display!
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/gt/intel_reset.c | 18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index b9d84d52e986..fe57296b790c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -7,6 +7,7 @@
>  #include <linux/sched/mm.h>
>  #include <linux/stop_machine.h>
>  
> +#include "display/intel_display.h"
>  #include "display/intel_display_types.h"
>  #include "display/intel_overlay.h"
>  
> @@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
>  	intel_engine_queue_breadcrumbs(engine);
>  }
>  
> +static bool reset_clobbers_display(struct drm_i915_private *i915)
> +{
> +	struct intel_crtc *crtc;
> +
> +	if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
> +		return false;
> +
> +	for_each_intel_crtc(&i915->drm, crtc) {
> +		if (crtc->active)
> +			return true;
> +	}

This feels racy. crtc->active gets set somewhere in the middle of the
modeset, so looks like now we could clobber all the stuff we've already
set up before crtc->active got set.

> +
> +	return false;
> +}
> +
>  static void __intel_gt_set_wedged(struct intel_gt *gt)
>  {
>  	struct intel_engine_cs *engine;
> @@ -755,7 +771,7 @@ static void __intel_gt_set_wedged(struct intel_gt *gt)
>  	awake = reset_prepare(gt);
>  
>  	/* Even if the GPU reset fails, it should still stop the engines */
> -	if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
> +	if (!reset_clobbers_display(gt->i915))
>  		__intel_gt_reset(gt, ALL_ENGINES);
>  
>  	for_each_engine(engine, gt->i915, id)
> -- 
> 2.23.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first
  2019-09-09 22:55 ` [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first Chris Wilson
  2019-09-10  0:59   ` Daniele Ceraolo Spurio
@ 2019-09-10  7:39   ` Janusz Krzysztofik
  2019-09-13  7:41     ` Chris Wilson
  1 sibling, 1 reply; 18+ messages in thread
From: Janusz Krzysztofik @ 2019-09-10  7:39 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

Hi Chris,

On Tuesday, September 10, 2019 12:55:36 AM CEST Chris Wilson wrote:
> Unwedging the GPU requires a successful GPU reset before we restore the
> default submission, or else we may see residual context switch events
> that we were not expecting.
> 
> Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/
gt/intel_reset.c
> index fe57296b790c..5242496a893a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>  	struct intel_gt_timelines *timelines = &gt->timelines;
>  	struct intel_timeline *tl;
>  	unsigned long flags;
> +	bool ok;
>  
>  	if (!test_bit(I915_WEDGED, &gt->reset.flags))
>  		return true;
> @@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt 
*gt)
>  	}
>  	spin_unlock_irqrestore(&timelines->lock, flags);
>  
> -	intel_gt_sanitize(gt, false);
> +	ok = false;
> +	if (!reset_clobbers_display(gt->i915))
> +		ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;
> +	if (!ok)
> +		return false;

Before your change, that code was executed inside intel_gt_sanitize(gt, false) 
which unfortunately didn't return any result.  The same outcome could be 
achieved by redefining intel_gt_sanitize() to return that result and saying:

	if (!intel_gt_sanitize(gt, false)
		return false;

Is there any specific reason for intel_gt_sanitize() returning void?

Thanks,
Janusz

>  
>  	/*
>  	 * Undo nop_submit_request. We prevent all new i915 requests from
> 




_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
  2019-09-10  7:39 ` [PATCH 1/2] " Ville Syrjälä
@ 2019-09-10  8:32   ` Chris Wilson
  2019-09-10  8:58     ` Ville Syrjälä
  2019-09-11 10:19   ` [PATCH v2] " Chris Wilson
  1 sibling, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2019-09-10  8:32 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

Quoting Ville Syrjälä (2019-09-10 08:39:31)
> On Mon, Sep 09, 2019 at 11:55:35PM +0100, Chris Wilson wrote:
> > If the display is inactive, we need not worry about the gpu reset
> > clobbering the display!
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_reset.c | 18 +++++++++++++++++-
> >  1 file changed, 17 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> > index b9d84d52e986..fe57296b790c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > @@ -7,6 +7,7 @@
> >  #include <linux/sched/mm.h>
> >  #include <linux/stop_machine.h>
> >  
> > +#include "display/intel_display.h"
> >  #include "display/intel_display_types.h"
> >  #include "display/intel_overlay.h"
> >  
> > @@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
> >       intel_engine_queue_breadcrumbs(engine);
> >  }
> >  
> > +static bool reset_clobbers_display(struct drm_i915_private *i915)
> > +{
> > +     struct intel_crtc *crtc;
> > +
> > +     if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
> > +             return false;
> > +
> > +     for_each_intel_crtc(&i915->drm, crtc) {
> > +             if (crtc->active)
> > +                     return true;
> > +     }
> 
> This feels racy. crtc->active gets set somewhere in the middle of the
> modeset, so looks like now we could clobber all the stuff we've already
> set up before crtc->active got set.

The question is, does it matter? After we unwedge, we clobber again for
realz. Not that we have anything deliberately trying to race
wedge-vs-modeset(on/off).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
  2019-09-10  8:32   ` Chris Wilson
@ 2019-09-10  8:58     ` Ville Syrjälä
  2019-09-11 10:01       ` Chris Wilson
  0 siblings, 1 reply; 18+ messages in thread
From: Ville Syrjälä @ 2019-09-10  8:58 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Tue, Sep 10, 2019 at 09:32:45AM +0100, Chris Wilson wrote:
> Quoting Ville Syrjälä (2019-09-10 08:39:31)
> > On Mon, Sep 09, 2019 at 11:55:35PM +0100, Chris Wilson wrote:
> > > If the display is inactive, we need not worry about the gpu reset
> > > clobbering the display!
> > > 
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_reset.c | 18 +++++++++++++++++-
> > >  1 file changed, 17 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> > > index b9d84d52e986..fe57296b790c 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > > @@ -7,6 +7,7 @@
> > >  #include <linux/sched/mm.h>
> > >  #include <linux/stop_machine.h>
> > >  
> > > +#include "display/intel_display.h"
> > >  #include "display/intel_display_types.h"
> > >  #include "display/intel_overlay.h"
> > >  
> > > @@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
> > >       intel_engine_queue_breadcrumbs(engine);
> > >  }
> > >  
> > > +static bool reset_clobbers_display(struct drm_i915_private *i915)
> > > +{
> > > +     struct intel_crtc *crtc;
> > > +
> > > +     if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
> > > +             return false;
> > > +
> > > +     for_each_intel_crtc(&i915->drm, crtc) {
> > > +             if (crtc->active)
> > > +                     return true;
> > > +     }
> > 
> > This feels racy. crtc->active gets set somewhere in the middle of the
> > modeset, so looks like now we could clobber all the stuff we've already
> > set up before crtc->active got set.
> 
> The question is, does it matter? After we unwedge, we clobber again for
> realz. Not that we have anything deliberately trying to race
> wedge-vs-modeset(on/off).

Not sure. But I suspect the hw might decide to hang the box if eg.
the PLL is borked when we touch something that really needs the clock.

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first
  2019-09-10  6:06     ` Chris Wilson
@ 2019-09-10 21:20       ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 18+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-09-10 21:20 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 9/9/19 11:06 PM, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2019-09-10 01:59:38)
>>
>>
>> On 9/9/19 3:55 PM, Chris Wilson wrote:
>>> Unwedging the GPU requires a successful GPU reset before we restore the
>>> default submission, or else we may see residual context switch events
>>> that we were not expecting.
>>>
>>> Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++++++-
>>>    1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
>>> index fe57296b790c..5242496a893a 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
>>> @@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>>>        struct intel_gt_timelines *timelines = &gt->timelines;
>>>        struct intel_timeline *tl;
>>>        unsigned long flags;
>>> +     bool ok;
>>>    
>>>        if (!test_bit(I915_WEDGED, &gt->reset.flags))
>>>                return true;
>>> @@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>>>        }
>>>        spin_unlock_irqrestore(&timelines->lock, flags);
>>>    
>>> -     intel_gt_sanitize(gt, false);
>>> +     ok = false;
>>> +     if (!reset_clobbers_display(gt->i915))
>>> +             ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;
>>
>> Of the thing we had in the gt_sanitize, we're ok skipping the
>> uc_sanitize() because we take care of that during wedge (from
>> intel_uc_reset_prepare), but what about the loop of
>> __intel_engine_reset()? Is that safe to skip here?
> 
> I think yes, because we always follow the unwedge with a GT restart. That
> is either via the full reset or the sanitize+restart on resume. Both call
> paths will also set the wedged bit if they fail. gem_eio/suspend should
> be testing the recovery upon resume path, and even gem_eio/*-stress
> should give responsible coverage of the normal recovery via full reset.
>   
>> Apart from that, the patch LGTM. Worth noting that with this change a
>> successful reset is required to unwedge even after a suspend/resume
>> cycle (in gem_sanitize), which is a good thing IMO.
> 
> Hence why relaxing the gpu_clobbers_display is important to retain the
> ability to clear wedged across suspend on older devices.
> -Chris
> 

Sold!

Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
  2019-09-10  8:58     ` Ville Syrjälä
@ 2019-09-11 10:01       ` Chris Wilson
  0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-11 10:01 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

Quoting Ville Syrjälä (2019-09-10 09:58:38)
> On Tue, Sep 10, 2019 at 09:32:45AM +0100, Chris Wilson wrote:
> > Quoting Ville Syrjälä (2019-09-10 08:39:31)
> > > On Mon, Sep 09, 2019 at 11:55:35PM +0100, Chris Wilson wrote:
> > > > If the display is inactive, we need not worry about the gpu reset
> > > > clobbering the display!
> > > > 
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > ---
> > > >  drivers/gpu/drm/i915/gt/intel_reset.c | 18 +++++++++++++++++-
> > > >  1 file changed, 17 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> > > > index b9d84d52e986..fe57296b790c 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > > > @@ -7,6 +7,7 @@
> > > >  #include <linux/sched/mm.h>
> > > >  #include <linux/stop_machine.h>
> > > >  
> > > > +#include "display/intel_display.h"
> > > >  #include "display/intel_display_types.h"
> > > >  #include "display/intel_overlay.h"
> > > >  
> > > > @@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
> > > >       intel_engine_queue_breadcrumbs(engine);
> > > >  }
> > > >  
> > > > +static bool reset_clobbers_display(struct drm_i915_private *i915)
> > > > +{
> > > > +     struct intel_crtc *crtc;
> > > > +
> > > > +     if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
> > > > +             return false;
> > > > +
> > > > +     for_each_intel_crtc(&i915->drm, crtc) {
> > > > +             if (crtc->active)
> > > > +                     return true;
> > > > +     }
> > > 
> > > This feels racy. crtc->active gets set somewhere in the middle of the
> > > modeset, so looks like now we could clobber all the stuff we've already
> > > set up before crtc->active got set.
> > 
> > The question is, does it matter? After we unwedge, we clobber again for
> > realz. Not that we have anything deliberately trying to race
> > wedge-vs-modeset(on/off).
> 
> Not sure. But I suspect the hw might decide to hang the box if eg.
> the PLL is borked when we touch something that really needs the clock.

crtc->active only changes within the atomic_tail? Worst case I guess is
we add

srcu = -ENODEV;
if (i915->gpu_reset_clobbers_display)
	/* only fails if interrupted */
	srcu = intel_gt_reset_trylock(&i915->gt);

...

if (srcu > 0)
	intel_gt_reset_unlock(&i915->gt, srcu);

-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2] drm/i915/gt: Allow clobbering gpu resets if the display is off
  2019-09-10  7:39 ` [PATCH 1/2] " Ville Syrjälä
  2019-09-10  8:32   ` Chris Wilson
@ 2019-09-11 10:19   ` Chris Wilson
  2019-09-11 10:48     ` Chris Wilson
  2019-09-11 10:52     ` Ville Syrjälä
  1 sibling, 2 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-11 10:19 UTC (permalink / raw)
  To: intel-gfx

If the display is inactive, we need not worry about the gpu reset
clobbering the display! To prevent the display changing state between us
checking the active state and doing the hard reset, we introduce the
lightweight reset lock to the atomic commit for the affected (legacy)
platforms.

Testcase: igt/gem_eio/kms
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c | 11 +++++++++++
 drivers/gpu/drm/i915/gt/intel_reset.c        | 18 +++++++++++++++++-
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 4ee750fa3ef0..a92487d8f4cf 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -13986,6 +13986,7 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
 	struct intel_crtc *crtc;
 	u64 put_domains[I915_MAX_PIPES] = {};
 	intel_wakeref_t wakeref = 0;
+	int srcu;
 	int i;
 
 	intel_atomic_commit_fence_wait(state);
@@ -14005,6 +14006,12 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
 		}
 	}
 
+	/* Prevent starting a GPU reset while we change global display state */
+	srcu = -ENODEV;
+	if (INTEL_INFO(dev_priv)->gpu_reset_clobbers_display)
+		/* only fails if interrupted */
+		srcu = intel_gt_reset_trylock(&dev_priv->gt);
+
 	intel_commit_modeset_disables(state);
 
 	/* FIXME: Eventually get rid of our crtc->config pointer */
@@ -14049,6 +14056,10 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
 	/* Now enable the clocks, plane, pipe, and connectors that we set up. */
 	dev_priv->display.commit_modeset_enables(state);
 
+
+	if (srcu > 0)
+		intel_gt_reset_unlock(&dev_priv->gt, srcu);
+
 	if (state->modeset) {
 		intel_encoders_update_complete(state);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index d8b1498d7ab7..df4a86bdb6f6 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -7,6 +7,7 @@
 #include <linux/sched/mm.h>
 #include <linux/stop_machine.h>
 
+#include "display/intel_display.h"
 #include "display/intel_display_types.h"
 #include "display/intel_overlay.h"
 
@@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
 	intel_engine_queue_breadcrumbs(engine);
 }
 
+static bool reset_clobbers_display(struct drm_i915_private *i915)
+{
+	struct intel_crtc *crtc;
+
+	if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
+		return false;
+
+	for_each_intel_crtc(&i915->drm, crtc) {
+		if (crtc->active)
+			return true;
+	}
+
+	return false;
+}
+
 static void __intel_gt_set_wedged(struct intel_gt *gt)
 {
 	struct intel_engine_cs *engine;
@@ -755,7 +771,7 @@ static void __intel_gt_set_wedged(struct intel_gt *gt)
 	awake = reset_prepare(gt);
 
 	/* Even if the GPU reset fails, it should still stop the engines */
-	if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
+	if (!reset_clobbers_display(gt->i915))
 		__intel_gt_reset(gt, ALL_ENGINES);
 
 	for_each_engine(engine, gt->i915, id)
-- 
2.23.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] drm/i915/gt: Allow clobbering gpu resets if the display is off
  2019-09-11 10:19   ` [PATCH v2] " Chris Wilson
@ 2019-09-11 10:48     ` Chris Wilson
  2019-09-11 10:52     ` Ville Syrjälä
  1 sibling, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-11 10:48 UTC (permalink / raw)
  To: intel-gfx

Quoting Chris Wilson (2019-09-11 11:19:59)
> If the display is inactive, we need not worry about the gpu reset
> clobbering the display! To prevent the display changing state between us
> checking the active state and doing the hard reset, we introduce the
> lightweight reset lock to the atomic commit for the affected (legacy)
> platforms.

> +       /* Prevent starting a GPU reset while we change global display state */
> +       srcu = -ENODEV;
> +       if (INTEL_INFO(dev_priv)->gpu_reset_clobbers_display)
> +               /* only fails if interrupted */
> +               srcu = intel_gt_reset_trylock(&dev_priv->gt);

This is overkill? Before starting the GPU reset, we serialise with the
modeset, do we not?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] drm/i915/gt: Allow clobbering gpu resets if the display is off
  2019-09-11 10:19   ` [PATCH v2] " Chris Wilson
  2019-09-11 10:48     ` Chris Wilson
@ 2019-09-11 10:52     ` Ville Syrjälä
  2019-09-11 11:02       ` Chris Wilson
  1 sibling, 1 reply; 18+ messages in thread
From: Ville Syrjälä @ 2019-09-11 10:52 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Wed, Sep 11, 2019 at 11:19:59AM +0100, Chris Wilson wrote:
> If the display is inactive, we need not worry about the gpu reset
> clobbering the display! To prevent the display changing state between us
> checking the active state and doing the hard reset, we introduce the
> lightweight reset lock to the atomic commit for the affected (legacy)
> platforms.
> 
> Testcase: igt/gem_eio/kms
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_display.c | 11 +++++++++++
>  drivers/gpu/drm/i915/gt/intel_reset.c        | 18 +++++++++++++++++-
>  2 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 4ee750fa3ef0..a92487d8f4cf 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -13986,6 +13986,7 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
>  	struct intel_crtc *crtc;
>  	u64 put_domains[I915_MAX_PIPES] = {};
>  	intel_wakeref_t wakeref = 0;
> +	int srcu;
>  	int i;
>  
>  	intel_atomic_commit_fence_wait(state);
> @@ -14005,6 +14006,12 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
>  		}
>  	}
>  
> +	/* Prevent starting a GPU reset while we change global display state */
> +	srcu = -ENODEV;
> +	if (INTEL_INFO(dev_priv)->gpu_reset_clobbers_display)
> +		/* only fails if interrupted */
> +		srcu = intel_gt_reset_trylock(&dev_priv->gt);

Hmm. What happens if we're holding the modeset locks and there's an
ongoing reset blocked on said locks?

> +
>  	intel_commit_modeset_disables(state);
>  
>  	/* FIXME: Eventually get rid of our crtc->config pointer */
> @@ -14049,6 +14056,10 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
>  	/* Now enable the clocks, plane, pipe, and connectors that we set up. */
>  	dev_priv->display.commit_modeset_enables(state);
>  
> +
> +	if (srcu > 0)
> +		intel_gt_reset_unlock(&dev_priv->gt, srcu);
> +
>  	if (state->modeset) {
>  		intel_encoders_update_complete(state);
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index d8b1498d7ab7..df4a86bdb6f6 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -7,6 +7,7 @@
>  #include <linux/sched/mm.h>
>  #include <linux/stop_machine.h>
>  
> +#include "display/intel_display.h"
>  #include "display/intel_display_types.h"
>  #include "display/intel_overlay.h"
>  
> @@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
>  	intel_engine_queue_breadcrumbs(engine);
>  }
>  
> +static bool reset_clobbers_display(struct drm_i915_private *i915)
> +{
> +	struct intel_crtc *crtc;
> +
> +	if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
> +		return false;
> +
> +	for_each_intel_crtc(&i915->drm, crtc) {
> +		if (crtc->active)
> +			return true;
> +	}
> +
> +	return false;
> +}
> +
>  static void __intel_gt_set_wedged(struct intel_gt *gt)
>  {
>  	struct intel_engine_cs *engine;
> @@ -755,7 +771,7 @@ static void __intel_gt_set_wedged(struct intel_gt *gt)
>  	awake = reset_prepare(gt);
>  
>  	/* Even if the GPU reset fails, it should still stop the engines */
> -	if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
> +	if (!reset_clobbers_display(gt->i915))
>  		__intel_gt_reset(gt, ALL_ENGINES);
>  
>  	for_each_engine(engine, gt->i915, id)
> -- 
> 2.23.0

-- 
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2)
  2019-09-09 22:55 [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Chris Wilson
                   ` (2 preceding siblings ...)
  2019-09-10  7:39 ` [PATCH 1/2] " Ville Syrjälä
@ 2019-09-11 10:58 ` Patchwork
  2019-09-11 11:35 ` ✗ Fi.CI.BAT: failure " Patchwork
  4 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2019-09-11 10:58 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2)
URL   : https://patchwork.freedesktop.org/series/66459/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
1b207a558e6e drm/i915/gt: Allow clobbering gpu resets if the display is off
-:49: CHECK:LINE_SPACING: Please don't use multiple blank lines
#49: FILE: drivers/gpu/drm/i915/display/intel_display.c:14071:
 
+

total: 0 errors, 0 warnings, 1 checks, 65 lines checked
ff9559a3849b drm/i915/gt: Only unwedge if we can reset first

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] drm/i915/gt: Allow clobbering gpu resets if the display is off
  2019-09-11 10:52     ` Ville Syrjälä
@ 2019-09-11 11:02       ` Chris Wilson
  0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-11 11:02 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: intel-gfx

Quoting Ville Syrjälä (2019-09-11 11:52:29)
> On Wed, Sep 11, 2019 at 11:19:59AM +0100, Chris Wilson wrote:
> > If the display is inactive, we need not worry about the gpu reset
> > clobbering the display! To prevent the display changing state between us
> > checking the active state and doing the hard reset, we introduce the
> > lightweight reset lock to the atomic commit for the affected (legacy)
> > platforms.
> > 
> > Testcase: igt/gem_eio/kms
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> >  drivers/gpu/drm/i915/display/intel_display.c | 11 +++++++++++
> >  drivers/gpu/drm/i915/gt/intel_reset.c        | 18 +++++++++++++++++-
> >  2 files changed, 28 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> > index 4ee750fa3ef0..a92487d8f4cf 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > @@ -13986,6 +13986,7 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
> >       struct intel_crtc *crtc;
> >       u64 put_domains[I915_MAX_PIPES] = {};
> >       intel_wakeref_t wakeref = 0;
> > +     int srcu;
> >       int i;
> >  
> >       intel_atomic_commit_fence_wait(state);
> > @@ -14005,6 +14006,12 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
> >               }
> >       }
> >  
> > +     /* Prevent starting a GPU reset while we change global display state */
> > +     srcu = -ENODEV;
> > +     if (INTEL_INFO(dev_priv)->gpu_reset_clobbers_display)
> > +             /* only fails if interrupted */
> > +             srcu = intel_gt_reset_trylock(&dev_priv->gt);
> 
> Hmm. What happens if we're holding the modeset locks and there's an
> ongoing reset blocked on said locks?

I was about to say it's the inner lock in both cases, but it's not.
Anyway doesn't that interaction (intel_prepare_reset) prevent
crtc->active changing as we use it from inside intel_gt_reset (which
calls intel_gt_unset_wedged)?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2)
  2019-09-09 22:55 [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Chris Wilson
                   ` (3 preceding siblings ...)
  2019-09-11 10:58 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2) Patchwork
@ 2019-09-11 11:35 ` Patchwork
  4 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2019-09-11 11:35 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2)
URL   : https://patchwork.freedesktop.org/series/66459/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_6868 -> Patchwork_14356
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_14356 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_14356, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_14356:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_hangman@error-state-basic:
    - fi-bwr-2160:        [PASS][1] -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-bwr-2160/igt@i915_hangman@error-state-basic.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-bwr-2160/igt@i915_hangman@error-state-basic.html
    - fi-blb-e6850:       [PASS][3] -> [INCOMPLETE][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-blb-e6850/igt@i915_hangman@error-state-basic.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-blb-e6850/igt@i915_hangman@error-state-basic.html

  
Known issues
------------

  Here are the changes found in Patchwork_14356 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_mmap_gtt@basic-write-no-prefault:
    - fi-icl-u3:          [PASS][5] -> [DMESG-WARN][6] ([fdo#107724]) +1 similar issue
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-icl-u3/igt@gem_mmap_gtt@basic-write-no-prefault.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-icl-u3/igt@gem_mmap_gtt@basic-write-no-prefault.html

  * igt@i915_hangman@error-state-basic:
    - fi-gdg-551:         [PASS][7] -> [INCOMPLETE][8] ([fdo#108316])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-gdg-551/igt@i915_hangman@error-state-basic.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-gdg-551/igt@i915_hangman@error-state-basic.html
    - fi-pnv-d510:        [PASS][9] -> [INCOMPLETE][10] ([fdo#110740])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-pnv-d510/igt@i915_hangman@error-state-basic.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-pnv-d510/igt@i915_hangman@error-state-basic.html

  * igt@kms_chamelium@dp-edid-read:
    - fi-kbl-7500u:       [PASS][11] -> [WARN][12] ([fdo#109483])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-kbl-7500u/igt@kms_chamelium@dp-edid-read.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-kbl-7500u/igt@kms_chamelium@dp-edid-read.html

  * igt@kms_frontbuffer_tracking@basic:
    - fi-icl-u3:          [PASS][13] -> [FAIL][14] ([fdo#103167])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-icl-u3/igt@kms_frontbuffer_tracking@basic.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-icl-u3/igt@kms_frontbuffer_tracking@basic.html

  
#### Possible fixes ####

  * igt@prime_vgem@basic-gtt:
    - fi-icl-u3:          [DMESG-WARN][15] ([fdo#107724]) -> [PASS][16] +1 similar issue
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-icl-u3/igt@prime_vgem@basic-gtt.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-icl-u3/igt@prime_vgem@basic-gtt.html

  
#### Warnings ####

  * igt@i915_pm_rpm@module-reload:
    - fi-icl-u2:          [INCOMPLETE][17] ([fdo#107713] / [fdo#108840]) -> [DMESG-WARN][18] ([fdo#110595])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-icl-u2/igt@i915_pm_rpm@module-reload.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-icl-u2/igt@i915_pm_rpm@module-reload.html

  
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#108316]: https://bugs.freedesktop.org/show_bug.cgi?id=108316
  [fdo#108840]: https://bugs.freedesktop.org/show_bug.cgi?id=108840
  [fdo#109483]: https://bugs.freedesktop.org/show_bug.cgi?id=109483
  [fdo#110595]: https://bugs.freedesktop.org/show_bug.cgi?id=110595
  [fdo#110740]: https://bugs.freedesktop.org/show_bug.cgi?id=110740


Participating hosts (53 -> 46)
------------------------------

  Missing    (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-y fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_6868 -> Patchwork_14356

  CI-20190529: 20190529
  CI_DRM_6868: fb9bbe42526c1a8467c99f8137ed6b94c749681f @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5178: efb4539494d94f03374874d3b61bd04ef3802aaa @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_14356: ff9559a3849b3aab2d39d531140cdd329ae818bd @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

ff9559a3849b drm/i915/gt: Only unwedge if we can reset first
1b207a558e6e drm/i915/gt: Allow clobbering gpu resets if the display is off

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first
  2019-09-10  7:39   ` Janusz Krzysztofik
@ 2019-09-13  7:41     ` Chris Wilson
  0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-13  7:41 UTC (permalink / raw)
  To: Janusz Krzysztofik; +Cc: intel-gfx

Quoting Janusz Krzysztofik (2019-09-10 08:39:51)
> Hi Chris,
> 
> On Tuesday, September 10, 2019 12:55:36 AM CEST Chris Wilson wrote:
> > @@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt 
> *gt)
> >       }
> >       spin_unlock_irqrestore(&timelines->lock, flags);
> >  
> > -     intel_gt_sanitize(gt, false);
> > +     ok = false;
> > +     if (!reset_clobbers_display(gt->i915))
> > +             ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;
> > +     if (!ok)
> > +             return false;
> 
> Before your change, that code was executed inside intel_gt_sanitize(gt, false) 
> which unfortunately didn't return any result.  The same outcome could be 
> achieved by redefining intel_gt_sanitize() to return that result and saying:
> 
>         if (!intel_gt_sanitize(gt, false)
>                 return false;
> 
> Is there any specific reason for intel_gt_sanitize() returning void?

The intent is that sanitize scrubs the leftover BIOS state, failure is
not an option.  The biggest change with respect to intel_gt_sanitize() is
the game we play with reset_clobbers_display -- we need the reset,
whereas in sanitize, the reset is good to have (but realistically we do
not expect there to be any contexts to scrub and so can take the risk).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-09-13  7:43 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-09 22:55 [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Chris Wilson
2019-09-09 22:55 ` [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first Chris Wilson
2019-09-10  0:59   ` Daniele Ceraolo Spurio
2019-09-10  6:06     ` Chris Wilson
2019-09-10 21:20       ` Daniele Ceraolo Spurio
2019-09-10  7:39   ` Janusz Krzysztofik
2019-09-13  7:41     ` Chris Wilson
2019-09-09 23:33 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Patchwork
2019-09-10  7:39 ` [PATCH 1/2] " Ville Syrjälä
2019-09-10  8:32   ` Chris Wilson
2019-09-10  8:58     ` Ville Syrjälä
2019-09-11 10:01       ` Chris Wilson
2019-09-11 10:19   ` [PATCH v2] " Chris Wilson
2019-09-11 10:48     ` Chris Wilson
2019-09-11 10:52     ` Ville Syrjälä
2019-09-11 11:02       ` Chris Wilson
2019-09-11 10:58 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2) Patchwork
2019-09-11 11:35 ` ✗ Fi.CI.BAT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.