* [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
@ 2019-09-09 22:55 Chris Wilson
2019-09-09 22:55 ` [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first Chris Wilson
` (4 more replies)
0 siblings, 5 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-09 22:55 UTC (permalink / raw)
To: intel-gfx
If the display is inactive, we need not worry about the gpu reset
clobbering the display!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
drivers/gpu/drm/i915/gt/intel_reset.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index b9d84d52e986..fe57296b790c 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -7,6 +7,7 @@
#include <linux/sched/mm.h>
#include <linux/stop_machine.h>
+#include "display/intel_display.h"
#include "display/intel_display_types.h"
#include "display/intel_overlay.h"
@@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
intel_engine_queue_breadcrumbs(engine);
}
+static bool reset_clobbers_display(struct drm_i915_private *i915)
+{
+ struct intel_crtc *crtc;
+
+ if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
+ return false;
+
+ for_each_intel_crtc(&i915->drm, crtc) {
+ if (crtc->active)
+ return true;
+ }
+
+ return false;
+}
+
static void __intel_gt_set_wedged(struct intel_gt *gt)
{
struct intel_engine_cs *engine;
@@ -755,7 +771,7 @@ static void __intel_gt_set_wedged(struct intel_gt *gt)
awake = reset_prepare(gt);
/* Even if the GPU reset fails, it should still stop the engines */
- if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
+ if (!reset_clobbers_display(gt->i915))
__intel_gt_reset(gt, ALL_ENGINES);
for_each_engine(engine, gt->i915, id)
--
2.23.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first
2019-09-09 22:55 [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Chris Wilson
@ 2019-09-09 22:55 ` Chris Wilson
2019-09-10 0:59 ` Daniele Ceraolo Spurio
2019-09-10 7:39 ` Janusz Krzysztofik
2019-09-09 23:33 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Patchwork
` (3 subsequent siblings)
4 siblings, 2 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-09 22:55 UTC (permalink / raw)
To: intel-gfx
Unwedging the GPU requires a successful GPU reset before we restore the
default submission, or else we may see residual context switch events
that we were not expecting.
Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
---
drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index fe57296b790c..5242496a893a 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
struct intel_gt_timelines *timelines = >->timelines;
struct intel_timeline *tl;
unsigned long flags;
+ bool ok;
if (!test_bit(I915_WEDGED, >->reset.flags))
return true;
@@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
}
spin_unlock_irqrestore(&timelines->lock, flags);
- intel_gt_sanitize(gt, false);
+ ok = false;
+ if (!reset_clobbers_display(gt->i915))
+ ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;
+ if (!ok)
+ return false;
/*
* Undo nop_submit_request. We prevent all new i915 requests from
--
2.23.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 18+ messages in thread
* ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
2019-09-09 22:55 [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Chris Wilson
2019-09-09 22:55 ` [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first Chris Wilson
@ 2019-09-09 23:33 ` Patchwork
2019-09-10 7:39 ` [PATCH 1/2] " Ville Syrjälä
` (2 subsequent siblings)
4 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2019-09-09 23:33 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
URL : https://patchwork.freedesktop.org/series/66459/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_6855 -> Patchwork_14334
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_14334 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_14334, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_14334:
### IGT changes ###
#### Possible regressions ####
* igt@i915_module_load@reload-with-fault-injection:
- fi-blb-e6850: NOTRUN -> [DMESG-WARN][1]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-blb-e6850/igt@i915_module_load@reload-with-fault-injection.html
- fi-bwr-2160: [PASS][2] -> [DMESG-WARN][3]
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-bwr-2160/igt@i915_module_load@reload-with-fault-injection.html
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-bwr-2160/igt@i915_module_load@reload-with-fault-injection.html
- fi-pnv-d510: NOTRUN -> [DMESG-WARN][4]
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-pnv-d510/igt@i915_module_load@reload-with-fault-injection.html
- fi-gdg-551: [PASS][5] -> [DMESG-WARN][6]
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-gdg-551/igt@i915_module_load@reload-with-fault-injection.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-gdg-551/igt@i915_module_load@reload-with-fault-injection.html
* igt@runner@aborted:
- fi-pnv-d510: NOTRUN -> [FAIL][7]
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-pnv-d510/igt@runner@aborted.html
- fi-gdg-551: NOTRUN -> [FAIL][8]
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-gdg-551/igt@runner@aborted.html
#### Suppressed ####
The following results come from untrusted machines, tests, or statuses.
They do not affect the overall result.
* igt@gem_sync@basic-store-all:
- {fi-tgl-u}: NOTRUN -> [INCOMPLETE][9]
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-tgl-u/igt@gem_sync@basic-store-all.html
Known issues
------------
Here are the changes found in Patchwork_14334 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@gem_flink_basic@basic:
- fi-icl-u3: [PASS][10] -> [DMESG-WARN][11] ([fdo#107724])
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-icl-u3/igt@gem_flink_basic@basic.html
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-icl-u3/igt@gem_flink_basic@basic.html
* igt@i915_module_load@reload:
- fi-icl-u3: [PASS][12] -> [DMESG-WARN][13] ([fdo#107724] / [fdo#111214])
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-icl-u3/igt@i915_module_load@reload.html
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-icl-u3/igt@i915_module_load@reload.html
* igt@i915_selftest@live_gem_contexts:
- fi-cfl-8700k: [PASS][14] -> [INCOMPLETE][15] ([fdo#111514])
[14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-cfl-8700k/igt@i915_selftest@live_gem_contexts.html
[15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-cfl-8700k/igt@i915_selftest@live_gem_contexts.html
#### Possible fixes ####
* igt@core_auth@basic-auth:
- fi-icl-u3: [DMESG-WARN][16] ([fdo#107724]) -> [PASS][17]
[16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-icl-u3/igt@core_auth@basic-auth.html
[17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-icl-u3/igt@core_auth@basic-auth.html
* igt@gem_exec_gttfill@basic:
- {fi-tgl-u}: [INCOMPLETE][18] ([fdo#111593]) -> [PASS][19]
[18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-tgl-u/igt@gem_exec_gttfill@basic.html
[19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-tgl-u/igt@gem_exec_gttfill@basic.html
* igt@i915_module_load@reload:
- fi-blb-e6850: [INCOMPLETE][20] ([fdo#107718]) -> [PASS][21]
[20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6855/fi-blb-e6850/igt@i915_module_load@reload.html
[21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/fi-blb-e6850/igt@i915_module_load@reload.html
{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).
[fdo#106107]: https://bugs.freedesktop.org/show_bug.cgi?id=106107
[fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
[fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
[fdo#111214]: https://bugs.freedesktop.org/show_bug.cgi?id=111214
[fdo#111514]: https://bugs.freedesktop.org/show_bug.cgi?id=111514
[fdo#111593]: https://bugs.freedesktop.org/show_bug.cgi?id=111593
Participating hosts (53 -> 47)
------------------------------
Additional (1): fi-pnv-d510
Missing (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-y fi-byt-clapper fi-bdw-samus
Build changes
-------------
* CI: CI-20190529 -> None
* Linux: CI_DRM_6855 -> Patchwork_14334
CI-20190529: 20190529
CI_DRM_6855: 4cef087ee0994669a097115945717fedb413c368 @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5177: 8f351d693352d21c96cef38c3fd77f778c6d7c33 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_14334: bf285b1528cb5a10ba2ec677b9da45507c996e9f @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
bf285b1528cb drm/i915/gt: Only unwedge if we can reset first
d5e889771f59 drm/i915/gt: Allow clobbering gpu resets if the display is off
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14334/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first
2019-09-09 22:55 ` [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first Chris Wilson
@ 2019-09-10 0:59 ` Daniele Ceraolo Spurio
2019-09-10 6:06 ` Chris Wilson
2019-09-10 7:39 ` Janusz Krzysztofik
1 sibling, 1 reply; 18+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-09-10 0:59 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 9/9/19 3:55 PM, Chris Wilson wrote:
> Unwedging the GPU requires a successful GPU reset before we restore the
> default submission, or else we may see residual context switch events
> that we were not expecting.
>
> Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index fe57296b790c..5242496a893a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
> struct intel_gt_timelines *timelines = >->timelines;
> struct intel_timeline *tl;
> unsigned long flags;
> + bool ok;
>
> if (!test_bit(I915_WEDGED, >->reset.flags))
> return true;
> @@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
> }
> spin_unlock_irqrestore(&timelines->lock, flags);
>
> - intel_gt_sanitize(gt, false);
> + ok = false;
> + if (!reset_clobbers_display(gt->i915))
> + ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;
Of the thing we had in the gt_sanitize, we're ok skipping the
uc_sanitize() because we take care of that during wedge (from
intel_uc_reset_prepare), but what about the loop of
__intel_engine_reset()? Is that safe to skip here?
Apart from that, the patch LGTM. Worth noting that with this change a
successful reset is required to unwedge even after a suspend/resume
cycle (in gem_sanitize), which is a good thing IMO.
Daniele
> + if (!ok)
> + return false;
>
> /*
> * Undo nop_submit_request. We prevent all new i915 requests from
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first
2019-09-10 0:59 ` Daniele Ceraolo Spurio
@ 2019-09-10 6:06 ` Chris Wilson
2019-09-10 21:20 ` Daniele Ceraolo Spurio
0 siblings, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2019-09-10 6:06 UTC (permalink / raw)
To: Daniele Ceraolo Spurio, intel-gfx
Quoting Daniele Ceraolo Spurio (2019-09-10 01:59:38)
>
>
> On 9/9/19 3:55 PM, Chris Wilson wrote:
> > Unwedging the GPU requires a successful GPU reset before we restore the
> > default submission, or else we may see residual context switch events
> > that we were not expecting.
> >
> > Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> > ---
> > drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> > index fe57296b790c..5242496a893a 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > @@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
> > struct intel_gt_timelines *timelines = >->timelines;
> > struct intel_timeline *tl;
> > unsigned long flags;
> > + bool ok;
> >
> > if (!test_bit(I915_WEDGED, >->reset.flags))
> > return true;
> > @@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
> > }
> > spin_unlock_irqrestore(&timelines->lock, flags);
> >
> > - intel_gt_sanitize(gt, false);
> > + ok = false;
> > + if (!reset_clobbers_display(gt->i915))
> > + ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;
>
> Of the thing we had in the gt_sanitize, we're ok skipping the
> uc_sanitize() because we take care of that during wedge (from
> intel_uc_reset_prepare), but what about the loop of
> __intel_engine_reset()? Is that safe to skip here?
I think yes, because we always follow the unwedge with a GT restart. That
is either via the full reset or the sanitize+restart on resume. Both call
paths will also set the wedged bit if they fail. gem_eio/suspend should
be testing the recovery upon resume path, and even gem_eio/*-stress
should give responsible coverage of the normal recovery via full reset.
> Apart from that, the patch LGTM. Worth noting that with this change a
> successful reset is required to unwedge even after a suspend/resume
> cycle (in gem_sanitize), which is a good thing IMO.
Hence why relaxing the gpu_clobbers_display is important to retain the
ability to clear wedged across suspend on older devices.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
2019-09-09 22:55 [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Chris Wilson
2019-09-09 22:55 ` [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first Chris Wilson
2019-09-09 23:33 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Patchwork
@ 2019-09-10 7:39 ` Ville Syrjälä
2019-09-10 8:32 ` Chris Wilson
2019-09-11 10:19 ` [PATCH v2] " Chris Wilson
2019-09-11 10:58 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2) Patchwork
2019-09-11 11:35 ` ✗ Fi.CI.BAT: failure " Patchwork
4 siblings, 2 replies; 18+ messages in thread
From: Ville Syrjälä @ 2019-09-10 7:39 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Mon, Sep 09, 2019 at 11:55:35PM +0100, Chris Wilson wrote:
> If the display is inactive, we need not worry about the gpu reset
> clobbering the display!
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
> drivers/gpu/drm/i915/gt/intel_reset.c | 18 +++++++++++++++++-
> 1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index b9d84d52e986..fe57296b790c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -7,6 +7,7 @@
> #include <linux/sched/mm.h>
> #include <linux/stop_machine.h>
>
> +#include "display/intel_display.h"
> #include "display/intel_display_types.h"
> #include "display/intel_overlay.h"
>
> @@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
> intel_engine_queue_breadcrumbs(engine);
> }
>
> +static bool reset_clobbers_display(struct drm_i915_private *i915)
> +{
> + struct intel_crtc *crtc;
> +
> + if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
> + return false;
> +
> + for_each_intel_crtc(&i915->drm, crtc) {
> + if (crtc->active)
> + return true;
> + }
This feels racy. crtc->active gets set somewhere in the middle of the
modeset, so looks like now we could clobber all the stuff we've already
set up before crtc->active got set.
> +
> + return false;
> +}
> +
> static void __intel_gt_set_wedged(struct intel_gt *gt)
> {
> struct intel_engine_cs *engine;
> @@ -755,7 +771,7 @@ static void __intel_gt_set_wedged(struct intel_gt *gt)
> awake = reset_prepare(gt);
>
> /* Even if the GPU reset fails, it should still stop the engines */
> - if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
> + if (!reset_clobbers_display(gt->i915))
> __intel_gt_reset(gt, ALL_ENGINES);
>
> for_each_engine(engine, gt->i915, id)
> --
> 2.23.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first
2019-09-09 22:55 ` [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first Chris Wilson
2019-09-10 0:59 ` Daniele Ceraolo Spurio
@ 2019-09-10 7:39 ` Janusz Krzysztofik
2019-09-13 7:41 ` Chris Wilson
1 sibling, 1 reply; 18+ messages in thread
From: Janusz Krzysztofik @ 2019-09-10 7:39 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
Hi Chris,
On Tuesday, September 10, 2019 12:55:36 AM CEST Chris Wilson wrote:
> Unwedging the GPU requires a successful GPU reset before we restore the
> default submission, or else we may see residual context switch events
> that we were not expecting.
>
> Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> ---
> drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/
gt/intel_reset.c
> index fe57296b790c..5242496a893a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
> struct intel_gt_timelines *timelines = >->timelines;
> struct intel_timeline *tl;
> unsigned long flags;
> + bool ok;
>
> if (!test_bit(I915_WEDGED, >->reset.flags))
> return true;
> @@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt
*gt)
> }
> spin_unlock_irqrestore(&timelines->lock, flags);
>
> - intel_gt_sanitize(gt, false);
> + ok = false;
> + if (!reset_clobbers_display(gt->i915))
> + ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;
> + if (!ok)
> + return false;
Before your change, that code was executed inside intel_gt_sanitize(gt, false)
which unfortunately didn't return any result. The same outcome could be
achieved by redefining intel_gt_sanitize() to return that result and saying:
if (!intel_gt_sanitize(gt, false)
return false;
Is there any specific reason for intel_gt_sanitize() returning void?
Thanks,
Janusz
>
> /*
> * Undo nop_submit_request. We prevent all new i915 requests from
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
2019-09-10 7:39 ` [PATCH 1/2] " Ville Syrjälä
@ 2019-09-10 8:32 ` Chris Wilson
2019-09-10 8:58 ` Ville Syrjälä
2019-09-11 10:19 ` [PATCH v2] " Chris Wilson
1 sibling, 1 reply; 18+ messages in thread
From: Chris Wilson @ 2019-09-10 8:32 UTC (permalink / raw)
To: Ville Syrjälä; +Cc: intel-gfx
Quoting Ville Syrjälä (2019-09-10 08:39:31)
> On Mon, Sep 09, 2019 at 11:55:35PM +0100, Chris Wilson wrote:
> > If the display is inactive, we need not worry about the gpu reset
> > clobbering the display!
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> > drivers/gpu/drm/i915/gt/intel_reset.c | 18 +++++++++++++++++-
> > 1 file changed, 17 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> > index b9d84d52e986..fe57296b790c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > @@ -7,6 +7,7 @@
> > #include <linux/sched/mm.h>
> > #include <linux/stop_machine.h>
> >
> > +#include "display/intel_display.h"
> > #include "display/intel_display_types.h"
> > #include "display/intel_overlay.h"
> >
> > @@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
> > intel_engine_queue_breadcrumbs(engine);
> > }
> >
> > +static bool reset_clobbers_display(struct drm_i915_private *i915)
> > +{
> > + struct intel_crtc *crtc;
> > +
> > + if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
> > + return false;
> > +
> > + for_each_intel_crtc(&i915->drm, crtc) {
> > + if (crtc->active)
> > + return true;
> > + }
>
> This feels racy. crtc->active gets set somewhere in the middle of the
> modeset, so looks like now we could clobber all the stuff we've already
> set up before crtc->active got set.
The question is, does it matter? After we unwedge, we clobber again for
realz. Not that we have anything deliberately trying to race
wedge-vs-modeset(on/off).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
2019-09-10 8:32 ` Chris Wilson
@ 2019-09-10 8:58 ` Ville Syrjälä
2019-09-11 10:01 ` Chris Wilson
0 siblings, 1 reply; 18+ messages in thread
From: Ville Syrjälä @ 2019-09-10 8:58 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Tue, Sep 10, 2019 at 09:32:45AM +0100, Chris Wilson wrote:
> Quoting Ville Syrjälä (2019-09-10 08:39:31)
> > On Mon, Sep 09, 2019 at 11:55:35PM +0100, Chris Wilson wrote:
> > > If the display is inactive, we need not worry about the gpu reset
> > > clobbering the display!
> > >
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > ---
> > > drivers/gpu/drm/i915/gt/intel_reset.c | 18 +++++++++++++++++-
> > > 1 file changed, 17 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> > > index b9d84d52e986..fe57296b790c 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > > @@ -7,6 +7,7 @@
> > > #include <linux/sched/mm.h>
> > > #include <linux/stop_machine.h>
> > >
> > > +#include "display/intel_display.h"
> > > #include "display/intel_display_types.h"
> > > #include "display/intel_overlay.h"
> > >
> > > @@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
> > > intel_engine_queue_breadcrumbs(engine);
> > > }
> > >
> > > +static bool reset_clobbers_display(struct drm_i915_private *i915)
> > > +{
> > > + struct intel_crtc *crtc;
> > > +
> > > + if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
> > > + return false;
> > > +
> > > + for_each_intel_crtc(&i915->drm, crtc) {
> > > + if (crtc->active)
> > > + return true;
> > > + }
> >
> > This feels racy. crtc->active gets set somewhere in the middle of the
> > modeset, so looks like now we could clobber all the stuff we've already
> > set up before crtc->active got set.
>
> The question is, does it matter? After we unwedge, we clobber again for
> realz. Not that we have anything deliberately trying to race
> wedge-vs-modeset(on/off).
Not sure. But I suspect the hw might decide to hang the box if eg.
the PLL is borked when we touch something that really needs the clock.
--
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first
2019-09-10 6:06 ` Chris Wilson
@ 2019-09-10 21:20 ` Daniele Ceraolo Spurio
0 siblings, 0 replies; 18+ messages in thread
From: Daniele Ceraolo Spurio @ 2019-09-10 21:20 UTC (permalink / raw)
To: Chris Wilson, intel-gfx
On 9/9/19 11:06 PM, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2019-09-10 01:59:38)
>>
>>
>> On 9/9/19 3:55 PM, Chris Wilson wrote:
>>> Unwedging the GPU requires a successful GPU reset before we restore the
>>> default submission, or else we may see residual context switch events
>>> that we were not expecting.
>>>
>>> Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
>>> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> ---
>>> drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++++++-
>>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
>>> index fe57296b790c..5242496a893a 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
>>> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
>>> @@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>>> struct intel_gt_timelines *timelines = >->timelines;
>>> struct intel_timeline *tl;
>>> unsigned long flags;
>>> + bool ok;
>>>
>>> if (!test_bit(I915_WEDGED, >->reset.flags))
>>> return true;
>>> @@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
>>> }
>>> spin_unlock_irqrestore(&timelines->lock, flags);
>>>
>>> - intel_gt_sanitize(gt, false);
>>> + ok = false;
>>> + if (!reset_clobbers_display(gt->i915))
>>> + ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;
>>
>> Of the thing we had in the gt_sanitize, we're ok skipping the
>> uc_sanitize() because we take care of that during wedge (from
>> intel_uc_reset_prepare), but what about the loop of
>> __intel_engine_reset()? Is that safe to skip here?
>
> I think yes, because we always follow the unwedge with a GT restart. That
> is either via the full reset or the sanitize+restart on resume. Both call
> paths will also set the wedged bit if they fail. gem_eio/suspend should
> be testing the recovery upon resume path, and even gem_eio/*-stress
> should give responsible coverage of the normal recovery via full reset.
>
>> Apart from that, the patch LGTM. Worth noting that with this change a
>> successful reset is required to unwedge even after a suspend/resume
>> cycle (in gem_sanitize), which is a good thing IMO.
>
> Hence why relaxing the gpu_clobbers_display is important to retain the
> ability to clear wedged across suspend on older devices.
> -Chris
>
Sold!
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off
2019-09-10 8:58 ` Ville Syrjälä
@ 2019-09-11 10:01 ` Chris Wilson
0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-11 10:01 UTC (permalink / raw)
To: Ville Syrjälä; +Cc: intel-gfx
Quoting Ville Syrjälä (2019-09-10 09:58:38)
> On Tue, Sep 10, 2019 at 09:32:45AM +0100, Chris Wilson wrote:
> > Quoting Ville Syrjälä (2019-09-10 08:39:31)
> > > On Mon, Sep 09, 2019 at 11:55:35PM +0100, Chris Wilson wrote:
> > > > If the display is inactive, we need not worry about the gpu reset
> > > > clobbering the display!
> > > >
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > ---
> > > > drivers/gpu/drm/i915/gt/intel_reset.c | 18 +++++++++++++++++-
> > > > 1 file changed, 17 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> > > > index b9d84d52e986..fe57296b790c 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > > > @@ -7,6 +7,7 @@
> > > > #include <linux/sched/mm.h>
> > > > #include <linux/stop_machine.h>
> > > >
> > > > +#include "display/intel_display.h"
> > > > #include "display/intel_display_types.h"
> > > > #include "display/intel_overlay.h"
> > > >
> > > > @@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
> > > > intel_engine_queue_breadcrumbs(engine);
> > > > }
> > > >
> > > > +static bool reset_clobbers_display(struct drm_i915_private *i915)
> > > > +{
> > > > + struct intel_crtc *crtc;
> > > > +
> > > > + if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
> > > > + return false;
> > > > +
> > > > + for_each_intel_crtc(&i915->drm, crtc) {
> > > > + if (crtc->active)
> > > > + return true;
> > > > + }
> > >
> > > This feels racy. crtc->active gets set somewhere in the middle of the
> > > modeset, so looks like now we could clobber all the stuff we've already
> > > set up before crtc->active got set.
> >
> > The question is, does it matter? After we unwedge, we clobber again for
> > realz. Not that we have anything deliberately trying to race
> > wedge-vs-modeset(on/off).
>
> Not sure. But I suspect the hw might decide to hang the box if eg.
> the PLL is borked when we touch something that really needs the clock.
crtc->active only changes within the atomic_tail? Worst case I guess is
we add
srcu = -ENODEV;
if (i915->gpu_reset_clobbers_display)
/* only fails if interrupted */
srcu = intel_gt_reset_trylock(&i915->gt);
...
if (srcu > 0)
intel_gt_reset_unlock(&i915->gt, srcu);
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v2] drm/i915/gt: Allow clobbering gpu resets if the display is off
2019-09-10 7:39 ` [PATCH 1/2] " Ville Syrjälä
2019-09-10 8:32 ` Chris Wilson
@ 2019-09-11 10:19 ` Chris Wilson
2019-09-11 10:48 ` Chris Wilson
2019-09-11 10:52 ` Ville Syrjälä
1 sibling, 2 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-11 10:19 UTC (permalink / raw)
To: intel-gfx
If the display is inactive, we need not worry about the gpu reset
clobbering the display! To prevent the display changing state between us
checking the active state and doing the hard reset, we introduce the
lightweight reset lock to the atomic commit for the affected (legacy)
platforms.
Testcase: igt/gem_eio/kms
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_display.c | 11 +++++++++++
drivers/gpu/drm/i915/gt/intel_reset.c | 18 +++++++++++++++++-
2 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 4ee750fa3ef0..a92487d8f4cf 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -13986,6 +13986,7 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
struct intel_crtc *crtc;
u64 put_domains[I915_MAX_PIPES] = {};
intel_wakeref_t wakeref = 0;
+ int srcu;
int i;
intel_atomic_commit_fence_wait(state);
@@ -14005,6 +14006,12 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
}
}
+ /* Prevent starting a GPU reset while we change global display state */
+ srcu = -ENODEV;
+ if (INTEL_INFO(dev_priv)->gpu_reset_clobbers_display)
+ /* only fails if interrupted */
+ srcu = intel_gt_reset_trylock(&dev_priv->gt);
+
intel_commit_modeset_disables(state);
/* FIXME: Eventually get rid of our crtc->config pointer */
@@ -14049,6 +14056,10 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
/* Now enable the clocks, plane, pipe, and connectors that we set up. */
dev_priv->display.commit_modeset_enables(state);
+
+ if (srcu > 0)
+ intel_gt_reset_unlock(&dev_priv->gt, srcu);
+
if (state->modeset) {
intel_encoders_update_complete(state);
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index d8b1498d7ab7..df4a86bdb6f6 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -7,6 +7,7 @@
#include <linux/sched/mm.h>
#include <linux/stop_machine.h>
+#include "display/intel_display.h"
#include "display/intel_display_types.h"
#include "display/intel_overlay.h"
@@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
intel_engine_queue_breadcrumbs(engine);
}
+static bool reset_clobbers_display(struct drm_i915_private *i915)
+{
+ struct intel_crtc *crtc;
+
+ if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
+ return false;
+
+ for_each_intel_crtc(&i915->drm, crtc) {
+ if (crtc->active)
+ return true;
+ }
+
+ return false;
+}
+
static void __intel_gt_set_wedged(struct intel_gt *gt)
{
struct intel_engine_cs *engine;
@@ -755,7 +771,7 @@ static void __intel_gt_set_wedged(struct intel_gt *gt)
awake = reset_prepare(gt);
/* Even if the GPU reset fails, it should still stop the engines */
- if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
+ if (!reset_clobbers_display(gt->i915))
__intel_gt_reset(gt, ALL_ENGINES);
for_each_engine(engine, gt->i915, id)
--
2.23.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2] drm/i915/gt: Allow clobbering gpu resets if the display is off
2019-09-11 10:19 ` [PATCH v2] " Chris Wilson
@ 2019-09-11 10:48 ` Chris Wilson
2019-09-11 10:52 ` Ville Syrjälä
1 sibling, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-11 10:48 UTC (permalink / raw)
To: intel-gfx
Quoting Chris Wilson (2019-09-11 11:19:59)
> If the display is inactive, we need not worry about the gpu reset
> clobbering the display! To prevent the display changing state between us
> checking the active state and doing the hard reset, we introduce the
> lightweight reset lock to the atomic commit for the affected (legacy)
> platforms.
> + /* Prevent starting a GPU reset while we change global display state */
> + srcu = -ENODEV;
> + if (INTEL_INFO(dev_priv)->gpu_reset_clobbers_display)
> + /* only fails if interrupted */
> + srcu = intel_gt_reset_trylock(&dev_priv->gt);
This is overkill? Before starting the GPU reset, we serialise with the
modeset, do we not?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2] drm/i915/gt: Allow clobbering gpu resets if the display is off
2019-09-11 10:19 ` [PATCH v2] " Chris Wilson
2019-09-11 10:48 ` Chris Wilson
@ 2019-09-11 10:52 ` Ville Syrjälä
2019-09-11 11:02 ` Chris Wilson
1 sibling, 1 reply; 18+ messages in thread
From: Ville Syrjälä @ 2019-09-11 10:52 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Wed, Sep 11, 2019 at 11:19:59AM +0100, Chris Wilson wrote:
> If the display is inactive, we need not worry about the gpu reset
> clobbering the display! To prevent the display changing state between us
> checking the active state and doing the hard reset, we introduce the
> lightweight reset lock to the atomic commit for the affected (legacy)
> platforms.
>
> Testcase: igt/gem_eio/kms
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/display/intel_display.c | 11 +++++++++++
> drivers/gpu/drm/i915/gt/intel_reset.c | 18 +++++++++++++++++-
> 2 files changed, 28 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 4ee750fa3ef0..a92487d8f4cf 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -13986,6 +13986,7 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
> struct intel_crtc *crtc;
> u64 put_domains[I915_MAX_PIPES] = {};
> intel_wakeref_t wakeref = 0;
> + int srcu;
> int i;
>
> intel_atomic_commit_fence_wait(state);
> @@ -14005,6 +14006,12 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
> }
> }
>
> + /* Prevent starting a GPU reset while we change global display state */
> + srcu = -ENODEV;
> + if (INTEL_INFO(dev_priv)->gpu_reset_clobbers_display)
> + /* only fails if interrupted */
> + srcu = intel_gt_reset_trylock(&dev_priv->gt);
Hmm. What happens if we're holding the modeset locks and there's an
ongoing reset blocked on said locks?
> +
> intel_commit_modeset_disables(state);
>
> /* FIXME: Eventually get rid of our crtc->config pointer */
> @@ -14049,6 +14056,10 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
> /* Now enable the clocks, plane, pipe, and connectors that we set up. */
> dev_priv->display.commit_modeset_enables(state);
>
> +
> + if (srcu > 0)
> + intel_gt_reset_unlock(&dev_priv->gt, srcu);
> +
> if (state->modeset) {
> intel_encoders_update_complete(state);
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
> index d8b1498d7ab7..df4a86bdb6f6 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -7,6 +7,7 @@
> #include <linux/sched/mm.h>
> #include <linux/stop_machine.h>
>
> +#include "display/intel_display.h"
> #include "display/intel_display_types.h"
> #include "display/intel_overlay.h"
>
> @@ -729,6 +730,21 @@ static void nop_submit_request(struct i915_request *request)
> intel_engine_queue_breadcrumbs(engine);
> }
>
> +static bool reset_clobbers_display(struct drm_i915_private *i915)
> +{
> + struct intel_crtc *crtc;
> +
> + if (!INTEL_INFO(i915)->gpu_reset_clobbers_display)
> + return false;
> +
> + for_each_intel_crtc(&i915->drm, crtc) {
> + if (crtc->active)
> + return true;
> + }
> +
> + return false;
> +}
> +
> static void __intel_gt_set_wedged(struct intel_gt *gt)
> {
> struct intel_engine_cs *engine;
> @@ -755,7 +771,7 @@ static void __intel_gt_set_wedged(struct intel_gt *gt)
> awake = reset_prepare(gt);
>
> /* Even if the GPU reset fails, it should still stop the engines */
> - if (!INTEL_INFO(gt->i915)->gpu_reset_clobbers_display)
> + if (!reset_clobbers_display(gt->i915))
> __intel_gt_reset(gt, ALL_ENGINES);
>
> for_each_engine(engine, gt->i915, id)
> --
> 2.23.0
--
Ville Syrjälä
Intel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* ✗ Fi.CI.CHECKPATCH: warning for series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2)
2019-09-09 22:55 [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Chris Wilson
` (2 preceding siblings ...)
2019-09-10 7:39 ` [PATCH 1/2] " Ville Syrjälä
@ 2019-09-11 10:58 ` Patchwork
2019-09-11 11:35 ` ✗ Fi.CI.BAT: failure " Patchwork
4 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2019-09-11 10:58 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2)
URL : https://patchwork.freedesktop.org/series/66459/
State : warning
== Summary ==
$ dim checkpatch origin/drm-tip
1b207a558e6e drm/i915/gt: Allow clobbering gpu resets if the display is off
-:49: CHECK:LINE_SPACING: Please don't use multiple blank lines
#49: FILE: drivers/gpu/drm/i915/display/intel_display.c:14071:
+
total: 0 errors, 0 warnings, 1 checks, 65 lines checked
ff9559a3849b drm/i915/gt: Only unwedge if we can reset first
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2] drm/i915/gt: Allow clobbering gpu resets if the display is off
2019-09-11 10:52 ` Ville Syrjälä
@ 2019-09-11 11:02 ` Chris Wilson
0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-11 11:02 UTC (permalink / raw)
To: Ville Syrjälä; +Cc: intel-gfx
Quoting Ville Syrjälä (2019-09-11 11:52:29)
> On Wed, Sep 11, 2019 at 11:19:59AM +0100, Chris Wilson wrote:
> > If the display is inactive, we need not worry about the gpu reset
> > clobbering the display! To prevent the display changing state between us
> > checking the active state and doing the hard reset, we introduce the
> > lightweight reset lock to the atomic commit for the affected (legacy)
> > platforms.
> >
> > Testcase: igt/gem_eio/kms
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/display/intel_display.c | 11 +++++++++++
> > drivers/gpu/drm/i915/gt/intel_reset.c | 18 +++++++++++++++++-
> > 2 files changed, 28 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> > index 4ee750fa3ef0..a92487d8f4cf 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > @@ -13986,6 +13986,7 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
> > struct intel_crtc *crtc;
> > u64 put_domains[I915_MAX_PIPES] = {};
> > intel_wakeref_t wakeref = 0;
> > + int srcu;
> > int i;
> >
> > intel_atomic_commit_fence_wait(state);
> > @@ -14005,6 +14006,12 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
> > }
> > }
> >
> > + /* Prevent starting a GPU reset while we change global display state */
> > + srcu = -ENODEV;
> > + if (INTEL_INFO(dev_priv)->gpu_reset_clobbers_display)
> > + /* only fails if interrupted */
> > + srcu = intel_gt_reset_trylock(&dev_priv->gt);
>
> Hmm. What happens if we're holding the modeset locks and there's an
> ongoing reset blocked on said locks?
I was about to say it's the inner lock in both cases, but it's not.
Anyway doesn't that interaction (intel_prepare_reset) prevent
crtc->active changing as we use it from inside intel_gt_reset (which
calls intel_gt_unset_wedged)?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* ✗ Fi.CI.BAT: failure for series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2)
2019-09-09 22:55 [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Chris Wilson
` (3 preceding siblings ...)
2019-09-11 10:58 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2) Patchwork
@ 2019-09-11 11:35 ` Patchwork
4 siblings, 0 replies; 18+ messages in thread
From: Patchwork @ 2019-09-11 11:35 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2)
URL : https://patchwork.freedesktop.org/series/66459/
State : failure
== Summary ==
CI Bug Log - changes from CI_DRM_6868 -> Patchwork_14356
====================================================
Summary
-------
**FAILURE**
Serious unknown changes coming with Patchwork_14356 absolutely need to be
verified manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_14356, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/
Possible new issues
-------------------
Here are the unknown changes that may have been introduced in Patchwork_14356:
### IGT changes ###
#### Possible regressions ####
* igt@i915_hangman@error-state-basic:
- fi-bwr-2160: [PASS][1] -> [INCOMPLETE][2]
[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-bwr-2160/igt@i915_hangman@error-state-basic.html
[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-bwr-2160/igt@i915_hangman@error-state-basic.html
- fi-blb-e6850: [PASS][3] -> [INCOMPLETE][4]
[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-blb-e6850/igt@i915_hangman@error-state-basic.html
[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-blb-e6850/igt@i915_hangman@error-state-basic.html
Known issues
------------
Here are the changes found in Patchwork_14356 that come from known issues:
### IGT changes ###
#### Issues hit ####
* igt@gem_mmap_gtt@basic-write-no-prefault:
- fi-icl-u3: [PASS][5] -> [DMESG-WARN][6] ([fdo#107724]) +1 similar issue
[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-icl-u3/igt@gem_mmap_gtt@basic-write-no-prefault.html
[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-icl-u3/igt@gem_mmap_gtt@basic-write-no-prefault.html
* igt@i915_hangman@error-state-basic:
- fi-gdg-551: [PASS][7] -> [INCOMPLETE][8] ([fdo#108316])
[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-gdg-551/igt@i915_hangman@error-state-basic.html
[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-gdg-551/igt@i915_hangman@error-state-basic.html
- fi-pnv-d510: [PASS][9] -> [INCOMPLETE][10] ([fdo#110740])
[9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-pnv-d510/igt@i915_hangman@error-state-basic.html
[10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-pnv-d510/igt@i915_hangman@error-state-basic.html
* igt@kms_chamelium@dp-edid-read:
- fi-kbl-7500u: [PASS][11] -> [WARN][12] ([fdo#109483])
[11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-kbl-7500u/igt@kms_chamelium@dp-edid-read.html
[12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-kbl-7500u/igt@kms_chamelium@dp-edid-read.html
* igt@kms_frontbuffer_tracking@basic:
- fi-icl-u3: [PASS][13] -> [FAIL][14] ([fdo#103167])
[13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-icl-u3/igt@kms_frontbuffer_tracking@basic.html
[14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-icl-u3/igt@kms_frontbuffer_tracking@basic.html
#### Possible fixes ####
* igt@prime_vgem@basic-gtt:
- fi-icl-u3: [DMESG-WARN][15] ([fdo#107724]) -> [PASS][16] +1 similar issue
[15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-icl-u3/igt@prime_vgem@basic-gtt.html
[16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-icl-u3/igt@prime_vgem@basic-gtt.html
#### Warnings ####
* igt@i915_pm_rpm@module-reload:
- fi-icl-u2: [INCOMPLETE][17] ([fdo#107713] / [fdo#108840]) -> [DMESG-WARN][18] ([fdo#110595])
[17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6868/fi-icl-u2/igt@i915_pm_rpm@module-reload.html
[18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/fi-icl-u2/igt@i915_pm_rpm@module-reload.html
[fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
[fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
[fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
[fdo#108316]: https://bugs.freedesktop.org/show_bug.cgi?id=108316
[fdo#108840]: https://bugs.freedesktop.org/show_bug.cgi?id=108840
[fdo#109483]: https://bugs.freedesktop.org/show_bug.cgi?id=109483
[fdo#110595]: https://bugs.freedesktop.org/show_bug.cgi?id=110595
[fdo#110740]: https://bugs.freedesktop.org/show_bug.cgi?id=110740
Participating hosts (53 -> 46)
------------------------------
Missing (7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-icl-y fi-byt-clapper fi-bdw-samus
Build changes
-------------
* CI: CI-20190529 -> None
* Linux: CI_DRM_6868 -> Patchwork_14356
CI-20190529: 20190529
CI_DRM_6868: fb9bbe42526c1a8467c99f8137ed6b94c749681f @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_5178: efb4539494d94f03374874d3b61bd04ef3802aaa @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_14356: ff9559a3849b3aab2d39d531140cdd329ae818bd @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
ff9559a3849b drm/i915/gt: Only unwedge if we can reset first
1b207a558e6e drm/i915/gt: Allow clobbering gpu resets if the display is off
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14356/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first
2019-09-10 7:39 ` Janusz Krzysztofik
@ 2019-09-13 7:41 ` Chris Wilson
0 siblings, 0 replies; 18+ messages in thread
From: Chris Wilson @ 2019-09-13 7:41 UTC (permalink / raw)
To: Janusz Krzysztofik; +Cc: intel-gfx
Quoting Janusz Krzysztofik (2019-09-10 08:39:51)
> Hi Chris,
>
> On Tuesday, September 10, 2019 12:55:36 AM CEST Chris Wilson wrote:
> > @@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt
> *gt)
> > }
> > spin_unlock_irqrestore(&timelines->lock, flags);
> >
> > - intel_gt_sanitize(gt, false);
> > + ok = false;
> > + if (!reset_clobbers_display(gt->i915))
> > + ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;
> > + if (!ok)
> > + return false;
>
> Before your change, that code was executed inside intel_gt_sanitize(gt, false)
> which unfortunately didn't return any result. The same outcome could be
> achieved by redefining intel_gt_sanitize() to return that result and saying:
>
> if (!intel_gt_sanitize(gt, false)
> return false;
>
> Is there any specific reason for intel_gt_sanitize() returning void?
The intent is that sanitize scrubs the leftover BIOS state, failure is
not an option. The biggest change with respect to intel_gt_sanitize() is
the game we play with reset_clobbers_display -- we need the reset,
whereas in sanitize, the reset is good to have (but realistically we do
not expect there to be any contexts to scrub and so can take the risk).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2019-09-13 7:43 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-09 22:55 [PATCH 1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Chris Wilson
2019-09-09 22:55 ` [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first Chris Wilson
2019-09-10 0:59 ` Daniele Ceraolo Spurio
2019-09-10 6:06 ` Chris Wilson
2019-09-10 21:20 ` Daniele Ceraolo Spurio
2019-09-10 7:39 ` Janusz Krzysztofik
2019-09-13 7:41 ` Chris Wilson
2019-09-09 23:33 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915/gt: Allow clobbering gpu resets if the display is off Patchwork
2019-09-10 7:39 ` [PATCH 1/2] " Ville Syrjälä
2019-09-10 8:32 ` Chris Wilson
2019-09-10 8:58 ` Ville Syrjälä
2019-09-11 10:01 ` Chris Wilson
2019-09-11 10:19 ` [PATCH v2] " Chris Wilson
2019-09-11 10:48 ` Chris Wilson
2019-09-11 10:52 ` Ville Syrjälä
2019-09-11 11:02 ` Chris Wilson
2019-09-11 10:58 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [v2] drm/i915/gt: Allow clobbering gpu resets if the display is off (rev2) Patchwork
2019-09-11 11:35 ` ✗ Fi.CI.BAT: failure " Patchwork
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.