All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails
@ 2017-11-29 13:59 Chris Wilson
  2017-11-29 14:05 ` [PATCH v2] " Chris Wilson
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Chris Wilson @ 2017-11-29 13:59 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

History tells us that if we cannot reset the GPU now, we never will. This
then impacts everything that is run subsequently. On failing the reset,
we mark the driver as wedged, trying to prevent further execution on the
GPU, forcing userspace to fallback to using the CPU to update its
framebuffers and let the user know what happened.

We also want to go one step further and add a taint to the kernel so that
any subsequent faults can be traced back to this failure. This is
important for igt, where if the GPU/driver fails we want to reboot and
restart testing rather than continue on into oblivion.

TAINT_DIE is colloquially known as "system on fire", which seems
appropriate for unresponsive hardware.

References: https://bugs.freedesktop.org/show_bug.cgi?id=103514
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 696d5cdf2779..f08343be880c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1904,10 +1904,24 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
 
 	ret = intel_gpu_reset(i915, ALL_ENGINES);
 	if (ret) {
-		if (ret != -ENODEV)
-			DRM_ERROR("Failed to reset chip: %i\n", ret);
-		else
+		/*
+		 * History tells us that if we cannot reset the GPU now, we
+		 * never will. This then impacts everything that is run
+		 * subsequently. On failing the reset, we mark the driver
+		 * as wedged, preventing further execution on the GPU.
+		 * We also want to go one step further and add a taint to the
+		 * kernel so that any subsequent faults can be traced back to
+		 * this failure. This is important for igt, where if the
+		 * GPU/driver fails we want to reboot and restart testing
+		 * rather than continue on into oblivion.
+		 */
+		if (ret != -ENODEV) {
+			dev_err(i915->drm.dev,
+				"Failed to reset chip: %i\n", ret);
+			add_taint(TAINT_DIE, LOCKDEP_STILL_OK);
+		} else {
 			DRM_DEBUG_DRIVER("GPU reset disabled\n");
+		}
 		goto error;
 	}
 
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails
  2017-11-29 13:59 [PATCH] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails Chris Wilson
@ 2017-11-29 14:05 ` Chris Wilson
  2017-11-30 12:24   ` Lofstedt, Marta
                     ` (2 more replies)
  2017-11-30 10:02 ` ✗ Fi.CI.BAT: failure for drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev2) Patchwork
                   ` (5 subsequent siblings)
  6 siblings, 3 replies; 13+ messages in thread
From: Chris Wilson @ 2017-11-29 14:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

History tells us that if we cannot reset the GPU now, we never will. This
then impacts everything that is run subsequently. On failing the reset,
we mark the driver as wedged, trying to prevent further execution on the
GPU, forcing userspace to fallback to using the CPU to update its
framebuffers and let the user know what happened.

We also want to go one step further and add a taint to the kernel so that
any subsequent faults can be traced back to this failure. This is
important for igt, where if the GPU/driver fails we want to reboot and
restart testing rather than continue on into oblivion.

TAINT_DIE is colloquially known as "system on fire", which seems
appropriate for unresponsive hardware.

v2: Also taint if the recovery fails (again history shows us that is
typically fatal).

References: https://bugs.freedesktop.org/show_bug.cgi?id=103514
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 696d5cdf2779..eb90ddac7f8b 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1897,18 +1897,21 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
 	disable_irq(i915->drm.irq);
 	ret = i915_gem_reset_prepare(i915);
 	if (ret) {
-		DRM_ERROR("GPU recovery failed\n");
+		dev_err(i915->drm.dev, "GPU recovery failed\n");
 		intel_gpu_reset(i915, ALL_ENGINES);
-		goto error;
+		goto taint;
 	}
 
 	ret = intel_gpu_reset(i915, ALL_ENGINES);
 	if (ret) {
-		if (ret != -ENODEV)
-			DRM_ERROR("Failed to reset chip: %i\n", ret);
-		else
+		if (ret != -ENODEV) {
+			dev_err(i915->drm.dev,
+				"Failed to reset chip: %i\n", ret);
+			goto taint;
+		} else {
 			DRM_DEBUG_DRIVER("GPU reset disabled\n");
-		goto error;
+			goto error;
+		}
 	}
 
 	i915_gem_reset(i915);
@@ -1951,6 +1954,19 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
 	wake_up_bit(&error->flags, I915_RESET_HANDOFF);
 	return;
 
+taint:
+	/*
+	 * History tells us that if we cannot reset the GPU now, we
+	 * never will. This then impacts everything that is run
+	 * subsequently. On failing the reset, we mark the driver
+	 * as wedged, preventing further execution on the GPU.
+	 * We also want to go one step further and add a taint to the
+	 * kernel so that any subsequent faults can be traced back to
+	 * this failure. This is important for igt, where if the
+	 * GPU/driver fails we want to reboot and restart testing
+	 * rather than continue on into oblivion.
+	 */
+	add_taint(TAINT_DIE, LOCKDEP_STILL_OK);
 error:
 	i915_gem_set_wedged(i915);
 	i915_gem_retire_requests(i915);
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev2)
  2017-11-29 13:59 [PATCH] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails Chris Wilson
  2017-11-29 14:05 ` [PATCH v2] " Chris Wilson
@ 2017-11-30 10:02 ` Patchwork
  2017-11-30 14:15 ` Patchwork
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2017-11-30 10:02 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev2)
URL   : https://patchwork.freedesktop.org/series/34623/
State : failure

== Summary ==

Series 34623v2 drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails
https://patchwork.freedesktop.org/api/1.0/series/34623/revisions/2/mbox/

Test debugfs_test:
        Subgroup read_all_entries:
                dmesg-warn -> PASS       (fi-bdw-gvtdvm) fdo#103938 +1
Test gem_exec_reloc:
        Subgroup basic-write-gtt-active:
                fail       -> PASS       (fi-gdg-551) fdo#102582 +1
Test gem_mmap_gtt:
        Subgroup basic-small-bo-tiledx:
                fail       -> PASS       (fi-gdg-551) fdo#102575
Test kms_frontbuffer_tracking:
        Subgroup basic:
                pass       -> FAIL       (fi-glk-1)

fdo#103938 https://bugs.freedesktop.org/show_bug.cgi?id=103938
fdo#102582 https://bugs.freedesktop.org/show_bug.cgi?id=102582
fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575

fi-bdw-5557u     total:288  pass:267  dwarn:0   dfail:0   fail:0   skip:21  time:437s
fi-bdw-gvtdvm    total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:449s
fi-blb-e6850     total:288  pass:223  dwarn:1   dfail:0   fail:0   skip:64  time:384s
fi-bsw-n3050     total:288  pass:242  dwarn:0   dfail:0   fail:0   skip:46  time:517s
fi-bwr-2160      total:288  pass:183  dwarn:0   dfail:0   fail:0   skip:105 time:280s
fi-bxt-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:505s
fi-bxt-j4205     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:506s
fi-byt-j1900     total:288  pass:253  dwarn:0   dfail:0   fail:0   skip:35  time:496s
fi-byt-n2820     total:288  pass:249  dwarn:0   dfail:0   fail:0   skip:39  time:477s
fi-elk-e7500     total:224  pass:162  dwarn:16  dfail:0   fail:0   skip:45 
fi-gdg-551       total:288  pass:179  dwarn:1   dfail:0   fail:0   skip:108 time:267s
fi-glk-1         total:288  pass:259  dwarn:0   dfail:0   fail:1   skip:28  time:534s
fi-hsw-4770      total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:369s
fi-hsw-4770r     total:288  pass:224  dwarn:0   dfail:0   fail:0   skip:64  time:257s
fi-ilk-650       total:288  pass:228  dwarn:0   dfail:0   fail:0   skip:60  time:392s
fi-ivb-3520m     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:484s
fi-ivb-3770      total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:444s
fi-kbl-7500u     total:288  pass:263  dwarn:1   dfail:0   fail:0   skip:24  time:486s
fi-kbl-7560u     total:288  pass:269  dwarn:0   dfail:0   fail:0   skip:19  time:527s
fi-kbl-7567u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:469s
fi-kbl-r         total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:531s
fi-pnv-d510      total:288  pass:222  dwarn:1   dfail:0   fail:0   skip:65  time:596s
fi-skl-6260u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:450s
fi-skl-6600u     total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:540s
fi-skl-6700hq    total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:563s
fi-skl-6700k     total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:515s
fi-skl-6770hq    total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:502s
fi-skl-gvtdvm    total:288  pass:265  dwarn:0   dfail:0   fail:0   skip:23  time:446s
fi-snb-2520m     total:288  pass:249  dwarn:0   dfail:0   fail:0   skip:39  time:545s
fi-snb-2600      total:288  pass:248  dwarn:0   dfail:0   fail:0   skip:40  time:412s
Blacklisted hosts:
fi-cfl-s2        total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:600s
fi-glk-dsi       total:288  pass:257  dwarn:0   dfail:0   fail:1   skip:30  time:501s

a19f73d6fe96a9aaa7f71d25bbe9f897dc5e9ee1 drm-tip: 2017y-11m-30d-08h-12m-27s UTC integration manifest
333c192454e0 drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7359/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails
  2017-11-29 14:05 ` [PATCH v2] " Chris Wilson
@ 2017-11-30 12:24   ` Lofstedt, Marta
  2017-12-04 13:41   ` Joonas Lahtinen
  2017-12-05 17:06   ` Chris Wilson
  2 siblings, 0 replies; 13+ messages in thread
From: Lofstedt, Marta @ 2017-11-30 12:24 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Daniel Vetter


> -----Original Message-----
> From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf
> Of Chris Wilson
> Sent: Wednesday, November 29, 2017 4:06 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Subject: [Intel-gfx] [PATCH v2] drm/i915: Taint (TAINT_DIE) the kernel if the
> GPU reset fails
> 
> History tells us that if we cannot reset the GPU now, we never will. This then
> impacts everything that is run subsequently. On failing the reset, we mark
> the driver as wedged, trying to prevent further execution on the GPU,
> forcing userspace to fallback to using the CPU to update its framebuffers and
> let the user know what happened.
> 
> We also want to go one step further and add a taint to the kernel so that any
> subsequent faults can be traced back to this failure. This is important for igt,
> where if the GPU/driver fails we want to reboot and restart testing rather
> than continue on into oblivion.

I am OK with this if the comment deciding what IGT wants to do when this state is collected is removed.
I just want to make it clear that the tainting feature is separated from what that should be done when the system is tainted.

> 
> TAINT_DIE is colloquially known as "system on fire", which seems
> appropriate for unresponsive hardware.
> 
> v2: Also taint if the recovery fails (again history shows us that is typically
> fatal).
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=103514
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 28 ++++++++++++++++++++++------
>  1 file changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c
> b/drivers/gpu/drm/i915/i915_drv.c index 696d5cdf2779..eb90ddac7f8b
> 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1897,18 +1897,21 @@ void i915_reset(struct drm_i915_private *i915,
> unsigned int flags)
>  	disable_irq(i915->drm.irq);
>  	ret = i915_gem_reset_prepare(i915);
>  	if (ret) {
> -		DRM_ERROR("GPU recovery failed\n");
> +		dev_err(i915->drm.dev, "GPU recovery
> failed\n");
>  		intel_gpu_reset(i915, ALL_ENGINES);
> -		goto error;
> +		goto taint;
>  	}
> 
>  	ret = intel_gpu_reset(i915, ALL_ENGINES);
>  	if (ret) {
> -		if (ret != -ENODEV)
> -			DRM_ERROR("Failed to reset chip:
> %i\n", ret);
> -		else
> +		if (ret != -ENODEV) {
> +			dev_err(i915->drm.dev,
> +				"Failed to reset
> chip: %i\n", ret);
> +			goto taint;
> +		} else {
>  			DRM_DEBUG_DRIVER("GPU reset
> disabled\n");
> -		goto error;
> +			goto error;
> +		}
>  	}
> 
>  	i915_gem_reset(i915);
> @@ -1951,6 +1954,19 @@ void i915_reset(struct drm_i915_private *i915,
> unsigned int flags)
>  	wake_up_bit(&error->flags, I915_RESET_HANDOFF);
>  	return;
> 
> +taint:
> +	/*
> +	 * History tells us that if we cannot reset the GPU now, we
> +	 * never will. This then impacts everything that is run
> +	 * subsequently. On failing the reset, we mark the driver
> +	 * as wedged, preventing further execution on the GPU.
> +	 * We also want to go one step further and add a taint to the
> +	 * kernel so that any subsequent faults can be traced back to
> +	 * this failure. This is important for igt, where if the
> +	 * GPU/driver fails we want to reboot and restart testing
> +	 * rather than continue on into oblivion.
> +	 */
> +	add_taint(TAINT_DIE, LOCKDEP_STILL_OK);
>  error:
>  	i915_gem_set_wedged(i915);
>  	i915_gem_retire_requests(i915);
> --
> 2.15.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev2)
  2017-11-29 13:59 [PATCH] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails Chris Wilson
  2017-11-29 14:05 ` [PATCH v2] " Chris Wilson
  2017-11-30 10:02 ` ✗ Fi.CI.BAT: failure for drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev2) Patchwork
@ 2017-11-30 14:15 ` Patchwork
  2017-12-05 17:26 ` [PATCH v3] drm/i915: Taint (TAINT_WARN) the kernel if the GPU reset fails Chris Wilson
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2017-11-30 14:15 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev2)
URL   : https://patchwork.freedesktop.org/series/34623/
State : failure

== Summary ==

Series 34623v2 drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails
https://patchwork.freedesktop.org/api/1.0/series/34623/revisions/2/mbox/

Test gem_exec_reloc:
        Subgroup basic-cpu-active:
                fail       -> PASS       (fi-gdg-551) fdo#102582 +3
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-a:
                pass       -> FAIL       (fi-skl-6700k)

fdo#102582 https://bugs.freedesktop.org/show_bug.cgi?id=102582

fi-bdw-5557u     total:288  pass:267  dwarn:0   dfail:0   fail:0   skip:21  time:439s
fi-bdw-gvtdvm    total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:453s
fi-blb-e6850     total:288  pass:223  dwarn:1   dfail:0   fail:0   skip:64  time:382s
fi-bsw-n3050     total:288  pass:242  dwarn:0   dfail:0   fail:0   skip:46  time:509s
fi-bwr-2160      total:288  pass:183  dwarn:0   dfail:0   fail:0   skip:105 time:281s
fi-bxt-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:497s
fi-bxt-j4205     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:512s
fi-byt-j1900     total:288  pass:253  dwarn:0   dfail:0   fail:0   skip:35  time:490s
fi-gdg-551       total:288  pass:176  dwarn:1   dfail:0   fail:3   skip:108 time:271s
fi-glk-1         total:288  pass:260  dwarn:0   dfail:0   fail:0   skip:28  time:543s
fi-hsw-4770      total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:381s
fi-hsw-4770r     total:288  pass:224  dwarn:0   dfail:0   fail:0   skip:64  time:261s
fi-ilk-650       total:288  pass:228  dwarn:0   dfail:0   fail:0   skip:60  time:393s
fi-ivb-3520m     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:471s
fi-ivb-3770      total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:450s
fi-kbl-7500u     total:288  pass:263  dwarn:1   dfail:0   fail:0   skip:24  time:485s
fi-kbl-7560u     total:288  pass:269  dwarn:0   dfail:0   fail:0   skip:19  time:531s
fi-kbl-7567u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:481s
fi-kbl-r         total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:533s
fi-pnv-d510      total:288  pass:222  dwarn:1   dfail:0   fail:0   skip:65  time:591s
fi-skl-6260u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:447s
fi-skl-6600u     total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:537s
fi-skl-6700hq    total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:560s
fi-skl-6700k     total:288  pass:263  dwarn:0   dfail:0   fail:1   skip:24  time:516s
fi-skl-6770hq    total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:500s
fi-skl-gvtdvm    total:288  pass:265  dwarn:0   dfail:0   fail:0   skip:23  time:446s
fi-snb-2520m     total:288  pass:249  dwarn:0   dfail:0   fail:0   skip:39  time:547s
fi-snb-2600      total:288  pass:248  dwarn:0   dfail:0   fail:0   skip:40  time:422s
Blacklisted hosts:
fi-cfl-s2        total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:606s
fi-glk-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:489s
fi-byt-n2820 failed to collect. IGT log at Patchwork_7368/fi-byt-n2820/igt.log
fi-elk-e7500 failed to collect. IGT log at Patchwork_7368/fi-elk-e7500/igt.log

6d6c48b9b35806aba461d2c8285db2689de9095f drm-tip: 2017y-11m-30d-12h-22m-59s UTC integration manifest
c7289df822c9 drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7368/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails
  2017-11-29 14:05 ` [PATCH v2] " Chris Wilson
  2017-11-30 12:24   ` Lofstedt, Marta
@ 2017-12-04 13:41   ` Joonas Lahtinen
  2017-12-04 13:45     ` Chris Wilson
  2017-12-05 16:56     ` Chris Wilson
  2017-12-05 17:06   ` Chris Wilson
  2 siblings, 2 replies; 13+ messages in thread
From: Joonas Lahtinen @ 2017-12-04 13:41 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Daniel Vetter

On Wed, 2017-11-29 at 14:05 +0000, Chris Wilson wrote:
> History tells us that if we cannot reset the GPU now, we never will. This
> then impacts everything that is run subsequently. On failing the reset,
> we mark the driver as wedged, trying to prevent further execution on the
> GPU, forcing userspace to fallback to using the CPU to update its
> framebuffers and let the user know what happened.
> 
> We also want to go one step further and add a taint to the kernel so that
> any subsequent faults can be traced back to this failure. This is
> important for igt, where if the GPU/driver fails we want to reboot and
> restart testing rather than continue on into oblivion.
> 
> TAINT_DIE is colloquially known as "system on fire", which seems
> appropriate for unresponsive hardware.
> 
> v2: Also taint if the recovery fails (again history shows us that is
> typically fatal).
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=103514
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Michał Winiarski <michal.winiarski@intel.com>

<SNIP>

> @@ -1951,6 +1954,19 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
>  	wake_up_bit(&error->flags, I915_RESET_HANDOFF);
>  	return;
>  
> +taint:
> +	/*
> +	 * History tells us that if we cannot reset the GPU now, we
> +	 * never will. This then impacts everything that is run
> +	 * subsequently. On failing the reset, we mark the driver
> +	 * as wedged, preventing further execution on the GPU.
> +	 * We also want to go one step further and add a taint to the
> +	 * kernel so that any subsequent faults can be traced back to
> +	 * this failure. This is important for igt, where if the
> +	 * GPU/driver fails we want to reboot and restart testing
> +	 * rather than continue on into oblivion.
> +	 */

As Marta mentioned too, How igt works on a given day is bit volatile to
document in the kernel comments.

With that dropped;

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails
  2017-12-04 13:41   ` Joonas Lahtinen
@ 2017-12-04 13:45     ` Chris Wilson
  2017-12-05 16:56     ` Chris Wilson
  1 sibling, 0 replies; 13+ messages in thread
From: Chris Wilson @ 2017-12-04 13:45 UTC (permalink / raw)
  To: Joonas Lahtinen, intel-gfx; +Cc: Daniel Vetter

Quoting Joonas Lahtinen (2017-12-04 13:41:11)
> On Wed, 2017-11-29 at 14:05 +0000, Chris Wilson wrote:
> > History tells us that if we cannot reset the GPU now, we never will. This
> > then impacts everything that is run subsequently. On failing the reset,
> > we mark the driver as wedged, trying to prevent further execution on the
> > GPU, forcing userspace to fallback to using the CPU to update its
> > framebuffers and let the user know what happened.
> > 
> > We also want to go one step further and add a taint to the kernel so that
> > any subsequent faults can be traced back to this failure. This is
> > important for igt, where if the GPU/driver fails we want to reboot and
> > restart testing rather than continue on into oblivion.
> > 
> > TAINT_DIE is colloquially known as "system on fire", which seems
> > appropriate for unresponsive hardware.
> > 
> > v2: Also taint if the recovery fails (again history shows us that is
> > typically fatal).
> > 
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=103514
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> 
> <SNIP>
> 
> > @@ -1951,6 +1954,19 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
> >       wake_up_bit(&error->flags, I915_RESET_HANDOFF);
> >       return;
> >  
> > +taint:
> > +     /*
> > +      * History tells us that if we cannot reset the GPU now, we
> > +      * never will. This then impacts everything that is run
> > +      * subsequently. On failing the reset, we mark the driver
> > +      * as wedged, preventing further execution on the GPU.
> > +      * We also want to go one step further and add a taint to the
> > +      * kernel so that any subsequent faults can be traced back to
> > +      * this failure. This is important for igt, where if the
> > +      * GPU/driver fails we want to reboot and restart testing
> > +      * rather than continue on into oblivion.
> > +      */
> 
> As Marta mentioned too, How igt works on a given day is bit volatile to
> document in the kernel comments.

Without a use case, this isn't going anywhere.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails
  2017-12-04 13:41   ` Joonas Lahtinen
  2017-12-04 13:45     ` Chris Wilson
@ 2017-12-05 16:56     ` Chris Wilson
  1 sibling, 0 replies; 13+ messages in thread
From: Chris Wilson @ 2017-12-05 16:56 UTC (permalink / raw)
  To: Joonas Lahtinen, intel-gfx; +Cc: Daniel Vetter

Quoting Joonas Lahtinen (2017-12-04 13:41:11)
> On Wed, 2017-11-29 at 14:05 +0000, Chris Wilson wrote:
> > History tells us that if we cannot reset the GPU now, we never will. This
> > then impacts everything that is run subsequently. On failing the reset,
> > we mark the driver as wedged, trying to prevent further execution on the
> > GPU, forcing userspace to fallback to using the CPU to update its
> > framebuffers and let the user know what happened.
> > 
> > We also want to go one step further and add a taint to the kernel so that
> > any subsequent faults can be traced back to this failure. This is
> > important for igt, where if the GPU/driver fails we want to reboot and
> > restart testing rather than continue on into oblivion.
> > 
> > TAINT_DIE is colloquially known as "system on fire", which seems
> > appropriate for unresponsive hardware.
> > 
> > v2: Also taint if the recovery fails (again history shows us that is
> > typically fatal).
> > 
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=103514
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> 
> <SNIP>
> 
> > @@ -1951,6 +1954,19 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
> >       wake_up_bit(&error->flags, I915_RESET_HANDOFF);
> >       return;
> >  
> > +taint:
> > +     /*
> > +      * History tells us that if we cannot reset the GPU now, we
> > +      * never will. This then impacts everything that is run
> > +      * subsequently. On failing the reset, we mark the driver
> > +      * as wedged, preventing further execution on the GPU.
> > +      * We also want to go one step further and add a taint to the
> > +      * kernel so that any subsequent faults can be traced back to
> > +      * this failure. This is important for igt, where if the
> > +      * GPU/driver fails we want to reboot and restart testing
> > +      * rather than continue on into oblivion.
> > +      */
> 
> As Marta mentioned too, How igt works on a given day is bit volatile to
> document in the kernel comments.

More to the point, CI now implements the described response to
TAINT_DIE, without which this is pointless (userspace sees the wedged
and either handles it or dies; CI sees the wedged as a challenge).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails
  2017-11-29 14:05 ` [PATCH v2] " Chris Wilson
  2017-11-30 12:24   ` Lofstedt, Marta
  2017-12-04 13:41   ` Joonas Lahtinen
@ 2017-12-05 17:06   ` Chris Wilson
  2 siblings, 0 replies; 13+ messages in thread
From: Chris Wilson @ 2017-12-05 17:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika

Quoting Chris Wilson (2017-11-29 14:05:33)
> History tells us that if we cannot reset the GPU now, we never will. This
> then impacts everything that is run subsequently. On failing the reset,
> we mark the driver as wedged, trying to prevent further execution on the
> GPU, forcing userspace to fallback to using the CPU to update its
> framebuffers and let the user know what happened.
> 
> We also want to go one step further and add a taint to the kernel so that
> any subsequent faults can be traced back to this failure. This is
> important for igt, where if the GPU/driver fails we want to reboot and
> restart testing rather than continue on into oblivion.
> 
> TAINT_DIE is colloquially known as "system on fire", which seems
> appropriate for unresponsive hardware.
> 
> v2: Also taint if the recovery fails (again history shows us that is
> typically fatal).
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=103514
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Michał Winiarski <michal.winiarski@intel.com>

irc
Acked-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3] drm/i915: Taint (TAINT_WARN) the kernel if the GPU reset fails
  2017-11-29 13:59 [PATCH] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails Chris Wilson
                   ` (2 preceding siblings ...)
  2017-11-30 14:15 ` Patchwork
@ 2017-12-05 17:26 ` Chris Wilson
  2017-12-05 17:27 ` [PATCH v4] " Chris Wilson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Chris Wilson @ 2017-12-05 17:26 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

History tells us that if we cannot reset the GPU now, we never will. This
then impacts everything that is run subsequently. On failing the reset,
we mark the driver as wedged, trying to prevent further execution on the
GPU, forcing userspace to fallback to using the CPU to update its
framebuffers and let the user know what happened.

We also want to go one step further and add a taint to the kernel so that
any subsequent faults can be traced back to this failure. This is
useful for CI, where if the GPU/driver fails we want to reboot and
restart testing rather than continue on into oblivion. For everyone
else, the warning taint is a testament to the system unreliability.

TAINT_WARN is used anytime a WARN() is emitted, which is suitable for
our purposes here as well; the driver/system may behave unexpectedly
after the failure.

v2: Also taint if the recovery fails (again history shows us that is
typically fatal).
v3: Use TAINT_WARN

References: https://bugs.freedesktop.org/show_bug.cgi?id=103514
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 7faf20aff25a..71213c4a13a8 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1897,9 +1897,9 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
 	disable_irq(i915->drm.irq);
 	ret = i915_gem_reset_prepare(i915);
 	if (ret) {
-		DRM_ERROR("GPU recovery failed\n");
+		dev_err(i915->drm.dev, "GPU recovery failed\n");
 		intel_gpu_reset(i915, ALL_ENGINES);
-		goto error;
+		goto taint;
 	}
 
 	if (!intel_has_gpu_reset(i915)) {
@@ -1916,7 +1916,7 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
 	}
 	if (ret) {
 		dev_err(i915->drm.dev, "Failed to reset chip\n");
-		goto error;
+		goto taint;
 	}
 
 	i915_gem_reset(i915);
@@ -1959,6 +1959,20 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
 	wake_up_bit(&error->flags, I915_RESET_HANDOFF);
 	return;
 
+taint:
+	/*
+	 * History tells us that if we cannot reset the GPU now, we
+	 * never will. This then impacts everything that is run
+	 * subsequently. On failing the reset, we mark the driver
+	 * as wedged, preventing further execution on the GPU.
+	 * We also want to go one step further and add a taint to the
+	 * kernel so that any subsequent faults can be traced back to
+	 * this failure. This is important for CI, where if the
+	 * GPU/driver fails we would like to reboot and restart testing
+	 * rather than continue on into oblivion. For everyone else,
+	 * the system should still plod around, but they have been warned!
+	 */
+	add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
 error:
 	i915_gem_set_wedged(i915);
 	i915_gem_retire_requests(i915);
-- 
2.15.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4] drm/i915: Taint (TAINT_WARN) the kernel if the GPU reset fails
  2017-11-29 13:59 [PATCH] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails Chris Wilson
                   ` (3 preceding siblings ...)
  2017-12-05 17:26 ` [PATCH v3] drm/i915: Taint (TAINT_WARN) the kernel if the GPU reset fails Chris Wilson
@ 2017-12-05 17:27 ` Chris Wilson
  2017-12-05 18:34 ` ✓ Fi.CI.BAT: success for drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev4) Patchwork
  2017-12-05 21:09 ` ✓ Fi.CI.IGT: " Patchwork
  6 siblings, 0 replies; 13+ messages in thread
From: Chris Wilson @ 2017-12-05 17:27 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

History tells us that if we cannot reset the GPU now, we never will. This
then impacts everything that is run subsequently. On failing the reset,
we mark the driver as wedged, trying to prevent further execution on the
GPU, forcing userspace to fallback to using the CPU to update its
framebuffers and let the user know what happened.

We also want to go one step further and add a taint to the kernel so that
any subsequent faults can be traced back to this failure. This is
useful for CI, where if the GPU/driver fails we want to reboot and
restart testing rather than continue on into oblivion. For everyone
else, the warning taint is a testament to the system unreliability.

TAINT_WARN is used anytime a WARN() is emitted, which is suitable for
our purposes here as well; the driver/system may behave unexpectedly
after the failure.

v2: Also taint if the recovery fails (again history shows us that is
typically fatal).
v3: Use TAINT_WARN

References: https://bugs.freedesktop.org/show_bug.cgi?id=103514
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 7faf20aff25a..5b1fd5f1defb 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1897,9 +1897,9 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
 	disable_irq(i915->drm.irq);
 	ret = i915_gem_reset_prepare(i915);
 	if (ret) {
-		DRM_ERROR("GPU recovery failed\n");
+		dev_err(i915->drm.dev, "GPU recovery failed\n");
 		intel_gpu_reset(i915, ALL_ENGINES);
-		goto error;
+		goto taint;
 	}
 
 	if (!intel_has_gpu_reset(i915)) {
@@ -1916,7 +1916,7 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
 	}
 	if (ret) {
 		dev_err(i915->drm.dev, "Failed to reset chip\n");
-		goto error;
+		goto taint;
 	}
 
 	i915_gem_reset(i915);
@@ -1959,6 +1959,20 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
 	wake_up_bit(&error->flags, I915_RESET_HANDOFF);
 	return;
 
+taint:
+	/*
+	 * History tells us that if we cannot reset the GPU now, we
+	 * never will. This then impacts everything that is run
+	 * subsequently. On failing the reset, we mark the driver
+	 * as wedged, preventing further execution on the GPU.
+	 * We also want to go one step further and add a taint to the
+	 * kernel so that any subsequent faults can be traced back to
+	 * this failure. This is important for CI, where if the
+	 * GPU/driver fails we would like to reboot and restart testing
+	 * rather than continue on into oblivion. For everyone else,
+	 * the system should still plod along, but they have been warned!
+	 */
+	add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
 error:
 	i915_gem_set_wedged(i915);
 	i915_gem_retire_requests(i915);
-- 
2.15.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev4)
  2017-11-29 13:59 [PATCH] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails Chris Wilson
                   ` (4 preceding siblings ...)
  2017-12-05 17:27 ` [PATCH v4] " Chris Wilson
@ 2017-12-05 18:34 ` Patchwork
  2017-12-05 21:09 ` ✓ Fi.CI.IGT: " Patchwork
  6 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2017-12-05 18:34 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev4)
URL   : https://patchwork.freedesktop.org/series/34623/
State : success

== Summary ==

Series 34623v4 drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails
https://patchwork.freedesktop.org/api/1.0/series/34623/revisions/4/mbox/

Test debugfs_test:
        Subgroup read_all_entries:
                dmesg-warn -> FAIL       (fi-elk-e7500) fdo#103989 +1
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-b:
                pass       -> INCOMPLETE (fi-snb-2520m) fdo#103713

fdo#103989 https://bugs.freedesktop.org/show_bug.cgi?id=103989
fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713

fi-bdw-5557u     total:288  pass:267  dwarn:0   dfail:0   fail:0   skip:21  time:435s
fi-blb-e6850     total:288  pass:223  dwarn:1   dfail:0   fail:0   skip:64  time:382s
fi-bsw-n3050     total:288  pass:242  dwarn:0   dfail:0   fail:0   skip:46  time:520s
fi-bwr-2160      total:288  pass:183  dwarn:0   dfail:0   fail:0   skip:105 time:282s
fi-bxt-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:503s
fi-bxt-j4205     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:505s
fi-byt-j1900     total:288  pass:253  dwarn:0   dfail:0   fail:0   skip:35  time:490s
fi-byt-n2820     total:288  pass:249  dwarn:0   dfail:0   fail:0   skip:39  time:472s
fi-elk-e7500     total:224  pass:162  dwarn:15  dfail:0   fail:1   skip:45 
fi-glk-1         total:288  pass:260  dwarn:0   dfail:0   fail:0   skip:28  time:538s
fi-hsw-4770      total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:369s
fi-hsw-4770r     total:288  pass:224  dwarn:0   dfail:0   fail:0   skip:64  time:261s
fi-ivb-3520m     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:475s
fi-ivb-3770      total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:444s
fi-kbl-7560u     total:288  pass:269  dwarn:0   dfail:0   fail:0   skip:19  time:520s
fi-kbl-7567u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:473s
fi-kbl-r         total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:532s
fi-pnv-d510      total:288  pass:222  dwarn:1   dfail:0   fail:0   skip:65  time:592s
fi-skl-6260u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:451s
fi-skl-6600u     total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:544s
fi-skl-6700hq    total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:568s
fi-skl-6700k     total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:517s
fi-skl-6770hq    total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:508s
fi-snb-2520m     total:245  pass:211  dwarn:0   dfail:0   fail:0   skip:33 
fi-snb-2600      total:288  pass:248  dwarn:0   dfail:0   fail:0   skip:40  time:414s
Blacklisted hosts:
fi-cfl-s2        total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:609s
fi-cnl-y         total:197  pass:185  dwarn:0   dfail:0   fail:0   skip:11 
fi-glk-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:489s

0d0fe916f52ad8f05dddab384ae7c90bb62ebac4 drm-tip: 2017y-12m-05d-14h-52m-17s UTC integration manifest
50f3430f4200 drm/i915: Taint (TAINT_WARN) the kernel if the GPU reset fails

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7416/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* ✓ Fi.CI.IGT: success for drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev4)
  2017-11-29 13:59 [PATCH] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails Chris Wilson
                   ` (5 preceding siblings ...)
  2017-12-05 18:34 ` ✓ Fi.CI.BAT: success for drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev4) Patchwork
@ 2017-12-05 21:09 ` Patchwork
  6 siblings, 0 replies; 13+ messages in thread
From: Patchwork @ 2017-12-05 21:09 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev4)
URL   : https://patchwork.freedesktop.org/series/34623/
State : success

== Summary ==

Test drv_suspend:
        Subgroup fence-restore-untiled-hibernate:
                fail       -> SKIP       (shard-snb) fdo#103375 +1
Test kms_frontbuffer_tracking:
        Subgroup fbc-rgb101010-draw-render:
                skip       -> PASS       (shard-snb) fdo#103167
        Subgroup fbc-1p-offscren-pri-shrfb-draw-blt:
                fail       -> PASS       (shard-snb) fdo#101623 +1
Test kms_flip:
        Subgroup vblank-vs-modeset-suspend-interruptible:
                skip       -> PASS       (shard-snb)
Test drv_module_reload:
        Subgroup basic-no-display:
                dmesg-warn -> PASS       (shard-snb) fdo#102707
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-b:
                pass       -> INCOMPLETE (shard-hsw) fdo#103706
Test kms_chv_cursor_fail:
        Subgroup pipe-b-128x128-top-edge:
                incomplete -> PASS       (shard-hsw)
Test prime_mmap_kms:
        Subgroup buffer-sharing:
                skip       -> PASS       (shard-snb)

fdo#103375 https://bugs.freedesktop.org/show_bug.cgi?id=103375
fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
fdo#101623 https://bugs.freedesktop.org/show_bug.cgi?id=101623
fdo#102707 https://bugs.freedesktop.org/show_bug.cgi?id=102707
fdo#103706 https://bugs.freedesktop.org/show_bug.cgi?id=103706

shard-hsw        total:2672 pass:1529 dwarn:2   dfail:0   fail:9   skip:1131 time:9254s
shard-snb        total:2679 pass:1308 dwarn:1   dfail:0   fail:10  skip:1360 time:8030s
Blacklisted hosts:
shard-apl        total:2657 pass:1650 dwarn:1   dfail:1   fail:27  skip:977 time:13244s
shard-kbl        total:2679 pass:1792 dwarn:2   dfail:0   fail:26  skip:859 time:10862s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7416/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-12-05 21:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-29 13:59 [PATCH] drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails Chris Wilson
2017-11-29 14:05 ` [PATCH v2] " Chris Wilson
2017-11-30 12:24   ` Lofstedt, Marta
2017-12-04 13:41   ` Joonas Lahtinen
2017-12-04 13:45     ` Chris Wilson
2017-12-05 16:56     ` Chris Wilson
2017-12-05 17:06   ` Chris Wilson
2017-11-30 10:02 ` ✗ Fi.CI.BAT: failure for drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev2) Patchwork
2017-11-30 14:15 ` Patchwork
2017-12-05 17:26 ` [PATCH v3] drm/i915: Taint (TAINT_WARN) the kernel if the GPU reset fails Chris Wilson
2017-12-05 17:27 ` [PATCH v4] " Chris Wilson
2017-12-05 18:34 ` ✓ Fi.CI.BAT: success for drm/i915: Taint (TAINT_DIE) the kernel if the GPU reset fails (rev4) Patchwork
2017-12-05 21:09 ` ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.