* [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
@ 2017-12-06 14:19 Chris Wilson
2017-12-06 14:37 ` ✓ Fi.CI.BAT: success for " Patchwork
` (11 more replies)
0 siblings, 12 replies; 19+ messages in thread
From: Chris Wilson @ 2017-12-06 14:19 UTC (permalink / raw)
To: intel-gfx
Since capturing the error state requires fiddling around with the GGTT
to read arbitrary buffers and is itself run under stop_machine(), it
deadlocks the machine (effectively a hard hang) when run in conjunction
with Broxton's VTd workaround to serialize GGTT access.
Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: John Harrison <john.C.Harrison@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
drivers/gpu/drm/i915/i915_gpu_error.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 48418fb81066..e6c7e8e53815 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1813,6 +1813,10 @@ void i915_capture_error_state(struct drm_i915_private *dev_priv,
if (!i915_modparams.error_capture)
return;
+ /* Prevent recursively calling stop_machine() and deadlocking. */
+ if (intel_ggtt_update_needs_vtd_wa(dev_priv))
+ return;
+
if (READ_ONCE(dev_priv->gpu_error.first_error))
return;
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 19+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
2017-12-06 14:19 [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
@ 2017-12-06 14:37 ` Patchwork
2017-12-06 14:43 ` [PATCH] " Daniel Vetter
` (10 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2017-12-06 14:37 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
URL : https://patchwork.freedesktop.org/series/34969/
State : success
== Summary ==
Series 34969v1 drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
https://patchwork.freedesktop.org/api/1.0/series/34969/revisions/1/mbox/
Test debugfs_test:
Subgroup read_all_entries:
dmesg-warn -> PASS (fi-elk-e7500) fdo#103989 +1
Test gem_mmap_gtt:
Subgroup basic-small-bo-tiledx:
pass -> FAIL (fi-gdg-551) fdo#102575
fdo#103989 https://bugs.freedesktop.org/show_bug.cgi?id=103989
fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
fi-bdw-5557u total:288 pass:267 dwarn:0 dfail:0 fail:0 skip:21 time:436s
fi-blb-e6850 total:288 pass:223 dwarn:1 dfail:0 fail:0 skip:64 time:383s
fi-bsw-n3050 total:288 pass:242 dwarn:0 dfail:0 fail:0 skip:46 time:523s
fi-bwr-2160 total:288 pass:183 dwarn:0 dfail:0 fail:0 skip:105 time:284s
fi-bxt-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:503s
fi-bxt-j4205 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:509s
fi-byt-j1900 total:288 pass:253 dwarn:0 dfail:0 fail:0 skip:35 time:488s
fi-byt-n2820 total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:471s
fi-elk-e7500 total:224 pass:163 dwarn:15 dfail:0 fail:0 skip:45
fi-gdg-551 total:288 pass:178 dwarn:1 dfail:0 fail:1 skip:108 time:266s
fi-glk-1 total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:538s
fi-hsw-4770 total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:369s
fi-hsw-4770r total:288 pass:224 dwarn:0 dfail:0 fail:0 skip:64 time:259s
fi-ivb-3520m total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:471s
fi-ivb-3770 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:445s
fi-kbl-7560u total:288 pass:269 dwarn:0 dfail:0 fail:0 skip:19 time:528s
fi-kbl-7567u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:472s
fi-kbl-r total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:535s
fi-pnv-d510 total:288 pass:222 dwarn:1 dfail:0 fail:0 skip:65 time:586s
fi-skl-6260u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:451s
fi-skl-6600u total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:540s
fi-skl-6700hq total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:575s
fi-skl-6700k total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:518s
fi-skl-6770hq total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:496s
fi-snb-2520m total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:552s
fi-snb-2600 total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:415s
Blacklisted hosts:
fi-cfl-s2 total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:614s
fi-cnl-y total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:633s
fi-glk-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:487s
fi-kbl-7500u failed to connect after reboot
1a0d67efb4cc5611887c79adc5c3315790f78df5 drm-tip: 2017y-12m-06d-00h-51m-07s UTC integration manifest
1cb49b831f1d drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7426/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
2017-12-06 14:19 [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
2017-12-06 14:37 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2017-12-06 14:43 ` Daniel Vetter
2017-12-06 14:48 ` Chris Wilson
2017-12-06 15:26 ` ✗ Fi.CI.IGT: warning for " Patchwork
` (9 subsequent siblings)
11 siblings, 1 reply; 19+ messages in thread
From: Daniel Vetter @ 2017-12-06 14:43 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Wed, Dec 06, 2017 at 02:19:03PM +0000, Chris Wilson wrote:
> Since capturing the error state requires fiddling around with the GGTT
> to read arbitrary buffers and is itself run under stop_machine(), it
> deadlocks the machine (effectively a hard hang) when run in conjunction
> with Broxton's VTd workaround to serialize GGTT access.
>
> Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Cc: John Harrison <john.C.Harrison@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> ---
> drivers/gpu/drm/i915/i915_gpu_error.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 48418fb81066..e6c7e8e53815 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1813,6 +1813,10 @@ void i915_capture_error_state(struct drm_i915_private *dev_priv,
> if (!i915_modparams.error_capture)
> return;
>
> + /* Prevent recursively calling stop_machine() and deadlocking. */
> + if (intel_ggtt_update_needs_vtd_wa(dev_priv))
> + return;
I'd put this closer to the stop machine, at the head of
i915_capture_gpu_state(). If the bogus debug output annoys then we could
switch that to an PTR_ERR return value I guess. But I guess this here is
ok too, so either way:
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> +
> if (READ_ONCE(dev_priv->gpu_error.first_error))
> return;
>
> --
> 2.15.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
2017-12-06 14:43 ` [PATCH] " Daniel Vetter
@ 2017-12-06 14:48 ` Chris Wilson
2017-12-06 14:51 ` Daniel Vetter
0 siblings, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2017-12-06 14:48 UTC (permalink / raw)
To: Daniel Vetter; +Cc: intel-gfx
Quoting Daniel Vetter (2017-12-06 14:43:39)
> On Wed, Dec 06, 2017 at 02:19:03PM +0000, Chris Wilson wrote:
> > Since capturing the error state requires fiddling around with the GGTT
> > to read arbitrary buffers and is itself run under stop_machine(), it
> > deadlocks the machine (effectively a hard hang) when run in conjunction
> > with Broxton's VTd workaround to serialize GGTT access.
> >
> > Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> > Cc: John Harrison <john.C.Harrison@intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_gpu_error.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> > index 48418fb81066..e6c7e8e53815 100644
> > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > @@ -1813,6 +1813,10 @@ void i915_capture_error_state(struct drm_i915_private *dev_priv,
> > if (!i915_modparams.error_capture)
> > return;
> >
> > + /* Prevent recursively calling stop_machine() and deadlocking. */
> > + if (intel_ggtt_update_needs_vtd_wa(dev_priv))
> > + return;
>
> I'd put this closer to the stop machine, at the head of
> i915_capture_gpu_state(). If the bogus debug output annoys then we could
> switch that to an PTR_ERR return value I guess. But I guess this here is
> ok too, so either way:
I was considering doing some of the capture, skipping the buffers, but
nowadays those buffers tend to the crux of triaging. My only real concern
is how to explain to the user that the error state cannot exist, for
which we could go and add -ENODEV to sysfs/debugfs just to be clear.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
2017-12-06 14:48 ` Chris Wilson
@ 2017-12-06 14:51 ` Daniel Vetter
0 siblings, 0 replies; 19+ messages in thread
From: Daniel Vetter @ 2017-12-06 14:51 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
On Wed, Dec 06, 2017 at 02:48:36PM +0000, Chris Wilson wrote:
> Quoting Daniel Vetter (2017-12-06 14:43:39)
> > On Wed, Dec 06, 2017 at 02:19:03PM +0000, Chris Wilson wrote:
> > > Since capturing the error state requires fiddling around with the GGTT
> > > to read arbitrary buffers and is itself run under stop_machine(), it
> > > deadlocks the machine (effectively a hard hang) when run in conjunction
> > > with Broxton's VTd workaround to serialize GGTT access.
> > >
> > > Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT")
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> > > Cc: John Harrison <john.C.Harrison@intel.com>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > > ---
> > > drivers/gpu/drm/i915/i915_gpu_error.c | 4 ++++
> > > 1 file changed, 4 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> > > index 48418fb81066..e6c7e8e53815 100644
> > > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > > @@ -1813,6 +1813,10 @@ void i915_capture_error_state(struct drm_i915_private *dev_priv,
> > > if (!i915_modparams.error_capture)
> > > return;
> > >
> > > + /* Prevent recursively calling stop_machine() and deadlocking. */
> > > + if (intel_ggtt_update_needs_vtd_wa(dev_priv))
> > > + return;
> >
> > I'd put this closer to the stop machine, at the head of
> > i915_capture_gpu_state(). If the bogus debug output annoys then we could
> > switch that to an PTR_ERR return value I guess. But I guess this here is
> > ok too, so either way:
>
> I was considering doing some of the capture, skipping the buffers, but
> nowadays those buffers tend to the crux of triaging. My only real concern
> is how to explain to the user that the error state cannot exist, for
> which we could go and add -ENODEV to sysfs/debugfs just to be clear.
Fancy idea: store ther PTR_ERR in ->first.error and return that? Would
address both my bikeshed and your suggestion.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* ✗ Fi.CI.IGT: warning for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
2017-12-06 14:19 [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
2017-12-06 14:37 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-12-06 14:43 ` [PATCH] " Daniel Vetter
@ 2017-12-06 15:26 ` Patchwork
2017-12-06 15:37 ` [PATCH v2] " Chris Wilson
` (8 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2017-12-06 15:26 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
URL : https://patchwork.freedesktop.org/series/34969/
State : warning
== Summary ==
Test kms_plane:
Subgroup plane-position-covered-pipe-c-planes:
skip -> PASS (shard-hsw)
Test kms_frontbuffer_tracking:
Subgroup fbc-1p-offscren-pri-shrfb-draw-blt:
fail -> PASS (shard-snb) fdo#101623 +1
Test kms_rotation_crc:
Subgroup cursor-rotation-180:
pass -> SKIP (shard-snb)
pass -> SKIP (shard-hsw)
Test perf:
Subgroup oa-exponents:
fail -> PASS (shard-hsw) fdo#102254
Test kms_flip:
Subgroup vblank-vs-suspend-interruptible:
incomplete -> PASS (shard-hsw) fdo#100368
fdo#101623 https://bugs.freedesktop.org/show_bug.cgi?id=101623
fdo#102254 https://bugs.freedesktop.org/show_bug.cgi?id=102254
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
shard-hsw total:2679 pass:1535 dwarn:1 dfail:0 fail:10 skip:1133 time:9424s
shard-snb total:2679 pass:1308 dwarn:1 dfail:0 fail:11 skip:1359 time:8079s
Blacklisted hosts:
shard-apl total:2679 pass:1677 dwarn:2 dfail:0 fail:23 skip:977 time:13525s
shard-kbl total:2571 pass:1722 dwarn:6 dfail:0 fail:22 skip:820 time:10349s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7426/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
2017-12-06 14:19 [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
` (2 preceding siblings ...)
2017-12-06 15:26 ` ✗ Fi.CI.IGT: warning for " Patchwork
@ 2017-12-06 15:37 ` Chris Wilson
2017-12-06 17:01 ` Bloomfield, Jon
2017-12-06 16:11 ` ✓ Fi.CI.BAT: success for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev2) Patchwork
` (7 subsequent siblings)
11 siblings, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2017-12-06 15:37 UTC (permalink / raw)
To: intel-gfx; +Cc: Daniel Vetter
Since capturing the error state requires fiddling around with the GGTT
to read arbitrary buffers and is itself run under stop_machine(), it
deadlocks the machine (effectively a hard hang) when run in conjunction
with Broxton's VTd workaround to serialize GGTT access.
v2: Store the ERR_PTR in first_error so that the error can be reported
to the user via sysfs.
Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: John Harrison <john.C.Harrison@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
drivers/gpu/drm/i915/i915_drv.h | 8 +++++++-
drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
drivers/gpu/drm/i915/i915_gpu_error.c | 15 ++++++++++++++-
3 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 594fd14e66c5..1eca4f954050 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3990,6 +3990,7 @@ static inline void i915_gpu_state_put(struct i915_gpu_state *gpu)
struct i915_gpu_state *i915_first_error_state(struct drm_i915_private *i915);
void i915_reset_error_state(struct drm_i915_private *i915);
+void i915_disable_error_state(struct drm_i915_private *i915, int err);
#else
@@ -4002,13 +4003,18 @@ static inline void i915_capture_error_state(struct drm_i915_private *dev_priv,
static inline struct i915_gpu_state *
i915_first_error_state(struct drm_i915_private *i915)
{
- return NULL;
+ return ERR_PTR(-ENODEV);
}
static inline void i915_reset_error_state(struct drm_i915_private *i915)
{
}
+static inline void i915_disable_error_state(struct drm_i915_private *i915,
+ int err)
+{
+}
+
#endif
const char *i915_cache_level_str(struct drm_i915_private *i915, int type);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index f3c35e826321..0264d88b4cff 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3373,6 +3373,9 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
ggtt->base.insert_page = bxt_vtd_ggtt_insert_page__BKL;
if (ggtt->base.clear_range != nop_clear_range)
ggtt->base.clear_range = bxt_vtd_ggtt_clear_range__BKL;
+
+ /* Prevent recursively calling stop_machine() and deadlocks. */
+ i915_disable_error_state(dev_priv, -ENODEV);
}
ggtt->invalidate = gen6_ggtt_invalidate;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 48418fb81066..0b45d28624b7 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -633,6 +633,9 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
return 0;
}
+ if (IS_ERR(error))
+ return PTR_ERR(error);
+
if (*error->error_msg)
err_printf(m, "%s\n", error->error_msg);
err_printf(m, "Kernel: " UTS_RELEASE "\n");
@@ -1819,6 +1822,7 @@ void i915_capture_error_state(struct drm_i915_private *dev_priv,
error = i915_capture_gpu_state(dev_priv);
if (!error) {
DRM_DEBUG_DRIVER("out of memory, not capturing error state\n");
+ i915_disable_error_state(dev_priv, -ENOMEM);
return;
}
@@ -1874,5 +1878,14 @@ void i915_reset_error_state(struct drm_i915_private *i915)
i915->gpu_error.first_error = NULL;
spin_unlock_irq(&i915->gpu_error.lock);
- i915_gpu_state_put(error);
+ if (!IS_ERR(error))
+ i915_gpu_state_put(error);
+}
+
+void i915_disable_error_state(struct drm_i915_private *i915, int err)
+{
+ spin_lock_irq(&i915->gpu_error.lock);
+ if (!i915->gpu_error.first_error)
+ i915->gpu_error.first_error = ERR_PTR(err);
+ spin_unlock_irq(&i915->gpu_error.lock);
}
--
2.15.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 19+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev2)
2017-12-06 14:19 [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
` (3 preceding siblings ...)
2017-12-06 15:37 ` [PATCH v2] " Chris Wilson
@ 2017-12-06 16:11 ` Patchwork
2017-12-06 17:43 ` ✓ Fi.CI.IGT: " Patchwork
` (6 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2017-12-06 16:11 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev2)
URL : https://patchwork.freedesktop.org/series/34969/
State : success
== Summary ==
Series 34969v2 drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
https://patchwork.freedesktop.org/api/1.0/series/34969/revisions/2/mbox/
Test debugfs_test:
Subgroup read_all_entries:
dmesg-warn -> DMESG-FAIL (fi-elk-e7500) fdo#103989
Test gem_mmap_gtt:
Subgroup basic-small-bo-tiledx:
fail -> PASS (fi-gdg-551) fdo#102575
fdo#103989 https://bugs.freedesktop.org/show_bug.cgi?id=103989
fdo#102575 https://bugs.freedesktop.org/show_bug.cgi?id=102575
fi-bdw-5557u total:288 pass:267 dwarn:0 dfail:0 fail:0 skip:21 time:438s
fi-blb-e6850 total:288 pass:223 dwarn:1 dfail:0 fail:0 skip:64 time:383s
fi-bsw-n3050 total:288 pass:242 dwarn:0 dfail:0 fail:0 skip:46 time:513s
fi-bwr-2160 total:288 pass:183 dwarn:0 dfail:0 fail:0 skip:105 time:281s
fi-bxt-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:501s
fi-bxt-j4205 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:509s
fi-byt-j1900 total:288 pass:253 dwarn:0 dfail:0 fail:0 skip:35 time:489s
fi-byt-n2820 total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:472s
fi-elk-e7500 total:224 pass:163 dwarn:14 dfail:1 fail:0 skip:45
fi-gdg-551 total:288 pass:179 dwarn:1 dfail:0 fail:0 skip:108 time:272s
fi-glk-1 total:288 pass:260 dwarn:0 dfail:0 fail:0 skip:28 time:538s
fi-hsw-4770 total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:383s
fi-hsw-4770r total:288 pass:224 dwarn:0 dfail:0 fail:0 skip:64 time:261s
fi-ivb-3520m total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:480s
fi-ivb-3770 total:288 pass:259 dwarn:0 dfail:0 fail:0 skip:29 time:451s
fi-kbl-7560u total:288 pass:269 dwarn:0 dfail:0 fail:0 skip:19 time:531s
fi-kbl-7567u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:476s
fi-kbl-r total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:529s
fi-skl-6260u total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:448s
fi-skl-6600u total:288 pass:261 dwarn:0 dfail:0 fail:0 skip:27 time:544s
fi-skl-6700hq total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:567s
fi-skl-6700k total:288 pass:264 dwarn:0 dfail:0 fail:0 skip:24 time:517s
fi-skl-6770hq total:288 pass:268 dwarn:0 dfail:0 fail:0 skip:20 time:499s
fi-snb-2520m total:288 pass:249 dwarn:0 dfail:0 fail:0 skip:39 time:550s
fi-snb-2600 total:288 pass:248 dwarn:0 dfail:0 fail:0 skip:40 time:419s
Blacklisted hosts:
fi-cfl-s2 total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:631s
fi-cnl-y total:288 pass:262 dwarn:0 dfail:0 fail:0 skip:26 time:619s
fi-glk-dsi total:288 pass:258 dwarn:0 dfail:0 fail:0 skip:30 time:491s
01b30547063a8ba25114041e6caf41fc98ea7ddb drm-tip: 2017y-12m-06d-15h-18m-33s UTC integration manifest
fce7cec98532 drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7429/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
2017-12-06 15:37 ` [PATCH v2] " Chris Wilson
@ 2017-12-06 17:01 ` Bloomfield, Jon
2017-12-06 17:25 ` Bloomfield, Jon
0 siblings, 1 reply; 19+ messages in thread
From: Bloomfield, Jon @ 2017-12-06 17:01 UTC (permalink / raw)
To: Chris Wilson, intel-gfx; +Cc: Daniel Vetter
> -----Original Message-----
> From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> Sent: Wednesday, December 6, 2017 7:38 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: Chris Wilson <chris@chris-wilson.co.uk>; Bloomfield, Jon
> <jon.bloomfield@intel.com>; Harrison, John C <john.c.harrison@intel.com>;
> Ursulin, Tvrtko <tvrtko.ursulin@intel.com>; Joonas Lahtinen
> <joonas.lahtinen@linux.intel.com>; Daniel Vetter <daniel.vetter@ffwll.ch>
> Subject: [PATCH v2] drm/i915: Prevent machine hang from Broxton's vtd w/a
> and error capture
>
> Since capturing the error state requires fiddling around with the GGTT
> to read arbitrary buffers and is itself run under stop_machine(), it
> deadlocks the machine (effectively a hard hang) when run in conjunction
> with Broxton's VTd workaround to serialize GGTT access.
>
> v2: Store the ERR_PTR in first_error so that the error can be reported
> to the user via sysfs.
>
> Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT")
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> Cc: John Harrison <john.C.Harrison@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
It's a real shame to lose error capture on BXT. Can we wrap stop_machine to make it recursive ?
Something like...
static cpumask_t sm_mask;
struct sm_args {
cpu_stop_fn_t *fn;
void *data;
};
void do_recursive_stop(void *sm_arg_data)
{
struct sm_arg *args = sm_arg_data;
/* We're stopped - flag the fact to prevent recursion */
cpumask_set_cpu(smp_processor_id(), &sm_mask);
args->fn(args->data);
/* Re-enable recursion */
cpumask_clear_cpu(smp_processor_id(), &sm_mask);
}
void recursive_stop_machine(cpu_stop_fn_t fn, void *data)
{
if (cpumask_test_cpu(smp_processor_id(), &sm_mask)) {
/* We were already stopped, so can just call directly */
fn(data);
}
else {
/* Our CPU is not currently stopped */
struct sm_args *args = {fn, data};
stop_machine(do_recursive_stop, args, NULL);
}
}
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
2017-12-06 17:01 ` Bloomfield, Jon
@ 2017-12-06 17:25 ` Bloomfield, Jon
0 siblings, 0 replies; 19+ messages in thread
From: Bloomfield, Jon @ 2017-12-06 17:25 UTC (permalink / raw)
To: Bloomfield, Jon, Chris Wilson, intel-gfx; +Cc: Daniel Vetter
> -----Original Message-----
> From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf
> Of Bloomfield, Jon
> Sent: Wednesday, December 6, 2017 9:01 AM
> To: Chris Wilson <chris@chris-wilson.co.uk>; intel-gfx@lists.freedesktop.org
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Subject: Re: [Intel-gfx] [PATCH v2] drm/i915: Prevent machine hang from
> Broxton's vtd w/a and error capture
>
> > -----Original Message-----
> > From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> > Sent: Wednesday, December 6, 2017 7:38 AM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>; Bloomfield, Jon
> > <jon.bloomfield@intel.com>; Harrison, John C
> <john.c.harrison@intel.com>;
> > Ursulin, Tvrtko <tvrtko.ursulin@intel.com>; Joonas Lahtinen
> > <joonas.lahtinen@linux.intel.com>; Daniel Vetter <daniel.vetter@ffwll.ch>
> > Subject: [PATCH v2] drm/i915: Prevent machine hang from Broxton's vtd
> w/a
> > and error capture
> >
> > Since capturing the error state requires fiddling around with the GGTT
> > to read arbitrary buffers and is itself run under stop_machine(), it
> > deadlocks the machine (effectively a hard hang) when run in conjunction
> > with Broxton's VTd workaround to serialize GGTT access.
> >
> > v2: Store the ERR_PTR in first_error so that the error can be reported
> > to the user via sysfs.
> >
> > Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT")
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Jon Bloomfield <jon.bloomfield@intel.com>
> > Cc: John Harrison <john.C.Harrison@intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> It's a real shame to lose error capture on BXT. Can we wrap stop_machine to
> make it recursive ?
>
> Something like...
>
> static cpumask_t sm_mask;
>
> struct sm_args {
> cpu_stop_fn_t *fn;
> void *data;
> };
>
> void do_recursive_stop(void *sm_arg_data)
> {
> struct sm_arg *args = sm_arg_data;
>
> /* We're stopped - flag the fact to prevent recursion */
> cpumask_set_cpu(smp_processor_id(), &sm_mask);
>
> args->fn(args->data);
>
> /* Re-enable recursion */
> cpumask_clear_cpu(smp_processor_id(), &sm_mask);
> }
>
> void recursive_stop_machine(cpu_stop_fn_t fn, void *data)
> {
> if (cpumask_test_cpu(smp_processor_id(), &sm_mask)) {
> /* We were already stopped, so can just call directly */
> fn(data);
> }
> else {
> /* Our CPU is not currently stopped */
> struct sm_args *args = {fn, data};
> stop_machine(do_recursive_stop, args, NULL);
> }
> }
... I think a single bool is sufficient in place of the cpumask, since it is set and cleared
within stop_machine - I started out trying to set/clear outside.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* ✓ Fi.CI.IGT: success for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev2)
2017-12-06 14:19 [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
` (4 preceding siblings ...)
2017-12-06 16:11 ` ✓ Fi.CI.BAT: success for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev2) Patchwork
@ 2017-12-06 17:43 ` Patchwork
2018-10-11 11:21 ` ✗ Fi.CI.BAT: failure " Patchwork
` (5 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2017-12-06 17:43 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev2)
URL : https://patchwork.freedesktop.org/series/34969/
State : success
== Summary ==
Test kms_frontbuffer_tracking:
Subgroup fbc-1p-offscren-pri-shrfb-draw-render:
pass -> FAIL (shard-snb) fdo#101623
Test kms_flip:
Subgroup vblank-vs-dpms-suspend-interruptible:
incomplete -> PASS (shard-hsw) fdo#103706 +1
Subgroup vblank-vs-modeset-suspend:
skip -> PASS (shard-hsw)
Test kms_setmode:
Subgroup basic:
pass -> FAIL (shard-hsw) fdo#99912
fdo#101623 https://bugs.freedesktop.org/show_bug.cgi?id=101623
fdo#103706 https://bugs.freedesktop.org/show_bug.cgi?id=103706
fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912
shard-hsw total:2527 pass:1455 dwarn:1 dfail:0 fail:10 skip:1061 time:8932s
shard-snb total:2679 pass:1308 dwarn:1 dfail:0 fail:12 skip:1358 time:8101s
Blacklisted hosts:
shard-apl total:2679 pass:1676 dwarn:2 dfail:0 fail:24 skip:977 time:13526s
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7429/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* ✗ Fi.CI.BAT: failure for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev2)
2017-12-06 14:19 [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
` (5 preceding siblings ...)
2017-12-06 17:43 ` ✓ Fi.CI.IGT: " Patchwork
@ 2018-10-11 11:21 ` Patchwork
2018-10-11 11:37 ` [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
` (4 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2018-10-11 11:21 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev2)
URL : https://patchwork.freedesktop.org/series/34969/
State : failure
== Summary ==
Applying: drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
Using index info to reconstruct a base tree...
M drivers/gpu/drm/i915/i915_drv.h
M drivers/gpu/drm/i915/i915_gem_gtt.c
M drivers/gpu/drm/i915/i915_gpu_error.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/i915_gpu_error.c
Auto-merging drivers/gpu/drm/i915/i915_gem_gtt.c
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/i915_gem_gtt.c
Auto-merging drivers/gpu/drm/i915/i915_drv.h
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/i915_drv.h
error: Failed to merge in the changes.
Patch failed at 0001 drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
Use 'git am --show-current-patch' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7429/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
2017-12-06 14:19 [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
` (6 preceding siblings ...)
2018-10-11 11:21 ` ✗ Fi.CI.BAT: failure " Patchwork
@ 2018-10-11 11:37 ` Chris Wilson
2018-10-11 22:03 ` kbuild test robot
2018-10-12 1:13 ` kbuild test robot
2018-10-11 11:43 ` ✗ Fi.CI.BAT: failure for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev3) Patchwork
` (3 subsequent siblings)
11 siblings, 2 replies; 19+ messages in thread
From: Chris Wilson @ 2018-10-11 11:37 UTC (permalink / raw)
To: intel-gfx
Since capturing the error state requires fiddling around with the GGTT
to read arbitrary buffers and is itself run under stop_machine(), it
deadlocks the machine (effectively a hard hang) when run in conjunction
with Broxton's VTd workaround to serialize GGTT access.
v2: Store the ERR_PTR in first_error so that the error can be reported
to the user via sysfs.
Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: John Harrison <john.C.Harrison@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
drivers/gpu/drm/i915/i915_gpu_error.c | 15 ++++++++++++++-
drivers/gpu/drm/i915/i915_gpu_error.h | 8 +++++++-
3 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 29ca9007a704..47b003daa6f3 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3339,6 +3339,9 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
ggtt->vm.insert_page = bxt_vtd_ggtt_insert_page__BKL;
if (ggtt->vm.clear_range != nop_clear_range)
ggtt->vm.clear_range = bxt_vtd_ggtt_clear_range__BKL;
+
+ /* Prevent recursively calling stop_machine() and deadlocks. */
+ i915_disable_error_state(dev_priv, -ENODEV);
}
ggtt->invalidate = gen6_ggtt_invalidate;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index c8d8f79688a8..f5b9914e9c6d 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -648,6 +648,9 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
return 0;
}
+ if (IS_ERR(error))
+ return PTR_ERR(error);
+
if (*error->error_msg)
err_printf(m, "%s\n", error->error_msg);
err_printf(m, "Kernel: " UTS_RELEASE "\n");
@@ -1867,6 +1870,7 @@ void i915_capture_error_state(struct drm_i915_private *i915,
error = i915_capture_gpu_state(i915);
if (!error) {
DRM_DEBUG_DRIVER("out of memory, not capturing error state\n");
+ i915_disable_error_state(dev_priv, -ENOMEM);
return;
}
@@ -1922,5 +1926,14 @@ void i915_reset_error_state(struct drm_i915_private *i915)
i915->gpu_error.first_error = NULL;
spin_unlock_irq(&i915->gpu_error.lock);
- i915_gpu_state_put(error);
+ if (!IS_ERR(error))
+ i915_gpu_state_put(error);
+}
+
+void i915_disable_error_state(struct drm_i915_private *i915, int err)
+{
+ spin_lock_irq(&i915->gpu_error.lock);
+ if (!i915->gpu_error.first_error)
+ i915->gpu_error.first_error = ERR_PTR(err);
+ spin_unlock_irq(&i915->gpu_error.lock);
}
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index 8710fb18ed74..3ec89a504de5 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -343,6 +343,7 @@ static inline void i915_gpu_state_put(struct i915_gpu_state *gpu)
struct i915_gpu_state *i915_first_error_state(struct drm_i915_private *i915);
void i915_reset_error_state(struct drm_i915_private *i915);
+void i915_disable_error_state(struct drm_i915_private *i915, int err);
#else
@@ -355,13 +356,18 @@ static inline void i915_capture_error_state(struct drm_i915_private *dev_priv,
static inline struct i915_gpu_state *
i915_first_error_state(struct drm_i915_private *i915)
{
- return NULL;
+ return ERR_PTR(-ENODEV);
}
static inline void i915_reset_error_state(struct drm_i915_private *i915)
{
}
+static inline void i915_disable_error_state(struct drm_i915_private *i915,
+ int err)
+{
+}
+
#endif /* IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) */
#endif /* _I915_GPU_ERROR_H_ */
--
2.19.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 19+ messages in thread
* ✗ Fi.CI.BAT: failure for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev3)
2017-12-06 14:19 [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
` (7 preceding siblings ...)
2018-10-11 11:37 ` [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
@ 2018-10-11 11:43 ` Patchwork
2018-10-11 11:51 ` [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
` (2 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2018-10-11 11:43 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev3)
URL : https://patchwork.freedesktop.org/series/34969/
State : failure
== Summary ==
CALL scripts/checksyscalls.sh
DESCEND objtool
CHK include/generated/compile.h
CC [M] drivers/gpu/drm/i915/i915_gpu_error.o
drivers/gpu/drm/i915/i915_gpu_error.c: In function ‘i915_capture_error_state’:
drivers/gpu/drm/i915/i915_gpu_error.c:1873:28: error: ‘dev_priv’ undeclared (first use in this function); did you mean ‘dev_crit’?
i915_disable_error_state(dev_priv, -ENOMEM);
^~~~~~~~
dev_crit
drivers/gpu/drm/i915/i915_gpu_error.c:1873:28: note: each undeclared identifier is reported only once for each function it appears in
scripts/Makefile.build:305: recipe for target 'drivers/gpu/drm/i915/i915_gpu_error.o' failed
make[4]: *** [drivers/gpu/drm/i915/i915_gpu_error.o] Error 1
scripts/Makefile.build:546: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:546: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:546: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1050: recipe for target 'drivers' failed
make: *** [drivers] Error 2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
2017-12-06 14:19 [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
` (8 preceding siblings ...)
2018-10-11 11:43 ` ✗ Fi.CI.BAT: failure for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev3) Patchwork
@ 2018-10-11 11:51 ` Chris Wilson
2018-10-11 12:29 ` ✓ Fi.CI.BAT: success for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev4) Patchwork
2018-10-11 17:13 ` ✓ Fi.CI.IGT: " Patchwork
11 siblings, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2018-10-11 11:51 UTC (permalink / raw)
To: intel-gfx
Since capturing the error state requires fiddling around with the GGTT
to read arbitrary buffers and is itself run under stop_machine(), it
deadlocks the machine (effectively a hard hang) when run in conjunction
with Broxton's VTd workaround to serialize GGTT access.
v2: Store the ERR_PTR in first_error so that the error can be reported
to the user via sysfs.
Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: John Harrison <john.C.Harrison@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++
drivers/gpu/drm/i915/i915_gpu_error.c | 15 ++++++++++++++-
drivers/gpu/drm/i915/i915_gpu_error.h | 8 +++++++-
3 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 29ca9007a704..47b003daa6f3 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3339,6 +3339,9 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
ggtt->vm.insert_page = bxt_vtd_ggtt_insert_page__BKL;
if (ggtt->vm.clear_range != nop_clear_range)
ggtt->vm.clear_range = bxt_vtd_ggtt_clear_range__BKL;
+
+ /* Prevent recursively calling stop_machine() and deadlocks. */
+ i915_disable_error_state(dev_priv, -ENODEV);
}
ggtt->invalidate = gen6_ggtt_invalidate;
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index c8d8f79688a8..21b5c8765015 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -648,6 +648,9 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
return 0;
}
+ if (IS_ERR(error))
+ return PTR_ERR(error);
+
if (*error->error_msg)
err_printf(m, "%s\n", error->error_msg);
err_printf(m, "Kernel: " UTS_RELEASE "\n");
@@ -1867,6 +1870,7 @@ void i915_capture_error_state(struct drm_i915_private *i915,
error = i915_capture_gpu_state(i915);
if (!error) {
DRM_DEBUG_DRIVER("out of memory, not capturing error state\n");
+ i915_disable_error_state(i915, -ENOMEM);
return;
}
@@ -1922,5 +1926,14 @@ void i915_reset_error_state(struct drm_i915_private *i915)
i915->gpu_error.first_error = NULL;
spin_unlock_irq(&i915->gpu_error.lock);
- i915_gpu_state_put(error);
+ if (!IS_ERR(error))
+ i915_gpu_state_put(error);
+}
+
+void i915_disable_error_state(struct drm_i915_private *i915, int err)
+{
+ spin_lock_irq(&i915->gpu_error.lock);
+ if (!i915->gpu_error.first_error)
+ i915->gpu_error.first_error = ERR_PTR(err);
+ spin_unlock_irq(&i915->gpu_error.lock);
}
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index 8710fb18ed74..3ec89a504de5 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -343,6 +343,7 @@ static inline void i915_gpu_state_put(struct i915_gpu_state *gpu)
struct i915_gpu_state *i915_first_error_state(struct drm_i915_private *i915);
void i915_reset_error_state(struct drm_i915_private *i915);
+void i915_disable_error_state(struct drm_i915_private *i915, int err);
#else
@@ -355,13 +356,18 @@ static inline void i915_capture_error_state(struct drm_i915_private *dev_priv,
static inline struct i915_gpu_state *
i915_first_error_state(struct drm_i915_private *i915)
{
- return NULL;
+ return ERR_PTR(-ENODEV);
}
static inline void i915_reset_error_state(struct drm_i915_private *i915)
{
}
+static inline void i915_disable_error_state(struct drm_i915_private *i915,
+ int err)
+{
+}
+
#endif /* IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR) */
#endif /* _I915_GPU_ERROR_H_ */
--
2.19.1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 19+ messages in thread
* ✓ Fi.CI.BAT: success for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev4)
2017-12-06 14:19 [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
` (9 preceding siblings ...)
2018-10-11 11:51 ` [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
@ 2018-10-11 12:29 ` Patchwork
2018-10-11 17:13 ` ✓ Fi.CI.IGT: " Patchwork
11 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2018-10-11 12:29 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev4)
URL : https://patchwork.freedesktop.org/series/34969/
State : success
== Summary ==
= CI Bug Log - changes from CI_DRM_4969 -> Patchwork_10427 =
== Summary - SUCCESS ==
No regressions found.
External URL: https://patchwork.freedesktop.org/api/1.0/series/34969/revisions/4/mbox/
== Known issues ==
Here are the changes found in Patchwork_10427 that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@kms_flip@basic-flip-vs-modeset:
fi-hsw-4770r: PASS -> DMESG-WARN (fdo#105602) +1
igt@kms_frontbuffer_tracking@basic:
{fi-icl-u2}: SKIP -> FAIL (fdo#103167)
fi-byt-clapper: PASS -> FAIL (fdo#103167)
igt@kms_pipe_crc_basic@read-crc-pipe-b:
fi-byt-clapper: PASS -> FAIL (fdo#107362)
==== Possible fixes ====
igt@gem_exec_suspend@basic-s3:
fi-cfl-8109u: INCOMPLETE (fdo#107187, fdo#108126) -> PASS
igt@kms_chamelium@dp-edid-read:
fi-kbl-7500u: WARN (fdo#102672) -> PASS
igt@kms_pipe_crc_basic@read-crc-pipe-b-frame-sequence:
fi-byt-clapper: FAIL (fdo#107362, fdo#103191) -> PASS +1
{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).
fdo#102672 https://bugs.freedesktop.org/show_bug.cgi?id=102672
fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
fdo#105602 https://bugs.freedesktop.org/show_bug.cgi?id=105602
fdo#107187 https://bugs.freedesktop.org/show_bug.cgi?id=107187
fdo#107362 https://bugs.freedesktop.org/show_bug.cgi?id=107362
fdo#108126 https://bugs.freedesktop.org/show_bug.cgi?id=108126
== Participating hosts (44 -> 39) ==
Missing (5): fi-bsw-cyan fi-ilk-m540 fi-byt-squawks fi-gdg-551 fi-pnv-d510
== Build changes ==
* Linux: CI_DRM_4969 -> Patchwork_10427
CI_DRM_4969: 1121d2889e57dedacc0885deaaa9de614832e62f @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_4673: 54cb1aeb4e50dea9f3abae632e317875d147c4ab @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_10427: ea61179bb5e951736f94721fa7359e98e78a3906 @ git://anongit.freedesktop.org/gfx-ci/linux
== Linux commits ==
ea61179bb5e9 drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10427/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* ✓ Fi.CI.IGT: success for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev4)
2017-12-06 14:19 [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
` (10 preceding siblings ...)
2018-10-11 12:29 ` ✓ Fi.CI.BAT: success for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev4) Patchwork
@ 2018-10-11 17:13 ` Patchwork
11 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2018-10-11 17:13 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev4)
URL : https://patchwork.freedesktop.org/series/34969/
State : success
== Summary ==
= CI Bug Log - changes from CI_DRM_4969_full -> Patchwork_10427_full =
== Summary - WARNING ==
Minor unknown changes coming with Patchwork_10427_full need to be verified
manually.
If you think the reported changes have nothing to do with the changes
introduced in Patchwork_10427_full, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.
== Possible new issues ==
Here are the unknown changes that may have been introduced in Patchwork_10427_full:
=== IGT changes ===
==== Warnings ====
igt@perf_pmu@rc6:
shard-kbl: PASS -> SKIP
igt@pm_rc6_residency@rc6-accuracy:
shard-snb: SKIP -> PASS
== Known issues ==
Here are the changes found in Patchwork_10427_full that come from known issues:
=== IGT changes ===
==== Issues hit ====
igt@drv_hangman@error-state-capture-render:
shard-glk: PASS -> INCOMPLETE (k.org#198133, fdo#103359)
igt@gem_exec_schedule@pi-ringfull-blt:
shard-skl: NOTRUN -> FAIL (fdo#103158) +2
igt@gem_ppgtt@blt-vs-render-ctxn:
shard-skl: NOTRUN -> TIMEOUT (fdo#108039)
igt@gem_userptr_blits@readonly-unsync:
shard-skl: NOTRUN -> INCOMPLETE (fdo#108074)
igt@kms_available_modes_crc@available_mode_test_crc:
shard-apl: PASS -> FAIL (fdo#106641)
igt@kms_busy@extended-pageflip-hang-newfb-render-a:
shard-hsw: PASS -> DMESG-WARN (fdo#102614)
igt@kms_busy@extended-pageflip-modeset-hang-oldfb-render-c:
shard-skl: NOTRUN -> DMESG-WARN (fdo#107956)
igt@kms_cursor_crc@cursor-128x128-random:
shard-apl: PASS -> FAIL (fdo#103232)
igt@kms_cursor_crc@cursor-64x64-dpms:
shard-glk: PASS -> FAIL (fdo#103232) +1
igt@kms_draw_crc@draw-method-xrgb2101010-pwrite-xtiled:
shard-skl: PASS -> FAIL (fdo#103184)
igt@kms_fbcon_fbt@psr:
shard-skl: NOTRUN -> FAIL (fdo#107882)
igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-cpu:
shard-skl: NOTRUN -> FAIL (fdo#103167)
igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-fullscreen:
shard-apl: PASS -> FAIL (fdo#103167) +1
igt@kms_frontbuffer_tracking@fbc-2p-primscrn-spr-indfb-draw-mmap-cpu:
shard-glk: PASS -> FAIL (fdo#103167) +2
igt@kms_frontbuffer_tracking@fbc-stridechange:
shard-skl: NOTRUN -> FAIL (fdo#105683)
igt@kms_panel_fitting@legacy:
shard-skl: NOTRUN -> FAIL (fdo#105456)
igt@kms_pipe_crc_basic@read-crc-pipe-c:
shard-skl: NOTRUN -> FAIL (fdo#107362) +1
igt@kms_plane@plane-position-covered-pipe-c-planes:
shard-glk: PASS -> FAIL (fdo#103166) +1
{igt@kms_plane_alpha_blend@pipe-b-coverage-7efc}:
shard-skl: NOTRUN -> FAIL (fdo#108146)
{igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min}:
shard-skl: NOTRUN -> FAIL (fdo#108145) +1
igt@kms_plane_multiple@atomic-pipe-a-tiling-y:
shard-apl: PASS -> FAIL (fdo#103166) +1
igt@perf_pmu@rc6-runtime-pm:
shard-glk: PASS -> FAIL (fdo#105010)
shard-apl: PASS -> FAIL (fdo#105010)
==== Possible fixes ====
igt@gem_exec_await@wide-contexts:
shard-glk: DMESG-FAIL (fdo#106680) -> PASS
igt@kms_cursor_crc@cursor-128x128-dpms:
shard-apl: FAIL (fdo#103232) -> PASS +1
igt@kms_flip@flip-vs-expired-vblank:
shard-kbl: FAIL (fdo#105363, fdo#102887) -> PASS
igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-mmap-cpu:
shard-glk: FAIL (fdo#103167) -> PASS +3
igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-mmap-wc:
shard-apl: FAIL (fdo#103167) -> PASS
{igt@kms_plane_alpha_blend@pipe-b-constant-alpha-max}:
shard-glk: FAIL (fdo#108145) -> PASS
igt@pm_rpm@dpms-non-lpsp:
shard-skl: INCOMPLETE (fdo#107807) -> SKIP
==== Warnings ====
igt@kms_vblank@pipe-b-wait-forked:
shard-snb: DMESG-WARN (fdo#107469) -> INCOMPLETE (fdo#105411)
{name}: This element is suppressed. This means it is ignored when computing
the status of the difference (SUCCESS, WARNING, or FAILURE).
fdo#102614 https://bugs.freedesktop.org/show_bug.cgi?id=102614
fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
fdo#103158 https://bugs.freedesktop.org/show_bug.cgi?id=103158
fdo#103166 https://bugs.freedesktop.org/show_bug.cgi?id=103166
fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
fdo#103184 https://bugs.freedesktop.org/show_bug.cgi?id=103184
fdo#103232 https://bugs.freedesktop.org/show_bug.cgi?id=103232
fdo#103359 https://bugs.freedesktop.org/show_bug.cgi?id=103359
fdo#105010 https://bugs.freedesktop.org/show_bug.cgi?id=105010
fdo#105363 https://bugs.freedesktop.org/show_bug.cgi?id=105363
fdo#105411 https://bugs.freedesktop.org/show_bug.cgi?id=105411
fdo#105456 https://bugs.freedesktop.org/show_bug.cgi?id=105456
fdo#105683 https://bugs.freedesktop.org/show_bug.cgi?id=105683
fdo#106641 https://bugs.freedesktop.org/show_bug.cgi?id=106641
fdo#106680 https://bugs.freedesktop.org/show_bug.cgi?id=106680
fdo#107362 https://bugs.freedesktop.org/show_bug.cgi?id=107362
fdo#107469 https://bugs.freedesktop.org/show_bug.cgi?id=107469
fdo#107807 https://bugs.freedesktop.org/show_bug.cgi?id=107807
fdo#107882 https://bugs.freedesktop.org/show_bug.cgi?id=107882
fdo#107956 https://bugs.freedesktop.org/show_bug.cgi?id=107956
fdo#108039 https://bugs.freedesktop.org/show_bug.cgi?id=108039
fdo#108074 https://bugs.freedesktop.org/show_bug.cgi?id=108074
fdo#108145 https://bugs.freedesktop.org/show_bug.cgi?id=108145
fdo#108146 https://bugs.freedesktop.org/show_bug.cgi?id=108146
k.org#198133 https://bugzilla.kernel.org/show_bug.cgi?id=198133
== Participating hosts (6 -> 6) ==
No changes in participating hosts
== Build changes ==
* Linux: CI_DRM_4969 -> Patchwork_10427
CI_DRM_4969: 1121d2889e57dedacc0885deaaa9de614832e62f @ git://anongit.freedesktop.org/gfx-ci/linux
IGT_4673: 54cb1aeb4e50dea9f3abae632e317875d147c4ab @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
Patchwork_10427: ea61179bb5e951736f94721fa7359e98e78a3906 @ git://anongit.freedesktop.org/gfx-ci/linux
piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit
== Logs ==
For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10427/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
2018-10-11 11:37 ` [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
@ 2018-10-11 22:03 ` kbuild test robot
2018-10-12 1:13 ` kbuild test robot
1 sibling, 0 replies; 19+ messages in thread
From: kbuild test robot @ 2018-10-11 22:03 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx, kbuild-all
[-- Attachment #1: Type: text/plain, Size: 3795 bytes --]
Hi Chris,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.19-rc7 next-20181011]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Prevent-machine-hang-from-Broxton-s-vtd-w-a-and-error-capture/20181012-053134
base: git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-x019-201840 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
drivers/gpu/drm/i915/i915_gpu_error.c: In function 'i915_capture_error_state':
>> drivers/gpu/drm/i915/i915_gpu_error.c:1827:28: error: 'dev_priv' undeclared (first use in this function); did you mean 'dev_crit'?
i915_disable_error_state(dev_priv, -ENOMEM);
^~~~~~~~
dev_crit
drivers/gpu/drm/i915/i915_gpu_error.c:1827:28: note: each undeclared identifier is reported only once for each function it appears in
vim +1827 drivers/gpu/drm/i915/i915_gpu_error.c
1798
1799 /**
1800 * i915_capture_error_state - capture an error record for later analysis
1801 * @i915: i915 device
1802 * @engine_mask: the mask of engines triggering the hang
1803 * @error_msg: a message to insert into the error capture header
1804 *
1805 * Should be called when an error is detected (either a hang or an error
1806 * interrupt) to capture error state from the time of the error. Fills
1807 * out a structure which becomes available in debugfs for user level tools
1808 * to pick up.
1809 */
1810 void i915_capture_error_state(struct drm_i915_private *i915,
1811 u32 engine_mask,
1812 const char *error_msg)
1813 {
1814 static bool warned;
1815 struct i915_gpu_state *error;
1816 unsigned long flags;
1817
1818 if (!i915_modparams.error_capture)
1819 return;
1820
1821 if (READ_ONCE(i915->gpu_error.first_error))
1822 return;
1823
1824 error = i915_capture_gpu_state(i915);
1825 if (!error) {
1826 DRM_DEBUG_DRIVER("out of memory, not capturing error state\n");
> 1827 i915_disable_error_state(dev_priv, -ENOMEM);
1828 return;
1829 }
1830
1831 i915_error_capture_msg(i915, error, engine_mask, error_msg);
1832 DRM_INFO("%s\n", error->error_msg);
1833
1834 if (!error->simulated) {
1835 spin_lock_irqsave(&i915->gpu_error.lock, flags);
1836 if (!i915->gpu_error.first_error) {
1837 i915->gpu_error.first_error = error;
1838 error = NULL;
1839 }
1840 spin_unlock_irqrestore(&i915->gpu_error.lock, flags);
1841 }
1842
1843 if (error) {
1844 __i915_gpu_state_free(&error->ref);
1845 return;
1846 }
1847
1848 if (!warned &&
1849 ktime_get_real_seconds() - DRIVER_TIMESTAMP < DAY_AS_SECONDS(180)) {
1850 DRM_INFO("GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.\n");
1851 DRM_INFO("Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel\n");
1852 DRM_INFO("drm/i915 developers can then reassign to the right component if it's not a kernel issue.\n");
1853 DRM_INFO("The gpu crash dump is required to analyze gpu hangs, so please always attach it.\n");
1854 DRM_INFO("GPU crash dump saved to /sys/class/drm/card%d/error\n",
1855 i915->drm.primary->index);
1856 warned = true;
1857 }
1858 }
1859
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 30015 bytes --]
[-- Attachment #3: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
2018-10-11 11:37 ` [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
2018-10-11 22:03 ` kbuild test robot
@ 2018-10-12 1:13 ` kbuild test robot
1 sibling, 0 replies; 19+ messages in thread
From: kbuild test robot @ 2018-10-12 1:13 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx, kbuild-all
[-- Attachment #1: Type: text/plain, Size: 3753 bytes --]
Hi Chris,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.19-rc7 next-20181011]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-Prevent-machine-hang-from-Broxton-s-vtd-w-a-and-error-capture/20181012-053134
base: git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-s1-10111203 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
drivers/gpu//drm/i915/i915_gpu_error.c: In function 'i915_capture_error_state':
>> drivers/gpu//drm/i915/i915_gpu_error.c:1827:28: error: 'dev_priv' undeclared (first use in this function)
i915_disable_error_state(dev_priv, -ENOMEM);
^~~~~~~~
drivers/gpu//drm/i915/i915_gpu_error.c:1827:28: note: each undeclared identifier is reported only once for each function it appears in
vim +/dev_priv +1827 drivers/gpu//drm/i915/i915_gpu_error.c
1798
1799 /**
1800 * i915_capture_error_state - capture an error record for later analysis
1801 * @i915: i915 device
1802 * @engine_mask: the mask of engines triggering the hang
1803 * @error_msg: a message to insert into the error capture header
1804 *
1805 * Should be called when an error is detected (either a hang or an error
1806 * interrupt) to capture error state from the time of the error. Fills
1807 * out a structure which becomes available in debugfs for user level tools
1808 * to pick up.
1809 */
1810 void i915_capture_error_state(struct drm_i915_private *i915,
1811 u32 engine_mask,
1812 const char *error_msg)
1813 {
1814 static bool warned;
1815 struct i915_gpu_state *error;
1816 unsigned long flags;
1817
1818 if (!i915_modparams.error_capture)
1819 return;
1820
1821 if (READ_ONCE(i915->gpu_error.first_error))
1822 return;
1823
1824 error = i915_capture_gpu_state(i915);
1825 if (!error) {
1826 DRM_DEBUG_DRIVER("out of memory, not capturing error state\n");
> 1827 i915_disable_error_state(dev_priv, -ENOMEM);
1828 return;
1829 }
1830
1831 i915_error_capture_msg(i915, error, engine_mask, error_msg);
1832 DRM_INFO("%s\n", error->error_msg);
1833
1834 if (!error->simulated) {
1835 spin_lock_irqsave(&i915->gpu_error.lock, flags);
1836 if (!i915->gpu_error.first_error) {
1837 i915->gpu_error.first_error = error;
1838 error = NULL;
1839 }
1840 spin_unlock_irqrestore(&i915->gpu_error.lock, flags);
1841 }
1842
1843 if (error) {
1844 __i915_gpu_state_free(&error->ref);
1845 return;
1846 }
1847
1848 if (!warned &&
1849 ktime_get_real_seconds() - DRIVER_TIMESTAMP < DAY_AS_SECONDS(180)) {
1850 DRM_INFO("GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.\n");
1851 DRM_INFO("Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel\n");
1852 DRM_INFO("drm/i915 developers can then reassign to the right component if it's not a kernel issue.\n");
1853 DRM_INFO("The gpu crash dump is required to analyze gpu hangs, so please always attach it.\n");
1854 DRM_INFO("GPU crash dump saved to /sys/class/drm/card%d/error\n",
1855 i915->drm.primary->index);
1856 warned = true;
1857 }
1858 }
1859
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 29926 bytes --]
[-- Attachment #3: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2018-10-12 1:14 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-06 14:19 [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
2017-12-06 14:37 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-12-06 14:43 ` [PATCH] " Daniel Vetter
2017-12-06 14:48 ` Chris Wilson
2017-12-06 14:51 ` Daniel Vetter
2017-12-06 15:26 ` ✗ Fi.CI.IGT: warning for " Patchwork
2017-12-06 15:37 ` [PATCH v2] " Chris Wilson
2017-12-06 17:01 ` Bloomfield, Jon
2017-12-06 17:25 ` Bloomfield, Jon
2017-12-06 16:11 ` ✓ Fi.CI.BAT: success for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev2) Patchwork
2017-12-06 17:43 ` ✓ Fi.CI.IGT: " Patchwork
2018-10-11 11:21 ` ✗ Fi.CI.BAT: failure " Patchwork
2018-10-11 11:37 ` [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
2018-10-11 22:03 ` kbuild test robot
2018-10-12 1:13 ` kbuild test robot
2018-10-11 11:43 ` ✗ Fi.CI.BAT: failure for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev3) Patchwork
2018-10-11 11:51 ` [PATCH] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture Chris Wilson
2018-10-11 12:29 ` ✓ Fi.CI.BAT: success for drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture (rev4) Patchwork
2018-10-11 17:13 ` ✓ Fi.CI.IGT: " Patchwork
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.