* [PATCH 1/3] drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init
@ 2020-10-07 12:03 Ville Syrjala
2020-10-13 15:47 ` Chris Wilson
0 siblings, 1 reply; 3+ messages in thread
From: Ville Syrjala @ 2020-10-07 12:03 UTC (permalink / raw)
To: intel-gfx; +Cc: stable, Chris Wilson
From: Ville Syrjälä <ville.syrjala@linux.intel.com>
Currently we leave the cache_level of the initial fb obj
set to NONE. This means on eLLC machines the first pin_to_display()
will try to switch it to WT which requires a vma unbind+bind.
If that happens during the fbdev initialization rcu does not
seem operational which causes the unbind to get stuck. To
most appearances this looks like a dead machine on boot.
Avoid the unbind by already marking the object cache_level
as WT when creating it. We still do an excplicit ggtt pin
which will rewrite the PTEs anyway, so they will match whatever
cache level we set.
Cc: <stable@vger.kernel.org> # v5.7+
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 907e1d155443..00c08600c60a 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -3445,6 +3445,14 @@ initial_plane_vma(struct drm_i915_private *i915,
if (IS_ERR(obj))
return NULL;
+ /*
+ * Mark it WT ahead of time to avoid changing the
+ * cache_level during fbdev initialization. The
+ * unbind there would get stuck waiting for rcu.
+ */
+ i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ?
+ I915_CACHE_WT : I915_CACHE_NONE);
+
switch (plane_config->tiling) {
case I915_TILING_NONE:
break;
--
2.26.2
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH 1/3] drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init
2020-10-07 12:03 [PATCH 1/3] drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init Ville Syrjala
@ 2020-10-13 15:47 ` Chris Wilson
2020-10-13 16:12 ` Ville Syrjälä
0 siblings, 1 reply; 3+ messages in thread
From: Chris Wilson @ 2020-10-13 15:47 UTC (permalink / raw)
To: Ville Syrjala, intel-gfx; +Cc: stable
See subject, s/ininitial/iniital/
Quoting Ville Syrjala (2020-10-07 13:03:27)
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>
> Currently we leave the cache_level of the initial fb obj
> set to NONE. This means on eLLC machines the first pin_to_display()
> will try to switch it to WT which requires a vma unbind+bind.
> If that happens during the fbdev initialization rcu does not
> seem operational which causes the unbind to get stuck. To
> most appearances this looks like a dead machine on boot.
>
> Avoid the unbind by already marking the object cache_level
> as WT when creating it. We still do an excplicit ggtt pin
> which will rewrite the PTEs anyway, so they will match whatever
> cache level we set.
>
> Cc: <stable@vger.kernel.org> # v5.7+
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
> drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index 907e1d155443..00c08600c60a 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -3445,6 +3445,14 @@ initial_plane_vma(struct drm_i915_private *i915,
> if (IS_ERR(obj))
> return NULL;
>
> + /*
> + * Mark it WT ahead of time to avoid changing the
> + * cache_level during fbdev initialization. The
> + * unbind there would get stuck waiting for rcu.
> + */
> + i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ?
> + I915_CACHE_WT : I915_CACHE_NONE);
Ok, I've been worrying about whether there were any more side-effects,
but I think it all comes out in the wash. The proof is definitely in the
eating, and we will know soon enough if we break someone's virtual
terminal.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 1/3] drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init
2020-10-13 15:47 ` Chris Wilson
@ 2020-10-13 16:12 ` Ville Syrjälä
0 siblings, 0 replies; 3+ messages in thread
From: Ville Syrjälä @ 2020-10-13 16:12 UTC (permalink / raw)
To: Chris Wilson; +Cc: intel-gfx, stable
On Tue, Oct 13, 2020 at 04:47:49PM +0100, Chris Wilson wrote:
> See subject, s/ininitial/iniital/
>
> Quoting Ville Syrjala (2020-10-07 13:03:27)
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> >
> > Currently we leave the cache_level of the initial fb obj
> > set to NONE. This means on eLLC machines the first pin_to_display()
> > will try to switch it to WT which requires a vma unbind+bind.
> > If that happens during the fbdev initialization rcu does not
> > seem operational which causes the unbind to get stuck. To
> > most appearances this looks like a dead machine on boot.
> >
> > Avoid the unbind by already marking the object cache_level
> > as WT when creating it. We still do an excplicit ggtt pin
> > which will rewrite the PTEs anyway, so they will match whatever
> > cache level we set.
> >
> > Cc: <stable@vger.kernel.org> # v5.7+
> > Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> > index 907e1d155443..00c08600c60a 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > @@ -3445,6 +3445,14 @@ initial_plane_vma(struct drm_i915_private *i915,
> > if (IS_ERR(obj))
> > return NULL;
> >
> > + /*
> > + * Mark it WT ahead of time to avoid changing the
> > + * cache_level during fbdev initialization. The
> > + * unbind there would get stuck waiting for rcu.
> > + */
> > + i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ?
> > + I915_CACHE_WT : I915_CACHE_NONE);
>
> Ok, I've been worrying about whether there were any more side-effects,
> but I think it all comes out in the wash. The proof is definitely in the
> eating, and we will know soon enough if we break someone's virtual
> terminal.
At least it seems to work on my CFL with eLLC caching enabled.
>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Ta.
--
Ville Syrjälä
Intel
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-10-13 16:12 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-07 12:03 [PATCH 1/3] drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init Ville Syrjala
2020-10-13 15:47 ` Chris Wilson
2020-10-13 16:12 ` Ville Syrjälä
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).