All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Mika Kuoppala <mika.kuoppala@linux.intel.com>,
	intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/2] drm/i915: Force reset on unready engine
Date: Fri, 10 Aug 2018 15:13:28 +0100	[thread overview]
Message-ID: <153391040793.12983.11406349176984550701@skylake-alporthouse-com> (raw)
In-Reply-To: <20180810140036.24240-2-mika.kuoppala@linux.intel.com>

Quoting Mika Kuoppala (2018-08-10 15:00:36)
> If engine reports that it is not ready for reset, we
> give up. Evidence shows that forcing a per engine reset
> on an engine which is not reporting to be ready for reset,
> can bring it back into a working order. There is risk that
> we corrupt the context image currently executing on that
> engine. But that is a risk worth taking as if we unblock
> the engine, we prevent a whole device wedging in a case
> of full gpu reset.
> 
> Reset individual engine even if it reports that it is not
> prepared for reset, but only if we aim for full gpu reset
> and not on first reset attempt.
> 
> v2: force reset only on later attempts, readability (Chris)
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/intel_uncore.c | 49 +++++++++++++++++++++++------
>  1 file changed, 39 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
> index 027d14574bfa..d24026839b17 100644
> --- a/drivers/gpu/drm/i915/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/intel_uncore.c
> @@ -2099,9 +2099,6 @@ static int gen8_reset_engine_start(struct intel_engine_cs *engine)
>                                            RESET_CTL_READY_TO_RESET,
>                                            700, 0,
>                                            NULL);
> -       if (ret)
> -               DRM_ERROR("%s: reset request timeout\n", engine->name);
> -
>         return ret;
>  }
>  
> @@ -2113,6 +2110,42 @@ static void gen8_reset_engine_cancel(struct intel_engine_cs *engine)
>                       _MASKED_BIT_DISABLE(RESET_CTL_REQUEST_RESET));
>  }
>  
> +static int reset_engines(struct drm_i915_private *i915,
> +                        unsigned int engine_mask,
> +                        unsigned int retry)
> +{
> +       if (INTEL_GEN(i915) >= 11)
> +               return gen11_reset_engines(i915, engine_mask);
> +       else
> +               return gen6_reset_engines(i915, engine_mask, retry);
> +}
> +
> +static int gen8_prepare_engine_for_reset(struct intel_engine_cs *engine,
> +                                        unsigned int retry)
> +{
> +       const bool force_reset_on_non_ready = retry >= 1;
> +       int ret;
> +
> +       ret = gen8_reset_engine_start(engine);
> +
> +       if (ret && force_reset_on_non_ready) {
> +               /*
> +                * Try to unblock a single non-ready engine by risking
> +                * context corruption.
> +                */
> +               ret = reset_engines(engine->i915,
> +                                   intel_engine_flag(engine),
> +                                   retry);
> +               if (!ret)
> +                       ret = gen8_reset_engine_start(engine);
> +
> +               DRM_ERROR("%s: reset request timeout, forcing reset (%d)\n",
> +                         engine->name, ret);

This looks dubious now ;)

After the force you then do a reset in the caller. Twice the reset for
twice the unpreparedness.

> +       }
> +
> +       return ret;
> +}
> +
>  static int gen8_reset_engines(struct drm_i915_private *dev_priv,
>                               unsigned int engine_mask,
>                               unsigned int retry)
> @@ -2122,16 +2155,12 @@ static int gen8_reset_engines(struct drm_i915_private *dev_priv,
>         int ret;
>  
>         for_each_engine_masked(engine, dev_priv, engine_mask, tmp) {
> -               if (gen8_reset_engine_start(engine)) {
> -                       ret = -EIO;
> +               ret = gen8_prepare_engine_for_reset(engine, retry);
> +               if (ret)

if (ret && !force_reset_unread)

>                         goto not_ready;
> -               }
>         }
>  
> -       if (INTEL_GEN(dev_priv) >= 11)
> -               ret = gen11_reset_engines(dev_priv, engine_mask);
> -       else
> -               ret = gen6_reset_engines(dev_priv, engine_mask, retry);
> +       ret = reset_engines(dev_priv, engine_mask, retry);
>  
>  not_ready:
>         for_each_engine_masked(engine, dev_priv, engine_mask, tmp)
> -- 
> 2.17.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2018-08-10 14:13 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-10 14:00 [PATCH 1/2] drm/i915: Expose retry count to per gen reset logic Mika Kuoppala
2018-08-10 14:00 ` [PATCH 2/2] drm/i915: Force reset on unready engine Mika Kuoppala
2018-08-10 14:13   ` Chris Wilson [this message]
2018-08-13  8:18     ` Mika Kuoppala
2018-08-13  8:32       ` Chris Wilson
2018-08-13  9:58         ` Mika Kuoppala
2018-08-13 10:03           ` Chris Wilson
2018-08-13 10:42             ` Mika Kuoppala
2018-08-13 10:51               ` Chris Wilson
2018-08-13 11:02                 ` Mika Kuoppala
2018-08-13 11:08                   ` Chris Wilson
2018-08-13 13:01                     ` Mika Kuoppala
2018-08-10 14:14 ` [PATCH 1/2] drm/i915: Expose retry count to per gen reset logic Chris Wilson
2018-08-13 14:03   ` Mika Kuoppala
2018-08-10 14:37 ` ✓ Fi.CI.BAT: success for series starting with [1/2] " Patchwork
2018-08-10 19:10 ` ✓ Fi.CI.IGT: " Patchwork
2018-08-13 11:19 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Expose retry count to per gen reset logic (rev3) Patchwork
2018-08-13 12:51 ` ✓ Fi.CI.IGT: " Patchwork
2018-08-13 13:54 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Expose retry count to per gen reset logic (rev4) Patchwork
2018-08-13 15:37 ` ✓ Fi.CI.IGT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=153391040793.12983.11406349176984550701@skylake-alporthouse-com \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=mika.kuoppala@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.