All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mika Kuoppala <mika.kuoppala@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v2] drm/i915: Stop engines when declaring the machine wedged
Date: Fri, 16 Mar 2018 10:58:28 +0200	[thread overview]
Message-ID: <87fu50mm6j.fsf@gaia.fi.intel.com> (raw)
In-Reply-To: <20180315151015.22741-1-chris@chris-wilson.co.uk>

Chris Wilson <chris@chris-wilson.co.uk> writes:

> If we fail to reset the GPU, we declare the machine wedged. However, the
> GPU may well still be running in the background with an in-flight
> request. So despite our efforts in cleaning up the request queue and
> faking the breadcrumb in the HWSP, the GPU may eventually write the
> in-flght seqno there breaking all of our assumptions and throwing the
> driver into a deep turmoil, wedging beyond wedged.
>
> To avoid this we ideally want to reset the GPU. Since that has already
> failed, make sure the rings have the stop bit set instead. This is part
> of the normal GPU reset sequence, but that is actually disabled by
> igt/gem_eio to force the wedged state. If we assume the worst, we must
> poke at the bit again before we give up.
>
> v2: Move the intel_gpu_reset() from set-wedged in the reset error path
> into i915_gem_set_wedged() itself. Even if the reset fails (e.g. if it is
> disabled by gem_eio), it still tries to make sure the engines are
> stopped. For i915_gem_set_wedged() callers from outside of i915_reset(),
> this should make sure the GPU is disabled while the driver is marked as
> being wedged.
>
> Testcase: igt/gem_eio
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 1 -
>  drivers/gpu/drm/i915/i915_gem.c | 3 +++
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index f03555efc520..3df5193487f3 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1995,7 +1995,6 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
>  error:
>  	i915_gem_set_wedged(i915);
>  	i915_retire_requests(i915);
> -	intel_gpu_reset(i915, ALL_ENGINES);
>  	goto finish;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 2fbd622bba30..802df8e1a544 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3246,6 +3246,9 @@ void i915_gem_set_wedged(struct drm_i915_private *i915)
>  	}
>  	i915->caps.scheduler = 0;
>  
> +	/* Even if the GPU reset fails, it should still stop the engines */
> +	intel_gpu_reset(i915, ALL_ENGINES);
> +

Comment is very welcome in here as modparm.reset usage isn't
so transparent.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

>  	/*
>  	 * Make sure no one is running the old callback before we proceed with
>  	 * cancelling requests and resetting the completion tracking. Otherwise
> -- 
> 2.16.2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2018-03-16  9:05 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-15 13:14 [PATCH 1/2] drm/i915: Trace GEM steps between submit and wedging Chris Wilson
2018-03-15 13:14 ` [PATCH 2/2] drm/i915: Stop engines when declaring the machine wedged Chris Wilson
2018-03-15 13:20   ` [PATCH] " Chris Wilson
2018-03-15 15:10     ` [PATCH v2] " Chris Wilson
2018-03-16  8:58       ` Mika Kuoppala [this message]
2018-03-16 10:29         ` Chris Wilson
2018-03-15 15:44     ` [PATCH] " Mika Kuoppala
2018-03-15 13:26 ` [PATCH 1/2] drm/i915: Trace GEM steps between submit and wedging Chris Wilson
2018-03-15 13:41 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Trace GEM steps between submit and wedging (rev2) Patchwork
2018-03-15 15:44 ` ✓ Fi.CI.BAT: success for series starting with [1/2] drm/i915: Trace GEM steps between submit and wedging (rev3) Patchwork
2018-03-15 15:59 ` ✓ Fi.CI.IGT: success for series starting with [1/2] drm/i915: Trace GEM steps between submit and wedging (rev2) Patchwork
2018-03-15 19:45 ` ✓ Fi.CI.IGT: success for series starting with [1/2] drm/i915: Trace GEM steps between submit and wedging (rev3) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fu50mm6j.fsf@gaia.fi.intel.com \
    --to=mika.kuoppala@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.