All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout
Date: Wed, 28 Mar 2018 09:59:09 +0100	[thread overview]
Message-ID: <152222754937.10679.7601261106064537411@mail.alporthouse.com> (raw)
In-Reply-To: <152214260622.10679.12320767910224384399@mail.alporthouse.com>

Quoting Chris Wilson (2018-03-27 10:23:26)
> Quoting Tvrtko Ursulin (2018-03-27 09:51:20)
> > 
> > On 26/03/2018 12:50, Chris Wilson wrote:
> > > +static enum hrtimer_restart preempt_timeout(struct hrtimer *hrtimer)
> > > +{
> > > +     struct intel_engine_execlists *execlists =
> > > +             container_of(hrtimer, typeof(*execlists), preempt_timer);
> > > +
> > > +     GEM_TRACE("%s\n",
> > > +               container_of(execlists,
> > > +                            struct intel_engine_cs,
> > > +                            execlists)->name);
> > > +
> > > +     queue_work(system_highpri_wq, &execlists->preempt_reset);
> > 
> > I suppose indirection from hrtimer to worker is for better than jiffie 
> > timeout granularity? But then queue_work might introduce some delay to 
> > defeat that.
> 
> Yes. It's betting on highpri_wq being just that. We can do better with
> our own RT kthread and a wakeup from the hrtimer if required.
> 
> Hmm, the original plan (watchdog TDR) was to avoid using the global
> reset entirely. The per-engine reset (although needs serialising with
> itself) doesn't need struct_mutex, so we should be able to do that from
> the timer directly, and just kick off the global reset on failure.
> 
> That sounds a whole lot better, let's dust off that code and see what
> breaks.

So I think something along the lines of

+static int try_execlists_reset(struct intel_engine_execlists *execlists)
+{
+       struct intel_engine_cs *engine =
+               container_of(execlists, typeof(*engine), execlists);
+       int err = -EBUSY;
+
+       if (!test_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted) &&
+           tasklet_trylock(&execlists->tasklet)) {
+               unsigned long *lock = &engine->i915->gpu_error.flags;
+               unsigned int bit = I915_RESET_ENGINE + engine->id;
+
+               if (!test_and_set_bit(bit, lock)) {
+                       tasklet_disable(&engine->execlists.tasklet);
+                       err = i915_reset_engine(engine,
+                                               "preemption timeout");
+                       tasklet_enable(&engine->execlists.tasklet);
+
+                       clear_bit(bit, lock);
+                       wake_up_bit(lock, bit);
+               }
+
+               tasklet_unlock(&execlists->tasklet);
+       }
+
+       return err;
+}

should do the trick, with a fallback to worker+i915_handle_error for the
cases where we can't claim ensure softirq safety.

There's still the issue of two resets in quick succession being treated
as a failure. That's also an issue for the current per-engine failover
to whole-device reset.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2018-03-28  8:59 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-26 11:50 Forced preemption gedankenexperiment Chris Wilson
2018-03-26 11:50 ` [PATCH 01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling Chris Wilson
2018-03-27 11:34   ` Mika Kuoppala
2018-03-27 11:47     ` Chris Wilson
2018-03-27 12:18       ` Mika Kuoppala
2018-03-27 13:34         ` Chris Wilson
2018-03-27 11:42   ` Mika Kuoppala
2018-03-26 11:50 ` [PATCH 02/11] drm/i915/execlists: Clear user-active flag on preemption completion Chris Wilson
2018-03-27 10:00   ` Chris Wilson
2018-03-27 10:01     ` Chris Wilson
2018-03-26 11:50 ` [PATCH 03/11] drm/i915: Include submission tasklet state in engine dump Chris Wilson
2018-03-27  8:37   ` Mika Kuoppala
2018-03-26 11:50 ` [PATCH 04/11] drm/i915/execlists: Refactor out complete_preempt_context() Chris Wilson
2018-03-26 11:50 ` [PATCH 05/11] drm/i915: Move engine reset prepare/finish to backends Chris Wilson
2018-03-26 11:50 ` [PATCH 06/11] drm/i915: Split execlists/guc reset prepartions Chris Wilson
2018-03-26 11:50 ` [PATCH 07/11] drm/i915/execlists: Flush pending preemption events during reset Chris Wilson
2018-03-27 11:44   ` [PATCH v2] " Chris Wilson
2018-03-27 15:33     ` Jeff McGee
2018-03-26 11:50 ` [PATCH 08/11] drm/i915/execlists: Force preemption via reset on timeout Chris Wilson
2018-03-27  8:51   ` Tvrtko Ursulin
2018-03-27  9:10     ` Chris Wilson
2018-03-27  9:23     ` Chris Wilson
2018-03-28  8:59       ` Chris Wilson [this message]
2018-03-27 15:40     ` Jeff McGee
2018-03-27 15:50   ` Jeff McGee
2018-03-26 11:50 ` [PATCH 09/11] drm/i915/preemption: Select timeout when scheduling Chris Wilson
2018-03-26 11:50 ` [PATCH 10/11] drm/i915: Use a preemption timeout to enforce interactivity Chris Wilson
2018-03-27  8:35   ` Tvrtko Ursulin
2018-03-27  8:39     ` Chris Wilson
2018-03-27  8:57       ` Tvrtko Ursulin
2018-03-26 11:50 ` [PATCH 11/11] drm/i915: Allow user control over preempt timeout on the important context Chris Wilson
2018-03-26 17:09   ` Tvrtko Ursulin
2018-03-26 19:52     ` Chris Wilson
2018-03-26 20:49       ` Chris Wilson
2018-03-26 12:08 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling Patchwork
2018-03-26 12:23 ` ✓ Fi.CI.BAT: success " Patchwork
2018-03-26 14:56 ` ✗ Fi.CI.IGT: failure " Patchwork
2018-03-27 12:28 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/11] drm/i915/execlists: Avoid kicking the submission too early for rescheduling (rev2) Patchwork
2018-03-27 12:44 ` ✓ Fi.CI.BAT: success " Patchwork
2018-03-27 15:30 ` ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=152222754937.10679.7601261106064537411@mail.alporthouse.com \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.