All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 12/32] drm/i915: Invert the GEM wakeref hierarchy
Date: Thu, 18 Apr 2019 14:07:49 +0100	[thread overview]
Message-ID: <155559286924.4314.12445180445128039391@skylake-alporthouse-com> (raw)
In-Reply-To: <07b1d8d6-eb75-e5c1-1792-6d1d0ebfa499@linux.intel.com>

Quoting Tvrtko Ursulin (2019-04-18 13:42:59)
> 
> On 17/04/2019 08:56, Chris Wilson wrote:
> > +static bool switch_to_kernel_context(struct intel_engine_cs *engine)
> > +{
> > +     struct i915_request *rq;
> > +
> > +     /* Already inside the kernel context, safe to power down. */
> > +     if (engine->wakeref_serial == engine->serial)
> > +             return true;
> > +
> > +     /* GPU is pointing to the void, as good as in the kernel context. */
> > +     if (i915_reset_failed(engine->i915))
> > +             return true;
> > +
> > +     /*
> > +      * Note, we do this without taking the timeline->mutex. We cannot
> > +      * as we may be called while retiring the kernel context and so
> > +      * already underneath the timeline->mutex. Instead we rely on the
> > +      * exclusive property of the intel_engine_park that prevents anyone
> > +      * else from creating a request on this engine. This also requires
> > +      * that the ring is empty and we avoid any waits while constructing
> > +      * the context, as they assume protection by the timeline->mutex.
> > +      * This should hold true as we can only park the engine after
> > +      * retiring the last request, thus all rings should be empty and
> > +      * all timelines idle.
> > +      */
> > +     rq = __i915_request_create(engine->kernel_context, GFP_NOWAIT);
> > +     if (IS_ERR(rq))
> > +             /* Context switch failed, hope for the best! Maybe reset? */
> > +             return true;
> > +
> > +     /* Check again on the next retirement. */
> > +     engine->wakeref_serial = engine->serial + 1;
> 
> Is engine->serial guaranteed to be stable at this point? I guess so 
> since there can only be one park at a time.

Yes, we're inside the engine unpark routine so we are serialised against
all other users of the engine.

> > +     __i915_request_commit(rq);
> > +
> > +     return false;
> > +}
> > +
> > +static int intel_engine_park(struct intel_wakeref *wf)
> > +{
> > +     struct intel_engine_cs *engine =
> > +             container_of(wf, typeof(*engine), wakeref);
> > +
> > +     /*
> > +      * If one and only one request is completed between pm events,
> > +      * we know that we are inside the kernel context and it is
> > +      * safe to power down. (We are paranoid in case that runtime
> > +      * suspend causes corruption to the active context image, and
> > +      * want to avoid that impacting userspace.)
> > +      */
> > +     if (!switch_to_kernel_context(engine))
> > +             return -EBUSY;
> 
> But it is ignored by intel_engine_pm_put. Should it be a WARN_ON or 
> something?

The intel_wakeref takes action and defers the put/parking. That's all we
need here, as the GEM layer stays awake with its background retire
worker still poking occasionally.

> > @@ -718,6 +721,7 @@ static void reset_prepare(struct drm_i915_private *i915)
> >       struct intel_engine_cs *engine;
> >       enum intel_engine_id id;
> >   
> > +     intel_gt_pm_get(i915);
> 
> It's not in the spirit of the patch to let engines wake up the gt?

You could, this was just because I liked the look of it. In an operation
affecting all engines, it felt the right thing to do.

> >       for_each_engine(engine, i915, id)
> >               reset_prepare_engine(engine);
> >   
> > @@ -755,48 +759,10 @@ static int gt_reset(struct drm_i915_private *i915,
> >   static void reset_finish_engine(struct intel_engine_cs *engine)
> >   {
> >       engine->reset.finish(engine);
> > +     intel_engine_pm_put(engine);
> >       intel_uncore_forcewake_put(engine->uncore, FORCEWAKE_ALL);
> >   }
> >   
> > -struct i915_gpu_restart {
> > -     struct work_struct work;
> > -     struct drm_i915_private *i915;
> > -};
> > -
> > -static void restart_work(struct work_struct *work)
> 
> Oh wow I did not see this part of the code so far. Ask a second pair of 
> eyes to check on it?

This is the best part! Resolved a very, very annoying thorn with the
reset requiring a worker.

> > @@ -1137,6 +1076,9 @@ int i915_reset_engine(struct intel_engine_cs *engine, const char *msg)
> >       GEM_TRACE("%s flags=%lx\n", engine->name, error->flags);
> >       GEM_BUG_ON(!test_bit(I915_RESET_ENGINE + engine->id, &error->flags));
> >   
> > +     if (!intel_wakeref_active(&engine->wakeref))
> > +             return 0;
> 
> I guess there can't be any races here since stuck engine can't be 
> parked. Do we have any tests which trigger this without a guilty 
> request? I kind of remember that isn't possible so probably not.

We do, we have the reset idle engine selftests. Not that it proves very
much, just that we don't die.

> > -static int init_render_ring(struct intel_engine_cs *engine)
> > +static int rcs_resume(struct intel_engine_cs *engine)
> >   {
> >       struct drm_i915_private *dev_priv = engine->i915;
> > -     int ret = init_ring_common(engine);
> > -     if (ret)
> > -             return ret;
> >   
> >       /* WaTimedSingleVertexDispatch:cl,bw,ctg,elk,ilk,snb */
> >       if (IS_GEN_RANGE(dev_priv, 4, 6))
> > @@ -875,7 +875,7 @@ static int init_render_ring(struct intel_engine_cs *engine)
> >       if (INTEL_GEN(dev_priv) >= 6)
> >               ENGINE_WRITE(engine, RING_IMR, ~engine->irq_keep_mask);
> >   
> > -     return 0;
> > +     return xcs_resume(engine);
> 
> This inverts the order between the common and rcs init. One thing which 
> jump out is the RING_IMR which is now done after starting the engine. 
> Can we lose an interrupt now?

That write shouldn't be there, we take care of that inside the restart.

> >   static void idle_work_handler(struct work_struct *work)
> >   {
> >       struct drm_i915_private *i915 =
> >               container_of(work, typeof(*i915), gem.idle_work.work);
> > -     bool rearm_hangcheck;
> > -
> > -     if (!READ_ONCE(i915->gt.awake))
> > -             return;
> > -
> > -     if (READ_ONCE(i915->gt.active_requests))
> > -             return;
> > -
> > -     rearm_hangcheck =
> > -             cancel_delayed_work_sync(&i915->gpu_error.hangcheck_work);
> >   
> >       if (!mutex_trylock(&i915->drm.struct_mutex)) {
> 
> Should struct_mutex be taken by i915_gem_park, inside the wakeref lock? 
> Or that would create a lock inversion somewhere else?

NO! Yes, that would create the most mighty of inversions! Never, ever,
ever, take struct_mutex inside another lock, for its the outer lock for
the entire driver (give or take an insignificant amount).

About the only lock it is under is the mmap_sem, and look at the pain
that causes. :(

> > +bool i915_retire_requests(struct drm_i915_private *i915)
> >   {
> >       struct intel_ring *ring, *tmp;
> >   
> >       lockdep_assert_held(&i915->drm.struct_mutex);
> >   
> > -     if (!i915->gt.active_requests)
> > -             return;
> 
> You don't want to replace this with a wakeref_active check?

I didn't feel it was worth it in the short term. At the end of the day,
the main caller of i915_retire_requests() should be the retirement
worker, with a perhaps a call from the shrinker (but hopefully not).

[snip]

> As said before concept is very elegant and I like it.
> 
> But it is a monster refactor and as much as did cross-reference the diff 
> versus the patched tree to get a full picture I have to say my review is 
> more about high level and trusting the CI to catch any details. :I
> 
> My main concernig is lock nesting, especially the nested annotation in 
> the preceding patch. Does lockdep catch anything if you don't have that 
> annotation?

Yes. The shrinker calls the intel_wakeref_put, but we need to take more
locks inside the intel_wakeref_get (the pin_map, and more in unpark).
Hence they get caught in the same lock_map and lockdep gets quite angry
even though they cannot overlap.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2019-04-18 13:07 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-17  7:56 [PATCH 01/32] drm/i915: Seal races between async GPU cancellation, retirement and signaling Chris Wilson
2019-04-17  7:56 ` [PATCH 02/32] drm/i915: Verify workarounds immediately after application Chris Wilson
2019-04-17  7:56 ` [PATCH 03/32] drm/i915: Verify the engine workarounds stick on application Chris Wilson
2019-04-17  7:56 ` [PATCH 04/32] drm/i915: Make workaround verification *optional* Chris Wilson
2019-04-17  9:37   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 05/32] drm/i915/selftests: Verify whitelist of context registers Chris Wilson
2019-04-17  7:56 ` [PATCH 06/32] drm/i915: Store the default sseu setup on the engine Chris Wilson
2019-04-17  9:40   ` Tvrtko Ursulin
2019-04-24  9:45     ` Chris Wilson
2019-04-17  7:56 ` [PATCH 07/32] drm/i915: Move GraphicsTechnology files under gt/ Chris Wilson
2019-04-17  9:42   ` Tvrtko Ursulin
2019-04-18 12:04   ` Joonas Lahtinen
2019-04-23  8:57     ` Joonas Lahtinen
2019-04-23  9:40       ` Jani Nikula
2019-04-23 16:46         ` Rodrigo Vivi
2019-04-17  7:56 ` [PATCH 08/32] drm/i915: Introduce struct intel_wakeref Chris Wilson
2019-04-17  9:45   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 09/32] drm/i915: Pull the GEM powermangement coupling into its own file Chris Wilson
2019-04-17  7:56 ` [PATCH 10/32] drm/i915: Introduce context->enter() and context->exit() Chris Wilson
2019-04-17  7:56 ` [PATCH 11/32] drm/i915: Pass intel_context to i915_request_create() Chris Wilson
2019-04-17  7:56 ` [PATCH 12/32] drm/i915: Invert the GEM wakeref hierarchy Chris Wilson
2019-04-18 12:42   ` Tvrtko Ursulin
2019-04-18 13:07     ` Chris Wilson [this message]
2019-04-18 13:22       ` Chris Wilson
2019-04-23 13:02   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 13/32] drm/i915/gvt: Pin the per-engine GVT shadow contexts Chris Wilson
2019-04-17  7:56 ` [PATCH 14/32] drm/i915: Explicitly pin the logical context for execbuf Chris Wilson
2019-04-17  7:56 ` [PATCH 15/32] drm/i915: Export intel_context_instance() Chris Wilson
2019-04-17  7:56 ` [PATCH 16/32] drm/i915/selftests: Use the real kernel context for sseu isolation tests Chris Wilson
2019-04-17  7:56 ` [PATCH 17/32] drm/i915/selftests: Pass around intel_context for sseu Chris Wilson
2019-04-17  7:56 ` [PATCH 18/32] drm/i915: Pass intel_context to intel_context_pin_lock() Chris Wilson
2019-04-17  7:56 ` [PATCH 19/32] drm/i915: Split engine setup/init into two phases Chris Wilson
2019-04-17  7:56 ` [PATCH 20/32] drm/i915: Switch back to an array of logical per-engine HW contexts Chris Wilson
2019-04-17  7:56 ` [PATCH 21/32] drm/i915: Remove intel_context.active_link Chris Wilson
2019-04-17  9:47   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 22/32] drm/i915: Move i915_request_alloc into selftests/ Chris Wilson
2019-04-17  7:56 ` [PATCH 23/32] drm/i915: Allow multiple user handles to the same VM Chris Wilson
2019-04-17  7:56 ` [PATCH 24/32] drm/i915: Restore control over ppgtt for context creation ABI Chris Wilson
2019-04-17  7:56 ` [PATCH 25/32] drm/i915: Allow a context to define its set of engines Chris Wilson
2019-04-17  9:50   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 26/32] drm/i915: Re-expose SINGLE_TIMELINE flags for context creation Chris Wilson
2019-04-17  7:56 ` [PATCH 27/32] drm/i915: Allow userspace to clone contexts on creation Chris Wilson
2019-04-17  9:50   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 28/32] drm/i915: Load balancing across a virtual engine Chris Wilson
2019-04-17 11:26   ` Tvrtko Ursulin
2019-04-17 13:51     ` Chris Wilson
2019-04-17  7:56 ` [PATCH 29/32] drm/i915: Apply an execution_mask to the virtual_engine Chris Wilson
2019-04-17 11:43   ` Tvrtko Ursulin
2019-04-17 11:57     ` Chris Wilson
2019-04-17 12:35       ` Tvrtko Ursulin
2019-04-17 12:46         ` Chris Wilson
2019-04-17 13:32           ` Tvrtko Ursulin
2019-04-18  7:24             ` Chris Wilson
2019-04-17  7:56 ` [PATCH 30/32] drm/i915: Extend execution fence to support a callback Chris Wilson
2019-04-17  7:56 ` [PATCH 31/32] drm/i915/execlists: Virtual engine bonding Chris Wilson
2019-04-18  6:47   ` Tvrtko Ursulin
2019-04-18  6:57     ` Chris Wilson
2019-04-18  8:57       ` Tvrtko Ursulin
2019-04-18  9:13         ` Chris Wilson
2019-04-18  9:50           ` Tvrtko Ursulin
2019-04-18  9:59             ` Chris Wilson
2019-04-18 10:24               ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 32/32] drm/i915: Allow specification of parallel execbuf Chris Wilson
2019-04-17  8:46 ` [PATCH 01/32] drm/i915: Seal races between async GPU cancellation, retirement and signaling Chris Wilson
2019-04-17 11:33 ` ✗ Fi.CI.BAT: failure for series starting with [01/32] " Patchwork
2019-04-18 10:32 ` [PATCH 01/32] " Tvrtko Ursulin
2019-04-18 10:40   ` Chris Wilson
2019-04-23 12:59 ` Tvrtko Ursulin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=155559286924.4314.12445180445128039391@skylake-alporthouse-com \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.