All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Mika Kuoppala <mika.kuoppala@linux.intel.com>,
	intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 07/12] drm/i915: Revoke mmaps and prevent access to fence registers across reset
Date: Mon, 04 Feb 2019 13:47:17 +0000	[thread overview]
Message-ID: <154928803745.14784.6537096810109423790@skylake-alporthouse-com> (raw)
In-Reply-To: <8736p34r1a.fsf@gaia.fi.intel.com>

Quoting Mika Kuoppala (2019-02-04 13:33:21)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> > @@ -272,6 +270,8 @@ struct i915_gpu_error {
> >        */
> >       wait_queue_head_t reset_queue;
> >  
> > +     struct srcu_struct srcu;
> > +
> 
> It is the only one in here so not causing confusion
> but could have been reset_backoff_srcu;

worksforme.

> > @@ -1274,9 +1272,12 @@ void i915_handle_error(struct drm_i915_private *i915,
> >               wait_event(i915->gpu_error.reset_queue,
> >                          !test_bit(I915_RESET_BACKOFF,
> >                                    &i915->gpu_error.flags));
> > -             goto out;
> > +             goto out; /* piggy-back on the other reset */
> >       }
> >  
> > +     /* Make sure i915_reset_trylock() sees the I915_RESET_BACKOFF */
> > +     synchronize_rcu_expedited();
> 
> Is the expedite here to minimize the time the faulted
> client can try to reaquire?

Simply to try and cap the amount of time it takes to issue a reset.
Without this we would regularly fail our assertion that userspace can do
a (full) reset in less than 250ms.

> >       /* Prevent any other reset-engine attempt. */
> >       for_each_engine(engine, i915, tmp) {
> >               while (test_and_set_bit(I915_RESET_ENGINE + engine->id,
> > @@ -1300,6 +1301,36 @@ void i915_handle_error(struct drm_i915_private *i915,
> >       intel_runtime_pm_put(i915, wakeref);
> >  }
> >  
> > +int i915_reset_trylock(struct drm_i915_private *i915)
> > +{
> > +     struct i915_gpu_error *error = &i915->gpu_error;
> > +     int srcu;
> > +
> > +     rcu_read_lock();
> > +     while (test_bit(I915_RESET_BACKOFF, &error->flags)) {
> > +             rcu_read_unlock();
> > +
> > +             if (wait_event_interruptible(error->reset_queue,
> > +                                          !test_bit(I915_RESET_BACKOFF,
> > +                                                    &error->flags)))
> > +                     return -EINTR;
> > +
> > +             rcu_read_lock();
> > +     }
> > +     srcu = srcu_read_lock(&error->srcu);
> 
> In here are we piggybacking the graze period from srcu
> into the rcu domain by nesting?
> 
> The srcu is ours, but the rcu is everyone. So this
> bothers me.

rcu serialises the update of the I915_RESET_BACKOFF bit. That
coordinates with the reset to say nothing is in use at this moment, and
the reset is not allowed to begin until all rcu reads are complete.

The srcu then coordinates with the actual reset; it is the
mutex/semaphore that prevents both from running at the same time.

We acquire it under the rcu read lock, while we know that
I915_RESET_BACKOFF is not set, and cannot be set. Then as we release the
rcu read lock and let reset start, it sets the I915_RESET_BACKOFF
putting the next faulter to sleep, but the reset is forced to sleep on
the active scru reader. As soon as they are all complete synchronize_srcu
returns and we can do the reset uncontended.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2019-02-04 13:47 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-04  8:41 [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Chris Wilson
2019-02-04  8:41 ` [PATCH 02/12] drm/i915/execlists: Suppress mere WAIT preemption Chris Wilson
2019-02-04 10:06   ` Tvrtko Ursulin
2019-02-04 10:18     ` Chris Wilson
2019-02-04 12:08       ` Tvrtko Ursulin
2019-02-04 12:19         ` Chris Wilson
2019-02-04 12:29           ` Tvrtko Ursulin
2019-02-04 10:49   ` [PATCH] " Chris Wilson
2019-02-04  8:41 ` [PATCH 03/12] drm/i915/execlists: Suppress redundant preemption Chris Wilson
2019-02-04 12:05   ` Tvrtko Ursulin
2019-02-04 12:25     ` Chris Wilson
2019-02-04  8:41 ` [PATCH 04/12] drm/i915/selftests: Exercise some AB...BA preemption chains Chris Wilson
2019-02-04  8:41 ` [PATCH 05/12] drm/i915: Trim NEWCLIENT boosting Chris Wilson
2019-02-04 12:11   ` Tvrtko Ursulin
2019-02-04 12:26     ` Chris Wilson
2019-02-04 12:42       ` Tvrtko Ursulin
2019-02-04 12:27     ` Chris Wilson
2019-02-04  8:41 ` [PATCH 06/12] drm/i915: Show support for accurate sw PMU busyness tracking Chris Wilson
2019-02-04 12:14   ` Tvrtko Ursulin
2019-02-04 12:28     ` Chris Wilson
2019-02-04 12:29       ` Chris Wilson
2019-02-04 12:37       ` Tvrtko Ursulin
2019-02-04 12:43         ` Chris Wilson
2019-02-04  8:41 ` [PATCH 07/12] drm/i915: Revoke mmaps and prevent access to fence registers across reset Chris Wilson
2019-02-04 13:33   ` Mika Kuoppala
2019-02-04 13:47     ` Chris Wilson [this message]
2019-02-04  8:41 ` [PATCH 08/12] drm/i915: Force the GPU reset upon wedging Chris Wilson
2019-02-04  8:41 ` [PATCH 09/12] drm/i915: Uninterruptibly drain the timelines on unwedging Chris Wilson
2019-02-04  8:41 ` [PATCH 10/12] drm/i915: Wait for old resets before applying debugfs/i915_wedged Chris Wilson
2019-02-04  8:41 ` [PATCH 11/12] drm/i915: Serialise resets with wedging Chris Wilson
2019-02-04  8:41 ` [PATCH 12/12] drm/i915: Don't claim an unstarted request was guilty Chris Wilson
2019-02-04  9:20 ` [PATCH 01/12] drm/i915: Allow normal clients to always preempt idle priority clients Tvrtko Ursulin
2019-02-04 10:19 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/12] " Patchwork
2019-02-04 10:23 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-02-04 10:48 ` ✓ Fi.CI.BAT: success " Patchwork
2019-02-04 11:27 ` ✗ Fi.CI.BAT: failure for series starting with [01/12] drm/i915: Allow normal clients to always preempt idle priority clients (rev2) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=154928803745.14784.6537096810109423790@skylake-alporthouse-com \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=mika.kuoppala@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.