All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: intel-gfx <intel-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/i915/userptr: Avoid struct_mutex recursion for mmu_invalidate_range_start
Date: Fri, 26 Oct 2018 15:48:07 +0200	[thread overview]
Message-ID: <CAKMK7uHQMfkKWqgbRJGQ4oEYXYdFLBeiBO6T+7Q9fW-RwB=rrQ@mail.gmail.com> (raw)
In-Reply-To: <CAKMK7uF6EULJp9sZsu8tJMXkKOx0fz6wjBiOSYhj+kknydcPwg@mail.gmail.com>

On Fri, Oct 26, 2018 at 11:02 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Oct 25, 2018 at 9:39 PM Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > Quoting Daniel Vetter (2018-10-25 20:16:50)
> > > On Thu, Oct 25, 2018 at 01:45:42PM +0100, Chris Wilson wrote:
> > > > Since commit 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu
> > > > notifiers") we have been able to report failure from
> > > > mmu_invalidate_range_start which allows us to use a trylock on the
> > > > struct_mutex to avoid potential recursion and report -EBUSY instead.
> > > > Furthermore, this allows us to pull the work into the main callback and
> > > > avoid the sleight-of-hand in using a workqueue to avoid lockdep.
> > > >
> > > > However, not all paths to mmu_invalidate_range_start are prepared to
> > > > handle failure, so instead of reporting the recursion, deal with it.
> > > >
> > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108375
> > > > References: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers")
> > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_drv.h         |   4 +-
> > > >  drivers/gpu/drm/i915/i915_gem.c         |  12 +-
> > > >  drivers/gpu/drm/i915/i915_gem_userptr.c | 219 ++++++++++++------------
> > > >  3 files changed, 115 insertions(+), 120 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > > index 2d7761b8ac07..c3b94c3f930f 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > @@ -3076,8 +3076,8 @@ enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock */
> > > >       I915_MM_SHRINKER
> > > >  };
> > > >
> > > > -void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
> > > > -                              enum i915_mm_subclass subclass);
> > > > +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
> > > > +                             enum i915_mm_subclass subclass);
> > > >  void __i915_gem_object_invalidate(struct drm_i915_gem_object *obj);
> > > >
> > > >  enum i915_map_type {
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > > index 93d09282710d..de7ccb3eb7b8 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -2454,17 +2454,18 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
> > > >       return pages;
> > > >  }
> > > >
> > > > -void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
> > > > -                              enum i915_mm_subclass subclass)
> > > > +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
> > > > +                             enum i915_mm_subclass subclass)
> > > >  {
> > > >       struct sg_table *pages;
> > > > +     int ret = -EBUSY;
> > > >
> > > >       if (i915_gem_object_has_pinned_pages(obj))
> > > > -             return;
> > > > +             return -EBUSY;
> > > >
> > > >       GEM_BUG_ON(obj->bind_count);
> > > >       if (!i915_gem_object_has_pages(obj))
> > > > -             return;
> > > > +             return 0;
> > > >
> > > >       /* May be called by shrinker from within get_pages() (on another bo) */
> > > >       mutex_lock_nested(&obj->mm.lock, subclass);
> > > > @@ -2480,8 +2481,11 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
> > > >       if (!IS_ERR(pages))
> > > >               obj->ops->put_pages(obj, pages);
> > > >
> > > > +     ret = 0;
> > > >  unlock:
> > > >       mutex_unlock(&obj->mm.lock);
> > > > +
> > > > +     return ret;
> > > >  }
> > > >
> > > >  bool i915_sg_trim(struct sg_table *orig_st)
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> > > > index 2c9b284036d1..a8f3c304db55 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> > > > @@ -50,79 +50,84 @@ struct i915_mmu_notifier {
> > > >       struct hlist_node node;
> > > >       struct mmu_notifier mn;
> > > >       struct rb_root_cached objects;
> > > > -     struct workqueue_struct *wq;
> > > > +     struct i915_mm_struct *mm;
> > > >  };
> > > >
> > > >  struct i915_mmu_object {
> > > >       struct i915_mmu_notifier *mn;
> > > >       struct drm_i915_gem_object *obj;
> > > >       struct interval_tree_node it;
> > > > -     struct list_head link;
> > > > -     struct work_struct work;
> > > > -     bool attached;
> > > >  };
> > > >
> > > > -static void cancel_userptr(struct work_struct *work)
> > > > -{
> > > > -     struct i915_mmu_object *mo = container_of(work, typeof(*mo), work);
> > > > -     struct drm_i915_gem_object *obj = mo->obj;
> > > > -     struct work_struct *active;
> > > > -
> > > > -     /* Cancel any active worker and force us to re-evaluate gup */
> > > > -     mutex_lock(&obj->mm.lock);
> > > > -     active = fetch_and_zero(&obj->userptr.work);
> > > > -     mutex_unlock(&obj->mm.lock);
> > > > -     if (active)
> > > > -             goto out;
> > > > -
> > > > -     i915_gem_object_wait(obj, I915_WAIT_ALL, MAX_SCHEDULE_TIMEOUT, NULL);
> > > > -
> > > > -     mutex_lock(&obj->base.dev->struct_mutex);
> > > > -
> > > > -     /* We are inside a kthread context and can't be interrupted */
> > > > -     if (i915_gem_object_unbind(obj) == 0)
> > > > -             __i915_gem_object_put_pages(obj, I915_MM_NORMAL);
> > > > -     WARN_ONCE(i915_gem_object_has_pages(obj),
> > > > -               "Failed to release pages: bind_count=%d, pages_pin_count=%d, pin_global=%d\n",
> > > > -               obj->bind_count,
> > > > -               atomic_read(&obj->mm.pages_pin_count),
> > > > -               obj->pin_global);
> > > > -
> > > > -     mutex_unlock(&obj->base.dev->struct_mutex);
> > > > -
> > > > -out:
> > > > -     i915_gem_object_put(obj);
> > > > -}
> > > > -
> > > >  static void add_object(struct i915_mmu_object *mo)
> > > >  {
> > > > -     if (mo->attached)
> > > > +     if (!RB_EMPTY_NODE(&mo->it.rb))
> > > >               return;
> > > >
> > > >       interval_tree_insert(&mo->it, &mo->mn->objects);
> > > > -     mo->attached = true;
> > > >  }
> > > >
> > > >  static void del_object(struct i915_mmu_object *mo)
> > > >  {
> > > > -     if (!mo->attached)
> > > > +     if (RB_EMPTY_NODE(&mo->it.rb))
> > > >               return;
> > > >
> > > >       interval_tree_remove(&mo->it, &mo->mn->objects);
> > > > -     mo->attached = false;
> > > > +     RB_CLEAR_NODE(&mo->it.rb);
> > > > +}
> > > > +
> > > > +static void
> > > > +__i915_gem_userptr_set_active(struct drm_i915_gem_object *obj, bool value)
> > > > +{
> > > > +     struct i915_mmu_object *mo = obj->userptr.mmu_object;
> > > > +
> > > > +     /*
> > > > +      * During mm_invalidate_range we need to cancel any userptr that
> > > > +      * overlaps the range being invalidated. Doing so requires the
> > > > +      * struct_mutex, and that risks recursion. In order to cause
> > > > +      * recursion, the user must alias the userptr address space with
> > > > +      * a GTT mmapping (possible with a MAP_FIXED) - then when we have
> > > > +      * to invalidate that mmaping, mm_invalidate_range is called with
> > > > +      * the userptr address *and* the struct_mutex held.  To prevent that
> > > > +      * we set a flag under the i915_mmu_notifier spinlock to indicate
> > > > +      * whether this object is valid.
> > > > +      */
> > > > +     if (!mo)
> > > > +             return;
> > > > +
> > > > +     spin_lock(&mo->mn->lock);
> > > > +     if (value)
> > > > +             add_object(mo);
> > > > +     else
> > > > +             del_object(mo);
> > > > +     spin_unlock(&mo->mn->lock);
> > > > +}
> > > > +
> > > > +static struct mutex *__i915_mutex_lock_recursive(struct mutex *m)
> > > > +{
> > > > +     switch (mutex_trylock_recursive(m)) {
> > > > +     default:
> > > > +     case MUTEX_TRYLOCK_FAILED:
> > > > +             mutex_lock(m);
> > > > +     case MUTEX_TRYLOCK_SUCCESS:
> > > > +             return m;
> > > > +
> > > > +     case MUTEX_TRYLOCK_RECURSIVE:
> > > > +             return ERR_PTR(-EEXIST);
> > > > +     }
> > > >  }
> > > >
> > > >  static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
> > > > -                                                    struct mm_struct *mm,
> > > > -                                                    unsigned long start,
> > > > -                                                    unsigned long end,
> > > > -                                                    bool blockable)
> > > > +                                                   struct mm_struct *mm,
> > > > +                                                   unsigned long start,
> > > > +                                                   unsigned long end,
> > > > +                                                   bool blockable)
> > > >  {
> > > >       struct i915_mmu_notifier *mn =
> > > >               container_of(_mn, struct i915_mmu_notifier, mn);
> > > > -     struct i915_mmu_object *mo;
> > > >       struct interval_tree_node *it;
> > > > -     LIST_HEAD(cancelled);
> > > > +     struct mutex *unlock = NULL;
> > > > +     int ret = 0;
> > > >
> > > >       if (RB_EMPTY_ROOT(&mn->objects.rb_root))
> > > >               return 0;
> > > > @@ -133,11 +138,16 @@ static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
> > > >       spin_lock(&mn->lock);
> > > >       it = interval_tree_iter_first(&mn->objects, start, end);
> > > >       while (it) {
> > > > +             struct drm_i915_gem_object *obj;
> > > > +             bool has_pages = false;
> > > > +
> > > >               if (!blockable) {
> > > > -                     spin_unlock(&mn->lock);
> > > > -                     return -EAGAIN;
> > > > +                     ret = -EAGAIN;
> > > > +                     break;
> > > >               }
> > > > -             /* The mmu_object is released late when destroying the
> > > > +
> > > > +             /*
> > > > +              * The mmu_object is released late when destroying the
> > > >                * GEM object so it is entirely possible to gain a
> > > >                * reference on an object in the process of being freed
> > > >                * since our serialisation is via the spinlock and not
> > > > @@ -146,21 +156,44 @@ static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
> > > >                * use-after-free we only acquire a reference on the
> > > >                * object if it is not in the process of being destroyed.
> > > >                */
> > > > -             mo = container_of(it, struct i915_mmu_object, it);
> > > > -             if (kref_get_unless_zero(&mo->obj->base.refcount))
> > > > -                     queue_work(mn->wq, &mo->work);
> > > > +             obj = container_of(it, struct i915_mmu_object, it)->obj;
> > > > +             if (!kref_get_unless_zero(&obj->base.refcount)) {
> > > > +                     it = interval_tree_iter_next(it, start, end);
> > > > +                     continue;
> > > > +             }
> > > > +             spin_unlock(&mn->lock);
> > > >
> > > > -             list_add(&mo->link, &cancelled);
> > > > -             it = interval_tree_iter_next(it, start, end);
> > > > +             /* Cancel any pending worker and force us to re-evaluate gup */
> > > > +             mutex_lock_nested(&obj->mm.lock, I915_MM_SHRINKER);
> > >
> > > mutex_lock_nested is meant for static nesting. This here is rather
> > > dynamic, and I don't really see where the guarantee is that the classic
> > > deadlock isn't possible:
> >
> > It's meant to be after put_pages == 0 to ensure it only applied to a
> > different object. A bit more involved to do it exactly how I want, that
> > is to basically incorporate it into put_pages itself. Actually, if it is
> > in the put_pages it should just work...
>
> So for the shrinker I guess you can make this work. For my taste
> there's not enough GEM_WARN_ON to check the invariants of the nesting
> scheme (I assume it is list_empty(obj->mm.link) -> use I915_MM_NORMAL,
> and I915_MM_SHRINKER for anyone who discovers an object through the
> obj->mm.link lists, but would be nice to double-check that every time
> we take the obj->mm.lock with a GEM_WARN_ON).

After a bit more reading around I think this isn't enough, and we also
rely on list_del(obj->mm.link) being serialized by dev->struct_mutex.
But I'm not entirely sure. We definitely seem to rely on the magic
"holding dev->struct_mutex extends the lifetime of anything with a
non-empty obj->mm.link" on the shrinker side still.
-Daniel

> But for the mmu notifier I'm not seeing that same guarantee, and
> didn't spot any other, so not clear why this works. And assuming I'm
> reading the code correctly, we do insert the object into the rb tree
> already when gup worker is still doing it's thing.
>
> Or maybe there's another invariant that guarnatees the nesting, but I
> checked a few and they all don't seem to work (like pages_pin_count).
> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2018-10-26 13:48 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-25 12:45 [PATCH] drm/i915/userptr: Avoid struct_mutex recursion for mmu_invalidate_range_start Chris Wilson
2018-10-25 13:32 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915/userptr: Avoid struct_mutex recursion for mmu_invalidate_range_start (rev2) Patchwork
2018-10-25 13:50 ` ✓ Fi.CI.BAT: success " Patchwork
2018-10-25 18:57 ` ✗ Fi.CI.IGT: failure " Patchwork
2018-10-25 19:16 ` [PATCH] drm/i915/userptr: Avoid struct_mutex recursion for mmu_invalidate_range_start Daniel Vetter
2018-10-25 19:39   ` Chris Wilson
2018-10-26  9:02     ` Daniel Vetter
2018-10-26 13:48       ` Daniel Vetter [this message]
2018-10-29 10:56         ` Chris Wilson
2018-11-06 10:48           ` Daniel Vetter
2018-10-25 21:04 ` Mark Janes
  -- strict thread matches above, loose matches on Subject: below --
2019-01-15 12:17 [CI] " Chris Wilson
2019-01-15 12:44 ` [PATCH] " Chris Wilson
2018-10-23  7:27 Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKMK7uHQMfkKWqgbRJGQ4oEYXYdFLBeiBO6T+7Q9fW-RwB=rrQ@mail.gmail.com' \
    --to=daniel@ffwll.ch \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.