Re: [PATCH 3/6] drm/gem: use new ww_mutex_(un)lock_for_each macros

From: Daniel Vetter <daniel@ffwll.ch>
To: "Christian König" <christian.koenig@amd.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Lucas Stach <l.stach@pengutronix.de>,
	Russell King <linux+etnaviv@armlinux.org.uk>,
	Christian Gmeiner <christian.gmeiner@gmail.com>,
	Qiang Yu <yuq825@gmail.com>, "Anholt, Eric" <eric@anholt.net>,
	Thomas Hellstrom <thellstrom@vmware.com>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	The etnaviv authors <etnaviv@lists.freedesktop.org>,
	lima@lists.freedesktop.org
Subject: Re: [PATCH 3/6] drm/gem: use new ww_mutex_(un)lock_for_each macros
Date: Fri, 14 Jun 2019 20:24:27 +0200	[thread overview]
Message-ID: <CAKMK7uFcDCJ9sozny1RqqRATwcK39doZNq+NZekvrzuO63ap-Q@mail.gmail.com> (raw)
In-Reply-To: <094da0f7-a0f0-9ed4-d2da-8c6e6d165380@gmail.com>

On Fri, Jun 14, 2019 at 8:10 PM Christian König <ckoenig.leichtzumerken@gmail.com> wrote:
>
> Am 14.06.19 um 17:22 schrieb Daniel Vetter:
> > On Fri, Jun 14, 2019 at 03:19:16PM +0200, Peter Zijlstra wrote:
> >> On Fri, Jun 14, 2019 at 02:41:22PM +0200, Christian König wrote:
> >>> Use the provided macros instead of implementing deadlock handling on our own.
> >>>
> >>> Signed-off-by: Christian König <christian.koenig@amd.com>
> >>> ---
> >>>   drivers/gpu/drm/drm_gem.c | 49 ++++++++++-----------------------------
> >>>   1 file changed, 12 insertions(+), 37 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> >>> index 50de138c89e0..6e4623d3bee2 100644
> >>> --- a/drivers/gpu/drm/drm_gem.c
> >>> +++ b/drivers/gpu/drm/drm_gem.c
> >>> @@ -1307,51 +1307,26 @@ int
> >>>   drm_gem_lock_reservations(struct drm_gem_object **objs, int count,
> >>>                       struct ww_acquire_ctx *acquire_ctx)
> >>>   {
> >>> -   int contended = -1;
> >>> +   struct ww_mutex *contended;
> >>>     int i, ret;
> >>>  
> >>>     ww_acquire_init(acquire_ctx, &reservation_ww_class);
> >>>  
> >>> -retry:
> >>> -   if (contended != -1) {
> >>> -           struct drm_gem_object *obj = objs[contended];
> >>> -
> >>> -           ret = ww_mutex_lock_slow_interruptible(&obj->resv->lock,
> >>> -                                                  acquire_ctx);
> >>> -           if (ret) {
> >>> -                   ww_acquire_done(acquire_ctx);
> >>> -                   return ret;
> >>> -           }
> >>> -   }
> >>> -
> >>> -   for (i = 0; i < count; i++) {
> >>> -           if (i == contended)
> >>> -                   continue;
> >>> -
> >>> -           ret = ww_mutex_lock_interruptible(&objs[i]->resv->lock,
> >>> -                                             acquire_ctx);
> >>> -           if (ret) {
> >>> -                   int j;
> >>> -
> >>> -                   for (j = 0; j < i; j++)
> >>> -                           ww_mutex_unlock(&objs[j]->resv->lock);
> >>> -
> >>> -                   if (contended != -1 && contended >= i)
> >>> -                           ww_mutex_unlock(&objs[contended]->resv->lock);
> >>> -
> >>> -                   if (ret == -EDEADLK) {
> >>> -                           contended = i;
> >>> -                           goto retry;
> >>> -                   }
> >>> -
> >>> -                   ww_acquire_done(acquire_ctx);
> >>> -                   return ret;
> >>> -           }
> >>> -   }
> >> I note all the sites you use this on are simple idx iterators; so how
> >> about something like so:
> >>
> >> int ww_mutex_unlock_all(int count, void *data, struct ww_mutex *(*func)(int, void *))
> >> {
> >>      int i;
> >>
> >>      for (i = 0; i < count; i++) {
> >>              lock = func(i, data);
> >>              ww_mutex_unlock(lock);
> >>      }
> >> }
> >>
> >> int ww_mutex_lock_all(int count, struct ww_acquire_context *acquire_ctx, bool intr,
> >>                    void *data, struct ww_mutex *(*func)(int, void *))
> >> {
> >>      int i, ret, contended = -1;
> >>      struct ww_mutex *lock;
> >>
> >> retry:
> >>      if (contended != -1) {
> >>              lock = func(contended, data);
> >>              if (intr)
> >>                      ret = ww_mutex_lock_slow_interruptible(lock, acquire_ctx);
> >>              else
> >>                      ret = ww_mutex_lock_slow(lock, acquire_ctx), 0;
> >>
> >>              if (ret) {
> >>                      ww_acquire_done(acquire_ctx);
> >>                      return ret;
> >>              }
> >>      }
> >>
> >>      for (i = 0; i < count; i++) {
> >>              if (i == contended)
> >>                      continue;
> >>
> >>              lock = func(i, data);
> >>              if (intr)
> >>                      ret = ww_mutex_lock_interruptible(lock, acquire_ctx);
> >>              else
> >>                      ret = ww_mutex_lock(lock, acquire_ctx), 0;
> >>
> >>              if (ret) {
> >>                      ww_mutex_unlock_all(i, data, func);
> >>                      if (contended > i) {
> >>                              lock = func(contended, data);
> >>                              ww_mutex_unlock(lock);
> >>                      }
> >>
> >>                      if (ret == -EDEADLK) {
> >>                              contended = i;
> >>                              goto retry;
> >>                      }
> >>
> >>                      ww_acquire_done(acquire_ctx);
> >>                      return ret;
> >>              }
> >>      }
> >>
> >>      ww_acquire_done(acquire_ctx);
> >>      return 0;
> >> }
> >>
> >>> +   ww_mutex_lock_for_each(for (i = 0; i < count; i++),
> >>> +                          &objs[i]->resv->lock, contended, ret, true,
> >>> +                          acquire_ctx)
> >>> +           if (ret)
> >>> +                   goto error;
> >> which then becomes:
> >>
> >> struct ww_mutex *gem_ww_mutex_func(int i, void *data)
> >> {
> >>      struct drm_gem_object **objs = data;
> >>      return &objs[i]->resv->lock;
> >> }
> >>
> >>      ret = ww_mutex_lock_all(count, acquire_ctx, true, objs, gem_ww_mutex_func);
> >>
> >>>     ww_acquire_done(acquire_ctx);
> >>>  
> >>>     return 0;
> >>> +
> >>> +error:
> >>> +   ww_mutex_unlock_for_each(for (i = 0; i < count; i++),
> >>> +                            &objs[i]->resv->lock, contended);
> >>> +   ww_acquire_done(acquire_ctx);
> >>> +   return ret;
> >>>   }
> >>>   EXPORT_SYMBOL(drm_gem_lock_reservations);
> > Another idea, entirely untested (I guess making sure that we can use the
> > same iterator for both locking and unlocking in the contended case will be
> > fun), but maybe something like this:
> >
> >       WW_MUTEX_LOCK_BEGIN();
> >       driver_for_each_loop (iter, pos) {
> >               WW_MUTEX_LOCK(&pos->ww_mutex);
> >       }
> >       WW_MUTEX_LOCK_END();
> >
> > That way we can reuse any and all iterators that'll ever show up at least.
> > It's still horrible because the macros need to jump around between all of
> > them.
>
> Yeah, I tried this as well and that's exactly the reason why I discarded
> this approach.
>
> There is this hack with goto *void we could use, but I'm pretty sure
> that is actually not part of any C standard.

Nah, just invisible jump labels + the all uppercase macro names to scream
that into your face. You essentially trade one set of horrors for another,
and this one allows you to open-code any kind of loop or other algorithim
to find all the ww_mutexes you need.

> > Would also make this useful for more cases, where maybe you need to
> > trylock some lru lock to get at your next ww_mutex, or do some
> > kref_get_unless_zero. Buffer eviction loops tend to acquire these, and
> > that would all get ugly real fast if we'd need to stuff it into some
> > iterator argument.
>
> Well I don't see a use case with eviction in general. The dance there
> requires something different as far as I can see.

Current ttm doesn't really bother with multi-threaded contention for all
of memory. You're fixing that, but I think in the end we need a
slow-reclaim which eats up the entire lru using the full ww_mutex dance.
Rougly

WW_MUTEX_LOCK_BEGIN()

lock(lru_lock);

while (bo = list_first(lru)) {
	if (kref_get_unless_zero(bo)) {
		unlock(lru_lock);
		WW_MUTEX_LOCK(bo->ww_mutex);
		lock(lru_lock);
	} else {
		/* bo is getting freed, steal it from the freeing process
		 * or just ignore */
	}
}
unlock(lru_lock)

WW_MUTEX_LOCK_END;

Also I think if we allow this we could perhaps use this to implement the
modeset macros too.
-Daniel

> > This is kinda what we went with for modeset locks with
> > DRM_MODESET_LOCK_ALL_BEGIN/END, you can grab more locks in between the
> > pair at least. But it's a lot more limited use-cases, maybe too fragile an
> > idea for ww_mutex in full generality.
> >
> > Not going to type this out because too much w/e mode here already, but I
> > can give it a stab next week.
> > -Daniel
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch