Re: [PATCH 29/38] drm/i915: Expose user control over the ppGTT associated with a context

From: Chris Wilson <chris@chris-wilson.co.uk>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 29/38] drm/i915: Expose user control over the ppGTT associated with a context
Date: Wed, 23 Jan 2019 12:15:15 +0000	[thread overview]
Message-ID: <154824571539.693.10232498231848989206@skylake-alporthouse-com> (raw)
In-Reply-To: <df9cf728-1aa3-2cfd-774d-72547190f776@linux.intel.com>

Quoting Tvrtko Ursulin (2019-01-23 12:00:49)
> 
> On 18/01/2019 14:01, Chris Wilson wrote:
> > Allow the user to share ppGTT between contexts on the same fd. This
> > gives the user the ability to relax their context isolation to share vm
> > between their own contexts, but does not allow them to import a vm from
> > another fd. The use case for sharing a vm is for the proposed virtual
> > engine work where a context may be created explicitly to setup a load
> > balancing engine, but always be run in conjunction with a second context
> > for rcs/compute etc. By giving control to the user on how they setup the
> > vm allows for them to have a single vm between all kernel contexts being
> > used to emulate a single client context, similarly to how we use a
> > single vm across all engines within a single kernel context today. It
> > also allows for future specification a separate vm between engines
> > inside a single kernel context should that be desired.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_gem_context.c       | 118 ++++++++-
> >   drivers/gpu/drm/i915/i915_gem_gtt.c           |  17 +-
> >   drivers/gpu/drm/i915/i915_gem_gtt.h           |  14 +-
> >   drivers/gpu/drm/i915/selftests/huge_pages.c   |   1 -
> >   .../gpu/drm/i915/selftests/i915_gem_context.c | 247 ++++++++++++++----
> >   drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |   1 -
> >   drivers/gpu/drm/i915/selftests/mock_context.c |   8 +-
> >   include/uapi/drm/i915_drm.h                   |   1 +
> >   8 files changed, 339 insertions(+), 68 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 7c90981704bf..f707241dbc78 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -109,6 +109,8 @@ static void lut_close(struct i915_gem_context *ctx)
> >               struct i915_vma *vma = rcu_dereference_raw(*slot);
> >   
> >               radix_tree_iter_delete(&ctx->handles_vma, &iter, slot);
> > +
> > +             vma->open_count--;
> 
> I did not figure out what is this. A) why open coded vma management 
> without any comments, and b) rest of the patch doesn't seem to touch 
> this tree.

As the vm may be shared between multiple contexts, we may then open the
same vma and record it in the different context lut. Each instance needs
to be accounted for.

> >               __i915_gem_object_release_unless_active(vma->obj);
> >       }
> >       rcu_read_unlock();
> > @@ -291,7 +293,7 @@ static void context_close(struct i915_gem_context *ctx)
> >        */
> >       lut_close(ctx);
> >       if (ctx->ppgtt)
> > -             i915_ppgtt_close(&ctx->ppgtt->vm);
> > +             i915_ppgtt_close(ctx->ppgtt);
> 
> I'll need to figure out if it is okay for context to close the ppgtt 
> instead of just dropping references to it. Like two contexts sharing 
> ppgtt and one closes it, the other one should continue to work fine, no? 
> Or even a third context is created sharing the same ppgtt.

ppgtt->open_count? We don't close until everyone agrees.

> >       ctx->file_priv = ERR_PTR(-EBADF);
> >       i915_gem_context_put(ctx);
> > @@ -401,6 +403,23 @@ static void __destroy_hw_context(struct i915_gem_context *ctx,
> >       context_close(ctx);
> >   }
> >   
> > +static void __set_ppgtt(struct i915_gem_context *ctx,
> > +                     struct i915_hw_ppgtt *ppgtt)
> > +{
> > +     if (ppgtt == ctx->ppgtt)
> > +             return;
> > +
> > +     if (ctx->ppgtt) {
> > +             i915_ppgtt_close(ctx->ppgtt);
> 
> Feels incorrect to close it if it could be shared and in use elsewhere.
> 
> > +             i915_ppgtt_put(ctx->ppgtt);
> > +     }
> > +
> > +     i915_ppgtt_open(ppgtt);
> 
> Do we need some protection against trying to re-open a closed ppgtt here?

We BUG_ON as a closed ppgtt shouldn't have been accessible via the file_priv.

> > +     ctx->ppgtt = i915_ppgtt_get(ppgtt);
> > +
> > +     ctx->desc_template = default_desc_template(ctx->i915, ppgtt);
> > +}
> > +
> >   static struct i915_gem_context *
> >   i915_gem_create_context(struct drm_i915_private *dev_priv,
> >                       struct drm_i915_file_private *file_priv)
> > @@ -427,8 +446,8 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
> >                       return ERR_CAST(ppgtt);
> >               }
> >   
> > -             ctx->ppgtt = ppgtt;
> > -             ctx->desc_template = default_desc_template(dev_priv, ppgtt);
> > +             __set_ppgtt(ctx, ppgtt);
> > +             i915_ppgtt_put(ppgtt);
> >       }
> >   
> >       trace_i915_context_create(ctx);
> > @@ -784,6 +803,87 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915)
> >       return 0;
> >   }
> >   
> > +static int get_ppgtt(struct i915_gem_context *ctx, u64 *out)
> 
> u32 *out ?

Oh, left over from planning for setting different vm on each engine. In
the end, I thought a separate setter/getter was sensible rather than
trying to overload a simple interface with a more complex one.

> > +{
> > +     struct drm_i915_file_private *file_priv = ctx->file_priv;
> > +     struct i915_hw_ppgtt *ppgtt;
> > +     int ret;
> > +
> > +     /* XXX rcu acquire? */
> > +     ppgtt = ctx->ppgtt;
> > +     if (!ppgtt)
> > +             return -ENODEV;
> > +
> > +     ret = mutex_lock_interruptible(&file_priv->vm_lock);
> > +     if (ret)
> > +             return ret;
> > +
> > +     ret = ppgtt->user_handle;
> > +     if (!ret) {
> > +             ret = idr_alloc(&file_priv->vm_idr, ppgtt, 0, 0, GFP_KERNEL);
> 
> GEM_WARN_ON(ret == 0) just in case?
> 
> > +             if (ret > 0) {
> > +                     ppgtt->user_handle = ret;
> > +                     i915_ppgtt_get(ppgtt);
> > +             }
> > +     }
> > +
> > +     mutex_unlock(&file_priv->vm_lock);
> > +     if (ret < 0)
> > +             return ret;
> > +
> > +     *out = ret;
> > +     return 0;
> > +}
> > +
> > +static int set_ppgtt(struct i915_gem_context *ctx, u32 id)
> > +{
> > +     struct drm_i915_file_private *file_priv = ctx->file_priv;
> > +     struct i915_hw_ppgtt *ppgtt;
> > +     int err;
> > +
> > +     err = mutex_lock_interruptible(&file_priv->vm_lock);
> > +     if (err)
> > +             return err;
> > +
> > +     ppgtt = idr_find(&file_priv->vm_idr, id);
> > +     if (ppgtt)
> > +             i915_ppgtt_get(ppgtt);
> > +     mutex_unlock(&file_priv->vm_lock);
> > +     if (!ppgtt)
> > +             return -ENOENT;
> > +
> > +     err = mutex_lock_interruptible(&ctx->i915->drm.struct_mutex);
> > +     if (err)
> > +             goto out;
> > +
> > +     /*
> > +      * We need to flush any requests using the current ppgtt before
> > +      * we release it as the requests do not hold a reference themselves,
> > +      * only indirectly through the context. By switching to the kernel
> > +      * context, we ensure that the TLBs are reloaded before using the
> > +      * same context again -- an extra layer of paranoia over wait_for_idle.
> > +      */
> > +     err = i915_gem_switch_to_kernel_context(ctx->i915);
> > +     if (err)
> > +             goto out_unlock;
> > +
> > +     err = i915_gem_wait_for_idle(ctx->i915,
> > +                                  I915_WAIT_LOCKED |
> > +                                  I915_WAIT_INTERRUPTIBLE,
> > +                                  MAX_SCHEDULE_TIMEOUT);
> 
> This is a bit worrying. Every new client setting up their contexts 
> causes a global sync point. Sounds bad for scalability. It may be that 
> in practice the event might be happening only every few seconds, rather 
> than multiple times per second, but I am only guessing. How difficult to 
> make requests own ppgtt directly?

Or just add a retire callback from the kernel-context. The more we look,
the more use cases we find.

> > +static inline struct i915_hw_ppgtt *i915_ppgtt_get(struct i915_hw_ppgtt *ppgtt)
> >   {
> > -     if (ppgtt)
> > -             kref_get(&ppgtt->ref);
> > +     kref_get(&ppgtt->ref);
> > +     return ppgtt;
> 
> Unrelated hunk?

Am I not allowed to tidy up as I go along! I use it in this patch :)

> > diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> > index 6a241745e78a..2864cfb82325 100644
> > --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> > @@ -243,6 +243,12 @@ static int live_nop_switch(void *arg)
> >       return err;
> >   }
> >   
> > +#define GEN8_HIGH_ADDRESS_BIT 47
> > +static inline u64 gen8_canonical_addr(u64 address)
> > +{
> > +     return sign_extend64(address, GEN8_HIGH_ADDRESS_BIT);
> > +}
> 
> We could move the copy from i915_gem_execbuffer.c to i915_gem_utils.h or 
> somewhere.

It's probably not even worth it, this is a debug hunk for another
problem, now it's own selftest to expose the issue.

> Okay I went a bit back and forth with this patch since I didn't really 
> understand the ctx and ppggt lifetime/open-close rules.

And I didn't even notice I was replying to a reply.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx