All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 3/3] drm/i915: Store a pointer to intel_context in i915_request
Date: Thu, 19 Apr 2018 12:10:22 +0100	[thread overview]
Message-ID: <152413622285.27610.8888272949330694993@mail.alporthouse.com> (raw)
In-Reply-To: <4bed3d0d-81e2-aa87-e804-16a26e9aed31@linux.intel.com>

Quoting Tvrtko Ursulin (2018-04-19 11:40:06)
> 
> On 19/04/2018 08:17, Chris Wilson wrote:
> > To ease the frequent and ugly pointer dance of
> > &request->gem_context->engine[request->engine->id] during request
> > submission, store that pointer as request->hw_context. One major
> > advantage that we will exploit later is that this decouples the logical
> > context state from the engine itself.
> 
> I can see merit in making all the places which work on engines or 
> requests more logically and elegantly grab the engine specific context.
> 
> Authors of current unmerged work will probably be not so thrilled though.

> > @@ -353,60 +346,56 @@ int intel_gvt_scan_and_shadow_workload(struct intel_vgpu_workload *workload)
> >       struct intel_vgpu_submission *s = &vgpu->submission;
> >       struct i915_gem_context *shadow_ctx = s->shadow_ctx;
> >       struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
> > -     int ring_id = workload->ring_id;
> > -     struct intel_engine_cs *engine = dev_priv->engine[ring_id];
> > -     struct intel_ring *ring;
> > +     struct intel_engine_cs *engine = dev_priv->engine[workload->ring_id];
> > +     struct intel_context *ce;
> >       int ret;
> >   
> >       lockdep_assert_held(&dev_priv->drm.struct_mutex);
> >   
> > -     if (workload->shadowed)
> > +     if (workload->req)
> >               return 0;
> 
> GVT team will need to look at this hunk/function.
...
> At which point I skip over all GVT and continue further down. :)

That code doesn't merit your attention (or ours really until it is
improved, it really is staging/ quality).

> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 2fe779cab298..ed1c591ee866 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -125,16 +125,10 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
> >       i915_ppgtt_put(ctx->ppgtt);
> >   
> >       for (i = 0; i < I915_NUM_ENGINES; i++) {
> > -             struct intel_context *ce = &ctx->engine[i];
> > +             struct intel_context *ce = &ctx->__engine[i];
> >   
> > -             if (!ce->state)
> > -                     continue;
> > -
> > -             WARN_ON(ce->pin_count);
> > -             if (ce->ring)
> > -                     intel_ring_free(ce->ring);
> > -
> > -             __i915_gem_object_release_unless_active(ce->state->obj);
> > +             if (ce->ops)
> 
> Is ce->ops now the gate which was "if (!ce->state) continue" before, to 
> avoid GEM_BUG_ON in execlists_context_destroy? I am wondering if this 
> loop should not just be for_each_engine. Places which walk until 
> I915_NUM_ENGINES should be very special and this one doesn't feel like 
> one.

s/I915_NUM_ENGINES/ARRAY_SIZE(ctx->__engine)/ That's the semantic intent
here. That we aren't just thinking about engines, but the contents of
the context struct. We teardown the struct.

> > @@ -1010,9 +1018,12 @@ bool intel_engine_has_kernel_context(const struct intel_engine_cs *engine)
> >        */
> >       rq = __i915_gem_active_peek(&engine->timeline->last_request);
> >       if (rq)
> > -             return rq->gem_context == kernel_context;
> > -     else
> > -             return engine->last_retired_context == kernel_context;
> > +             return rq->gem_context == kctx;
> > +
> > +     if (!engine->last_retired_context)
> > +             return false;
> > +
> > +     return engine->last_retired_context->gem_context == kctx;
> 
> You thought this will be a bit more readable/obvious?

Hmm, I wrote this before I wrote to_intel_context(). If we use that, it
looks like

        const struct intel_context *kernel_context =
                to_intel_context(engine->i915->kernel_context, engine);
        struct i915_request *rq;

        rq = __i915_gem_active_peek(&engine->timeline->last_request);
        if (rq)
                return rq->hw_context == kernel_context;
        else
                return engine->last_retired_context == kernel_context;

which isn't as bad as the first pass was...

> > @@ -511,9 +511,7 @@ static void guc_add_request(struct intel_guc *guc, struct i915_request *rq)
> >   {
> >       struct intel_guc_client *client = guc->execbuf_client;
> >       struct intel_engine_cs *engine = rq->engine;
> > -     u32 ctx_desc =
> > -             lower_32_bits(intel_lr_context_descriptor(rq->gem_context,
> > -                                                       engine));
> > +     u32 ctx_desc = lower_32_bits(rq->hw_context->lrc_desc);
> 
> Ha...
> 
> >       u32 ring_tail = intel_ring_set_tail(rq->ring, rq->tail) / sizeof(u64);
> >   
> >       spin_lock(&client->wq_lock);
> > @@ -551,8 +549,8 @@ static void inject_preempt_context(struct work_struct *work)
> >                                            preempt_work[engine->id]);
> >       struct intel_guc_client *client = guc->preempt_client;
> >       struct guc_stage_desc *stage_desc = __get_stage_desc(client);
> > -     u32 ctx_desc = lower_32_bits(intel_lr_context_descriptor(client->owner,
> > -                                                              engine));
> > +     u32 ctx_desc = lower_32_bits(to_intel_context(client->owner,
> > +                                                   engine)->lrc_desc);
> 
> ..hum.

Just pointing out that this code is horrible ;)

One of the goals of having rq->hw_context was to avoid the pointer
dance, such as exemplified by intel_lr_context_descriptor(). But here we
should just stick the ctx_desc in the intel_guc_client.

> > @@ -708,7 +706,7 @@ static void guc_dequeue(struct intel_engine_cs *engine)
> >               struct i915_request *rq, *rn;
> >   
> >               list_for_each_entry_safe(rq, rn, &p->requests, sched.link) {
> > -                     if (last && rq->gem_context != last->gem_context) {
> > +                     if (last && rq->hw_context != last->hw_context) {
> 
> Same thing really since this is already per engine.

Not always...

> >                               if (port == last_port) {
> >                                       __list_del_many(&p->requests,
> >                                                       &rq->sched.link);
> > @@ -991,7 +989,8 @@ static void guc_fill_preempt_context(struct intel_guc *guc)
> >       enum intel_engine_id id;
> >   
> >       for_each_engine(engine, dev_priv, id) {
> > -             struct intel_context *ce = &client->owner->engine[id];
> > +             struct intel_context *ce =
> > +                     to_intel_context(client->owner, engine);
> >               u32 addr = intel_hws_preempt_done_address(engine);
> >               u32 *cs;
> >   
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index 0777226e65a6..e6725690b5b6 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -164,7 +164,8 @@
> >   #define WA_TAIL_BYTES (sizeof(u32) * WA_TAIL_DWORDS)
> >   
> >   static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
> > -                                         struct intel_engine_cs *engine);
> > +                                         struct intel_engine_cs *engine,
> > +                                         struct intel_context *ce);
> 
> Feels a bit redundant to pass in everything but OK.

You do remember which patch this was split out of ;)

> >   {
> >       return (IS_ENABLED(CONFIG_DRM_I915_GVT) &&
> > -             i915_gem_context_force_single_submission(ctx));
> > +             i915_gem_context_force_single_submission(ctx->gem_context));
> >   }
> 
> But the whole function change is again not needed I think.
> 
> >   
> > -static bool can_merge_ctx(const struct i915_gem_context *prev,
> > -                       const struct i915_gem_context *next)
> > +static bool can_merge_ctx(const struct intel_context *prev,
> > +                       const struct intel_context *next)
> 
> This one neither.

Consider what it means. The rule is that we are not allowed to submit
the same lrc descriptor to subsequent ports; that corresponds to
the hw context, not gem_context. The payoff, aside from the better
semantics now, is in subsequent patches.

> > @@ -686,14 +687,14 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
> >                                * the same context (even though a different
> >                                * request) to the second port.
> >                                */
> > -                             if (ctx_single_port_submission(last->gem_context) ||
> > -                                 ctx_single_port_submission(rq->gem_context)) {
> > +                             if (ctx_single_port_submission(last->hw_context) ||
> > +                                 ctx_single_port_submission(rq->hw_context)) {
> >                                       __list_del_many(&p->requests,
> >                                                       &rq->sched.link);
> >                                       goto done;
> >                               }
> >   
> > -                             GEM_BUG_ON(last->gem_context == rq->gem_context);
> > +                             GEM_BUG_ON(last->hw_context == rq->hw_context);
> 
> And if you agree with the above then these two hunks do not have to be 
> in the patch.

Respectfully disagree as this is ground work for subsequent logical
contexts that float between engines.

> > +static struct intel_context *
> > +execlists_context_pin(struct intel_engine_cs *engine,
> > +                   struct i915_gem_context *ctx)
> >   {
> > -     struct intel_context *ce = &ctx->engine[engine->id];
> > +     struct intel_context *ce = to_intel_context(ctx, engine);
> >   
> >       lockdep_assert_held(&ctx->i915->drm.struct_mutex);
> > -     GEM_BUG_ON(ce->pin_count == 0);
> > -
> > -     if (--ce->pin_count)
> > -             return;
> >   
> > -     intel_ring_unpin(ce->ring);
> > +     if (likely(ce->pin_count++))
> > +             return ce;
> > +     GEM_BUG_ON(!ce->pin_count); /* no overflow please! */
> >   
> > -     ce->state->obj->pin_global--;
> > -     i915_gem_object_unpin_map(ce->state->obj);
> > -     i915_vma_unpin(ce->state);
> > +     ce->ops = &execlists_context_ops;
> 
> If ops are set on pin it would be good to unset them on destroy.

No point on destroy, since it's being destroyed.

> Hm, or even set them not on pin but on creation.

engine->context_pin() is our intel_context factory, pin is the creation
function.

Definitely a bit asymmetrical, the key bit was having ce->unpin() so that
we didn't try to release the intel_context via a stale engine reference.

What I have been considering is passing intel_context to
i915_request_alloc() instead and trying to avoid that asymmetry by
managing the creation in the caller and separating it from pin().

> Not a huge deal but it is a 
> bit unbalanced, and also if ce->ops is also a guard to distinguish 
> possible from impossible engines then it is a bit more strange.

Not impossible engines in this regard, unused contexts.

> >   
> > -     i915_gem_context_put(ctx);
> > +     return __execlists_context_pin(engine, ctx, ce);
> >   }
> >   
> >   static int execlists_request_alloc(struct i915_request *request)
> >   {
> > -     struct intel_engine_cs *engine = request->engine;
> > -     struct intel_context *ce = &request->gem_context->engine[engine->id];
> > -     int ret;
> > +     int err;
> >   
> > -     GEM_BUG_ON(!ce->pin_count);
> > +     GEM_BUG_ON(!request->hw_context->pin_count);
> >   
> >       /* Flush enough space to reduce the likelihood of waiting after
> >        * we start building the request - in which case we will just
> > @@ -1388,9 +1413,9 @@ static int execlists_request_alloc(struct i915_request *request)
> >        */
> >       request->reserved_space += EXECLISTS_REQUEST_SIZE;
> >   
> > -     ret = intel_ring_wait_for_space(request->ring, request->reserved_space);
> > -     if (ret)
> > -             return ret;
> > +     err = intel_ring_wait_for_space(request->ring, request->reserved_space);
> > +     if (err)
> > +             return err;
> 
> Rename of ret to err for a smaller diff is avoidable.

A couple of lines for trying to harmonise between ret and err.

> > @@ -1288,32 +1305,37 @@ intel_ring_context_pin(struct intel_engine_cs *engine,
> >   
> >       i915_gem_context_get(ctx);
> >   
> > -out:
> >       /* One ringbuffer to rule them all */
> > -     return engine->buffer;
> > +     GEM_BUG_ON(!engine->buffer);
> > +     ce->ring = engine->buffer;
> > +
> > +     return ce;
> >   
> >   err:
> >       ce->pin_count = 0;
> 
> Strictly speaking this should be done in the caller.

Sssh. Tail call, think of it as a single contiguous function ;)

> > @@ -1323,10 +1345,6 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
> >   
> >       intel_engine_setup_common(engine);
> >   
> > -     err = intel_engine_init_common(engine);
> > -     if (err)
> > -             goto err;
> > -
> >       ring = intel_engine_create_ring(engine, 32 * PAGE_SIZE);
> >       if (IS_ERR(ring)) {
> >               err = PTR_ERR(ring);
> > @@ -1341,8 +1359,14 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
> >       GEM_BUG_ON(engine->buffer);
> >       engine->buffer = ring;
> >   
> > +     err = intel_engine_init_common(engine);
> > +     if (err)
> > +             goto err_unpin;
> > +
> 
> This ordering change doesn't look like it has to be done in this patch.

It does. Do you want to take another look ;)

The problem here is the legacy ringbuffer submission uses a common
engine->buffer, and the kernel contexts are pinned in init_common; so we
have to create that buffer first.

> > diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> > index e6d4b882599a..47a9de741c5f 100644
> > --- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> > +++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
> > @@ -243,6 +243,11 @@ struct drm_i915_private *mock_gem_device(void)
> >       if (!i915->kernel_context)
> >               goto err_engine;
> >   
> > +     /* Fake the initial load for the powersaving context */
> > +     i915->engine[RCS]->last_retired_context =
> > +             intel_context_pin(i915->kernel_context, i915->engine[RCS]);
> > +     GEM_BUG_ON(IS_ERR_OR_NULL(i915->engine[RCS]->last_retired_context));
> 
> What is this?

Keeping selftests happy. I was asserting engine->last_retired_context was
set chasing ce->ops and found an instance where it wasn't which caused
this patch to blow up. And I had I written
intel_engine_has_kernel_context() differently, this chunk wouldn't have
been required.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2018-04-19 11:10 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-19  7:17 [PATCH 1/3] drm/i915: Move request->ctx aside Chris Wilson
2018-04-19  7:17 ` [PATCH 2/3] drm/i915: Move fiddling with engine->last_retired_context Chris Wilson
2018-04-19 10:45   ` Tvrtko Ursulin
2018-04-19  7:17 ` [PATCH 3/3] drm/i915: Store a pointer to intel_context in i915_request Chris Wilson
2018-04-19 10:40   ` Tvrtko Ursulin
2018-04-19 11:10     ` Chris Wilson [this message]
2018-04-19 12:11       ` Tvrtko Ursulin
2018-04-19  7:31 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] drm/i915: Move request->ctx aside Patchwork
2018-04-19  7:32 ` ✗ Fi.CI.SPARSE: " Patchwork
2018-04-19  7:47 ` ✓ Fi.CI.BAT: success " Patchwork
2018-04-19  8:34 ` ✓ Fi.CI.IGT: " Patchwork
2018-04-19 10:43 ` [PATCH 1/3] " Tvrtko Ursulin
2018-04-19 11:39   ` Chris Wilson
  -- strict thread matches above, loose matches on Subject: below --
2018-04-18 13:56 Chris Wilson
2018-04-18 13:56 ` [PATCH 3/3] drm/i915: Store a pointer to intel_context in i915_request Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=152413622285.27610.8888272949330694993@mail.alporthouse.com \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.