All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 29/32] drm/i915: Apply an execution_mask to the virtual_engine
Date: Wed, 17 Apr 2019 14:32:03 +0100	[thread overview]
Message-ID: <826d688c-39bc-c98a-c507-77152ea3c7e4@linux.intel.com> (raw)
In-Reply-To: <155550518086.2264.3894599782756787444@skylake-alporthouse-com>


On 17/04/2019 13:46, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-04-17 13:35:29)
>>
>> On 17/04/2019 12:57, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-04-17 12:43:49)
>>>>
>>>> On 17/04/2019 08:56, Chris Wilson wrote:
>>>>> Allow the user to direct which physical engines of the virtual engine
>>>>> they wish to execute one, as sometimes it is necessary to override the
>>>>> load balancing algorithm.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c    |  58 +++++++++++
>>>>>     drivers/gpu/drm/i915/gt/selftest_lrc.c | 131 +++++++++++++++++++++++++
>>>>>     drivers/gpu/drm/i915/i915_request.c    |   1 +
>>>>>     drivers/gpu/drm/i915/i915_request.h    |   3 +
>>>>>     4 files changed, 193 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> index d6efd6aa67cb..560a18bb4cbb 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> @@ -552,6 +552,18 @@ execlists_context_schedule_out(struct i915_request *rq, unsigned long status)
>>>>>         intel_engine_context_out(rq->engine);
>>>>>         execlists_context_status_change(rq, status);
>>>>>         trace_i915_request_out(rq);
>>>>> +
>>>>> +     /*
>>>>> +      * If this is part of a virtual engine, its next request may have
>>>>> +      * been blocked waiting for access to the active context. We have
>>>>> +      * to kick all the siblings again in case we need to switch (e.g.
>>>>> +      * the next request is not runnable on this engine). Hopefully,
>>>>> +      * we will already have submitted the next request before the
>>>>> +      * tasklet runs and do not need to rebuild each virtual tree
>>>>> +      * and kick everyone again.
>>>>> +      */
>>>>> +     if (rq->engine != rq->hw_context->engine)
>>>>> +             tasklet_schedule(&rq->hw_context->engine->execlists.tasklet);
>>>>
>>>> Is this needed only for non-default execution_mask? If so it would be
>>>> good to limit it to avoid tasklet storm with plain veng.
>>>
>>> The issue is not just with this rq but the next one. If that has a
>>> restricted mask that prevents it running on this engine, we may have
>>> missed the opportunity to queue it (and so never run it under just the
>>> right circumstances).
>>>
>>> Something like
>>>        to_virtual_engine(rq->hw_context->engine)->request->execution_mask & ~rq->execution_mask
>>>
>>> The storm isn't quite so bad, it's only on context-out, and we often do
>>> succeed in keeping it busy. I was just trying to avoid pulling in ve here.
>>
>> What do you mean by the "pulling in ve" bit? Avoiding using
>> to_virtual_engine like in the line you wrote above?
> 
> Just laziness hiding behind an excuse of trying to not to smear veng too
> widely.
> 
>>>>> +
>>>>> +     rq = READ_ONCE(ve->request);
>>>>> +     if (!rq)
>>>>> +             return 0;
>>>>> +
>>>>> +     /* The rq is ready for submission; rq->execution_mask is now stable. */
>>>>> +     mask = rq->execution_mask;
>>>>> +     if (unlikely(!mask)) {
>>>>> +             /* Invalid selection, submit to a random engine in error */
>>>>> +             i915_request_skip(rq, -ENODEV);
>>>>
>>>> When can this happen? It looks like if it can happen we should reject it
>>>> earlier. Or if it can't then just assert.
>>>
>>> Many submit fences can end up with an interesection of 0. This is the
>>> convenient point to do the rejection, as with any other asynchronous
>>> error.
>>
>> Which ones are many? Why would we have uAPI which allows setting
>> impossible things where all requests will fail with -ENODEV?
> 
> But we are rejecting them in the uAPI, right here. This is the earliest
> point where all the information for a particular execbuf is available
> and we have the means of reporting that back.

In the tasklet? I could be just extra slow today, but please could you 
explain how we allowed a submission which can't be rejected earlier than 
in the tasklet. What sequence of events leads to it?

> 
>>>>> +             mask = ve->siblings[0]->mask;
>>>>> +     }
>>>>> +
>>>>> +     GEM_TRACE("%s: rq=%llx:%lld, mask=%x, prio=%d\n",
>>>>> +               ve->base.name,
>>>>> +               rq->fence.context, rq->fence.seqno,
>>>>> +               mask, ve->base.execlists.queue_priority_hint);
>>>>> +
>>>>> +     return mask;
>>>>> +}
>>>>> +
>>>>>     static void virtual_submission_tasklet(unsigned long data)
>>>>>     {
>>>>>         struct virtual_engine * const ve = (struct virtual_engine *)data;
>>>>>         const int prio = ve->base.execlists.queue_priority_hint;
>>>>> +     intel_engine_mask_t mask;
>>>>>         unsigned int n;
>>>>>     
>>>>> +     rcu_read_lock();
>>>>> +     mask = virtual_submission_mask(ve);
>>>>> +     rcu_read_unlock();
>>>>
>>>> What is the RCU for?
>>>
>>> Accessing ve->request. There's nothing stopping another engine from
>>> spotting the ve->request still in its tree, submitting it and it being
>>> retired all during the read here.
>>
>> AFAIU there can only be one instance of virtual_submission_tasklet per
>> VE at a time and the code above is before the request is inserted into
>> physical engine trees. So I don't get it.
> 
> But the veng is being utilized by real engines concurrently, they are
> who take the ve->request and execute it and so may free the ve->request
> behind the submission tasklet's back. Later on the spinlock comes into
> play after we have decided there's a request ready.

How can real engines see this request at this point since it hasn't been 
put in the queue yet?

And if it is protecting against the tasklet then it should be 
local_bh_disable/enable. But wait.. it is a tasklet already so that also 
doesn't make sense.

So I just don't see it.

I guess it is related to the question of zero intersected mask. If that 
would be impossible you would be able to fetch the mask from insige the 
locked section in the hunk one down.

> 
>> Hm.. but going back to the veng patch, there is a
>> GEM_BUG_ON(ve->request) in virtual_submit_request. Why couldn't this be
>> called multiple times in parallel for different requests?
> 
> Because we strictly ordered submission into the veng so that it only
> considers one ready request at a time. Processing more requests
> decreased global throughput as load-balancing is no longer "late" (the
> physical engines then amalgamate all the ve requests into one submit).

I got temporarily confused into thinking submit_notify is at the 
queued->runnable transition. You see what you are dealing with here. :I

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2019-04-17 13:32 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-17  7:56 [PATCH 01/32] drm/i915: Seal races between async GPU cancellation, retirement and signaling Chris Wilson
2019-04-17  7:56 ` [PATCH 02/32] drm/i915: Verify workarounds immediately after application Chris Wilson
2019-04-17  7:56 ` [PATCH 03/32] drm/i915: Verify the engine workarounds stick on application Chris Wilson
2019-04-17  7:56 ` [PATCH 04/32] drm/i915: Make workaround verification *optional* Chris Wilson
2019-04-17  9:37   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 05/32] drm/i915/selftests: Verify whitelist of context registers Chris Wilson
2019-04-17  7:56 ` [PATCH 06/32] drm/i915: Store the default sseu setup on the engine Chris Wilson
2019-04-17  9:40   ` Tvrtko Ursulin
2019-04-24  9:45     ` Chris Wilson
2019-04-17  7:56 ` [PATCH 07/32] drm/i915: Move GraphicsTechnology files under gt/ Chris Wilson
2019-04-17  9:42   ` Tvrtko Ursulin
2019-04-18 12:04   ` Joonas Lahtinen
2019-04-23  8:57     ` Joonas Lahtinen
2019-04-23  9:40       ` Jani Nikula
2019-04-23 16:46         ` Rodrigo Vivi
2019-04-17  7:56 ` [PATCH 08/32] drm/i915: Introduce struct intel_wakeref Chris Wilson
2019-04-17  9:45   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 09/32] drm/i915: Pull the GEM powermangement coupling into its own file Chris Wilson
2019-04-17  7:56 ` [PATCH 10/32] drm/i915: Introduce context->enter() and context->exit() Chris Wilson
2019-04-17  7:56 ` [PATCH 11/32] drm/i915: Pass intel_context to i915_request_create() Chris Wilson
2019-04-17  7:56 ` [PATCH 12/32] drm/i915: Invert the GEM wakeref hierarchy Chris Wilson
2019-04-18 12:42   ` Tvrtko Ursulin
2019-04-18 13:07     ` Chris Wilson
2019-04-18 13:22       ` Chris Wilson
2019-04-23 13:02   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 13/32] drm/i915/gvt: Pin the per-engine GVT shadow contexts Chris Wilson
2019-04-17  7:56 ` [PATCH 14/32] drm/i915: Explicitly pin the logical context for execbuf Chris Wilson
2019-04-17  7:56 ` [PATCH 15/32] drm/i915: Export intel_context_instance() Chris Wilson
2019-04-17  7:56 ` [PATCH 16/32] drm/i915/selftests: Use the real kernel context for sseu isolation tests Chris Wilson
2019-04-17  7:56 ` [PATCH 17/32] drm/i915/selftests: Pass around intel_context for sseu Chris Wilson
2019-04-17  7:56 ` [PATCH 18/32] drm/i915: Pass intel_context to intel_context_pin_lock() Chris Wilson
2019-04-17  7:56 ` [PATCH 19/32] drm/i915: Split engine setup/init into two phases Chris Wilson
2019-04-17  7:56 ` [PATCH 20/32] drm/i915: Switch back to an array of logical per-engine HW contexts Chris Wilson
2019-04-17  7:56 ` [PATCH 21/32] drm/i915: Remove intel_context.active_link Chris Wilson
2019-04-17  9:47   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 22/32] drm/i915: Move i915_request_alloc into selftests/ Chris Wilson
2019-04-17  7:56 ` [PATCH 23/32] drm/i915: Allow multiple user handles to the same VM Chris Wilson
2019-04-17  7:56 ` [PATCH 24/32] drm/i915: Restore control over ppgtt for context creation ABI Chris Wilson
2019-04-17  7:56 ` [PATCH 25/32] drm/i915: Allow a context to define its set of engines Chris Wilson
2019-04-17  9:50   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 26/32] drm/i915: Re-expose SINGLE_TIMELINE flags for context creation Chris Wilson
2019-04-17  7:56 ` [PATCH 27/32] drm/i915: Allow userspace to clone contexts on creation Chris Wilson
2019-04-17  9:50   ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 28/32] drm/i915: Load balancing across a virtual engine Chris Wilson
2019-04-17 11:26   ` Tvrtko Ursulin
2019-04-17 13:51     ` Chris Wilson
2019-04-17  7:56 ` [PATCH 29/32] drm/i915: Apply an execution_mask to the virtual_engine Chris Wilson
2019-04-17 11:43   ` Tvrtko Ursulin
2019-04-17 11:57     ` Chris Wilson
2019-04-17 12:35       ` Tvrtko Ursulin
2019-04-17 12:46         ` Chris Wilson
2019-04-17 13:32           ` Tvrtko Ursulin [this message]
2019-04-18  7:24             ` Chris Wilson
2019-04-17  7:56 ` [PATCH 30/32] drm/i915: Extend execution fence to support a callback Chris Wilson
2019-04-17  7:56 ` [PATCH 31/32] drm/i915/execlists: Virtual engine bonding Chris Wilson
2019-04-18  6:47   ` Tvrtko Ursulin
2019-04-18  6:57     ` Chris Wilson
2019-04-18  8:57       ` Tvrtko Ursulin
2019-04-18  9:13         ` Chris Wilson
2019-04-18  9:50           ` Tvrtko Ursulin
2019-04-18  9:59             ` Chris Wilson
2019-04-18 10:24               ` Tvrtko Ursulin
2019-04-17  7:56 ` [PATCH 32/32] drm/i915: Allow specification of parallel execbuf Chris Wilson
2019-04-17  8:46 ` [PATCH 01/32] drm/i915: Seal races between async GPU cancellation, retirement and signaling Chris Wilson
2019-04-17 11:33 ` ✗ Fi.CI.BAT: failure for series starting with [01/32] " Patchwork
2019-04-18 10:32 ` [PATCH 01/32] " Tvrtko Ursulin
2019-04-18 10:40   ` Chris Wilson
2019-04-23 12:59 ` Tvrtko Ursulin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=826d688c-39bc-c98a-c507-77152ea3c7e4@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.