All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 06/16] drm/i915/gt: Decouple completed requests on unwind
Date: Wed, 25 Nov 2020 16:21:27 +0000	[thread overview]
Message-ID: <7157b2c0-0fde-8adb-95bd-84ae4573d5b9@linux.intel.com> (raw)
In-Reply-To: <160629970681.25068.9984672839751167059@build.alporthouse.com>


On 25/11/2020 10:21, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-11-25 09:15:25)
>>
>> On 24/11/2020 17:31, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2020-11-24 17:13:02)
>>>>
>>>> On 24/11/2020 11:42, Chris Wilson wrote:
>>>>> Since the introduction of preempt-to-busy, requests can complete in the
>>>>> background, even while they are not on the engine->active.requests list.
>>>>> As such, the engine->active.request list itself is not in strict
>>>>> retirement order, and we have to scan the entire list while unwinding to
>>>>> not miss any. However, if the request is completed we currently leave it
>>>>> on the list [until retirement], but we could just as simply remove it
>>>>> and stop treating it as active. We would only have to then traverse it
>>>>> once while unwinding in quick succession.
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/gt/intel_lrc.c | 6 ++++--
>>>>>     drivers/gpu/drm/i915/i915_request.c | 3 ++-
>>>>>     2 files changed, 6 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> index 30aa59fb7271..cf11cbac241b 100644
>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>> @@ -1116,8 +1116,10 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
>>>>>         list_for_each_entry_safe_reverse(rq, rn,
>>>>>                                          &engine->active.requests,
>>>>>                                          sched.link) {
>>>>> -             if (i915_request_completed(rq))
>>>>> -                     continue; /* XXX */
>>>>> +             if (i915_request_completed(rq)) {
>>>>> +                     list_del_init(&rq->sched.link);
>>>>> +                     continue;
>>>>> +             }
>>>>>     
>>>>>                 __i915_request_unsubmit(rq);
>>>>>     
>>>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>>>> index 8d7d29c9e375..a9db1376b996 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>>>> @@ -321,7 +321,8 @@ bool i915_request_retire(struct i915_request *rq)
>>>>>          * after removing the breadcrumb and signaling it, so that we do not
>>>>>          * inadvertently attach the breadcrumb to a completed request.
>>>>>          */
>>>>> -     remove_from_engine(rq);
>>>>> +     if (!list_empty(&rq->sched.link))
>>>>> +             remove_from_engine(rq);
>>>>
>>>> The list_empty check is unlocked so is list_del_init in
>>>> remove_from_engine safe on potentially already unlinked request or it
>>>> needs to re-check under the lock?
>>>
>>> It's safe. The unwind is under the lock, and remove_from_engine takes
>>> the lock, and both do list_del_init() which is a no-op if already
>>> removed. And the end state of the flag bits is the same on each path. We
>>> can skip the __notify_execute_cb_imm() since we know in unwind it is
>>> executing and there should be no cb.
>>>
>>> The test before we take the lock is only allowed to skip the active.lock
>>> if it sees the list is already decoupled, in which case we can leave it
>>> to the unwind to remove it from the engine (and we know that the request
>>> can only have been inflight prior to completion). Since the test is not
>>> locked, we don't serialise with the removal, but the list_del_init is
>>> the last action on the request so there is no window where the unwind is
>>> accessing the request after it may have been retired.
>>>
>>> list_move() will not confuse list_empty(), as although it does a
>>> list_del_entry, it is not temporarily re-initialised to an empty list.
>>
>> List_del_init is indeed safe. List_move.. which one you think can race
>> with retire? Preempt-to-busy unwinding an almost completed request yet
>> again? Or even preempt timeout racing with completion?
> 
> Here in unwind. We pass the completion check, but the request may still
> be running and complete at any time (until we submit & ack the new ELSP).
> So an unlocked list_empty check during retire can race with any of the
> list_move during unwind and resubmit. (On resubmit, we check completed
> under the lock and drop the request in __i915_request_submit which
> should also leave it in a consistent state as if we had called
> remove_from_engine.)

Right, yes, that seems safe as well. Only new problem could have been a 
false negative, meaning remote_from_engine _not_ scheduled by mistake if 
a transient false list_empty condition.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2020-11-25 16:21 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-24 11:42 [Intel-gfx] [PATCH 01/16] drm/i915/gem: Drop free_work for GEM contexts Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 02/16] drm/i915/gt: Track the overall awake/busy time Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 03/16] drm/i915/gt: Protect context lifetime with RCU Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 04/16] drm/i915/gt: Split the breadcrumb spinlock between global and contexts Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 05/16] drm/i915/gt: Move the breadcrumb to the signaler if completed upon cancel Chris Wilson
2020-11-24 16:19   ` Tvrtko Ursulin
2020-11-24 16:30     ` Chris Wilson
2020-11-24 17:02       ` Tvrtko Ursulin
2020-11-25 19:56   ` [Intel-gfx] [PATCH v2] " Chris Wilson
2020-11-26 11:32     ` Tvrtko Ursulin
2020-11-24 11:42 ` [Intel-gfx] [PATCH 06/16] drm/i915/gt: Decouple completed requests on unwind Chris Wilson
2020-11-24 17:13   ` Tvrtko Ursulin
2020-11-24 17:31     ` Chris Wilson
2020-11-25  9:15       ` Tvrtko Ursulin
2020-11-25 10:21         ` Chris Wilson
2020-11-25 16:21           ` Tvrtko Ursulin [this message]
2020-11-24 11:42 ` [Intel-gfx] [PATCH 07/16] drm/i915/gt: Check for a completed last request once Chris Wilson
2020-11-24 17:19   ` Tvrtko Ursulin
2020-11-24 17:38     ` Chris Wilson
2020-11-25  8:59       ` Tvrtko Ursulin
2020-11-24 11:42 ` [Intel-gfx] [PATCH 08/16] drm/i915/gt: Replace direct submit with direct call to tasklet Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 09/16] drm/i915/gt: ce->inflight updates are now serialised Chris Wilson
2020-11-25  9:34   ` Tvrtko Ursulin
2020-11-24 11:42 ` [Intel-gfx] [PATCH 10/16] drm/i915/gt: Use virtual_engine during execlists_dequeue Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 11/16] drm/i915/gt: Decouple inflight virtual engines Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 12/16] drm/i915/gt: Defer schedule_out until after the next dequeue Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 13/16] drm/i915/gt: Remove virtual breadcrumb before transfer Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 14/16] drm/i915/gt: Shrink the critical section for irq signaling Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 15/16] drm/i915/gt: Resubmit the virtual engine on schedule-out Chris Wilson
2020-11-24 11:42 ` [Intel-gfx] [PATCH 16/16] drm/i915/gt: Simplify virtual engine handling for execlists_hold() Chris Wilson
2020-11-24 14:15 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/16] drm/i915/gem: Drop free_work for GEM contexts Patchwork
2020-11-24 14:16 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-11-24 14:45 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-11-24 18:04 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2020-11-25 21:02 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/16] drm/i915/gem: Drop free_work for GEM contexts (rev2) Patchwork
2020-11-25 21:04 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-11-25 21:32 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-11-25 23:34 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7157b2c0-0fde-8adb-95bd-84ae4573d5b9@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.