intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	intel-gfx@lists.freedesktop.org
Cc: thomas.hellstrom@intel.com
Subject: Re: [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance
Date: Tue, 26 Jan 2021 13:24:07 +0000	[thread overview]
Message-ID: <161166744720.2943.1804423973098975957@build.alporthouse.com> (raw)
In-Reply-To: <10f2ca0b-f18a-47c7-b038-9c40135cc040@linux.intel.com>

Quoting Tvrtko Ursulin (2021-01-26 13:15:29)
> 
> On 26/01/2021 11:55, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2021-01-26 11:40:24)
> >>
> >> On 26/01/2021 11:30, Chris Wilson wrote:
> >>> Quoting Tvrtko Ursulin (2021-01-26 11:12:53)
> >>>>
> >>>>
> >>>> On 25/01/2021 14:01, Chris Wilson wrote:
> >>>>> +static void ipi_schedule(struct work_struct *wrk)
> >>>>> +{
> >>>>> +     struct i915_sched_ipi *ipi = container_of(wrk, typeof(*ipi), work);
> >>>>> +     struct i915_request *rq = xchg(&ipi->list, NULL);
> >>>>> +
> >>>>> +     do {
> >>>>> +             struct i915_request *rn = xchg(&rq->sched.ipi_link, NULL);
> >>>>> +             int prio;
> >>>>> +
> >>>>> +             prio = ipi_get_prio(rq);
> >>>>> +
> >>>>> +             /*
> >>>>> +              * For cross-engine scheduling to work we rely on one of two
> >>>>> +              * things:
> >>>>> +              *
> >>>>> +              * a) The requests are using dma-fence fences and so will not
> >>>>> +              * be scheduled until the previous engine is completed, and
> >>>>> +              * so we cannot cross back onto the original engine and end up
> >>>>> +              * queuing an earlier request after the first (due to the
> >>>>> +              * interrupted DFS).
> >>>>> +              *
> >>>>> +              * b) The requests are using semaphores and so may be already
> >>>>> +              * be in flight, in which case if we cross back onto the same
> >>>>> +              * engine, we will already have put the interrupted DFS into
> >>>>> +              * the priolist, and the continuation will now be queued
> >>>>> +              * afterwards [out-of-order]. However, since we are using
> >>>>> +              * semaphores in this case, we also perform yield on semaphore
> >>>>> +              * waits and so will reorder the requests back into the correct
> >>>>> +              * sequence. This occurrence (of promoting a request chain
> >>>>> +              * that crosses the engines using semaphores back unto itself)
> >>>>> +              * should be unlikely enough that it probably does not matter...
> >>>>> +              */
> >>>>> +             local_bh_disable();
> >>>>> +             i915_request_set_priority(rq, prio);
> >>>>> +             local_bh_enable();
> >>>>
> >>>> Is it that important and wouldn't the priority order restore eventually
> >>>> due timeslicing?
> >>>
> >>> There would be a window in which we executed userspace code
> >>> out-of-order. That's enough to scare me! However, for our PI dependency
> >>> chains it should not matter as the only time we do submit out-of-order,
> >>> we are stuck on _our_ semaphore that cannot be resolved until the
> >>> requests are back in-order.
> >>
> >> Out of order how? Within a single timeline?! I though only with
> >> incomplete view of priority inheritance, which in my mind could only
> >> cause deadlocks (if no timeslicing). But really really out of order?
> > 
> > Fences between timelines. Let's say we have 3 requests, A,B,C all with
> > sequential fencing (C depends on B depends on A), but B is on a
> > different engine to (A, C) and we are using semaphores to submit early.
> > If we bump the priority of C, we see it crosses the engine to B, and send
> > an ipi_priority, but set C to be higher priority than A. So we now
> > schedule C before A!
> 
> Yeah so different timelines, I think that's not a huge problem to start 
> with. Only if things were non-preemptable.

And for the special case where it may occur, it's inside an preemptible
section (under our control).

> > However, since C depends on B which depends on A, C is stuck on its
> > semaphore from B, and B is waiting for A. As soon as A is set to the
> > same priority as C (after a couple of ipi_priority()), we rerun the
> > scheduler see that C has a semaphore-yield (or eventually timeslice
> > expired) and so run A before C, and order is restored.
> > 
> >>> I've tried to trick this into causing problems with the
> >>> i915_selftest/igt_schedule_cycle and gem_exec_schedule/noreorder.
> >>> Fortunately for my sanity, neither test have caught any problems.
> >>>
> >>> This is the handwaving part of removing the global lock.
> >>>
> >>>>> +     /*
> >>>>> +      * If we are setting the priority before being submitted, see if we
> >>>>> +      * can quickly adjust our own priority in-situ and avoid taking
> >>>>> +      * the contended engine->active.lock. If we need priority inheritance,
> >>>>> +      * take the slow route.
> >>>>> +      */
> >>>>> +     if (rq_prio(rq) == I915_PRIORITY_INVALID) {
> >>>>> +             struct i915_dependency *p;
> >>>>> +
> >>>>> +             rcu_read_lock();
> >>>>> +             for_each_signaler(p, rq) {
> >>>>> +                     struct i915_request *s =
> >>>>> +                             container_of(p->signaler, typeof(*s), sched);
> >>>>> +
> >>>>> +                     if (rq_prio(s) >= prio)
> >>>>> +                             continue;
> >>>>> +
> >>>>> +                     if (__i915_request_is_complete(s))
> >>>>> +                             continue;
> >>>>> +
> >>>>> +                     break;
> >>>>> +             }
> >>>>> +             rcu_read_unlock();
> >>>>
> >>>> Exit this loop with a first lower priority incomplete signaler. What
> >>>> does the block below then do? Feels like it needs a comment.
> >>>
> >>> I thought I had sufficiently explained that in the comment above.
> >>>
> >>> /* Update priority in place if no PI required */
> >>>>> +             if (&p->signal_link == &rq->sched.signalers_list &&
> >>>>> +                 cmpxchg(&rq->sched.attr.priority,
> >>>>> +                         I915_PRIORITY_INVALID,
> >>>>> +                         prio) == I915_PRIORITY_INVALID)
> >>>>> +                     return;
> >>>
> >>> It could do a few more tricks to change the priority in-place a second
> >>> time, but I did not think that would be frequent enough to matter.
> >>> Whereas we always adjust the priority from INVALID once before
> >>> submission, and avoiding taking the lock then does make a difference to
> >>> the profiles.
> >>
> >> To start with, if p is NULL or un-initialized (can be, no?) then
> >> relationship of &p->signal_link to &rq->sched.signalers_list escapes me.
> > 
> > p is constrained to be a member of the signalers_list or its head.
> 
> Is it defined list_for_each_entry exits with pos set? It is in 
> implementation but I don't know why it would have to be. Could you 
> change this to some form of list_empty or a descriptively named helper 
> for clarity?

It as defined as the macro gets.

There's a list_entry_is_head(). That sounds new.

commit e130816164e244b692921de49771eeb28205152d
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date:   Thu Oct 15 20:11:31 2020 -0700

    include/linux/list.h: add a macro to test if entry is pointing to the head

-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2021-01-26 13:24 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25 14:00 [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Chris Wilson
2021-01-25 14:00 ` [Intel-gfx] [PATCH 02/41] drm/i915/gt: Move the defer_request waiter active assertion Chris Wilson
2021-01-25 14:53   ` Tvrtko Ursulin
2021-01-25 14:00 ` [Intel-gfx] [PATCH 03/41] drm/i915: Replace engine->schedule() with a known request operation Chris Wilson
2021-01-25 15:14   ` Tvrtko Ursulin
2021-01-25 14:00 ` [Intel-gfx] [PATCH 04/41] drm/i915: Teach the i915_dependency to use a double-lock Chris Wilson
2021-01-25 15:34   ` Tvrtko Ursulin
2021-01-25 21:37     ` Chris Wilson
2021-01-26  9:40       ` Tvrtko Ursulin
2021-01-25 14:01 ` [Intel-gfx] [PATCH 05/41] drm/i915: Restructure priority inheritance Chris Wilson
2021-01-26 11:12   ` Tvrtko Ursulin
2021-01-26 11:30     ` Chris Wilson
2021-01-26 11:40       ` Tvrtko Ursulin
2021-01-26 11:55         ` Chris Wilson
2021-01-26 13:15           ` Tvrtko Ursulin
2021-01-26 13:24             ` Chris Wilson [this message]
2021-01-26 13:45               ` Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 06/41] drm/i915/selftests: Measure set-priority duration Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 07/41] drm/i915/selftests: Exercise priority inheritance around an engine loop Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 08/41] drm/i915: Improve DFS for priority inheritance Chris Wilson
2021-01-26 16:22   ` Tvrtko Ursulin
2021-01-26 16:26     ` Chris Wilson
2021-01-26 16:42       ` Tvrtko Ursulin
2021-01-26 16:51         ` Tvrtko Ursulin
2021-01-26 16:51         ` Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 09/41] drm/i915/selftests: Exercise relative mmio paths to non-privileged registers Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 10/41] drm/i915/selftests: Exercise cross-process context isolation Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 11/41] drm/i915: Extract request submission from execlists Chris Wilson
2021-01-26 16:28   ` Tvrtko Ursulin
2021-01-25 14:01 ` [Intel-gfx] [PATCH 12/41] drm/i915: Extract request rewinding " Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 13/41] drm/i915: Extract request suspension from the execlists Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 14/41] drm/i915: Extract the ability to defer and rerun a request later Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 15/41] drm/i915: Fix the iterative dfs for defering requests Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 16/41] drm/i915: Move common active lists from engine to i915_scheduler Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 17/41] drm/i915: Move scheduler queue Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 18/41] drm/i915: Move tasklet from execlists to sched Chris Wilson
2021-01-27 14:10   ` Tvrtko Ursulin
2021-01-27 14:24     ` Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 19/41] drm/i915/gt: Show scheduler queues when dumping state Chris Wilson
2021-01-27 14:13   ` Tvrtko Ursulin
2021-01-27 14:35     ` Chris Wilson
2021-01-27 14:50       ` Tvrtko Ursulin
2021-01-27 14:55         ` Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 20/41] drm/i915: Replace priolist rbtree with a skiplist Chris Wilson
2021-01-27 15:10   ` Tvrtko Ursulin
2021-01-27 15:33     ` Chris Wilson
2021-01-27 15:44       ` Chris Wilson
2021-01-27 15:58         ` Tvrtko Ursulin
2021-01-28  9:50           ` Chris Wilson
2021-01-28 15:56   ` Tvrtko Ursulin
2021-01-28 16:26     ` Chris Wilson
2021-01-28 16:42       ` Tvrtko Ursulin
2021-01-28 22:20         ` Chris Wilson
2021-01-28 22:44         ` Chris Wilson
2021-01-29  9:24           ` Tvrtko Ursulin
2021-01-29  9:37       ` Tvrtko Ursulin
2021-01-29 10:26         ` Chris Wilson
2021-01-28 22:56   ` Matthew Brost
2021-01-29 10:30     ` Chris Wilson
2021-01-29 17:01       ` Matthew Brost
2021-01-29 10:22   ` Tvrtko Ursulin
2021-01-25 14:01 ` [Intel-gfx] [PATCH 21/41] drm/i915: Wrap cmpxchg64 with try_cmpxchg64() helper Chris Wilson
2021-01-27 15:28   ` Tvrtko Ursulin
2021-01-25 14:01 ` [Intel-gfx] [PATCH 22/41] drm/i915: Fair low-latency scheduling Chris Wilson
2021-01-28 11:35   ` Tvrtko Ursulin
2021-01-28 12:32     ` Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 23/41] drm/i915/gt: Specify a deadline for the heartbeat Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 24/41] drm/i915: Extend the priority boosting for the display with a deadline Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 25/41] drm/i915/gt: Support virtual engine queues Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 26/41] drm/i915: Move saturated workload detection back to the context Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 27/41] drm/i915: Bump default timeslicing quantum to 5ms Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 28/41] drm/i915/gt: Wrap intel_timeline.has_initial_breadcrumb Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 29/41] drm/i915/gt: Track timeline GGTT offset separately from subpage offset Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 30/41] drm/i915/gt: Add timeline "mode" Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 31/41] drm/i915/gt: Use indices for writing into relative timelines Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 32/41] drm/i915/selftests: Exercise relative timeline modes Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 33/41] drm/i915/gt: Use ppHWSP for unshared non-semaphore related timelines Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 34/41] Restore "drm/i915: drop engine_pin/unpin_breadcrumbs_irq" Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 35/41] drm/i915/gt: Couple tasklet scheduling for all CS interrupts Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 36/41] drm/i915/gt: Support creation of 'internal' rings Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 37/41] drm/i915/gt: Use client timeline address for seqno writes Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 38/41] drm/i915/gt: Infrastructure for ring scheduling Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 39/41] drm/i915/gt: Implement ring scheduler for gen4-7 Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 40/41] drm/i915/gt: Enable ring scheduling for gen5-7 Chris Wilson
2021-01-25 14:01 ` [Intel-gfx] [PATCH 41/41] drm/i915: Support secure dispatch on gen6/gen7 Chris Wilson
2021-01-25 14:40 ` [Intel-gfx] [PATCH 01/41] drm/i915/selftests: Check for engine-reset errors in the middle of workarounds Tvrtko Ursulin
2021-01-25 17:08 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/41] " Patchwork
2021-01-25 17:10 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-01-25 17:38 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-01-25 22:45 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=161166744720.2943.1804423973098975957@build.alporthouse.com \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=thomas.hellstrom@intel.com \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).