All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Gordon <david.s.gordon@intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	intel-gfx@lists.freedesktop.org, "Rogozhkin,
	Dmitry V" <dmitry.v.rogozhkin@intel.com>,
	"Gong, Zhipeng" <zhipeng.gong@intel.com>,
	"Harrison, John C" <john.c.harrison@intel.com>
Subject: Re: [PATCH v7] drm/i915: Slaughter the thundering i915_wait_request herd
Date: Tue, 8 Dec 2015 14:53:27 +0000	[thread overview]
Message-ID: <5666EEE7.7040106@intel.com> (raw)
In-Reply-To: <1448278471-31181-1-git-send-email-John.C.Harrison@Intel.com>

On 08/12/15 14:33, Chris Wilson wrote:
> On Tue, Dec 08, 2015 at 02:03:39PM +0000, Tvrtko Ursulin wrote:
>> On 08/12/15 10:44, Chris Wilson wrote:
>>> On Mon, Dec 07, 2015 at 03:08:28PM +0000, Tvrtko Ursulin wrote:
>>>> Equally, why wouldn't we wake up all waiters for which the requests
>>>> have been completed?
>>>
>>> Because we no longer track the requests to be completed, having moved to
>>> a chain of waiting processes instead of a chain of requests. I could
>>> insert a waitqueue into the intel_breadcrumb and that would indeed
>>> necessitate locking in the irq handler (and irq locks everywhere :(.
>>
>> You have a tree of seqnos each with a wait->task and current seqno
>> on the engine can be queried. So I don't see where is the problem?
>
> The "problem" is that every process will do its own post-irq seqno check
> after being woken up, then grab the common spinlock to remove itself
> from the tree.
>
> We could avoid that by using RB_NODE_EMPTY shortcircuiting I suppose.
>
>>>> Would be a cheap check here and it would save a cascading growing
>>>> latency as one task wakes up the next one.
>>>
>>> Well, it can't be here since we may remove_waiter after a signal
>>> (incomplete wait). So this part has to walk the chain of processes. Ugh,
>>> and have to move the waitqueue from one waiter to the next...
>>
>> Ok on interrupted waiters it makes no sense, but on normal waiter
>> removal it would just mean comparing engine->get_seqno() vs the
>> first waiter seqno and waking up all until the uncompleted one is
>> found?
>
> It's a tradeoff whether it can be written more neatly than:
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index be76086..474636f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1279,6 +1279,9 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>                          break;
>                  }
>
> +               if (RB_EMPTY_NODE(&wait.node))
> +                       break;
> +
>                  set_task_state(wait.task, state);
>
>                  /* Before we do the heavier coherent read of the seqno,
> @@ -1312,7 +1315,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>                          break;
>          }
>   out:
> -       intel_engine_remove_breadcrumb(req->ring, &wait);
> +       intel_engine_remove_breadcrumb(req->ring, &wait, ret);
>          __set_task_state(wait.task, TASK_RUNNING);
>          trace_i915_gem_request_wait_end(req);
>
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index ae3ee3c..421c214 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -169,10 +169,14 @@ void intel_engine_enable_fake_irq(struct intel_engine_cs *engine)
>   }
>
>   void intel_engine_remove_breadcrumb(struct intel_engine_cs *engine,
> -                                   struct intel_breadcrumb *wait)
> +                                   struct intel_breadcrumb *wait,
> +                                   int ret)
>   {
>          struct intel_breadcrumbs *b = &engine->breadcrumbs;
>
> +       if (RB_EMPTY_NODE(&wait->node))
> +               return;
> +
>          spin_lock(&b->lock);
>
>          if (b->first_waiter == wait->task) {
> @@ -187,6 +191,18 @@ void intel_engine_remove_breadcrumb(struct intel_engine_cs *engine,
>                   * completion check.
>                   */
>                  next = rb_next(&wait->node);
> +               if (ret == 0) {
> +                       u32 seqno = intel_ring_get_seqno(engine);
> +                       while (next &&
> +                              i915_seqno_passed(seqno,
> +                                                to_crumb(next)->seqno)) {
> +                               struct rb_node *n = rb_next(next);
> +                               wake_up_process(next->task);
> +                               rb_erase(next, &b->requests);
> +                               RB_CLEAR_NODE(next);
> +                               next = n;
> +                       }
> +               }
>                  task = next ? to_crumb(next)->task : NULL;
>
>                  b->first_waiter = task;
>
> My biggest complaint is that we are mixing request-complete and direct
> evaluation of i915_seqno_passed now. (I expect that we can tidy the code
> up somewhat.)
> -Chris

So how is this going to work in the new "struct fence" regime? (See John 
Harrison's RFC of 23rd November).

The fence structure contains the per-request completion indicator, so 
once a fence has been signalled the owner only needs to check that, and 
doesn't need to reevaluate any seqno comparisons after waking up.

Does that help?

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2015-12-08 14:53 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-29  8:47 i915_wait_request scaling Chris Wilson
2015-11-29  8:47 ` [PATCH 01/15] drm/i915: Break busywaiting for requests on pending signals Chris Wilson
2015-11-30 10:01   ` Tvrtko Ursulin
2015-11-29  8:48 ` [PATCH 02/15] drm/i915: Limit the busy wait on requests to 10us not 10ms! Chris Wilson
2015-11-30 10:02   ` Tvrtko Ursulin
2015-11-30 10:08     ` Chris Wilson
2015-11-29  8:48 ` [PATCH 03/15] drm/i915: Only spin whilst waiting on the current request Chris Wilson
2015-11-30 10:06   ` Tvrtko Ursulin
2015-12-01 15:47     ` Dave Gordon
2015-12-01 15:58       ` Chris Wilson
2015-12-01 16:44         ` Dave Gordon
2015-12-03  8:52           ` Daniel Vetter
2015-11-29  8:48 ` [PATCH 04/15] drm/i915: Cache the reset_counter for the request Chris Wilson
2015-12-01  8:31   ` Daniel Vetter
2015-12-01  8:47     ` Chris Wilson
2015-12-01  9:15       ` Chris Wilson
2015-12-01 11:05         ` [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper Chris Wilson
2015-12-01 11:05           ` [PATCH 2/3] drm/i915: Store the reset counter when constructing a request Chris Wilson
2015-12-03  8:59             ` Daniel Vetter
2015-12-01 11:05           ` [PATCH 3/3] drm/i915: Prevent leaking of -EIO from i915_wait_request() Chris Wilson
2015-12-03  9:14             ` Daniel Vetter
2015-12-03  9:41               ` Chris Wilson
2015-12-11  9:02               ` Chris Wilson
2015-12-11 16:46                 ` Daniel Vetter
2015-12-03  8:57           ` [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper Daniel Vetter
2015-12-03  9:02             ` Chris Wilson
2015-12-03  9:20               ` Daniel Vetter
2015-11-29  8:48 ` [PATCH 05/15] drm/i915: Suppress error message when GPU resets are disabled Chris Wilson
2015-12-01  8:30   ` Daniel Vetter
2015-11-29  8:48 ` [PATCH 06/15] drm/i915: Delay queuing hangcheck to wait-request Chris Wilson
2015-11-29  8:48 ` [PATCH 07/15] drm/i915: Check the timeout passed to i915_wait_request Chris Wilson
2015-11-30 10:14   ` Tvrtko Ursulin
2015-11-30 10:19     ` Chris Wilson
2015-11-30 10:27       ` Tvrtko Ursulin
2015-11-30 10:22   ` Chris Wilson
2015-11-30 10:28     ` Tvrtko Ursulin
2015-11-29  8:48 ` [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd Chris Wilson
2015-11-30 10:53   ` Chris Wilson
2015-11-30 12:09     ` Tvrtko Ursulin
2015-11-30 12:38       ` Chris Wilson
2015-11-30 13:33         ` Tvrtko Ursulin
2015-11-30 14:30           ` Chris Wilson
2015-11-30 12:05   ` Tvrtko Ursulin
2015-11-30 12:30     ` Chris Wilson
2015-11-30 13:32       ` Tvrtko Ursulin
2015-11-30 14:18         ` Chris Wilson
2015-12-01 17:06           ` Dave Gordon
2015-11-30 14:26         ` Chris Wilson
2015-11-30 14:34   ` [PATCH v4] " Chris Wilson
2015-11-30 16:30     ` Chris Wilson
2015-11-30 16:40       ` Chris Wilson
2015-12-01 18:34     ` Dave Gordon
2015-12-03 16:22       ` [PATCH v7] " Chris Wilson
2015-12-07 15:08         ` Tvrtko Ursulin
2015-12-08 10:44           ` Chris Wilson
2015-12-08 14:03             ` Tvrtko Ursulin
2015-12-08 14:33               ` Chris Wilson
2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
2015-11-23 11:34                   ` [RFC 01/12] staging/android/sync: Support sync points created from dma-fences John.C.Harrison
2015-11-23 13:29                     ` Maarten Lankhorst
2015-11-23 13:31                     ` [Intel-gfx] " Tvrtko Ursulin
2015-11-23 11:34                   ` [RFC 02/12] staging/android/sync: add sync_fence_create_dma John.C.Harrison
2015-11-23 13:27                     ` Maarten Lankhorst
2015-11-23 13:38                       ` John Harrison
2015-11-23 13:44                         ` Tvrtko Ursulin
2015-11-23 13:48                           ` Maarten Lankhorst
2015-11-23 11:34                   ` [RFC 03/12] staging/android/sync: Move sync framework out of staging John.C.Harrison
2015-11-23 11:34                   ` [RFC 04/12] drm/i915: Convert requests to use struct fence John.C.Harrison
2015-11-23 11:34                   ` [RFC 05/12] drm/i915: Removed now redudant parameter to i915_gem_request_completed() John.C.Harrison
2015-11-23 11:34                   ` [RFC 06/12] drm/i915: Add per context timelines to fence object John.C.Harrison
2015-11-23 11:34                   ` [RFC 07/12] drm/i915: Delay the freeing of requests until retire time John.C.Harrison
2015-11-23 11:34                   ` [RFC 08/12] drm/i915: Interrupt driven fences John.C.Harrison
2015-12-11 12:17                     ` Tvrtko Ursulin
2015-11-23 11:34                   ` [RFC 09/12] drm/i915: Updated request structure tracing John.C.Harrison
2015-11-23 11:34                   ` [RFC 10/12] android/sync: Fix reversed sense of signaled fence John.C.Harrison
2015-11-23 11:34                   ` [RFC 11/12] drm/i915: Add sync framework support to execbuff IOCTL John.C.Harrison
2015-11-23 11:34                   ` [RFC 12/12] drm/i915: Cache last IRQ seqno to reduce IRQ overhead John.C.Harrison
2015-11-23 11:38                   ` [RFC 00/12] Convert requests to use struct fence John Harrison
2015-12-08 14:53                   ` Dave Gordon [this message]
2015-11-30 15:45   ` [PATCH] drm/i915: Convert trace-irq to the breadcrumb waiter Chris Wilson
2015-11-29  8:48 ` [PATCH 09/15] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson
2015-11-29  8:48 ` [PATCH 10/15] drm/i915: Remove the lazy_coherency parameter from request-completed? Chris Wilson
2015-11-29  8:48 ` [PATCH 11/15] drm/i915: Use HWS for seqno tracking everywhere Chris Wilson
2015-11-29  8:48 ` [PATCH 12/15] drm/i915: Reduce seqno/irq barrier to a clflush on legacy gen6+ Chris Wilson
2015-11-29  8:48 ` [PATCH 13/15] drm/i915: Stop setting wraparound seqno on initialisation Chris Wilson
2015-12-01 16:57   ` Dave Gordon
2015-12-04  9:36     ` Daniel Vetter
2015-12-04  9:51       ` Chris Wilson
2015-11-29  8:48 ` [PATCH 14/15] drm/i915: Only query timestamp when measuring elapsed time Chris Wilson
2015-11-30 10:19   ` Tvrtko Ursulin
2015-11-30 14:31     ` Chris Wilson
2015-11-29  8:48 ` [PATCH 15/15] drm/i915: On GPU reset, set the HWS breadcrumb to the last seqno Chris Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5666EEE7.7040106@intel.com \
    --to=david.s.gordon@intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=dmitry.v.rogozhkin@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=john.c.harrison@intel.com \
    --cc=tvrtko.ursulin@linux.intel.com \
    --cc=zhipeng.gong@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.