All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: John Harrison <John.C.Harrison@Intel.com>
Cc: Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands
Date: Wed, 24 Jun 2015 14:45:33 +0200	[thread overview]
Message-ID: <20150624124533.GG25769@phenom.ffwll.local> (raw)
In-Reply-To: <558AA028.4090908@Intel.com>

On Wed, Jun 24, 2015 at 01:18:48PM +0100, John Harrison wrote:
> On 23/06/2015 21:00, Daniel Vetter wrote:
> >On Tue, Jun 23, 2015 at 04:43:24PM +0100, John Harrison wrote:
> >>On 23/06/2015 14:24, Daniel Vetter wrote:
> >>>On Tue, Jun 23, 2015 at 12:38:01PM +0100, John Harrison wrote:
> >>>>On 22/06/2015 21:12, Daniel Vetter wrote:
> >>>>>On Fri, Jun 19, 2015 at 05:34:12PM +0100, John.C.Harrison@Intel.com wrote:
> >>>>>>From: John Harrison <John.C.Harrison@Intel.com>
> >>>>>>
> >>>>>>It is a bad idea for i915_add_request() to fail. The work will already have been
> >>>>>>send to the ring and will be processed, but there will not be any tracking or
> >>>>>>management of that work.
> >>>>>>
> >>>>>>The only way the add request call can fail is if it can't write its epilogue
> >>>>>>commands to the ring (cache flushing, seqno updates, interrupt signalling). The
> >>>>>>reasons for that are mostly down to running out of ring buffer space and the
> >>>>>>problems associated with trying to get some more. This patch prevents that
> >>>>>>situation from happening in the first place.
> >>>>>>
> >>>>>>When a request is created, it marks sufficient space as reserved for the
> >>>>>>epilogue commands. Thus guaranteeing that by the time the epilogue is written,
> >>>>>>there will be plenty of space for it. Note that a ring_begin() call is required
> >>>>>>to actually reserve the space (and do any potential waiting). However, that is
> >>>>>>not currently done at request creation time. This is because the ring_begin()
> >>>>>>code can allocate a request. Hence calling begin() from the request allocation
> >>>>>>code would lead to infinite recursion! Later patches in this series remove the
> >>>>>>need for begin() to do the allocate. At that point, it becomes safe for the
> >>>>>>allocate to call begin() and really reserve the space.
> >>>>>>
> >>>>>>Until then, there is a potential for insufficient space to be available at the
> >>>>>>point of calling i915_add_request(). However, that would only be in the case
> >>>>>>where the request was created and immediately submitted without ever calling
> >>>>>>ring_begin() and adding any work to that request. Which should never happen. And
> >>>>>>even if it does, and if that request happens to fall down the tiny window of
> >>>>>>opportunity for failing due to being out of ring space then does it really
> >>>>>>matter because the request wasn't doing anything in the first place?
> >>>>>>
> >>>>>>v2: Updated the 'reserved space too small' warning to include the offending
> >>>>>>sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
> >>>>>>re-initialisation of tracking state after a buffer wrap to keep the sanity
> >>>>>>checks accurate.
> >>>>>>
> >>>>>>v3: Incremented the reserved size to accommodate Ironlake (after finally
> >>>>>>managing to run on an ILK system). Also fixed missing wrap code in LRC mode.
> >>>>>>
> >>>>>>v4: Added extra comment and removed duplicate WARN (feedback from Tomas).
> >>>>>>
> >>>>>>v5: Re-write of wrap handling to prevent unnecessary early wraps (feedback from
> >>>>>>Daniel Vetter).
> >>>>>This didn't actually implement what I suggested (wrapping is the worst
> >>>>>case, hence skipping the check for that is breaking the sanity check) and
> >>>>>so changed the patch from "correct, but a bit fragile" to broken. I've
> >>>>>merged the previous version instead.
> >>>>>-Daniel
> >>>>I'm confused. I thought your main issue was the early wrapping not the
> >>>>sanity check. The check is to ensure that the reservation is large enough to
> >>>>cover all the commands written during request submission. That should not be
> >>>>affected by whether a wrap occurs or not. Wrapping does not magically add an
> >>>>extra bunch of dwords to the emit_request() call. Whereas making the check
> >>>>work with the wrap condition requires adding in extra tracking state of
> >>>>exactly where the wrap occurred. That is extra code that only exists to
> >>>>catch something in the very rare case which should already have been caught
> >>>>in the very common case. I.e. if your reserved size is too small then you
> >>>>will hit the warning on every batch buffer submission.
> >>>The problem is that if you allow a wrap in the reserve size then the
> >>>ringspace requirements are bigger than if you don't wrap. And since the
> >>>add request is split up into many intel_ring_begin that's possible. Hence
> >>>if you allow wrapping in the reserved space, then the most important case
> >>>for the debug check is to make sure that it catches any kind of
> >>>reservation overflow while wrapping. The not-wrapped case is probably the
> >>>boring one.
> >>>
> >>>And indeed eventually we should overflow since according to your comment
> >>>the worst case add request on ilk is 136 dwords. And the largest
> >>>intel_ring_begin in there is 32 dwords, which means at most we'll throw
> >>>away 31 dwords when wrapping. Which means the 160 dwords of reservation
> >>>are not enough since we'd need 167 dwords of space for the worst case. But
> >>>since the space_end debug check was a no-op for the wrapped case you won't
> >>>catch this one.
> >>The minimum reservation size in this case is still only 136. The prepare
> >>code checks for the 32 words actually requested and wraps if necessary. It
> >>then checks for 136+32 words of space. If that would cause a wrap it will
> >>then add on the amount of space actually left in the ring and wait for that
> >>bigger total. That guarantees that it has waited for the 136 at the start of
> >>the ring. The caller is then free to fill in the 32 words and there is still
> >>guaranteed to be a minimum of 136 words available (with or without wrapping)
> >>before any further wait for space is necessary. Thus the add_request() code
> >>is safe from fear of failure irrespective of where any wrap might occur.
> >>
> >>
> >>>Wrt keeping track of wrapping while the reservation is in use, the
> >>>following should do that without any need of additional tracking:
> >>>
> >>>
> >>>	int used_size = ringbuf->tail - ringbuf->reserved_tail;
> >>>
> >>>	if (used_size < 0)
> >>>		used_size += ringbuf->size;
> >>>
> >>>	WARN(used_size < ringbuf->reserved_size,
> >>>	     "request reserved size too small: %d vs %d!\n",
> >>>	     used_size, ringbuf->reserved_size);
> >>>
> >>>I was mistaken that you can reuse __intel_ring_space (since that has
> >>>slightly different requirements), but this gives you a nicely localized
> >>>check for reservation overflow which works even when you wrap. Ofc it
> >>>won't work if an add_request is bigger than the entire ring, but that's
> >>>impossible anyway since we can at most reserve ringbuf->size -
> >>>I915_RING_FREE_SPACE.
> >>The problem with the above calculation is that it includes the wasted space
> >>at the end of the ring. Thus it will complain the reserved size was too
> >>small when in fact it was just fine.
> >Ok I again misunderstood your patch a bit since it didn't quite do what I
> >expect, and I stand corrected that v5 works too. But I still seem to fail
> >to get my main concern across. I'll see whether I can whip up a patch as a
> >short demonstration, maybe that helps to unconfuse this dicussion.
> >
> >For now I think we're covered with either v4 or v5 so sticking with either
> >is ok with me.
> >-Daniel
> 
> I think v5 is much better. It reduces the ring space wastage which I thought
> was your main concern.

Ok with me too - I simply didn't pick it up when merging yesterday because
I couldn't immediately convince myself it's correct, but really wanted to
pull in your series. Unfortunately it's now burried below piles of
patches, so can you please do a delta patch?

> The problem with a more simplistic approach that just doubles the minimum
> reserve size to ensure that it will fit before or after a wrap is that you
> are doubling the reserve size. That too is rather wasteful of ring space. It
> also means that you only find out when the reserve size is too small when
> you hit the maximum usage coincident with a worst case wrap point. Whereas
> the v5 method means that you notice a too small reserve whether wrapping or
> not.

We don't need to double the reservation since the add_request tail is
split up into many individual intel_ring_begin. And we'd only need to wrap
for the largest of those, which is substantially less than the entire
reservation. Furthermore with the reservation these commands can't ever
fail, so for those we know are only used in the add_request tail we could
go to a wrap-only intel_ring_begin which never waits and have one at a
dword cmd boundary. That means we'd need to overestimate the needed
ringbuffer space by just a few dwords (namely the size of the longest CS
cmd we emit under reservation). Which is around 6 dwords or so iirc. And
to avoid changing ilk we could just special case that in reserve_space().

In practice I don't think there would be any difference with your v5 since
especially with the scheduler we shouldn't ever overfill rings really. But
the clear upside is that the reserve_space_end check would be independent
of any implementation details of how reservation vs. wrapping is done
exactly. And hence robust against any future fumbles in this area. Looking
at our history of the relevant code we can expect a lot of those.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2015-06-24 12:42 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-29 16:43 [PATCH 00/55] Remove the outstanding_lazy_request John.C.Harrison
2015-05-29 16:43 ` [PATCH 01/55] drm/i915: Re-instate request->uniq becuase it is extremely useful John.C.Harrison
2015-06-03 11:14   ` Tomas Elf
2015-05-29 16:43 ` [PATCH 02/55] drm/i915: Reserve ring buffer space for i915_add_request() commands John.C.Harrison
2015-06-02 18:14   ` Tomas Elf
2015-06-04 12:06   ` John.C.Harrison
2015-06-09 16:00     ` Tomas Elf
2015-06-18 12:10       ` John.C.Harrison
2015-06-17 14:04     ` Daniel Vetter
2015-06-18 10:43       ` John Harrison
2015-06-19 16:34   ` John.C.Harrison
2015-06-22 20:12     ` Daniel Vetter
2015-06-23 11:38       ` John Harrison
2015-06-23 13:24         ` Daniel Vetter
2015-06-23 15:43           ` John Harrison
2015-06-23 20:00             ` Daniel Vetter
2015-06-24 12:18               ` John Harrison
2015-06-24 12:45                 ` Daniel Vetter [this message]
2015-06-24 17:05                   ` John Harrison
2015-05-29 16:43 ` [PATCH 03/55] drm/i915: i915_add_request must not fail John.C.Harrison
2015-06-02 18:16   ` Tomas Elf
2015-06-04 14:07     ` John Harrison
2015-06-05 10:55       ` Tomas Elf
2015-06-23 10:16   ` Chris Wilson
2015-06-23 10:47     ` John Harrison
2015-05-29 16:43 ` [PATCH 04/55] drm/i915: Early alloc request in execbuff John.C.Harrison
2015-05-29 16:43 ` [PATCH 05/55] drm/i915: Set context in request from creation even in legacy mode John.C.Harrison
2015-05-29 16:43 ` [PATCH 06/55] drm/i915: Merged the many do_execbuf() parameters into a structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 07/55] drm/i915: Simplify i915_gem_execbuffer_retire_commands() parameters John.C.Harrison
2015-05-29 16:43 ` [PATCH 08/55] drm/i915: Update alloc_request to return the allocated request John.C.Harrison
2015-05-29 16:43 ` [PATCH 09/55] drm/i915: Add request to execbuf params and add explicit cleanup John.C.Harrison
2015-05-29 16:43 ` [PATCH 10/55] drm/i915: Update the dispatch tracepoint to use params->request John.C.Harrison
2015-05-29 16:43 ` [PATCH 11/55] drm/i915: Update move_to_gpu() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 12/55] drm/i915: Update execbuffer_move_to_active() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 13/55] drm/i915: Add flag to i915_add_request() to skip the cache flush John.C.Harrison
2015-06-02 18:19   ` Tomas Elf
2015-05-29 16:43 ` [PATCH 14/55] drm/i915: Update i915_gpu_idle() to manage its own request John.C.Harrison
2015-05-29 16:43 ` [PATCH 15/55] drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring John.C.Harrison
2015-06-18 12:11   ` John.C.Harrison
2015-05-29 16:43 ` [PATCH 16/55] drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable() John.C.Harrison
2015-05-29 16:43 ` [PATCH 17/55] drm/i915: Don't tag kernel batches as user batches John.C.Harrison
2015-05-29 16:43 ` [PATCH 18/55] drm/i915: Add explicit request management to i915_gem_init_hw() John.C.Harrison
2015-06-02 18:20   ` Tomas Elf
2015-05-29 16:43 ` [PATCH 19/55] drm/i915: Update ppgtt_init_ring() & context_enable() to take requests John.C.Harrison
2015-05-29 16:43 ` [PATCH 20/55] drm/i915: Update i915_switch_context() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 21/55] drm/i915: Update do_switch() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 22/55] drm/i915: Update deferred context creation to do explicit request management John.C.Harrison
2015-06-02 18:22   ` Tomas Elf
2015-05-29 16:43 ` [PATCH 23/55] drm/i915: Update init_context() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 24/55] drm/i915: Update render_state_init() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 25/55] drm/i915: Update i915_gem_object_sync() " John.C.Harrison
2015-06-02 18:26   ` Tomas Elf
2015-06-04 12:57     ` John Harrison
2015-06-18 12:14       ` John.C.Harrison
2015-06-18 12:21         ` Chris Wilson
2015-06-18 12:59           ` John Harrison
2015-06-18 14:24             ` Daniel Vetter
2015-06-18 15:39               ` Chris Wilson
2015-06-18 16:16                 ` John Harrison
2015-06-22 20:03                   ` Daniel Vetter
2015-06-22 20:14                     ` Chris Wilson
2015-06-18 16:36         ` 3.16 backlight kernel options Stéphane ANCELOT
2015-05-29 16:43 ` [PATCH 26/55] drm/i915: Update overlay code to do explicit request management John.C.Harrison
2015-05-29 16:43 ` [PATCH 27/55] drm/i915: Update queue_flip() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 28/55] drm/i915: Update add_request() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 29/55] drm/i915: Update [vma|object]_move_to_active() to take request structures John.C.Harrison
2015-05-29 16:43 ` [PATCH 30/55] drm/i915: Update l3_remap to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 31/55] drm/i915: Update mi_set_context() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 32/55] drm/i915: Update a bunch of execbuffer helpers to take request structures John.C.Harrison
2015-05-29 16:43 ` [PATCH 33/55] drm/i915: Update workarounds_emit() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 34/55] drm/i915: Update flush_all_caches() " John.C.Harrison
2015-05-29 16:43 ` [PATCH 35/55] drm/i915: Update switch_mm() to take a request structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 36/55] drm/i915: Update ring->flush() to take a requests structure John.C.Harrison
2015-05-29 16:43 ` [PATCH 37/55] drm/i915: Update some flush helpers to take request structures John.C.Harrison
2015-05-29 16:43 ` [PATCH 38/55] drm/i915: Update ring->emit_flush() to take a request structure John.C.Harrison
2015-05-29 16:44 ` [PATCH 39/55] drm/i915: Update ring->add_request() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 40/55] drm/i915: Update ring->emit_request() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 41/55] drm/i915: Update ring->dispatch_execbuffer() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 42/55] drm/i915: Update ring->emit_bb_start() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 43/55] drm/i915: Update ring->sync_to() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 44/55] drm/i915: Update ring->signal() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 45/55] drm/i915: Update cacheline_align() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 46/55] drm/i915: Update intel_ring_begin() " John.C.Harrison
2015-06-23 10:24   ` Chris Wilson
2015-06-23 10:37     ` John Harrison
2015-06-23 13:25       ` Daniel Vetter
2015-06-23 15:27         ` John Harrison
2015-06-23 15:34           ` Daniel Vetter
2015-05-29 16:44 ` [PATCH 47/55] drm/i915: Update intel_logical_ring_begin() " John.C.Harrison
2015-05-29 16:44 ` [PATCH 48/55] drm/i915: Add *_ring_begin() to request allocation John.C.Harrison
2015-06-17 13:31   ` Daniel Vetter
2015-06-17 14:27     ` Chris Wilson
2015-06-17 14:54       ` Daniel Vetter
2015-06-17 15:52         ` Chris Wilson
2015-06-18 11:21           ` John Harrison
2015-06-18 13:29             ` Daniel Vetter
2015-06-19 16:34               ` John Harrison
2015-05-29 16:44 ` [PATCH 49/55] drm/i915: Remove the now obsolete intel_ring_get_request() John.C.Harrison
2015-05-29 16:44 ` [PATCH 50/55] drm/i915: Remove the now obsolete 'outstanding_lazy_request' John.C.Harrison
2015-05-29 16:44 ` [PATCH 51/55] drm/i915: Move the request/file and request/pid association to creation time John.C.Harrison
2015-06-03 11:15   ` Tomas Elf
2015-05-29 16:44 ` [PATCH 52/55] drm/i915: Remove 'faked' request from LRC submission John.C.Harrison
2015-05-29 16:44 ` [PATCH 53/55] drm/i915: Update a bunch of LRC functions to take requests John.C.Harrison
2015-05-29 16:44 ` [PATCH 54/55] drm/i915: Remove the now obsolete 'i915_gem_check_olr()' John.C.Harrison
2015-06-02 18:27   ` Tomas Elf
2015-06-23 10:23   ` Chris Wilson
2015-06-23 10:39     ` John Harrison
2015-05-29 16:44 ` [PATCH 55/55] drm/i915: Rename the somewhat reduced i915_gem_object_flush_active() John.C.Harrison
2015-06-02 18:27   ` Tomas Elf
2015-06-17 14:06   ` Daniel Vetter
2015-06-17 14:21     ` Chris Wilson
2015-06-18 11:03       ` John Harrison
2015-06-18 11:10         ` Chris Wilson
2015-06-18 11:27           ` John Harrison
2015-06-18 10:57     ` John Harrison
2015-06-04 18:23 ` [PATCH 14/56] drm/i915: Make retire condition check for requests not objects John.C.Harrison
2015-06-04 18:24   ` John Harrison
2015-06-09 15:56   ` Tomas Elf
2015-06-17 15:01     ` Daniel Vetter
2015-06-22 21:04 ` [PATCH 00/55] Remove the outstanding_lazy_request Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150624124533.GG25769@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=Intel-GFX@Lists.FreeDesktop.Org \
    --cc=John.C.Harrison@Intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.