Re: [PATCH] drm/i915: Reserve space improvements

From: John Harrison <John.C.Harrison@Intel.com>
To: Tomas Elf <tomas.elf@intel.com>, Intel-GFX@Lists.FreeDesktop.Org
Subject: Re: [PATCH] drm/i915: Reserve space improvements
Date: Tue, 30 Jun 2015 16:53:39 +0100	[thread overview]
Message-ID: <5592BB83.3070001@Intel.com> (raw)
In-Reply-To: <5592AB2C.8050303@intel.com>

On 30/06/2015 15:43, Tomas Elf wrote:
> On 30/06/2015 12:40, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> An earlier patch was added to reserve space in the ring buffer for the
>> commands issued during 'add_request()'. The initial version was
>> pessimistic in the way it handled buffer wrapping and would cause
>> premature wraps and thus waste ring space.
>>
>> This patch updates the code to better handle the wrap case. It no
>> longer enforces that the space being asked for and the reserved space
>> are a single contiguous block. Instead, it allows the reserve to be on
>> the far end of a wrap operation. It still guarantees that the space is
>> available so when the wrap occurs, no wait will happen. Thus the wrap
>> cannot fail which is the whole point of the exercise.
>>
>> Also fixed a merge failure with some comments from the original patch.
>>
>> v2: Incorporated suggestion by David Gordon to move the wrap code
>> inside the prepare function and thus allow a single combined
>> wait_for_space() call rather than doing one before the wrap and
>> another after. This also makes the prepare code much simpler and
>> easier to follow.
>>
>> v3: Fix for 'effective_size' vs 'size' during ring buffer remainder
>> calculations (spotted by Tomas Elf).
>>
>> For: VIZ-5115
>> CC: Daniel Vetter <daniel@ffwll.ch>
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> ---
>>   drivers/gpu/drm/i915/intel_lrc.c        | 73 
>> +++++++++++++-------------
>>   drivers/gpu/drm/i915/intel_ringbuffer.c | 90 
>> +++++++++++++++++++--------------
>>   drivers/gpu/drm/i915/intel_ringbuffer.h |  4 +-
>>   3 files changed, 90 insertions(+), 77 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index b36e064..7d9515d 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -663,12 +663,12 @@ static int logical_ring_wait_for_space(struct 
>> drm_i915_gem_request *req,
>>       unsigned space;
>>       int ret;
>>
>> -    /* The whole point of reserving space is to not wait! */
>> -    WARN_ON(ringbuf->reserved_in_use);
>> -
>>       if (intel_ring_space(ringbuf) >= bytes)
>>           return 0;
>>
>> +    /* The whole point of reserving space is to not wait! */
>> +    WARN_ON(ringbuf->reserved_in_use);
>> +
>>       list_for_each_entry(target, &ring->request_list, list) {
>>           /*
>>            * The request queue is per-engine, so can contain requests
>> @@ -718,22 +718,11 @@ intel_logical_ring_advance_and_submit(struct 
>> drm_i915_gem_request *request)
>>       execlists_context_queue(request);
>>   }
>>
>> -static int logical_ring_wrap_buffer(struct drm_i915_gem_request *req)
>> +static void __wrap_ring_buffer(struct intel_ringbuffer *ringbuf)
>>   {
>> -    struct intel_ringbuffer *ringbuf = req->ringbuf;
>>       uint32_t __iomem *virt;
>>       int rem = ringbuf->size - ringbuf->tail;
>>
>> -    /* Can't wrap if space has already been reserved! */
>> -    WARN_ON(ringbuf->reserved_in_use);
>> -
>> -    if (ringbuf->space < rem) {
>> -        int ret = logical_ring_wait_for_space(req, rem);
>> -
>> -        if (ret)
>> -            return ret;
>> -    }
>> -
>>       virt = ringbuf->virtual_start + ringbuf->tail;
>>       rem /= 4;
>>       while (rem--)
>> @@ -741,40 +730,50 @@ static int logical_ring_wrap_buffer(struct 
>> drm_i915_gem_request *req)
>>
>>       ringbuf->tail = 0;
>>       intel_ring_update_space(ringbuf);
>> -
>> -    return 0;
>>   }
>>
>>   static int logical_ring_prepare(struct drm_i915_gem_request *req, 
>> int bytes)
>>   {
>>       struct intel_ringbuffer *ringbuf = req->ringbuf;
>> -    int ret;
>> -
>> -    /*
>> -     * Add on the reserved size to the request to make sure that after
>> -     * the intended commands have been emitted, there is guaranteed to
>> -     * still be enough free space to send them to the hardware.
>> -     */
>> -    if (!ringbuf->reserved_in_use)
>> -        bytes += ringbuf->reserved_size;
>> -
>> -    if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>> -        ret = logical_ring_wrap_buffer(req);
>> -        if (unlikely(ret))
>> -            return ret;
>> +    int remain_usable = ringbuf->effective_size - ringbuf->tail;
>> +    int remain_actual = ringbuf->size - ringbuf->tail;
>
> You could add a comment here (and in the legacy implementation) 
> explaining why we make the distinction between remain_usable and 
> remain_actual. Or you could add the comment when we actually use 
> remain_actual further down in the function. It's up to you.
>
> We first need to be pessimistic about how much space is left in the 
> ring buffer when determining the need for wrapping - therefore we use 
> remain_usable, which disregards the end-of-buffer padding - and then 
> we need to be pessimistic about how much ring space we need to wait 
> for - therefore we use remain_actual, which takes the end-of-buffer 
> padding into consideration to make sure that we're actually not 
> waiting for too little space.

It's not about being pessimistic or optimistic. That implies there is 
some choice, that one version is a little bit better in one situation 
but either would do. Whereas this is about functional correctness. The 
difference between the actual ring size and the usable ring size is not 
some reserved space thing to make something go faster. This is about the 
hardware locking up if the entire buffer is used. I think 'usable' and 
'actual' are fairly obvious names. If you want to know the details about 
why there is an 'effective_size' in the first place then look up 
'effective_size' in the code and read the comment about i830 hangs.

> If you add those comments maybe you could also rename the variables to 
> something like "remaining_space_usable" etc. since "remain_usable" is 
> more of a verb than a noun. But that's seriously nitpicking.
Or maybe 
'remaining_space_that_is_usable_without_causing_an_i830_to_hang_because_it_skips_the_last_two_cachelines'? 
There is such a thing as being too verbose and making the code hard to read.

> If you at least add the comments to make it crystal clear why we do it 
> this way then I'm fine.

I don't see what comment could be added that would actually make things 
clearer without being a long and unnecessary description of the i830 
issue. There are two variables declared on consecutive lines that cache 
pretty simple calculations and have fairly clear names. It looks quite 
self explanatory to me! The code then tests to see if it can use the 
usable space, if not then it wraps around the actual buffer size. Again, 
seems pretty obvious as to what is going on and why - you can only use 
the usable bit, but when traversing the whole buffer you pretty 
obviously need to traverse the whole buffer not just the part that may 
or may not have been used.

>
> Reviewed-by: Tomas Elf <tomas.elf@intel.com>
>
> Thanks,
> Tomas
>
>> +    int ret, total_bytes, wait_bytes = 0;
>> +    bool need_wrap = false;
>>
>> -        if(ringbuf->reserved_size) {
>> -            uint32_t size = ringbuf->reserved_size;
>> +    if (ringbuf->reserved_in_use)
>> +        total_bytes = bytes;
>> +    else
>> +        total_bytes = bytes + ringbuf->reserved_size;
>>
>> -            intel_ring_reserved_space_cancel(ringbuf);
>> -            intel_ring_reserved_space_reserve(ringbuf, size);
>> +    if (unlikely(bytes > remain_usable)) {
>> +        /*
>> +         * Not enough space for the basic request. So need to flush
>> +         * out the remainder and then wait for base + reserved.
>> +         */
>> +        wait_bytes = remain_actual + total_bytes;
>> +        need_wrap = true;
>> +    } else {
>> +        if (unlikely(total_bytes > remain_usable)) {
>> +            /*
>> +             * The base request will fit but the reserved space
>> +             * falls off the end. So only need to to wait for the
>> +             * reserved size after flushing out the remainder.
>> +             */
>> +            wait_bytes = remain_actual + ringbuf->reserved_size;
>> +            need_wrap = true;
>> +        } else if (total_bytes > ringbuf->space) {
>> +            /* No wrapping required, just waiting. */
>> +            wait_bytes = total_bytes;
>>           }
>>       }
>>
>> -    if (unlikely(ringbuf->space < bytes)) {
>> -        ret = logical_ring_wait_for_space(req, bytes);
>> +    if (wait_bytes) {
>> +        ret = logical_ring_wait_for_space(req, wait_bytes);
>>           if (unlikely(ret))
>>               return ret;
>> +
>> +        if (need_wrap)
>> +            __wrap_ring_buffer(ringbuf);
>>       }
>>
>>       return 0;
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
>> b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index af7c12e..e39c891 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -2121,12 +2121,12 @@ static int ring_wait_for_space(struct 
>> intel_engine_cs *ring, int n)
>>       unsigned space;
>>       int ret;
>>
>> -    /* The whole point of reserving space is to not wait! */
>> -    WARN_ON(ringbuf->reserved_in_use);
>> -
>>       if (intel_ring_space(ringbuf) >= n)
>>           return 0;
>>
>> +    /* The whole point of reserving space is to not wait! */
>> +    WARN_ON(ringbuf->reserved_in_use);
>> +
>>       list_for_each_entry(request, &ring->request_list, list) {
>>           space = __intel_ring_space(request->postfix, ringbuf->tail,
>>                          ringbuf->size);
>> @@ -2145,21 +2145,11 @@ static int ring_wait_for_space(struct 
>> intel_engine_cs *ring, int n)
>>       return 0;
>>   }
>>
>> -static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
>> +static void __wrap_ring_buffer(struct intel_ringbuffer *ringbuf)
>>   {
>>       uint32_t __iomem *virt;
>> -    struct intel_ringbuffer *ringbuf = ring->buffer;
>>       int rem = ringbuf->size - ringbuf->tail;
>>
>> -    /* Can't wrap if space has already been reserved! */
>> -    WARN_ON(ringbuf->reserved_in_use);
>> -
>> -    if (ringbuf->space < rem) {
>> -        int ret = ring_wait_for_space(ring, rem);
>> -        if (ret)
>> -            return ret;
>> -    }
>> -
>>       virt = ringbuf->virtual_start + ringbuf->tail;
>>       rem /= 4;
>>       while (rem--)
>> @@ -2167,8 +2157,6 @@ static int intel_wrap_ring_buffer(struct 
>> intel_engine_cs *ring)
>>
>>       ringbuf->tail = 0;
>>       intel_ring_update_space(ringbuf);
>> -
>> -    return 0;
>>   }
>>
>>   int intel_ring_idle(struct intel_engine_cs *ring)
>> @@ -2238,9 +2226,21 @@ void intel_ring_reserved_space_use(struct 
>> intel_ringbuffer *ringbuf)
>>   void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
>>   {
>>       WARN_ON(!ringbuf->reserved_in_use);
>> -    WARN(ringbuf->tail > ringbuf->reserved_tail + 
>> ringbuf->reserved_size,
>> -         "request reserved size too small: %d vs %d!\n",
>> -         ringbuf->tail - ringbuf->reserved_tail, 
>> ringbuf->reserved_size);
>> +    if (ringbuf->tail > ringbuf->reserved_tail) {
>> +        WARN(ringbuf->tail > ringbuf->reserved_tail + 
>> ringbuf->reserved_size,
>> +             "request reserved size too small: %d vs %d!\n",
>> +             ringbuf->tail - ringbuf->reserved_tail, 
>> ringbuf->reserved_size);
>> +    } else {
>> +        /*
>> +         * The ring was wrapped while the reserved space was in use.
>> +         * That means that some unknown amount of the ring tail was
>> +         * no-op filled and skipped. Thus simply adding the ring size
>> +         * to the tail and doing the above space check will not work.
>> +         * Rather than attempt to track how much tail was skipped,
>> +         * it is much simpler to say that also skipping the sanity
>> +         * check every once in a while is not a big issue.
>> +         */
>> +    }
>>
>>       ringbuf->reserved_size   = 0;
>>       ringbuf->reserved_in_use = false;
>> @@ -2249,33 +2249,45 @@ void intel_ring_reserved_space_end(struct 
>> intel_ringbuffer *ringbuf)
>>   static int __intel_ring_prepare(struct intel_engine_cs *ring, int 
>> bytes)
>>   {
>>       struct intel_ringbuffer *ringbuf = ring->buffer;
>> -    int ret;
>> -
>> -    /*
>> -     * Add on the reserved size to the request to make sure that after
>> -     * the intended commands have been emitted, there is guaranteed to
>> -     * still be enough free space to send them to the hardware.
>> -     */
>> -    if (!ringbuf->reserved_in_use)
>> -        bytes += ringbuf->reserved_size;
>> -
>> -    if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>> -        ret = intel_wrap_ring_buffer(ring);
>> -        if (unlikely(ret))
>> -            return ret;
>> +    int remain_usable = ringbuf->effective_size - ringbuf->tail;
>> +    int remain_actual = ringbuf->size - ringbuf->tail;
>> +    int ret, total_bytes, wait_bytes = 0;
>> +    bool need_wrap = false;
>>
>> -        if(ringbuf->reserved_size) {
>> -            uint32_t size = ringbuf->reserved_size;
>> +    if (ringbuf->reserved_in_use)
>> +        total_bytes = bytes;
>> +    else
>> +        total_bytes = bytes + ringbuf->reserved_size;
>>
>> -            intel_ring_reserved_space_cancel(ringbuf);
>> -            intel_ring_reserved_space_reserve(ringbuf, size);
>> +    if (unlikely(bytes > remain_usable)) {
>> +        /*
>> +         * Not enough space for the basic request. So need to flush
>> +         * out the remainder and then wait for base + reserved.
>> +         */
>> +        wait_bytes = remain_actual + total_bytes;
>> +        need_wrap = true;
>> +    } else {
>> +        if (unlikely(total_bytes > remain_usable)) {
>> +            /*
>> +             * The base request will fit but the reserved space
>> +             * falls off the end. So only need to to wait for the
>> +             * reserved size after flushing out the remainder.
>> +             */
>> +            wait_bytes = remain_actual + ringbuf->reserved_size;
>> +            need_wrap = true;
>> +        } else if (total_bytes > ringbuf->space) {
>> +            /* No wrapping required, just waiting. */
>> +            wait_bytes = total_bytes;
>>           }
>>       }
>>
>> -    if (unlikely(ringbuf->space < bytes)) {
>> -        ret = ring_wait_for_space(ring, bytes);
>> +    if (wait_bytes) {
>> +        ret = ring_wait_for_space(ring, wait_bytes);
>>           if (unlikely(ret))
>>               return ret;
>> +
>> +        if (need_wrap)
>> +            __wrap_ring_buffer(ringbuf);
>>       }
>>
>>       return 0;
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
>> b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> index 0e2bbc6..304cac4 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>> @@ -473,7 +473,6 @@ static inline u32 intel_ring_get_tail(struct 
>> intel_ringbuffer *ringbuf)
>>    * will always have sufficient room to do its stuff. The request 
>> creation
>>    * code calls this automatically.
>>    */
>> -int intel_ring_reserve_space(struct drm_i915_gem_request *request);
>>   void intel_ring_reserved_space_reserve(struct intel_ringbuffer 
>> *ringbuf, int size);
>>   /* Cancel the reservation, e.g. because the request is being 
>> discarded. */
>>   void intel_ring_reserved_space_cancel(struct intel_ringbuffer 
>> *ringbuf);
>> @@ -482,4 +481,7 @@ void intel_ring_reserved_space_use(struct 
>> intel_ringbuffer *ringbuf);
>>   /* Finish with the reserved space - for use by i915_add_request() 
>> only. */
>>   void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
>>
>> +/* Legacy ringbuffer specific portion of reservation code: */
>> +int intel_ring_reserve_space(struct drm_i915_gem_request *request);
>> +
>>   #endif /* _INTEL_RINGBUFFER_H_ */
>>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx