From mboxrd@z Thu Jan  1 00:00:00 1970
From: Daniel Vetter <daniel@ffwll.ch>
Subject: Re: [RFC 15/44] drm/i915: Added deferred work handler
	for scheduler
Date: Wed, 23 Jul 2014 20:50:05 +0200
Message-ID: <CAKMK7uE7WiWY-G+=R1WCzq1GgS+wC9iwU0acC6DYFvLKMJ089A@mail.gmail.com>
References: <1403803475-16337-1-git-send-email-John.C.Harrison@Intel.com>
 <1403803475-16337-16-git-send-email-John.C.Harrison@Intel.com>
 <20140707191435.GD5821@phenom.ffwll.local>
 <53CFD6C2.2010600@Intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <intel-gfx-bounces@lists.freedesktop.org>
Received: from mail-ig0-f175.google.com (mail-ig0-f175.google.com
 [209.85.213.175])
 by gabe.freedesktop.org (Postfix) with ESMTP id 0ABEE6E373
 for <Intel-GFX@lists.freedesktop.org>; Wed, 23 Jul 2014 11:50:05 -0700 (PDT)
Received: by mail-ig0-f175.google.com with SMTP id uq10so5582716igb.8
 for <Intel-GFX@lists.freedesktop.org>; Wed, 23 Jul 2014 11:50:05 -0700 (PDT)
In-Reply-To: <53CFD6C2.2010600@Intel.com>
List-Unsubscribe: <http://lists.freedesktop.org/mailman/options/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <http://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <http://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
To: John Harrison <John.C.Harrison@intel.com>
Cc: intel-gfx <Intel-GFX@lists.freedesktop.org>
List-Id: intel-gfx@lists.freedesktop.org

On Wed, Jul 23, 2014 at 5:37 PM, John Harrison
<John.C.Harrison@intel.com> wrote:
>>>   diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>> b/drivers/gpu/drm/i915/i915_drv.h
>>> index 0977653..fbafa68 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -1075,6 +1075,16 @@ struct i915_gem_mm {
>>>         struct delayed_work idle_work;
>>>         /**
>>> +        * New scheme is to get an interrupt after every work packet
>>> +        * in order to allow the low latency scheduling of pending
>>> +        * packets. The idea behind adding new packets to a pending
>>> +        * queue rather than directly into the hardware ring buffer
>>> +        * is to allow high priority packets to over take low priority
>>> +        * ones.
>>> +        */
>>> +       struct work_struct scheduler_work;
>>
>> Latency for work items isn't too awesome, and e.g. Oscar's execlist code
>> latches the next context right away from the irq handler. Why can't we do
>> something similar for the scheduler? Fishing the next item out of a
>> priority queue shouldn't be expensive ...
>> -Daniel
>
>
> The problem is that taking batch buffers from the scheduler's queue and
> submitting them to the hardware requires lots of processing that is not IRQ
> compatible. It isn't just a simple register write. Half of the code in
> 'i915_gem_do_execbuffer()' must be executed. Probably/possibly it could be
> made IRQ friendly but that would place a lot of restrictions on a lot of
> code that currently doesn't expect to be restricted. Instead, the submission
> is done via a work handler that acquires the driver mutex lock.
>
> In order to cover the extra latency, the scheduler operates in a
> multi-buffered mode and aims to keep eight batch buffers in flight at all
> times. That number being obtained empirically by running lots of benchmarks
> on Android with lots of different settings and seeing where the buffer size
> stopped making a difference.

So I've tried to stitch together that part of the scheduler from the
patch series. Afaics you do the actual scheduling under the protection
of irqsave spinlocks (well you also hold the dev->struct_mutex). That
means you disable local interrupts. Up to the actual submit point I
spotted two such critcial sections encompassing pretty much all the
code.

If we'd run the same code from the interrupt handler then only our own
interrupt handler is blocked, all other interrupt processing can
continue. So that's actually a lot nicer than what you have. In any
case you can't do expensive operations under an irqsave spinlock
anyway.

So either I've missed something big here, or this justification doesn't hold up.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch