All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Sagar Arun Kamble <sagar.a.kamble@intel.com>,
	intel-gfx@lists.freedesktop.org
Cc: Eero@freedesktop.org
Subject: Re: [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request
Date: Mon, 27 Nov 2017 09:39:44 +0000	[thread overview]
Message-ID: <151177558460.28317.17370751551160535157@mail.alporthouse.com> (raw)
In-Reply-To: <b583c05f-9df2-8f7f-7ab4-d02812a3e71e@intel.com>

Quoting Sagar Arun Kamble (2017-11-27 07:20:01)
> 
> 
> On 11/26/2017 5:50 PM, Chris Wilson wrote:
> > An interesting discussion regarding "hybrid interrupt polling" for NVMe
> > came to the conclusion that the ideal busyspin before sleeping
> 
> I think hybrid approach suggests sleep (1/2 duration) and then busyspin (1/2 duration)
> (for small I/O size), so this should be "busyspin after sleeping" although we are not doing exactly same.

It does, we are not. For reasons I thought I had described ... But not
here apparently. Differences between hybrid interrupt polling and
ourselves we should include in the comments as well.
 
> >   was half
> > of the expected request latency (and better if it was already halfway
> > through that request). This suggested that we too should look again at
> > our tradeoff between spinning and waiting. Currently, our spin simply
> > tries to hide the cost of enabling the interrupt, which is good to avoid
> > penalising nop requests (i.e. test throughput) and not much else.
> > Studying real world workloads suggests that a spin of upto 500us can
> > dramatically boost performance, but the suggestion is that this is not
> > from avoiding interrupt latency per-se, but from secondary effects of
> > sleeping such as allowing the CPU reduce cstate and context switch away.
> >
> > v2: Expose the spin setting via Kconfig options for easier adjustment
> > and testing.
> > v3: Don't get caught sneaking in a change to the busyspin parameters.
> >
> > Suggested-by: Sagar Kamble <sagar.a.kamble@intel.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Sagar Kamble <sagar.a.kamble@intel.com>
> > Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Ben Widawsky <ben@bwidawsk.net>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> >   drivers/gpu/drm/i915/Kconfig            |  6 ++++++
> >   drivers/gpu/drm/i915/Kconfig.profile    | 23 +++++++++++++++++++++++
> >   drivers/gpu/drm/i915/i915_gem_request.c | 28 ++++++++++++++++++++++++----
> >   3 files changed, 53 insertions(+), 4 deletions(-)
> >   create mode 100644 drivers/gpu/drm/i915/Kconfig.profile
> >
> > diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
> > index dfd95889f4b7..eae90783f8f9 100644
> > --- a/drivers/gpu/drm/i915/Kconfig
> > +++ b/drivers/gpu/drm/i915/Kconfig
> > @@ -131,3 +131,9 @@ depends on DRM_I915
> >   depends on EXPERT
> >   source drivers/gpu/drm/i915/Kconfig.debug
> >   endmenu
> > +
> > +menu "drm/i915 Profile Guided Optimisation"
> > +     visible if EXPERT
> > +     depends on DRM_I915
> > +     source drivers/gpu/drm/i915/Kconfig.profile
> > +endmenu
> > diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
> > new file mode 100644
> > index 000000000000..a1aed0e2aad5
> > --- /dev/null
> > +++ b/drivers/gpu/drm/i915/Kconfig.profile
> > @@ -0,0 +1,23 @@
> > +config DRM_I915_SPIN_REQUEST_IRQ
> > +     int
> > +     default 5 # microseconds
> > +     help
> > +       Before sleeping waiting for a request (GPU operation) to complete,
> > +       we may spend some time polling for its completion. As the IRQ may
> > +       take a non-negligible time to setup, we do a short spin first to
> > +       check if the request will complete quickly.
> > +
> > +       May be 0 to disable the initial spin.
> > +
> > +config DRM_I915_SPIN_REQUEST_CS
> > +     int
> > +     default 2 # microseconds
> > +     help
> > +       After sleeping for a request (GPU operation) to complete, we will
> > +       be woken up on the completion of every request prior to the one
> > +       being waited on. For very short requests, going back to sleep and
> > +       be woken up again may add considerably to the wakeup latency. To
> > +       avoid incurring extra latency from the scheduler, we may choose to
> > +       spin prior to sleeping again.
> > +
> > +       May be 0 to disable spinning after being woken.
> > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> > index a90bdd26571f..7ac72a0a949c 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_request.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> > @@ -1198,8 +1198,21 @@ long i915_wait_request(struct drm_i915_gem_request *req,
> >       GEM_BUG_ON(!intel_wait_has_seqno(&wait));
> >       GEM_BUG_ON(!i915_sw_fence_signaled(&req->submit));
> >   
> > -     /* Optimistic short spin before touching IRQs */
> > -     if (__i915_spin_request(req, wait.seqno, state, 5))
> > +     /* Optimistic spin before touching IRQs.
> > +      *
> > +      * We may use a rather large value here to offset the penalty of
> > +      * switching away from the active task. Frequently, the client will
> > +      * wait upon an old swapbuffer to throttle itself to remain within a
> > +      * frame of the gpu. If the client is running in lockstep with the gpu,
> > +      * then it should not be waiting long at all, and a sleep now will incur
> > +      * extra scheduler latency in producing the next frame. So we sleep
> > +      * for longer to try and keep the client running.
> > +      *
> > +      * We need ~5us to enable the irq, ~20us to hide a context switch.
> 
> This comment fits more with next patch.

It just a general overview of the short of timescales we expect. I want
to explain the rationale behind having a spin and what spins we have in
mind. Maybe if I just add "upto"
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2017-11-27  9:39 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-26 12:20 [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request Chris Wilson
2017-11-26 12:20 ` [PATCH v3 2/2] drm/i915: Increase busyspin limit before a context-switch Chris Wilson
2017-11-28 17:15   ` Tvrtko Ursulin
2017-11-26 12:41 ` ✓ Fi.CI.BAT: success for series starting with [v3,1/2] drm/i915: Expose the busyspin durations for i915_wait_request Patchwork
2017-11-26 13:30 ` ✓ Fi.CI.IGT: " Patchwork
2017-11-27  7:20 ` [PATCH v3 1/2] " Sagar Arun Kamble
2017-11-27  7:25   ` Sagar Arun Kamble
2017-11-27  9:39   ` Chris Wilson [this message]
2017-11-27 10:10 ` [PATCH v4] " Chris Wilson
2017-11-28 17:18   ` Tvrtko Ursulin
2017-11-27 10:33 ` ✓ Fi.CI.BAT: success for series starting with [v4] drm/i915: Expose the busyspin durations for i915_wait_request (rev2) Patchwork
2017-11-27 12:07 ` ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=151177558460.28317.17370751551160535157@mail.alporthouse.com \
    --to=chris@chris-wilson.co.uk \
    --cc=Eero@freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=sagar.a.kamble@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.