All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org
Cc: Ben Widawsky <ben@bwidawsk.net>,
	Eero Tamminen <eero.t.tamminen@intel.com>
Subject: Re: [PATCH v3 2/2] drm/i915: Increase busyspin limit before a context-switch
Date: Tue, 28 Nov 2017 17:15:52 +0000	[thread overview]
Message-ID: <d4c3b2c1-1ebc-b140-7010-8aee44a58593@linux.intel.com> (raw)
In-Reply-To: <20171126122059.2556-2-chris@chris-wilson.co.uk>


On 26/11/2017 12:20, Chris Wilson wrote:
> Looking at the distribution of i915_wait_request for a set of GL
> benchmarks, we see:
> 
> broadwell# python bcc/tools/funclatency.py -u i915_wait_request
>     usecs               : count     distribution
>         0 -> 1          : 29184    |****************************************|
>         2 -> 3          : 5767     |*******                                 |
>         4 -> 7          : 3000     |****                                    |
>         8 -> 15         : 491      |                                        |
>        16 -> 31         : 140      |                                        |
>        32 -> 63         : 203      |                                        |
>        64 -> 127        : 543      |                                        |
>       128 -> 255        : 881      |*                                       |
>       256 -> 511        : 1209     |*                                       |
>       512 -> 1023       : 1739     |**                                      |
>      1024 -> 2047       : 22855    |*******************************         |
>      2048 -> 4095       : 1725     |**                                      |
>      4096 -> 8191       : 5813     |*******                                 |
>      8192 -> 16383      : 5348     |*******                                 |
>     16384 -> 32767      : 1000     |*                                       |
>     32768 -> 65535      : 4400     |******                                  |
>     65536 -> 131071     : 296      |                                        |
>    131072 -> 262143     : 225      |                                        |
>    262144 -> 524287     : 4        |                                        |
>    524288 -> 1048575    : 1        |                                        |
>   1048576 -> 2097151    : 1        |                                        |
>   2097152 -> 4194303    : 1        |                                        |
> 
> broxton# python bcc/tools/funclatency.py -u i915_wait_request
>     usecs               : count     distribution
>         0 -> 1          : 5523     |*************************************   |
>         2 -> 3          : 1340     |*********                               |
>         4 -> 7          : 2100     |**************                          |
>         8 -> 15         : 755      |*****                                   |
>        16 -> 31         : 211      |*                                       |
>        32 -> 63         : 53       |                                        |
>        64 -> 127        : 71       |                                        |
>       128 -> 255        : 113      |                                        |
>       256 -> 511        : 262      |*                                       |
>       512 -> 1023       : 358      |**                                      |
>      1024 -> 2047       : 1105     |*******                                 |
>      2048 -> 4095       : 848      |*****                                   |
>      4096 -> 8191       : 1295     |********                                |
>      8192 -> 16383      : 5894     |****************************************|
>     16384 -> 32767      : 4270     |****************************            |
>     32768 -> 65535      : 5622     |**************************************  |
>     65536 -> 131071     : 306      |**                                      |
>    131072 -> 262143     : 50       |                                        |
>    262144 -> 524287     : 76       |                                        |
>    524288 -> 1048575    : 34       |                                        |
>   1048576 -> 2097151    : 0        |                                        |
>   2097152 -> 4194303    : 1        |                                        |
> 
> Picking 20us for the context-switch busyspin has the dual advantage of
> catching most frequent short waits while avoiding the cost of a context
> switch. 20us is a typical latency of 2 context-switches, i.e. the cost
> of taking the sleep, without the secondary effects of cache flushing.

Next thing I wanted to ask is cumulative time spent spinning vs test 
duration, or in other words, CPU usage before and after.

And of course was the benefit on benchmarks results measurable, by how 
much, and what does the perf per Watt say?

Regards,

Tvrtko

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Sagar Kamble <sagar.a.kamble@intel.com>
> Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> ---
>   drivers/gpu/drm/i915/Kconfig.profile | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
> index a1aed0e2aad5..c8fe5754466c 100644
> --- a/drivers/gpu/drm/i915/Kconfig.profile
> +++ b/drivers/gpu/drm/i915/Kconfig.profile
> @@ -11,7 +11,7 @@ config DRM_I915_SPIN_REQUEST_IRQ
>   
>   config DRM_I915_SPIN_REQUEST_CS
>   	int
> -	default 2 # microseconds
> +	default 20 # microseconds
>   	help
>   	  After sleeping for a request (GPU operation) to complete, we will
>   	  be woken up on the completion of every request prior to the one
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2017-11-28 17:15 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-26 12:20 [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request Chris Wilson
2017-11-26 12:20 ` [PATCH v3 2/2] drm/i915: Increase busyspin limit before a context-switch Chris Wilson
2017-11-28 17:15   ` Tvrtko Ursulin [this message]
2017-11-26 12:41 ` ✓ Fi.CI.BAT: success for series starting with [v3,1/2] drm/i915: Expose the busyspin durations for i915_wait_request Patchwork
2017-11-26 13:30 ` ✓ Fi.CI.IGT: " Patchwork
2017-11-27  7:20 ` [PATCH v3 1/2] " Sagar Arun Kamble
2017-11-27  7:25   ` Sagar Arun Kamble
2017-11-27  9:39   ` Chris Wilson
2017-11-27 10:10 ` [PATCH v4] " Chris Wilson
2017-11-28 17:18   ` Tvrtko Ursulin
2017-11-27 10:33 ` ✓ Fi.CI.BAT: success for series starting with [v4] drm/i915: Expose the busyspin durations for i915_wait_request (rev2) Patchwork
2017-11-27 12:07 ` ✗ Fi.CI.IGT: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d4c3b2c1-1ebc-b140-7010-8aee44a58593@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=ben@bwidawsk.net \
    --cc=chris@chris-wilson.co.uk \
    --cc=eero.t.tamminen@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.