All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request
@ 2017-11-26 12:20 Chris Wilson
  2017-11-26 12:20 ` [PATCH v3 2/2] drm/i915: Increase busyspin limit before a context-switch Chris Wilson
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: Chris Wilson @ 2017-11-26 12:20 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Eero Tamminen

An interesting discussion regarding "hybrid interrupt polling" for NVMe
came to the conclusion that the ideal busyspin before sleeping was half
of the expected request latency (and better if it was already halfway
through that request). This suggested that we too should look again at
our tradeoff between spinning and waiting. Currently, our spin simply
tries to hide the cost of enabling the interrupt, which is good to avoid
penalising nop requests (i.e. test throughput) and not much else.
Studying real world workloads suggests that a spin of upto 500us can
dramatically boost performance, but the suggestion is that this is not
from avoiding interrupt latency per-se, but from secondary effects of
sleeping such as allowing the CPU reduce cstate and context switch away.

v2: Expose the spin setting via Kconfig options for easier adjustment
and testing.
v3: Don't get caught sneaking in a change to the busyspin parameters.

Suggested-by: Sagar Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Sagar Kamble <sagar.a.kamble@intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/Kconfig            |  6 ++++++
 drivers/gpu/drm/i915/Kconfig.profile    | 23 +++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_request.c | 28 ++++++++++++++++++++++++----
 3 files changed, 53 insertions(+), 4 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/Kconfig.profile

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index dfd95889f4b7..eae90783f8f9 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -131,3 +131,9 @@ depends on DRM_I915
 depends on EXPERT
 source drivers/gpu/drm/i915/Kconfig.debug
 endmenu
+
+menu "drm/i915 Profile Guided Optimisation"
+	visible if EXPERT
+	depends on DRM_I915
+	source drivers/gpu/drm/i915/Kconfig.profile
+endmenu
diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
new file mode 100644
index 000000000000..a1aed0e2aad5
--- /dev/null
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -0,0 +1,23 @@
+config DRM_I915_SPIN_REQUEST_IRQ
+	int
+	default 5 # microseconds
+	help
+	  Before sleeping waiting for a request (GPU operation) to complete,
+	  we may spend some time polling for its completion. As the IRQ may
+	  take a non-negligible time to setup, we do a short spin first to
+	  check if the request will complete quickly.
+
+	  May be 0 to disable the initial spin.
+
+config DRM_I915_SPIN_REQUEST_CS
+	int
+	default 2 # microseconds
+	help
+	  After sleeping for a request (GPU operation) to complete, we will
+	  be woken up on the completion of every request prior to the one
+	  being waited on. For very short requests, going back to sleep and
+	  be woken up again may add considerably to the wakeup latency. To
+	  avoid incurring extra latency from the scheduler, we may choose to
+	  spin prior to sleeping again.
+
+	  May be 0 to disable spinning after being woken.
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index a90bdd26571f..7ac72a0a949c 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -1198,8 +1198,21 @@ long i915_wait_request(struct drm_i915_gem_request *req,
 	GEM_BUG_ON(!intel_wait_has_seqno(&wait));
 	GEM_BUG_ON(!i915_sw_fence_signaled(&req->submit));
 
-	/* Optimistic short spin before touching IRQs */
-	if (__i915_spin_request(req, wait.seqno, state, 5))
+	/* Optimistic spin before touching IRQs.
+	 *
+	 * We may use a rather large value here to offset the penalty of
+	 * switching away from the active task. Frequently, the client will
+	 * wait upon an old swapbuffer to throttle itself to remain within a
+	 * frame of the gpu. If the client is running in lockstep with the gpu,
+	 * then it should not be waiting long at all, and a sleep now will incur
+	 * extra scheduler latency in producing the next frame. So we sleep
+	 * for longer to try and keep the client running.
+	 *
+	 * We need ~5us to enable the irq, ~20us to hide a context switch.
+	 */
+	if (CONFIG_DRM_I915_SPIN_REQUEST_IRQ &&
+	    __i915_spin_request(req, wait.seqno, state,
+				CONFIG_DRM_I915_SPIN_REQUEST_IRQ))
 		goto complete;
 
 	set_current_state(state);
@@ -1255,8 +1268,15 @@ long i915_wait_request(struct drm_i915_gem_request *req,
 		    __i915_wait_request_check_and_reset(req))
 			continue;
 
-		/* Only spin if we know the GPU is processing this request */
-		if (__i915_spin_request(req, wait.seqno, state, 2))
+		/*
+		 * A quick spin now we are on the CPU to offset the cost of
+		 * context switching away (and so spin for roughly the same as
+		 * the scheduler latency). We only spin if we know the GPU is
+		 * processing this request, and so likely to finish shortly.
+		 */
+		if (CONFIG_DRM_I915_SPIN_REQUEST_CS &&
+		    __i915_spin_request(req, wait.seqno, state,
+					CONFIG_DRM_I915_SPIN_REQUEST_CS))
 			break;
 
 		if (!intel_wait_check_request(&wait, req)) {
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 2/2] drm/i915: Increase busyspin limit before a context-switch
  2017-11-26 12:20 [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request Chris Wilson
@ 2017-11-26 12:20 ` Chris Wilson
  2017-11-28 17:15   ` Tvrtko Ursulin
  2017-11-26 12:41 ` ✓ Fi.CI.BAT: success for series starting with [v3,1/2] drm/i915: Expose the busyspin durations for i915_wait_request Patchwork
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2017-11-26 12:20 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Eero Tamminen

Looking at the distribution of i915_wait_request for a set of GL
benchmarks, we see:

broadwell# python bcc/tools/funclatency.py -u i915_wait_request
   usecs               : count     distribution
       0 -> 1          : 29184    |****************************************|
       2 -> 3          : 5767     |*******                                 |
       4 -> 7          : 3000     |****                                    |
       8 -> 15         : 491      |                                        |
      16 -> 31         : 140      |                                        |
      32 -> 63         : 203      |                                        |
      64 -> 127        : 543      |                                        |
     128 -> 255        : 881      |*                                       |
     256 -> 511        : 1209     |*                                       |
     512 -> 1023       : 1739     |**                                      |
    1024 -> 2047       : 22855    |*******************************         |
    2048 -> 4095       : 1725     |**                                      |
    4096 -> 8191       : 5813     |*******                                 |
    8192 -> 16383      : 5348     |*******                                 |
   16384 -> 32767      : 1000     |*                                       |
   32768 -> 65535      : 4400     |******                                  |
   65536 -> 131071     : 296      |                                        |
  131072 -> 262143     : 225      |                                        |
  262144 -> 524287     : 4        |                                        |
  524288 -> 1048575    : 1        |                                        |
 1048576 -> 2097151    : 1        |                                        |
 2097152 -> 4194303    : 1        |                                        |

broxton# python bcc/tools/funclatency.py -u i915_wait_request
   usecs               : count     distribution
       0 -> 1          : 5523     |*************************************   |
       2 -> 3          : 1340     |*********                               |
       4 -> 7          : 2100     |**************                          |
       8 -> 15         : 755      |*****                                   |
      16 -> 31         : 211      |*                                       |
      32 -> 63         : 53       |                                        |
      64 -> 127        : 71       |                                        |
     128 -> 255        : 113      |                                        |
     256 -> 511        : 262      |*                                       |
     512 -> 1023       : 358      |**                                      |
    1024 -> 2047       : 1105     |*******                                 |
    2048 -> 4095       : 848      |*****                                   |
    4096 -> 8191       : 1295     |********                                |
    8192 -> 16383      : 5894     |****************************************|
   16384 -> 32767      : 4270     |****************************            |
   32768 -> 65535      : 5622     |**************************************  |
   65536 -> 131071     : 306      |**                                      |
  131072 -> 262143     : 50       |                                        |
  262144 -> 524287     : 76       |                                        |
  524288 -> 1048575    : 34       |                                        |
 1048576 -> 2097151    : 0        |                                        |
 2097152 -> 4194303    : 1        |                                        |

Picking 20us for the context-switch busyspin has the dual advantage of
catching most frequent short waits while avoiding the cost of a context
switch. 20us is a typical latency of 2 context-switches, i.e. the cost
of taking the sleep, without the secondary effects of cache flushing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Sagar Kamble <sagar.a.kamble@intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/Kconfig.profile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
index a1aed0e2aad5..c8fe5754466c 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -11,7 +11,7 @@ config DRM_I915_SPIN_REQUEST_IRQ
 
 config DRM_I915_SPIN_REQUEST_CS
 	int
-	default 2 # microseconds
+	default 20 # microseconds
 	help
 	  After sleeping for a request (GPU operation) to complete, we will
 	  be woken up on the completion of every request prior to the one
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [v3,1/2] drm/i915: Expose the busyspin durations for i915_wait_request
  2017-11-26 12:20 [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request Chris Wilson
  2017-11-26 12:20 ` [PATCH v3 2/2] drm/i915: Increase busyspin limit before a context-switch Chris Wilson
@ 2017-11-26 12:41 ` Patchwork
  2017-11-26 13:30 ` ✓ Fi.CI.IGT: " Patchwork
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2017-11-26 12:41 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v3,1/2] drm/i915: Expose the busyspin durations for i915_wait_request
URL   : https://patchwork.freedesktop.org/series/34404/
State : success

== Summary ==

Series 34404v1 series starting with [v3,1/2] drm/i915: Expose the busyspin durations for i915_wait_request
https://patchwork.freedesktop.org/api/1.0/series/34404/revisions/1/mbox/

Test gem_ringfill:
        Subgroup basic-default-hang:
                dmesg-warn -> INCOMPLETE (fi-pnv-d510) fdo#101600

fdo#101600 https://bugs.freedesktop.org/show_bug.cgi?id=101600

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:445s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:458s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:380s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:537s
fi-bwr-2160      total:289  pass:183  dwarn:0   dfail:0   fail:0   skip:106 time:280s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:493s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:508s
fi-byt-j1900     total:289  pass:254  dwarn:0   dfail:0   fail:0   skip:35  time:490s
fi-byt-n2820     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:485s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:419s
fi-gdg-551       total:289  pass:178  dwarn:1   dfail:0   fail:1   skip:109 time:270s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:540s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:426s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:433s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:425s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:474s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:463s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:482s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:535s
fi-kbl-7567u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:475s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:531s
fi-pnv-d510      total:147  pass:113  dwarn:0   dfail:0   fail:0   skip:33 
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:454s
fi-skl-6600u     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:541s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:573s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:510s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:492s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:457s
fi-snb-2520m     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:558s
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:418s
Blacklisted hosts:
fi-cfl-s2        total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:610s
fi-cnl-y         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:556s
fi-glk-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:498s

b4a607bc6d5295f19d3bf15c6fb2024251dcc0ea drm-tip: 2017y-11m-25d-18h-37m-40s UTC integration manifest
1af026f891e0 drm/i915: Increase busyspin limit before a context-switch
d2dadfd2d1cd drm/i915: Expose the busyspin durations for i915_wait_request

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7292/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [v3,1/2] drm/i915: Expose the busyspin durations for i915_wait_request
  2017-11-26 12:20 [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request Chris Wilson
  2017-11-26 12:20 ` [PATCH v3 2/2] drm/i915: Increase busyspin limit before a context-switch Chris Wilson
  2017-11-26 12:41 ` ✓ Fi.CI.BAT: success for series starting with [v3,1/2] drm/i915: Expose the busyspin durations for i915_wait_request Patchwork
@ 2017-11-26 13:30 ` Patchwork
  2017-11-27  7:20 ` [PATCH v3 1/2] " Sagar Arun Kamble
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2017-11-26 13:30 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v3,1/2] drm/i915: Expose the busyspin durations for i915_wait_request
URL   : https://patchwork.freedesktop.org/series/34404/
State : success

== Summary ==

Warning: bzip CI_DRM_3389/shard-glkb3/results31.json.bz2 wasn't in correct JSON format
Test kms_frontbuffer_tracking:
        Subgroup fbc-1p-offscren-pri-shrfb-draw-render:
                pass       -> FAIL       (shard-snb) fdo#101623 +1
Test kms_plane:
        Subgroup plane-panning-bottom-right-pipe-b-planes:
                skip       -> PASS       (shard-snb)
Test kms_universal_plane:
        Subgroup disable-primary-vs-flip-pipe-b:
                skip       -> PASS       (shard-snb)
Test kms_draw_crc:
        Subgroup draw-method-rgb565-mmap-cpu-xtiled:
                skip       -> PASS       (shard-snb)

fdo#101623 https://bugs.freedesktop.org/show_bug.cgi?id=101623

shard-hsw        total:2662 pass:1535 dwarn:1   dfail:0   fail:10  skip:1116 time:9506s
shard-snb        total:2662 pass:1307 dwarn:1   dfail:0   fail:12  skip:1342 time:8079s
Blacklisted hosts:
shard-apl        total:2640 pass:1666 dwarn:1   dfail:0   fail:23  skip:949 time:13216s
shard-kbl        total:2662 pass:1802 dwarn:1   dfail:0   fail:24  skip:835 time:10725s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7292/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request
  2017-11-26 12:20 [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request Chris Wilson
                   ` (2 preceding siblings ...)
  2017-11-26 13:30 ` ✓ Fi.CI.IGT: " Patchwork
@ 2017-11-27  7:20 ` Sagar Arun Kamble
  2017-11-27  7:25   ` Sagar Arun Kamble
  2017-11-27  9:39   ` Chris Wilson
  2017-11-27 10:10 ` [PATCH v4] " Chris Wilson
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 12+ messages in thread
From: Sagar Arun Kamble @ 2017-11-27  7:20 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Eero Tamminen, Ben Widawsky



On 11/26/2017 5:50 PM, Chris Wilson wrote:
> An interesting discussion regarding "hybrid interrupt polling" for NVMe
> came to the conclusion that the ideal busyspin before sleeping

I think hybrid approach suggests sleep (1/2 duration) and then busyspin (1/2 duration)
(for small I/O size), so this should be "busyspin after sleeping" although we are not doing exactly same.

>   was half
> of the expected request latency (and better if it was already halfway
> through that request). This suggested that we too should look again at
> our tradeoff between spinning and waiting. Currently, our spin simply
> tries to hide the cost of enabling the interrupt, which is good to avoid
> penalising nop requests (i.e. test throughput) and not much else.
> Studying real world workloads suggests that a spin of upto 500us can
> dramatically boost performance, but the suggestion is that this is not
> from avoiding interrupt latency per-se, but from secondary effects of
> sleeping such as allowing the CPU reduce cstate and context switch away.
>
> v2: Expose the spin setting via Kconfig options for easier adjustment
> and testing.
> v3: Don't get caught sneaking in a change to the busyspin parameters.
>
> Suggested-by: Sagar Kamble <sagar.a.kamble@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Sagar Kamble <sagar.a.kamble@intel.com>
> Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> ---
>   drivers/gpu/drm/i915/Kconfig            |  6 ++++++
>   drivers/gpu/drm/i915/Kconfig.profile    | 23 +++++++++++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_request.c | 28 ++++++++++++++++++++++++----
>   3 files changed, 53 insertions(+), 4 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/Kconfig.profile
>
> diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
> index dfd95889f4b7..eae90783f8f9 100644
> --- a/drivers/gpu/drm/i915/Kconfig
> +++ b/drivers/gpu/drm/i915/Kconfig
> @@ -131,3 +131,9 @@ depends on DRM_I915
>   depends on EXPERT
>   source drivers/gpu/drm/i915/Kconfig.debug
>   endmenu
> +
> +menu "drm/i915 Profile Guided Optimisation"
> +	visible if EXPERT
> +	depends on DRM_I915
> +	source drivers/gpu/drm/i915/Kconfig.profile
> +endmenu
> diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
> new file mode 100644
> index 000000000000..a1aed0e2aad5
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/Kconfig.profile
> @@ -0,0 +1,23 @@
> +config DRM_I915_SPIN_REQUEST_IRQ
> +	int
> +	default 5 # microseconds
> +	help
> +	  Before sleeping waiting for a request (GPU operation) to complete,
> +	  we may spend some time polling for its completion. As the IRQ may
> +	  take a non-negligible time to setup, we do a short spin first to
> +	  check if the request will complete quickly.
> +
> +	  May be 0 to disable the initial spin.
> +
> +config DRM_I915_SPIN_REQUEST_CS
> +	int
> +	default 2 # microseconds
> +	help
> +	  After sleeping for a request (GPU operation) to complete, we will
> +	  be woken up on the completion of every request prior to the one
> +	  being waited on. For very short requests, going back to sleep and
> +	  be woken up again may add considerably to the wakeup latency. To
> +	  avoid incurring extra latency from the scheduler, we may choose to
> +	  spin prior to sleeping again.
> +
> +	  May be 0 to disable spinning after being woken.
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index a90bdd26571f..7ac72a0a949c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -1198,8 +1198,21 @@ long i915_wait_request(struct drm_i915_gem_request *req,
>   	GEM_BUG_ON(!intel_wait_has_seqno(&wait));
>   	GEM_BUG_ON(!i915_sw_fence_signaled(&req->submit));
>   
> -	/* Optimistic short spin before touching IRQs */
> -	if (__i915_spin_request(req, wait.seqno, state, 5))
> +	/* Optimistic spin before touching IRQs.
> +	 *
> +	 * We may use a rather large value here to offset the penalty of
> +	 * switching away from the active task. Frequently, the client will
> +	 * wait upon an old swapbuffer to throttle itself to remain within a
> +	 * frame of the gpu. If the client is running in lockstep with the gpu,
> +	 * then it should not be waiting long at all, and a sleep now will incur
> +	 * extra scheduler latency in producing the next frame. So we sleep
> +	 * for longer to try and keep the client running.
> +	 *
> +	 * We need ~5us to enable the irq, ~20us to hide a context switch.

This comment fits more with next patch.

> +	 */
> +	if (CONFIG_DRM_I915_SPIN_REQUEST_IRQ &&

If this is set to 0 we still want to sleep and not go to complete.

> +	    __i915_spin_request(req, wait.seqno, state,
> +				CONFIG_DRM_I915_SPIN_REQUEST_IRQ))
>   		goto complete;
>   
>   	set_current_state(state);
> @@ -1255,8 +1268,15 @@ long i915_wait_request(struct drm_i915_gem_request *req,
>   		    __i915_wait_request_check_and_reset(req))
>   			continue;
>   
> -		/* Only spin if we know the GPU is processing this request */
> -		if (__i915_spin_request(req, wait.seqno, state, 2))
> +		/*
> +		 * A quick spin now we are on the CPU to offset the cost of
> +		 * context switching away (and so spin for roughly the same as
> +		 * the scheduler latency). We only spin if we know the GPU is
> +		 * processing this request, and so likely to finish shortly.
> +		 */
> +		if (CONFIG_DRM_I915_SPIN_REQUEST_CS &&

Same here.

> +		    __i915_spin_request(req, wait.seqno, state,
> +					CONFIG_DRM_I915_SPIN_REQUEST_CS))
>   			break;
>   
>   		if (!intel_wait_check_request(&wait, req)) {

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request
  2017-11-27  7:20 ` [PATCH v3 1/2] " Sagar Arun Kamble
@ 2017-11-27  7:25   ` Sagar Arun Kamble
  2017-11-27  9:39   ` Chris Wilson
  1 sibling, 0 replies; 12+ messages in thread
From: Sagar Arun Kamble @ 2017-11-27  7:25 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Eero Tamminen, Ben Widawsky



On 11/27/2017 12:50 PM, Sagar Arun Kamble wrote:
>
>
> On 11/26/2017 5:50 PM, Chris Wilson wrote:
>> An interesting discussion regarding "hybrid interrupt polling" for NVMe
>> came to the conclusion that the ideal busyspin before sleeping
>
> I think hybrid approach suggests sleep (1/2 duration) and then 
> busyspin (1/2 duration)
> (for small I/O size), so this should be "busyspin after sleeping" 
> although we are not doing exactly same.
>
>>   was half
>> of the expected request latency (and better if it was already halfway
>> through that request). This suggested that we too should look again at
>> our tradeoff between spinning and waiting. Currently, our spin simply
>> tries to hide the cost of enabling the interrupt, which is good to avoid
>> penalising nop requests (i.e. test throughput) and not much else.
>> Studying real world workloads suggests that a spin of upto 500us can
>> dramatically boost performance, but the suggestion is that this is not
>> from avoiding interrupt latency per-se, but from secondary effects of
>> sleeping such as allowing the CPU reduce cstate and context switch away.
>>
>> v2: Expose the spin setting via Kconfig options for easier adjustment
>> and testing.
>> v3: Don't get caught sneaking in a change to the busyspin parameters.
>>
>> Suggested-by: Sagar Kamble <sagar.a.kamble@intel.com>
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Sagar Kamble <sagar.a.kamble@intel.com>
>> Cc: Eero Tamminen <eero.t.tamminen@intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Ben Widawsky <ben@bwidawsk.net>
>> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>> Cc: Michał Winiarski <michal.winiarski@intel.com>
>> ---
>>   drivers/gpu/drm/i915/Kconfig            |  6 ++++++
>>   drivers/gpu/drm/i915/Kconfig.profile    | 23 +++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_gem_request.c | 28 
>> ++++++++++++++++++++++++----
>>   3 files changed, 53 insertions(+), 4 deletions(-)
>>   create mode 100644 drivers/gpu/drm/i915/Kconfig.profile
>>
>> diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
>> index dfd95889f4b7..eae90783f8f9 100644
>> --- a/drivers/gpu/drm/i915/Kconfig
>> +++ b/drivers/gpu/drm/i915/Kconfig
>> @@ -131,3 +131,9 @@ depends on DRM_I915
>>   depends on EXPERT
>>   source drivers/gpu/drm/i915/Kconfig.debug
>>   endmenu
>> +
>> +menu "drm/i915 Profile Guided Optimisation"
>> +    visible if EXPERT
>> +    depends on DRM_I915
>> +    source drivers/gpu/drm/i915/Kconfig.profile
>> +endmenu
>> diff --git a/drivers/gpu/drm/i915/Kconfig.profile 
>> b/drivers/gpu/drm/i915/Kconfig.profile
>> new file mode 100644
>> index 000000000000..a1aed0e2aad5
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/Kconfig.profile
>> @@ -0,0 +1,23 @@
>> +config DRM_I915_SPIN_REQUEST_IRQ
>> +    int
>> +    default 5 # microseconds
>> +    help
>> +      Before sleeping waiting for a request (GPU operation) to 
>> complete,
>> +      we may spend some time polling for its completion. As the IRQ may
>> +      take a non-negligible time to setup, we do a short spin first to
>> +      check if the request will complete quickly.
>> +
>> +      May be 0 to disable the initial spin.
>> +
>> +config DRM_I915_SPIN_REQUEST_CS
>> +    int
>> +    default 2 # microseconds
>> +    help
>> +      After sleeping for a request (GPU operation) to complete, we will
>> +      be woken up on the completion of every request prior to the one
>> +      being waited on. For very short requests, going back to sleep and
>> +      be woken up again may add considerably to the wakeup latency. To
>> +      avoid incurring extra latency from the scheduler, we may 
>> choose to
>> +      spin prior to sleeping again.
>> +
>> +      May be 0 to disable spinning after being woken.
>> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
>> b/drivers/gpu/drm/i915/i915_gem_request.c
>> index a90bdd26571f..7ac72a0a949c 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_request.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
>> @@ -1198,8 +1198,21 @@ long i915_wait_request(struct 
>> drm_i915_gem_request *req,
>>       GEM_BUG_ON(!intel_wait_has_seqno(&wait));
>>       GEM_BUG_ON(!i915_sw_fence_signaled(&req->submit));
>>   -    /* Optimistic short spin before touching IRQs */
>> -    if (__i915_spin_request(req, wait.seqno, state, 5))
>> +    /* Optimistic spin before touching IRQs.
>> +     *
>> +     * We may use a rather large value here to offset the penalty of
>> +     * switching away from the active task. Frequently, the client will
>> +     * wait upon an old swapbuffer to throttle itself to remain 
>> within a
>> +     * frame of the gpu. If the client is running in lockstep with 
>> the gpu,
>> +     * then it should not be waiting long at all, and a sleep now 
>> will incur
>> +     * extra scheduler latency in producing the next frame. So we sleep
>> +     * for longer to try and keep the client running.
>> +     *
>> +     * We need ~5us to enable the irq, ~20us to hide a context switch.
>
> This comment fits more with next patch.
>
>> +     */
>> +    if (CONFIG_DRM_I915_SPIN_REQUEST_IRQ &&
>
> If this is set to 0 we still want to sleep and not go to complete.
Hit wicket :) ... Ignore this comment.
With above comments above commit message and comment update this change 
looks fine to me.
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>
>> +        __i915_spin_request(req, wait.seqno, state,
>> +                CONFIG_DRM_I915_SPIN_REQUEST_IRQ))
>>           goto complete;
>>         set_current_state(state);
>> @@ -1255,8 +1268,15 @@ long i915_wait_request(struct 
>> drm_i915_gem_request *req,
>>               __i915_wait_request_check_and_reset(req))
>>               continue;
>>   -        /* Only spin if we know the GPU is processing this request */
>> -        if (__i915_spin_request(req, wait.seqno, state, 2))
>> +        /*
>> +         * A quick spin now we are on the CPU to offset the cost of
>> +         * context switching away (and so spin for roughly the same as
>> +         * the scheduler latency). We only spin if we know the GPU is
>> +         * processing this request, and so likely to finish shortly.
>> +         */
>> +        if (CONFIG_DRM_I915_SPIN_REQUEST_CS &&
>
> Same here.
>
>> +            __i915_spin_request(req, wait.seqno, state,
>> +                    CONFIG_DRM_I915_SPIN_REQUEST_CS))
>>               break;
>>             if (!intel_wait_check_request(&wait, req)) {
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request
  2017-11-27  7:20 ` [PATCH v3 1/2] " Sagar Arun Kamble
  2017-11-27  7:25   ` Sagar Arun Kamble
@ 2017-11-27  9:39   ` Chris Wilson
  1 sibling, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-11-27  9:39 UTC (permalink / raw)
  To: Sagar Arun Kamble, intel-gfx; +Cc: Eero

Quoting Sagar Arun Kamble (2017-11-27 07:20:01)
> 
> 
> On 11/26/2017 5:50 PM, Chris Wilson wrote:
> > An interesting discussion regarding "hybrid interrupt polling" for NVMe
> > came to the conclusion that the ideal busyspin before sleeping
> 
> I think hybrid approach suggests sleep (1/2 duration) and then busyspin (1/2 duration)
> (for small I/O size), so this should be "busyspin after sleeping" although we are not doing exactly same.

It does, we are not. For reasons I thought I had described ... But not
here apparently. Differences between hybrid interrupt polling and
ourselves we should include in the comments as well.
 
> >   was half
> > of the expected request latency (and better if it was already halfway
> > through that request). This suggested that we too should look again at
> > our tradeoff between spinning and waiting. Currently, our spin simply
> > tries to hide the cost of enabling the interrupt, which is good to avoid
> > penalising nop requests (i.e. test throughput) and not much else.
> > Studying real world workloads suggests that a spin of upto 500us can
> > dramatically boost performance, but the suggestion is that this is not
> > from avoiding interrupt latency per-se, but from secondary effects of
> > sleeping such as allowing the CPU reduce cstate and context switch away.
> >
> > v2: Expose the spin setting via Kconfig options for easier adjustment
> > and testing.
> > v3: Don't get caught sneaking in a change to the busyspin parameters.
> >
> > Suggested-by: Sagar Kamble <sagar.a.kamble@intel.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Sagar Kamble <sagar.a.kamble@intel.com>
> > Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Ben Widawsky <ben@bwidawsk.net>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > Cc: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> >   drivers/gpu/drm/i915/Kconfig            |  6 ++++++
> >   drivers/gpu/drm/i915/Kconfig.profile    | 23 +++++++++++++++++++++++
> >   drivers/gpu/drm/i915/i915_gem_request.c | 28 ++++++++++++++++++++++++----
> >   3 files changed, 53 insertions(+), 4 deletions(-)
> >   create mode 100644 drivers/gpu/drm/i915/Kconfig.profile
> >
> > diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
> > index dfd95889f4b7..eae90783f8f9 100644
> > --- a/drivers/gpu/drm/i915/Kconfig
> > +++ b/drivers/gpu/drm/i915/Kconfig
> > @@ -131,3 +131,9 @@ depends on DRM_I915
> >   depends on EXPERT
> >   source drivers/gpu/drm/i915/Kconfig.debug
> >   endmenu
> > +
> > +menu "drm/i915 Profile Guided Optimisation"
> > +     visible if EXPERT
> > +     depends on DRM_I915
> > +     source drivers/gpu/drm/i915/Kconfig.profile
> > +endmenu
> > diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
> > new file mode 100644
> > index 000000000000..a1aed0e2aad5
> > --- /dev/null
> > +++ b/drivers/gpu/drm/i915/Kconfig.profile
> > @@ -0,0 +1,23 @@
> > +config DRM_I915_SPIN_REQUEST_IRQ
> > +     int
> > +     default 5 # microseconds
> > +     help
> > +       Before sleeping waiting for a request (GPU operation) to complete,
> > +       we may spend some time polling for its completion. As the IRQ may
> > +       take a non-negligible time to setup, we do a short spin first to
> > +       check if the request will complete quickly.
> > +
> > +       May be 0 to disable the initial spin.
> > +
> > +config DRM_I915_SPIN_REQUEST_CS
> > +     int
> > +     default 2 # microseconds
> > +     help
> > +       After sleeping for a request (GPU operation) to complete, we will
> > +       be woken up on the completion of every request prior to the one
> > +       being waited on. For very short requests, going back to sleep and
> > +       be woken up again may add considerably to the wakeup latency. To
> > +       avoid incurring extra latency from the scheduler, we may choose to
> > +       spin prior to sleeping again.
> > +
> > +       May be 0 to disable spinning after being woken.
> > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> > index a90bdd26571f..7ac72a0a949c 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_request.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> > @@ -1198,8 +1198,21 @@ long i915_wait_request(struct drm_i915_gem_request *req,
> >       GEM_BUG_ON(!intel_wait_has_seqno(&wait));
> >       GEM_BUG_ON(!i915_sw_fence_signaled(&req->submit));
> >   
> > -     /* Optimistic short spin before touching IRQs */
> > -     if (__i915_spin_request(req, wait.seqno, state, 5))
> > +     /* Optimistic spin before touching IRQs.
> > +      *
> > +      * We may use a rather large value here to offset the penalty of
> > +      * switching away from the active task. Frequently, the client will
> > +      * wait upon an old swapbuffer to throttle itself to remain within a
> > +      * frame of the gpu. If the client is running in lockstep with the gpu,
> > +      * then it should not be waiting long at all, and a sleep now will incur
> > +      * extra scheduler latency in producing the next frame. So we sleep
> > +      * for longer to try and keep the client running.
> > +      *
> > +      * We need ~5us to enable the irq, ~20us to hide a context switch.
> 
> This comment fits more with next patch.

It just a general overview of the short of timescales we expect. I want
to explain the rationale behind having a spin and what spins we have in
mind. Maybe if I just add "upto"
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4] drm/i915: Expose the busyspin durations for i915_wait_request
  2017-11-26 12:20 [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request Chris Wilson
                   ` (3 preceding siblings ...)
  2017-11-27  7:20 ` [PATCH v3 1/2] " Sagar Arun Kamble
@ 2017-11-27 10:10 ` Chris Wilson
  2017-11-28 17:18   ` Tvrtko Ursulin
  2017-11-27 10:33 ` ✓ Fi.CI.BAT: success for series starting with [v4] drm/i915: Expose the busyspin durations for i915_wait_request (rev2) Patchwork
  2017-11-27 12:07 ` ✗ Fi.CI.IGT: failure " Patchwork
  6 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2017-11-27 10:10 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Eero Tamminen

An interesting discussion regarding "hybrid interrupt polling" for NVMe
came to the conclusion that the ideal busyspin before sleeping was half
of the expected request latency (and better if it was already halfway
through that request). This suggested that we too should look again at
our tradeoff between spinning and waiting. Currently, our spin simply
tries to hide the cost of enabling the interrupt, which is good to avoid
penalising nop requests (i.e. test throughput) and not much else.
Studying real world workloads suggests that a spin of upto 500us can
dramatically boost performance, but the suggestion is that this is not
from avoiding interrupt latency per-se, but from secondary effects of
sleeping such as allowing the CPU reduce cstate and context switch away.

In a truly hybrid interrupt polling scheme, we would aim to sleep until
just before the request completed and then wake up in advance of the
interrupt and do a quick poll to handle completion. This is tricky for
ourselves at the moment as we are not recording request times, and since
we allow preemption, our requests are not on as a nicely ordered
timeline as IO. However, the idea is interesting, for it will certainly
help us decide when busyspinning is worthwhile.

v2: Expose the spin setting via Kconfig options for easier adjustment
and testing.
v3: Don't get caught sneaking in a change to the busyspin parameters.
v4: Explain more about the "hybrid interrupt polling" scheme that we
want to migrate towards.

Suggested-by: Sagar Kamble <sagar.a.kamble@intel.com>
References: http://events.linuxfoundation.org/sites/events/files/slides/lemoal-nvme-polling-vault-2017-final_0.pdf
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Sagar Kamble <sagar.a.kamble@intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Sagar Kamble <sagar.a.kamble@intel.com>
---
 drivers/gpu/drm/i915/Kconfig            |  6 +++++
 drivers/gpu/drm/i915/Kconfig.profile    | 26 ++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_request.c | 39 +++++++++++++++++++++++++++++----
 3 files changed, 67 insertions(+), 4 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/Kconfig.profile

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index dfd95889f4b7..eae90783f8f9 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -131,3 +131,9 @@ depends on DRM_I915
 depends on EXPERT
 source drivers/gpu/drm/i915/Kconfig.debug
 endmenu
+
+menu "drm/i915 Profile Guided Optimisation"
+	visible if EXPERT
+	depends on DRM_I915
+	source drivers/gpu/drm/i915/Kconfig.profile
+endmenu
diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
new file mode 100644
index 000000000000..8a230eeb98df
--- /dev/null
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -0,0 +1,26 @@
+config DRM_I915_SPIN_REQUEST_IRQ
+	int
+	default 5 # microseconds
+	help
+	  Before sleeping waiting for a request (GPU operation) to complete,
+	  we may spend some time polling for its completion. As the IRQ may
+	  take a non-negligible time to setup, we do a short spin first to
+	  check if the request will complete in the time it would have taken
+	  us to enable the interrupt.
+
+	  May be 0 to disable the initial spin. In practice, we estimate
+	  the cost of enabling the interrupt (if currently disabled) to be
+	  a few microseconds.
+
+config DRM_I915_SPIN_REQUEST_CS
+	int
+	default 2 # microseconds
+	help
+	  After sleeping for a request (GPU operation) to complete, we will
+	  be woken up on the completion of every request prior to the one
+	  being waited on. For very short requests, going back to sleep and
+	  be woken up again may add considerably to the wakeup latency. To
+	  avoid incurring extra latency from the scheduler, we may choose to
+	  spin prior to sleeping again.
+
+	  May be 0 to disable spinning after being woken.
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index a90bdd26571f..be84ea6a56d7 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -1198,8 +1198,32 @@ long i915_wait_request(struct drm_i915_gem_request *req,
 	GEM_BUG_ON(!intel_wait_has_seqno(&wait));
 	GEM_BUG_ON(!i915_sw_fence_signaled(&req->submit));
 
-	/* Optimistic short spin before touching IRQs */
-	if (__i915_spin_request(req, wait.seqno, state, 5))
+	/*
+	 * Optimistic spin before touching IRQs.
+	 *
+	 * We may use a rather large value here to offset the penalty of
+	 * switching away from the active task. Frequently, the client will
+	 * wait upon an old swapbuffer to throttle itself to remain within a
+	 * frame of the gpu. If the client is running in lockstep with the gpu,
+	 * then it should not be waiting long at all, and a sleep now will incur
+	 * extra scheduler latency in producing the next frame. To try to
+	 * avoid adding the cost of enabling/disabling the interrupt to the
+	 * short wait, we first spin to see if the request would have completed
+	 * in the time taken to setup the interrupt.
+	 *
+	 * We need upto 5us to enable the irq, and upto 20us to hide the
+	 * scheduler latency of a context switch, ignoring the secondary
+	 * impacts from a context switch such as cache eviction.
+	 *
+	 * The scheme used for low-latency IO is called "hybrid interrupt
+	 * polling". The suggestion there is to sleep until just before you
+	 * expect to be woken by the device interrupt and then poll for its
+	 * completion. That requires having a good predictor for the request
+	 * duration, which we currently lack.
+	 */
+	if (CONFIG_DRM_I915_SPIN_REQUEST_IRQ &&
+	    __i915_spin_request(req, wait.seqno, state,
+				CONFIG_DRM_I915_SPIN_REQUEST_IRQ))
 		goto complete;
 
 	set_current_state(state);
@@ -1255,8 +1279,15 @@ long i915_wait_request(struct drm_i915_gem_request *req,
 		    __i915_wait_request_check_and_reset(req))
 			continue;
 
-		/* Only spin if we know the GPU is processing this request */
-		if (__i915_spin_request(req, wait.seqno, state, 2))
+		/*
+		 * A quick spin now we are on the CPU to offset the cost of
+		 * context switching away (and so spin for roughly the same as
+		 * the scheduler latency). We only spin if we know the GPU is
+		 * processing this request, and so likely to finish shortly.
+		 */
+		if (CONFIG_DRM_I915_SPIN_REQUEST_CS &&
+		    __i915_spin_request(req, wait.seqno, state,
+					CONFIG_DRM_I915_SPIN_REQUEST_CS))
 			break;
 
 		if (!intel_wait_check_request(&wait, req)) {
-- 
2.15.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [v4] drm/i915: Expose the busyspin durations for i915_wait_request (rev2)
  2017-11-26 12:20 [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request Chris Wilson
                   ` (4 preceding siblings ...)
  2017-11-27 10:10 ` [PATCH v4] " Chris Wilson
@ 2017-11-27 10:33 ` Patchwork
  2017-11-27 12:07 ` ✗ Fi.CI.IGT: failure " Patchwork
  6 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2017-11-27 10:33 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v4] drm/i915: Expose the busyspin durations for i915_wait_request (rev2)
URL   : https://patchwork.freedesktop.org/series/34404/
State : success

== Summary ==

Series 34404v2 series starting with [v4] drm/i915: Expose the busyspin durations for i915_wait_request
https://patchwork.freedesktop.org/api/1.0/series/34404/revisions/2/mbox/

Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-b:
                pass       -> INCOMPLETE (fi-snb-2520m) fdo#103713
Test drv_module_reload:
        Subgroup basic-reload-inject:
                pass       -> DMESG-WARN (fi-bwr-2160) fdo#103923

fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713
fdo#103923 https://bugs.freedesktop.org/show_bug.cgi?id=103923

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:442s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:462s
fi-blb-e6850     total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:378s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:527s
fi-bwr-2160      total:289  pass:182  dwarn:1   dfail:0   fail:0   skip:106 time:319s
fi-bxt-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:502s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:502s
fi-byt-j1900     total:289  pass:254  dwarn:0   dfail:0   fail:0   skip:35  time:489s
fi-byt-n2820     total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:481s
fi-elk-e7500     total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:425s
fi-gdg-551       total:289  pass:178  dwarn:1   dfail:0   fail:1   skip:109 time:262s
fi-glk-1         total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:539s
fi-hsw-4770      total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:420s
fi-hsw-4770r     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:433s
fi-ilk-650       total:289  pass:228  dwarn:0   dfail:0   fail:0   skip:61  time:427s
fi-ivb-3520m     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:483s
fi-ivb-3770      total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:454s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:478s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:534s
fi-kbl-7567u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:479s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:534s
fi-pnv-d510      total:289  pass:222  dwarn:1   dfail:0   fail:0   skip:66  time:578s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:448s
fi-skl-6600u     total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:541s
fi-skl-6700hq    total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:567s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:512s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:491s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:457s
fi-snb-2520m     total:246  pass:212  dwarn:0   dfail:0   fail:0   skip:33 
fi-snb-2600      total:289  pass:249  dwarn:0   dfail:0   fail:0   skip:40  time:418s
Blacklisted hosts:
fi-cfl-s2        total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:609s
fi-cnl-y         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:552s
fi-glk-dsi       total:289  pass:259  dwarn:0   dfail:0   fail:0   skip:30  time:490s

2ac27dcb0eb2b64104e890031705714051be9ef8 drm-tip: 2017y-11m-26d-19h-47m-14s UTC integration manifest
a6d6a0e50c9d drm/i915: Increase busyspin limit before a context-switch
37c966e049bd drm/i915: Expose the busyspin durations for i915_wait_request

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7301/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ✗ Fi.CI.IGT: failure for series starting with [v4] drm/i915: Expose the busyspin durations for i915_wait_request (rev2)
  2017-11-26 12:20 [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request Chris Wilson
                   ` (5 preceding siblings ...)
  2017-11-27 10:33 ` ✓ Fi.CI.BAT: success for series starting with [v4] drm/i915: Expose the busyspin durations for i915_wait_request (rev2) Patchwork
@ 2017-11-27 12:07 ` Patchwork
  6 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2017-11-27 12:07 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v4] drm/i915: Expose the busyspin durations for i915_wait_request (rev2)
URL   : https://patchwork.freedesktop.org/series/34404/
State : failure

== Summary ==

Test kms_frontbuffer_tracking:
        Subgroup fbc-1p-offscren-pri-shrfb-draw-render:
                fail       -> PASS       (shard-snb) fdo#101623 +1
        Subgroup fbc-1p-shrfb-fliptrack:
                pass       -> SKIP       (shard-snb) fdo#103167
Test kms_flip:
        Subgroup flip-vs-expired-vblank-interruptible:
                fail       -> PASS       (shard-hsw) fdo#102887
Test kms_chv_cursor_fail:
        Subgroup pipe-c-64x64-right-edge:
                pass       -> INCOMPLETE (shard-hsw)
Test drv_selftest:
        Subgroup mock_sanitycheck:
                dmesg-warn -> PASS       (shard-snb) fdo#103717
Test perf:
        Subgroup blocking:
                fail       -> PASS       (shard-hsw) fdo#102252
Test drv_module_reload:
        Subgroup basic-no-display:
                dmesg-warn -> PASS       (shard-snb) fdo#102707

fdo#101623 https://bugs.freedesktop.org/show_bug.cgi?id=101623
fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
fdo#103717 https://bugs.freedesktop.org/show_bug.cgi?id=103717
fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252
fdo#102707 https://bugs.freedesktop.org/show_bug.cgi?id=102707

shard-hsw        total:2588 pass:1495 dwarn:1   dfail:0   fail:9   skip:1082 time:9231s
shard-snb        total:2662 pass:1306 dwarn:1   dfail:0   fail:12  skip:1343 time:8055s
Blacklisted hosts:
shard-apl        total:2640 pass:1664 dwarn:2   dfail:0   fail:23  skip:949 time:13116s
shard-kbl        total:2588 pass:1749 dwarn:9   dfail:0   fail:26  skip:804 time:10772s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7301/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 2/2] drm/i915: Increase busyspin limit before a context-switch
  2017-11-26 12:20 ` [PATCH v3 2/2] drm/i915: Increase busyspin limit before a context-switch Chris Wilson
@ 2017-11-28 17:15   ` Tvrtko Ursulin
  0 siblings, 0 replies; 12+ messages in thread
From: Tvrtko Ursulin @ 2017-11-28 17:15 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Ben Widawsky, Eero Tamminen


On 26/11/2017 12:20, Chris Wilson wrote:
> Looking at the distribution of i915_wait_request for a set of GL
> benchmarks, we see:
> 
> broadwell# python bcc/tools/funclatency.py -u i915_wait_request
>     usecs               : count     distribution
>         0 -> 1          : 29184    |****************************************|
>         2 -> 3          : 5767     |*******                                 |
>         4 -> 7          : 3000     |****                                    |
>         8 -> 15         : 491      |                                        |
>        16 -> 31         : 140      |                                        |
>        32 -> 63         : 203      |                                        |
>        64 -> 127        : 543      |                                        |
>       128 -> 255        : 881      |*                                       |
>       256 -> 511        : 1209     |*                                       |
>       512 -> 1023       : 1739     |**                                      |
>      1024 -> 2047       : 22855    |*******************************         |
>      2048 -> 4095       : 1725     |**                                      |
>      4096 -> 8191       : 5813     |*******                                 |
>      8192 -> 16383      : 5348     |*******                                 |
>     16384 -> 32767      : 1000     |*                                       |
>     32768 -> 65535      : 4400     |******                                  |
>     65536 -> 131071     : 296      |                                        |
>    131072 -> 262143     : 225      |                                        |
>    262144 -> 524287     : 4        |                                        |
>    524288 -> 1048575    : 1        |                                        |
>   1048576 -> 2097151    : 1        |                                        |
>   2097152 -> 4194303    : 1        |                                        |
> 
> broxton# python bcc/tools/funclatency.py -u i915_wait_request
>     usecs               : count     distribution
>         0 -> 1          : 5523     |*************************************   |
>         2 -> 3          : 1340     |*********                               |
>         4 -> 7          : 2100     |**************                          |
>         8 -> 15         : 755      |*****                                   |
>        16 -> 31         : 211      |*                                       |
>        32 -> 63         : 53       |                                        |
>        64 -> 127        : 71       |                                        |
>       128 -> 255        : 113      |                                        |
>       256 -> 511        : 262      |*                                       |
>       512 -> 1023       : 358      |**                                      |
>      1024 -> 2047       : 1105     |*******                                 |
>      2048 -> 4095       : 848      |*****                                   |
>      4096 -> 8191       : 1295     |********                                |
>      8192 -> 16383      : 5894     |****************************************|
>     16384 -> 32767      : 4270     |****************************            |
>     32768 -> 65535      : 5622     |**************************************  |
>     65536 -> 131071     : 306      |**                                      |
>    131072 -> 262143     : 50       |                                        |
>    262144 -> 524287     : 76       |                                        |
>    524288 -> 1048575    : 34       |                                        |
>   1048576 -> 2097151    : 0        |                                        |
>   2097152 -> 4194303    : 1        |                                        |
> 
> Picking 20us for the context-switch busyspin has the dual advantage of
> catching most frequent short waits while avoiding the cost of a context
> switch. 20us is a typical latency of 2 context-switches, i.e. the cost
> of taking the sleep, without the secondary effects of cache flushing.

Next thing I wanted to ask is cumulative time spent spinning vs test 
duration, or in other words, CPU usage before and after.

And of course was the benefit on benchmarks results measurable, by how 
much, and what does the perf per Watt say?

Regards,

Tvrtko

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Sagar Kamble <sagar.a.kamble@intel.com>
> Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> ---
>   drivers/gpu/drm/i915/Kconfig.profile | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
> index a1aed0e2aad5..c8fe5754466c 100644
> --- a/drivers/gpu/drm/i915/Kconfig.profile
> +++ b/drivers/gpu/drm/i915/Kconfig.profile
> @@ -11,7 +11,7 @@ config DRM_I915_SPIN_REQUEST_IRQ
>   
>   config DRM_I915_SPIN_REQUEST_CS
>   	int
> -	default 2 # microseconds
> +	default 20 # microseconds
>   	help
>   	  After sleeping for a request (GPU operation) to complete, we will
>   	  be woken up on the completion of every request prior to the one
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4] drm/i915: Expose the busyspin durations for i915_wait_request
  2017-11-27 10:10 ` [PATCH v4] " Chris Wilson
@ 2017-11-28 17:18   ` Tvrtko Ursulin
  0 siblings, 0 replies; 12+ messages in thread
From: Tvrtko Ursulin @ 2017-11-28 17:18 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Ben Widawsky, Eero Tamminen


On 27/11/2017 10:10, Chris Wilson wrote:
> An interesting discussion regarding "hybrid interrupt polling" for NVMe
> came to the conclusion that the ideal busyspin before sleeping was half
> of the expected request latency (and better if it was already halfway
> through that request). This suggested that we too should look again at
> our tradeoff between spinning and waiting. Currently, our spin simply
> tries to hide the cost of enabling the interrupt, which is good to avoid
> penalising nop requests (i.e. test throughput) and not much else.
> Studying real world workloads suggests that a spin of upto 500us can
> dramatically boost performance, but the suggestion is that this is not
> from avoiding interrupt latency per-se, but from secondary effects of
> sleeping such as allowing the CPU reduce cstate and context switch away.
> 
> In a truly hybrid interrupt polling scheme, we would aim to sleep until
> just before the request completed and then wake up in advance of the
> interrupt and do a quick poll to handle completion. This is tricky for
> ourselves at the moment as we are not recording request times, and since
> we allow preemption, our requests are not on as a nicely ordered
> timeline as IO. However, the idea is interesting, for it will certainly
> help us decide when busyspinning is worthwhile.
> 
> v2: Expose the spin setting via Kconfig options for easier adjustment
> and testing.
> v3: Don't get caught sneaking in a change to the busyspin parameters.
> v4: Explain more about the "hybrid interrupt polling" scheme that we
> want to migrate towards.
> 
> Suggested-by: Sagar Kamble <sagar.a.kamble@intel.com>
> References: http://events.linuxfoundation.org/sites/events/files/slides/lemoal-nvme-polling-vault-2017-final_0.pdf
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Sagar Kamble <sagar.a.kamble@intel.com>
> Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Ben Widawsky <ben@bwidawsk.net>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> Reviewed-by: Sagar Kamble <sagar.a.kamble@intel.com>
> ---
>   drivers/gpu/drm/i915/Kconfig            |  6 +++++
>   drivers/gpu/drm/i915/Kconfig.profile    | 26 ++++++++++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_request.c | 39 +++++++++++++++++++++++++++++----
>   3 files changed, 67 insertions(+), 4 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/Kconfig.profile
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
> index dfd95889f4b7..eae90783f8f9 100644
> --- a/drivers/gpu/drm/i915/Kconfig
> +++ b/drivers/gpu/drm/i915/Kconfig
> @@ -131,3 +131,9 @@ depends on DRM_I915
>   depends on EXPERT
>   source drivers/gpu/drm/i915/Kconfig.debug
>   endmenu
> +
> +menu "drm/i915 Profile Guided Optimisation"
> +	visible if EXPERT
> +	depends on DRM_I915
> +	source drivers/gpu/drm/i915/Kconfig.profile
> +endmenu
> diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile
> new file mode 100644
> index 000000000000..8a230eeb98df
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/Kconfig.profile
> @@ -0,0 +1,26 @@
> +config DRM_I915_SPIN_REQUEST_IRQ
> +	int
> +	default 5 # microseconds
> +	help
> +	  Before sleeping waiting for a request (GPU operation) to complete,
> +	  we may spend some time polling for its completion. As the IRQ may
> +	  take a non-negligible time to setup, we do a short spin first to
> +	  check if the request will complete in the time it would have taken
> +	  us to enable the interrupt.
> +
> +	  May be 0 to disable the initial spin. In practice, we estimate
> +	  the cost of enabling the interrupt (if currently disabled) to be
> +	  a few microseconds.
> +
> +config DRM_I915_SPIN_REQUEST_CS
> +	int
> +	default 2 # microseconds
> +	help
> +	  After sleeping for a request (GPU operation) to complete, we will
> +	  be woken up on the completion of every request prior to the one
> +	  being waited on. For very short requests, going back to sleep and
> +	  be woken up again may add considerably to the wakeup latency. To
> +	  avoid incurring extra latency from the scheduler, we may choose to
> +	  spin prior to sleeping again.
> +
> +	  May be 0 to disable spinning after being woken.
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index a90bdd26571f..be84ea6a56d7 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -1198,8 +1198,32 @@ long i915_wait_request(struct drm_i915_gem_request *req,
>   	GEM_BUG_ON(!intel_wait_has_seqno(&wait));
>   	GEM_BUG_ON(!i915_sw_fence_signaled(&req->submit));
>   
> -	/* Optimistic short spin before touching IRQs */
> -	if (__i915_spin_request(req, wait.seqno, state, 5))
> +	/*
> +	 * Optimistic spin before touching IRQs.
> +	 *
> +	 * We may use a rather large value here to offset the penalty of
> +	 * switching away from the active task. Frequently, the client will
> +	 * wait upon an old swapbuffer to throttle itself to remain within a
> +	 * frame of the gpu. If the client is running in lockstep with the gpu,
> +	 * then it should not be waiting long at all, and a sleep now will incur
> +	 * extra scheduler latency in producing the next frame. To try to
> +	 * avoid adding the cost of enabling/disabling the interrupt to the
> +	 * short wait, we first spin to see if the request would have completed
> +	 * in the time taken to setup the interrupt.
> +	 *
> +	 * We need upto 5us to enable the irq, and upto 20us to hide the
> +	 * scheduler latency of a context switch, ignoring the secondary
> +	 * impacts from a context switch such as cache eviction.
> +	 *
> +	 * The scheme used for low-latency IO is called "hybrid interrupt
> +	 * polling". The suggestion there is to sleep until just before you
> +	 * expect to be woken by the device interrupt and then poll for its
> +	 * completion. That requires having a good predictor for the request
> +	 * duration, which we currently lack.
> +	 */
> +	if (CONFIG_DRM_I915_SPIN_REQUEST_IRQ &&
> +	    __i915_spin_request(req, wait.seqno, state,
> +				CONFIG_DRM_I915_SPIN_REQUEST_IRQ))
>   		goto complete;
>   
>   	set_current_state(state);
> @@ -1255,8 +1279,15 @@ long i915_wait_request(struct drm_i915_gem_request *req,
>   		    __i915_wait_request_check_and_reset(req))
>   			continue;
>   
> -		/* Only spin if we know the GPU is processing this request */
> -		if (__i915_spin_request(req, wait.seqno, state, 2))
> +		/*
> +		 * A quick spin now we are on the CPU to offset the cost of
> +		 * context switching away (and so spin for roughly the same as
> +		 * the scheduler latency). We only spin if we know the GPU is
> +		 * processing this request, and so likely to finish shortly.
> +		 */
> +		if (CONFIG_DRM_I915_SPIN_REQUEST_CS &&
> +		    __i915_spin_request(req, wait.seqno, state,
> +					CONFIG_DRM_I915_SPIN_REQUEST_CS))
>   			break;
>   
>   		if (!intel_wait_check_request(&wait, req)) {
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-11-28 17:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-26 12:20 [PATCH v3 1/2] drm/i915: Expose the busyspin durations for i915_wait_request Chris Wilson
2017-11-26 12:20 ` [PATCH v3 2/2] drm/i915: Increase busyspin limit before a context-switch Chris Wilson
2017-11-28 17:15   ` Tvrtko Ursulin
2017-11-26 12:41 ` ✓ Fi.CI.BAT: success for series starting with [v3,1/2] drm/i915: Expose the busyspin durations for i915_wait_request Patchwork
2017-11-26 13:30 ` ✓ Fi.CI.IGT: " Patchwork
2017-11-27  7:20 ` [PATCH v3 1/2] " Sagar Arun Kamble
2017-11-27  7:25   ` Sagar Arun Kamble
2017-11-27  9:39   ` Chris Wilson
2017-11-27 10:10 ` [PATCH v4] " Chris Wilson
2017-11-28 17:18   ` Tvrtko Ursulin
2017-11-27 10:33 ` ✓ Fi.CI.BAT: success for series starting with [v4] drm/i915: Expose the busyspin durations for i915_wait_request (rev2) Patchwork
2017-11-27 12:07 ` ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.