All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation
@ 2018-09-14  8:00 Chris Wilson
  2018-09-14  8:00 ` [PATCH 2/3] drm/i915: Flush the tasklet when checking for idle Chris Wilson
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Chris Wilson @ 2018-09-14  8:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

If we try and fail to allocate a i915_request, we apply some
backpressure on the clients to throttle the memory allocations coming
from i915.ko. Currently, we wait until completely idle, but this is far
too heavy and leads to some situations where the only escape is to
declare a client hung and reset the GPU. The intent is to only ratelimit
the allocation requests and to allow ourselves to recycle requests and
memory from any long queues built up by a client hog.

Although the system memory is inherently a global resources, we don't
want to overly penalize an unlucky client to pay the price of reaping a
hog. To reduce the influence of one client on another, we can instead of
waiting for the entire GPU to idle, impose a barrier on the local client.
(One end goal for request allocation is for scalability to many
concurrent allocators; simultaneous execbufs.)

To prevent ourselves from getting caught out by long running requests
(requests that may never finish without userspace intervention, whom we
are blocking) we need to impose a finite timeout, ideally shorter than
hangcheck. A long time ago Paul McKenney suggested that RCU users should
ratelimit themselves using judicious use of cond_synchronize_rcu(). This
gives us the opportunity to reduce our indefinite wait for the GPU to
idle to a wait for the RCU grace period of the previous allocation along
this timeline to expire, satisfying both the local and finite properties
we desire for our ratelimiting.

There are still a few global steps (reclaim not least amongst those!)
when we exhaust the immediate slab pool, at least now the wait is itself
decoupled from struct_mutex for our glorious highly parallel future!

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106680
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_request.c | 14 ++++++++------
 drivers/gpu/drm/i915/i915_request.h |  8 ++++++++
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 09ed48833b54..a492385b2089 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -732,13 +732,13 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	rq = kmem_cache_alloc(i915->requests,
 			      GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
 	if (unlikely(!rq)) {
+		i915_retire_requests(i915);
+
 		/* Ratelimit ourselves to prevent oom from malicious clients */
-		ret = i915_gem_wait_for_idle(i915,
-					     I915_WAIT_LOCKED |
-					     I915_WAIT_INTERRUPTIBLE,
-					     MAX_SCHEDULE_TIMEOUT);
-		if (ret)
-			goto err_unreserve;
+		rq = i915_gem_active_raw(&ce->ring->timeline->last_request,
+					 &i915->drm.struct_mutex);
+		if (rq)
+			cond_synchronize_rcu(rq->rcustate);
 
 		/*
 		 * We've forced the client to stall and catch up with whatever
@@ -758,6 +758,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 		}
 	}
 
+	rq->rcustate = get_state_synchronize_rcu();
+
 	INIT_LIST_HEAD(&rq->active_list);
 	rq->i915 = i915;
 	rq->engine = engine;
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 9898301ab7ef..7fa94b024968 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -100,6 +100,14 @@ struct i915_request {
 	struct i915_timeline *timeline;
 	struct intel_signal_node signaling;
 
+	/*
+	 * The rcu epoch of when this request was allocated. Used to judiciously
+	 * apply backpressure on future allocations to ensure that under
+	 * mempressure there is sufficient RCU ticks for us to reclaim our
+	 * RCU protected slabs.
+	 */
+	unsigned long rcustate;
+
 	/*
 	 * Fences for the various phases in the request's lifetime.
 	 *
-- 
2.19.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] drm/i915: Flush the tasklet when checking for idle
  2018-09-14  8:00 [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation Chris Wilson
@ 2018-09-14  8:00 ` Chris Wilson
  2018-09-14 10:21   ` Tvrtko Ursulin
  2018-09-14  8:00 ` [PATCH 3/3] drm/i915/execlists: Reset CSB pointers on canceling requests (wedging) Chris Wilson
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2018-09-14  8:00 UTC (permalink / raw)
  To: intel-gfx

In order to reduce latency when checking for idle we kick the tasklet
directly. Sometimes this is not enough as it is queued on another cpu
and so to improve the accuracy of this idle-check (and so to reduce
latency overall by avoiding another pass, or worse declaring a timeout!)
wait for the tasklet to complete.

References: https://bugs.freedesktop.org/show_bug.cgi?id=107916
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
---
 drivers/gpu/drm/i915/intel_engine_cs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 10cd051ba29e..217ed3ee1cab 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -990,6 +990,9 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 		}
 		local_bh_enable();
 
+		/* Otherwise flush the tasklet if it was on another cpu */
+		tasklet_unlock_wait(t);
+
 		if (READ_ONCE(engine->execlists.active))
 			return false;
 	}
-- 
2.19.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] drm/i915/execlists: Reset CSB pointers on canceling requests (wedging)
  2018-09-14  8:00 [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation Chris Wilson
  2018-09-14  8:00 ` [PATCH 2/3] drm/i915: Flush the tasklet when checking for idle Chris Wilson
@ 2018-09-14  8:00 ` Chris Wilson
  2018-09-14 14:17   ` Mika Kuoppala
  2018-09-14 10:02 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation Patchwork
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2018-09-14  8:00 UTC (permalink / raw)
  To: intel-gfx

The prior assumption was that we did not need to reset the CSB on
wedging when cancelling the outstanding requests as it would be cleaned
up in the subsequent reset prior to restarting the GPU. However, what
was not accounted for was that in performing the reset, we would try to
process the outstanding CSB entries. If the GPU happened to complete a
CS event just as we were performing the cancellation of requests, that
event would be kept in the CSB until the reset -- but our bookkeeping
was cleared, causing confusion when trying to complete the CS event.

v2: Use a sanitize on unwedge to avoid interfering with eio suspend
(where we intentionally disable GPU reset).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107925
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d6f2bbd6a0dc..c8020719bcfb 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3438,6 +3438,9 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 	i915_retire_requests(i915);
 	GEM_BUG_ON(i915->gt.active_requests);
 
+	if (!intel_gpu_reset(i915, ALL_ENGINES))
+		intel_engines_sanitize(i915);
+
 	/*
 	 * Undo nop_submit_request. We prevent all new i915 requests from
 	 * being queued (by disallowing execbuf whilst wedged) so having
-- 
2.19.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation
  2018-09-14  8:00 [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation Chris Wilson
  2018-09-14  8:00 ` [PATCH 2/3] drm/i915: Flush the tasklet when checking for idle Chris Wilson
  2018-09-14  8:00 ` [PATCH 3/3] drm/i915/execlists: Reset CSB pointers on canceling requests (wedging) Chris Wilson
@ 2018-09-14 10:02 ` Patchwork
  2018-09-14 10:09 ` [PATCH 1/3] " Tvrtko Ursulin
  2018-09-14 11:30 ` ✓ Fi.CI.IGT: success for series starting with [1/3] " Patchwork
  4 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2018-09-14 10:02 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation
URL   : https://patchwork.freedesktop.org/series/49688/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4823 -> Patchwork_10187 =

== Summary - SUCCESS ==

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/49688/revisions/1/mbox/

== Known issues ==

  Here are the changes found in Patchwork_10187 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@drv_getparams_basic@basic-subslice-total:
      fi-snb-2520m:       NOTRUN -> DMESG-WARN (fdo#103713) +10

    igt@gem_exec_suspend@basic-s3:
      fi-blb-e6850:       PASS -> INCOMPLETE (fdo#107718)

    igt@kms_frontbuffer_tracking@basic:
      fi-hsw-peppy:       PASS -> DMESG-WARN (fdo#102614)
      fi-byt-clapper:     PASS -> FAIL (fdo#103167)

    igt@kms_pipe_crc_basic@hang-read-crc-pipe-b:
      fi-byt-clapper:     PASS -> FAIL (fdo#107362, fdo#103191) +1

    igt@kms_psr@primary_page_flip:
      fi-kbl-r:           PASS -> FAIL (fdo#107336)

    
    ==== Possible fixes ====

    igt@drv_selftest@mock_hugepages:
      fi-bwr-2160:        DMESG-FAIL -> PASS

    igt@kms_flip@basic-flip-vs-dpms:
      fi-hsw-4770r:       DMESG-WARN (fdo#105602) -> PASS

    
  fdo#102614 https://bugs.freedesktop.org/show_bug.cgi?id=102614
  fdo#103167 https://bugs.freedesktop.org/show_bug.cgi?id=103167
  fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
  fdo#103713 https://bugs.freedesktop.org/show_bug.cgi?id=103713
  fdo#105602 https://bugs.freedesktop.org/show_bug.cgi?id=105602
  fdo#107336 https://bugs.freedesktop.org/show_bug.cgi?id=107336
  fdo#107362 https://bugs.freedesktop.org/show_bug.cgi?id=107362
  fdo#107718 https://bugs.freedesktop.org/show_bug.cgi?id=107718


== Participating hosts (47 -> 44) ==

  Additional (1): fi-snb-2520m 
  Missing    (4): fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-hsw-4200u 


== Build changes ==

    * Linux: CI_DRM_4823 -> Patchwork_10187

  CI_DRM_4823: 0b1b218e81709c9930d44cb1afdff052442fc843 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4640: 9a8da36e708f9ed15b20689dfe305e41f9a19008 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_10187: 3bfe8147b34102c706e64dd2b038d4aa9bf1429e @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

3bfe8147b341 drm/i915/execlists: Reset CSB pointers on canceling requests (wedging)
c30bb3a15a90 drm/i915: Flush the tasklet when checking for idle
d9c3e5d225d7 drm/i915: Limit the backpressure for i915_request allocation

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10187/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation
  2018-09-14  8:00 [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation Chris Wilson
                   ` (2 preceding siblings ...)
  2018-09-14 10:02 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation Patchwork
@ 2018-09-14 10:09 ` Tvrtko Ursulin
  2018-09-14 11:30 ` ✓ Fi.CI.IGT: success for series starting with [1/3] " Patchwork
  4 siblings, 0 replies; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-09-14 10:09 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Daniel Vetter


On 14/09/2018 09:00, Chris Wilson wrote:
> If we try and fail to allocate a i915_request, we apply some
> backpressure on the clients to throttle the memory allocations coming
> from i915.ko. Currently, we wait until completely idle, but this is far
> too heavy and leads to some situations where the only escape is to
> declare a client hung and reset the GPU. The intent is to only ratelimit
> the allocation requests and to allow ourselves to recycle requests and
> memory from any long queues built up by a client hog.
> 
> Although the system memory is inherently a global resources, we don't
> want to overly penalize an unlucky client to pay the price of reaping a
> hog. To reduce the influence of one client on another, we can instead of
> waiting for the entire GPU to idle, impose a barrier on the local client.
> (One end goal for request allocation is for scalability to many
> concurrent allocators; simultaneous execbufs.)
> 
> To prevent ourselves from getting caught out by long running requests
> (requests that may never finish without userspace intervention, whom we
> are blocking) we need to impose a finite timeout, ideally shorter than
> hangcheck. A long time ago Paul McKenney suggested that RCU users should
> ratelimit themselves using judicious use of cond_synchronize_rcu(). This
> gives us the opportunity to reduce our indefinite wait for the GPU to
> idle to a wait for the RCU grace period of the previous allocation along
> this timeline to expire, satisfying both the local and finite properties
> we desire for our ratelimiting.
> 
> There are still a few global steps (reclaim not least amongst those!)
> when we exhaust the immediate slab pool, at least now the wait is itself
> decoupled from struct_mutex for our glorious highly parallel future!
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106680
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>   drivers/gpu/drm/i915/i915_request.c | 14 ++++++++------
>   drivers/gpu/drm/i915/i915_request.h |  8 ++++++++
>   2 files changed, 16 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 09ed48833b54..a492385b2089 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -732,13 +732,13 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	rq = kmem_cache_alloc(i915->requests,
>   			      GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
>   	if (unlikely(!rq)) {
> +		i915_retire_requests(i915);
> +
>   		/* Ratelimit ourselves to prevent oom from malicious clients */
> -		ret = i915_gem_wait_for_idle(i915,
> -					     I915_WAIT_LOCKED |
> -					     I915_WAIT_INTERRUPTIBLE,
> -					     MAX_SCHEDULE_TIMEOUT);
> -		if (ret)
> -			goto err_unreserve;
> +		rq = i915_gem_active_raw(&ce->ring->timeline->last_request,
> +					 &i915->drm.struct_mutex);
> +		if (rq)
> +			cond_synchronize_rcu(rq->rcustate);
>   
>   		/*
>   		 * We've forced the client to stall and catch up with whatever
> @@ -758,6 +758,8 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   		}
>   	}
>   
> +	rq->rcustate = get_state_synchronize_rcu();
> +
>   	INIT_LIST_HEAD(&rq->active_list);
>   	rq->i915 = i915;
>   	rq->engine = engine;
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index 9898301ab7ef..7fa94b024968 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -100,6 +100,14 @@ struct i915_request {
>   	struct i915_timeline *timeline;
>   	struct intel_signal_node signaling;
>   
> +	/*
> +	 * The rcu epoch of when this request was allocated. Used to judiciously
> +	 * apply backpressure on future allocations to ensure that under
> +	 * mempressure there is sufficient RCU ticks for us to reclaim our
> +	 * RCU protected slabs.
> +	 */
> +	unsigned long rcustate;
> +
>   	/*
>   	 * Fences for the various phases in the request's lifetime.
>   	 *
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] drm/i915: Flush the tasklet when checking for idle
  2018-09-14  8:00 ` [PATCH 2/3] drm/i915: Flush the tasklet when checking for idle Chris Wilson
@ 2018-09-14 10:21   ` Tvrtko Ursulin
  2018-09-14 11:40     ` Chris Wilson
  0 siblings, 1 reply; 9+ messages in thread
From: Tvrtko Ursulin @ 2018-09-14 10:21 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 14/09/2018 09:00, Chris Wilson wrote:
> In order to reduce latency when checking for idle we kick the tasklet
> directly. Sometimes this is not enough as it is queued on another cpu
> and so to improve the accuracy of this idle-check (and so to reduce
> latency overall by avoiding another pass, or worse declaring a timeout!)
> wait for the tasklet to complete.
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=107916
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> Cc: Michel Thierry <michel.thierry@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_engine_cs.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 10cd051ba29e..217ed3ee1cab 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -990,6 +990,9 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
>   		}
>   		local_bh_enable();
>   
> +		/* Otherwise flush the tasklet if it was on another cpu */
> +		tasklet_unlock_wait(t);

That's one bizarre api! I was expecting it to mess up the state here but 
nope, apparently it is actually what one would expect to be named 
tasklet_sync.

> +
>   		if (READ_ONCE(engine->execlists.active))
>   			return false;
>   	}
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation
  2018-09-14  8:00 [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation Chris Wilson
                   ` (3 preceding siblings ...)
  2018-09-14 10:09 ` [PATCH 1/3] " Tvrtko Ursulin
@ 2018-09-14 11:30 ` Patchwork
  4 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2018-09-14 11:30 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation
URL   : https://patchwork.freedesktop.org/series/49688/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4823_full -> Patchwork_10187_full =

== Summary - SUCCESS ==

  No regressions found.

  

== Known issues ==

  Here are the changes found in Patchwork_10187_full that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@drv_suspend@forcewake:
      shard-snb:          PASS -> INCOMPLETE (fdo#105411)

    igt@kms_frontbuffer_tracking@fbc-2p-primscrn-spr-indfb-draw-blt:
      shard-glk:          PASS -> DMESG-FAIL (fdo#106538) +1

    igt@kms_setmode@basic:
      shard-kbl:          PASS -> FAIL (fdo#99912)

    igt@kms_vblank@pipe-a-query-forked-hang:
      shard-glk:          PASS -> DMESG-WARN (fdo#106538, fdo#105763) +2

    
    ==== Possible fixes ====

    igt@drv_suspend@shrink:
      shard-kbl:          INCOMPLETE (fdo#106886, fdo#103665) -> PASS

    igt@kms_cursor_legacy@2x-long-cursor-vs-flip-legacy:
      shard-hsw:          FAIL (fdo#105767) -> PASS

    igt@kms_setmode@basic:
      shard-apl:          FAIL (fdo#99912) -> PASS

    
  fdo#103665 https://bugs.freedesktop.org/show_bug.cgi?id=103665
  fdo#105411 https://bugs.freedesktop.org/show_bug.cgi?id=105411
  fdo#105763 https://bugs.freedesktop.org/show_bug.cgi?id=105763
  fdo#105767 https://bugs.freedesktop.org/show_bug.cgi?id=105767
  fdo#106538 https://bugs.freedesktop.org/show_bug.cgi?id=106538
  fdo#106886 https://bugs.freedesktop.org/show_bug.cgi?id=106886
  fdo#99912 https://bugs.freedesktop.org/show_bug.cgi?id=99912


== Participating hosts (5 -> 5) ==

  No changes in participating hosts


== Build changes ==

    * Linux: CI_DRM_4823 -> Patchwork_10187

  CI_DRM_4823: 0b1b218e81709c9930d44cb1afdff052442fc843 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4640: 9a8da36e708f9ed15b20689dfe305e41f9a19008 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_10187: 3bfe8147b34102c706e64dd2b038d4aa9bf1429e @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10187/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] drm/i915: Flush the tasklet when checking for idle
  2018-09-14 10:21   ` Tvrtko Ursulin
@ 2018-09-14 11:40     ` Chris Wilson
  0 siblings, 0 replies; 9+ messages in thread
From: Chris Wilson @ 2018-09-14 11:40 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-09-14 11:21:07)
> 
> On 14/09/2018 09:00, Chris Wilson wrote:
> > In order to reduce latency when checking for idle we kick the tasklet
> > directly. Sometimes this is not enough as it is queued on another cpu
> > and so to improve the accuracy of this idle-check (and so to reduce
> > latency overall by avoiding another pass, or worse declaring a timeout!)
> > wait for the tasklet to complete.
> > 
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=107916
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> > Cc: Michel Thierry <michel.thierry@intel.com>
> > ---
> >   drivers/gpu/drm/i915/intel_engine_cs.c | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> > index 10cd051ba29e..217ed3ee1cab 100644
> > --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> > @@ -990,6 +990,9 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
> >               }
> >               local_bh_enable();
> >   
> > +             /* Otherwise flush the tasklet if it was on another cpu */
> > +             tasklet_unlock_wait(t);
> 
> That's one bizarre api! I was expecting it to mess up the state here but 
> nope, apparently it is actually what one would expect to be named 
> tasklet_sync.
> 
> > +
> >               if (READ_ONCE(engine->execlists.active))
> >                       return false;
> >       }
> > 
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Ta, first two applied. I'm in the market for a victim for number 3.
Mika!
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] drm/i915/execlists: Reset CSB pointers on canceling requests (wedging)
  2018-09-14  8:00 ` [PATCH 3/3] drm/i915/execlists: Reset CSB pointers on canceling requests (wedging) Chris Wilson
@ 2018-09-14 14:17   ` Mika Kuoppala
  0 siblings, 0 replies; 9+ messages in thread
From: Mika Kuoppala @ 2018-09-14 14:17 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> The prior assumption was that we did not need to reset the CSB on
> wedging when cancelling the outstanding requests as it would be cleaned
> up in the subsequent reset prior to restarting the GPU. However, what
> was not accounted for was that in performing the reset, we would try to

'performing the reset' could be 'preparing engine for reset'

> process the outstanding CSB entries. If the GPU happened to complete a
> CS event just as we were performing the cancellation of requests, that
> event would be kept in the CSB until the reset -- but our bookkeeping
> was cleared, causing confusion when trying to complete the CS event.
>
> v2: Use a sanitize on unwedge to avoid interfering with eio suspend
> (where we intentionally disable GPU reset).
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107925
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

I was glad to notice that there were quality comments
on resetting/clearing the csb/ports.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-09-14 14:18 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-14  8:00 [PATCH 1/3] drm/i915: Limit the backpressure for i915_request allocation Chris Wilson
2018-09-14  8:00 ` [PATCH 2/3] drm/i915: Flush the tasklet when checking for idle Chris Wilson
2018-09-14 10:21   ` Tvrtko Ursulin
2018-09-14 11:40     ` Chris Wilson
2018-09-14  8:00 ` [PATCH 3/3] drm/i915/execlists: Reset CSB pointers on canceling requests (wedging) Chris Wilson
2018-09-14 14:17   ` Mika Kuoppala
2018-09-14 10:02 ` ✓ Fi.CI.BAT: success for series starting with [1/3] drm/i915: Limit the backpressure for i915_request allocation Patchwork
2018-09-14 10:09 ` [PATCH 1/3] " Tvrtko Ursulin
2018-09-14 11:30 ` ✓ Fi.CI.IGT: success for series starting with [1/3] " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.