All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Priority boost for locked waits
@ 2017-01-18 16:41 Chris Wilson
  2017-01-18 16:53 ` [PATCH v2] " Chris Wilson
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Wilson @ 2017-01-18 16:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

We always try to do an unlocked wait before resorting to having a
blocking wait under the mutex - so we very rarely have to sleep under
the struct_mutex. However, when we do we want that wait to be as short
as possible as the struct_mutex is our BKL that will stall the driver and
all clients.

There should be no impact for all typical workloads.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c1c2765bb8d0..47d17b227a05 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -395,6 +395,15 @@ i915_gem_object_wait_fence(struct dma_fence *fence,
 			rps = NULL;
 	}
 
+	/* Very rarely do we wait whilst holding the mutex. We try to always
+	 * do an unlocked wait before using a locked wait. However, when we
+	 * have to resort to a locked wait, we want that wait to be as short
+	 * as possible as the struct_mutex is our BKL that will stall the
+	 * driver and all clients.
+	 */
+	if (flags & I915_WAIT_LOCKED && rq->engine->schedule)
+		rq->engine->schedule(rq, I915_PRIORITY_MAX);
+
 	timeout = i915_wait_request(rq, flags, timeout);
 
 out:
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2] drm/i915: Priority boost for locked waits
  2017-01-18 16:41 [PATCH] drm/i915: Priority boost for locked waits Chris Wilson
@ 2017-01-18 16:53 ` Chris Wilson
  2017-01-19  6:18   ` Tvrtko Ursulin
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Wilson @ 2017-01-18 16:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

We always try to do an unlocked wait before resorting to having a
blocking wait under the mutex - so we very rarely have to sleep under
the struct_mutex. However, when we do we want that wait to be as short
as possible as the struct_mutex is our BKL that will stall the driver and
all clients.

There should be no impact for all typical workloads.

v2: Move down a layer to apply to all waits.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem_request.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index bacb875a6ef3..7be17d9c304b 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -1054,6 +1054,15 @@ long i915_wait_request(struct drm_i915_gem_request *req,
 	if (!timeout)
 		return -ETIME;
 
+	/* Very rarely do we wait whilst holding the mutex. We try to always
+	 * do an unlocked wait before using a locked wait. However, when we
+	 * have to resort to a locked wait, we want that wait to be as short
+	 * as possible as the struct_mutex is our BKL that will stall the
+	 * driver and all clients.
+	 */
+	if (flags & I915_WAIT_LOCKED && req->engine->schedule)
+		req->engine->schedule(req, I915_PRIORITY_MAX);
+
 	trace_i915_gem_request_wait_begin(req);
 
 	add_wait_queue(&req->execute, &exec);
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] drm/i915: Priority boost for locked waits
  2017-01-18 16:53 ` [PATCH v2] " Chris Wilson
@ 2017-01-19  6:18   ` Tvrtko Ursulin
  2017-01-19  8:02     ` Chris Wilson
  0 siblings, 1 reply; 4+ messages in thread
From: Tvrtko Ursulin @ 2017-01-19  6:18 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Daniel Vetter


On 18/01/2017 16:53, Chris Wilson wrote:
> We always try to do an unlocked wait before resorting to having a
> blocking wait under the mutex - so we very rarely have to sleep under
> the struct_mutex. However, when we do we want that wait to be as short
> as possible as the struct_mutex is our BKL that will stall the driver and
> all clients.
>
> There should be no impact for all typical workloads.
>
> v2: Move down a layer to apply to all waits.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_gem_request.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> index bacb875a6ef3..7be17d9c304b 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -1054,6 +1054,15 @@ long i915_wait_request(struct drm_i915_gem_request *req,
>  	if (!timeout)
>  		return -ETIME;
>
> +	/* Very rarely do we wait whilst holding the mutex. We try to always
> +	 * do an unlocked wait before using a locked wait. However, when we
> +	 * have to resort to a locked wait, we want that wait to be as short
> +	 * as possible as the struct_mutex is our BKL that will stall the
> +	 * driver and all clients.
> +	 */
> +	if (flags & I915_WAIT_LOCKED && req->engine->schedule)
> +		req->engine->schedule(req, I915_PRIORITY_MAX);
> +
>  	trace_i915_gem_request_wait_begin(req);
>
>  	add_wait_queue(&req->execute, &exec);
>

Would it be worth moving it to after the wait_begin tracepoint? Just 
thinking to maybe account as much as the time spent between wait_begin 
and wait_end in case someone would be looking at that.

I did not like the locked wait from i915_gem_gtt_finish_pages triggering 
the priority bump but it seem irrelevant on platforms with the scheduler.

Another concern is set cache level ioctl which does not have an unlocked 
wait first so might be used for illegal priority bumping?

Also on the VMA unbind path, so everything which can go through there 
first should be evaluated.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] drm/i915: Priority boost for locked waits
  2017-01-19  6:18   ` Tvrtko Ursulin
@ 2017-01-19  8:02     ` Chris Wilson
  0 siblings, 0 replies; 4+ messages in thread
From: Chris Wilson @ 2017-01-19  8:02 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Daniel Vetter, intel-gfx

On Thu, Jan 19, 2017 at 06:18:56AM +0000, Tvrtko Ursulin wrote:
> 
> On 18/01/2017 16:53, Chris Wilson wrote:
> >We always try to do an unlocked wait before resorting to having a
> >blocking wait under the mutex - so we very rarely have to sleep under
> >the struct_mutex. However, when we do we want that wait to be as short
> >as possible as the struct_mutex is our BKL that will stall the driver and
> >all clients.
> >
> >There should be no impact for all typical workloads.
> >
> >v2: Move down a layer to apply to all waits.
> >
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >---
> > drivers/gpu/drm/i915/i915_gem_request.c | 9 +++++++++
> > 1 file changed, 9 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
> >index bacb875a6ef3..7be17d9c304b 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_request.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_request.c
> >@@ -1054,6 +1054,15 @@ long i915_wait_request(struct drm_i915_gem_request *req,
> > 	if (!timeout)
> > 		return -ETIME;
> >
> >+	/* Very rarely do we wait whilst holding the mutex. We try to always
> >+	 * do an unlocked wait before using a locked wait. However, when we
> >+	 * have to resort to a locked wait, we want that wait to be as short
> >+	 * as possible as the struct_mutex is our BKL that will stall the
> >+	 * driver and all clients.
> >+	 */
> >+	if (flags & I915_WAIT_LOCKED && req->engine->schedule)
> >+		req->engine->schedule(req, I915_PRIORITY_MAX);
> >+
> > 	trace_i915_gem_request_wait_begin(req);
> >
> > 	add_wait_queue(&req->execute, &exec);
> >
> 
> Would it be worth moving it to after the wait_begin tracepoint? Just
> thinking to maybe account as much as the time spent between
> wait_begin and wait_end in case someone would be looking at that.

I put it before to avoid attributing this time to the wait itself.
Either works for me, there used to be a tracepoint here for when the
thread actually slept.
 
> I did not like the locked wait from i915_gem_gtt_finish_pages
> triggering the priority bump but it seem irrelevant on platforms
> with the scheduler.

That is a wait for everything anyway, but even a hint of reordering
along those wait-for-idle paths is a waste.
 
> Another concern is set cache level ioctl which does not have an
> unlocked wait first so might be used for illegal priority bumping?

I've done unlocked waits first here, it is pretty ugly if you try to be
accurate and only wait prior to a rebind. On the internal usecase, we
priority bump the regular path precisely because any stall here is a
user visible delay. Throwing in a preemptive wait on the ioctl would be
ok.
 
> Also on the VMA unbind path, so everything which can go through
> there first should be evaluated.

Everywhere we take this, we are holding the struct_mutex and so
blocking. The only time we unbind whilst active is during eviction and
purging, both good candidates. The normal unbind paths are only on
inactive vma.

Fences are not used by the gpu on gen4+ so unaffected. The only path I
thought of being able to reliably hit from igt was the full ring. And I
still think it makes sense to minimise the hold of struct_mutex.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-01-19  8:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-18 16:41 [PATCH] drm/i915: Priority boost for locked waits Chris Wilson
2017-01-18 16:53 ` [PATCH v2] " Chris Wilson
2017-01-19  6:18   ` Tvrtko Ursulin
2017-01-19  8:02     ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.