All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] drm/i915: recheck ringspace after wait+retire
@ 2014-12-18 17:03 Dave Gordon
  2014-12-18 17:18 ` Chris Wilson
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Gordon @ 2014-12-18 17:03 UTC (permalink / raw)
  To: intel-gfx

In {intel,logical}_ring_wait_request(), we try to find a request
whose completion will release the amount of ring space required.
If we find such a request, we wait for it, and then retire it, in
the expectation that we will now have at least the required amount
of free space in the ring. But it's good to check that this has
actually happened, so we can back out with a meaningful error
code if something unexpected has happened, such as wait_request
returning early.

This code was already in the execlist version, so the change to
intel_lrc.c is just to add a comment; but we want the same check
in the legacy ringbuffer mode too.

v2: rebase to latest drm-intel-nightly

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c        |    9 +++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c |   12 +++++++++++-
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 90867b2..329cb5c 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -955,6 +955,15 @@ static int logical_ring_wait_request(struct intel_ringbuffer *ringbuf,
 
 	i915_gem_retire_requests_ring(ring);
 
+	/*
+	 * According to our calculation above, retiring the request we just
+	 * waited for should have resulted in there being enough space in
+	 * the ringbuffer; but let's check.
+	 *
+	 * If there is not now enough space, something has gone horribly worng
+	 * (such as wait_request returning early, but with no error, or
+	 * retire_requests failing to retire the request we expected it to).
+	 */
 	return intel_ring_space(ringbuf) >= bytes ? 0 : -ENOSPC;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 018a37e..c95bf3a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1932,6 +1932,7 @@ static int intel_ring_wait_request(struct intel_engine_cs *ring, int n)
 		return 0;
 
 	list_for_each_entry(request, &ring->request_list, list) {
+		/* Would completion of this request free enough space? */
 		if (__intel_ring_space(request->tail, ringbuf->tail,
 				       ringbuf->effective_size) >= n) {
 			break;
@@ -1947,7 +1948,16 @@ static int intel_ring_wait_request(struct intel_engine_cs *ring, int n)
 
 	i915_gem_retire_requests_ring(ring);
 
-	return 0;
+	/*
+	 * According to our calculation above, retiring the request we just
+	 * waited for should have resulted in there being enough space in
+	 * the ringbuffer; but let's check.
+	 *
+	 * If there is not now enough space, something has gone horribly worng
+	 * (such as wait_request returning early, but with no error, or
+	 * retire_requests failing to retire the request we expected it to).
+	 */
+	return intel_ring_space(ringbuf) >= n ? 0 : -ENOSPC;
 }
 
 static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
-- 
1.7.9.5

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] drm/i915: recheck ringspace after wait+retire
  2014-12-18 17:03 [PATCH v2] drm/i915: recheck ringspace after wait+retire Dave Gordon
@ 2014-12-18 17:18 ` Chris Wilson
  2014-12-18 18:09   ` Dave Gordon
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Wilson @ 2014-12-18 17:18 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Dec 18, 2014 at 05:03:41PM +0000, Dave Gordon wrote:
> In {intel,logical}_ring_wait_request(), we try to find a request
> whose completion will release the amount of ring space required.
> If we find such a request, we wait for it, and then retire it, in
> the expectation that we will now have at least the required amount
> of free space in the ring. But it's good to check that this has
> actually happened, so we can back out with a meaningful error
> code if something unexpected has happened, such as wait_request
> returning early.

It's pretty pointless. It's a programming bug, if anything it should
WARN if it happens.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] drm/i915: recheck ringspace after wait+retire
  2014-12-18 17:18 ` Chris Wilson
@ 2014-12-18 18:09   ` Dave Gordon
  2014-12-18 20:19     ` Daniel Vetter
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Gordon @ 2014-12-18 18:09 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On 18/12/14 17:18, Chris Wilson wrote:
> On Thu, Dec 18, 2014 at 05:03:41PM +0000, Dave Gordon wrote:
>> In {intel,logical}_ring_wait_request(), we try to find a request
>> whose completion will release the amount of ring space required.
>> If we find such a request, we wait for it, and then retire it, in
>> the expectation that we will now have at least the required amount
>> of free space in the ring. But it's good to check that this has
>> actually happened, so we can back out with a meaningful error
>> code if something unexpected has happened, such as wait_request
>> returning early.
> 
> It's pretty pointless. It's a programming bug, if anything it should
> WARN if it happens.
> -Chris

I don't regard preventing the propagation of errors as pointless,
whether or not it's a programming bug. If this case arose, then without
this check we would go ahead and trample over whatever was next in the
ring. Quite likely TAIL would overtake HEAD, and then all sorts of
bizarre things would happen some time later.

In particular, this may be able to catch bugs introduced by /future
features/, such as TDR, scheduling, and preemption, all of which are in
preparation now and all of which have the potential to disrupt the
predictable flow of requests. So it's good to be resilient to upcoming
changes and make it as easy as possible to catch bugs at the earliest
opportunity.

This is also a step towards the deduplication of the ringbuffer and LRC
paths by making sure that they are as similar as possible, and thus
simple to merge at some later opportunity.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] drm/i915: recheck ringspace after wait+retire
  2014-12-18 18:09   ` Dave Gordon
@ 2014-12-18 20:19     ` Daniel Vetter
  0 siblings, 0 replies; 4+ messages in thread
From: Daniel Vetter @ 2014-12-18 20:19 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Thu, Dec 18, 2014 at 06:09:18PM +0000, Dave Gordon wrote:
> On 18/12/14 17:18, Chris Wilson wrote:
> > On Thu, Dec 18, 2014 at 05:03:41PM +0000, Dave Gordon wrote:
> >> In {intel,logical}_ring_wait_request(), we try to find a request
> >> whose completion will release the amount of ring space required.
> >> If we find such a request, we wait for it, and then retire it, in
> >> the expectation that we will now have at least the required amount
> >> of free space in the ring. But it's good to check that this has
> >> actually happened, so we can back out with a meaningful error
> >> code if something unexpected has happened, such as wait_request
> >> returning early.
> > 
> > It's pretty pointless. It's a programming bug, if anything it should
> > WARN if it happens.
> > -Chris
> 
> I don't regard preventing the propagation of errors as pointless,
> whether or not it's a programming bug. If this case arose, then without
> this check we would go ahead and trample over whatever was next in the
> ring. Quite likely TAIL would overtake HEAD, and then all sorts of
> bizarre things would happen some time later.
> 
> In particular, this may be able to catch bugs introduced by /future
> features/, such as TDR, scheduling, and preemption, all of which are in
> preparation now and all of which have the potential to disrupt the
> predictable flow of requests. So it's good to be resilient to upcoming
> changes and make it as easy as possible to catch bugs at the earliest
> opportunity.
> 
> This is also a step towards the deduplication of the ringbuffer and LRC
> paths by making sure that they are as similar as possible, and thus
> simple to merge at some later opportunity.

Overwriting the TAIL is a fairly benign disaster as far as gpus are
concerned, the gpu tends to survive those. I concur with Chris that a
simple WARN_ON for paranoia is good enough. And I don't think we need the
comments, imo the code is clear enough as-is.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-12-18 20:20 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-18 17:03 [PATCH v2] drm/i915: recheck ringspace after wait+retire Dave Gordon
2014-12-18 17:18 ` Chris Wilson
2014-12-18 18:09   ` Dave Gordon
2014-12-18 20:19     ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.