All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/nouveau: rewrite nouveau_dma_wait()
@ 2009-08-21  9:56 skeggsb-Re5JQEeQqe8AvxtiuMwx3w
       [not found] ` <1250848611-21642-1-git-send-email-skeggsb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: skeggsb-Re5JQEeQqe8AvxtiuMwx3w @ 2009-08-21  9:56 UTC (permalink / raw)
  To: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Ben Skeggs

From: Ben Skeggs <bskeggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

There was at least one situation we could get into in the old code where
we'd end up overrunning the push buffer with the jumps back to the start
of the buffer in an attempt to get more space.

In addition, the new code prevents misdetecting a GPU hang by resetting
the timeout counter if we see the GPU GET pointer advancing, and
simplifies the handling of a confusing corner-case.

There's also a significant amount of commenting added to the code, as some
of the interactions are quite complex and hard to grasp on first looking
at the code.
---
 drivers/gpu/drm/nouveau/nouveau_dma.c |  114 +++++++++++++++++++++------------
 1 files changed, 72 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dma.c b/drivers/gpu/drm/nouveau/nouveau_dma.c
index e1a0adb..8930420 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dma.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dma.c
@@ -113,8 +113,13 @@ READ_GET(struct nouveau_channel *chan, uint32_t *get)
 
 	val = nvchan_rd32(chan->user_get);
 	if (val < chan->pushbuf_base ||
-	    val >= chan->pushbuf_base + chan->pushbuf_bo->bo.mem.size)
+	    val >= chan->pushbuf_base + chan->pushbuf_bo->bo.mem.size) {
+		/* meaningless to dma_wait() except to know whether the
+		 * GPU has stalled or not
+		 */
+		*get = val;
 		return false;
+	}
 
 	*get = (val - chan->pushbuf_base) >> 2;
 	return true;
@@ -123,54 +128,79 @@ READ_GET(struct nouveau_channel *chan, uint32_t *get)
 int
 nouveau_dma_wait(struct nouveau_channel *chan, int size)
 {
-	const int us_timeout = 100000;
-	uint32_t get;
-	int ret = -EBUSY, i;
+	uint32_t get, prev_get = 0, cnt = 0;
+	bool get_valid;
+
+	while (chan->dma.free < size) {
+		/* reset counter as long as GET is still advancing, this is
+		 * to avoid misdetecting a GPU lockup if the GPU happens to
+		 * just be processing an operation that takes a long time
+		 */
+		get_valid = READ_GET(chan, &get);
+		if (get != prev_get) {
+			prev_get = get;
+			cnt = 0;
+		}
 
-	for (i = 0; i < us_timeout; i++) {
-		if (!READ_GET(chan, &get)) {
+		if ((++cnt & 0xff) == 0) {
 			DRM_UDELAY(1);
-			continue;
+			if (cnt > 10000)
+				return -EBUSY;
 		}
 
-		if (chan->dma.put >= get) {
-			chan->dma.free = chan->dma.max - chan->dma.cur;
-
-			if (chan->dma.free < size) {
-				OUT_RING(chan, 0x20000000|chan->pushbuf_base);
-				if (get <= NOUVEAU_DMA_SKIPS) {
-					/*corner case - will be idle*/
-					if (chan->dma.put <= NOUVEAU_DMA_SKIPS)
-						WRITE_PUT(NOUVEAU_DMA_SKIPS + 1);
-
-					for (; i < us_timeout; i++) {
-						if (READ_GET(chan, &get) &&
-						    get > NOUVEAU_DMA_SKIPS)
-							break;
-
-						DRM_UDELAY(1);
-					}
-
-					if (i >= us_timeout)
-						break;
-				}
-
-				WRITE_PUT(NOUVEAU_DMA_SKIPS);
-				chan->dma.cur  =
-				chan->dma.put  = NOUVEAU_DMA_SKIPS;
-				chan->dma.free = get - (NOUVEAU_DMA_SKIPS + 1);
-			}
-		} else {
-			chan->dma.free = get - chan->dma.cur - 1;
-		}
+		/* loop until we have a usable GET pointer.  the value
+		 * we read from the GPU may be outside the main ring if
+		 * PFIFO is processing a buffer called from the main ring,
+		 * discard these values until something sensible is seen.
+		 *
+		 * the other case we discard GET is while the GPU is fetching
+		 * from the SKIPS area, so the code below doesn't have to deal
+		 * with some fun corner cases.
+		 */
+		if (!get_valid || get < NOUVEAU_DMA_SKIPS)
+			continue;
 
-		if (chan->dma.free >= size) {
-			ret = 0;
-			break;
+		if (get <= chan->dma.cur) {
+			/* engine is fetching behind us, or is completely
+			 * idle (GET == PUT) so we have free space up until
+			 * the end of the push buffer
+			 *
+			 * we can only hit that path once per call due to
+			 * looping back to the beginning of the push buffer,
+			 * we'll hit the fetching-ahead-of-us path from that
+			 * point on.
+			 *
+			 * the *one* exception to that rule is if we read
+			 * GET==PUT, in which case the below conditional will
+			 * always succeed and break us out of the wait loop.
+			 */
+			chan->dma.free = chan->dma.max - chan->dma.cur;
+			if (chan->dma.free >= size)
+				break;
+
+			/* not enough space left at the end of the push buffer,
+			 * instruct the GPU to jump back to the start right
+			 * after processing the currently pending commands.
+			 */
+			OUT_RING  (chan, chan->pushbuf_base | 0x20000000);
+			WRITE_PUT (NOUVEAU_DMA_SKIPS);
+
+			/* we're now submitting commands at the start of
+			 * the push buffer.
+			 */
+			chan->dma.cur  =
+			chan->dma.put  = NOUVEAU_DMA_SKIPS;
 		}
 
-		DRM_UDELAY(1);
+		/* engine fetching ahead of us, we have space up until the
+		 * current GET pointer.  the "- 1" is to ensure there's
+		 * space left to emit a jump back to the beginning of the
+		 * push buffer if we require it.  we can never get GET == PUT
+		 * here, so this is safe.
+		 */
+		chan->dma.free = get - chan->dma.cur - 1;
 	}
 
-	return ret;
+	return 0;
 }
+
-- 
1.6.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/nouveau: rewrite nouveau_dma_wait()
       [not found] ` <1250848611-21642-1-git-send-email-skeggsb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2009-08-21 19:30   ` Maarten Maathuis
       [not found]     ` <6d4bc9fc0908211230l7f63207fm673559e3b4f656bb-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2009-08-22 10:25   ` Pekka Paalanen
  1 sibling, 1 reply; 8+ messages in thread
From: Maarten Maathuis @ 2009-08-21 19:30 UTC (permalink / raw)
  To: skeggsb-Re5JQEeQqe8AvxtiuMwx3w
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

The timeout is too short, the gallium driver will easily trigger an assert.

Maarten.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/nouveau: rewrite nouveau_dma_wait()
       [not found]     ` <6d4bc9fc0908211230l7f63207fm673559e3b4f656bb-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-08-21 23:26       ` Ben Skeggs
       [not found]         ` <1250897199.20095.1.camel-oFTkVezJ1HzzwxVw7NwMvg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Ben Skeggs @ 2009-08-21 23:26 UTC (permalink / raw)
  To: Maarten Maathuis; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Fri, 2009-08-21 at 21:30 +0200, Maarten Maathuis wrote:
> The timeout is too short, the gallium driver will easily trigger an assert.
Oops, that was a typo.  Thanks, fixed.

Any other comments?
> 
> Maarten.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/nouveau: rewrite nouveau_dma_wait()
       [not found]         ` <1250897199.20095.1.camel-oFTkVezJ1HzzwxVw7NwMvg@public.gmane.org>
@ 2009-08-21 23:42           ` Maarten Maathuis
  0 siblings, 0 replies; 8+ messages in thread
From: Maarten Maathuis @ 2009-08-21 23:42 UTC (permalink / raw)
  To: skeggsb-Re5JQEeQqe8AvxtiuMwx3w; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Sat, Aug 22, 2009 at 1:26 AM, Ben Skeggs<skeggsb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Fri, 2009-08-21 at 21:30 +0200, Maarten Maathuis wrote:
>> The timeout is too short, the gallium driver will easily trigger an assert.
> Oops, that was a typo.  Thanks, fixed.
>
> Any other comments?

Not really, it's been running fine apart from the mentioned issue.

>>
>> Maarten.
>
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/nouveau: rewrite nouveau_dma_wait()
       [not found] ` <1250848611-21642-1-git-send-email-skeggsb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2009-08-21 19:30   ` Maarten Maathuis
@ 2009-08-22 10:25   ` Pekka Paalanen
       [not found]     ` <20090822132559.7e404f88-cxYvVS3buNOdIgDiPM52R8c4bpwCjbIv@public.gmane.org>
  1 sibling, 1 reply; 8+ messages in thread
From: Pekka Paalanen @ 2009-08-22 10:25 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Fri, 21 Aug 2009 19:56:51 +1000
skeggsb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:

> From: Ben Skeggs <bskeggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> 
> There was at least one situation we could get into in the old code where
> we'd end up overrunning the push buffer with the jumps back to the start
> of the buffer in an attempt to get more space.
> 
> In addition, the new code prevents misdetecting a GPU hang by resetting
> the timeout counter if we see the GPU GET pointer advancing, and
> simplifies the handling of a confusing corner-case.
> 
> There's also a significant amount of commenting added to the code, as some
> of the interactions are quite complex and hard to grasp on first looking
> at the code.

Excellent, now even I may understand it. :-)

> ---
>  drivers/gpu/drm/nouveau/nouveau_dma.c |  114 +++++++++++++++++++++------------
>  1 files changed, 72 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dma.c b/drivers/gpu/drm/nouveau/nouveau_dma.c
> index e1a0adb..8930420 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dma.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dma.c
> @@ -113,8 +113,13 @@ READ_GET(struct nouveau_channel *chan, uint32_t *get)
>  
>  	val = nvchan_rd32(chan->user_get);
>  	if (val < chan->pushbuf_base ||
> -	    val >= chan->pushbuf_base + chan->pushbuf_bo->bo.mem.size)
> +	    val >= chan->pushbuf_base + chan->pushbuf_bo->bo.mem.size) {
> +		/* meaningless to dma_wait() except to know whether the
> +		 * GPU has stalled or not
> +		 */
> +		*get = val;

*get = val? Why not *get = (val - chan->pushbuf_base) >> 2 since it
is used for comparing with an old value in the caller? Not that it
likely matters much, but it just strikes as odd.

>  		return false;
> +	}
>  
>  	*get = (val - chan->pushbuf_base) >> 2;
>  	return true;
> @@ -123,54 +128,79 @@ READ_GET(struct nouveau_channel *chan, uint32_t *get)
>  int
>  nouveau_dma_wait(struct nouveau_channel *chan, int size)
>  {
> -	const int us_timeout = 100000;
> -	uint32_t get;
> -	int ret = -EBUSY, i;
> +	uint32_t get, prev_get = 0, cnt = 0;
> +	bool get_valid;
> +
> +	while (chan->dma.free < size) {
> +		/* reset counter as long as GET is still advancing, this is
> +		 * to avoid misdetecting a GPU lockup if the GPU happens to
> +		 * just be processing an operation that takes a long time
> +		 */
> +		get_valid = READ_GET(chan, &get);
> +		if (get != prev_get) {
> +			prev_get = get;
> +			cnt = 0;
> +		}
>  
> -	for (i = 0; i < us_timeout; i++) {
> -		if (!READ_GET(chan, &get)) {
> +		if ((++cnt & 0xff) == 0) {
>  			DRM_UDELAY(1);
> -			continue;
> +			if (cnt > 10000)
> +				return -EBUSY;
>  		}

DRM_UDELAY is a busy-wait. Since we are by default busy-waiting anyway,
why bother with udelay? Is it to give the PCI bus a rest once in a while?

How about adding a third level of waiting: a sleeping wait. Something like:

if (++cnt > CNT_MAX)
	return -EBUSY;

if (cnt & 0xfff == 0)
	msleep(5);
else if (cnt & 0xf == 0)
	udelay(1);

In pseudocode:
repeat until timeout:
- repeat 256 times:
  * repeat 16 times READ_GET
  * 1 us busywait
- at least 5 ms sleep (schedules)

The total expected timeout is roughly
cnt * READ_GET + (cnt / 16) * 1us + (cnt / 4096) * 5ms

In other words:
CNT_MAX = TIMEOUT / (READ_GET + 1us / 16 + 5000us / 4096)

Is 1 us busywait enough considering PCI latency? Might not.
Also the 5 ms might be too long. We probably have to adjust
those based on e.g. EXA and OpenGL benchmarks.

The big question here is, do we need nouveau_dma_wait() in a
context where sleeping is not allowed?
We probably want might_sleep() in the beginning of nouveau_dma_wait().
(See include/linux/kernel.h)

<clip>
> +		/* loop until we have a usable GET pointer.  the value
> +		 * we read from the GPU may be outside the main ring if
> +		 * PFIFO is processing a buffer called from the main ring,
> +		 * discard these values until something sensible is seen.
> +		 *
> +		 * the other case we discard GET is while the GPU is fetching
> +		 * from the SKIPS area, so the code below doesn't have to deal
> +		 * with some fun corner cases.
> +		 */
> +		if (!get_valid || get < NOUVEAU_DMA_SKIPS)
> +			continue;
>  
> -		if (chan->dma.free >= size) {
> -			ret = 0;
> -			break;
> +		if (get <= chan->dma.cur) {
> +			/* engine is fetching behind us, or is completely
> +			 * idle (GET == PUT) so we have free space up until
> +			 * the end of the push buffer
> +			 *
> +			 * we can only hit that path once per call due to
> +			 * looping back to the beginning of the push buffer,
> +			 * we'll hit the fetching-ahead-of-us path from that
> +			 * point on.
> +			 *
> +			 * the *one* exception to that rule is if we read
> +			 * GET==PUT, in which case the below conditional will
> +			 * always succeed and break us out of the wait loop.
> +			 */
> +			chan->dma.free = chan->dma.max - chan->dma.cur;
> +			if (chan->dma.free >= size)
> +				break;

If dma.free == size, doesn't the jump command emitted on the next call
here get written past the end of the buffer? dma.cur will equal dma.max.

> +
> +			/* not enough space left at the end of the push buffer,
> +			 * instruct the GPU to jump back to the start right
> +			 * after processing the currently pending commands.
> +			 */
> +			OUT_RING  (chan, chan->pushbuf_base | 0x20000000);
> +			WRITE_PUT (NOUVEAU_DMA_SKIPS);
> +
> +			/* we're now submitting commands at the start of
> +			 * the push buffer.
> +			 */
> +			chan->dma.cur  =
> +			chan->dma.put  = NOUVEAU_DMA_SKIPS;
>  		}
>  
> -		DRM_UDELAY(1);
> +		/* engine fetching ahead of us, we have space up until the
> +		 * current GET pointer.  the "- 1" is to ensure there's
> +		 * space left to emit a jump back to the beginning of the
> +		 * push buffer if we require it.  we can never get GET == PUT
> +		 * here, so this is safe.
> +		 */
> +		chan->dma.free = get - chan->dma.cur - 1;
>  	}
>  
> -	return ret;
> +	return 0;
>  }
> +
> -- 
> 1.6.4


-- 
Pekka Paalanen
http://www.iki.fi/pq/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/nouveau: rewrite nouveau_dma_wait()
       [not found]     ` <20090822132559.7e404f88-cxYvVS3buNOdIgDiPM52R8c4bpwCjbIv@public.gmane.org>
@ 2009-08-22 10:30       ` Maarten Maathuis
       [not found]         ` <6d4bc9fc0908220330l32803ce0qea995c59aa7f894c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Maarten Maathuis @ 2009-08-22 10:30 UTC (permalink / raw)
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

dma.max is adjusted to always have space for the jump.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/nouveau: rewrite nouveau_dma_wait()
       [not found]         ` <6d4bc9fc0908220330l32803ce0qea995c59aa7f894c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2009-08-22 12:20           ` Pekka Paalanen
       [not found]             ` <20090822152021.45e535a5-cxYvVS3buNOdIgDiPM52R8c4bpwCjbIv@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Pekka Paalanen @ 2009-08-22 12:20 UTC (permalink / raw)
  To: Maarten Maathuis; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Sat, 22 Aug 2009 12:30:03 +0200
Maarten Maathuis <madman2003-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:

> dma.max is adjusted to always have space for the jump.

Ah, then that should be documented somewhere.

But if that is the case, why:
>>> chan->dma.free = get - chan->dma.cur - 1;

The extra space is not needed in this branch, since the available space
is limited by GET. If it is not locked up, GET will move forward and give
space. Otherwise, we would be in the get <= dma.cur branch.

Assuming the jump command is at index dma.max, isn't dma.max the maximum
possible value for GET inside the ring? Therefore
>>> chan->dma.free = get - chan->dma.cur - 1;
should become equivalent to
>>> chan->dma.free = chan->dma.max - chan->dma.cur;
but there is the -1.

Or do we observe GET == dma.max + 1 just before it actually jumps?

Or do we simply not know how it behaves and want to be safe?

-- 
Pekka Paalanen
http://www.iki.fi/pq/

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] drm/nouveau: rewrite nouveau_dma_wait()
       [not found]             ` <20090822152021.45e535a5-cxYvVS3buNOdIgDiPM52R8c4bpwCjbIv@public.gmane.org>
@ 2009-08-22 12:27               ` Maarten Maathuis
  0 siblings, 0 replies; 8+ messages in thread
From: Maarten Maathuis @ 2009-08-22 12:27 UTC (permalink / raw)
  To: Pekka Paalanen; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Sat, Aug 22, 2009 at 2:20 PM, Pekka Paalanen<pq-X3B1VOXEql0@public.gmane.org> wrote:
> On Sat, 22 Aug 2009 12:30:03 +0200
> Maarten Maathuis <madman2003-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
>> dma.max is adjusted to always have space for the jump.
>
> Ah, then that should be documented somewhere.
>
> But if that is the case, why:
>>>> chan->dma.free = get - chan->dma.cur - 1;
>
> The extra space is not needed in this branch, since the available space
> is limited by GET. If it is not locked up, GET will move forward and give
> space. Otherwise, we would be in the get <= dma.cur branch.
>
> Assuming the jump command is at index dma.max, isn't dma.max the maximum
> possible value for GET inside the ring? Therefore
>>>> chan->dma.free = get - chan->dma.cur - 1;
> should become equivalent to
>>>> chan->dma.free = chan->dma.max - chan->dma.cur;
> but there is the -1.
>
> Or do we observe GET == dma.max + 1 just before it actually jumps?

I guess it's just a safety, i do not think our get is ever read as
dma.max + 1, because we read get before jumping.

>
> Or do we simply not know how it behaves and want to be safe?
>
> --
> Pekka Paalanen
> http://www.iki.fi/pq/
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-08-22 12:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-21  9:56 [PATCH] drm/nouveau: rewrite nouveau_dma_wait() skeggsb-Re5JQEeQqe8AvxtiuMwx3w
     [not found] ` <1250848611-21642-1-git-send-email-skeggsb-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2009-08-21 19:30   ` Maarten Maathuis
     [not found]     ` <6d4bc9fc0908211230l7f63207fm673559e3b4f656bb-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-21 23:26       ` Ben Skeggs
     [not found]         ` <1250897199.20095.1.camel-oFTkVezJ1HzzwxVw7NwMvg@public.gmane.org>
2009-08-21 23:42           ` Maarten Maathuis
2009-08-22 10:25   ` Pekka Paalanen
     [not found]     ` <20090822132559.7e404f88-cxYvVS3buNOdIgDiPM52R8c4bpwCjbIv@public.gmane.org>
2009-08-22 10:30       ` Maarten Maathuis
     [not found]         ` <6d4bc9fc0908220330l32803ce0qea995c59aa7f894c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-22 12:20           ` Pekka Paalanen
     [not found]             ` <20090822152021.45e535a5-cxYvVS3buNOdIgDiPM52R8c4bpwCjbIv@public.gmane.org>
2009-08-22 12:27               ` Maarten Maathuis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.