All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Make wa_tail_dwords flexible for future platforms.
@ 2016-01-25 19:29 Rodrigo Vivi
  2016-01-25 19:36 ` Ben Widawsky
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Rodrigo Vivi @ 2016-01-25 19:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ben Widawsky, Rodrigo Vivi

Commit 7c17d3773 (drm/i915: Use ordered seqno write interrupt generation
on gen8+ execlists) moved two MI_NOOPs to the advance_and_submit functions
and hardcoded the WA_TAIL_DWORDS. With this we don't have a clean way to
implement or remove WaIdleLiteRestore for different platforms.

This patch aims to let it more flexible. So we just emit the NOOPs
equivalent of what was initialized.

Also let's just include the platforms we know that needs this Wa,
i.e gen8 and gen9 platforms.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c        | 31 +++++++++++++++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
 2 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index da97bc5..d0b253d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -764,18 +764,18 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
 {
 	struct intel_ringbuffer *ringbuf = request->ringbuf;
 	struct drm_i915_private *dev_priv = request->i915;
+	int i;
 
 	intel_logical_ring_advance(ringbuf);
 	request->tail = ringbuf->tail;
 
 	/*
-	 * Here we add two extra NOOPs as padding to avoid
+	 * Here we add extra NOOPs as padding to avoid
 	 * lite restore of a context with HEAD==TAIL.
-	 *
-	 * Caller must reserve WA_TAIL_DWORDS for us!
 	 */
-	intel_logical_ring_emit(ringbuf, MI_NOOP);
-	intel_logical_ring_emit(ringbuf, MI_NOOP);
+	for (i = 0; i < ringbuf->wa_tail_dwords; i++)
+		intel_logical_ring_emit(ringbuf, MI_NOOP);
+
 	intel_logical_ring_advance(ringbuf);
 
 	if (intel_ring_stopped(request->ring))
@@ -876,6 +876,16 @@ int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
 	if (ret)
 		return ret;
 
+	if (IS_GEN8(req->ring->dev) || IS_GEN9(req->ring->dev))
+		/*
+		 * Reserve space for 2 NOOPs at the end of each request to be
+		 * used as a workaround for not being allowed to do lite
+		 * restore with HEAD==TAIL (WaIdleLiteRestore).
+		 */
+		req->ringbuf->wa_tail_dwords = 2;
+
+	num_dwords += req->ringbuf->wa_tail_dwords;
+
 	ret = logical_ring_prepare(req, num_dwords * sizeof(uint32_t));
 	if (ret)
 		return ret;
@@ -1858,13 +1868,6 @@ static void bxt_a_set_seqno(struct intel_engine_cs *ring, u32 seqno)
 	intel_flush_status_page(ring, I915_GEM_HWS_INDEX);
 }
 
-/*
- * Reserve space for 2 NOOPs at the end of each request to be
- * used as a workaround for not being allowed to do lite
- * restore with HEAD==TAIL (WaIdleLiteRestore).
- */
-#define WA_TAIL_DWORDS 2
-
 static inline u32 hws_seqno_address(struct intel_engine_cs *engine)
 {
 	return engine->status_page.gfx_addr + I915_GEM_HWS_INDEX_ADDR;
@@ -1875,7 +1878,7 @@ static int gen8_emit_request(struct drm_i915_gem_request *request)
 	struct intel_ringbuffer *ringbuf = request->ringbuf;
 	int ret;
 
-	ret = intel_logical_ring_begin(request, 6 + WA_TAIL_DWORDS);
+	ret = intel_logical_ring_begin(request, 6);
 	if (ret)
 		return ret;
 
@@ -1899,7 +1902,7 @@ static int gen8_emit_request_render(struct drm_i915_gem_request *request)
 	struct intel_ringbuffer *ringbuf = request->ringbuf;
 	int ret;
 
-	ret = intel_logical_ring_begin(request, 6 + WA_TAIL_DWORDS);
+	ret = intel_logical_ring_begin(request, 6);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 566b0ae..62b4e1b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -122,6 +122,8 @@ struct intel_ringbuffer {
 	 * we can detect new retirements.
 	 */
 	u32 last_retired_head;
+
+	int wa_tail_dwords;
 };
 
 struct	intel_context;
-- 
2.4.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: Make wa_tail_dwords flexible for future platforms.
  2016-01-25 19:29 [PATCH] drm/i915: Make wa_tail_dwords flexible for future platforms Rodrigo Vivi
@ 2016-01-25 19:36 ` Ben Widawsky
  2016-01-25 21:17 ` Chris Wilson
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Ben Widawsky @ 2016-01-25 19:36 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Mon, Jan 25, 2016 at 11:29:19AM -0800, Rodrigo Vivi wrote:
> Commit 7c17d3773 (drm/i915: Use ordered seqno write interrupt generation
> on gen8+ execlists) moved two MI_NOOPs to the advance_and_submit functions
> and hardcoded the WA_TAIL_DWORDS. With this we don't have a clean way to
> implement or remove WaIdleLiteRestore for different platforms.
> 
> This patch aims to let it more flexible. So we just emit the NOOPs
> equivalent of what was initialized.
> 
> Also let's just include the platforms we know that needs this Wa,
> i.e gen8 and gen9 platforms.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Ben Widawsky <benjamin.widawsky@intel.com>
> 
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

I don't claim to understand the reason this patch ends up being required, but it
looks fine to me. I would have gone with a bool for the workaround instead of a
count of dwords - but it's up to you. Since you allude to already knowing what
future hardware does, we know we don't need a variable length here.

> ---
>  drivers/gpu/drm/i915/intel_lrc.c        | 31 +++++++++++++++++--------------
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  2 ++
>  2 files changed, 19 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index da97bc5..d0b253d 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -764,18 +764,18 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
>  {
>  	struct intel_ringbuffer *ringbuf = request->ringbuf;
>  	struct drm_i915_private *dev_priv = request->i915;
> +	int i;
>  
>  	intel_logical_ring_advance(ringbuf);
>  	request->tail = ringbuf->tail;
>  
>  	/*
> -	 * Here we add two extra NOOPs as padding to avoid
> +	 * Here we add extra NOOPs as padding to avoid
>  	 * lite restore of a context with HEAD==TAIL.
> -	 *
> -	 * Caller must reserve WA_TAIL_DWORDS for us!
>  	 */
> -	intel_logical_ring_emit(ringbuf, MI_NOOP);
> -	intel_logical_ring_emit(ringbuf, MI_NOOP);
> +	for (i = 0; i < ringbuf->wa_tail_dwords; i++)
> +		intel_logical_ring_emit(ringbuf, MI_NOOP);
> +
>  	intel_logical_ring_advance(ringbuf);
>  
>  	if (intel_ring_stopped(request->ring))
> @@ -876,6 +876,16 @@ int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
>  	if (ret)
>  		return ret;
>  
> +	if (IS_GEN8(req->ring->dev) || IS_GEN9(req->ring->dev))
> +		/*
> +		 * Reserve space for 2 NOOPs at the end of each request to be
> +		 * used as a workaround for not being allowed to do lite
> +		 * restore with HEAD==TAIL (WaIdleLiteRestore).
> +		 */
> +		req->ringbuf->wa_tail_dwords = 2;

This should be set at ring_init, not here.

> +
> +	num_dwords += req->ringbuf->wa_tail_dwords;
> +
>  	ret = logical_ring_prepare(req, num_dwords * sizeof(uint32_t));
>  	if (ret)
>  		return ret;
> @@ -1858,13 +1868,6 @@ static void bxt_a_set_seqno(struct intel_engine_cs *ring, u32 seqno)
>  	intel_flush_status_page(ring, I915_GEM_HWS_INDEX);
>  }
>  
> -/*
> - * Reserve space for 2 NOOPs at the end of each request to be
> - * used as a workaround for not being allowed to do lite
> - * restore with HEAD==TAIL (WaIdleLiteRestore).
> - */
> -#define WA_TAIL_DWORDS 2
> -
>  static inline u32 hws_seqno_address(struct intel_engine_cs *engine)
>  {
>  	return engine->status_page.gfx_addr + I915_GEM_HWS_INDEX_ADDR;
> @@ -1875,7 +1878,7 @@ static int gen8_emit_request(struct drm_i915_gem_request *request)
>  	struct intel_ringbuffer *ringbuf = request->ringbuf;
>  	int ret;
>  
> -	ret = intel_logical_ring_begin(request, 6 + WA_TAIL_DWORDS);
> +	ret = intel_logical_ring_begin(request, 6);
>  	if (ret)
>  		return ret;
>  
> @@ -1899,7 +1902,7 @@ static int gen8_emit_request_render(struct drm_i915_gem_request *request)
>  	struct intel_ringbuffer *ringbuf = request->ringbuf;
>  	int ret;
>  
> -	ret = intel_logical_ring_begin(request, 6 + WA_TAIL_DWORDS);
> +	ret = intel_logical_ring_begin(request, 6);
>  	if (ret)
>  		return ret;
>  

I think it's a lot more straightforward to do the + ringbuf->wa_tail_dwords
here, so that ring_being can be as dumb as possible.

> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 566b0ae..62b4e1b 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -122,6 +122,8 @@ struct intel_ringbuffer {
>  	 * we can detect new retirements.
>  	 */
>  	u32 last_retired_head;
> +
> +	int wa_tail_dwords;
>  };
>  
>  struct	intel_context;
> -- 
> 2.4.3
> 

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: Make wa_tail_dwords flexible for future platforms.
  2016-01-25 19:29 [PATCH] drm/i915: Make wa_tail_dwords flexible for future platforms Rodrigo Vivi
  2016-01-25 19:36 ` Ben Widawsky
@ 2016-01-25 21:17 ` Chris Wilson
  2016-01-26  8:29   ` Chris Wilson
  2016-01-27 16:14 ` ✓ Fi.CI.BAT: success for drm/i915: Make wa_tail_dwords flexible for future platforms Patchwork
  2016-01-28 17:08 ` ✗ Fi.CI.BAT: failure for drm/i915: Make wa_tail_dwords flexible for future platforms. (rev2) Patchwork
  3 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2016-01-25 21:17 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, Ben Widawsky

On Mon, Jan 25, 2016 at 11:29:19AM -0800, Rodrigo Vivi wrote:
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -764,18 +764,18 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
>  {
>  	struct intel_ringbuffer *ringbuf = request->ringbuf;
>  	struct drm_i915_private *dev_priv = request->i915;
> +	int i;
>  
>  	intel_logical_ring_advance(ringbuf);
>  	request->tail = ringbuf->tail;
>  
>  	/*
> -	 * Here we add two extra NOOPs as padding to avoid
> +	 * Here we add extra NOOPs as padding to avoid
>  	 * lite restore of a context with HEAD==TAIL.
> -	 *
> -	 * Caller must reserve WA_TAIL_DWORDS for us!
>  	 */
> -	intel_logical_ring_emit(ringbuf, MI_NOOP);
> -	intel_logical_ring_emit(ringbuf, MI_NOOP);
> +	for (i = 0; i < ringbuf->wa_tail_dwords; i++)
> +		intel_logical_ring_emit(ringbuf, MI_NOOP);
> +
>  	intel_logical_ring_advance(ringbuf);
>  
>  	if (intel_ring_stopped(request->ring))
> @@ -876,6 +876,16 @@ int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
>  	if (ret)
>  		return ret;
>  
> +	if (IS_GEN8(req->ring->dev) || IS_GEN9(req->ring->dev))

req->i915

This is attrocious. Just allocate the extra space when required.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: Make wa_tail_dwords flexible for future platforms.
  2016-01-25 21:17 ` Chris Wilson
@ 2016-01-26  8:29   ` Chris Wilson
  2016-01-26 13:51     ` Rodrigo Vivi
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2016-01-26  8:29 UTC (permalink / raw)
  To: Rodrigo Vivi, intel-gfx, Ben Widawsky

On Mon, Jan 25, 2016 at 09:17:15PM +0000, Chris Wilson wrote:
> On Mon, Jan 25, 2016 at 11:29:19AM -0800, Rodrigo Vivi wrote:
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -764,18 +764,18 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
> >  {
> >  	struct intel_ringbuffer *ringbuf = request->ringbuf;
> >  	struct drm_i915_private *dev_priv = request->i915;
> > +	int i;
> >  
> >  	intel_logical_ring_advance(ringbuf);
> >  	request->tail = ringbuf->tail;
> >  
> >  	/*
> > -	 * Here we add two extra NOOPs as padding to avoid
> > +	 * Here we add extra NOOPs as padding to avoid
> >  	 * lite restore of a context with HEAD==TAIL.
> > -	 *
> > -	 * Caller must reserve WA_TAIL_DWORDS for us!
> >  	 */
> > -	intel_logical_ring_emit(ringbuf, MI_NOOP);
> > -	intel_logical_ring_emit(ringbuf, MI_NOOP);
> > +	for (i = 0; i < ringbuf->wa_tail_dwords; i++)
> > +		intel_logical_ring_emit(ringbuf, MI_NOOP);
> > +
> >  	intel_logical_ring_advance(ringbuf);
> >  
> >  	if (intel_ring_stopped(request->ring))
> > @@ -876,6 +876,16 @@ int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
> >  	if (ret)
> >  		return ret;
> >  
> > +	if (IS_GEN8(req->ring->dev) || IS_GEN9(req->ring->dev))
> 
> req->i915
> 
> This is attrocious. Just allocate the extra space when required.

Slightly less grumpy this morning.

1. This is duplicating the reserved-space mechanism, by open-coding the
requirements for execlists. Fine-tuning the reserved space per ring may
be worth it, but probably not. Over reserving space is not a hung issue
(it just effectively reduces the size of the ring), and the granularity
is the size of the average request.

2. You are hiding how much space is actually used during request
emission. This makes review impossible, and we depend upon review to
verify that the intel_ring_begin() matches the number of dwords emitted.

3. Is this even the right mechanism considering the number of other ways
of automatically emitting instructions between batches and contexts? We
cannot answer that as this patch is out of context.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: Make wa_tail_dwords flexible for future platforms.
  2016-01-26  8:29   ` Chris Wilson
@ 2016-01-26 13:51     ` Rodrigo Vivi
  2016-01-26 14:06       ` Chris Wilson
  0 siblings, 1 reply; 11+ messages in thread
From: Rodrigo Vivi @ 2016-01-26 13:51 UTC (permalink / raw)
  To: Chris Wilson, Rodrigo Vivi, intel-gfx, Ben Widawsky


[-- Attachment #1.1: Type: text/plain, Size: 3141 bytes --]

On Tue, Jan 26, 2016 at 12:30 AM Chris Wilson <chris@chris-wilson.co.uk>
wrote:

> On Mon, Jan 25, 2016 at 09:17:15PM +0000, Chris Wilson wrote:
> > On Mon, Jan 25, 2016 at 11:29:19AM -0800, Rodrigo Vivi wrote:
> > > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > > @@ -764,18 +764,18 @@ intel_logical_ring_advance_and_submit(struct
> drm_i915_gem_request *request)
> > >  {
> > >     struct intel_ringbuffer *ringbuf = request->ringbuf;
> > >     struct drm_i915_private *dev_priv = request->i915;
> > > +   int i;
> > >
> > >     intel_logical_ring_advance(ringbuf);
> > >     request->tail = ringbuf->tail;
> > >
> > >     /*
> > > -    * Here we add two extra NOOPs as padding to avoid
> > > +    * Here we add extra NOOPs as padding to avoid
> > >      * lite restore of a context with HEAD==TAIL.
> > > -    *
> > > -    * Caller must reserve WA_TAIL_DWORDS for us!
> > >      */
> > > -   intel_logical_ring_emit(ringbuf, MI_NOOP);
> > > -   intel_logical_ring_emit(ringbuf, MI_NOOP);
> > > +   for (i = 0; i < ringbuf->wa_tail_dwords; i++)
> > > +           intel_logical_ring_emit(ringbuf, MI_NOOP);
> > > +
> > >     intel_logical_ring_advance(ringbuf);
> > >
> > >     if (intel_ring_stopped(request->ring))
> > > @@ -876,6 +876,16 @@ int intel_logical_ring_begin(struct
> drm_i915_gem_request *req, int num_dwords)
> > >     if (ret)
> > >             return ret;
> > >
> > > +   if (IS_GEN8(req->ring->dev) || IS_GEN9(req->ring->dev))
> >
> > req->i915
> >
> > This is attrocious. Just allocate the extra space when required.
>
>
by this logic I should just emit the mi_noops when required as well, right?


> Slightly less grumpy this morning.
>

thanks

>
> 1. This is duplicating the reserved-space mechanism, by open-coding the
> requirements for execlists. Fine-tuning the reserved space per ring may
> be worth it, but probably not. Over reserving space is not a hung issue
> (it just effectively reduces the size of the ring), and the granularity
> is the size of the average request.
>

forgive this clueless mind here, but I don't see how I'm duplicating the
reserved-space...


>
> 2. You are hiding how much space is actually used during request
> emission. This makes review impossible, and we depend upon review to
> verify that the intel_ring_begin() matches the number of dwords emitted.
>

but the mi_noops are hidden on the submit and advance... shouldn't we move
it back to the places that allocates it.


>
> 3. Is this even the right mechanism considering the number of other ways
> of automatically emitting instructions between batches and contexts? We
> cannot answer that as this patch is out of context.
>

yeap, sorry again, I was just going to the easiest path to be able to avoid
the nulls per platform without adding 3 ifs..

But I wonder if you mean on comment "1." that we can live with
WA_TAIL_DWORDS 2 and avoid only the NULLs when needed... Is this the case?


> -Chris
>
> --
> Chris Wilson, Intel Open Source Technology Centre
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>

[-- Attachment #1.2: Type: text/html, Size: 4816 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: Make wa_tail_dwords flexible for future platforms.
  2016-01-26 13:51     ` Rodrigo Vivi
@ 2016-01-26 14:06       ` Chris Wilson
  2016-01-27 12:27         ` Dave Gordon
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2016-01-26 14:06 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, Ben Widawsky, Rodrigo Vivi

On Tue, Jan 26, 2016 at 01:51:19PM +0000, Rodrigo Vivi wrote:
>    On Tue, Jan 26, 2016 at 12:30 AM Chris Wilson
>    <[1]chris@chris-wilson.co.uk> wrote:
> 
>      On Mon, Jan 25, 2016 at 09:17:15PM +0000, Chris Wilson wrote:
>      > On Mon, Jan 25, 2016 at 11:29:19AM -0800, Rodrigo Vivi wrote:
>      > > +++ b/drivers/gpu/drm/i915/intel_lrc.c
>      > > @@ -764,18 +764,18 @@ intel_logical_ring_advance_and_submit(struct
>      drm_i915_gem_request *request)
>      > >  {
>      > >     struct intel_ringbuffer *ringbuf = request->ringbuf;
>      > >     struct drm_i915_private *dev_priv = request->i915;
>      > > +   int i;
>      > >
>      > >     intel_logical_ring_advance(ringbuf);
>      > >     request->tail = ringbuf->tail;
>      > >
>      > >     /*
>      > > -    * Here we add two extra NOOPs as padding to avoid
>      > > +    * Here we add extra NOOPs as padding to avoid
>      > >      * lite restore of a context with HEAD==TAIL.
>      > > -    *
>      > > -    * Caller must reserve WA_TAIL_DWORDS for us!
>      > >      */
>      > > -   intel_logical_ring_emit(ringbuf, MI_NOOP);
>      > > -   intel_logical_ring_emit(ringbuf, MI_NOOP);
>      > > +   for (i = 0; i < ringbuf->wa_tail_dwords; i++)
>      > > +           intel_logical_ring_emit(ringbuf, MI_NOOP);
>      > > +
>      > >     intel_logical_ring_advance(ringbuf);
>      > >
>      > >     if (intel_ring_stopped(request->ring))
>      > > @@ -876,6 +876,16 @@ int intel_logical_ring_begin(struct
>      drm_i915_gem_request *req, int num_dwords)
>      > >     if (ret)
>      > >             return ret;
>      > >
>      > > +   if (IS_GEN8(req->ring->dev) || IS_GEN9(req->ring->dev))
>      >
>      > req->i915
>      >
>      > This is attrocious. Just allocate the extra space when required.
> 
>    by this logic I should just emit the mi_noops when required as well,
>    right?

Yes, I didn't like the placement of the wa_tail but I went with that to
avoid the code duplication.

>      Slightly less grumpy this morning.
> 
>    thanks 
> 
>      1. This is duplicating the reserved-space mechanism, by open-coding the
>      requirements for execlists. Fine-tuning the reserved space per ring may
>      be worth it, but probably not. Over reserving space is not a hung issue
>      (it just effectively reduces the size of the ring), and the granularity
>      is the size of the average request.
> 
>    forgive this clueless mind here, but I don't see how I'm duplicating the
>    reserved-space... 

You are extending every begin by the overallocation required to emit
the tail dwords. We already extend every begin by the overallocation
required to emit the request (until we come to emit the request, where
there is no more overallocation applied).

>      2. You are hiding how much space is actually used during request
>      emission. This makes review impossible, and we depend upon review to
>      verify that the intel_ring_begin() matches the number of dwords emitted.
> 
>    but the mi_noops are hidden on the submit and advance... shouldn't we move
>    it back to the places that allocates it.

Hence why I stressed that in the comments - but it is a tail call, just
read it as one function. The important sequence is that

intel_ring_begin(count)
...
count x intel_ring_emit
...
intel_ring_advance()

is clear to the reader. Yes, this breaks that rule by replacing
intel_ring_advance() with a custom lr_ring_advance_and_submit() and
perhaps it would be clearer to add lr_ring_begin_for_submit() or
something to stress the slight discrepancy, but still make the pairing
clear.

>      3. Is this even the right mechanism considering the number of other ways
>      of automatically emitting instructions between batches and contexts? We
>      cannot answer that as this patch is out of context.
> 
>    yeap, sorry again, I was just going to the easiest path to be able to
>    avoid the nulls per platform without adding 3 ifs..
>    But I wonder if you mean on comment "1." that we can live with
>    WA_TAIL_DWORDS 2 and avoid only the NULLs when needed... Is this the case?

If you want more dwords in the add_request callback, we need to add
those to the MIN_SPACE_FOR_ADD_REQUEST. If we need to add a lot, then
making it variable seems fine - but it should just hook into the common
mechanism i.e. the minimum space should be computed during engine
initialisation and the reservation applied at i915_gem_eequest_alloc().
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915: Make wa_tail_dwords flexible for future platforms.
  2016-01-26 14:06       ` Chris Wilson
@ 2016-01-27 12:27         ` Dave Gordon
  2016-01-27 16:57           ` [PATCH] drm/i915/execlists: Move WA_TAIL_DWORDS to callee Chris Wilson
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Gordon @ 2016-01-27 12:27 UTC (permalink / raw)
  To: Chris Wilson, Rodrigo Vivi, Rodrigo Vivi, intel-gfx, Ben Widawsky

On 26/01/16 14:06, Chris Wilson wrote:
> On Tue, Jan 26, 2016 at 01:51:19PM +0000, Rodrigo Vivi wrote:
>>     On Tue, Jan 26, 2016 at 12:30 AM Chris Wilson
>>     <[1]chris@chris-wilson.co.uk> wrote:
>>
>>       On Mon, Jan 25, 2016 at 09:17:15PM +0000, Chris Wilson wrote:
>>       > On Mon, Jan 25, 2016 at 11:29:19AM -0800, Rodrigo Vivi wrote:
>>       > > +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>       > > @@ -764,18 +764,18 @@ intel_logical_ring_advance_and_submit(struct
>>       drm_i915_gem_request *request)
>>       > >  {
>>       > >     struct intel_ringbuffer *ringbuf = request->ringbuf;
>>       > >     struct drm_i915_private *dev_priv = request->i915;
>>       > > +   int i;
>>       > >
>>       > >     intel_logical_ring_advance(ringbuf);
>>       > >     request->tail = ringbuf->tail;
>>       > >
>>       > >     /*
>>       > > -    * Here we add two extra NOOPs as padding to avoid
>>       > > +    * Here we add extra NOOPs as padding to avoid
>>       > >      * lite restore of a context with HEAD==TAIL.
>>       > > -    *
>>       > > -    * Caller must reserve WA_TAIL_DWORDS for us!
>>       > >      */
>>       > > -   intel_logical_ring_emit(ringbuf, MI_NOOP);
>>       > > -   intel_logical_ring_emit(ringbuf, MI_NOOP);
>>       > > +   for (i = 0; i < ringbuf->wa_tail_dwords; i++)
>>       > > +           intel_logical_ring_emit(ringbuf, MI_NOOP);
>>       > > +
>>       > >     intel_logical_ring_advance(ringbuf);
>>       > >
>>       > >     if (intel_ring_stopped(request->ring))
>>       > > @@ -876,6 +876,16 @@ int intel_logical_ring_begin(struct
>>       drm_i915_gem_request *req, int num_dwords)
>>       > >     if (ret)
>>       > >             return ret;
>>       > >
>>       > > +   if (IS_GEN8(req->ring->dev) || IS_GEN9(req->ring->dev))
>>       >
>>       > req->i915
>>       >
>>       > This is attrocious. Just allocate the extra space when required.
>>
>>     by this logic I should just emit the mi_noops when required as well,
>>     right?
>
> Yes, I didn't like the placement of the wa_tail but I went with that to
> avoid the code duplication.
>
>>       Slightly less grumpy this morning.
>>
>>     thanks
>>
>>       1. This is duplicating the reserved-space mechanism, by open-coding the
>>       requirements for execlists. Fine-tuning the reserved space per ring may
>>       be worth it, but probably not. Over reserving space is not a hung issue
>>       (it just effectively reduces the size of the ring), and the granularity
>>       is the size of the average request.
>>
>>     forgive this clueless mind here, but I don't see how I'm duplicating the
>>     reserved-space...
>
> You are extending every begin by the overallocation required to emit
> the tail dwords. We already extend every begin by the overallocation
> required to emit the request (until we come to emit the request, where
> there is no more overallocation applied).
>
>>       2. You are hiding how much space is actually used during request
>>       emission. This makes review impossible, and we depend upon review to
>>       verify that the intel_ring_begin() matches the number of dwords emitted.
>>
>>     but the mi_noops are hidden on the submit and advance... shouldn't we move
>>     it back to the places that allocates it.
>
> Hence why I stressed that in the comments - but it is a tail call, just
> read it as one function. The important sequence is that
>
> intel_ring_begin(count)
> ...
> count x intel_ring_emit
> ...
> intel_ring_advance()
>
> is clear to the reader. Yes, this breaks that rule by replacing
> intel_ring_advance() with a custom lr_ring_advance_and_submit() and
> perhaps it would be clearer to add lr_ring_begin_for_submit() or
> something to stress the slight discrepancy, but still make the pairing
> clear.
>
>>       3. Is this even the right mechanism considering the number of other ways
>>       of automatically emitting instructions between batches and contexts? We
>>       cannot answer that as this patch is out of context.
>>
>>     yeap, sorry again, I was just going to the easiest path to be able to
>>     avoid the nulls per platform without adding 3 ifs..
>>     But I wonder if you mean on comment "1." that we can live with
>>     WA_TAIL_DWORDS 2 and avoid only the NULLs when needed... Is this the case?
>
> If you want more dwords in the add_request callback, we need to add
> those to the MIN_SPACE_FOR_ADD_REQUEST. If we need to add a lot, then
> making it variable seems fine - but it should just hook into the common
> mechanism i.e. the minimum space should be computed during engine
> initialisation and the reservation applied at i915_gem_eequest_alloc().
> -Chris

I think the cleanest partitioning of the functionality would be:
     1. The space for the NOOPs should be accounted for in the reserved
        space, because it's just part of the total space required to
        complete an add_request/emit_request(). Since the amount
        reserved is determined in intel_{logical_}ring_reserve_space()
        it could be added only in the LRC path, if we were concerned
        about the extra space (which I don't think we should be).

     2. callers do begin(N), N*emit(), advance(), add_request(). They
        don't bother about extra NOOPs.

     3. gen8_emit_request() shouldn't have to bother with them either, or
        even with claiming the space for them.

     4. advance_and_submit() (which is execlist specific) can do an extra
        begin() just to keep begin/advance balanced -- it can't fail or
        wait, 'cos it's in the reserved space -- and emits the extra
        NOOPs. This is where it can be made conditional on specific GENs,
        if you want that to be explicit, though since the overhead is so
        small I'd be inclined to always enable it here, and only check
        whether to actually apply the TAIL-bump in the ELSP-poking code.

In summary: mostly as Chris had it, but without the extra space being 
added to the begin() call in gen8_emit_request() (as Rodrigo has it).

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Make wa_tail_dwords flexible for future platforms.
  2016-01-25 19:29 [PATCH] drm/i915: Make wa_tail_dwords flexible for future platforms Rodrigo Vivi
  2016-01-25 19:36 ` Ben Widawsky
  2016-01-25 21:17 ` Chris Wilson
@ 2016-01-27 16:14 ` Patchwork
  2016-01-28 17:08 ` ✗ Fi.CI.BAT: failure for drm/i915: Make wa_tail_dwords flexible for future platforms. (rev2) Patchwork
  3 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2016-01-27 16:14 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

== Summary ==

Built on 5ae916607e3e12ba18c848dff42baaad5b118c4b drm-intel-nightly: 2016y-01m-27d-12h-48m-36s UTC integration manifest

Test kms_flip:
        Subgroup basic-flip-vs-dpms:
                pass       -> DMESG-WARN (ilk-hp8440p) UNSTABLE

bdw-nuci7        total:141  pass:132  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:144  pass:138  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:144  pass:120  dwarn:0   dfail:0   fail:0   skip:24 
byt-nuc          total:144  pass:129  dwarn:0   dfail:0   fail:0   skip:15 
hsw-brixbox      total:144  pass:137  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:144  pass:140  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:144  pass:104  dwarn:1   dfail:0   fail:1   skip:38 
ivb-t430s        total:144  pass:138  dwarn:0   dfail:0   fail:0   skip:6  
skl-i5k-2        total:144  pass:135  dwarn:1   dfail:0   fail:0   skip:8  
snb-dellxps      total:144  pass:130  dwarn:0   dfail:0   fail:0   skip:14 
snb-x220t        total:144  pass:130  dwarn:0   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1264/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] drm/i915/execlists: Move WA_TAIL_DWORDS to callee
  2016-01-27 12:27         ` Dave Gordon
@ 2016-01-27 16:57           ` Chris Wilson
  2016-02-01 14:00             ` Dave Gordon
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2016-01-27 16:57 UTC (permalink / raw)
  To: intel-gfx

Currently emit-request starts writing to the ring and reserves space for
a workaround to be emitted later whilst submitting the request. It is
easier to read if the caller only allocates sufficient space for its
access (then the reader can quickly verify that the ring begin allocates
the exact space for the number of dwords emitted) and closes the access
to the ring. During submit, if we need to add the workaround, we can
reacquire ring access, in the assurance that we reserved space for
ourselves when beginning the request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@gmail.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 41 ++++++++++++++++++++--------------------
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index da97bc5666b5..74fcf0f8d97a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -760,23 +760,27 @@ static int logical_ring_wait_for_space(struct drm_i915_gem_request *req,
  * point, the tail *inside* the context is updated and the ELSP written to.
  */
 static int
-intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
+intel_logical_ring_submit(struct drm_i915_gem_request *request)
 {
 	struct intel_ringbuffer *ringbuf = request->ringbuf;
 	struct drm_i915_private *dev_priv = request->i915;
 
-	intel_logical_ring_advance(ringbuf);
 	request->tail = ringbuf->tail;
 
 	/*
-	 * Here we add two extra NOOPs as padding to avoid
-	 * lite restore of a context with HEAD==TAIL.
-	 *
-	 * Caller must reserve WA_TAIL_DWORDS for us!
+	 * Reserve space for 2 NOOPs at the end of each request to be
+	 * used as a workaround for not being allowed to do lite
+	 * restore with HEAD==TAIL (WaIdleLiteRestore).
 	 */
-	intel_logical_ring_emit(ringbuf, MI_NOOP);
-	intel_logical_ring_emit(ringbuf, MI_NOOP);
-	intel_logical_ring_advance(ringbuf);
+	if (1 /* need WaIdleLiteRestore */) {
+		int ret = intel_logical_ring_begin(request, 2);
+		if (ret)
+			return ret;
+
+		intel_logical_ring_emit(ringbuf, MI_NOOP);
+		intel_logical_ring_emit(ringbuf, MI_NOOP);
+		intel_logical_ring_advance(ringbuf);
+	}
 
 	if (intel_ring_stopped(request->ring))
 		return 0;
@@ -1858,13 +1862,6 @@ static void bxt_a_set_seqno(struct intel_engine_cs *ring, u32 seqno)
 	intel_flush_status_page(ring, I915_GEM_HWS_INDEX);
 }
 
-/*
- * Reserve space for 2 NOOPs at the end of each request to be
- * used as a workaround for not being allowed to do lite
- * restore with HEAD==TAIL (WaIdleLiteRestore).
- */
-#define WA_TAIL_DWORDS 2
-
 static inline u32 hws_seqno_address(struct intel_engine_cs *engine)
 {
 	return engine->status_page.gfx_addr + I915_GEM_HWS_INDEX_ADDR;
@@ -1875,7 +1872,7 @@ static int gen8_emit_request(struct drm_i915_gem_request *request)
 	struct intel_ringbuffer *ringbuf = request->ringbuf;
 	int ret;
 
-	ret = intel_logical_ring_begin(request, 6 + WA_TAIL_DWORDS);
+	ret = intel_logical_ring_begin(request, 6);
 	if (ret)
 		return ret;
 
@@ -1891,7 +1888,9 @@ static int gen8_emit_request(struct drm_i915_gem_request *request)
 	intel_logical_ring_emit(ringbuf, i915_gem_request_get_seqno(request));
 	intel_logical_ring_emit(ringbuf, MI_USER_INTERRUPT);
 	intel_logical_ring_emit(ringbuf, MI_NOOP);
-	return intel_logical_ring_advance_and_submit(request);
+	intel_logical_ring_advance(ringbuf);
+
+	return intel_logical_ring_submit(request);
 }
 
 static int gen8_emit_request_render(struct drm_i915_gem_request *request)
@@ -1899,7 +1898,7 @@ static int gen8_emit_request_render(struct drm_i915_gem_request *request)
 	struct intel_ringbuffer *ringbuf = request->ringbuf;
 	int ret;
 
-	ret = intel_logical_ring_begin(request, 6 + WA_TAIL_DWORDS);
+	ret = intel_logical_ring_begin(request, 6);
 	if (ret)
 		return ret;
 
@@ -1916,7 +1915,9 @@ static int gen8_emit_request_render(struct drm_i915_gem_request *request)
 	intel_logical_ring_emit(ringbuf, 0);
 	intel_logical_ring_emit(ringbuf, i915_gem_request_get_seqno(request));
 	intel_logical_ring_emit(ringbuf, MI_USER_INTERRUPT);
-	return intel_logical_ring_advance_and_submit(request);
+	intel_logical_ring_advance(ringbuf);
+
+	return intel_logical_ring_submit(request);
 }
 
 static int intel_lr_context_render_state_init(struct drm_i915_gem_request *req)
-- 
2.7.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Make wa_tail_dwords flexible for future platforms. (rev2)
  2016-01-25 19:29 [PATCH] drm/i915: Make wa_tail_dwords flexible for future platforms Rodrigo Vivi
                   ` (2 preceding siblings ...)
  2016-01-27 16:14 ` ✓ Fi.CI.BAT: success for drm/i915: Make wa_tail_dwords flexible for future platforms Patchwork
@ 2016-01-28 17:08 ` Patchwork
  3 siblings, 0 replies; 11+ messages in thread
From: Patchwork @ 2016-01-28 17:08 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Summary ==

Series 2818v2 drm/i915: Make wa_tail_dwords flexible for future platforms.
http://patchwork.freedesktop.org/api/1.0/series/2818/revisions/2/mbox/

Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-b:
                pass       -> DMESG-WARN (ivb-t430s)

bdw-nuci7        total:156  pass:147  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:159  pass:153  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:159  pass:135  dwarn:0   dfail:0   fail:0   skip:24 
byt-nuc          total:159  pass:142  dwarn:0   dfail:0   fail:0   skip:17 
hsw-brixbox      total:159  pass:152  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:159  pass:155  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:159  pass:114  dwarn:0   dfail:0   fail:1   skip:44 
ivb-t430s        total:159  pass:150  dwarn:1   dfail:0   fail:0   skip:8  
snb-dellxps      total:159  pass:141  dwarn:0   dfail:0   fail:0   skip:18 
snb-x220t        total:159  pass:141  dwarn:0   dfail:0   fail:1   skip:17 

HANGED skl-i5k-2 in igt@gem_sync@basic-blt

Results at /archive/results/CI_IGT_test/Patchwork_1307/

b3f8ad64bc71f6236f05c2e9f4ad49a61745869a drm-intel-nightly: 2016y-01m-28d-10h-26m-23s UTC integration manifest
ba92b4fc17f9d80dcfb44de8cebffd957af4f73a drm/i915/execlists: Move WA_TAIL_DWORDS to callee

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915/execlists: Move WA_TAIL_DWORDS to callee
  2016-01-27 16:57           ` [PATCH] drm/i915/execlists: Move WA_TAIL_DWORDS to callee Chris Wilson
@ 2016-02-01 14:00             ` Dave Gordon
  0 siblings, 0 replies; 11+ messages in thread
From: Dave Gordon @ 2016-02-01 14:00 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 27/01/16 16:57, Chris Wilson wrote:
> Currently emit-request starts writing to the ring and reserves space for
> a workaround to be emitted later whilst submitting the request. It is
> easier to read if the caller only allocates sufficient space for its
> access (then the reader can quickly verify that the ring begin allocates
> the exact space for the number of dwords emitted) and closes the access
> to the ring. During submit, if we need to add the workaround, we can
> reacquire ring access, in the assurance that we reserved space for
> ourselves when beginning the request.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Dave Gordon <david.s.gordon@intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@gmail.com>
> ---

Generally, yes, but ...

>   drivers/gpu/drm/i915/intel_lrc.c | 41 ++++++++++++++++++++--------------------
>   1 file changed, 21 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index da97bc5666b5..74fcf0f8d97a 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -760,23 +760,27 @@ static int logical_ring_wait_for_space(struct drm_i915_gem_request *req,
>    * point, the tail *inside* the context is updated and the ELSP written to.
>    */
>   static int
> -intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
> +intel_logical_ring_submit(struct drm_i915_gem_request *request)

The comment above this function still has the old name

>   {
>   	struct intel_ringbuffer *ringbuf = request->ringbuf;
>   	struct drm_i915_private *dev_priv = request->i915;
>
> -	intel_logical_ring_advance(ringbuf);
>   	request->tail = ringbuf->tail;
>
>   	/*
> -	 * Here we add two extra NOOPs as padding to avoid
> -	 * lite restore of a context with HEAD==TAIL.
> -	 *
> -	 * Caller must reserve WA_TAIL_DWORDS for us!
> +	 * Reserve space for 2 NOOPs at the end of each request to be
> +	 * used as a workaround for not being allowed to do lite
> +	 * restore with HEAD==TAIL (WaIdleLiteRestore).
>   	 */
> -	intel_logical_ring_emit(ringbuf, MI_NOOP);
> -	intel_logical_ring_emit(ringbuf, MI_NOOP);
> -	intel_logical_ring_advance(ringbuf);
> +	if (1 /* need WaIdleLiteRestore */) {
> +		int ret = intel_logical_ring_begin(request, 2);
> +		if (ret)
> +			return ret;
> +
> +		intel_logical_ring_emit(ringbuf, MI_NOOP);
> +		intel_logical_ring_emit(ringbuf, MI_NOOP);
> +		intel_logical_ring_advance(ringbuf);
> +	}

How about keeping the generalisation of emitting WA_TAIL_DWORDS of NOOPs 
(and the test can be if this is greater than 0) ...

>
>   	if (intel_ring_stopped(request->ring))
>   		return 0;
> @@ -1858,13 +1862,6 @@ static void bxt_a_set_seqno(struct intel_engine_cs *ring, u32 seqno)
>   	intel_flush_status_page(ring, I915_GEM_HWS_INDEX);
>   }
>
> -/*
> - * Reserve space for 2 NOOPs at the end of each request to be
> - * used as a workaround for not being allowed to do lite
> - * restore with HEAD==TAIL (WaIdleLiteRestore).
> - */
> -#define WA_TAIL_DWORDS 2
> -

... and keeping the define of WA_TAIL_DWORDS (but preferably moved to 
the top of the file), and changing intel_logical_ring_reserve_space() to 
add this many dwords to the space reserved.

That should make clear the connection between:
1. reserving the space (intel_logical_ring_reserve_space)
2. filling it with NOOPs (intel_logical_ring_submit)
3. using the space (execlists_context_unqueue)

because they would each mention WA_TAIL_DWORDS and WaIdleLiteRestore, 
and it will be obvious that it really is just a fix for a specific issue 
with execlist submission.

BTW the comment in execlists_context_unqueue() is (or will be) wrong 
about where the padding is added.

All the remaining changes below look good :)

.Dave.

>   static inline u32 hws_seqno_address(struct intel_engine_cs *engine)
>   {
>   	return engine->status_page.gfx_addr + I915_GEM_HWS_INDEX_ADDR;
> @@ -1875,7 +1872,7 @@ static int gen8_emit_request(struct drm_i915_gem_request *request)
>   	struct intel_ringbuffer *ringbuf = request->ringbuf;
>   	int ret;
>
> -	ret = intel_logical_ring_begin(request, 6 + WA_TAIL_DWORDS);
> +	ret = intel_logical_ring_begin(request, 6);
>   	if (ret)
>   		return ret;
>
> @@ -1891,7 +1888,9 @@ static int gen8_emit_request(struct drm_i915_gem_request *request)
>   	intel_logical_ring_emit(ringbuf, i915_gem_request_get_seqno(request));
>   	intel_logical_ring_emit(ringbuf, MI_USER_INTERRUPT);
>   	intel_logical_ring_emit(ringbuf, MI_NOOP);
> -	return intel_logical_ring_advance_and_submit(request);
> +	intel_logical_ring_advance(ringbuf);
> +
> +	return intel_logical_ring_submit(request);
>   }
>
>   static int gen8_emit_request_render(struct drm_i915_gem_request *request)
> @@ -1899,7 +1898,7 @@ static int gen8_emit_request_render(struct drm_i915_gem_request *request)
>   	struct intel_ringbuffer *ringbuf = request->ringbuf;
>   	int ret;
>
> -	ret = intel_logical_ring_begin(request, 6 + WA_TAIL_DWORDS);
> +	ret = intel_logical_ring_begin(request, 6);
>   	if (ret)
>   		return ret;
>
> @@ -1916,7 +1915,9 @@ static int gen8_emit_request_render(struct drm_i915_gem_request *request)
>   	intel_logical_ring_emit(ringbuf, 0);
>   	intel_logical_ring_emit(ringbuf, i915_gem_request_get_seqno(request));
>   	intel_logical_ring_emit(ringbuf, MI_USER_INTERRUPT);
> -	return intel_logical_ring_advance_and_submit(request);
> +	intel_logical_ring_advance(ringbuf);
> +
> +	return intel_logical_ring_submit(request);
>   }
>
>   static int intel_lr_context_render_state_init(struct drm_i915_gem_request *req)
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-02-01 14:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-25 19:29 [PATCH] drm/i915: Make wa_tail_dwords flexible for future platforms Rodrigo Vivi
2016-01-25 19:36 ` Ben Widawsky
2016-01-25 21:17 ` Chris Wilson
2016-01-26  8:29   ` Chris Wilson
2016-01-26 13:51     ` Rodrigo Vivi
2016-01-26 14:06       ` Chris Wilson
2016-01-27 12:27         ` Dave Gordon
2016-01-27 16:57           ` [PATCH] drm/i915/execlists: Move WA_TAIL_DWORDS to callee Chris Wilson
2016-02-01 14:00             ` Dave Gordon
2016-01-27 16:14 ` ✓ Fi.CI.BAT: success for drm/i915: Make wa_tail_dwords flexible for future platforms Patchwork
2016-01-28 17:08 ` ✗ Fi.CI.BAT: failure for drm/i915: Make wa_tail_dwords flexible for future platforms. (rev2) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.