All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915/guc: Move GuC wq_reserve_space to ring_reserve_space
@ 2015-12-09  1:04 yu.dai
  2015-12-09  9:05 ` Chris Wilson
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: yu.dai @ 2015-12-09  1:04 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

Split GuC work queue space reserve and submission and move the space
reserve to where ring space is reserved. The reason is that failure
in intel_logical_ring_advance_and_submit won't be handled. In the
case timeout happens, driver can return early in order to handle the
error.
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 21 +++++++++------------
 drivers/gpu/drm/i915/intel_guc.h           |  1 +
 drivers/gpu/drm/i915/intel_lrc.c           | 10 +++++++++-
 3 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 8a71458..264fdf7 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -472,25 +472,21 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
 			     sizeof(desc) * client->ctx_index);
 }
 
-/* Get valid workqueue item and return it back to offset */
-static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
+int i915_guc_wq_reserve_space(struct i915_guc_client *gc)
 {
 	struct guc_process_desc *desc;
 	void *base;
 	u32 size = sizeof(struct guc_wq_item);
 	int ret = -ETIMEDOUT, timeout_counter = 200;
 
+	if (!gc)
+		return 0;
+
 	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
 	desc = base + gc->proc_desc_offset;
 
 	while (timeout_counter-- > 0) {
 		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
-			*offset = gc->wq_tail;
-
-			/* advance the tail for next workqueue item */
-			gc->wq_tail += size;
-			gc->wq_tail &= gc->wq_size - 1;
-
 			/* this will break the loop */
 			timeout_counter = 0;
 			ret = 0;
@@ -512,11 +508,12 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc,
 	struct guc_wq_item *wqi;
 	void *base;
 	u32 tail, wq_len, wq_off = 0;
-	int ret;
 
-	ret = guc_get_workqueue_space(gc, &wq_off);
-	if (ret)
-		return ret;
+	wq_off = gc->wq_tail;
+
+	/* advance the tail for next workqueue item */
+	gc->wq_tail += sizeof(struct guc_wq_item);
+	gc->wq_tail &= gc->wq_size - 1;
 
 	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
 	 * should not have the case where structure wqi is across page, neither
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 0e048bf..59c8e21 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -123,5 +123,6 @@ int i915_guc_submit(struct i915_guc_client *client,
 		    struct drm_i915_gem_request *rq);
 void i915_guc_submission_disable(struct drm_device *dev);
 void i915_guc_submission_fini(struct drm_device *dev);
+int i915_guc_wq_reserve_space(struct i915_guc_client *client);
 
 #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f96fb51..25cbeab 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -844,6 +844,8 @@ int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
 
 int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request)
 {
+	int ret;
+
 	/*
 	 * The first call merely notes the reserve request and is common for
 	 * all back ends. The subsequent localised _begin() call actually
@@ -854,7 +856,13 @@ int intel_logical_ring_reserve_space(struct drm_i915_gem_request *request)
 	 */
 	intel_ring_reserved_space_reserve(request->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
 
-	return intel_logical_ring_begin(request, 0);
+	ret = intel_logical_ring_begin(request, 0);
+	if (ret)
+		return ret;
+
+	ret = i915_guc_wq_reserve_space(request->i915->guc.execbuf_client);
+
+	return ret;
 }
 
 /**
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915/guc: Move GuC wq_reserve_space to ring_reserve_space
  2015-12-09  1:04 [PATCH] drm/i915/guc: Move GuC wq_reserve_space to ring_reserve_space yu.dai
@ 2015-12-09  9:05 ` Chris Wilson
  2015-12-09 17:08   ` Yu Dai
  2015-12-09 18:50 ` [PATCH v2] drm/i915/guc: Move GuC wq_reserve_space to alloc_request_extras yu.dai
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Chris Wilson @ 2015-12-09  9:05 UTC (permalink / raw)
  To: yu.dai; +Cc: intel-gfx

On Tue, Dec 08, 2015 at 05:04:50PM -0800, yu.dai@intel.com wrote:
> From: Alex Dai <yu.dai@intel.com>
> 
> Split GuC work queue space reserve and submission and move the space
> reserve to where ring space is reserved. The reason is that failure
> in intel_logical_ring_advance_and_submit won't be handled. In the
> case timeout happens, driver can return early in order to handle the
> error.

Not here. You want the request_alloc_extras callback.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] drm/i915/guc: Move GuC wq_reserve_space to ring_reserve_space
  2015-12-09  9:05 ` Chris Wilson
@ 2015-12-09 17:08   ` Yu Dai
  0 siblings, 0 replies; 11+ messages in thread
From: Yu Dai @ 2015-12-09 17:08 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx



On 12/09/2015 01:05 AM, Chris Wilson wrote:
> On Tue, Dec 08, 2015 at 05:04:50PM -0800, yu.dai@intel.com wrote:
> > From: Alex Dai <yu.dai@intel.com>
> >
> > Split GuC work queue space reserve and submission and move the space
> > reserve to where ring space is reserved. The reason is that failure
> > in intel_logical_ring_advance_and_submit won't be handled. In the
> > case timeout happens, driver can return early in order to handle the
> > error.
>
> Not here. You want the request_alloc_extras callback.

OK. That is even before touching the ring.

Thanks,
Alex
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2] drm/i915/guc: Move GuC wq_reserve_space to alloc_request_extras
  2015-12-09  1:04 [PATCH] drm/i915/guc: Move GuC wq_reserve_space to ring_reserve_space yu.dai
  2015-12-09  9:05 ` Chris Wilson
@ 2015-12-09 18:50 ` yu.dai
  2015-12-10 17:14   ` Dave Gordon
  2015-12-11  1:17 ` [PATCH v3] drm/i915/guc: Move GuC wq_check_space " yu.dai
  2015-12-16 19:45 ` [PATCH v4] " yu.dai
  3 siblings, 1 reply; 11+ messages in thread
From: yu.dai @ 2015-12-09 18:50 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

Split GuC work queue space reserve from submission and move it to
ring_alloc_request_extras. The reason is that failure in later
i915_add_request() won't be handled. In the case timeout happens,
driver can return early in order to handle the error.

v1: Move wq_reserve_space to ring_reserve_space
v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson)

Signed-off-by: Alex Dai <yu.dai@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 21 +++++++++------------
 drivers/gpu/drm/i915/intel_guc.h           |  1 +
 drivers/gpu/drm/i915/intel_lrc.c           |  6 ++++++
 3 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 226e9c0..f7bd038 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -472,25 +472,21 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
 			     sizeof(desc) * client->ctx_index);
 }
 
-/* Get valid workqueue item and return it back to offset */
-static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
+int i915_guc_wq_reserve_space(struct i915_guc_client *gc)
 {
 	struct guc_process_desc *desc;
 	void *base;
 	u32 size = sizeof(struct guc_wq_item);
 	int ret = -ETIMEDOUT, timeout_counter = 200;
 
+	if (!gc)
+		return 0;
+
 	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
 	desc = base + gc->proc_desc_offset;
 
 	while (timeout_counter-- > 0) {
 		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
-			*offset = gc->wq_tail;
-
-			/* advance the tail for next workqueue item */
-			gc->wq_tail += size;
-			gc->wq_tail &= gc->wq_size - 1;
-
 			/* this will break the loop */
 			timeout_counter = 0;
 			ret = 0;
@@ -512,11 +508,12 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc,
 	struct guc_wq_item *wqi;
 	void *base;
 	u32 tail, wq_len, wq_off = 0;
-	int ret;
 
-	ret = guc_get_workqueue_space(gc, &wq_off);
-	if (ret)
-		return ret;
+	wq_off = gc->wq_tail;
+
+	/* advance the tail for next workqueue item */
+	gc->wq_tail += sizeof(struct guc_wq_item);
+	gc->wq_tail &= gc->wq_size - 1;
 
 	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
 	 * should not have the case where structure wqi is across page, neither
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 0e048bf..59c8e21 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -123,5 +123,6 @@ int i915_guc_submit(struct i915_guc_client *client,
 		    struct drm_i915_gem_request *rq);
 void i915_guc_submission_disable(struct drm_device *dev);
 void i915_guc_submission_fini(struct drm_device *dev);
+int i915_guc_wq_reserve_space(struct i915_guc_client *client);
 
 #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f96fb51..7d53d27 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -667,6 +667,12 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
 			return ret;
 	}
 
+	/* Reserve GuC WQ space here (one request needs one WQ item) because
+	 * the later i915_add_request() call can't fail. */
+	ret = i915_guc_wq_reserve_space(request->i915->guc.execbuf_client);
+	if (ret)
+		return ret;
+
 	return 0;
 }
 
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] drm/i915/guc: Move GuC wq_reserve_space to alloc_request_extras
  2015-12-09 18:50 ` [PATCH v2] drm/i915/guc: Move GuC wq_reserve_space to alloc_request_extras yu.dai
@ 2015-12-10 17:14   ` Dave Gordon
  2015-12-11  0:22     ` Yu Dai
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Gordon @ 2015-12-10 17:14 UTC (permalink / raw)
  To: yu.dai, intel-gfx

On 09/12/15 18:50, yu.dai@intel.com wrote:
> From: Alex Dai <yu.dai@intel.com>
>
> Split GuC work queue space reserve from submission and move it to
> ring_alloc_request_extras. The reason is that failure in later
> i915_add_request() won't be handled. In the case timeout happens,
> driver can return early in order to handle the error.
>
> v1: Move wq_reserve_space to ring_reserve_space
> v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson)
>
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 21 +++++++++------------
>   drivers/gpu/drm/i915/intel_guc.h           |  1 +
>   drivers/gpu/drm/i915/intel_lrc.c           |  6 ++++++
>   3 files changed, 16 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 226e9c0..f7bd038 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -472,25 +472,21 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
>   			     sizeof(desc) * client->ctx_index);
>   }
>
> -/* Get valid workqueue item and return it back to offset */
> -static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
> +int i915_guc_wq_reserve_space(struct i915_guc_client *gc)

I think the name is misleading, because we don't actually reserve 
anything here, just check that there is some free space in the WQ.

(We certainly don't WANT to reserve anything, because that would be 
difficult to clean up in the event of submission failure for any other 
reason. So I think it's only the name needs changing. Although ...

>   {
>   	struct guc_process_desc *desc;
>   	void *base;
>   	u32 size = sizeof(struct guc_wq_item);
>   	int ret = -ETIMEDOUT, timeout_counter = 200;
>
> +	if (!gc)
> +		return 0;
> +
>   	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
>   	desc = base + gc->proc_desc_offset;
>
>   	while (timeout_counter-- > 0) {
>   		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {

... as an alternative strategy, we could cache the calculated freespace 
in the client structure; then if we already know there's at least 1 slot 
free from last time we checked, we could then just decrement the cached 
value and avoid the kmap+spinwait overhead. Only when we reach 0 would 
we have to go through this code to refresh our view of desc->head and 
recalculate the actual current freespace. [NB: clear cached value on reset?]

Does that sound like a useful optimisation?

> -			*offset = gc->wq_tail;
> -
> -			/* advance the tail for next workqueue item */
> -			gc->wq_tail += size;
> -			gc->wq_tail &= gc->wq_size - 1;
> -
>   			/* this will break the loop */
>   			timeout_counter = 0;
>   			ret = 0;
> @@ -512,11 +508,12 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc,
>   	struct guc_wq_item *wqi;
>   	void *base;
>   	u32 tail, wq_len, wq_off = 0;
> -	int ret;
>
> -	ret = guc_get_workqueue_space(gc, &wq_off);
> -	if (ret)
> -		return ret;
> +	wq_off = gc->wq_tail;
> +
> +	/* advance the tail for next workqueue item */
> +	gc->wq_tail += sizeof(struct guc_wq_item);
> +	gc->wq_tail &= gc->wq_size - 1;

I was a bit unhappy about this code just assuming that there *must* be 
space (because we KNOW we've checked above) -- unless someone violated 
the proper calling sequence (TDR?). OTOH, it would be too expensive to 
go through the map-and-calculate code all over again just to catch an 
unlikely scenario. But, if we cache the last-calculated value as above, 
then the check could be cheap :) For example, just add a pre_checked 
size field that's set by the pre-check and then checked and decremented 
on submission; there shouldn't be more than one submission in progress 
at a time, because dev->struct_mutex is held across the whole sequence 
(but it's not an error to see two pre-checks in a row, because a request 
can be abandoned partway).

>   	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
>   	 * should not have the case where structure wqi is across page, neither
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 0e048bf..59c8e21 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -123,5 +123,6 @@ int i915_guc_submit(struct i915_guc_client *client,
>   		    struct drm_i915_gem_request *rq);
>   void i915_guc_submission_disable(struct drm_device *dev);
>   void i915_guc_submission_fini(struct drm_device *dev);
> +int i915_guc_wq_reserve_space(struct i915_guc_client *client);
>
>   #endif
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index f96fb51..7d53d27 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -667,6 +667,12 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
>   			return ret;
>   	}
>
> +	/* Reserve GuC WQ space here (one request needs one WQ item) because
> +	 * the later i915_add_request() call can't fail. */
> +	ret = i915_guc_wq_reserve_space(request->i915->guc.execbuf_client);
> +	if (ret)
> +		return ret;
> +
>   	return 0;
>   }

Worth checking for GuC submission before that call? Maybe not ...

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] drm/i915/guc: Move GuC wq_reserve_space to alloc_request_extras
  2015-12-10 17:14   ` Dave Gordon
@ 2015-12-11  0:22     ` Yu Dai
  0 siblings, 0 replies; 11+ messages in thread
From: Yu Dai @ 2015-12-11  0:22 UTC (permalink / raw)
  To: Dave Gordon, intel-gfx



On 12/10/2015 09:14 AM, Dave Gordon wrote:
> On 09/12/15 18:50, yu.dai@intel.com wrote:
> > From: Alex Dai <yu.dai@intel.com>
> >
> > Split GuC work queue space reserve from submission and move it to
> > ring_alloc_request_extras. The reason is that failure in later
> > i915_add_request() won't be handled. In the case timeout happens,
> > driver can return early in order to handle the error.
> >
> > v1: Move wq_reserve_space to ring_reserve_space
> > v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson)
> >
> > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_guc_submission.c | 21 +++++++++------------
> >   drivers/gpu/drm/i915/intel_guc.h           |  1 +
> >   drivers/gpu/drm/i915/intel_lrc.c           |  6 ++++++
> >   3 files changed, 16 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> > index 226e9c0..f7bd038 100644
> > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > @@ -472,25 +472,21 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
> >   			     sizeof(desc) * client->ctx_index);
> >   }
> >
> > -/* Get valid workqueue item and return it back to offset */
> > -static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
> > +int i915_guc_wq_reserve_space(struct i915_guc_client *gc)
>
> I think the name is misleading, because we don't actually reserve
> anything here, just check that there is some free space in the WQ.
>
> (We certainly don't WANT to reserve anything, because that would be
> difficult to clean up in the event of submission failure for any other
> reason. So I think it's only the name needs changing. Although ...

I was trying to use similar name as ring_reserve_space. It follows the 
same pattern as filling ring buffer - reserve, emit and advance. The 
only difference is it reserves 4 dwords (or checks space) rather than 
various size of bytes for ring buffer case. Maybe using 'check' is 
better for this case because 4 dwords in wq is what we need for 
submission via GuC.
> >   {
> >   	struct guc_process_desc *desc;
> >   	void *base;
> >   	u32 size = sizeof(struct guc_wq_item);
> >   	int ret = -ETIMEDOUT, timeout_counter = 200;
> >
> > +	if (!gc)
> > +		return 0;
> > +
> >   	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
> >   	desc = base + gc->proc_desc_offset;
> >
> >   	while (timeout_counter-- > 0) {
> >   		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
>
> ... as an alternative strategy, we could cache the calculated freespace
> in the client structure; then if we already know there's at least 1 slot
> free from last time we checked, we could then just decrement the cached
> value and avoid the kmap+spinwait overhead. Only when we reach 0 would
> we have to go through this code to refresh our view of desc->head and
> recalculate the actual current freespace. [NB: clear cached value on reset?]
>
> Does that sound like a useful optimisation?

I think it is a good idea.
> > -			*offset = gc->wq_tail;
> > -
> > -			/* advance the tail for next workqueue item */
> > -			gc->wq_tail += size;
> > -			gc->wq_tail &= gc->wq_size - 1;
> > -
> >   			/* this will break the loop */
> >   			timeout_counter = 0;
> >   			ret = 0;
> > @@ -512,11 +508,12 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc,
> >   	struct guc_wq_item *wqi;
> >   	void *base;
> >   	u32 tail, wq_len, wq_off = 0;
> > -	int ret;
> >
> > -	ret = guc_get_workqueue_space(gc, &wq_off);
> > -	if (ret)
> > -		return ret;
> > +	wq_off = gc->wq_tail;
> > +
> > +	/* advance the tail for next workqueue item */
> > +	gc->wq_tail += sizeof(struct guc_wq_item);
> > +	gc->wq_tail &= gc->wq_size - 1;
>
> I was a bit unhappy about this code just assuming that there *must* be
> space (because we KNOW we've checked above) -- unless someone violated
> the proper calling sequence (TDR?). OTOH, it would be too expensive to
> go through the map-and-calculate code all over again just to catch an
> unlikely scenario. But, if we cache the last-calculated value as above,
> then the check could be cheap :) For example, just add a pre_checked
> size field that's set by the pre-check and then checked and decremented
> on submission; there shouldn't be more than one submission in progress
> at a time, because dev->struct_mutex is held across the whole sequence
> (but it's not an error to see two pre-checks in a row, because a request
> can be abandoned partway).

I don't understand the concerns here. As I said above, filling GuC WQ is 
same as filling ring buffer - reserve (check), emit and advance. These 
two lines are GuC version of intel_logical_ring_emit and 
intel_logical_ring_advance. Maybe if I move these two lines to end of 
guc_add_workqueue_item, people can understand it more easily.

> >   	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
> >   	 * should not have the case where structure wqi is across page, neither
> > diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> > index 0e048bf..59c8e21 100644
> > --- a/drivers/gpu/drm/i915/intel_guc.h
> > +++ b/drivers/gpu/drm/i915/intel_guc.h
> > @@ -123,5 +123,6 @@ int i915_guc_submit(struct i915_guc_client *client,
> >   		    struct drm_i915_gem_request *rq);
> >   void i915_guc_submission_disable(struct drm_device *dev);
> >   void i915_guc_submission_fini(struct drm_device *dev);
> > +int i915_guc_wq_reserve_space(struct i915_guc_client *client);
> >
> >   #endif
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index f96fb51..7d53d27 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -667,6 +667,12 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
> >   			return ret;
> >   	}
> >
> > +	/* Reserve GuC WQ space here (one request needs one WQ item) because
> > +	 * the later i915_add_request() call can't fail. */
> > +	ret = i915_guc_wq_reserve_space(request->i915->guc.execbuf_client);
> > +	if (ret)
> > +		return ret;
> > +
> >   	return 0;
> >   }
>
> Worth checking for GuC submission before that call? Maybe not ...

It is purely code alignment issue if I add a if() block. The 
execbuf_client is valid only when GuC submission is enabled and the 
check is done inside that function call.

Alex
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3] drm/i915/guc: Move GuC wq_check_space to alloc_request_extras
  2015-12-09  1:04 [PATCH] drm/i915/guc: Move GuC wq_reserve_space to ring_reserve_space yu.dai
  2015-12-09  9:05 ` Chris Wilson
  2015-12-09 18:50 ` [PATCH v2] drm/i915/guc: Move GuC wq_reserve_space to alloc_request_extras yu.dai
@ 2015-12-11  1:17 ` yu.dai
  2015-12-15 15:45   ` Dave Gordon
  2015-12-16 19:45 ` [PATCH v4] " yu.dai
  3 siblings, 1 reply; 11+ messages in thread
From: yu.dai @ 2015-12-11  1:17 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

Split GuC work queue space checking from submission and move it to
ring_alloc_request_extras. The reason is that failure in later
i915_add_request() won't be handled. In the case timeout happens,
driver can return early in order to handle the error.

v1: Move wq_reserve_space to ring_reserve_space
v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson)
v3: The work queue head pointer is cached by driver now. So we can
    quickly return if space is available.
    s/reserve/check/g (Dave Gordon)

Signed-off-by: Alex Dai <yu.dai@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 34 ++++++++++++++++++------------
 drivers/gpu/drm/i915/intel_guc.h           |  2 ++
 drivers/gpu/drm/i915/intel_lrc.c           |  6 ++++++
 3 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 226e9c0..cb8e1f71 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -207,6 +207,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
 	/* Update the tail so it is visible to GuC */
 	desc->tail = gc->wq_tail;
 
+	/* Cache the head where GuC is processing */
+	gc->wq_head = desc->head;
+
 	/* current cookie */
 	db_cmp.db_status = GUC_DOORBELL_ENABLED;
 	db_cmp.cookie = gc->cookie;
@@ -472,28 +475,30 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
 			     sizeof(desc) * client->ctx_index);
 }
 
-/* Get valid workqueue item and return it back to offset */
-static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
+int i915_guc_wq_check_space(struct i915_guc_client *gc)
 {
 	struct guc_process_desc *desc;
 	void *base;
 	u32 size = sizeof(struct guc_wq_item);
 	int ret = -ETIMEDOUT, timeout_counter = 200;
 
+	if (!gc)
+		return 0;
+
+	/* Quickly return if wq space is available since last time we cache the
+	 * head position. */
+	if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size)
+		return 0;
+
 	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
 	desc = base + gc->proc_desc_offset;
 
 	while (timeout_counter-- > 0) {
-		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
-			*offset = gc->wq_tail;
+		gc->wq_head = desc->head;
 
-			/* advance the tail for next workqueue item */
-			gc->wq_tail += size;
-			gc->wq_tail &= gc->wq_size - 1;
-
-			/* this will break the loop */
-			timeout_counter = 0;
+		if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size) {
 			ret = 0;
+			break;
 		}
 
 		if (timeout_counter)
@@ -512,11 +517,8 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc,
 	struct guc_wq_item *wqi;
 	void *base;
 	u32 tail, wq_len, wq_off = 0;
-	int ret;
 
-	ret = guc_get_workqueue_space(gc, &wq_off);
-	if (ret)
-		return ret;
+	wq_off = gc->wq_tail;
 
 	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
 	 * should not have the case where structure wqi is across page, neither
@@ -551,6 +553,10 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc,
 
 	kunmap_atomic(base);
 
+	/* advance the tail for next workqueue item */
+	gc->wq_tail += sizeof(struct guc_wq_item);
+	gc->wq_tail &= gc->wq_size - 1;
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 0e048bf..5cf555d 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -43,6 +43,7 @@ struct i915_guc_client {
 	uint32_t wq_offset;
 	uint32_t wq_size;
 	uint32_t wq_tail;
+	uint32_t wq_head;
 
 	/* GuC submission statistics & status */
 	uint64_t submissions[I915_NUM_RINGS];
@@ -123,5 +124,6 @@ int i915_guc_submit(struct i915_guc_client *client,
 		    struct drm_i915_gem_request *rq);
 void i915_guc_submission_disable(struct drm_device *dev);
 void i915_guc_submission_fini(struct drm_device *dev);
+int i915_guc_wq_check_space(struct i915_guc_client *client);
 
 #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index f96fb51..9605ce9 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -667,6 +667,12 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
 			return ret;
 	}
 
+	/* Reserve GuC WQ space here (one request needs one WQ item) because
+	 * the later i915_add_request() call can't fail. */
+	ret = i915_guc_wq_check_space(request->i915->guc.execbuf_client);
+	if (ret)
+		return ret;
+
 	return 0;
 }
 
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] drm/i915/guc: Move GuC wq_check_space to alloc_request_extras
  2015-12-11  1:17 ` [PATCH v3] drm/i915/guc: Move GuC wq_check_space " yu.dai
@ 2015-12-15 15:45   ` Dave Gordon
  0 siblings, 0 replies; 11+ messages in thread
From: Dave Gordon @ 2015-12-15 15:45 UTC (permalink / raw)
  To: yu.dai, intel-gfx

On 11/12/15 01:17, yu.dai@intel.com wrote:
> From: Alex Dai <yu.dai@intel.com>
>
> Split GuC work queue space checking from submission and move it to
> ring_alloc_request_extras. The reason is that failure in later
> i915_add_request() won't be handled. In the case timeout happens,
> driver can return early in order to handle the error.
>
> v1: Move wq_reserve_space to ring_reserve_space
> v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson)
> v3: The work queue head pointer is cached by driver now. So we can
>      quickly return if space is available.
>      s/reserve/check/g (Dave Gordon)
>
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_guc_submission.c | 34 ++++++++++++++++++------------
>   drivers/gpu/drm/i915/intel_guc.h           |  2 ++
>   drivers/gpu/drm/i915/intel_lrc.c           |  6 ++++++
>   3 files changed, 28 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 226e9c0..cb8e1f71 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -207,6 +207,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>   	/* Update the tail so it is visible to GuC */
>   	desc->tail = gc->wq_tail;
>
> +	/* Cache the head where GuC is processing */
> +	gc->wq_head = desc->head;
> +

This could be at the end of this function, thus:

		/* update the cookie to newly read cookie from GuC */
		db_cmp.cookie = db_ret.cookie;
		db_exc.cookie = db_ret.cookie + 1;
		if (db_exc.cookie == 0)
			db_exc.cookie = 1;
	}

+	/* Finally, update the cached copy of the GuC's WQ head */
+	gc->wq_head = desc->head;
+
	kunmap_atomic(base);
	return ret;
}

which might increase the chance that we will get an updated value and 
thus avoid having to spinwait on the next check().

>   	/* current cookie */
>   	db_cmp.db_status = GUC_DOORBELL_ENABLED;
>   	db_cmp.cookie = gc->cookie;
> @@ -472,28 +475,30 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
>   			     sizeof(desc) * client->ctx_index);
>   }
>
> -/* Get valid workqueue item and return it back to offset */
> -static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
> +int i915_guc_wq_check_space(struct i915_guc_client *gc)
>   {
>   	struct guc_process_desc *desc;
>   	void *base;
>   	u32 size = sizeof(struct guc_wq_item);
>   	int ret = -ETIMEDOUT, timeout_counter = 200;
>
> +	if (!gc)
> +		return 0;
> +
> +	/* Quickly return if wq space is available since last time we cache the
> +	 * head position. */
> +	if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size)
> +		return 0;
> +
>   	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
>   	desc = base + gc->proc_desc_offset;
>
>   	while (timeout_counter-- > 0) {
> -		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
> -			*offset = gc->wq_tail;
> +		gc->wq_head = desc->head;
>
> -			/* advance the tail for next workqueue item */
> -			gc->wq_tail += size;
> -			gc->wq_tail &= gc->wq_size - 1;
> -
> -			/* this will break the loop */
> -			timeout_counter = 0;
> +		if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size) {
>   			ret = 0;
> +			break;
>   		}
>
>   		if (timeout_counter)
> @@ -512,11 +517,8 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc,
>   	struct guc_wq_item *wqi;
>   	void *base;
>   	u32 tail, wq_len, wq_off = 0;
> -	int ret;
>
> -	ret = guc_get_workqueue_space(gc, &wq_off);
> -	if (ret)
> -		return ret;
> +	wq_off = gc->wq_tail;

Conversely, I'd prefer the updates below to be here instead; also the 
check that I mentioned last time:

+	u32 space = CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size);
+	if (WARN_ON(space < sizeof(struct guc_wq_item))
+		return -ENOSPC;		/* shouldn't happen	*/
+
+	/* postincrement WQ tail for next time */
+	wq_off = gc->wq_tail;
+	gc->wq_tail += sizeof(struct guc_wq_item);
+	gc->wq_tail &= gc->wq_size - 1;

>
>   	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
>   	 * should not have the case where structure wqi is across page, neither
> @@ -551,6 +553,10 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc,
>
>   	kunmap_atomic(base);
>
> +	/* advance the tail for next workqueue item */
> +	gc->wq_tail += sizeof(struct guc_wq_item);
> +	gc->wq_tail &= gc->wq_size - 1;

(moved to above)

> +
>   	return 0;
>   }
>
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 0e048bf..5cf555d 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -43,6 +43,7 @@ struct i915_guc_client {
>   	uint32_t wq_offset;
>   	uint32_t wq_size;
>   	uint32_t wq_tail;
> +	uint32_t wq_head;
>
>   	/* GuC submission statistics & status */
>   	uint64_t submissions[I915_NUM_RINGS];
> @@ -123,5 +124,6 @@ int i915_guc_submit(struct i915_guc_client *client,
>   		    struct drm_i915_gem_request *rq);
>   void i915_guc_submission_disable(struct drm_device *dev);
>   void i915_guc_submission_fini(struct drm_device *dev);
> +int i915_guc_wq_check_space(struct i915_guc_client *client);
>
>   #endif
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index f96fb51..9605ce9 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -667,6 +667,12 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
>   			return ret;
>   	}
>
> +	/* Reserve GuC WQ space here (one request needs one WQ item) because
> +	 * the later i915_add_request() call can't fail. */
> +	ret = i915_guc_wq_check_space(request->i915->guc.execbuf_client);
> +	if (ret)
> +		return ret;
> +
>   	return 0;
>   }

It looks odd to call a GuC-only function regardless of whether we're 
using the GuC, so:

+	if (i915.enable_guc_submission) {
+		/*
+		 * Check that the GuC has space for the request before
+		 * going any further, as the i915_add_request() call
+		 * later on mustn't fail ...
+		 */
+		struct intel_guc *guc = &request->i915->guc;
+
+		ret = i915_guc_wq_check_space(guc->execbuf_client);
+		if (ret)
+			return ret;
+	}

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v4] drm/i915/guc: Move GuC wq_check_space to alloc_request_extras
  2015-12-09  1:04 [PATCH] drm/i915/guc: Move GuC wq_reserve_space to ring_reserve_space yu.dai
                   ` (2 preceding siblings ...)
  2015-12-11  1:17 ` [PATCH v3] drm/i915/guc: Move GuC wq_check_space " yu.dai
@ 2015-12-16 19:45 ` yu.dai
  2015-12-23 15:39   ` Dave Gordon
  3 siblings, 1 reply; 11+ messages in thread
From: yu.dai @ 2015-12-16 19:45 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

Split GuC work queue space checking from submission and move it to
ring_alloc_request_extras. The reason is that failure in later
i915_add_request() won't be handled. In the case timeout happens,
driver can return early in order to handle the error.

v1: Move wq_reserve_space to ring_reserve_space
v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson)
v3: The work queue head pointer is cached by driver now. So we can
    quickly return if space is available.
    s/reserve/check/g (Dave Gordon)
v4: Update cached wq head after ring doorbell; check wq space before
    ring doorbell in case unexpected error happens; call wq space
    check only when GuC submission is enabled. (Dave Gordon)

Signed-off-by: Alex Dai <yu.dai@intel.com>

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index ef20071..7554d16 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -247,6 +247,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
 			db_exc.cookie = 1;
 	}
 
+	/* Finally, update the cached copy of the GuC's WQ head */
+	gc->wq_head = desc->head;
+
 	kunmap_atomic(base);
 	return ret;
 }
@@ -472,28 +475,30 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
 			     sizeof(desc) * client->ctx_index);
 }
 
-/* Get valid workqueue item and return it back to offset */
-static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
+int i915_guc_wq_check_space(struct i915_guc_client *gc)
 {
 	struct guc_process_desc *desc;
 	void *base;
 	u32 size = sizeof(struct guc_wq_item);
 	int ret = -ETIMEDOUT, timeout_counter = 200;
 
+	if (!gc)
+		return 0;
+
+	/* Quickly return if wq space is available since last time we cache the
+	 * head position. */
+	if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size)
+		return 0;
+
 	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
 	desc = base + gc->proc_desc_offset;
 
 	while (timeout_counter-- > 0) {
-		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
-			*offset = gc->wq_tail;
+		gc->wq_head = desc->head;
 
-			/* advance the tail for next workqueue item */
-			gc->wq_tail += size;
-			gc->wq_tail &= gc->wq_size - 1;
-
-			/* this will break the loop */
-			timeout_counter = 0;
+		if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size) {
 			ret = 0;
+			break;
 		}
 
 		if (timeout_counter)
@@ -511,12 +516,16 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc,
 	enum intel_ring_id ring_id = rq->ring->id;
 	struct guc_wq_item *wqi;
 	void *base;
-	u32 tail, wq_len, wq_off = 0;
-	int ret;
+	u32 tail, wq_len, wq_off, space;
+
+	space = CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size);
+	if (WARN_ON(space < sizeof(struct guc_wq_item)))
+		return -ENOSPC; /* shouldn't happen */
 
-	ret = guc_get_workqueue_space(gc, &wq_off);
-	if (ret)
-		return ret;
+	/* postincrement WQ tail for next time */
+	wq_off = gc->wq_tail;
+	gc->wq_tail += sizeof(struct guc_wq_item);
+	gc->wq_tail &= gc->wq_size - 1;
 
 	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
 	 * should not have the case where structure wqi is across page, neither
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 0e048bf..5cf555d 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -43,6 +43,7 @@ struct i915_guc_client {
 	uint32_t wq_offset;
 	uint32_t wq_size;
 	uint32_t wq_tail;
+	uint32_t wq_head;
 
 	/* GuC submission statistics & status */
 	uint64_t submissions[I915_NUM_RINGS];
@@ -123,5 +124,6 @@ int i915_guc_submit(struct i915_guc_client *client,
 		    struct drm_i915_gem_request *rq);
 void i915_guc_submission_disable(struct drm_device *dev);
 void i915_guc_submission_fini(struct drm_device *dev);
+int i915_guc_wq_check_space(struct i915_guc_client *client);
 
 #endif
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 272f36f..cd232d2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -667,6 +667,19 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
 			return ret;
 	}
 
+	if (i915.enable_guc_submission) {
+		/*
+		 * Check that the GuC has space for the request before
+		 * going any further, as the i915_add_request() call
+		 * later on mustn't fail ...
+		 */
+		struct intel_guc *guc = &request->i915->guc;
+
+		ret = i915_guc_wq_check_space(guc->execbuf_client);
+		if (ret)
+			return ret;
+	}
+
 	return 0;
 }
 
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v4] drm/i915/guc: Move GuC wq_check_space to alloc_request_extras
  2015-12-16 19:45 ` [PATCH v4] " yu.dai
@ 2015-12-23 15:39   ` Dave Gordon
  2016-01-05 10:07     ` Daniel Vetter
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Gordon @ 2015-12-23 15:39 UTC (permalink / raw)
  To: yu.dai, intel-gfx

On 16/12/15 19:45, yu.dai@intel.com wrote:
> From: Alex Dai <yu.dai@intel.com>
>
> Split GuC work queue space checking from submission and move it to
> ring_alloc_request_extras. The reason is that failure in later
> i915_add_request() won't be handled. In the case timeout happens,
> driver can return early in order to handle the error.
>
> v1: Move wq_reserve_space to ring_reserve_space
> v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson)
> v3: The work queue head pointer is cached by driver now. So we can
>      quickly return if space is available.
>      s/reserve/check/g (Dave Gordon)
> v4: Update cached wq head after ring doorbell; check wq space before
>      ring doorbell in case unexpected error happens; call wq space
>      check only when GuC submission is enabled. (Dave Gordon)
>
> Signed-off-by: Alex Dai <yu.dai@intel.com>

LGTM.
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>

> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index ef20071..7554d16 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -247,6 +247,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>   			db_exc.cookie = 1;
>   	}
>
> +	/* Finally, update the cached copy of the GuC's WQ head */
> +	gc->wq_head = desc->head;
> +
>   	kunmap_atomic(base);
>   	return ret;
>   }
> @@ -472,28 +475,30 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
>   			     sizeof(desc) * client->ctx_index);
>   }
>
> -/* Get valid workqueue item and return it back to offset */
> -static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
> +int i915_guc_wq_check_space(struct i915_guc_client *gc)
>   {
>   	struct guc_process_desc *desc;
>   	void *base;
>   	u32 size = sizeof(struct guc_wq_item);
>   	int ret = -ETIMEDOUT, timeout_counter = 200;
>
> +	if (!gc)
> +		return 0;
> +
> +	/* Quickly return if wq space is available since last time we cache the
> +	 * head position. */
> +	if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size)
> +		return 0;
> +
>   	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
>   	desc = base + gc->proc_desc_offset;
>
>   	while (timeout_counter-- > 0) {
> -		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
> -			*offset = gc->wq_tail;
> +		gc->wq_head = desc->head;
>
> -			/* advance the tail for next workqueue item */
> -			gc->wq_tail += size;
> -			gc->wq_tail &= gc->wq_size - 1;
> -
> -			/* this will break the loop */
> -			timeout_counter = 0;
> +		if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size) {
>   			ret = 0;
> +			break;
>   		}
>
>   		if (timeout_counter)
> @@ -511,12 +516,16 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc,
>   	enum intel_ring_id ring_id = rq->ring->id;
>   	struct guc_wq_item *wqi;
>   	void *base;
> -	u32 tail, wq_len, wq_off = 0;
> -	int ret;
> +	u32 tail, wq_len, wq_off, space;
> +
> +	space = CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size);
> +	if (WARN_ON(space < sizeof(struct guc_wq_item)))
> +		return -ENOSPC; /* shouldn't happen */
>
> -	ret = guc_get_workqueue_space(gc, &wq_off);
> -	if (ret)
> -		return ret;
> +	/* postincrement WQ tail for next time */
> +	wq_off = gc->wq_tail;
> +	gc->wq_tail += sizeof(struct guc_wq_item);
> +	gc->wq_tail &= gc->wq_size - 1;
>
>   	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
>   	 * should not have the case where structure wqi is across page, neither
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 0e048bf..5cf555d 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -43,6 +43,7 @@ struct i915_guc_client {
>   	uint32_t wq_offset;
>   	uint32_t wq_size;
>   	uint32_t wq_tail;
> +	uint32_t wq_head;
>
>   	/* GuC submission statistics & status */
>   	uint64_t submissions[I915_NUM_RINGS];
> @@ -123,5 +124,6 @@ int i915_guc_submit(struct i915_guc_client *client,
>   		    struct drm_i915_gem_request *rq);
>   void i915_guc_submission_disable(struct drm_device *dev);
>   void i915_guc_submission_fini(struct drm_device *dev);
> +int i915_guc_wq_check_space(struct i915_guc_client *client);
>
>   #endif
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 272f36f..cd232d2 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -667,6 +667,19 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
>   			return ret;
>   	}
>
> +	if (i915.enable_guc_submission) {
> +		/*
> +		 * Check that the GuC has space for the request before
> +		 * going any further, as the i915_add_request() call
> +		 * later on mustn't fail ...
> +		 */
> +		struct intel_guc *guc = &request->i915->guc;
> +
> +		ret = i915_guc_wq_check_space(guc->execbuf_client);
> +		if (ret)
> +			return ret;
> +	}
> +
>   	return 0;
>   }
>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4] drm/i915/guc: Move GuC wq_check_space to alloc_request_extras
  2015-12-23 15:39   ` Dave Gordon
@ 2016-01-05 10:07     ` Daniel Vetter
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Vetter @ 2016-01-05 10:07 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Wed, Dec 23, 2015 at 03:39:04PM +0000, Dave Gordon wrote:
> On 16/12/15 19:45, yu.dai@intel.com wrote:
> >From: Alex Dai <yu.dai@intel.com>
> >
> >Split GuC work queue space checking from submission and move it to
> >ring_alloc_request_extras. The reason is that failure in later
> >i915_add_request() won't be handled. In the case timeout happens,
> >driver can return early in order to handle the error.
> >
> >v1: Move wq_reserve_space to ring_reserve_space
> >v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson)
> >v3: The work queue head pointer is cached by driver now. So we can
> >     quickly return if space is available.
> >     s/reserve/check/g (Dave Gordon)
> >v4: Update cached wq head after ring doorbell; check wq space before
> >     ring doorbell in case unexpected error happens; call wq space
> >     check only when GuC submission is enabled. (Dave Gordon)
> >
> >Signed-off-by: Alex Dai <yu.dai@intel.com>
> 
> LGTM.
> Reviewed-by: Dave Gordon <david.s.gordon@intel.com>

Queued for -next, thanks for the patch.
-Daniel

> 
> >diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> >index ef20071..7554d16 100644
> >--- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >@@ -247,6 +247,9 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
> >  			db_exc.cookie = 1;
> >  	}
> >
> >+	/* Finally, update the cached copy of the GuC's WQ head */
> >+	gc->wq_head = desc->head;
> >+
> >  	kunmap_atomic(base);
> >  	return ret;
> >  }
> >@@ -472,28 +475,30 @@ static void guc_fini_ctx_desc(struct intel_guc *guc,
> >  			     sizeof(desc) * client->ctx_index);
> >  }
> >
> >-/* Get valid workqueue item and return it back to offset */
> >-static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
> >+int i915_guc_wq_check_space(struct i915_guc_client *gc)
> >  {
> >  	struct guc_process_desc *desc;
> >  	void *base;
> >  	u32 size = sizeof(struct guc_wq_item);
> >  	int ret = -ETIMEDOUT, timeout_counter = 200;
> >
> >+	if (!gc)
> >+		return 0;
> >+
> >+	/* Quickly return if wq space is available since last time we cache the
> >+	 * head position. */
> >+	if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size)
> >+		return 0;
> >+
> >  	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
> >  	desc = base + gc->proc_desc_offset;
> >
> >  	while (timeout_counter-- > 0) {
> >-		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
> >-			*offset = gc->wq_tail;
> >+		gc->wq_head = desc->head;
> >
> >-			/* advance the tail for next workqueue item */
> >-			gc->wq_tail += size;
> >-			gc->wq_tail &= gc->wq_size - 1;
> >-
> >-			/* this will break the loop */
> >-			timeout_counter = 0;
> >+		if (CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size) >= size) {
> >  			ret = 0;
> >+			break;
> >  		}
> >
> >  		if (timeout_counter)
> >@@ -511,12 +516,16 @@ static int guc_add_workqueue_item(struct i915_guc_client *gc,
> >  	enum intel_ring_id ring_id = rq->ring->id;
> >  	struct guc_wq_item *wqi;
> >  	void *base;
> >-	u32 tail, wq_len, wq_off = 0;
> >-	int ret;
> >+	u32 tail, wq_len, wq_off, space;
> >+
> >+	space = CIRC_SPACE(gc->wq_tail, gc->wq_head, gc->wq_size);
> >+	if (WARN_ON(space < sizeof(struct guc_wq_item)))
> >+		return -ENOSPC; /* shouldn't happen */
> >
> >-	ret = guc_get_workqueue_space(gc, &wq_off);
> >-	if (ret)
> >-		return ret;
> >+	/* postincrement WQ tail for next time */
> >+	wq_off = gc->wq_tail;
> >+	gc->wq_tail += sizeof(struct guc_wq_item);
> >+	gc->wq_tail &= gc->wq_size - 1;
> >
> >  	/* For now workqueue item is 4 DWs; workqueue buffer is 2 pages. So we
> >  	 * should not have the case where structure wqi is across page, neither
> >diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> >index 0e048bf..5cf555d 100644
> >--- a/drivers/gpu/drm/i915/intel_guc.h
> >+++ b/drivers/gpu/drm/i915/intel_guc.h
> >@@ -43,6 +43,7 @@ struct i915_guc_client {
> >  	uint32_t wq_offset;
> >  	uint32_t wq_size;
> >  	uint32_t wq_tail;
> >+	uint32_t wq_head;
> >
> >  	/* GuC submission statistics & status */
> >  	uint64_t submissions[I915_NUM_RINGS];
> >@@ -123,5 +124,6 @@ int i915_guc_submit(struct i915_guc_client *client,
> >  		    struct drm_i915_gem_request *rq);
> >  void i915_guc_submission_disable(struct drm_device *dev);
> >  void i915_guc_submission_fini(struct drm_device *dev);
> >+int i915_guc_wq_check_space(struct i915_guc_client *client);
> >
> >  #endif
> >diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >index 272f36f..cd232d2 100644
> >--- a/drivers/gpu/drm/i915/intel_lrc.c
> >+++ b/drivers/gpu/drm/i915/intel_lrc.c
> >@@ -667,6 +667,19 @@ int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request
> >  			return ret;
> >  	}
> >
> >+	if (i915.enable_guc_submission) {
> >+		/*
> >+		 * Check that the GuC has space for the request before
> >+		 * going any further, as the i915_add_request() call
> >+		 * later on mustn't fail ...
> >+		 */
> >+		struct intel_guc *guc = &request->i915->guc;
> >+
> >+		ret = i915_guc_wq_check_space(guc->execbuf_client);
> >+		if (ret)
> >+			return ret;
> >+	}
> >+
> >  	return 0;
> >  }
> >
> >
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-01-05 10:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-09  1:04 [PATCH] drm/i915/guc: Move GuC wq_reserve_space to ring_reserve_space yu.dai
2015-12-09  9:05 ` Chris Wilson
2015-12-09 17:08   ` Yu Dai
2015-12-09 18:50 ` [PATCH v2] drm/i915/guc: Move GuC wq_reserve_space to alloc_request_extras yu.dai
2015-12-10 17:14   ` Dave Gordon
2015-12-11  0:22     ` Yu Dai
2015-12-11  1:17 ` [PATCH v3] drm/i915/guc: Move GuC wq_check_space " yu.dai
2015-12-15 15:45   ` Dave Gordon
2015-12-16 19:45 ` [PATCH v4] " yu.dai
2015-12-23 15:39   ` Dave Gordon
2016-01-05 10:07     ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.