[PATCH] drm/i915/guc: Keep irq enabled during GuC cmd submission

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] drm/i915/guc: Keep irq enabled during GuC cmd submission
@ 2015-11-20  0:15 yu.dai
  2015-11-20  8:45 ` Daniel Vetter
  0 siblings, 1 reply; 24+ messages in thread
From: yu.dai @ 2015-11-20  0:15 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

When GuC Work Queue is full, driver will wait GuC for avaliable
space by calling wait_for_atomic. If irq is disabled, lockup will
happen because jiffies won't be updated.

Issue is found in igt/gem_close_race.

Signed-off-by: Alex Dai <yu.dai@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 0a6b007..bbfa6ed 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -597,14 +597,13 @@ int i915_guc_submit(struct i915_guc_client *client,
 {
 	struct intel_guc *guc = client->guc;
 	enum intel_ring_id ring_id = rq->ring->id;
-	unsigned long flags;
 	int q_ret, b_ret;
 
 	/* Need this because of the deferred pin ctx and ring */
 	/* Shall we move this right after ring is pinned? */
 	lr_context_update(rq);
 
-	spin_lock_irqsave(&client->wq_lock, flags);
+	spin_lock(&client->wq_lock);
 
 	q_ret = guc_add_workqueue_item(client, rq);
 	if (q_ret == 0)
@@ -620,7 +619,7 @@ int i915_guc_submit(struct i915_guc_client *client,
 	} else {
 		client->retcode = 0;
 	}
-	spin_unlock_irqrestore(&client->wq_lock, flags);
+	spin_unlock(&client->wq_lock);
 
 	spin_lock(&guc->host2guc_lock);
 	guc->submissions[ring_id] += 1;
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/i915/guc: Keep irq enabled during GuC cmd submission
  2015-11-20  0:15 [PATCH] drm/i915/guc: Keep irq enabled during GuC cmd submission yu.dai
@ 2015-11-20  8:45 ` Daniel Vetter
  2015-11-23 23:02   ` [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock yu.dai
  0 siblings, 1 reply; 24+ messages in thread
From: Daniel Vetter @ 2015-11-20  8:45 UTC (permalink / raw)
  To: yu.dai; +Cc: intel-gfx

On Thu, Nov 19, 2015 at 04:15:22PM -0800, yu.dai@intel.com wrote:
> From: Alex Dai <yu.dai@intel.com>
> 
> When GuC Work Queue is full, driver will wait GuC for avaliable
> space by calling wait_for_atomic. If irq is disabled, lockup will
> happen because jiffies won't be updated.
> 
> Issue is found in igt/gem_close_race.
> 
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 0a6b007..bbfa6ed 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -597,14 +597,13 @@ int i915_guc_submit(struct i915_guc_client *client,
>  {
>  	struct intel_guc *guc = client->guc;
>  	enum intel_ring_id ring_id = rq->ring->id;
> -	unsigned long flags;
>  	int q_ret, b_ret;
>  
>  	/* Need this because of the deferred pin ctx and ring */
>  	/* Shall we move this right after ring is pinned? */
>  	lr_context_update(rq);
>  
> -	spin_lock_irqsave(&client->wq_lock, flags);
> +	spin_lock(&client->wq_lock);

Oh dear your busy-spinning for the hw under a spinlock. That's horribly
broken design, please fix this asap. _irqsave is the least of your
concerns here.
-Daniel

>  
>  	q_ret = guc_add_workqueue_item(client, rq);
>  	if (q_ret == 0)
> @@ -620,7 +619,7 @@ int i915_guc_submit(struct i915_guc_client *client,
>  	} else {
>  		client->retcode = 0;
>  	}
> -	spin_unlock_irqrestore(&client->wq_lock, flags);
> +	spin_unlock(&client->wq_lock);
>  
>  	spin_lock(&guc->host2guc_lock);
>  	guc->submissions[ring_id] += 1;
> -- 
> 2.5.0
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock
  2015-11-20  8:45 ` Daniel Vetter
@ 2015-11-23 23:02   ` yu.dai
  2015-11-24 13:04     ` Daniel Vetter
                       ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: yu.dai @ 2015-11-23 23:02 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

When GuC Work Queue is full, driver will wait GuC for avaliable
space by delaying 1ms. The wait needs to be out of spinlockirq /
unlock. Otherwise, lockup happens because jiffies won't be updated
dur to irq is disabled.

Issue is found in igt/gem_close_race.

Signed-off-by: Alex Dai <yu.dai@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 0a6b007..1418397 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
 	union guc_doorbell_qw *db;
 	void *base;
 	int attempt = 2, ret = -EAGAIN;
+	unsigned long flags;
 
 	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
 	desc = base + gc->proc_desc_offset;
 
+	spin_lock_irqsave(&gc->wq_lock, flags);
+
 	/* Update the tail so it is visible to GuC */
 	desc->tail = gc->wq_tail;
 
@@ -248,7 +251,10 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
 			db_exc.cookie = 1;
 	}
 
+	spin_unlock_irqrestore(&gc->wq_lock, flags);
+
 	kunmap_atomic(base);
+
 	return ret;
 }
 
@@ -487,16 +493,16 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
 	struct guc_process_desc *desc;
 	void *base;
 	u32 size = sizeof(struct guc_wq_item);
-	int ret = 0, timeout_counter = 200;
+	int ret = -ETIMEDOUT, timeout_counter = 200;
+	unsigned long flags;
 
 	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
 	desc = base + gc->proc_desc_offset;
 
 	while (timeout_counter-- > 0) {
-		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
-				gc->wq_size) >= size, 1);
+		spin_lock_irqsave(&gc->wq_lock, flags);
 
-		if (!ret) {
+		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
 			*offset = gc->wq_tail;
 
 			/* advance the tail for next workqueue item */
@@ -505,7 +511,13 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
 
 			/* this will break the loop */
 			timeout_counter = 0;
+			ret = 0;
 		}
+
+		spin_unlock_irqrestore(&gc->wq_lock, flags);
+
+		if (timeout_counter)
+			usleep_range(1000, 2000);
 	};
 
 	kunmap_atomic(base);
@@ -597,19 +609,17 @@ int i915_guc_submit(struct i915_guc_client *client,
 {
 	struct intel_guc *guc = client->guc;
 	enum intel_ring_id ring_id = rq->ring->id;
-	unsigned long flags;
 	int q_ret, b_ret;
 
 	/* Need this because of the deferred pin ctx and ring */
 	/* Shall we move this right after ring is pinned? */
 	lr_context_update(rq);
 
-	spin_lock_irqsave(&client->wq_lock, flags);
-
 	q_ret = guc_add_workqueue_item(client, rq);
 	if (q_ret == 0)
 		b_ret = guc_ring_doorbell(client);
 
+	spin_lock(&guc->host2guc_lock);
 	client->submissions[ring_id] += 1;
 	if (q_ret) {
 		client->q_fail += 1;
@@ -620,9 +630,6 @@ int i915_guc_submit(struct i915_guc_client *client,
 	} else {
 		client->retcode = 0;
 	}
-	spin_unlock_irqrestore(&client->wq_lock, flags);
-
-	spin_lock(&guc->host2guc_lock);
 	guc->submissions[ring_id] += 1;
 	guc->last_seqno[ring_id] = rq->seqno;
 	spin_unlock(&guc->host2guc_lock);
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock
  2015-11-23 23:02   ` [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock yu.dai
@ 2015-11-24 13:04     ` Daniel Vetter
  2015-11-24 13:26       ` Imre Deak
  2015-11-25 19:29     ` [PATCH v2] drm/i915/guc: Clean up locks in GuC yu.dai
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 24+ messages in thread
From: Daniel Vetter @ 2015-11-24 13:04 UTC (permalink / raw)
  To: yu.dai; +Cc: intel-gfx

On Mon, Nov 23, 2015 at 03:02:58PM -0800, yu.dai@intel.com wrote:
> From: Alex Dai <yu.dai@intel.com>
> 
> When GuC Work Queue is full, driver will wait GuC for avaliable
> space by delaying 1ms. The wait needs to be out of spinlockirq /
> unlock. Otherwise, lockup happens because jiffies won't be updated
> dur to irq is disabled.
> 
> Issue is found in igt/gem_close_race.
> 
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 27 +++++++++++++++++----------
>  1 file changed, 17 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 0a6b007..1418397 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>  	union guc_doorbell_qw *db;
>  	void *base;
>  	int attempt = 2, ret = -EAGAIN;
> +	unsigned long flags;
>  
>  	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));

We don't need kmap_atomic anymore here now, since it's outside of the
spinlock.

>  	desc = base + gc->proc_desc_offset;
>  
> +	spin_lock_irqsave(&gc->wq_lock, flags);

Please don't use the super-generic _irqsave. It's expensive and results in
fragile code when someone accidentally reuses something in an interrupt
handler that was never meant to run in that context.

Instead please use the most specific funtion:
- spin_lock if you know you are in irq context.
- sipn_lock_irq if you know you are not.
- spin_lock_irqsave should be a big warning sign that your code has
  layering issues.

Please audit the entire guc code for the above two issues.

> +
>  	/* Update the tail so it is visible to GuC */
>  	desc->tail = gc->wq_tail;
>  
> @@ -248,7 +251,10 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>  			db_exc.cookie = 1;
>  	}
>  
> +	spin_unlock_irqrestore(&gc->wq_lock, flags);
> +
>  	kunmap_atomic(base);
> +
>  	return ret;
>  }
>  
> @@ -487,16 +493,16 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
>  	struct guc_process_desc *desc;
>  	void *base;
>  	u32 size = sizeof(struct guc_wq_item);
> -	int ret = 0, timeout_counter = 200;
> +	int ret = -ETIMEDOUT, timeout_counter = 200;
> +	unsigned long flags;
>  
>  	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
>  	desc = base + gc->proc_desc_offset;
>  
>  	while (timeout_counter-- > 0) {
> -		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
> -				gc->wq_size) >= size, 1);
> +		spin_lock_irqsave(&gc->wq_lock, flags);
>  
> -		if (!ret) {
> +		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
>  			*offset = gc->wq_tail;
>  
>  			/* advance the tail for next workqueue item */
> @@ -505,7 +511,13 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
>  
>  			/* this will break the loop */
>  			timeout_counter = 0;
> +			ret = 0;
>  		}
> +
> +		spin_unlock_irqrestore(&gc->wq_lock, flags);
> +
> +		if (timeout_counter)
> +			usleep_range(1000, 2000);

Do we really not have a interrupt/signal from the guc when it has cleared
up some space?

>  	};
>  
>  	kunmap_atomic(base);
> @@ -597,19 +609,17 @@ int i915_guc_submit(struct i915_guc_client *client,
>  {
>  	struct intel_guc *guc = client->guc;
>  	enum intel_ring_id ring_id = rq->ring->id;
> -	unsigned long flags;
>  	int q_ret, b_ret;
>  
>  	/* Need this because of the deferred pin ctx and ring */
>  	/* Shall we move this right after ring is pinned? */
>  	lr_context_update(rq);
>  
> -	spin_lock_irqsave(&client->wq_lock, flags);
> -
>  	q_ret = guc_add_workqueue_item(client, rq);
>  	if (q_ret == 0)
>  		b_ret = guc_ring_doorbell(client);
>  
> +	spin_lock(&guc->host2guc_lock);

So at first I thought there's a race now, but then I looked at what
host2guc and wq_lock protect. It seems like the only thing they do is
protect against debugfs, all the real protection against inconsistent
state is done through dev->struct_mutex.

Can't we just rip out all this spinlock business from the guc code?
It would be easier than fixing up the races in here.
-Daniel

>  	client->submissions[ring_id] += 1;
>  	if (q_ret) {
>  		client->q_fail += 1;
> @@ -620,9 +630,6 @@ int i915_guc_submit(struct i915_guc_client *client,
>  	} else {
>  		client->retcode = 0;
>  	}
> -	spin_unlock_irqrestore(&client->wq_lock, flags);
> -
> -	spin_lock(&guc->host2guc_lock);
>  	guc->submissions[ring_id] += 1;
>  	guc->last_seqno[ring_id] = rq->seqno;
>  	spin_unlock(&guc->host2guc_lock);
> -- 
> 2.5.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock
  2015-11-24 13:04     ` Daniel Vetter
@ 2015-11-24 13:26       ` Imre Deak
  2015-11-24 17:00         ` Yu Dai
  0 siblings, 1 reply; 24+ messages in thread
From: Imre Deak @ 2015-11-24 13:26 UTC (permalink / raw)
  To: Daniel Vetter, yu.dai; +Cc: intel-gfx

On ti, 2015-11-24 at 14:04 +0100, Daniel Vetter wrote:
> On Mon, Nov 23, 2015 at 03:02:58PM -0800, yu.dai@intel.com wrote:
> > From: Alex Dai <yu.dai@intel.com>
> > 
> > When GuC Work Queue is full, driver will wait GuC for avaliable
> > space by delaying 1ms. The wait needs to be out of spinlockirq /
> > unlock. Otherwise, lockup happens because jiffies won't be updated
> > dur to irq is disabled.
> > 
> > Issue is found in igt/gem_close_race.
> > 
> > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_guc_submission.c | 27 +++++++++++++++++-
> > ---------
> >  1 file changed, 17 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> > b/drivers/gpu/drm/i915/i915_guc_submission.c
> > index 0a6b007..1418397 100644
> > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > @@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct
> > i915_guc_client *gc)
> >  	union guc_doorbell_qw *db;
> >  	void *base;
> >  	int attempt = 2, ret = -EAGAIN;
> > +	unsigned long flags;
> >  
> >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> > >client_obj, 0));
> 
> We don't need kmap_atomic anymore here now, since it's outside of the
> spinlock.
> 
> >  	desc = base + gc->proc_desc_offset;
> >  
> > +	spin_lock_irqsave(&gc->wq_lock, flags);
> 
> Please don't use the super-generic _irqsave. It's expensive and
> results in
> fragile code when someone accidentally reuses something in an
> interrupt
> handler that was never meant to run in that context.
> 
> Instead please use the most specific funtion:
> - spin_lock if you know you are in irq context.
> - sipn_lock_irq if you know you are not.

Right, and simply spin_lock() if the lock is not taken in IRQ context
ever.

> - spin_lock_irqsave should be a big warning sign that your code has
>   layering issues.
> 
> Please audit the entire guc code for the above two issues.

Agreed, it looks inconsistent atm: we do spin_lock(wq_lock) from
debugfs and spin_lock_irq(wq_lock) from i915_guc_submit(). Neither of
them are called from IRQ context AFAICS, in which case a simple
spin_lock() would do.

--Imre

> > +
> >  	/* Update the tail so it is visible to GuC */
> >  	desc->tail = gc->wq_tail;
> >  
> > @@ -248,7 +251,10 @@ static int guc_ring_doorbell(struct
> > i915_guc_client *gc)
> >  			db_exc.cookie = 1;
> >  	}
> >  
> > +	spin_unlock_irqrestore(&gc->wq_lock, flags);
> > +
> >  	kunmap_atomic(base);
> > +
> >  	return ret;
> >  }
> >  
> > @@ -487,16 +493,16 @@ static int guc_get_workqueue_space(struct
> > i915_guc_client *gc, u32 *offset)
> >  	struct guc_process_desc *desc;
> >  	void *base;
> >  	u32 size = sizeof(struct guc_wq_item);
> > -	int ret = 0, timeout_counter = 200;
> > +	int ret = -ETIMEDOUT, timeout_counter = 200;
> > +	unsigned long flags;
> >  
> >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> > >client_obj, 0));
> >  	desc = base + gc->proc_desc_offset;
> >  
> >  	while (timeout_counter-- > 0) {
> > -		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail,
> > desc->head,
> > -				gc->wq_size) >= size, 1);
> > +		spin_lock_irqsave(&gc->wq_lock, flags);
> >  
> > -		if (!ret) {
> > +		if (CIRC_SPACE(gc->wq_tail, desc->head, gc-
> > >wq_size) >= size) {
> >  			*offset = gc->wq_tail;
> >  
> >  			/* advance the tail for next workqueue
> > item */
> > @@ -505,7 +511,13 @@ static int guc_get_workqueue_space(struct
> > i915_guc_client *gc, u32 *offset)
> >  
> >  			/* this will break the loop */
> >  			timeout_counter = 0;
> > +			ret = 0;
> >  		}
> > +
> > +		spin_unlock_irqrestore(&gc->wq_lock, flags);
> > +
> > +		if (timeout_counter)
> > +			usleep_range(1000, 2000);
> 
> Do we really not have a interrupt/signal from the guc when it has
> cleared
> up some space?
> 
> >  	};
> >  
> >  	kunmap_atomic(base);
> > @@ -597,19 +609,17 @@ int i915_guc_submit(struct i915_guc_client
> > *client,
> >  {
> >  	struct intel_guc *guc = client->guc;
> >  	enum intel_ring_id ring_id = rq->ring->id;
> > -	unsigned long flags;
> >  	int q_ret, b_ret;
> >  
> >  	/* Need this because of the deferred pin ctx and ring */
> >  	/* Shall we move this right after ring is pinned? */
> >  	lr_context_update(rq);
> >  
> > -	spin_lock_irqsave(&client->wq_lock, flags);
> > -
> >  	q_ret = guc_add_workqueue_item(client, rq);
> >  	if (q_ret == 0)
> >  		b_ret = guc_ring_doorbell(client);
> >  
> > +	spin_lock(&guc->host2guc_lock);
> 
> So at first I thought there's a race now, but then I looked at what
> host2guc and wq_lock protect. It seems like the only thing they do is
> protect against debugfs, all the real protection against inconsistent
> state is done through dev->struct_mutex.
> 
> Can't we just rip out all this spinlock business from the guc code?
> It would be easier than fixing up the races in here.



> -Daniel
> 
> >  	client->submissions[ring_id] += 1;
> >  	if (q_ret) {
> >  		client->q_fail += 1;
> > @@ -620,9 +630,6 @@ int i915_guc_submit(struct i915_guc_client
> > *client,
> >  	} else {
> >  		client->retcode = 0;
> >  	}
> > -	spin_unlock_irqrestore(&client->wq_lock, flags);
> > -
> > -	spin_lock(&guc->host2guc_lock);
> >  	guc->submissions[ring_id] += 1;
> >  	guc->last_seqno[ring_id] = rq->seqno;
> >  	spin_unlock(&guc->host2guc_lock);
> > -- 
> > 2.5.0
> > 
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock
  2015-11-24 13:26       ` Imre Deak
@ 2015-11-24 17:00         ` Yu Dai
  2015-11-24 17:05           ` Imre Deak
  0 siblings, 1 reply; 24+ messages in thread
From: Yu Dai @ 2015-11-24 17:00 UTC (permalink / raw)
  To: imre.deak, Daniel Vetter; +Cc: intel-gfx



On 11/24/2015 05:26 AM, Imre Deak wrote:
> On ti, 2015-11-24 at 14:04 +0100, Daniel Vetter wrote:
> > On Mon, Nov 23, 2015 at 03:02:58PM -0800, yu.dai@intel.com wrote:
> > > From: Alex Dai <yu.dai@intel.com>
> > >
> > > When GuC Work Queue is full, driver will wait GuC for avaliable
> > > space by delaying 1ms. The wait needs to be out of spinlockirq /
> > > unlock. Otherwise, lockup happens because jiffies won't be updated
> > > dur to irq is disabled.
> > >
> > > Issue is found in igt/gem_close_race.
> > >
> > > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_guc_submission.c | 27 +++++++++++++++++-
> > > ---------
> > >  1 file changed, 17 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > index 0a6b007..1418397 100644
> > > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > @@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct
> > > i915_guc_client *gc)
> > >  	union guc_doorbell_qw *db;
> > >  	void *base;
> > >  	int attempt = 2, ret = -EAGAIN;
> > > +	unsigned long flags;
> > >
> > >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> > > >client_obj, 0));
> >
> > We don't need kmap_atomic anymore here now, since it's outside of the
> > spinlock.
> >
> > >  	desc = base + gc->proc_desc_offset;
> > >
> > > +	spin_lock_irqsave(&gc->wq_lock, flags);
> >
> > Please don't use the super-generic _irqsave. It's expensive and
> > results in
> > fragile code when someone accidentally reuses something in an
> > interrupt
> > handler that was never meant to run in that context.
> >
> > Instead please use the most specific funtion:
> > - spin_lock if you know you are in irq context.
> > - sipn_lock_irq if you know you are not.
>
> Right, and simply spin_lock() if the lock is not taken in IRQ context
> ever.

This is not in IRQ context. So I will use spin_lock_irq instead.
> > - spin_lock_irqsave should be a big warning sign that your code has
> >   layering issues.
> >
> > Please audit the entire guc code for the above two issues.
>
> Agreed, it looks inconsistent atm: we do spin_lock(wq_lock) from
> debugfs and spin_lock_irq(wq_lock) from i915_guc_submit(). Neither of
> them are called from IRQ context AFAICS, in which case a simple
> spin_lock() would do.
>
> --Imre
>
> > > +
> > >  	/* Update the tail so it is visible to GuC */
> > >  	desc->tail = gc->wq_tail;
> > >
> > > @@ -248,7 +251,10 @@ static int guc_ring_doorbell(struct
> > > i915_guc_client *gc)
> > >  			db_exc.cookie = 1;
> > >  	}
> > >
> > > +	spin_unlock_irqrestore(&gc->wq_lock, flags);
> > > +
> > >  	kunmap_atomic(base);
> > > +
> > >  	return ret;
> > >  }
> > >
> > > @@ -487,16 +493,16 @@ static int guc_get_workqueue_space(struct
> > > i915_guc_client *gc, u32 *offset)
> > >  	struct guc_process_desc *desc;
> > >  	void *base;
> > >  	u32 size = sizeof(struct guc_wq_item);
> > > -	int ret = 0, timeout_counter = 200;
> > > +	int ret = -ETIMEDOUT, timeout_counter = 200;
> > > +	unsigned long flags;
> > >
> > >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> > > >client_obj, 0));
> > >  	desc = base + gc->proc_desc_offset;
> > >
> > >  	while (timeout_counter-- > 0) {
> > > -		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail,
> > > desc->head,
> > > -				gc->wq_size) >= size, 1);
> > > +		spin_lock_irqsave(&gc->wq_lock, flags);
> > >
> > > -		if (!ret) {
> > > +		if (CIRC_SPACE(gc->wq_tail, desc->head, gc-
> > > >wq_size) >= size) {
> > >  			*offset = gc->wq_tail;
> > >
> > >  			/* advance the tail for next workqueue
> > > item */
> > > @@ -505,7 +511,13 @@ static int guc_get_workqueue_space(struct
> > > i915_guc_client *gc, u32 *offset)
> > >
> > >  			/* this will break the loop */
> > >  			timeout_counter = 0;
> > > +			ret = 0;
> > >  		}
> > > +
> > > +		spin_unlock_irqrestore(&gc->wq_lock, flags);
> > > +
> > > +		if (timeout_counter)
> > > +			usleep_range(1000, 2000);
> >
> > Do we really not have a interrupt/signal from the guc when it has
> > cleared
> > up some space?
> >

This is not implemented in fw although I think it could be done through 
the guc to host interrupt. I am worry about that if we implement this, 
it will end up with driver handles too many interrupts (maybe same 
amount of context switch). However, ideally we don't want to handle 
interrupts at all.
> > >  	};
> > >
> > >  	kunmap_atomic(base);
> > > @@ -597,19 +609,17 @@ int i915_guc_submit(struct i915_guc_client
> > > *client,
> > >  {
> > >  	struct intel_guc *guc = client->guc;
> > >  	enum intel_ring_id ring_id = rq->ring->id;
> > > -	unsigned long flags;
> > >  	int q_ret, b_ret;
> > >
> > >  	/* Need this because of the deferred pin ctx and ring */
> > >  	/* Shall we move this right after ring is pinned? */
> > >  	lr_context_update(rq);
> > >
> > > -	spin_lock_irqsave(&client->wq_lock, flags);
> > > -
> > >  	q_ret = guc_add_workqueue_item(client, rq);
> > >  	if (q_ret == 0)
> > >  		b_ret = guc_ring_doorbell(client);
> > >
> > > +	spin_lock(&guc->host2guc_lock);
> >
> > So at first I thought there's a race now, but then I looked at what
> > host2guc and wq_lock protect. It seems like the only thing they do is
> > protect against debugfs, all the real protection against inconsistent
> > state is done through dev->struct_mutex.
> >
> > Can't we just rip out all this spinlock business from the guc code?
> > It would be easier than fixing up the races in here.
>
>

Yes, host2guc lock can be done through dev->struct_mutex. But definitely 
we don't want to interrupt the process when driver program guc work 
queue and ring the door bell.
> > -Daniel
> >
> > >  	client->submissions[ring_id] += 1;
> > >  	if (q_ret) {
> > >  		client->q_fail += 1;
> > > @@ -620,9 +630,6 @@ int i915_guc_submit(struct i915_guc_client
> > > *client,
> > >  	} else {
> > >  		client->retcode = 0;
> > >  	}
> > > -	spin_unlock_irqrestore(&client->wq_lock, flags);
> > > -
> > > -	spin_lock(&guc->host2guc_lock);
> > >  	guc->submissions[ring_id] += 1;
> > >  	guc->last_seqno[ring_id] = rq->seqno;
> > >  	spin_unlock(&guc->host2guc_lock);
> > > --
> > > 2.5.0
> > >
> >

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock
  2015-11-24 17:00         ` Yu Dai
@ 2015-11-24 17:05           ` Imre Deak
  2015-11-24 18:08             ` Daniel Vetter
  0 siblings, 1 reply; 24+ messages in thread
From: Imre Deak @ 2015-11-24 17:05 UTC (permalink / raw)
  To: Yu Dai, Daniel Vetter; +Cc: intel-gfx

On ti, 2015-11-24 at 09:00 -0800, Yu Dai wrote:
> 
> On 11/24/2015 05:26 AM, Imre Deak wrote:
> > On ti, 2015-11-24 at 14:04 +0100, Daniel Vetter wrote:
> > > On Mon, Nov 23, 2015 at 03:02:58PM -0800, yu.dai@intel.com wrote:
> > > > From: Alex Dai <yu.dai@intel.com>
> > > > 
> > > > When GuC Work Queue is full, driver will wait GuC for avaliable
> > > > space by delaying 1ms. The wait needs to be out of spinlockirq
> > > > /
> > > > unlock. Otherwise, lockup happens because jiffies won't be
> > > > updated
> > > > dur to irq is disabled.
> > > > 
> > > > Issue is found in igt/gem_close_race.
> > > > 
> > > > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_guc_submission.c | 27
> > > > +++++++++++++++++-
> > > > ---------
> > > >  1 file changed, 17 insertions(+), 10 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > index 0a6b007..1418397 100644
> > > > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > @@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct
> > > > i915_guc_client *gc)
> > > >  	union guc_doorbell_qw *db;
> > > >  	void *base;
> > > >  	int attempt = 2, ret = -EAGAIN;
> > > > +	unsigned long flags;
> > > > 
> > > >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> > > > > client_obj, 0));
> > > 
> > > We don't need kmap_atomic anymore here now, since it's outside of
> > > the
> > > spinlock.
> > > 
> > > >  	desc = base + gc->proc_desc_offset;
> > > > 
> > > > +	spin_lock_irqsave(&gc->wq_lock, flags);
> > > 
> > > Please don't use the super-generic _irqsave. It's expensive and
> > > results in
> > > fragile code when someone accidentally reuses something in an
> > > interrupt
> > > handler that was never meant to run in that context.
> > > 
> > > Instead please use the most specific funtion:
> > > - spin_lock if you know you are in irq context.
> > > - sipn_lock_irq if you know you are not.
> > 
> > Right, and simply spin_lock() if the lock is not taken in IRQ
> > context
> > ever.
> 
> This is not in IRQ context. So I will use spin_lock_irq instead.

You can just use spin_lock(). spin_lock_irq() makes only sense if you
take the lock in IRQ context too, which is not the case.

> > > - spin_lock_irqsave should be a big warning sign that your code
> > > has
> > >   layering issues.
> > > 
> > > Please audit the entire guc code for the above two issues.
> > 
> > Agreed, it looks inconsistent atm: we do spin_lock(wq_lock) from
> > debugfs and spin_lock_irq(wq_lock) from i915_guc_submit(). Neither
> > of
> > them are called from IRQ context AFAICS, in which case a simple
> > spin_lock() would do.
> > 
> > --Imre
> > 
> > > > +
> > > >  	/* Update the tail so it is visible to GuC */
> > > >  	desc->tail = gc->wq_tail;
> > > > 
> > > > @@ -248,7 +251,10 @@ static int guc_ring_doorbell(struct
> > > > i915_guc_client *gc)
> > > >  			db_exc.cookie = 1;
> > > >  	}
> > > > 
> > > > +	spin_unlock_irqrestore(&gc->wq_lock, flags);
> > > > +
> > > >  	kunmap_atomic(base);
> > > > +
> > > >  	return ret;
> > > >  }
> > > > 
> > > > @@ -487,16 +493,16 @@ static int guc_get_workqueue_space(struct
> > > > i915_guc_client *gc, u32 *offset)
> > > >  	struct guc_process_desc *desc;
> > > >  	void *base;
> > > >  	u32 size = sizeof(struct guc_wq_item);
> > > > -	int ret = 0, timeout_counter = 200;
> > > > +	int ret = -ETIMEDOUT, timeout_counter = 200;
> > > > +	unsigned long flags;
> > > > 
> > > >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> > > > > client_obj, 0));
> > > >  	desc = base + gc->proc_desc_offset;
> > > > 
> > > >  	while (timeout_counter-- > 0) {
> > > > -		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail,
> > > > desc->head,
> > > > -				gc->wq_size) >= size, 1);
> > > > +		spin_lock_irqsave(&gc->wq_lock, flags);
> > > > 
> > > > -		if (!ret) {
> > > > +		if (CIRC_SPACE(gc->wq_tail, desc->head, gc-
> > > > > wq_size) >= size) {
> > > >  			*offset = gc->wq_tail;
> > > > 
> > > >  			/* advance the tail for next workqueue
> > > > item */
> > > > @@ -505,7 +511,13 @@ static int guc_get_workqueue_space(struct
> > > > i915_guc_client *gc, u32 *offset)
> > > > 
> > > >  			/* this will break the loop */
> > > >  			timeout_counter = 0;
> > > > +			ret = 0;
> > > >  		}
> > > > +
> > > > +		spin_unlock_irqrestore(&gc->wq_lock, flags);
> > > > +
> > > > +		if (timeout_counter)
> > > > +			usleep_range(1000, 2000);
> > > 
> > > Do we really not have a interrupt/signal from the guc when it has
> > > cleared
> > > up some space?
> > > 
> 
> This is not implemented in fw although I think it could be done
> through 
> the guc to host interrupt. I am worry about that if we implement
> this, 
> it will end up with driver handles too many interrupts (maybe same 
> amount of context switch). However, ideally we don't want to handle 
> interrupts at all.
> > > >  	};
> > > > 
> > > >  	kunmap_atomic(base);
> > > > @@ -597,19 +609,17 @@ int i915_guc_submit(struct
> > > > i915_guc_client
> > > > *client,
> > > >  {
> > > >  	struct intel_guc *guc = client->guc;
> > > >  	enum intel_ring_id ring_id = rq->ring->id;
> > > > -	unsigned long flags;
> > > >  	int q_ret, b_ret;
> > > > 
> > > >  	/* Need this because of the deferred pin ctx and ring
> > > > */
> > > >  	/* Shall we move this right after ring is pinned? */
> > > >  	lr_context_update(rq);
> > > > 
> > > > -	spin_lock_irqsave(&client->wq_lock, flags);
> > > > -
> > > >  	q_ret = guc_add_workqueue_item(client, rq);
> > > >  	if (q_ret == 0)
> > > >  		b_ret = guc_ring_doorbell(client);
> > > > 
> > > > +	spin_lock(&guc->host2guc_lock);
> > > 
> > > So at first I thought there's a race now, but then I looked at
> > > what
> > > host2guc and wq_lock protect. It seems like the only thing they
> > > do is
> > > protect against debugfs, all the real protection against
> > > inconsistent
> > > state is done through dev->struct_mutex.
> > > 
> > > Can't we just rip out all this spinlock business from the guc
> > > code?
> > > It would be easier than fixing up the races in here.
> > 
> > 
> 
> Yes, host2guc lock can be done through dev->struct_mutex. But
> definitely 
> we don't want to interrupt the process when driver program guc work 
> queue and ring the door bell.
> > > -Daniel
> > > 
> > > >  	client->submissions[ring_id] += 1;
> > > >  	if (q_ret) {
> > > >  		client->q_fail += 1;
> > > > @@ -620,9 +630,6 @@ int i915_guc_submit(struct i915_guc_client
> > > > *client,
> > > >  	} else {
> > > >  		client->retcode = 0;
> > > >  	}
> > > > -	spin_unlock_irqrestore(&client->wq_lock, flags);
> > > > -
> > > > -	spin_lock(&guc->host2guc_lock);
> > > >  	guc->submissions[ring_id] += 1;
> > > >  	guc->last_seqno[ring_id] = rq->seqno;
> > > >  	spin_unlock(&guc->host2guc_lock);
> > > > --
> > > > 2.5.0
> > > > 
> > > 
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock
  2015-11-24 17:05           ` Imre Deak
@ 2015-11-24 18:08             ` Daniel Vetter
  2015-11-24 18:36               ` Yu Dai
  0 siblings, 1 reply; 24+ messages in thread
From: Daniel Vetter @ 2015-11-24 18:08 UTC (permalink / raw)
  To: Imre Deak; +Cc: intel-gfx

On Tue, Nov 24, 2015 at 07:05:47PM +0200, Imre Deak wrote:
> On ti, 2015-11-24 at 09:00 -0800, Yu Dai wrote:
> > 
> > On 11/24/2015 05:26 AM, Imre Deak wrote:
> > > On ti, 2015-11-24 at 14:04 +0100, Daniel Vetter wrote:
> > > > On Mon, Nov 23, 2015 at 03:02:58PM -0800, yu.dai@intel.com wrote:
> > > > > From: Alex Dai <yu.dai@intel.com>
> > > > > 
> > > > > When GuC Work Queue is full, driver will wait GuC for avaliable
> > > > > space by delaying 1ms. The wait needs to be out of spinlockirq
> > > > > /
> > > > > unlock. Otherwise, lockup happens because jiffies won't be
> > > > > updated
> > > > > dur to irq is disabled.
> > > > > 
> > > > > Issue is found in igt/gem_close_race.
> > > > > 
> > > > > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/i915_guc_submission.c | 27
> > > > > +++++++++++++++++-
> > > > > ---------
> > > > >  1 file changed, 17 insertions(+), 10 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > index 0a6b007..1418397 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > @@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct
> > > > > i915_guc_client *gc)
> > > > >  	union guc_doorbell_qw *db;
> > > > >  	void *base;
> > > > >  	int attempt = 2, ret = -EAGAIN;
> > > > > +	unsigned long flags;
> > > > > 
> > > > >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> > > > > > client_obj, 0));
> > > > 
> > > > We don't need kmap_atomic anymore here now, since it's outside of
> > > > the
> > > > spinlock.
> > > > 
> > > > >  	desc = base + gc->proc_desc_offset;
> > > > > 
> > > > > +	spin_lock_irqsave(&gc->wq_lock, flags);
> > > > 
> > > > Please don't use the super-generic _irqsave. It's expensive and
> > > > results in
> > > > fragile code when someone accidentally reuses something in an
> > > > interrupt
> > > > handler that was never meant to run in that context.
> > > > 
> > > > Instead please use the most specific funtion:
> > > > - spin_lock if you know you are in irq context.
> > > > - sipn_lock_irq if you know you are not.
> > > 
> > > Right, and simply spin_lock() if the lock is not taken in IRQ
> > > context
> > > ever.
> > 
> > This is not in IRQ context. So I will use spin_lock_irq instead.
> 
> You can just use spin_lock(). spin_lock_irq() makes only sense if you
> take the lock in IRQ context too, which is not the case.

Imo just drop both spinlocks, adding locks for debugfs is overkill imo.
-Daniel

> 
> > > > - spin_lock_irqsave should be a big warning sign that your code
> > > > has
> > > >   layering issues.
> > > > 
> > > > Please audit the entire guc code for the above two issues.
> > > 
> > > Agreed, it looks inconsistent atm: we do spin_lock(wq_lock) from
> > > debugfs and spin_lock_irq(wq_lock) from i915_guc_submit(). Neither
> > > of
> > > them are called from IRQ context AFAICS, in which case a simple
> > > spin_lock() would do.
> > > 
> > > --Imre
> > > 
> > > > > +
> > > > >  	/* Update the tail so it is visible to GuC */
> > > > >  	desc->tail = gc->wq_tail;
> > > > > 
> > > > > @@ -248,7 +251,10 @@ static int guc_ring_doorbell(struct
> > > > > i915_guc_client *gc)
> > > > >  			db_exc.cookie = 1;
> > > > >  	}
> > > > > 
> > > > > +	spin_unlock_irqrestore(&gc->wq_lock, flags);
> > > > > +
> > > > >  	kunmap_atomic(base);
> > > > > +
> > > > >  	return ret;
> > > > >  }
> > > > > 
> > > > > @@ -487,16 +493,16 @@ static int guc_get_workqueue_space(struct
> > > > > i915_guc_client *gc, u32 *offset)
> > > > >  	struct guc_process_desc *desc;
> > > > >  	void *base;
> > > > >  	u32 size = sizeof(struct guc_wq_item);
> > > > > -	int ret = 0, timeout_counter = 200;
> > > > > +	int ret = -ETIMEDOUT, timeout_counter = 200;
> > > > > +	unsigned long flags;
> > > > > 
> > > > >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> > > > > > client_obj, 0));
> > > > >  	desc = base + gc->proc_desc_offset;
> > > > > 
> > > > >  	while (timeout_counter-- > 0) {
> > > > > -		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail,
> > > > > desc->head,
> > > > > -				gc->wq_size) >= size, 1);
> > > > > +		spin_lock_irqsave(&gc->wq_lock, flags);
> > > > > 
> > > > > -		if (!ret) {
> > > > > +		if (CIRC_SPACE(gc->wq_tail, desc->head, gc-
> > > > > > wq_size) >= size) {
> > > > >  			*offset = gc->wq_tail;
> > > > > 
> > > > >  			/* advance the tail for next workqueue
> > > > > item */
> > > > > @@ -505,7 +511,13 @@ static int guc_get_workqueue_space(struct
> > > > > i915_guc_client *gc, u32 *offset)
> > > > > 
> > > > >  			/* this will break the loop */
> > > > >  			timeout_counter = 0;
> > > > > +			ret = 0;
> > > > >  		}
> > > > > +
> > > > > +		spin_unlock_irqrestore(&gc->wq_lock, flags);
> > > > > +
> > > > > +		if (timeout_counter)
> > > > > +			usleep_range(1000, 2000);
> > > > 
> > > > Do we really not have a interrupt/signal from the guc when it has
> > > > cleared
> > > > up some space?
> > > > 
> > 
> > This is not implemented in fw although I think it could be done
> > through 
> > the guc to host interrupt. I am worry about that if we implement
> > this, 
> > it will end up with driver handles too many interrupts (maybe same 
> > amount of context switch). However, ideally we don't want to handle 
> > interrupts at all.
> > > > >  	};
> > > > > 
> > > > >  	kunmap_atomic(base);
> > > > > @@ -597,19 +609,17 @@ int i915_guc_submit(struct
> > > > > i915_guc_client
> > > > > *client,
> > > > >  {
> > > > >  	struct intel_guc *guc = client->guc;
> > > > >  	enum intel_ring_id ring_id = rq->ring->id;
> > > > > -	unsigned long flags;
> > > > >  	int q_ret, b_ret;
> > > > > 
> > > > >  	/* Need this because of the deferred pin ctx and ring
> > > > > */
> > > > >  	/* Shall we move this right after ring is pinned? */
> > > > >  	lr_context_update(rq);
> > > > > 
> > > > > -	spin_lock_irqsave(&client->wq_lock, flags);
> > > > > -
> > > > >  	q_ret = guc_add_workqueue_item(client, rq);
> > > > >  	if (q_ret == 0)
> > > > >  		b_ret = guc_ring_doorbell(client);
> > > > > 
> > > > > +	spin_lock(&guc->host2guc_lock);
> > > > 
> > > > So at first I thought there's a race now, but then I looked at
> > > > what
> > > > host2guc and wq_lock protect. It seems like the only thing they
> > > > do is
> > > > protect against debugfs, all the real protection against
> > > > inconsistent
> > > > state is done through dev->struct_mutex.
> > > > 
> > > > Can't we just rip out all this spinlock business from the guc
> > > > code?
> > > > It would be easier than fixing up the races in here.
> > > 
> > > 
> > 
> > Yes, host2guc lock can be done through dev->struct_mutex. But
> > definitely 
> > we don't want to interrupt the process when driver program guc work 
> > queue and ring the door bell.
> > > > -Daniel
> > > > 
> > > > >  	client->submissions[ring_id] += 1;
> > > > >  	if (q_ret) {
> > > > >  		client->q_fail += 1;
> > > > > @@ -620,9 +630,6 @@ int i915_guc_submit(struct i915_guc_client
> > > > > *client,
> > > > >  	} else {
> > > > >  		client->retcode = 0;
> > > > >  	}
> > > > > -	spin_unlock_irqrestore(&client->wq_lock, flags);
> > > > > -
> > > > > -	spin_lock(&guc->host2guc_lock);
> > > > >  	guc->submissions[ring_id] += 1;
> > > > >  	guc->last_seqno[ring_id] = rq->seqno;
> > > > >  	spin_unlock(&guc->host2guc_lock);
> > > > > --
> > > > > 2.5.0
> > > > > 
> > > > 
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock
  2015-11-24 18:08             ` Daniel Vetter
@ 2015-11-24 18:36               ` Yu Dai
  2015-11-24 19:13                 ` Daniel Vetter
  0 siblings, 1 reply; 24+ messages in thread
From: Yu Dai @ 2015-11-24 18:36 UTC (permalink / raw)
  To: Daniel Vetter, Imre Deak; +Cc: intel-gfx



On 11/24/2015 10:08 AM, Daniel Vetter wrote:
> On Tue, Nov 24, 2015 at 07:05:47PM +0200, Imre Deak wrote:
> > On ti, 2015-11-24 at 09:00 -0800, Yu Dai wrote:
> > >
> > > On 11/24/2015 05:26 AM, Imre Deak wrote:
> > > > On ti, 2015-11-24 at 14:04 +0100, Daniel Vetter wrote:
> > > > > On Mon, Nov 23, 2015 at 03:02:58PM -0800, yu.dai@intel.com wrote:
> > > > > > From: Alex Dai <yu.dai@intel.com>
> > > > > >
> > > > > > When GuC Work Queue is full, driver will wait GuC for avaliable
> > > > > > space by delaying 1ms. The wait needs to be out of spinlockirq
> > > > > > /
> > > > > > unlock. Otherwise, lockup happens because jiffies won't be
> > > > > > updated
> > > > > > dur to irq is disabled.
> > > > > >
> > > > > > Issue is found in igt/gem_close_race.
> > > > > >
> > > > > > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > > > > > ---
> > > > > >  drivers/gpu/drm/i915/i915_guc_submission.c | 27
> > > > > > +++++++++++++++++-
> > > > > > ---------
> > > > > >  1 file changed, 17 insertions(+), 10 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > > b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > > index 0a6b007..1418397 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > > > > @@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct
> > > > > > i915_guc_client *gc)
> > > > > >  	union guc_doorbell_qw *db;
> > > > > >  	void *base;
> > > > > >  	int attempt = 2, ret = -EAGAIN;
> > > > > > +	unsigned long flags;
> > > > > >
> > > > > >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> > > > > > > client_obj, 0));
> > > > >
> > > > > We don't need kmap_atomic anymore here now, since it's outside of
> > > > > the
> > > > > spinlock.
> > > > >
> > > > > >  	desc = base + gc->proc_desc_offset;
> > > > > >
> > > > > > +	spin_lock_irqsave(&gc->wq_lock, flags);
> > > > >
> > > > > Please don't use the super-generic _irqsave. It's expensive and
> > > > > results in
> > > > > fragile code when someone accidentally reuses something in an
> > > > > interrupt
> > > > > handler that was never meant to run in that context.
> > > > >
> > > > > Instead please use the most specific funtion:
> > > > > - spin_lock if you know you are in irq context.
> > > > > - sipn_lock_irq if you know you are not.
> > > >
> > > > Right, and simply spin_lock() if the lock is not taken in IRQ
> > > > context
> > > > ever.
> > >
> > > This is not in IRQ context. So I will use spin_lock_irq instead.
> >
> > You can just use spin_lock(). spin_lock_irq() makes only sense if you
> > take the lock in IRQ context too, which is not the case.
>
> Imo just drop both spinlocks, adding locks for debugfs is overkill imo.
>
How about using mutex_lock_interruptible(&dev->struct_mutex) instead in 
debugfs, which is to replace host2guc lock.

spinlock during ring the door bell is still needed.

Alex
> >
> > > > > - spin_lock_irqsave should be a big warning sign that your code
> > > > > has
> > > > >   layering issues.
> > > > >
> > > > > Please audit the entire guc code for the above two issues.
> > > >
> > > > Agreed, it looks inconsistent atm: we do spin_lock(wq_lock) from
> > > > debugfs and spin_lock_irq(wq_lock) from i915_guc_submit(). Neither
> > > > of
> > > > them are called from IRQ context AFAICS, in which case a simple
> > > > spin_lock() would do.
> > > >
> > > > --Imre
> > > >
> > > > > > +
> > > > > >  	/* Update the tail so it is visible to GuC */
> > > > > >  	desc->tail = gc->wq_tail;
> > > > > >
> > > > > > @@ -248,7 +251,10 @@ static int guc_ring_doorbell(struct
> > > > > > i915_guc_client *gc)
> > > > > >  			db_exc.cookie = 1;
> > > > > >  	}
> > > > > >
> > > > > > +	spin_unlock_irqrestore(&gc->wq_lock, flags);
> > > > > > +
> > > > > >  	kunmap_atomic(base);
> > > > > > +
> > > > > >  	return ret;
> > > > > >  }
> > > > > >
> > > > > > @@ -487,16 +493,16 @@ static int guc_get_workqueue_space(struct
> > > > > > i915_guc_client *gc, u32 *offset)
> > > > > >  	struct guc_process_desc *desc;
> > > > > >  	void *base;
> > > > > >  	u32 size = sizeof(struct guc_wq_item);
> > > > > > -	int ret = 0, timeout_counter = 200;
> > > > > > +	int ret = -ETIMEDOUT, timeout_counter = 200;
> > > > > > +	unsigned long flags;
> > > > > >
> > > > > >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> > > > > > > client_obj, 0));
> > > > > >  	desc = base + gc->proc_desc_offset;
> > > > > >
> > > > > >  	while (timeout_counter-- > 0) {
> > > > > > -		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail,
> > > > > > desc->head,
> > > > > > -				gc->wq_size) >= size, 1);
> > > > > > +		spin_lock_irqsave(&gc->wq_lock, flags);
> > > > > >
> > > > > > -		if (!ret) {
> > > > > > +		if (CIRC_SPACE(gc->wq_tail, desc->head, gc-
> > > > > > > wq_size) >= size) {
> > > > > >  			*offset = gc->wq_tail;
> > > > > >
> > > > > >  			/* advance the tail for next workqueue
> > > > > > item */
> > > > > > @@ -505,7 +511,13 @@ static int guc_get_workqueue_space(struct
> > > > > > i915_guc_client *gc, u32 *offset)
> > > > > >
> > > > > >  			/* this will break the loop */
> > > > > >  			timeout_counter = 0;
> > > > > > +			ret = 0;
> > > > > >  		}
> > > > > > +
> > > > > > +		spin_unlock_irqrestore(&gc->wq_lock, flags);
> > > > > > +
> > > > > > +		if (timeout_counter)
> > > > > > +			usleep_range(1000, 2000);
> > > > >
> > > > > Do we really not have a interrupt/signal from the guc when it has
> > > > > cleared
> > > > > up some space?
> > > > >
> > >
> > > This is not implemented in fw although I think it could be done
> > > through
> > > the guc to host interrupt. I am worry about that if we implement
> > > this,
> > > it will end up with driver handles too many interrupts (maybe same
> > > amount of context switch). However, ideally we don't want to handle
> > > interrupts at all.
> > > > > >  	};
> > > > > >
> > > > > >  	kunmap_atomic(base);
> > > > > > @@ -597,19 +609,17 @@ int i915_guc_submit(struct
> > > > > > i915_guc_client
> > > > > > *client,
> > > > > >  {
> > > > > >  	struct intel_guc *guc = client->guc;
> > > > > >  	enum intel_ring_id ring_id = rq->ring->id;
> > > > > > -	unsigned long flags;
> > > > > >  	int q_ret, b_ret;
> > > > > >
> > > > > >  	/* Need this because of the deferred pin ctx and ring
> > > > > > */
> > > > > >  	/* Shall we move this right after ring is pinned? */
> > > > > >  	lr_context_update(rq);
> > > > > >
> > > > > > -	spin_lock_irqsave(&client->wq_lock, flags);
> > > > > > -
> > > > > >  	q_ret = guc_add_workqueue_item(client, rq);
> > > > > >  	if (q_ret == 0)
> > > > > >  		b_ret = guc_ring_doorbell(client);
> > > > > >
> > > > > > +	spin_lock(&guc->host2guc_lock);
> > > > >
> > > > > So at first I thought there's a race now, but then I looked at
> > > > > what
> > > > > host2guc and wq_lock protect. It seems like the only thing they
> > > > > do is
> > > > > protect against debugfs, all the real protection against
> > > > > inconsistent
> > > > > state is done through dev->struct_mutex.
> > > > >
> > > > > Can't we just rip out all this spinlock business from the guc
> > > > > code?
> > > > > It would be easier than fixing up the races in here.
> > > >
> > > >
> > >
> > > Yes, host2guc lock can be done through dev->struct_mutex. But
> > > definitely
> > > we don't want to interrupt the process when driver program guc work
> > > queue and ring the door bell.
> > > > > -Daniel
> > > > >
> > > > > >  	client->submissions[ring_id] += 1;
> > > > > >  	if (q_ret) {
> > > > > >  		client->q_fail += 1;
> > > > > > @@ -620,9 +630,6 @@ int i915_guc_submit(struct i915_guc_client
> > > > > > *client,
> > > > > >  	} else {
> > > > > >  		client->retcode = 0;
> > > > > >  	}
> > > > > > -	spin_unlock_irqrestore(&client->wq_lock, flags);
> > > > > > -
> > > > > > -	spin_lock(&guc->host2guc_lock);
> > > > > >  	guc->submissions[ring_id] += 1;
> > > > > >  	guc->last_seqno[ring_id] = rq->seqno;
> > > > > >  	spin_unlock(&guc->host2guc_lock);
> > > > > > --
> > > > > > 2.5.0
> > > > > >
> > > > >
> > >
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock
  2015-11-24 18:36               ` Yu Dai
@ 2015-11-24 19:13                 ` Daniel Vetter
  2015-11-24 22:34                   ` Yu Dai
  0 siblings, 1 reply; 24+ messages in thread
From: Daniel Vetter @ 2015-11-24 19:13 UTC (permalink / raw)
  To: Yu Dai; +Cc: intel-gfx

On Tue, Nov 24, 2015 at 10:36:54AM -0800, Yu Dai wrote:
> 
> 
> On 11/24/2015 10:08 AM, Daniel Vetter wrote:
> >On Tue, Nov 24, 2015 at 07:05:47PM +0200, Imre Deak wrote:
> >> On ti, 2015-11-24 at 09:00 -0800, Yu Dai wrote:
> >> >
> >> > On 11/24/2015 05:26 AM, Imre Deak wrote:
> >> > > On ti, 2015-11-24 at 14:04 +0100, Daniel Vetter wrote:
> >> > > > On Mon, Nov 23, 2015 at 03:02:58PM -0800, yu.dai@intel.com wrote:
> >> > > > > From: Alex Dai <yu.dai@intel.com>
> >> > > > >
> >> > > > > When GuC Work Queue is full, driver will wait GuC for avaliable
> >> > > > > space by delaying 1ms. The wait needs to be out of spinlockirq
> >> > > > > /
> >> > > > > unlock. Otherwise, lockup happens because jiffies won't be
> >> > > > > updated
> >> > > > > dur to irq is disabled.
> >> > > > >
> >> > > > > Issue is found in igt/gem_close_race.
> >> > > > >
> >> > > > > Signed-off-by: Alex Dai <yu.dai@intel.com>
> >> > > > > ---
> >> > > > >  drivers/gpu/drm/i915/i915_guc_submission.c | 27
> >> > > > > +++++++++++++++++-
> >> > > > > ---------
> >> > > > >  1 file changed, 17 insertions(+), 10 deletions(-)
> >> > > > >
> >> > > > > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> >> > > > > b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> > > > > index 0a6b007..1418397 100644
> >> > > > > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >> > > > > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> > > > > @@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct
> >> > > > > i915_guc_client *gc)
> >> > > > >  	union guc_doorbell_qw *db;
> >> > > > >  	void *base;
> >> > > > >  	int attempt = 2, ret = -EAGAIN;
> >> > > > > +	unsigned long flags;
> >> > > > >
> >> > > > >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> >> > > > > > client_obj, 0));
> >> > > >
> >> > > > We don't need kmap_atomic anymore here now, since it's outside of
> >> > > > the
> >> > > > spinlock.
> >> > > >
> >> > > > >  	desc = base + gc->proc_desc_offset;
> >> > > > >
> >> > > > > +	spin_lock_irqsave(&gc->wq_lock, flags);
> >> > > >
> >> > > > Please don't use the super-generic _irqsave. It's expensive and
> >> > > > results in
> >> > > > fragile code when someone accidentally reuses something in an
> >> > > > interrupt
> >> > > > handler that was never meant to run in that context.
> >> > > >
> >> > > > Instead please use the most specific funtion:
> >> > > > - spin_lock if you know you are in irq context.
> >> > > > - sipn_lock_irq if you know you are not.
> >> > >
> >> > > Right, and simply spin_lock() if the lock is not taken in IRQ
> >> > > context
> >> > > ever.
> >> >
> >> > This is not in IRQ context. So I will use spin_lock_irq instead.
> >>
> >> You can just use spin_lock(). spin_lock_irq() makes only sense if you
> >> take the lock in IRQ context too, which is not the case.
> >
> >Imo just drop both spinlocks, adding locks for debugfs is overkill imo.
> >
> How about using mutex_lock_interruptible(&dev->struct_mutex) instead in
> debugfs, which is to replace host2guc lock.

Yes.

> spinlock during ring the door bell is still needed.

Where/why is that needed? At least on a quick look I didn't notice
anything.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock
  2015-11-24 19:13                 ` Daniel Vetter
@ 2015-11-24 22:34                   ` Yu Dai
  2015-11-25  8:45                     ` Daniel Vetter
  0 siblings, 1 reply; 24+ messages in thread
From: Yu Dai @ 2015-11-24 22:34 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx



On 11/24/2015 11:13 AM, Daniel Vetter wrote:
> On Tue, Nov 24, 2015 at 10:36:54AM -0800, Yu Dai wrote:
> >
> >
> > On 11/24/2015 10:08 AM, Daniel Vetter wrote:
> > >On Tue, Nov 24, 2015 at 07:05:47PM +0200, Imre Deak wrote:
> > >> On ti, 2015-11-24 at 09:00 -0800, Yu Dai wrote:
> > >> >
> > >> > On 11/24/2015 05:26 AM, Imre Deak wrote:
> > >> > > On ti, 2015-11-24 at 14:04 +0100, Daniel Vetter wrote:
> > >> > > > On Mon, Nov 23, 2015 at 03:02:58PM -0800, yu.dai@intel.com wrote:
> > >> > > > > From: Alex Dai <yu.dai@intel.com>
> > >> > > > >
> > >> > > > > When GuC Work Queue is full, driver will wait GuC for avaliable
> > >> > > > > space by delaying 1ms. The wait needs to be out of spinlockirq
> > >> > > > > /
> > >> > > > > unlock. Otherwise, lockup happens because jiffies won't be
> > >> > > > > updated
> > >> > > > > dur to irq is disabled.
> > >> > > > >
> > >> > > > > Issue is found in igt/gem_close_race.
> > >> > > > >
> > >> > > > > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > >> > > > > ---
> > >> > > > >  drivers/gpu/drm/i915/i915_guc_submission.c | 27
> > >> > > > > +++++++++++++++++-
> > >> > > > > ---------
> > >> > > > >  1 file changed, 17 insertions(+), 10 deletions(-)
> > >> > > > >
> > >> > > > > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> > >> > > > > b/drivers/gpu/drm/i915/i915_guc_submission.c
> > >> > > > > index 0a6b007..1418397 100644
> > >> > > > > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > >> > > > > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > >> > > > > @@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct
> > >> > > > > i915_guc_client *gc)
> > >> > > > >  	union guc_doorbell_qw *db;
> > >> > > > >  	void *base;
> > >> > > > >  	int attempt = 2, ret = -EAGAIN;
> > >> > > > > +	unsigned long flags;
> > >> > > > >
> > >> > > > >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> > >> > > > > > client_obj, 0));
> > >> > > >
> > >> > > > We don't need kmap_atomic anymore here now, since it's outside of
> > >> > > > the
> > >> > > > spinlock.
> > >> > > >
> > >> > > > >  	desc = base + gc->proc_desc_offset;
> > >> > > > >
> > >> > > > > +	spin_lock_irqsave(&gc->wq_lock, flags);
> > >> > > >
> > >> > > > Please don't use the super-generic _irqsave. It's expensive and
> > >> > > > results in
> > >> > > > fragile code when someone accidentally reuses something in an
> > >> > > > interrupt
> > >> > > > handler that was never meant to run in that context.
> > >> > > >
> > >> > > > Instead please use the most specific funtion:
> > >> > > > - spin_lock if you know you are in irq context.
> > >> > > > - sipn_lock_irq if you know you are not.
> > >> > >
> > >> > > Right, and simply spin_lock() if the lock is not taken in IRQ
> > >> > > context
> > >> > > ever.
> > >> >
> > >> > This is not in IRQ context. So I will use spin_lock_irq instead.
> > >>
> > >> You can just use spin_lock(). spin_lock_irq() makes only sense if you
> > >> take the lock in IRQ context too, which is not the case.
> > >
> > >Imo just drop both spinlocks, adding locks for debugfs is overkill imo.
> > >
> > How about using mutex_lock_interruptible(&dev->struct_mutex) instead in
> > debugfs, which is to replace host2guc lock.
>
> Yes.
>
> > spinlock during ring the door bell is still needed.
>
> Where/why is that needed? At least on a quick look I didn't notice
> anything.
>

Currently there is only one guc client to do the commands submission. It 
appears we don't need the lock. When there are more clients and all 
write to the scratch registers or ring the door bell, we don't want them 
interact with each other. Also, if we implement guc to host interrupt 
(says to handle the log buffer full event), we do need to protect the 
guc client content. Well, none presents today. I can clean up these and 
test out.

Alex
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock
  2015-11-24 22:34                   ` Yu Dai
@ 2015-11-25  8:45                     ` Daniel Vetter
  0 siblings, 0 replies; 24+ messages in thread
From: Daniel Vetter @ 2015-11-25  8:45 UTC (permalink / raw)
  To: Yu Dai; +Cc: intel-gfx

On Tue, Nov 24, 2015 at 02:34:46PM -0800, Yu Dai wrote:
> 
> 
> On 11/24/2015 11:13 AM, Daniel Vetter wrote:
> >On Tue, Nov 24, 2015 at 10:36:54AM -0800, Yu Dai wrote:
> >>
> >>
> >> On 11/24/2015 10:08 AM, Daniel Vetter wrote:
> >> >On Tue, Nov 24, 2015 at 07:05:47PM +0200, Imre Deak wrote:
> >> >> On ti, 2015-11-24 at 09:00 -0800, Yu Dai wrote:
> >> >> >
> >> >> > On 11/24/2015 05:26 AM, Imre Deak wrote:
> >> >> > > On ti, 2015-11-24 at 14:04 +0100, Daniel Vetter wrote:
> >> >> > > > On Mon, Nov 23, 2015 at 03:02:58PM -0800, yu.dai@intel.com wrote:
> >> >> > > > > From: Alex Dai <yu.dai@intel.com>
> >> >> > > > >
> >> >> > > > > When GuC Work Queue is full, driver will wait GuC for avaliable
> >> >> > > > > space by delaying 1ms. The wait needs to be out of spinlockirq
> >> >> > > > > /
> >> >> > > > > unlock. Otherwise, lockup happens because jiffies won't be
> >> >> > > > > updated
> >> >> > > > > dur to irq is disabled.
> >> >> > > > >
> >> >> > > > > Issue is found in igt/gem_close_race.
> >> >> > > > >
> >> >> > > > > Signed-off-by: Alex Dai <yu.dai@intel.com>
> >> >> > > > > ---
> >> >> > > > >  drivers/gpu/drm/i915/i915_guc_submission.c | 27
> >> >> > > > > +++++++++++++++++-
> >> >> > > > > ---------
> >> >> > > > >  1 file changed, 17 insertions(+), 10 deletions(-)
> >> >> > > > >
> >> >> > > > > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
> >> >> > > > > b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> >> > > > > index 0a6b007..1418397 100644
> >> >> > > > > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >> >> > > > > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> >> > > > > @@ -201,10 +201,13 @@ static int guc_ring_doorbell(struct
> >> >> > > > > i915_guc_client *gc)
> >> >> > > > >  	union guc_doorbell_qw *db;
> >> >> > > > >  	void *base;
> >> >> > > > >  	int attempt = 2, ret = -EAGAIN;
> >> >> > > > > +	unsigned long flags;
> >> >> > > > >
> >> >> > > > >  	base = kmap_atomic(i915_gem_object_get_page(gc-
> >> >> > > > > > client_obj, 0));
> >> >> > > >
> >> >> > > > We don't need kmap_atomic anymore here now, since it's outside of
> >> >> > > > the
> >> >> > > > spinlock.
> >> >> > > >
> >> >> > > > >  	desc = base + gc->proc_desc_offset;
> >> >> > > > >
> >> >> > > > > +	spin_lock_irqsave(&gc->wq_lock, flags);
> >> >> > > >
> >> >> > > > Please don't use the super-generic _irqsave. It's expensive and
> >> >> > > > results in
> >> >> > > > fragile code when someone accidentally reuses something in an
> >> >> > > > interrupt
> >> >> > > > handler that was never meant to run in that context.
> >> >> > > >
> >> >> > > > Instead please use the most specific funtion:
> >> >> > > > - spin_lock if you know you are in irq context.
> >> >> > > > - sipn_lock_irq if you know you are not.
> >> >> > >
> >> >> > > Right, and simply spin_lock() if the lock is not taken in IRQ
> >> >> > > context
> >> >> > > ever.
> >> >> >
> >> >> > This is not in IRQ context. So I will use spin_lock_irq instead.
> >> >>
> >> >> You can just use spin_lock(). spin_lock_irq() makes only sense if you
> >> >> take the lock in IRQ context too, which is not the case.
> >> >
> >> >Imo just drop both spinlocks, adding locks for debugfs is overkill imo.
> >> >
> >> How about using mutex_lock_interruptible(&dev->struct_mutex) instead in
> >> debugfs, which is to replace host2guc lock.
> >
> >Yes.
> >
> >> spinlock during ring the door bell is still needed.
> >
> >Where/why is that needed? At least on a quick look I didn't notice
> >anything.
> >
> 
> Currently there is only one guc client to do the commands submission. It
> appears we don't need the lock. When there are more clients and all write to
> the scratch registers or ring the door bell, we don't want them interact
> with each other. Also, if we implement guc to host interrupt (says to handle
> the log buffer full event), we do need to protect the guc client content.
> Well, none presents today. I can clean up these and test out.

Yeah I think it's better to add locking when we need it, since very likely
something will have changed in the codebase by then. Otherwise there's
only complication and confusion. So please remove the spinlocks and use
the usual interruptible mutex locking for debugfs for now.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2] drm/i915/guc: Clean up locks in GuC
  2015-11-23 23:02   ` [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock yu.dai
  2015-11-24 13:04     ` Daniel Vetter
@ 2015-11-25 19:29     ` yu.dai
  2015-11-26  9:16       ` Nick Hoath
  2015-12-01  0:15     ` [PATCH v3] " yu.dai
  2015-12-03  0:56     ` [PATCH v4] " yu.dai
  3 siblings, 1 reply; 24+ messages in thread
From: yu.dai @ 2015-11-25 19:29 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

When GuC Work Queue is full, driver will wait GuC for avaliable
space by delaying 1ms. The wait needs to be out of spinlockirq /
unlock. Otherwise, lockup happens because jiffies won't be updated
dur to irq is disabled. The unnecessary locks has been cleared.
dev->struct_mutex is used instead where needed.

Issue is found in igt/gem_close_race.

v2: Clean up wq_lock too
v1: Clean up host2guc lock as well

Signed-off-by: Alex Dai <yu.dai@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        | 12 +++++------
 drivers/gpu/drm/i915/i915_guc_submission.c | 32 +++++++-----------------------
 drivers/gpu/drm/i915/intel_guc.h           |  4 ----
 3 files changed, 13 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a728ff1..d6b7817 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2473,15 +2473,15 @@ static int i915_guc_info(struct seq_file *m, void *data)
 	if (!HAS_GUC_SCHED(dev_priv->dev))
 		return 0;
 
+	if (mutex_lock_interruptible(&dev->struct_mutex))
+		return 0;
+
 	/* Take a local copy of the GuC data, so we can dump it at leisure */
-	spin_lock(&dev_priv->guc.host2guc_lock);
 	guc = dev_priv->guc;
-	if (guc.execbuf_client) {
-		spin_lock(&guc.execbuf_client->wq_lock);
+	if (guc.execbuf_client)
 		client = *guc.execbuf_client;
-		spin_unlock(&guc.execbuf_client->wq_lock);
-	}
-	spin_unlock(&dev_priv->guc.host2guc_lock);
+
+	mutex_unlock(&dev->struct_mutex);
 
 	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
 	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index ed9f100..97996e5 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -86,7 +86,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
 		return -EINVAL;
 
 	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
-	spin_lock(&dev_priv->guc.host2guc_lock);
 
 	dev_priv->guc.action_count += 1;
 	dev_priv->guc.action_cmd = data[0];
@@ -119,7 +118,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
 	}
 	dev_priv->guc.action_status = status;
 
-	spin_unlock(&dev_priv->guc.host2guc_lock);
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
 	return ret;
@@ -249,6 +247,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
 	}
 
 	kunmap_atomic(base);
+
 	return ret;
 }
 
@@ -292,16 +291,12 @@ static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
 	const uint32_t cacheline_size = cache_line_size();
 	uint32_t offset;
 
-	spin_lock(&guc->host2guc_lock);
-
 	/* Doorbell uses a single cache line within a page */
 	offset = offset_in_page(guc->db_cacheline);
 
 	/* Moving to next cache line to reduce contention */
 	guc->db_cacheline += cacheline_size;
 
-	spin_unlock(&guc->host2guc_lock);
-
 	DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize %u\n",
 			offset, guc->db_cacheline, cacheline_size);
 
@@ -322,13 +317,11 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
 	const uint16_t end = start + half;
 	uint16_t id;
 
-	spin_lock(&guc->host2guc_lock);
 	id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
 	if (id == end)
 		id = GUC_INVALID_DOORBELL_ID;
 	else
 		bitmap_set(guc->doorbell_bitmap, id, 1);
-	spin_unlock(&guc->host2guc_lock);
 
 	DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
 			hi_pri ? "high" : "normal", id);
@@ -338,9 +331,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
 
 static void release_doorbell(struct intel_guc *guc, uint16_t id)
 {
-	spin_lock(&guc->host2guc_lock);
 	bitmap_clear(guc->doorbell_bitmap, id, 1);
-	spin_unlock(&guc->host2guc_lock);
 }
 
 /*
@@ -487,16 +478,13 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
 	struct guc_process_desc *desc;
 	void *base;
 	u32 size = sizeof(struct guc_wq_item);
-	int ret = 0, timeout_counter = 200;
+	int ret = -ETIMEDOUT, timeout_counter = 200;
 
 	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
 	desc = base + gc->proc_desc_offset;
 
 	while (timeout_counter-- > 0) {
-		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
-				gc->wq_size) >= size, 1);
-
-		if (!ret) {
+		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
 			*offset = gc->wq_tail;
 
 			/* advance the tail for next workqueue item */
@@ -505,7 +493,11 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
 
 			/* this will break the loop */
 			timeout_counter = 0;
+			ret = 0;
 		}
+
+		if (timeout_counter)
+			usleep_range(1000, 2000);
 	};
 
 	kunmap_atomic(base);
@@ -597,15 +589,12 @@ int i915_guc_submit(struct i915_guc_client *client,
 {
 	struct intel_guc *guc = client->guc;
 	enum intel_ring_id ring_id = rq->ring->id;
-	unsigned long flags;
 	int q_ret, b_ret;
 
 	/* Need this because of the deferred pin ctx and ring */
 	/* Shall we move this right after ring is pinned? */
 	lr_context_update(rq);
 
-	spin_lock_irqsave(&client->wq_lock, flags);
-
 	q_ret = guc_add_workqueue_item(client, rq);
 	if (q_ret == 0)
 		b_ret = guc_ring_doorbell(client);
@@ -620,12 +609,8 @@ int i915_guc_submit(struct i915_guc_client *client,
 	} else {
 		client->retcode = 0;
 	}
-	spin_unlock_irqrestore(&client->wq_lock, flags);
-
-	spin_lock(&guc->host2guc_lock);
 	guc->submissions[ring_id] += 1;
 	guc->last_seqno[ring_id] = rq->seqno;
-	spin_unlock(&guc->host2guc_lock);
 
 	return q_ret;
 }
@@ -768,7 +753,6 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
 	client->client_obj = obj;
 	client->wq_offset = GUC_DB_SIZE;
 	client->wq_size = GUC_WQ_SIZE;
-	spin_lock_init(&client->wq_lock);
 
 	client->doorbell_offset = select_doorbell_cacheline(guc);
 
@@ -871,8 +855,6 @@ int i915_guc_submission_init(struct drm_device *dev)
 	if (!guc->ctx_pool_obj)
 		return -ENOMEM;
 
-	spin_lock_init(&dev_priv->guc.host2guc_lock);
-
 	ida_init(&guc->ctx_ids);
 
 	guc_create_log(guc);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 5ba5866..8229522 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -42,8 +42,6 @@ struct i915_guc_client {
 
 	uint32_t wq_offset;
 	uint32_t wq_size;
-
-	spinlock_t wq_lock;		/* Protects all data below	*/
 	uint32_t wq_tail;
 
 	/* GuC submission statistics & status */
@@ -95,8 +93,6 @@ struct intel_guc {
 
 	struct i915_guc_client *execbuf_client;
 
-	spinlock_t host2guc_lock;	/* Protects all data below	*/
-
 	DECLARE_BITMAP(doorbell_bitmap, GUC_MAX_DOORBELLS);
 	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
 
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] drm/i915/guc: Clean up locks in GuC
  2015-11-25 19:29     ` [PATCH v2] drm/i915/guc: Clean up locks in GuC yu.dai
@ 2015-11-26  9:16       ` Nick Hoath
  0 siblings, 0 replies; 24+ messages in thread
From: Nick Hoath @ 2015-11-26  9:16 UTC (permalink / raw)
  To: Dai, Yu, intel-gfx

On 25/11/2015 19:29, Dai, Yu wrote:
> From: Alex Dai <yu.dai@intel.com>
>
> When GuC Work Queue is full, driver will wait GuC for avaliable
                                                         available
> space by delaying 1ms. The wait needs to be out of spinlockirq /
> unlock. Otherwise, lockup happens because jiffies won't be updated
> dur to irq is disabled. The unnecessary locks has been cleared.
   due        being                              have
> dev->struct_mutex is used instead where needed.
>
> Issue is found in igt/gem_close_race.
>
> v2: Clean up wq_lock too
> v1: Clean up host2guc lock as well
>
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c        | 12 +++++------
>   drivers/gpu/drm/i915/i915_guc_submission.c | 32 +++++++-----------------------
>   drivers/gpu/drm/i915/intel_guc.h           |  4 ----
>   3 files changed, 13 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index a728ff1..d6b7817 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2473,15 +2473,15 @@ static int i915_guc_info(struct seq_file *m, void *data)
>   	if (!HAS_GUC_SCHED(dev_priv->dev))
>   		return 0;
>
> +	if (mutex_lock_interruptible(&dev->struct_mutex))
> +		return 0;
> +
>   	/* Take a local copy of the GuC data, so we can dump it at leisure */
> -	spin_lock(&dev_priv->guc.host2guc_lock);
>   	guc = dev_priv->guc;
> -	if (guc.execbuf_client) {
> -		spin_lock(&guc.execbuf_client->wq_lock);
> +	if (guc.execbuf_client)
>   		client = *guc.execbuf_client;
> -		spin_unlock(&guc.execbuf_client->wq_lock);
> -	}
> -	spin_unlock(&dev_priv->guc.host2guc_lock);
> +
> +	mutex_unlock(&dev->struct_mutex);
>
>   	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
>   	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index ed9f100..97996e5 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -86,7 +86,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
>   		return -EINVAL;
>
>   	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> -	spin_lock(&dev_priv->guc.host2guc_lock);
>
>   	dev_priv->guc.action_count += 1;
>   	dev_priv->guc.action_cmd = data[0];
> @@ -119,7 +118,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
>   	}
>   	dev_priv->guc.action_status = status;
>
> -	spin_unlock(&dev_priv->guc.host2guc_lock);
>   	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>
>   	return ret;
> @@ -249,6 +247,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>   	}
>
>   	kunmap_atomic(base);
> +
Unnecessary whitespace churn
>   	return ret;
>   }
>
> @@ -292,16 +291,12 @@ static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
>   	const uint32_t cacheline_size = cache_line_size();
>   	uint32_t offset;
>
> -	spin_lock(&guc->host2guc_lock);
> -
>   	/* Doorbell uses a single cache line within a page */
>   	offset = offset_in_page(guc->db_cacheline);
>
>   	/* Moving to next cache line to reduce contention */
>   	guc->db_cacheline += cacheline_size;
>
> -	spin_unlock(&guc->host2guc_lock);
> -
>   	DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize %u\n",
>   			offset, guc->db_cacheline, cacheline_size);
>
> @@ -322,13 +317,11 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
>   	const uint16_t end = start + half;
>   	uint16_t id;
>
> -	spin_lock(&guc->host2guc_lock);
>   	id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
>   	if (id == end)
>   		id = GUC_INVALID_DOORBELL_ID;
>   	else
>   		bitmap_set(guc->doorbell_bitmap, id, 1);
> -	spin_unlock(&guc->host2guc_lock);
>
>   	DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
>   			hi_pri ? "high" : "normal", id);
> @@ -338,9 +331,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
>
>   static void release_doorbell(struct intel_guc *guc, uint16_t id)
>   {
> -	spin_lock(&guc->host2guc_lock);
>   	bitmap_clear(guc->doorbell_bitmap, id, 1);
> -	spin_unlock(&guc->host2guc_lock);
>   }
>
>   /*
> @@ -487,16 +478,13 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
>   	struct guc_process_desc *desc;
>   	void *base;
>   	u32 size = sizeof(struct guc_wq_item);
> -	int ret = 0, timeout_counter = 200;
> +	int ret = -ETIMEDOUT, timeout_counter = 200;
>
>   	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
>   	desc = base + gc->proc_desc_offset;
>
>   	while (timeout_counter-- > 0) {
> -		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
> -				gc->wq_size) >= size, 1);
> -
> -		if (!ret) {
> +		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
>   			*offset = gc->wq_tail;
>
>   			/* advance the tail for next workqueue item */
> @@ -505,7 +493,11 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
>
>   			/* this will break the loop */
>   			timeout_counter = 0;
> +			ret = 0;
>   		}
> +
> +		if (timeout_counter)
> +			usleep_range(1000, 2000);
>   	};
>
>   	kunmap_atomic(base);
> @@ -597,15 +589,12 @@ int i915_guc_submit(struct i915_guc_client *client,
>   {
>   	struct intel_guc *guc = client->guc;
>   	enum intel_ring_id ring_id = rq->ring->id;
> -	unsigned long flags;
>   	int q_ret, b_ret;
>
>   	/* Need this because of the deferred pin ctx and ring */
>   	/* Shall we move this right after ring is pinned? */
>   	lr_context_update(rq);
>
> -	spin_lock_irqsave(&client->wq_lock, flags);
> -
>   	q_ret = guc_add_workqueue_item(client, rq);
>   	if (q_ret == 0)
>   		b_ret = guc_ring_doorbell(client);
> @@ -620,12 +609,8 @@ int i915_guc_submit(struct i915_guc_client *client,
>   	} else {
>   		client->retcode = 0;
>   	}
> -	spin_unlock_irqrestore(&client->wq_lock, flags);
> -
> -	spin_lock(&guc->host2guc_lock);
>   	guc->submissions[ring_id] += 1;
>   	guc->last_seqno[ring_id] = rq->seqno;
> -	spin_unlock(&guc->host2guc_lock);
>
>   	return q_ret;
>   }
> @@ -768,7 +753,6 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
>   	client->client_obj = obj;
>   	client->wq_offset = GUC_DB_SIZE;
>   	client->wq_size = GUC_WQ_SIZE;
> -	spin_lock_init(&client->wq_lock);
>
>   	client->doorbell_offset = select_doorbell_cacheline(guc);
>
> @@ -871,8 +855,6 @@ int i915_guc_submission_init(struct drm_device *dev)
>   	if (!guc->ctx_pool_obj)
>   		return -ENOMEM;
>
> -	spin_lock_init(&dev_priv->guc.host2guc_lock);
> -
>   	ida_init(&guc->ctx_ids);
>
>   	guc_create_log(guc);
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 5ba5866..8229522 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -42,8 +42,6 @@ struct i915_guc_client {
>
>   	uint32_t wq_offset;
>   	uint32_t wq_size;
> -
> -	spinlock_t wq_lock;		/* Protects all data below	*/
>   	uint32_t wq_tail;
>
>   	/* GuC submission statistics & status */
> @@ -95,8 +93,6 @@ struct intel_guc {
>
>   	struct i915_guc_client *execbuf_client;
>
> -	spinlock_t host2guc_lock;	/* Protects all data below	*/
> -
>   	DECLARE_BITMAP(doorbell_bitmap, GUC_MAX_DOORBELLS);
>   	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v3] drm/i915/guc: Clean up locks in GuC
  2015-11-23 23:02   ` [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock yu.dai
  2015-11-24 13:04     ` Daniel Vetter
  2015-11-25 19:29     ` [PATCH v2] drm/i915/guc: Clean up locks in GuC yu.dai
@ 2015-12-01  0:15     ` yu.dai
  2015-12-02 19:50       ` Dave Gordon
  2015-12-03  0:56     ` [PATCH v4] " yu.dai
  3 siblings, 1 reply; 24+ messages in thread
From: yu.dai @ 2015-12-01  0:15 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

When GuC Work Queue is full, driver will wait GuC for avaliable
space by delaying 1ms. The wait needs to be out of spinlockirq /
unlock. Otherwise, lockup happens because jiffies won't be updated
due to irq is disabled. The unnecessary locks has been cleared.
dev->struct_mutex is used instead where needed.

Issue is found in igt/gem_close_race.

v3: Remove unnecessary whitespace churn
v2: Clean up wq_lock too
v1: Clean up host2guc lock as well

Signed-off-by: Alex Dai <yu.dai@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        | 12 ++++++------
 drivers/gpu/drm/i915/i915_guc_submission.c | 31 ++++++------------------------
 drivers/gpu/drm/i915/intel_guc.h           |  4 ----
 3 files changed, 12 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a728ff1..d6b7817 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2473,15 +2473,15 @@ static int i915_guc_info(struct seq_file *m, void *data)
 	if (!HAS_GUC_SCHED(dev_priv->dev))
 		return 0;
 
+	if (mutex_lock_interruptible(&dev->struct_mutex))
+		return 0;
+
 	/* Take a local copy of the GuC data, so we can dump it at leisure */
-	spin_lock(&dev_priv->guc.host2guc_lock);
 	guc = dev_priv->guc;
-	if (guc.execbuf_client) {
-		spin_lock(&guc.execbuf_client->wq_lock);
+	if (guc.execbuf_client)
 		client = *guc.execbuf_client;
-		spin_unlock(&guc.execbuf_client->wq_lock);
-	}
-	spin_unlock(&dev_priv->guc.host2guc_lock);
+
+	mutex_unlock(&dev->struct_mutex);
 
 	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
 	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index ed9f100..a7f9785 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -86,7 +86,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
 		return -EINVAL;
 
 	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
-	spin_lock(&dev_priv->guc.host2guc_lock);
 
 	dev_priv->guc.action_count += 1;
 	dev_priv->guc.action_cmd = data[0];
@@ -119,7 +118,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
 	}
 	dev_priv->guc.action_status = status;
 
-	spin_unlock(&dev_priv->guc.host2guc_lock);
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
 	return ret;
@@ -292,16 +290,12 @@ static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
 	const uint32_t cacheline_size = cache_line_size();
 	uint32_t offset;
 
-	spin_lock(&guc->host2guc_lock);
-
 	/* Doorbell uses a single cache line within a page */
 	offset = offset_in_page(guc->db_cacheline);
 
 	/* Moving to next cache line to reduce contention */
 	guc->db_cacheline += cacheline_size;
 
-	spin_unlock(&guc->host2guc_lock);
-
 	DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize %u\n",
 			offset, guc->db_cacheline, cacheline_size);
 
@@ -322,13 +316,11 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
 	const uint16_t end = start + half;
 	uint16_t id;
 
-	spin_lock(&guc->host2guc_lock);
 	id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
 	if (id == end)
 		id = GUC_INVALID_DOORBELL_ID;
 	else
 		bitmap_set(guc->doorbell_bitmap, id, 1);
-	spin_unlock(&guc->host2guc_lock);
 
 	DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
 			hi_pri ? "high" : "normal", id);
@@ -338,9 +330,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
 
 static void release_doorbell(struct intel_guc *guc, uint16_t id)
 {
-	spin_lock(&guc->host2guc_lock);
 	bitmap_clear(guc->doorbell_bitmap, id, 1);
-	spin_unlock(&guc->host2guc_lock);
 }
 
 /*
@@ -487,16 +477,13 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
 	struct guc_process_desc *desc;
 	void *base;
 	u32 size = sizeof(struct guc_wq_item);
-	int ret = 0, timeout_counter = 200;
+	int ret = -ETIMEDOUT, timeout_counter = 200;
 
 	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
 	desc = base + gc->proc_desc_offset;
 
 	while (timeout_counter-- > 0) {
-		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
-				gc->wq_size) >= size, 1);
-
-		if (!ret) {
+		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
 			*offset = gc->wq_tail;
 
 			/* advance the tail for next workqueue item */
@@ -505,7 +492,11 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
 
 			/* this will break the loop */
 			timeout_counter = 0;
+			ret = 0;
 		}
+
+		if (timeout_counter)
+			usleep_range(1000, 2000);
 	};
 
 	kunmap_atomic(base);
@@ -597,15 +588,12 @@ int i915_guc_submit(struct i915_guc_client *client,
 {
 	struct intel_guc *guc = client->guc;
 	enum intel_ring_id ring_id = rq->ring->id;
-	unsigned long flags;
 	int q_ret, b_ret;
 
 	/* Need this because of the deferred pin ctx and ring */
 	/* Shall we move this right after ring is pinned? */
 	lr_context_update(rq);
 
-	spin_lock_irqsave(&client->wq_lock, flags);
-
 	q_ret = guc_add_workqueue_item(client, rq);
 	if (q_ret == 0)
 		b_ret = guc_ring_doorbell(client);
@@ -620,12 +608,8 @@ int i915_guc_submit(struct i915_guc_client *client,
 	} else {
 		client->retcode = 0;
 	}
-	spin_unlock_irqrestore(&client->wq_lock, flags);
-
-	spin_lock(&guc->host2guc_lock);
 	guc->submissions[ring_id] += 1;
 	guc->last_seqno[ring_id] = rq->seqno;
-	spin_unlock(&guc->host2guc_lock);
 
 	return q_ret;
 }
@@ -768,7 +752,6 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
 	client->client_obj = obj;
 	client->wq_offset = GUC_DB_SIZE;
 	client->wq_size = GUC_WQ_SIZE;
-	spin_lock_init(&client->wq_lock);
 
 	client->doorbell_offset = select_doorbell_cacheline(guc);
 
@@ -871,8 +854,6 @@ int i915_guc_submission_init(struct drm_device *dev)
 	if (!guc->ctx_pool_obj)
 		return -ENOMEM;
 
-	spin_lock_init(&dev_priv->guc.host2guc_lock);
-
 	ida_init(&guc->ctx_ids);
 
 	guc_create_log(guc);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 5ba5866..8229522 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -42,8 +42,6 @@ struct i915_guc_client {
 
 	uint32_t wq_offset;
 	uint32_t wq_size;
-
-	spinlock_t wq_lock;		/* Protects all data below	*/
 	uint32_t wq_tail;
 
 	/* GuC submission statistics & status */
@@ -95,8 +93,6 @@ struct intel_guc {
 
 	struct i915_guc_client *execbuf_client;
 
-	spinlock_t host2guc_lock;	/* Protects all data below	*/
-
 	DECLARE_BITMAP(doorbell_bitmap, GUC_MAX_DOORBELLS);
 	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
 
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3] drm/i915/guc: Clean up locks in GuC
  2015-12-01  0:15     ` [PATCH v3] " yu.dai
@ 2015-12-02 19:50       ` Dave Gordon
  0 siblings, 0 replies; 24+ messages in thread
From: Dave Gordon @ 2015-12-02 19:50 UTC (permalink / raw)
  To: Dai, Yu, intel-gfx

On 01/12/15 00:15, Dai, Yu wrote:
> From: Alex Dai <yu.dai@intel.com>
>
> When GuC Work Queue is full, driver will wait GuC for avaliable
> space by delaying 1ms. The wait needs to be out of spinlockirq /
> unlock. Otherwise, lockup happens because jiffies won't be updated
> due to irq is disabled. The unnecessary locks has been cleared.
> dev->struct_mutex is used instead where needed.
>
> Issue is found in igt/gem_close_race.
>
> v3: Remove unnecessary whitespace churn
> v2: Clean up wq_lock too
> v1: Clean up host2guc lock as well
>
> Signed-off-by: Alex Dai <yu.dai@intel.com>

A few typos in the commit message, as pointed out by Nick in the 
previous review cycle, but the code looks OK so

Reviewed-by: Dave Gordon <david.s.gordon@intel.com>

and here's a suggestion for the commit message, with a
bit more background and more 'why' rather than 'what':

: For now, remove the spinlocks that protected the GuC's
: statistics block and work queue; they are only accessed
: by code that already holds the global struct_mutex, and
: so are redundant (until the big struct_mutex rewrite!).
:
: The specific problem that the spinlocks caused was that
: if the work queue was full, the driver would try to
: spinwait for one jiffy, but with interrupts disabled the
: jiffy count would not advance, leading to a system hang.
: The issue was found using test case igt/gem_close_race.
:
: The new version will usleep() instead, still holding
: the struct_mutex but without any spinlocks.

> ---
>   drivers/gpu/drm/i915/i915_debugfs.c        | 12 ++++++------
>   drivers/gpu/drm/i915/i915_guc_submission.c | 31 ++++++------------------------
>   drivers/gpu/drm/i915/intel_guc.h           |  4 ----
>   3 files changed, 12 insertions(+), 35 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index a728ff1..d6b7817 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2473,15 +2473,15 @@ static int i915_guc_info(struct seq_file *m, void *data)
>   	if (!HAS_GUC_SCHED(dev_priv->dev))
>   		return 0;
>
> +	if (mutex_lock_interruptible(&dev->struct_mutex))
> +		return 0;
> +
>   	/* Take a local copy of the GuC data, so we can dump it at leisure */
> -	spin_lock(&dev_priv->guc.host2guc_lock);
>   	guc = dev_priv->guc;
> -	if (guc.execbuf_client) {
> -		spin_lock(&guc.execbuf_client->wq_lock);
> +	if (guc.execbuf_client)
>   		client = *guc.execbuf_client;
> -		spin_unlock(&guc.execbuf_client->wq_lock);
> -	}
> -	spin_unlock(&dev_priv->guc.host2guc_lock);
> +
> +	mutex_unlock(&dev->struct_mutex);
>
>   	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
>   	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index ed9f100..a7f9785 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -86,7 +86,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
>   		return -EINVAL;
>
>   	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> -	spin_lock(&dev_priv->guc.host2guc_lock);
>
>   	dev_priv->guc.action_count += 1;
>   	dev_priv->guc.action_cmd = data[0];
> @@ -119,7 +118,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
>   	}
>   	dev_priv->guc.action_status = status;
>
> -	spin_unlock(&dev_priv->guc.host2guc_lock);
>   	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>
>   	return ret;
> @@ -292,16 +290,12 @@ static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
>   	const uint32_t cacheline_size = cache_line_size();
>   	uint32_t offset;
>
> -	spin_lock(&guc->host2guc_lock);
> -
>   	/* Doorbell uses a single cache line within a page */
>   	offset = offset_in_page(guc->db_cacheline);
>
>   	/* Moving to next cache line to reduce contention */
>   	guc->db_cacheline += cacheline_size;
>
> -	spin_unlock(&guc->host2guc_lock);
> -
>   	DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize %u\n",
>   			offset, guc->db_cacheline, cacheline_size);
>
> @@ -322,13 +316,11 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
>   	const uint16_t end = start + half;
>   	uint16_t id;
>
> -	spin_lock(&guc->host2guc_lock);
>   	id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
>   	if (id == end)
>   		id = GUC_INVALID_DOORBELL_ID;
>   	else
>   		bitmap_set(guc->doorbell_bitmap, id, 1);
> -	spin_unlock(&guc->host2guc_lock);
>
>   	DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
>   			hi_pri ? "high" : "normal", id);
> @@ -338,9 +330,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
>
>   static void release_doorbell(struct intel_guc *guc, uint16_t id)
>   {
> -	spin_lock(&guc->host2guc_lock);
>   	bitmap_clear(guc->doorbell_bitmap, id, 1);
> -	spin_unlock(&guc->host2guc_lock);
>   }
>
>   /*
> @@ -487,16 +477,13 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
>   	struct guc_process_desc *desc;
>   	void *base;
>   	u32 size = sizeof(struct guc_wq_item);
> -	int ret = 0, timeout_counter = 200;
> +	int ret = -ETIMEDOUT, timeout_counter = 200;
>
>   	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
>   	desc = base + gc->proc_desc_offset;
>
>   	while (timeout_counter-- > 0) {
> -		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
> -				gc->wq_size) >= size, 1);
> -
> -		if (!ret) {
> +		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
>   			*offset = gc->wq_tail;
>
>   			/* advance the tail for next workqueue item */
> @@ -505,7 +492,11 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
>
>   			/* this will break the loop */
>   			timeout_counter = 0;
> +			ret = 0;
>   		}
> +
> +		if (timeout_counter)
> +			usleep_range(1000, 2000);
>   	};
>
>   	kunmap_atomic(base);
> @@ -597,15 +588,12 @@ int i915_guc_submit(struct i915_guc_client *client,
>   {
>   	struct intel_guc *guc = client->guc;
>   	enum intel_ring_id ring_id = rq->ring->id;
> -	unsigned long flags;
>   	int q_ret, b_ret;
>
>   	/* Need this because of the deferred pin ctx and ring */
>   	/* Shall we move this right after ring is pinned? */
>   	lr_context_update(rq);
>
> -	spin_lock_irqsave(&client->wq_lock, flags);
> -
>   	q_ret = guc_add_workqueue_item(client, rq);
>   	if (q_ret == 0)
>   		b_ret = guc_ring_doorbell(client);
> @@ -620,12 +608,8 @@ int i915_guc_submit(struct i915_guc_client *client,
>   	} else {
>   		client->retcode = 0;
>   	}
> -	spin_unlock_irqrestore(&client->wq_lock, flags);
> -
> -	spin_lock(&guc->host2guc_lock);
>   	guc->submissions[ring_id] += 1;
>   	guc->last_seqno[ring_id] = rq->seqno;
> -	spin_unlock(&guc->host2guc_lock);
>
>   	return q_ret;
>   }
> @@ -768,7 +752,6 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
>   	client->client_obj = obj;
>   	client->wq_offset = GUC_DB_SIZE;
>   	client->wq_size = GUC_WQ_SIZE;
> -	spin_lock_init(&client->wq_lock);
>
>   	client->doorbell_offset = select_doorbell_cacheline(guc);
>
> @@ -871,8 +854,6 @@ int i915_guc_submission_init(struct drm_device *dev)
>   	if (!guc->ctx_pool_obj)
>   		return -ENOMEM;
>
> -	spin_lock_init(&dev_priv->guc.host2guc_lock);
> -
>   	ida_init(&guc->ctx_ids);
>
>   	guc_create_log(guc);
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 5ba5866..8229522 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -42,8 +42,6 @@ struct i915_guc_client {
>
>   	uint32_t wq_offset;
>   	uint32_t wq_size;
> -
> -	spinlock_t wq_lock;		/* Protects all data below	*/
>   	uint32_t wq_tail;
>
>   	/* GuC submission statistics & status */
> @@ -95,8 +93,6 @@ struct intel_guc {
>
>   	struct i915_guc_client *execbuf_client;
>
> -	spinlock_t host2guc_lock;	/* Protects all data below	*/
> -
>   	DECLARE_BITMAP(doorbell_bitmap, GUC_MAX_DOORBELLS);
>   	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v4] drm/i915/guc: Clean up locks in GuC
  2015-11-23 23:02   ` [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock yu.dai
                       ` (2 preceding siblings ...)
  2015-12-01  0:15     ` [PATCH v3] " yu.dai
@ 2015-12-03  0:56     ` yu.dai
  2015-12-03  7:52       ` Daniel Vetter
  2016-01-22 10:54       ` Tvrtko Ursulin
  3 siblings, 2 replies; 24+ messages in thread
From: yu.dai @ 2015-12-03  0:56 UTC (permalink / raw)
  To: intel-gfx

From: Alex Dai <yu.dai@intel.com>

For now, remove the spinlocks that protected the GuC's
statistics block and work queue; they are only accessed
by code that already holds the global struct_mutex, and
so are redundant (until the big struct_mutex rewrite!).

The specific problem that the spinlocks caused was that
if the work queue was full, the driver would try to
spinwait for one jiffy, but with interrupts disabled the
jiffy count would not advance, leading to a system hang.
The issue was found using test case igt/gem_close_race.

The new version will usleep() instead, still holding
the struct_mutex but without any spinlocks.

v4: Reorganize commit message (Dave Gordon)
v3: Remove unnecessary whitespace churn
v2: Clean up wq_lock too
v1: Clean up host2guc lock as well

Signed-off-by: Alex Dai <yu.dai@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c        | 12 ++++++------
 drivers/gpu/drm/i915/i915_guc_submission.c | 31 ++++++------------------------
 drivers/gpu/drm/i915/intel_guc.h           |  4 ----
 3 files changed, 12 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a728ff1..d6b7817 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2473,15 +2473,15 @@ static int i915_guc_info(struct seq_file *m, void *data)
 	if (!HAS_GUC_SCHED(dev_priv->dev))
 		return 0;
 
+	if (mutex_lock_interruptible(&dev->struct_mutex))
+		return 0;
+
 	/* Take a local copy of the GuC data, so we can dump it at leisure */
-	spin_lock(&dev_priv->guc.host2guc_lock);
 	guc = dev_priv->guc;
-	if (guc.execbuf_client) {
-		spin_lock(&guc.execbuf_client->wq_lock);
+	if (guc.execbuf_client)
 		client = *guc.execbuf_client;
-		spin_unlock(&guc.execbuf_client->wq_lock);
-	}
-	spin_unlock(&dev_priv->guc.host2guc_lock);
+
+	mutex_unlock(&dev->struct_mutex);
 
 	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
 	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index ed9f100..a7f9785 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -86,7 +86,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
 		return -EINVAL;
 
 	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
-	spin_lock(&dev_priv->guc.host2guc_lock);
 
 	dev_priv->guc.action_count += 1;
 	dev_priv->guc.action_cmd = data[0];
@@ -119,7 +118,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
 	}
 	dev_priv->guc.action_status = status;
 
-	spin_unlock(&dev_priv->guc.host2guc_lock);
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 
 	return ret;
@@ -292,16 +290,12 @@ static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
 	const uint32_t cacheline_size = cache_line_size();
 	uint32_t offset;
 
-	spin_lock(&guc->host2guc_lock);
-
 	/* Doorbell uses a single cache line within a page */
 	offset = offset_in_page(guc->db_cacheline);
 
 	/* Moving to next cache line to reduce contention */
 	guc->db_cacheline += cacheline_size;
 
-	spin_unlock(&guc->host2guc_lock);
-
 	DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize %u\n",
 			offset, guc->db_cacheline, cacheline_size);
 
@@ -322,13 +316,11 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
 	const uint16_t end = start + half;
 	uint16_t id;
 
-	spin_lock(&guc->host2guc_lock);
 	id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
 	if (id == end)
 		id = GUC_INVALID_DOORBELL_ID;
 	else
 		bitmap_set(guc->doorbell_bitmap, id, 1);
-	spin_unlock(&guc->host2guc_lock);
 
 	DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
 			hi_pri ? "high" : "normal", id);
@@ -338,9 +330,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
 
 static void release_doorbell(struct intel_guc *guc, uint16_t id)
 {
-	spin_lock(&guc->host2guc_lock);
 	bitmap_clear(guc->doorbell_bitmap, id, 1);
-	spin_unlock(&guc->host2guc_lock);
 }
 
 /*
@@ -487,16 +477,13 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
 	struct guc_process_desc *desc;
 	void *base;
 	u32 size = sizeof(struct guc_wq_item);
-	int ret = 0, timeout_counter = 200;
+	int ret = -ETIMEDOUT, timeout_counter = 200;
 
 	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
 	desc = base + gc->proc_desc_offset;
 
 	while (timeout_counter-- > 0) {
-		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
-				gc->wq_size) >= size, 1);
-
-		if (!ret) {
+		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
 			*offset = gc->wq_tail;
 
 			/* advance the tail for next workqueue item */
@@ -505,7 +492,11 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
 
 			/* this will break the loop */
 			timeout_counter = 0;
+			ret = 0;
 		}
+
+		if (timeout_counter)
+			usleep_range(1000, 2000);
 	};
 
 	kunmap_atomic(base);
@@ -597,15 +588,12 @@ int i915_guc_submit(struct i915_guc_client *client,
 {
 	struct intel_guc *guc = client->guc;
 	enum intel_ring_id ring_id = rq->ring->id;
-	unsigned long flags;
 	int q_ret, b_ret;
 
 	/* Need this because of the deferred pin ctx and ring */
 	/* Shall we move this right after ring is pinned? */
 	lr_context_update(rq);
 
-	spin_lock_irqsave(&client->wq_lock, flags);
-
 	q_ret = guc_add_workqueue_item(client, rq);
 	if (q_ret == 0)
 		b_ret = guc_ring_doorbell(client);
@@ -620,12 +608,8 @@ int i915_guc_submit(struct i915_guc_client *client,
 	} else {
 		client->retcode = 0;
 	}
-	spin_unlock_irqrestore(&client->wq_lock, flags);
-
-	spin_lock(&guc->host2guc_lock);
 	guc->submissions[ring_id] += 1;
 	guc->last_seqno[ring_id] = rq->seqno;
-	spin_unlock(&guc->host2guc_lock);
 
 	return q_ret;
 }
@@ -768,7 +752,6 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
 	client->client_obj = obj;
 	client->wq_offset = GUC_DB_SIZE;
 	client->wq_size = GUC_WQ_SIZE;
-	spin_lock_init(&client->wq_lock);
 
 	client->doorbell_offset = select_doorbell_cacheline(guc);
 
@@ -871,8 +854,6 @@ int i915_guc_submission_init(struct drm_device *dev)
 	if (!guc->ctx_pool_obj)
 		return -ENOMEM;
 
-	spin_lock_init(&dev_priv->guc.host2guc_lock);
-
 	ida_init(&guc->ctx_ids);
 
 	guc_create_log(guc);
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 5ba5866..8229522 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -42,8 +42,6 @@ struct i915_guc_client {
 
 	uint32_t wq_offset;
 	uint32_t wq_size;
-
-	spinlock_t wq_lock;		/* Protects all data below	*/
 	uint32_t wq_tail;
 
 	/* GuC submission statistics & status */
@@ -95,8 +93,6 @@ struct intel_guc {
 
 	struct i915_guc_client *execbuf_client;
 
-	spinlock_t host2guc_lock;	/* Protects all data below	*/
-
 	DECLARE_BITMAP(doorbell_bitmap, GUC_MAX_DOORBELLS);
 	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
 
-- 
2.5.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v4] drm/i915/guc: Clean up locks in GuC
  2015-12-03  0:56     ` [PATCH v4] " yu.dai
@ 2015-12-03  7:52       ` Daniel Vetter
  2016-01-22 10:54       ` Tvrtko Ursulin
  1 sibling, 0 replies; 24+ messages in thread
From: Daniel Vetter @ 2015-12-03  7:52 UTC (permalink / raw)
  To: yu.dai; +Cc: intel-gfx

On Wed, Dec 02, 2015 at 04:56:29PM -0800, yu.dai@intel.com wrote:
> From: Alex Dai <yu.dai@intel.com>
> 
> For now, remove the spinlocks that protected the GuC's
> statistics block and work queue; they are only accessed
> by code that already holds the global struct_mutex, and
> so are redundant (until the big struct_mutex rewrite!).
> 
> The specific problem that the spinlocks caused was that
> if the work queue was full, the driver would try to
> spinwait for one jiffy, but with interrupts disabled the
> jiffy count would not advance, leading to a system hang.
> The issue was found using test case igt/gem_close_race.
> 
> The new version will usleep() instead, still holding
> the struct_mutex but without any spinlocks.
> 
> v4: Reorganize commit message (Dave Gordon)
> v3: Remove unnecessary whitespace churn
> v2: Clean up wq_lock too
> v1: Clean up host2guc lock as well
> 
> Signed-off-by: Alex Dai <yu.dai@intel.com>

Please add the r-b yourself when resending and you have one, for next time
around.

Queued for -next, thanks for the patch.
-Daniel
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        | 12 ++++++------
>  drivers/gpu/drm/i915/i915_guc_submission.c | 31 ++++++------------------------
>  drivers/gpu/drm/i915/intel_guc.h           |  4 ----
>  3 files changed, 12 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index a728ff1..d6b7817 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2473,15 +2473,15 @@ static int i915_guc_info(struct seq_file *m, void *data)
>  	if (!HAS_GUC_SCHED(dev_priv->dev))
>  		return 0;
>  
> +	if (mutex_lock_interruptible(&dev->struct_mutex))
> +		return 0;
> +
>  	/* Take a local copy of the GuC data, so we can dump it at leisure */
> -	spin_lock(&dev_priv->guc.host2guc_lock);
>  	guc = dev_priv->guc;
> -	if (guc.execbuf_client) {
> -		spin_lock(&guc.execbuf_client->wq_lock);
> +	if (guc.execbuf_client)
>  		client = *guc.execbuf_client;
> -		spin_unlock(&guc.execbuf_client->wq_lock);
> -	}
> -	spin_unlock(&dev_priv->guc.host2guc_lock);
> +
> +	mutex_unlock(&dev->struct_mutex);
>  
>  	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
>  	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index ed9f100..a7f9785 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -86,7 +86,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
>  		return -EINVAL;
>  
>  	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> -	spin_lock(&dev_priv->guc.host2guc_lock);
>  
>  	dev_priv->guc.action_count += 1;
>  	dev_priv->guc.action_cmd = data[0];
> @@ -119,7 +118,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
>  	}
>  	dev_priv->guc.action_status = status;
>  
> -	spin_unlock(&dev_priv->guc.host2guc_lock);
>  	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>  
>  	return ret;
> @@ -292,16 +290,12 @@ static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
>  	const uint32_t cacheline_size = cache_line_size();
>  	uint32_t offset;
>  
> -	spin_lock(&guc->host2guc_lock);
> -
>  	/* Doorbell uses a single cache line within a page */
>  	offset = offset_in_page(guc->db_cacheline);
>  
>  	/* Moving to next cache line to reduce contention */
>  	guc->db_cacheline += cacheline_size;
>  
> -	spin_unlock(&guc->host2guc_lock);
> -
>  	DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize %u\n",
>  			offset, guc->db_cacheline, cacheline_size);
>  
> @@ -322,13 +316,11 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
>  	const uint16_t end = start + half;
>  	uint16_t id;
>  
> -	spin_lock(&guc->host2guc_lock);
>  	id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
>  	if (id == end)
>  		id = GUC_INVALID_DOORBELL_ID;
>  	else
>  		bitmap_set(guc->doorbell_bitmap, id, 1);
> -	spin_unlock(&guc->host2guc_lock);
>  
>  	DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
>  			hi_pri ? "high" : "normal", id);
> @@ -338,9 +330,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
>  
>  static void release_doorbell(struct intel_guc *guc, uint16_t id)
>  {
> -	spin_lock(&guc->host2guc_lock);
>  	bitmap_clear(guc->doorbell_bitmap, id, 1);
> -	spin_unlock(&guc->host2guc_lock);
>  }
>  
>  /*
> @@ -487,16 +477,13 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
>  	struct guc_process_desc *desc;
>  	void *base;
>  	u32 size = sizeof(struct guc_wq_item);
> -	int ret = 0, timeout_counter = 200;
> +	int ret = -ETIMEDOUT, timeout_counter = 200;
>  
>  	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
>  	desc = base + gc->proc_desc_offset;
>  
>  	while (timeout_counter-- > 0) {
> -		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
> -				gc->wq_size) >= size, 1);
> -
> -		if (!ret) {
> +		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
>  			*offset = gc->wq_tail;
>  
>  			/* advance the tail for next workqueue item */
> @@ -505,7 +492,11 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
>  
>  			/* this will break the loop */
>  			timeout_counter = 0;
> +			ret = 0;
>  		}
> +
> +		if (timeout_counter)
> +			usleep_range(1000, 2000);
>  	};
>  
>  	kunmap_atomic(base);
> @@ -597,15 +588,12 @@ int i915_guc_submit(struct i915_guc_client *client,
>  {
>  	struct intel_guc *guc = client->guc;
>  	enum intel_ring_id ring_id = rq->ring->id;
> -	unsigned long flags;
>  	int q_ret, b_ret;
>  
>  	/* Need this because of the deferred pin ctx and ring */
>  	/* Shall we move this right after ring is pinned? */
>  	lr_context_update(rq);
>  
> -	spin_lock_irqsave(&client->wq_lock, flags);
> -
>  	q_ret = guc_add_workqueue_item(client, rq);
>  	if (q_ret == 0)
>  		b_ret = guc_ring_doorbell(client);
> @@ -620,12 +608,8 @@ int i915_guc_submit(struct i915_guc_client *client,
>  	} else {
>  		client->retcode = 0;
>  	}
> -	spin_unlock_irqrestore(&client->wq_lock, flags);
> -
> -	spin_lock(&guc->host2guc_lock);
>  	guc->submissions[ring_id] += 1;
>  	guc->last_seqno[ring_id] = rq->seqno;
> -	spin_unlock(&guc->host2guc_lock);
>  
>  	return q_ret;
>  }
> @@ -768,7 +752,6 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
>  	client->client_obj = obj;
>  	client->wq_offset = GUC_DB_SIZE;
>  	client->wq_size = GUC_WQ_SIZE;
> -	spin_lock_init(&client->wq_lock);
>  
>  	client->doorbell_offset = select_doorbell_cacheline(guc);
>  
> @@ -871,8 +854,6 @@ int i915_guc_submission_init(struct drm_device *dev)
>  	if (!guc->ctx_pool_obj)
>  		return -ENOMEM;
>  
> -	spin_lock_init(&dev_priv->guc.host2guc_lock);
> -
>  	ida_init(&guc->ctx_ids);
>  
>  	guc_create_log(guc);
> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
> index 5ba5866..8229522 100644
> --- a/drivers/gpu/drm/i915/intel_guc.h
> +++ b/drivers/gpu/drm/i915/intel_guc.h
> @@ -42,8 +42,6 @@ struct i915_guc_client {
>  
>  	uint32_t wq_offset;
>  	uint32_t wq_size;
> -
> -	spinlock_t wq_lock;		/* Protects all data below	*/
>  	uint32_t wq_tail;
>  
>  	/* GuC submission statistics & status */
> @@ -95,8 +93,6 @@ struct intel_guc {
>  
>  	struct i915_guc_client *execbuf_client;
>  
> -	spinlock_t host2guc_lock;	/* Protects all data below	*/
> -
>  	DECLARE_BITMAP(doorbell_bitmap, GUC_MAX_DOORBELLS);
>  	uint32_t db_cacheline;		/* Cyclic counter mod pagesize	*/
>  
> -- 
> 2.5.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4] drm/i915/guc: Clean up locks in GuC
  2015-12-03  0:56     ` [PATCH v4] " yu.dai
  2015-12-03  7:52       ` Daniel Vetter
@ 2016-01-22 10:54       ` Tvrtko Ursulin
  2016-01-25 16:01         ` Daniel Vetter
  1 sibling, 1 reply; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-22 10:54 UTC (permalink / raw)
  To: yu.dai, intel-gfx


On 03/12/15 00:56, yu.dai@intel.com wrote:
> From: Alex Dai <yu.dai@intel.com>
> 
> For now, remove the spinlocks that protected the GuC's
> statistics block and work queue; they are only accessed
> by code that already holds the global struct_mutex, and
> so are redundant (until the big struct_mutex rewrite!).
> 
> The specific problem that the spinlocks caused was that
> if the work queue was full, the driver would try to
> spinwait for one jiffy, but with interrupts disabled the
> jiffy count would not advance, leading to a system hang.
> The issue was found using test case igt/gem_close_race.
> 
> The new version will usleep() instead, still holding
> the struct_mutex but without any spinlocks.
> 
> v4: Reorganize commit message (Dave Gordon)
> v3: Remove unnecessary whitespace churn
> v2: Clean up wq_lock too
> v1: Clean up host2guc lock as well
> 
> Signed-off-by: Alex Dai <yu.dai@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c        | 12 ++++++------
>   drivers/gpu/drm/i915/i915_guc_submission.c | 31 ++++++------------------------
>   drivers/gpu/drm/i915/intel_guc.h           |  4 ----
>   3 files changed, 12 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index a728ff1..d6b7817 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2473,15 +2473,15 @@ static int i915_guc_info(struct seq_file *m, void *data)
>   	if (!HAS_GUC_SCHED(dev_priv->dev))
>   		return 0;
>   
> +	if (mutex_lock_interruptible(&dev->struct_mutex))
> +		return 0;
> +
>   	/* Take a local copy of the GuC data, so we can dump it at leisure */
> -	spin_lock(&dev_priv->guc.host2guc_lock);
>   	guc = dev_priv->guc;
> -	if (guc.execbuf_client) {
> -		spin_lock(&guc.execbuf_client->wq_lock);
> +	if (guc.execbuf_client)
>   		client = *guc.execbuf_client;
> -		spin_unlock(&guc.execbuf_client->wq_lock);
> -	}
> -	spin_unlock(&dev_priv->guc.host2guc_lock);
> +
> +	mutex_unlock(&dev->struct_mutex);
>   
>   	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
>   	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index ed9f100..a7f9785 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -86,7 +86,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
>   		return -EINVAL;
>   
>   	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> -	spin_lock(&dev_priv->guc.host2guc_lock);
>   
>   	dev_priv->guc.action_count += 1;
>   	dev_priv->guc.action_cmd = data[0];
> @@ -119,7 +118,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
>   	}
>   	dev_priv->guc.action_status = status;
>   
> -	spin_unlock(&dev_priv->guc.host2guc_lock);
>   	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>   
>   	return ret;
> @@ -292,16 +290,12 @@ static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
>   	const uint32_t cacheline_size = cache_line_size();
>   	uint32_t offset;
>   
> -	spin_lock(&guc->host2guc_lock);
> -
>   	/* Doorbell uses a single cache line within a page */
>   	offset = offset_in_page(guc->db_cacheline);
>   
>   	/* Moving to next cache line to reduce contention */
>   	guc->db_cacheline += cacheline_size;
>   
> -	spin_unlock(&guc->host2guc_lock);
> -
>   	DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize %u\n",
>   			offset, guc->db_cacheline, cacheline_size);
>   
> @@ -322,13 +316,11 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
>   	const uint16_t end = start + half;
>   	uint16_t id;
>   
> -	spin_lock(&guc->host2guc_lock);
>   	id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
>   	if (id == end)
>   		id = GUC_INVALID_DOORBELL_ID;
>   	else
>   		bitmap_set(guc->doorbell_bitmap, id, 1);
> -	spin_unlock(&guc->host2guc_lock);
>   
>   	DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
>   			hi_pri ? "high" : "normal", id);
> @@ -338,9 +330,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
>   
>   static void release_doorbell(struct intel_guc *guc, uint16_t id)
>   {
> -	spin_lock(&guc->host2guc_lock);
>   	bitmap_clear(guc->doorbell_bitmap, id, 1);
> -	spin_unlock(&guc->host2guc_lock);
>   }
>   
>   /*
> @@ -487,16 +477,13 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
>   	struct guc_process_desc *desc;
>   	void *base;
>   	u32 size = sizeof(struct guc_wq_item);
> -	int ret = 0, timeout_counter = 200;
> +	int ret = -ETIMEDOUT, timeout_counter = 200;
>   
>   	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
>   	desc = base + gc->proc_desc_offset;
>   
>   	while (timeout_counter-- > 0) {
> -		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
> -				gc->wq_size) >= size, 1);
> -
> -		if (!ret) {
> +		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
>   			*offset = gc->wq_tail;
>   
>   			/* advance the tail for next workqueue item */
> @@ -505,7 +492,11 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
>   
>   			/* this will break the loop */
>   			timeout_counter = 0;
> +			ret = 0;
>   		}
> +
> +		if (timeout_counter)
> +			usleep_range(1000, 2000);

Sleeping is illegal while holding kmap_atomic mapping:

[   34.098798] BUG: scheduling while atomic: gem_close_race/1941/0x00000002
[   34.098822] Modules linked in: hid_generic usbhid i915 asix usbnet libphy mii i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm coretemp i2c_hid hid video pinctrl_sunrisepoint pinctrl_intel acpi_pad nls_iso8859_1 e1000e ptp psmouse pps_core ahci libahci
[   34.098824] CPU: 0 PID: 1941 Comm: gem_close_race Tainted: G     U          4.4.0-160121+ #123
[   34.098824] Hardware name: Intel Corporation Skylake Client platform/Skylake AIO DDR3L RVP10, BIOS SKLSE2R1.R00.X100.B01.1509220551 09/22/2015
[   34.098825]  0000000000013e40 ffff880166c27a78 ffffffff81280d02 ffff880172c13e40
[   34.098826]  ffff880166c27a88 ffffffff810c203a ffff880166c27ac8 ffffffff814ec808
[   34.098827]  ffff88016b7c6000 ffff880166c28000 00000000000f4240 0000000000000001
[   34.098827] Call Trace:
[   34.098831]  [<ffffffff81280d02>] dump_stack+0x4b/0x79
[   34.098833]  [<ffffffff810c203a>] __schedule_bug+0x41/0x4f
[   34.098834]  [<ffffffff814ec808>] __schedule+0x5a8/0x690
[   34.098835]  [<ffffffff814ec927>] schedule+0x37/0x80
[   34.098836]  [<ffffffff814ef3fd>] schedule_hrtimeout_range_clock+0xad/0x130
[   34.098837]  [<ffffffff81090be0>] ? hrtimer_init+0x10/0x10
[   34.098838]  [<ffffffff814ef3f1>] ? schedule_hrtimeout_range_clock+0xa1/0x130
[   34.098839]  [<ffffffff814ef48e>] schedule_hrtimeout_range+0xe/0x10
[   34.098840]  [<ffffffff814eef9b>] usleep_range+0x3b/0x40
[   34.098853]  [<ffffffffa01ec109>] i915_guc_wq_check_space+0x119/0x210 [i915]
[   34.098861]  [<ffffffffa01da47c>] intel_logical_ring_alloc_request_extras+0x5c/0x70 [i915]
[   34.098869]  [<ffffffffa01cdbf1>] i915_gem_request_alloc+0x91/0x170 [i915]
[   34.098875]  [<ffffffffa01c1c07>] i915_gem_do_execbuffer.isra.25+0xbc7/0x12a0 [i915]
[   34.098882]  [<ffffffffa01cb785>] ? i915_gem_object_get_pages_gtt+0x225/0x3c0 [i915]
[   34.098889]  [<ffffffffa01d1fb6>] ? i915_gem_pwrite_ioctl+0xd6/0x9f0 [i915]
[   34.098895]  [<ffffffffa01c2e68>] i915_gem_execbuffer2+0xa8/0x250 [i915]
[   34.098900]  [<ffffffffa00f65d8>] drm_ioctl+0x258/0x4f0 [drm]
[   34.098906]  [<ffffffffa01c2dc0>] ? i915_gem_execbuffer+0x340/0x340 [i915]
[   34.098908]  [<ffffffff8111590d>] do_vfs_ioctl+0x2cd/0x4a0
[   34.098909]  [<ffffffff8111eac2>] ? __fget+0x72/0xb0
[   34.098910]  [<ffffffff81115b1c>] SyS_ioctl+0x3c/0x70
[   34.098911]  [<ffffffff814effd7>] entry_SYSCALL_64_fastpath+0x12/0x6a
[   34.100208] ------------[ cut here ]------------

I don't know the code well enough to suggest a suitable fix.

mdelay certainly seems out of the question. So maybe move the kmap so it holds
the mapping across the lifetime of the page it is mapping, or whethever it is
convenient?

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4] drm/i915/guc: Clean up locks in GuC
  2016-01-22 10:54       ` Tvrtko Ursulin
@ 2016-01-25 16:01         ` Daniel Vetter
  2016-01-26 11:53           ` Tvrtko Ursulin
  0 siblings, 1 reply; 24+ messages in thread
From: Daniel Vetter @ 2016-01-25 16:01 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Fri, Jan 22, 2016 at 10:54:42AM +0000, Tvrtko Ursulin wrote:
> 
> On 03/12/15 00:56, yu.dai@intel.com wrote:
> > From: Alex Dai <yu.dai@intel.com>
> > 
> > For now, remove the spinlocks that protected the GuC's
> > statistics block and work queue; they are only accessed
> > by code that already holds the global struct_mutex, and
> > so are redundant (until the big struct_mutex rewrite!).
> > 
> > The specific problem that the spinlocks caused was that
> > if the work queue was full, the driver would try to
> > spinwait for one jiffy, but with interrupts disabled the
> > jiffy count would not advance, leading to a system hang.
> > The issue was found using test case igt/gem_close_race.
> > 
> > The new version will usleep() instead, still holding
> > the struct_mutex but without any spinlocks.
> > 
> > v4: Reorganize commit message (Dave Gordon)
> > v3: Remove unnecessary whitespace churn
> > v2: Clean up wq_lock too
> > v1: Clean up host2guc lock as well
> > 
> > Signed-off-by: Alex Dai <yu.dai@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_debugfs.c        | 12 ++++++------
> >   drivers/gpu/drm/i915/i915_guc_submission.c | 31 ++++++------------------------
> >   drivers/gpu/drm/i915/intel_guc.h           |  4 ----
> >   3 files changed, 12 insertions(+), 35 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index a728ff1..d6b7817 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -2473,15 +2473,15 @@ static int i915_guc_info(struct seq_file *m, void *data)
> >   	if (!HAS_GUC_SCHED(dev_priv->dev))
> >   		return 0;
> >   
> > +	if (mutex_lock_interruptible(&dev->struct_mutex))
> > +		return 0;
> > +
> >   	/* Take a local copy of the GuC data, so we can dump it at leisure */
> > -	spin_lock(&dev_priv->guc.host2guc_lock);
> >   	guc = dev_priv->guc;
> > -	if (guc.execbuf_client) {
> > -		spin_lock(&guc.execbuf_client->wq_lock);
> > +	if (guc.execbuf_client)
> >   		client = *guc.execbuf_client;
> > -		spin_unlock(&guc.execbuf_client->wq_lock);
> > -	}
> > -	spin_unlock(&dev_priv->guc.host2guc_lock);
> > +
> > +	mutex_unlock(&dev->struct_mutex);
> >   
> >   	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
> >   	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
> > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> > index ed9f100..a7f9785 100644
> > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > @@ -86,7 +86,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
> >   		return -EINVAL;
> >   
> >   	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> > -	spin_lock(&dev_priv->guc.host2guc_lock);
> >   
> >   	dev_priv->guc.action_count += 1;
> >   	dev_priv->guc.action_cmd = data[0];
> > @@ -119,7 +118,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
> >   	}
> >   	dev_priv->guc.action_status = status;
> >   
> > -	spin_unlock(&dev_priv->guc.host2guc_lock);
> >   	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
> >   
> >   	return ret;
> > @@ -292,16 +290,12 @@ static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
> >   	const uint32_t cacheline_size = cache_line_size();
> >   	uint32_t offset;
> >   
> > -	spin_lock(&guc->host2guc_lock);
> > -
> >   	/* Doorbell uses a single cache line within a page */
> >   	offset = offset_in_page(guc->db_cacheline);
> >   
> >   	/* Moving to next cache line to reduce contention */
> >   	guc->db_cacheline += cacheline_size;
> >   
> > -	spin_unlock(&guc->host2guc_lock);
> > -
> >   	DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize %u\n",
> >   			offset, guc->db_cacheline, cacheline_size);
> >   
> > @@ -322,13 +316,11 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
> >   	const uint16_t end = start + half;
> >   	uint16_t id;
> >   
> > -	spin_lock(&guc->host2guc_lock);
> >   	id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
> >   	if (id == end)
> >   		id = GUC_INVALID_DOORBELL_ID;
> >   	else
> >   		bitmap_set(guc->doorbell_bitmap, id, 1);
> > -	spin_unlock(&guc->host2guc_lock);
> >   
> >   	DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
> >   			hi_pri ? "high" : "normal", id);
> > @@ -338,9 +330,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
> >   
> >   static void release_doorbell(struct intel_guc *guc, uint16_t id)
> >   {
> > -	spin_lock(&guc->host2guc_lock);
> >   	bitmap_clear(guc->doorbell_bitmap, id, 1);
> > -	spin_unlock(&guc->host2guc_lock);
> >   }
> >   
> >   /*
> > @@ -487,16 +477,13 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
> >   	struct guc_process_desc *desc;
> >   	void *base;
> >   	u32 size = sizeof(struct guc_wq_item);
> > -	int ret = 0, timeout_counter = 200;
> > +	int ret = -ETIMEDOUT, timeout_counter = 200;
> >   
> >   	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
> >   	desc = base + gc->proc_desc_offset;
> >   
> >   	while (timeout_counter-- > 0) {
> > -		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
> > -				gc->wq_size) >= size, 1);
> > -
> > -		if (!ret) {
> > +		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
> >   			*offset = gc->wq_tail;
> >   
> >   			/* advance the tail for next workqueue item */
> > @@ -505,7 +492,11 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
> >   
> >   			/* this will break the loop */
> >   			timeout_counter = 0;
> > +			ret = 0;
> >   		}
> > +
> > +		if (timeout_counter)
> > +			usleep_range(1000, 2000);
> 
> Sleeping is illegal while holding kmap_atomic mapping:
> 
> [   34.098798] BUG: scheduling while atomic: gem_close_race/1941/0x00000002
> [   34.098822] Modules linked in: hid_generic usbhid i915 asix usbnet libphy mii i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm coretemp i2c_hid hid video pinctrl_sunrisepoint pinctrl_intel acpi_pad nls_iso8859_1 e1000e ptp psmouse pps_core ahci libahci
> [   34.098824] CPU: 0 PID: 1941 Comm: gem_close_race Tainted: G     U          4.4.0-160121+ #123
> [   34.098824] Hardware name: Intel Corporation Skylake Client platform/Skylake AIO DDR3L RVP10, BIOS SKLSE2R1.R00.X100.B01.1509220551 09/22/2015
> [   34.098825]  0000000000013e40 ffff880166c27a78 ffffffff81280d02 ffff880172c13e40
> [   34.098826]  ffff880166c27a88 ffffffff810c203a ffff880166c27ac8 ffffffff814ec808
> [   34.098827]  ffff88016b7c6000 ffff880166c28000 00000000000f4240 0000000000000001
> [   34.098827] Call Trace:
> [   34.098831]  [<ffffffff81280d02>] dump_stack+0x4b/0x79
> [   34.098833]  [<ffffffff810c203a>] __schedule_bug+0x41/0x4f
> [   34.098834]  [<ffffffff814ec808>] __schedule+0x5a8/0x690
> [   34.098835]  [<ffffffff814ec927>] schedule+0x37/0x80
> [   34.098836]  [<ffffffff814ef3fd>] schedule_hrtimeout_range_clock+0xad/0x130
> [   34.098837]  [<ffffffff81090be0>] ? hrtimer_init+0x10/0x10
> [   34.098838]  [<ffffffff814ef3f1>] ? schedule_hrtimeout_range_clock+0xa1/0x130
> [   34.098839]  [<ffffffff814ef48e>] schedule_hrtimeout_range+0xe/0x10
> [   34.098840]  [<ffffffff814eef9b>] usleep_range+0x3b/0x40
> [   34.098853]  [<ffffffffa01ec109>] i915_guc_wq_check_space+0x119/0x210 [i915]
> [   34.098861]  [<ffffffffa01da47c>] intel_logical_ring_alloc_request_extras+0x5c/0x70 [i915]
> [   34.098869]  [<ffffffffa01cdbf1>] i915_gem_request_alloc+0x91/0x170 [i915]
> [   34.098875]  [<ffffffffa01c1c07>] i915_gem_do_execbuffer.isra.25+0xbc7/0x12a0 [i915]
> [   34.098882]  [<ffffffffa01cb785>] ? i915_gem_object_get_pages_gtt+0x225/0x3c0 [i915]
> [   34.098889]  [<ffffffffa01d1fb6>] ? i915_gem_pwrite_ioctl+0xd6/0x9f0 [i915]
> [   34.098895]  [<ffffffffa01c2e68>] i915_gem_execbuffer2+0xa8/0x250 [i915]
> [   34.098900]  [<ffffffffa00f65d8>] drm_ioctl+0x258/0x4f0 [drm]
> [   34.098906]  [<ffffffffa01c2dc0>] ? i915_gem_execbuffer+0x340/0x340 [i915]
> [   34.098908]  [<ffffffff8111590d>] do_vfs_ioctl+0x2cd/0x4a0
> [   34.098909]  [<ffffffff8111eac2>] ? __fget+0x72/0xb0
> [   34.098910]  [<ffffffff81115b1c>] SyS_ioctl+0x3c/0x70
> [   34.098911]  [<ffffffff814effd7>] entry_SYSCALL_64_fastpath+0x12/0x6a
> [   34.100208] ------------[ cut here ]------------
> 
> I don't know the code well enough to suggest a suitable fix.

Code really, really, really should be run with full debug tooling (and the
above is one of the most basic ones) enabled.

I think the best approach would be to:
- install GuC firmware on all CI machines. Please coordinate with
  Daniela&Tomi on this, and cc: intel-gfx-ci@eclists.intel.com on any such
  mails.

- Every patch series that touches still disabled code needs to include a
  patch to enable GuC (or whatever it is) by default. Of course that patch
  must have a DONTMERGE or similar tag, if the feature isn't 100% ready
  for prime time yet. This way we can enlist CI to hunt for stuff like the
  above, and it's _really_ good at that.

Alex&Tvrtko, can you please take care of this?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4] drm/i915/guc: Clean up locks in GuC
  2016-01-25 16:01         ` Daniel Vetter
@ 2016-01-26 11:53           ` Tvrtko Ursulin
  2016-01-26 12:08             ` Chris Wilson
  0 siblings, 1 reply; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-26 11:53 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


On 25/01/16 16:01, Daniel Vetter wrote:
> On Fri, Jan 22, 2016 at 10:54:42AM +0000, Tvrtko Ursulin wrote:
>>
>> On 03/12/15 00:56, yu.dai@intel.com wrote:
>>> From: Alex Dai <yu.dai@intel.com>
>>>
>>> For now, remove the spinlocks that protected the GuC's
>>> statistics block and work queue; they are only accessed
>>> by code that already holds the global struct_mutex, and
>>> so are redundant (until the big struct_mutex rewrite!).
>>>
>>> The specific problem that the spinlocks caused was that
>>> if the work queue was full, the driver would try to
>>> spinwait for one jiffy, but with interrupts disabled the
>>> jiffy count would not advance, leading to a system hang.
>>> The issue was found using test case igt/gem_close_race.
>>>
>>> The new version will usleep() instead, still holding
>>> the struct_mutex but without any spinlocks.
>>>
>>> v4: Reorganize commit message (Dave Gordon)
>>> v3: Remove unnecessary whitespace churn
>>> v2: Clean up wq_lock too
>>> v1: Clean up host2guc lock as well
>>>
>>> Signed-off-by: Alex Dai <yu.dai@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_debugfs.c        | 12 ++++++------
>>>    drivers/gpu/drm/i915/i915_guc_submission.c | 31 ++++++------------------------
>>>    drivers/gpu/drm/i915/intel_guc.h           |  4 ----
>>>    3 files changed, 12 insertions(+), 35 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index a728ff1..d6b7817 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -2473,15 +2473,15 @@ static int i915_guc_info(struct seq_file *m, void *data)
>>>    	if (!HAS_GUC_SCHED(dev_priv->dev))
>>>    		return 0;
>>>
>>> +	if (mutex_lock_interruptible(&dev->struct_mutex))
>>> +		return 0;
>>> +
>>>    	/* Take a local copy of the GuC data, so we can dump it at leisure */
>>> -	spin_lock(&dev_priv->guc.host2guc_lock);
>>>    	guc = dev_priv->guc;
>>> -	if (guc.execbuf_client) {
>>> -		spin_lock(&guc.execbuf_client->wq_lock);
>>> +	if (guc.execbuf_client)
>>>    		client = *guc.execbuf_client;
>>> -		spin_unlock(&guc.execbuf_client->wq_lock);
>>> -	}
>>> -	spin_unlock(&dev_priv->guc.host2guc_lock);
>>> +
>>> +	mutex_unlock(&dev->struct_mutex);
>>>
>>>    	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
>>>    	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> index ed9f100..a7f9785 100644
>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>> @@ -86,7 +86,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
>>>    		return -EINVAL;
>>>
>>>    	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
>>> -	spin_lock(&dev_priv->guc.host2guc_lock);
>>>
>>>    	dev_priv->guc.action_count += 1;
>>>    	dev_priv->guc.action_cmd = data[0];
>>> @@ -119,7 +118,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
>>>    	}
>>>    	dev_priv->guc.action_status = status;
>>>
>>> -	spin_unlock(&dev_priv->guc.host2guc_lock);
>>>    	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
>>>
>>>    	return ret;
>>> @@ -292,16 +290,12 @@ static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
>>>    	const uint32_t cacheline_size = cache_line_size();
>>>    	uint32_t offset;
>>>
>>> -	spin_lock(&guc->host2guc_lock);
>>> -
>>>    	/* Doorbell uses a single cache line within a page */
>>>    	offset = offset_in_page(guc->db_cacheline);
>>>
>>>    	/* Moving to next cache line to reduce contention */
>>>    	guc->db_cacheline += cacheline_size;
>>>
>>> -	spin_unlock(&guc->host2guc_lock);
>>> -
>>>    	DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize %u\n",
>>>    			offset, guc->db_cacheline, cacheline_size);
>>>
>>> @@ -322,13 +316,11 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
>>>    	const uint16_t end = start + half;
>>>    	uint16_t id;
>>>
>>> -	spin_lock(&guc->host2guc_lock);
>>>    	id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
>>>    	if (id == end)
>>>    		id = GUC_INVALID_DOORBELL_ID;
>>>    	else
>>>    		bitmap_set(guc->doorbell_bitmap, id, 1);
>>> -	spin_unlock(&guc->host2guc_lock);
>>>
>>>    	DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
>>>    			hi_pri ? "high" : "normal", id);
>>> @@ -338,9 +330,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
>>>
>>>    static void release_doorbell(struct intel_guc *guc, uint16_t id)
>>>    {
>>> -	spin_lock(&guc->host2guc_lock);
>>>    	bitmap_clear(guc->doorbell_bitmap, id, 1);
>>> -	spin_unlock(&guc->host2guc_lock);
>>>    }
>>>
>>>    /*
>>> @@ -487,16 +477,13 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
>>>    	struct guc_process_desc *desc;
>>>    	void *base;
>>>    	u32 size = sizeof(struct guc_wq_item);
>>> -	int ret = 0, timeout_counter = 200;
>>> +	int ret = -ETIMEDOUT, timeout_counter = 200;
>>>
>>>    	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
>>>    	desc = base + gc->proc_desc_offset;
>>>
>>>    	while (timeout_counter-- > 0) {
>>> -		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
>>> -				gc->wq_size) >= size, 1);
>>> -
>>> -		if (!ret) {
>>> +		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
>>>    			*offset = gc->wq_tail;
>>>
>>>    			/* advance the tail for next workqueue item */
>>> @@ -505,7 +492,11 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
>>>
>>>    			/* this will break the loop */
>>>    			timeout_counter = 0;
>>> +			ret = 0;
>>>    		}
>>> +
>>> +		if (timeout_counter)
>>> +			usleep_range(1000, 2000);
>>
>> Sleeping is illegal while holding kmap_atomic mapping:
>>
>> [   34.098798] BUG: scheduling while atomic: gem_close_race/1941/0x00000002
>> [   34.098822] Modules linked in: hid_generic usbhid i915 asix usbnet libphy mii i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm coretemp i2c_hid hid video pinctrl_sunrisepoint pinctrl_intel acpi_pad nls_iso8859_1 e1000e ptp psmouse pps_core ahci libahci
>> [   34.098824] CPU: 0 PID: 1941 Comm: gem_close_race Tainted: G     U          4.4.0-160121+ #123
>> [   34.098824] Hardware name: Intel Corporation Skylake Client platform/Skylake AIO DDR3L RVP10, BIOS SKLSE2R1.R00.X100.B01.1509220551 09/22/2015
>> [   34.098825]  0000000000013e40 ffff880166c27a78 ffffffff81280d02 ffff880172c13e40
>> [   34.098826]  ffff880166c27a88 ffffffff810c203a ffff880166c27ac8 ffffffff814ec808
>> [   34.098827]  ffff88016b7c6000 ffff880166c28000 00000000000f4240 0000000000000001
>> [   34.098827] Call Trace:
>> [   34.098831]  [<ffffffff81280d02>] dump_stack+0x4b/0x79
>> [   34.098833]  [<ffffffff810c203a>] __schedule_bug+0x41/0x4f
>> [   34.098834]  [<ffffffff814ec808>] __schedule+0x5a8/0x690
>> [   34.098835]  [<ffffffff814ec927>] schedule+0x37/0x80
>> [   34.098836]  [<ffffffff814ef3fd>] schedule_hrtimeout_range_clock+0xad/0x130
>> [   34.098837]  [<ffffffff81090be0>] ? hrtimer_init+0x10/0x10
>> [   34.098838]  [<ffffffff814ef3f1>] ? schedule_hrtimeout_range_clock+0xa1/0x130
>> [   34.098839]  [<ffffffff814ef48e>] schedule_hrtimeout_range+0xe/0x10
>> [   34.098840]  [<ffffffff814eef9b>] usleep_range+0x3b/0x40
>> [   34.098853]  [<ffffffffa01ec109>] i915_guc_wq_check_space+0x119/0x210 [i915]
>> [   34.098861]  [<ffffffffa01da47c>] intel_logical_ring_alloc_request_extras+0x5c/0x70 [i915]
>> [   34.098869]  [<ffffffffa01cdbf1>] i915_gem_request_alloc+0x91/0x170 [i915]
>> [   34.098875]  [<ffffffffa01c1c07>] i915_gem_do_execbuffer.isra.25+0xbc7/0x12a0 [i915]
>> [   34.098882]  [<ffffffffa01cb785>] ? i915_gem_object_get_pages_gtt+0x225/0x3c0 [i915]
>> [   34.098889]  [<ffffffffa01d1fb6>] ? i915_gem_pwrite_ioctl+0xd6/0x9f0 [i915]
>> [   34.098895]  [<ffffffffa01c2e68>] i915_gem_execbuffer2+0xa8/0x250 [i915]
>> [   34.098900]  [<ffffffffa00f65d8>] drm_ioctl+0x258/0x4f0 [drm]
>> [   34.098906]  [<ffffffffa01c2dc0>] ? i915_gem_execbuffer+0x340/0x340 [i915]
>> [   34.098908]  [<ffffffff8111590d>] do_vfs_ioctl+0x2cd/0x4a0
>> [   34.098909]  [<ffffffff8111eac2>] ? __fget+0x72/0xb0
>> [   34.098910]  [<ffffffff81115b1c>] SyS_ioctl+0x3c/0x70
>> [   34.098911]  [<ffffffff814effd7>] entry_SYSCALL_64_fastpath+0x12/0x6a
>> [   34.100208] ------------[ cut here ]------------
>>
>> I don't know the code well enough to suggest a suitable fix.
>
> Code really, really, really should be run with full debug tooling (and the
> above is one of the most basic ones) enabled.
>
> I think the best approach would be to:
> - install GuC firmware on all CI machines. Please coordinate with
>    Daniela&Tomi on this, and cc: intel-gfx-ci@eclists.intel.com on any such
>    mails.
>
> - Every patch series that touches still disabled code needs to include a
>    patch to enable GuC (or whatever it is) by default. Of course that patch
>    must have a DONTMERGE or similar tag, if the feature isn't 100% ready
>    for prime time yet. This way we can enlist CI to hunt for stuff like the
>    above, and it's _really_ good at that.
>
> Alex&Tvrtko, can you please take care of this?

Chris Harris has liaised with Tomi and apparently GuC firmware is 
already on the lab SKLs from ~two weeks ago.

Since the firmware is on all machines, presumably we would also want to 
have CI runs without the GuC enabled to ensure patches are not breaking 
the execlists mode?

How would you go about that? Submit patch series twice? Once with the 
final "enable GuC" patch and one without?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4] drm/i915/guc: Clean up locks in GuC
  2016-01-26 11:53           ` Tvrtko Ursulin
@ 2016-01-26 12:08             ` Chris Wilson
  2016-01-26 16:54               ` Daniel Vetter
  0 siblings, 1 reply; 24+ messages in thread
From: Chris Wilson @ 2016-01-26 12:08 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Tue, Jan 26, 2016 at 11:53:45AM +0000, Tvrtko Ursulin wrote:
> Chris Harris has liaised with Tomi and apparently GuC firmware is
> already on the lab SKLs from ~two weeks ago.
> 
> Since the firmware is on all machines, presumably we would also want
> to have CI runs without the GuC enabled to ensure patches are not
> breaking the execlists mode?
> 
> How would you go about that? Submit patch series twice? Once with
> the final "enable GuC" patch and one without?

After the enable GuC patch, execlists will still be triggered on
machines without the right firmware. We either remove the firmware from
the CI image for that run, or reboot with i915.enable_guc=0

Yes, we should be checking that both modes "still" function.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4] drm/i915/guc: Clean up locks in GuC
  2016-01-26 12:08             ` Chris Wilson
@ 2016-01-26 16:54               ` Daniel Vetter
  2016-01-26 17:11                 ` Chris Wilson
  0 siblings, 1 reply; 24+ messages in thread
From: Daniel Vetter @ 2016-01-26 16:54 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Daniel Vetter, intel-gfx

On Tue, Jan 26, 2016 at 12:08:37PM +0000, Chris Wilson wrote:
> On Tue, Jan 26, 2016 at 11:53:45AM +0000, Tvrtko Ursulin wrote:
> > Chris Harris has liaised with Tomi and apparently GuC firmware is
> > already on the lab SKLs from ~two weeks ago.
> > 
> > Since the firmware is on all machines, presumably we would also want
> > to have CI runs without the GuC enabled to ensure patches are not
> > breaking the execlists mode?
> > 
> > How would you go about that? Submit patch series twice? Once with
> > the final "enable GuC" patch and one without?
> 
> After the enable GuC patch, execlists will still be triggered on
> machines without the right firmware. We either remove the firmware from
> the CI image for that run, or reboot with i915.enable_guc=0
> 
> Yes, we should be checking that both modes "still" function.

Since guc mode will be for skl only bdw will still do execlist tests for
us. I think that should be good enough. But I've heard also that the
private patch submission is coming along, so you could submit a 2nd time
without the "enable guc" patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v4] drm/i915/guc: Clean up locks in GuC
  2016-01-26 16:54               ` Daniel Vetter
@ 2016-01-26 17:11                 ` Chris Wilson
  0 siblings, 0 replies; 24+ messages in thread
From: Chris Wilson @ 2016-01-26 17:11 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Tue, Jan 26, 2016 at 05:54:21PM +0100, Daniel Vetter wrote:
> On Tue, Jan 26, 2016 at 12:08:37PM +0000, Chris Wilson wrote:
> > On Tue, Jan 26, 2016 at 11:53:45AM +0000, Tvrtko Ursulin wrote:
> > > Chris Harris has liaised with Tomi and apparently GuC firmware is
> > > already on the lab SKLs from ~two weeks ago.
> > > 
> > > Since the firmware is on all machines, presumably we would also want
> > > to have CI runs without the GuC enabled to ensure patches are not
> > > breaking the execlists mode?
> > > 
> > > How would you go about that? Submit patch series twice? Once with
> > > the final "enable GuC" patch and one without?
> > 
> > After the enable GuC patch, execlists will still be triggered on
> > machines without the right firmware. We either remove the firmware from
> > the CI image for that run, or reboot with i915.enable_guc=0
> > 
> > Yes, we should be checking that both modes "still" function.
> 
> Since guc mode will be for skl only bdw will still do execlist tests for
> us. I think that should be good enough. But I've heard also that the
> private patch submission is coming along, so you could submit a 2nd time
> without the "enable guc" patch.

Since skl doesn't pass igt, I think it is presumptuous that will be
fixed first if the results are totally ignored.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2016-01-26 17:11 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-20  0:15 [PATCH] drm/i915/guc: Keep irq enabled during GuC cmd submission yu.dai
2015-11-20  8:45 ` Daniel Vetter
2015-11-23 23:02   ` [PATCH] drm/i915/guc: Move wait for GuC out of spinlock/unlock yu.dai
2015-11-24 13:04     ` Daniel Vetter
2015-11-24 13:26       ` Imre Deak
2015-11-24 17:00         ` Yu Dai
2015-11-24 17:05           ` Imre Deak
2015-11-24 18:08             ` Daniel Vetter
2015-11-24 18:36               ` Yu Dai
2015-11-24 19:13                 ` Daniel Vetter
2015-11-24 22:34                   ` Yu Dai
2015-11-25  8:45                     ` Daniel Vetter
2015-11-25 19:29     ` [PATCH v2] drm/i915/guc: Clean up locks in GuC yu.dai
2015-11-26  9:16       ` Nick Hoath
2015-12-01  0:15     ` [PATCH v3] " yu.dai
2015-12-02 19:50       ` Dave Gordon
2015-12-03  0:56     ` [PATCH v4] " yu.dai
2015-12-03  7:52       ` Daniel Vetter
2016-01-22 10:54       ` Tvrtko Ursulin
2016-01-25 16:01         ` Daniel Vetter
2016-01-26 11:53           ` Tvrtko Ursulin
2016-01-26 12:08             ` Chris Wilson
2016-01-26 16:54               ` Daniel Vetter
2016-01-26 17:11                 ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.