All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Take reference for signaling the request from hardirq
@ 2017-03-03 14:45 Chris Wilson
  2017-03-03 14:52 ` Mika Kuoppala
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Chris Wilson @ 2017-03-03 14:45 UTC (permalink / raw)
  To: intel-gfx

Being inside a spinlock signaling that the hardware just completed a
request doesn't prevent a second thread already spotting that the
request is complete, freeing it and reallocating it! The code currently
tries to prevent this using RCU -- but that only prevents the request
from being freed, it doesn't prevent us from reallocating it - that
requires us to take a reference.

[  206.922985] BUG: spinlock already unlocked on CPU#4, gem_exec_parall/7796
[  206.922994]  lock: 0xffff8801c6047120, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
[  206.923000] CPU: 4 PID: 7796 Comm: gem_exec_parall Not tainted 4.10.0-CI-Patchwork_4008+ #1
[  206.923006] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 1805 06/20/2016
[  206.923012] Call Trace:
[  206.923014]  <IRQ>
[  206.923019]  dump_stack+0x67/0x92
[  206.923023]  spin_dump+0x73/0xc0
[  206.923027]  do_raw_spin_unlock+0x79/0xb0
[  206.923031]  _raw_spin_unlock_irqrestore+0x27/0x60
[  206.923042]  dma_fence_signal+0x160/0x230
[  206.923060]  notify_ring+0xae/0x2e0 [i915]
[  206.923073]  ? ibx_hpd_irq_handler+0xc0/0xc0 [i915]
[  206.923086]  gen8_gt_irq_handler+0x219/0x290 [i915]
[  206.923100]  gen8_irq_handler+0x8e/0x6b0 [i915]
[  206.923105]  __handle_irq_event_percpu+0x58/0x370
[  206.923109]  handle_irq_event_percpu+0x1e/0x50
[  206.923113]  handle_irq_event+0x34/0x60
[  206.923117]  handle_edge_irq+0xbe/0x150
[  206.923122]  handle_irq+0x15/0x20
[  206.923126]  do_IRQ+0x63/0x130
[  206.923142]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
[  206.923148]  common_interrupt+0x90/0x90
[  206.923153] RIP: 0010:osq_lock+0x77/0x110
[  206.923157] RSP: 0018:ffffc90001cabaa0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff6e
[  206.923164] RAX: 0000000000000000 RBX: ffff880236d1abc0 RCX: ffff8801ef642fc0
[  206.923169] RDX: ffff8801ef6427c0 RSI: ffffffff81c6e7fd RDI: ffffffff81c7c848
[  206.923175] RBP: ffffc90001cabab8 R08: 00000000692bb19b R09: 08c1493200000000
[  206.923180] R10: 0000000000000001 R11: 0000000000000001 R12: ffff880236cdabc0
[  206.923185] R13: ffff8802207f00b0 R14: ffffffffa00b7cd9 R15: ffff8802207f0070
[  206.923191]  </IRQ>
[  206.923206]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
[  206.923213]  __mutex_lock+0x649/0x990
[  206.923217]  ? __mutex_lock+0xb0/0x990
[  206.923221]  ? _raw_spin_unlock+0x2c/0x50
[  206.923226]  ? __pm_runtime_resume+0x56/0x80
[  206.923242]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
[  206.923249]  mutex_lock_interruptible_nested+0x16/0x20
[  206.923264]  i915_mutex_lock_interruptible+0x39/0x140 [i915]
[  206.923270]  ? __pm_runtime_resume+0x56/0x80
[  206.923285]  i915_gem_do_execbuffer.isra.15+0x442/0x1d10 [i915]
[  206.923291]  ? __lock_acquire+0x449/0x1b50
[  206.923296]  ? __might_fault+0x3e/0x90
[  206.923301]  ? __might_fault+0x87/0x90
[  206.923305]  ? __might_fault+0x3e/0x90
[  206.923320]  i915_gem_execbuffer2+0xb5/0x220 [i915]
[  206.923327]  drm_ioctl+0x200/0x450
[  206.923341]  ? i915_gem_execbuffer+0x330/0x330 [i915]
[  206.923348]  do_vfs_ioctl+0x90/0x6e0
[  206.923352]  ? __fget+0x108/0x200
[  206.923356]  ? expand_files+0x2b0/0x2b0
[  206.923361]  SyS_ioctl+0x3c/0x70
[  206.923365]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[  206.923369] RIP: 0033:0x7fdd75fc6357
[  206.923373] RSP: 002b:00007fdd20e59bf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  206.923380] RAX: ffffffffffffffda RBX: ffffffff81481ff3 RCX: 00007fdd75fc6357
[  206.923385] RDX: 00007fdd20e59c70 RSI: 0000000040406469 RDI: 0000000000000003
[  206.923390] RBP: ffffc90001cabf88 R08: 0000000000000040 R09: 00000000000003f7
[  206.923396] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  206.923401] R13: 0000000000000003 R14: 0000000040406469 R15: 0000000001cf9cb0
[  206.923408]  ? __this_cpu_preempt_check+0x13/0x20

Fixes: 56299fb7d904 ("drm/i915: Signal first fence from irq handler if complete")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100051
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 5fa2c4c56b09..73ab5abf20d7 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1039,8 +1039,6 @@ static void notify_ring(struct intel_engine_cs *engine)
 	atomic_inc(&engine->irq_count);
 	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
 
-	rcu_read_lock();
-
 	spin_lock(&engine->breadcrumbs.lock);
 	wait = engine->breadcrumbs.first_wait;
 	if (wait) {
@@ -1057,7 +1055,7 @@ static void notify_ring(struct intel_engine_cs *engine)
 		 */
 		if (i915_seqno_passed(intel_engine_get_seqno(engine),
 				      wait->seqno))
-			rq = wait->request;
+			rq = i915_gem_request_get(wait->request);
 
 		wake_up_process(wait->tsk);
 	} else {
@@ -1065,10 +1063,10 @@ static void notify_ring(struct intel_engine_cs *engine)
 	}
 	spin_unlock(&engine->breadcrumbs.lock);
 
-	if (rq)
+	if (rq) {
 		dma_fence_signal(&rq->fence);
-
-	rcu_read_unlock();
+		i915_gem_request_put(rq);
+	}
 
 	trace_intel_engine_notify(engine, wait);
 }
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Take reference for signaling the request from hardirq
  2017-03-03 14:45 [PATCH] drm/i915: Take reference for signaling the request from hardirq Chris Wilson
@ 2017-03-03 14:52 ` Mika Kuoppala
  2017-03-03 14:59   ` Chris Wilson
  2017-03-03 15:00   ` Mika Kuoppala
  2017-03-03 14:52 ` Chris Wilson
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 10+ messages in thread
From: Mika Kuoppala @ 2017-03-03 14:52 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Being inside a spinlock signaling that the hardware just completed a
> request doesn't prevent a second thread already spotting that the
> request is complete, freeing it and reallocating it! The code currently
> tries to prevent this using RCU -- but that only prevents the request
> from being freed, it doesn't prevent us from reallocating it - that
> requires us to take a reference.

How can it be reallocated if it was never freed in the first place?
-Mika

>
> [  206.922985] BUG: spinlock already unlocked on CPU#4, gem_exec_parall/7796
> [  206.922994]  lock: 0xffff8801c6047120, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
> [  206.923000] CPU: 4 PID: 7796 Comm: gem_exec_parall Not tainted 4.10.0-CI-Patchwork_4008+ #1
> [  206.923006] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 1805 06/20/2016
> [  206.923012] Call Trace:
> [  206.923014]  <IRQ>
> [  206.923019]  dump_stack+0x67/0x92
> [  206.923023]  spin_dump+0x73/0xc0
> [  206.923027]  do_raw_spin_unlock+0x79/0xb0
> [  206.923031]  _raw_spin_unlock_irqrestore+0x27/0x60
> [  206.923042]  dma_fence_signal+0x160/0x230
> [  206.923060]  notify_ring+0xae/0x2e0 [i915]
> [  206.923073]  ? ibx_hpd_irq_handler+0xc0/0xc0 [i915]
> [  206.923086]  gen8_gt_irq_handler+0x219/0x290 [i915]
> [  206.923100]  gen8_irq_handler+0x8e/0x6b0 [i915]
> [  206.923105]  __handle_irq_event_percpu+0x58/0x370
> [  206.923109]  handle_irq_event_percpu+0x1e/0x50
> [  206.923113]  handle_irq_event+0x34/0x60
> [  206.923117]  handle_edge_irq+0xbe/0x150
> [  206.923122]  handle_irq+0x15/0x20
> [  206.923126]  do_IRQ+0x63/0x130
> [  206.923142]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
> [  206.923148]  common_interrupt+0x90/0x90
> [  206.923153] RIP: 0010:osq_lock+0x77/0x110
> [  206.923157] RSP: 0018:ffffc90001cabaa0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff6e
> [  206.923164] RAX: 0000000000000000 RBX: ffff880236d1abc0 RCX: ffff8801ef642fc0
> [  206.923169] RDX: ffff8801ef6427c0 RSI: ffffffff81c6e7fd RDI: ffffffff81c7c848
> [  206.923175] RBP: ffffc90001cabab8 R08: 00000000692bb19b R09: 08c1493200000000
> [  206.923180] R10: 0000000000000001 R11: 0000000000000001 R12: ffff880236cdabc0
> [  206.923185] R13: ffff8802207f00b0 R14: ffffffffa00b7cd9 R15: ffff8802207f0070
> [  206.923191]  </IRQ>
> [  206.923206]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
> [  206.923213]  __mutex_lock+0x649/0x990
> [  206.923217]  ? __mutex_lock+0xb0/0x990
> [  206.923221]  ? _raw_spin_unlock+0x2c/0x50
> [  206.923226]  ? __pm_runtime_resume+0x56/0x80
> [  206.923242]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
> [  206.923249]  mutex_lock_interruptible_nested+0x16/0x20
> [  206.923264]  i915_mutex_lock_interruptible+0x39/0x140 [i915]
> [  206.923270]  ? __pm_runtime_resume+0x56/0x80
> [  206.923285]  i915_gem_do_execbuffer.isra.15+0x442/0x1d10 [i915]
> [  206.923291]  ? __lock_acquire+0x449/0x1b50
> [  206.923296]  ? __might_fault+0x3e/0x90
> [  206.923301]  ? __might_fault+0x87/0x90
> [  206.923305]  ? __might_fault+0x3e/0x90
> [  206.923320]  i915_gem_execbuffer2+0xb5/0x220 [i915]
> [  206.923327]  drm_ioctl+0x200/0x450
> [  206.923341]  ? i915_gem_execbuffer+0x330/0x330 [i915]
> [  206.923348]  do_vfs_ioctl+0x90/0x6e0
> [  206.923352]  ? __fget+0x108/0x200
> [  206.923356]  ? expand_files+0x2b0/0x2b0
> [  206.923361]  SyS_ioctl+0x3c/0x70
> [  206.923365]  entry_SYSCALL_64_fastpath+0x1c/0xb1
> [  206.923369] RIP: 0033:0x7fdd75fc6357
> [  206.923373] RSP: 002b:00007fdd20e59bf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [  206.923380] RAX: ffffffffffffffda RBX: ffffffff81481ff3 RCX: 00007fdd75fc6357
> [  206.923385] RDX: 00007fdd20e59c70 RSI: 0000000040406469 RDI: 0000000000000003
> [  206.923390] RBP: ffffc90001cabf88 R08: 0000000000000040 R09: 00000000000003f7
> [  206.923396] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> [  206.923401] R13: 0000000000000003 R14: 0000000040406469 R15: 0000000001cf9cb0
> [  206.923408]  ? __this_cpu_preempt_check+0x13/0x20
>
> Fixes: 56299fb7d904 ("drm/i915: Signal first fence from irq handler if complete")
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100051
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_irq.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 5fa2c4c56b09..73ab5abf20d7 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1039,8 +1039,6 @@ static void notify_ring(struct intel_engine_cs *engine)
>  	atomic_inc(&engine->irq_count);
>  	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
>  
> -	rcu_read_lock();
> -
>  	spin_lock(&engine->breadcrumbs.lock);
>  	wait = engine->breadcrumbs.first_wait;
>  	if (wait) {
> @@ -1057,7 +1055,7 @@ static void notify_ring(struct intel_engine_cs *engine)
>  		 */
>  		if (i915_seqno_passed(intel_engine_get_seqno(engine),
>  				      wait->seqno))
> -			rq = wait->request;
> +			rq = i915_gem_request_get(wait->request);
>  
>  		wake_up_process(wait->tsk);
>  	} else {
> @@ -1065,10 +1063,10 @@ static void notify_ring(struct intel_engine_cs *engine)
>  	}
>  	spin_unlock(&engine->breadcrumbs.lock);
>  
> -	if (rq)
> +	if (rq) {
>  		dma_fence_signal(&rq->fence);
> -
> -	rcu_read_unlock();
> +		i915_gem_request_put(rq);
> +	}
>  
>  	trace_intel_engine_notify(engine, wait);
>  }
> -- 
> 2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Take reference for signaling the request from hardirq
  2017-03-03 14:45 [PATCH] drm/i915: Take reference for signaling the request from hardirq Chris Wilson
  2017-03-03 14:52 ` Mika Kuoppala
@ 2017-03-03 14:52 ` Chris Wilson
  2017-03-03 14:57   ` Chris Wilson
  2017-03-03 15:18 ` Mika Kuoppala
  2017-03-03 15:47 ` ✓ Fi.CI.BAT: success for " Patchwork
  3 siblings, 1 reply; 10+ messages in thread
From: Chris Wilson @ 2017-03-03 14:52 UTC (permalink / raw)
  To: intel-gfx

On Fri, Mar 03, 2017 at 02:45:57PM +0000, Chris Wilson wrote:
> ---
>  drivers/gpu/drm/i915/i915_irq.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 5fa2c4c56b09..73ab5abf20d7 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1039,8 +1039,6 @@ static void notify_ring(struct intel_engine_cs *engine)
>  	atomic_inc(&engine->irq_count);
>  	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
>  
> -	rcu_read_lock();
> -
>  	spin_lock(&engine->breadcrumbs.lock);
>  	wait = engine->breadcrumbs.first_wait;
>  	if (wait) {
> @@ -1057,7 +1055,7 @@ static void notify_ring(struct intel_engine_cs *engine)
>  		 */
>  		if (i915_seqno_passed(intel_engine_get_seqno(engine),
>  				      wait->seqno))
> -			rq = wait->request;
> +			rq = i915_gem_request_get(wait->request);
>  
>  		wake_up_process(wait->tsk);
>  	} else {
> @@ -1065,10 +1063,10 @@ static void notify_ring(struct intel_engine_cs *engine)
>  	}
>  	spin_unlock(&engine->breadcrumbs.lock);
>  
> -	if (rq)
> +	if (rq) {
>  		dma_fence_signal(&rq->fence);
> -
> -	rcu_read_unlock();
> +		i915_gem_request_put(rq);
> +	}

The alternative would be to place spin_unlock_wait() in
i915_fence_release. The advantages of that will be avoiding the extra
work inside the interrupt handler.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Take reference for signaling the request from hardirq
  2017-03-03 14:52 ` Chris Wilson
@ 2017-03-03 14:57   ` Chris Wilson
  0 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2017-03-03 14:57 UTC (permalink / raw)
  To: intel-gfx, Tvrtko Ursulin, Mika Kuoppala

On Fri, Mar 03, 2017 at 02:52:59PM +0000, Chris Wilson wrote:
> On Fri, Mar 03, 2017 at 02:45:57PM +0000, Chris Wilson wrote:
> > ---
> >  drivers/gpu/drm/i915/i915_irq.c | 10 ++++------
> >  1 file changed, 4 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 5fa2c4c56b09..73ab5abf20d7 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -1039,8 +1039,6 @@ static void notify_ring(struct intel_engine_cs *engine)
> >  	atomic_inc(&engine->irq_count);
> >  	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
> >  
> > -	rcu_read_lock();
> > -
> >  	spin_lock(&engine->breadcrumbs.lock);
> >  	wait = engine->breadcrumbs.first_wait;
> >  	if (wait) {
> > @@ -1057,7 +1055,7 @@ static void notify_ring(struct intel_engine_cs *engine)
> >  		 */
> >  		if (i915_seqno_passed(intel_engine_get_seqno(engine),
> >  				      wait->seqno))
> > -			rq = wait->request;
> > +			rq = i915_gem_request_get(wait->request);
> >  
> >  		wake_up_process(wait->tsk);
> >  	} else {
> > @@ -1065,10 +1063,10 @@ static void notify_ring(struct intel_engine_cs *engine)
> >  	}
> >  	spin_unlock(&engine->breadcrumbs.lock);
> >  
> > -	if (rq)
> > +	if (rq) {
> >  		dma_fence_signal(&rq->fence);
> > -
> > -	rcu_read_unlock();
> > +		i915_gem_request_put(rq);
> > +	}
> 
> The alternative would be to place spin_unlock_wait() in
> i915_fence_release. The advantages of that will be avoiding the extra
> work inside the interrupt handler.

That's not enough as there's an open window before we acquire the
fence->lock.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Take reference for signaling the request from hardirq
  2017-03-03 14:52 ` Mika Kuoppala
@ 2017-03-03 14:59   ` Chris Wilson
  2017-03-03 15:00   ` Mika Kuoppala
  1 sibling, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2017-03-03 14:59 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Fri, Mar 03, 2017 at 04:52:52PM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Being inside a spinlock signaling that the hardware just completed a
> > request doesn't prevent a second thread already spotting that the
> > request is complete, freeing it and reallocating it! The code currently
> > tries to prevent this using RCU -- but that only prevents the request
> > from being freed, it doesn't prevent us from reallocating it - that
> > requires us to take a reference.
> 
> How can it be reallocated if it was never freed in the first place?

We use a slab-cache. The pages of the slab themselves are guarded by
RCU, but we reallocate from the freelist immediately.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Take reference for signaling the request from hardirq
  2017-03-03 14:52 ` Mika Kuoppala
  2017-03-03 14:59   ` Chris Wilson
@ 2017-03-03 15:00   ` Mika Kuoppala
  2017-03-03 15:13     ` Chris Wilson
  1 sibling, 1 reply; 10+ messages in thread
From: Mika Kuoppala @ 2017-03-03 15:00 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Mika Kuoppala <mika.kuoppala@linux.intel.com> writes:

> Chris Wilson <chris@chris-wilson.co.uk> writes:
>
>> Being inside a spinlock signaling that the hardware just completed a
>> request doesn't prevent a second thread already spotting that the
>> request is complete, freeing it and reallocating it! The code currently
>> tries to prevent this using RCU -- but that only prevents the request
>> from being freed, it doesn't prevent us from reallocating it - that
>> requires us to take a reference.
>
> How can it be reallocated if it was never freed in the first place?

Oh request backing storage from being freed?

-Mika

> -Mika
>
>>
>> [  206.922985] BUG: spinlock already unlocked on CPU#4, gem_exec_parall/7796
>> [  206.922994]  lock: 0xffff8801c6047120, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
>> [  206.923000] CPU: 4 PID: 7796 Comm: gem_exec_parall Not tainted 4.10.0-CI-Patchwork_4008+ #1
>> [  206.923006] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 1805 06/20/2016
>> [  206.923012] Call Trace:
>> [  206.923014]  <IRQ>
>> [  206.923019]  dump_stack+0x67/0x92
>> [  206.923023]  spin_dump+0x73/0xc0
>> [  206.923027]  do_raw_spin_unlock+0x79/0xb0
>> [  206.923031]  _raw_spin_unlock_irqrestore+0x27/0x60
>> [  206.923042]  dma_fence_signal+0x160/0x230
>> [  206.923060]  notify_ring+0xae/0x2e0 [i915]
>> [  206.923073]  ? ibx_hpd_irq_handler+0xc0/0xc0 [i915]
>> [  206.923086]  gen8_gt_irq_handler+0x219/0x290 [i915]
>> [  206.923100]  gen8_irq_handler+0x8e/0x6b0 [i915]
>> [  206.923105]  __handle_irq_event_percpu+0x58/0x370
>> [  206.923109]  handle_irq_event_percpu+0x1e/0x50
>> [  206.923113]  handle_irq_event+0x34/0x60
>> [  206.923117]  handle_edge_irq+0xbe/0x150
>> [  206.923122]  handle_irq+0x15/0x20
>> [  206.923126]  do_IRQ+0x63/0x130
>> [  206.923142]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
>> [  206.923148]  common_interrupt+0x90/0x90
>> [  206.923153] RIP: 0010:osq_lock+0x77/0x110
>> [  206.923157] RSP: 0018:ffffc90001cabaa0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff6e
>> [  206.923164] RAX: 0000000000000000 RBX: ffff880236d1abc0 RCX: ffff8801ef642fc0
>> [  206.923169] RDX: ffff8801ef6427c0 RSI: ffffffff81c6e7fd RDI: ffffffff81c7c848
>> [  206.923175] RBP: ffffc90001cabab8 R08: 00000000692bb19b R09: 08c1493200000000
>> [  206.923180] R10: 0000000000000001 R11: 0000000000000001 R12: ffff880236cdabc0
>> [  206.923185] R13: ffff8802207f00b0 R14: ffffffffa00b7cd9 R15: ffff8802207f0070
>> [  206.923191]  </IRQ>
>> [  206.923206]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
>> [  206.923213]  __mutex_lock+0x649/0x990
>> [  206.923217]  ? __mutex_lock+0xb0/0x990
>> [  206.923221]  ? _raw_spin_unlock+0x2c/0x50
>> [  206.923226]  ? __pm_runtime_resume+0x56/0x80
>> [  206.923242]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
>> [  206.923249]  mutex_lock_interruptible_nested+0x16/0x20
>> [  206.923264]  i915_mutex_lock_interruptible+0x39/0x140 [i915]
>> [  206.923270]  ? __pm_runtime_resume+0x56/0x80
>> [  206.923285]  i915_gem_do_execbuffer.isra.15+0x442/0x1d10 [i915]
>> [  206.923291]  ? __lock_acquire+0x449/0x1b50
>> [  206.923296]  ? __might_fault+0x3e/0x90
>> [  206.923301]  ? __might_fault+0x87/0x90
>> [  206.923305]  ? __might_fault+0x3e/0x90
>> [  206.923320]  i915_gem_execbuffer2+0xb5/0x220 [i915]
>> [  206.923327]  drm_ioctl+0x200/0x450
>> [  206.923341]  ? i915_gem_execbuffer+0x330/0x330 [i915]
>> [  206.923348]  do_vfs_ioctl+0x90/0x6e0
>> [  206.923352]  ? __fget+0x108/0x200
>> [  206.923356]  ? expand_files+0x2b0/0x2b0
>> [  206.923361]  SyS_ioctl+0x3c/0x70
>> [  206.923365]  entry_SYSCALL_64_fastpath+0x1c/0xb1
>> [  206.923369] RIP: 0033:0x7fdd75fc6357
>> [  206.923373] RSP: 002b:00007fdd20e59bf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>> [  206.923380] RAX: ffffffffffffffda RBX: ffffffff81481ff3 RCX: 00007fdd75fc6357
>> [  206.923385] RDX: 00007fdd20e59c70 RSI: 0000000040406469 RDI: 0000000000000003
>> [  206.923390] RBP: ffffc90001cabf88 R08: 0000000000000040 R09: 00000000000003f7
>> [  206.923396] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>> [  206.923401] R13: 0000000000000003 R14: 0000000040406469 R15: 0000000001cf9cb0
>> [  206.923408]  ? __this_cpu_preempt_check+0x13/0x20
>>
>> Fixes: 56299fb7d904 ("drm/i915: Signal first fence from irq handler if complete")
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100051
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_irq.c | 10 ++++------
>>  1 file changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>> index 5fa2c4c56b09..73ab5abf20d7 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -1039,8 +1039,6 @@ static void notify_ring(struct intel_engine_cs *engine)
>>  	atomic_inc(&engine->irq_count);
>>  	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
>>  
>> -	rcu_read_lock();
>> -
>>  	spin_lock(&engine->breadcrumbs.lock);
>>  	wait = engine->breadcrumbs.first_wait;
>>  	if (wait) {
>> @@ -1057,7 +1055,7 @@ static void notify_ring(struct intel_engine_cs *engine)
>>  		 */
>>  		if (i915_seqno_passed(intel_engine_get_seqno(engine),
>>  				      wait->seqno))
>> -			rq = wait->request;
>> +			rq = i915_gem_request_get(wait->request);
>>  
>>  		wake_up_process(wait->tsk);
>>  	} else {
>> @@ -1065,10 +1063,10 @@ static void notify_ring(struct intel_engine_cs *engine)
>>  	}
>>  	spin_unlock(&engine->breadcrumbs.lock);
>>  
>> -	if (rq)
>> +	if (rq) {
>>  		dma_fence_signal(&rq->fence);
>> -
>> -	rcu_read_unlock();
>> +		i915_gem_request_put(rq);
>> +	}
>>  
>>  	trace_intel_engine_notify(engine, wait);
>>  }
>> -- 
>> 2.11.0
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Take reference for signaling the request from hardirq
  2017-03-03 15:00   ` Mika Kuoppala
@ 2017-03-03 15:13     ` Chris Wilson
  0 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2017-03-03 15:13 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Fri, Mar 03, 2017 at 05:00:50PM +0200, Mika Kuoppala wrote:
> Mika Kuoppala <mika.kuoppala@linux.intel.com> writes:
> 
> > Chris Wilson <chris@chris-wilson.co.uk> writes:
> >
> >> Being inside a spinlock signaling that the hardware just completed a
> >> request doesn't prevent a second thread already spotting that the
> >> request is complete, freeing it and reallocating it! The code currently
> >> tries to prevent this using RCU -- but that only prevents the request
> >> from being freed, it doesn't prevent us from reallocating it - that
> >> requires us to take a reference.
> >
> > How can it be reallocated if it was never freed in the first place?
> 
> Oh request backing storage from being freed?

Yes. "prevents the request slab from being freed, it doesn't prevent us
from reallocating an individual request".
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] drm/i915: Take reference for signaling the request from hardirq
  2017-03-03 14:45 [PATCH] drm/i915: Take reference for signaling the request from hardirq Chris Wilson
  2017-03-03 14:52 ` Mika Kuoppala
  2017-03-03 14:52 ` Chris Wilson
@ 2017-03-03 15:18 ` Mika Kuoppala
  2017-03-03 15:47 ` ✓ Fi.CI.BAT: success for " Patchwork
  3 siblings, 0 replies; 10+ messages in thread
From: Mika Kuoppala @ 2017-03-03 15:18 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Being inside a spinlock signaling that the hardware just completed a
> request doesn't prevent a second thread already spotting that the
> request is complete, freeing it and reallocating it! The code currently
> tries to prevent this using RCU -- but that only prevents the request
> from being freed, it doesn't prevent us from reallocating it - that
> requires us to take a reference.
>
> [  206.922985] BUG: spinlock already unlocked on CPU#4, gem_exec_parall/7796
> [  206.922994]  lock: 0xffff8801c6047120, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
> [  206.923000] CPU: 4 PID: 7796 Comm: gem_exec_parall Not tainted 4.10.0-CI-Patchwork_4008+ #1
> [  206.923006] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 1805 06/20/2016
> [  206.923012] Call Trace:
> [  206.923014]  <IRQ>
> [  206.923019]  dump_stack+0x67/0x92
> [  206.923023]  spin_dump+0x73/0xc0
> [  206.923027]  do_raw_spin_unlock+0x79/0xb0
> [  206.923031]  _raw_spin_unlock_irqrestore+0x27/0x60
> [  206.923042]  dma_fence_signal+0x160/0x230
> [  206.923060]  notify_ring+0xae/0x2e0 [i915]
> [  206.923073]  ? ibx_hpd_irq_handler+0xc0/0xc0 [i915]
> [  206.923086]  gen8_gt_irq_handler+0x219/0x290 [i915]
> [  206.923100]  gen8_irq_handler+0x8e/0x6b0 [i915]
> [  206.923105]  __handle_irq_event_percpu+0x58/0x370
> [  206.923109]  handle_irq_event_percpu+0x1e/0x50
> [  206.923113]  handle_irq_event+0x34/0x60
> [  206.923117]  handle_edge_irq+0xbe/0x150
> [  206.923122]  handle_irq+0x15/0x20
> [  206.923126]  do_IRQ+0x63/0x130
> [  206.923142]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
> [  206.923148]  common_interrupt+0x90/0x90
> [  206.923153] RIP: 0010:osq_lock+0x77/0x110
> [  206.923157] RSP: 0018:ffffc90001cabaa0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff6e
> [  206.923164] RAX: 0000000000000000 RBX: ffff880236d1abc0 RCX: ffff8801ef642fc0
> [  206.923169] RDX: ffff8801ef6427c0 RSI: ffffffff81c6e7fd RDI: ffffffff81c7c848
> [  206.923175] RBP: ffffc90001cabab8 R08: 00000000692bb19b R09: 08c1493200000000
> [  206.923180] R10: 0000000000000001 R11: 0000000000000001 R12: ffff880236cdabc0
> [  206.923185] R13: ffff8802207f00b0 R14: ffffffffa00b7cd9 R15: ffff8802207f0070
> [  206.923191]  </IRQ>
> [  206.923206]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
> [  206.923213]  __mutex_lock+0x649/0x990
> [  206.923217]  ? __mutex_lock+0xb0/0x990
> [  206.923221]  ? _raw_spin_unlock+0x2c/0x50
> [  206.923226]  ? __pm_runtime_resume+0x56/0x80
> [  206.923242]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
> [  206.923249]  mutex_lock_interruptible_nested+0x16/0x20
> [  206.923264]  i915_mutex_lock_interruptible+0x39/0x140 [i915]
> [  206.923270]  ? __pm_runtime_resume+0x56/0x80
> [  206.923285]  i915_gem_do_execbuffer.isra.15+0x442/0x1d10 [i915]
> [  206.923291]  ? __lock_acquire+0x449/0x1b50
> [  206.923296]  ? __might_fault+0x3e/0x90
> [  206.923301]  ? __might_fault+0x87/0x90
> [  206.923305]  ? __might_fault+0x3e/0x90
> [  206.923320]  i915_gem_execbuffer2+0xb5/0x220 [i915]
> [  206.923327]  drm_ioctl+0x200/0x450
> [  206.923341]  ? i915_gem_execbuffer+0x330/0x330 [i915]
> [  206.923348]  do_vfs_ioctl+0x90/0x6e0
> [  206.923352]  ? __fget+0x108/0x200
> [  206.923356]  ? expand_files+0x2b0/0x2b0
> [  206.923361]  SyS_ioctl+0x3c/0x70
> [  206.923365]  entry_SYSCALL_64_fastpath+0x1c/0xb1
> [  206.923369] RIP: 0033:0x7fdd75fc6357
> [  206.923373] RSP: 002b:00007fdd20e59bf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [  206.923380] RAX: ffffffffffffffda RBX: ffffffff81481ff3 RCX: 00007fdd75fc6357
> [  206.923385] RDX: 00007fdd20e59c70 RSI: 0000000040406469 RDI: 0000000000000003
> [  206.923390] RBP: ffffc90001cabf88 R08: 0000000000000040 R09: 00000000000003f7
> [  206.923396] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> [  206.923401] R13: 0000000000000003 R14: 0000000040406469 R15: 0000000001cf9cb0
> [  206.923408]  ? __this_cpu_preempt_check+0x13/0x20
>
> Fixes: 56299fb7d904 ("drm/i915: Signal first fence from irq handler if complete")
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100051
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_irq.c | 10 ++++------
>  1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 5fa2c4c56b09..73ab5abf20d7 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1039,8 +1039,6 @@ static void notify_ring(struct intel_engine_cs *engine)
>  	atomic_inc(&engine->irq_count);
>  	set_bit(ENGINE_IRQ_BREADCRUMB, &engine->irq_posted);
>  
> -	rcu_read_lock();
> -
>  	spin_lock(&engine->breadcrumbs.lock);
>  	wait = engine->breadcrumbs.first_wait;
>  	if (wait) {
> @@ -1057,7 +1055,7 @@ static void notify_ring(struct intel_engine_cs *engine)
>  		 */
>  		if (i915_seqno_passed(intel_engine_get_seqno(engine),
>  				      wait->seqno))
> -			rq = wait->request;
> +			rq = i915_gem_request_get(wait->request);
>  
>  		wake_up_process(wait->tsk);
>  	} else {
> @@ -1065,10 +1063,10 @@ static void notify_ring(struct intel_engine_cs *engine)
>  	}
>  	spin_unlock(&engine->breadcrumbs.lock);
>  
> -	if (rq)
> +	if (rq) {
>  		dma_fence_signal(&rq->fence);
> -
> -	rcu_read_unlock();
> +		i915_gem_request_put(rq);
> +	}
>  
>  	trace_intel_engine_notify(engine, wait);
>  }
> -- 
> 2.11.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Take reference for signaling the request from hardirq
  2017-03-03 14:45 [PATCH] drm/i915: Take reference for signaling the request from hardirq Chris Wilson
                   ` (2 preceding siblings ...)
  2017-03-03 15:18 ` Mika Kuoppala
@ 2017-03-03 15:47 ` Patchwork
  2017-03-03 16:00   ` Chris Wilson
  3 siblings, 1 reply; 10+ messages in thread
From: Patchwork @ 2017-03-03 15:47 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Take reference for signaling the request from hardirq
URL   : https://patchwork.freedesktop.org/series/20629/
State : success

== Summary ==

Series 20629v1 drm/i915: Take reference for signaling the request from hardirq
https://patchwork.freedesktop.org/api/1.0/series/20629/revisions/1/mbox/

Test gem_exec_flush:
        Subgroup basic-batch-kernel-default-uc:
                pass       -> FAIL       (fi-snb-2600) fdo#100007
Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-b-frame-sequence:
                fail       -> PASS       (fi-skl-6770hq) fdo#99788

fdo#100007 https://bugs.freedesktop.org/show_bug.cgi?id=100007
fdo#99788 https://bugs.freedesktop.org/show_bug.cgi?id=99788

fi-bdw-5557u     total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11 
fi-bsw-n3050     total:278  pass:239  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205     total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19 
fi-bxt-t5700     total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20 
fi-byt-j1900     total:278  pass:251  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820     total:278  pass:247  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770      total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16 
fi-hsw-4770r     total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16 
fi-ilk-650       total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50 
fi-ivb-3520m     total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18 
fi-ivb-3770      total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18 
fi-kbl-7500u     total:278  pass:259  dwarn:1   dfail:0   fail:0   skip:18 
fi-skl-6260u     total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10 
fi-skl-6700hq    total:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17 
fi-skl-6700k     total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18 
fi-skl-6770hq    total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10 
fi-snb-2520m     total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28 
fi-snb-2600      total:278  pass:248  dwarn:0   dfail:0   fail:1   skip:29 

d84f6bf3c67330736e0793ca429631c6081b0625 drm-tip: 2017y-03m-03d-14h-51m-19s UTC integration manifest
22ebaf6 drm/i915: Take reference for signaling the request from hardirq

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4055/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: ✓ Fi.CI.BAT: success for drm/i915: Take reference for signaling the request from hardirq
  2017-03-03 15:47 ` ✓ Fi.CI.BAT: success for " Patchwork
@ 2017-03-03 16:00   ` Chris Wilson
  0 siblings, 0 replies; 10+ messages in thread
From: Chris Wilson @ 2017-03-03 16:00 UTC (permalink / raw)
  To: intel-gfx

On Fri, Mar 03, 2017 at 03:47:51PM -0000, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Take reference for signaling the request from hardirq
> URL   : https://patchwork.freedesktop.org/series/20629/
> State : success
> 
> == Summary ==
> 
> Series 20629v1 drm/i915: Take reference for signaling the request from hardirq
> https://patchwork.freedesktop.org/api/1.0/series/20629/revisions/1/mbox/
> 
> Test gem_exec_flush:
>         Subgroup basic-batch-kernel-default-uc:
>                 pass       -> FAIL       (fi-snb-2600) fdo#100007

See the other breadcrumbs series!

> Test kms_pipe_crc_basic:
>         Subgroup read-crc-pipe-b-frame-sequence:
>                 fail       -> PASS       (fi-skl-6770hq) fdo#99788
> 
> fdo#100007 https://bugs.freedesktop.org/show_bug.cgi?id=100007
> fdo#99788 https://bugs.freedesktop.org/show_bug.cgi?id=99788

Thanks for the bug report and reviewing,
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-03-03 16:00 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-03 14:45 [PATCH] drm/i915: Take reference for signaling the request from hardirq Chris Wilson
2017-03-03 14:52 ` Mika Kuoppala
2017-03-03 14:59   ` Chris Wilson
2017-03-03 15:00   ` Mika Kuoppala
2017-03-03 15:13     ` Chris Wilson
2017-03-03 14:52 ` Chris Wilson
2017-03-03 14:57   ` Chris Wilson
2017-03-03 15:18 ` Mika Kuoppala
2017-03-03 15:47 ` ✓ Fi.CI.BAT: success for " Patchwork
2017-03-03 16:00   ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.