intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] drm/i915: Fix rps irq warning
@ 2011-09-04  3:24 Ben Widawsky
  2011-09-04  9:03 ` Chris Wilson
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Widawsky @ 2011-09-04  3:24 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

I couldn't reproduce this one, but...  Interrupt mask state is lost if
three interrupts occur before the workqueue has run.

Should be straight forward to reproduce even without SMP. I'm pretty
sure Dan Vetter was trying to explain this to me, and I couldn't get it.
My solution I think is different than his though.

Here is a sample failure:
  <pm irq 1<<0 occurs>
  local iir = 1
  spin lock
  mask_pm_irq(1 << 0);
  dev_priv->iir = 1;
  spin_unlock
  clear_pm_irq(1 << 0);

  <pm irq 1 << 1 occurs>
  local iir = 2
  spin lock
  mask_pm_irq(1 << 1); // here is the bug
  dev_priv->iir = 3;
  clear_pm_irq(1 << 1);

  <pm irq 1 << 0 occurs again>
  local iir = 1
  spin lock
  WARN!!!

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_irq.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 9cbb0cd..55518e3 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -649,8 +649,8 @@ static irqreturn_t ironlake_irq_handler(DRM_IRQ_ARGS)
 		unsigned long flags;
 		spin_lock_irqsave(&dev_priv->rps_lock, flags);
 		WARN(dev_priv->pm_iir & pm_iir, "Missed a PM interrupt\n");
-		I915_WRITE(GEN6_PMIMR, pm_iir);
 		dev_priv->pm_iir |= pm_iir;
+		I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir);
 		spin_unlock_irqrestore(&dev_priv->rps_lock, flags);
 		queue_work(dev_priv->wq, &dev_priv->rps_work);
 	}
-- 
1.7.6.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] drm/i915: Fix rps irq warning
  2011-09-04  3:24 [PATCH] drm/i915: Fix rps irq warning Ben Widawsky
@ 2011-09-04  9:03 ` Chris Wilson
  2011-09-04 15:49   ` Ben Widawsky
  0 siblings, 1 reply; 22+ messages in thread
From: Chris Wilson @ 2011-09-04  9:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

On Sat,  3 Sep 2011 20:24:15 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> I couldn't reproduce this one, but...  Interrupt mask state is lost if
> three interrupts occur before the workqueue has run.
> 
> Should be straight forward to reproduce even without SMP. I'm pretty
> sure Dan Vetter was trying to explain this to me, and I couldn't get it.
> My solution I think is different than his though.

This logic is now duplicated in ivybridge_irq_handler(). This simply fits
the scenario Daniel described, whilst also fitting in with our
understanding of IMR, IER and IIR. (A big assumption ;-)

Reported-by: Soeren Sonnenburg <sonne@debian.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/3] slaughter rps races some more
  2011-09-04 15:49   ` Ben Widawsky
@ 2011-09-04 15:34     ` Daniel Vetter
  2011-09-04 15:35       ` [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Daniel Vetter
                         ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Daniel Vetter @ 2011-09-04 15:34 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

Hi all,

Let's have some new ideas for races around the rps code.

Not yet tested (I can't hit the WARN anyway in my testing), I've just
wanted to get this out for discussion.

Yours, Daniel

Daniel Vetter (3):
  drm/i915: close PM interrupt masking races in the irq handler
  drm/i915: close PM interrupt masking races in the rps work func
  drm/i915: close rps work vs. rps disable races

 drivers/gpu/drm/i915/i915_irq.c      |    8 +++++---
 drivers/gpu/drm/i915/intel_display.c |    8 +++++++-
 2 files changed, 12 insertions(+), 4 deletions(-)

-- 
1.7.6

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler
  2011-09-04 15:34     ` [PATCH 0/3] slaughter rps races some more Daniel Vetter
@ 2011-09-04 15:35       ` Daniel Vetter
  2011-09-04 17:09         ` Ben Widawsky
  2011-09-04 15:35       ` [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func Daniel Vetter
  2011-09-04 15:35       ` [PATCH 3/3] drm/i915: close rps work vs. rps disable races Daniel Vetter
  2 siblings, 1 reply; 22+ messages in thread
From: Daniel Vetter @ 2011-09-04 15:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

Quoting Chris Wilson's more concise description:

"Ah I think I see the problem. As you point out we only mask the
current interrupt received, so that if we have a task pending (and so
IMR != 0) we actually unmask the pending interrupt and so could
receive it again before the tasklet is finally kicked off by the
grumpy scheduler."

So we need the hw to issue PM interrupts A, B, A while the scheduler
is hating us and refuses to run the rps work item. On receiving PM
interrupt A we hit the
WARN because

dev_priv->pm_iir == PM_A | PM_B

Also add a posting read as suggested by Chris to ensure proper
ordering of the writes to PMIMR and PMIIR. Just in case somebody
weakens write ordering.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_irq.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 9cbb0cd..2fdd9f9 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -536,8 +536,9 @@ static irqreturn_t ivybridge_irq_handler(DRM_IRQ_ARGS)
 		unsigned long flags;
 		spin_lock_irqsave(&dev_priv->rps_lock, flags);
 		WARN(dev_priv->pm_iir & pm_iir, "Missed a PM interrupt\n");
-		I915_WRITE(GEN6_PMIMR, pm_iir);
 		dev_priv->pm_iir |= pm_iir;
+		I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir);
+		POSTING_READ(GEN6_PMIMR);
 		spin_unlock_irqrestore(&dev_priv->rps_lock, flags);
 		queue_work(dev_priv->wq, &dev_priv->rps_work);
 	}
@@ -649,8 +650,9 @@ static irqreturn_t ironlake_irq_handler(DRM_IRQ_ARGS)
 		unsigned long flags;
 		spin_lock_irqsave(&dev_priv->rps_lock, flags);
 		WARN(dev_priv->pm_iir & pm_iir, "Missed a PM interrupt\n");
-		I915_WRITE(GEN6_PMIMR, pm_iir);
 		dev_priv->pm_iir |= pm_iir;
+		I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir);
+		POSTING_READ(GEN6_PMIMR);
 		spin_unlock_irqrestore(&dev_priv->rps_lock, flags);
 		queue_work(dev_priv->wq, &dev_priv->rps_work);
 	}
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func
  2011-09-04 15:34     ` [PATCH 0/3] slaughter rps races some more Daniel Vetter
  2011-09-04 15:35       ` [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Daniel Vetter
@ 2011-09-04 15:35       ` Daniel Vetter
  2011-09-04 17:08         ` Ben Widawsky
  2011-09-04 15:35       ` [PATCH 3/3] drm/i915: close rps work vs. rps disable races Daniel Vetter
  2 siblings, 1 reply; 22+ messages in thread
From: Daniel Vetter @ 2011-09-04 15:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

This patch closes the following race:

We get a PM interrupt A, mask it, set dev_priv->iir = PM_A and kick of the
work item. Scheduler isn't grumpy, so the work queue takes rps_lock,
grabs pm_iir = dev_priv->pm_iir and pm_imr = READ(PMIMR). Note that
pm_imr == pm_iir because we've just masked the interrupt we've got.

Now hw sends out PM interrupt B (not masked), we process it and mask
it.  Later on the irq handler also clears PMIIR.

Then the work item proceeds and at the end clears PMIMR. Because
(local) pm_imr == pm_iir we have
        pm_imr & ~pm_iir == 0
so all interrupts are enabled.

Hardware is still interrupt-happy, and sends out a new PM interrupt B.
PMIMR doesn't mask B (it does not mask anything), PMIIR is cleared, so
we get it and hit the WARN in the interrupt handler (because
dev_priv->pm_iir == PM_B).

That's why I've moved the
        WRITE(PMIMR, 0)
up under the protection of the rps_lock. And write an uncoditional 0
to PMIMR, because that's what we'll do anyway.

This races looks much more likely because we can arbitrarily extend
the window by grabing dev->struct mutex right after the irq handler
has processed the first PM_B interrupt.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_irq.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2fdd9f9..21ebcbd 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -383,6 +383,7 @@ static void gen6_pm_rps_work(struct work_struct *work)
 	pm_iir = dev_priv->pm_iir;
 	dev_priv->pm_iir = 0;
 	pm_imr = I915_READ(GEN6_PMIMR);
+	I915_WRITE(GEN6_PMIMR, 0);
 	spin_unlock_irq(&dev_priv->rps_lock);
 
 	if (!pm_iir)
@@ -420,7 +421,6 @@ static void gen6_pm_rps_work(struct work_struct *work)
 	 * an *extremely* unlikely race with gen6_rps_enable() that is prevented
 	 * by holding struct_mutex for the duration of the write.
 	 */
-	I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir);
 	mutex_unlock(&dev_priv->dev->struct_mutex);
 }
 
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/3] drm/i915: close rps work vs. rps disable races
  2011-09-04 15:34     ` [PATCH 0/3] slaughter rps races some more Daniel Vetter
  2011-09-04 15:35       ` [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Daniel Vetter
  2011-09-04 15:35       ` [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func Daniel Vetter
@ 2011-09-04 15:35       ` Daniel Vetter
  2011-09-04 17:23         ` Ben Widawsky
  2 siblings, 1 reply; 22+ messages in thread
From: Daniel Vetter @ 2011-09-04 15:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

The rps disabling code wasn't properly cancelling outstanding work
items. Also add a comment that explains why we're not racing with
the work item, that could again unmask interrupts.

This also fixes a bug on module unload because nothing was properly
syncing up with that work item, possibly leading to it being run after
freeing dev_priv or removing the module code.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/intel_display.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 56a8554..ccd4600 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7483,10 +7483,16 @@ void gen6_disable_rps(struct drm_device *dev)
 
 	I915_WRITE(GEN6_RPNSWREQ, 1 << 31);
 	I915_WRITE(GEN6_PMINTRMSK, 0xffffffff);
-	I915_WRITE(GEN6_PMIER, 0);
+	/* Complete PM interrupt masking here doesn't race with the rps work
+	 * item again unmasking PM interrupts because that is using PMIMR
+	 * (logically sitting atop of PMINTRMSK) to mask interrupts. */
+	cancel_work_sync(&dev_priv->rps_work);
 
+	/* Clear PMIMR and dev_priv->pm_iir in case outstanding work gets
+	 * cancelled before having run. */
 	spin_lock_irq(&dev_priv->rps_lock);
 	dev_priv->pm_iir = 0;
+	I915_WRITE(GEN6_PMIER, 0);
 	spin_unlock_irq(&dev_priv->rps_lock);
 
 	I915_WRITE(GEN6_PMIIR, I915_READ(GEN6_PMIIR));
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] drm/i915: Fix rps irq warning
  2011-09-04  9:03 ` Chris Wilson
@ 2011-09-04 15:49   ` Ben Widawsky
  2011-09-04 15:34     ` [PATCH 0/3] slaughter rps races some more Daniel Vetter
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Widawsky @ 2011-09-04 15:49 UTC (permalink / raw)
  To: intel-gfx

On Sun, 04 Sep 2011 10:03:21 +0100
Chris Wilson <chris@chris-wilson.co.uk> wrote:

> On Sat,  3 Sep 2011 20:24:15 -0700, Ben Widawsky <ben@bwidawsk.net>
> wrote:
> > I couldn't reproduce this one, but...  Interrupt mask state is lost
> > if three interrupts occur before the workqueue has run.
> > 
> > Should be straight forward to reproduce even without SMP. I'm pretty
> > sure Dan Vetter was trying to explain this to me, and I couldn't
> > get it. My solution I think is different than his though.
> 
> This logic is now duplicated in ivybridge_irq_handler(). This simply
> fits the scenario Daniel described, whilst also fitting in with our
> understanding of IMR, IER and IIR. (A big assumption ;-)
> 
> Reported-by: Soeren Sonnenburg <sonne@debian.org>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> -Chris
> 

I completely screwed this one up because I was tired. Chris sent me this
exact patch, which I told him was wrong, and then proceeded  to rewrite
the same thing on my own. Chris, I suggest you resubmit what you sent me
privately with my reviewed by, and please accept my apology.

I am nak'ing my patch.

Ben

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func
  2011-09-04 15:35       ` [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func Daniel Vetter
@ 2011-09-04 17:08         ` Ben Widawsky
  2011-09-04 19:26           ` Daniel Vetter
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Widawsky @ 2011-09-04 17:08 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Sun,  4 Sep 2011 17:35:01 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> This patch closes the following race:
> 
> We get a PM interrupt A, mask it, set dev_priv->iir = PM_A and kick
> of the work item. Scheduler isn't grumpy, so the work queue takes
> rps_lock, grabs pm_iir = dev_priv->pm_iir and pm_imr = READ(PMIMR).
> Note that pm_imr == pm_iir because we've just masked the interrupt
> we've got.
> 
> Now hw sends out PM interrupt B (not masked), we process it and mask
> it.  Later on the irq handler also clears PMIIR.
> 
> Then the work item proceeds and at the end clears PMIMR. Because
> (local) pm_imr == pm_iir we have
>         pm_imr & ~pm_iir == 0
> so all interrupts are enabled.
> 
> Hardware is still interrupt-happy, and sends out a new PM interrupt B.
> PMIMR doesn't mask B (it does not mask anything), PMIIR is cleared, so
> we get it and hit the WARN in the interrupt handler (because
> dev_priv->pm_iir == PM_B).
> 
> That's why I've moved the
>         WRITE(PMIMR, 0)
> up under the protection of the rps_lock. And write an uncoditional 0
> to PMIMR, because that's what we'll do anyway.
> 
> This races looks much more likely because we can arbitrarily extend
> the window by grabing dev->struct mutex right after the irq handler
> has processed the first PM_B interrupt.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_irq.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c
> b/drivers/gpu/drm/i915/i915_irq.c index 2fdd9f9..21ebcbd 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -383,6 +383,7 @@ static void gen6_pm_rps_work(struct work_struct
> *work) pm_iir = dev_priv->pm_iir;
>  	dev_priv->pm_iir = 0;
>  	pm_imr = I915_READ(GEN6_PMIMR);
> +	I915_WRITE(GEN6_PMIMR, 0);
>  	spin_unlock_irq(&dev_priv->rps_lock);
>  
>  	if (!pm_iir)
> @@ -420,7 +421,6 @@ static void gen6_pm_rps_work(struct work_struct
> *work)
>  	 * an *extremely* unlikely race with gen6_rps_enable() that
> is prevented
>  	 * by holding struct_mutex for the duration of the write.
>  	 */
> -	I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir);
>  	mutex_unlock(&dev_priv->dev->struct_mutex);
>  }
>  

How about this:

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 55518e3..3bc1479 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -415,12 +415,7 @@ static void gen6_pm_rps_work(struct work_struct *work)
        gen6_set_rps(dev_priv->dev, new_delay);
        dev_priv->cur_delay = new_delay;
 
-       /*
-        * rps_lock not held here because clearing is non-destructive. There is
-        * an *extremely* unlikely race with gen6_rps_enable() that is prevented
-        * by holding struct_mutex for the duration of the write.
-        */
-       I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir);
+       I915_WRITE(GEN6_PMIMR, pm_imr & dev_priv->pm_iir);
        mutex_unlock(&dev_priv->dev->struct_mutex);
 }

Ben

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler
  2011-09-04 15:35       ` [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Daniel Vetter
@ 2011-09-04 17:09         ` Ben Widawsky
  0 siblings, 0 replies; 22+ messages in thread
From: Ben Widawsky @ 2011-09-04 17:09 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Sun,  4 Sep 2011 17:35:00 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> Quoting Chris Wilson's more concise description:
> 
> "Ah I think I see the problem. As you point out we only mask the
> current interrupt received, so that if we have a task pending (and so
> IMR != 0) we actually unmask the pending interrupt and so could
> receive it again before the tasklet is finally kicked off by the
> grumpy scheduler."
> 
> So we need the hw to issue PM interrupts A, B, A while the scheduler
> is hating us and refuses to run the rps work item. On receiving PM
> interrupt A we hit the
> WARN because
> 
> dev_priv->pm_iir == PM_A | PM_B
> 
> Also add a posting read as suggested by Chris to ensure proper
> ordering of the writes to PMIMR and PMIIR. Just in case somebody
> weakens write ordering.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] drm/i915: close rps work vs. rps disable races
  2011-09-04 15:35       ` [PATCH 3/3] drm/i915: close rps work vs. rps disable races Daniel Vetter
@ 2011-09-04 17:23         ` Ben Widawsky
  2011-09-04 19:17           ` Daniel Vetter
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Widawsky @ 2011-09-04 17:23 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Sun, Sep 04, 2011 at 05:35:02PM +0200, Daniel Vetter wrote:
> The rps disabling code wasn't properly cancelling outstanding work
> items. Also add a comment that explains why we're not racing with
> the work item, that could again unmask interrupts.
> 
> This also fixes a bug on module unload because nothing was properly
> syncing up with that work item, possibly leading to it being run after
> freeing dev_priv or removing the module code.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/intel_display.c |    8 +++++++-
>  1 files changed, 7 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 56a8554..ccd4600 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -7483,10 +7483,16 @@ void gen6_disable_rps(struct drm_device *dev)
>  
>  	I915_WRITE(GEN6_RPNSWREQ, 1 << 31);
>  	I915_WRITE(GEN6_PMINTRMSK, 0xffffffff);
> -	I915_WRITE(GEN6_PMIER, 0);
> +	/* Complete PM interrupt masking here doesn't race with the rps work
> +	 * item again unmasking PM interrupts because that is using PMIMR
> +	 * (logically sitting atop of PMINTRMSK) to mask interrupts. */
> +	cancel_work_sync(&dev_priv->rps_work);
>  
> +	/* Clear PMIMR and dev_priv->pm_iir in case outstanding work gets
> +	 * cancelled before having run. */

This comment is wrong, you are clearing PMIER, not IMR. IMR gets cleared
in enable() iirc

>  	spin_lock_irq(&dev_priv->rps_lock);
>  	dev_priv->pm_iir = 0;
> +	I915_WRITE(GEN6_PMIER, 0);
>  	spin_unlock_irq(&dev_priv->rps_lock);
>  
>  	I915_WRITE(GEN6_PMIIR, I915_READ(GEN6_PMIIR));
> -- 
> 1.7.6
> 

I'm not sure this actually fixes a problem. The existing code:
1. disables all interrupts (no more can occur).
2. sets pm_iir to 0 safe in rps lock
<workqueues can run at this point, but IMR has no effect with IER = 0>

I think you should do the cancel work sync somewhere in the code before
module unload (to be correct). I just don't think this fixes a race.

Ben

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] drm/i915: close rps work vs. rps disable races
  2011-09-04 17:23         ` Ben Widawsky
@ 2011-09-04 19:17           ` Daniel Vetter
  2011-09-04 19:50             ` Ben Widawsky
  0 siblings, 1 reply; 22+ messages in thread
From: Daniel Vetter @ 2011-09-04 19:17 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Daniel Vetter, intel-gfx

On Sun, Sep 04, 2011 at 05:23:30PM +0000, Ben Widawsky wrote:
> On Sun, Sep 04, 2011 at 05:35:02PM +0200, Daniel Vetter wrote:
> > The rps disabling code wasn't properly cancelling outstanding work
> > items. Also add a comment that explains why we're not racing with
> > the work item, that could again unmask interrupts.
> > 
> > This also fixes a bug on module unload because nothing was properly
> > syncing up with that work item, possibly leading to it being run after
> > freeing dev_priv or removing the module code.
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> > ---
> >  drivers/gpu/drm/i915/intel_display.c |    8 +++++++-
> >  1 files changed, 7 insertions(+), 1 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 56a8554..ccd4600 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -7483,10 +7483,16 @@ void gen6_disable_rps(struct drm_device *dev)
> >  
> >  	I915_WRITE(GEN6_RPNSWREQ, 1 << 31);
> >  	I915_WRITE(GEN6_PMINTRMSK, 0xffffffff);
> > -	I915_WRITE(GEN6_PMIER, 0);
> > +	/* Complete PM interrupt masking here doesn't race with the rps work
> > +	 * item again unmasking PM interrupts because that is using PMIMR
> > +	 * (logically sitting atop of PMINTRMSK) to mask interrupts. */
> > +	cancel_work_sync(&dev_priv->rps_work);
> >  
> > +	/* Clear PMIMR and dev_priv->pm_iir in case outstanding work gets
> > +	 * cancelled before having run. */
> 
> This comment is wrong, you are clearing PMIER, not IMR. IMR gets cleared
> in enable() iirc

Bleh, now it's my turn to completely screw it up ;-) Still, I think some
comment that we muck around with different registers than the work item
would be good. And clearing PMIMR under the lock just for paranoia so
that setting dev_priv->pm_iir doesn't leave a strange interrupt mask
lingering around

> >  	spin_lock_irq(&dev_priv->rps_lock);
> >  	dev_priv->pm_iir = 0;
> > +	I915_WRITE(GEN6_PMIER, 0);
> >  	spin_unlock_irq(&dev_priv->rps_lock);
> >  
> >  	I915_WRITE(GEN6_PMIIR, I915_READ(GEN6_PMIIR));
> I'm not sure this actually fixes a problem. The existing code:
> 1. disables all interrupts (no more can occur).
> 2. sets pm_iir to 0 safe in rps lock
> <workqueues can run at this point, but IMR has no effect with IER = 0>
> 
> I think you should do the cancel work sync somewhere in the code before
> module unload (to be correct). I just don't think this fixes a race.

Well, we definitely need the cancel_work_sync in the unload path
somewhere. I prefer it in the rps disable function. In the context of the
other patches my patch description is a bit lousy because I've really only
wanted to fix the unload race (and the PMIMR inconsistency) with this
patch.

-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func
  2011-09-04 17:08         ` Ben Widawsky
@ 2011-09-04 19:26           ` Daniel Vetter
  2011-09-04 19:56             ` Ben Widawsky
  0 siblings, 1 reply; 22+ messages in thread
From: Daniel Vetter @ 2011-09-04 19:26 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Daniel Vetter, intel-gfx

On Sun, Sep 04, 2011 at 10:08:17AM -0700, Ben Widawsky wrote:
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 55518e3..3bc1479 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -415,12 +415,7 @@ static void gen6_pm_rps_work(struct work_struct *work)
>         gen6_set_rps(dev_priv->dev, new_delay);
>         dev_priv->cur_delay = new_delay;
>  
> -       /*
> -        * rps_lock not held here because clearing is non-destructive. There is
> -        * an *extremely* unlikely race with gen6_rps_enable() that is prevented
> -        * by holding struct_mutex for the duration of the write.
> -        */
> -       I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir);
> +       I915_WRITE(GEN6_PMIMR, pm_imr & dev_priv->pm_iir);
>         mutex_unlock(&dev_priv->dev->struct_mutex);
>  }

For this to work we'd need to hold the rps_lock (to avoid racing with the
irq handler). But imo my approach is conceptually simpler: The work func
grabs all oustanding PM interrupts and then enables them again in one go
(protected by rps_lock). And because the dev_priv->wq workqueue is
single-threaded (no point in using multiple threads when all work items
grab dev->struct mutex) we also cannot make a mess by running work items
in the wrong order (or in parallel).

-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] drm/i915: close rps work vs. rps disable races
  2011-09-04 19:17           ` Daniel Vetter
@ 2011-09-04 19:50             ` Ben Widawsky
  2011-09-04 19:57               ` Daniel Vetter
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Widawsky @ 2011-09-04 19:50 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx

On Sun, Sep 04, 2011 at 09:17:24PM +0200, Daniel Vetter wrote:
> > >  	spin_lock_irq(&dev_priv->rps_lock);
> > >  	dev_priv->pm_iir = 0;
> > > +	I915_WRITE(GEN6_PMIER, 0);
> > >  	spin_unlock_irq(&dev_priv->rps_lock);
> > >  
> > >  	I915_WRITE(GEN6_PMIIR, I915_READ(GEN6_PMIIR));
> > I'm not sure this actually fixes a problem. The existing code:
> > 1. disables all interrupts (no more can occur).
> > 2. sets pm_iir to 0 safe in rps lock
> > <workqueues can run at this point, but IMR has no effect with IER = 0>
> > 
> > I think you should do the cancel work sync somewhere in the code before
> > module unload (to be correct). I just don't think this fixes a race.
> 
> Well, we definitely need the cancel_work_sync in the unload path
> somewhere. I prefer it in the rps disable function. In the context of the
> other patches my patch description is a bit lousy because I've really only
> wanted to fix the unload race (and the PMIMR inconsistency) with this
> patch.
> 
> -Daniel

I still fail to see the inconsitency. The value of PMR no longer matters once
IER is cleared. What the wq or irq handler does after this point should
be fine so long as everything is setup up properly at enable time. This
is a minor detail.

So I would rework some of your comments a bit, and I also think it's
about time we create a central place to cancel workqueue items. Mostly
because I think this patch is subject to a deadlock with struct_mutex
(you're holding it when you call gen6_disable_rps(), but you're doing
cancel_work_sync on a workqueue which attempts to acquire struct_mutex.

Ben

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func
  2011-09-04 19:26           ` Daniel Vetter
@ 2011-09-04 19:56             ` Ben Widawsky
  2011-09-04 20:10               ` Daniel Vetter
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Widawsky @ 2011-09-04 19:56 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx

On Sun, Sep 04, 2011 at 09:26:48PM +0200, Daniel Vetter wrote:
> On Sun, Sep 04, 2011 at 10:08:17AM -0700, Ben Widawsky wrote:
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 55518e3..3bc1479 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -415,12 +415,7 @@ static void gen6_pm_rps_work(struct work_struct *work)
> >         gen6_set_rps(dev_priv->dev, new_delay);
> >         dev_priv->cur_delay = new_delay;
> >  
> > -       /*
> > -        * rps_lock not held here because clearing is non-destructive. There is
> > -        * an *extremely* unlikely race with gen6_rps_enable() that is prevented
> > -        * by holding struct_mutex for the duration of the write.
> > -        */
> > -       I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir);
> > +       I915_WRITE(GEN6_PMIMR, pm_imr & dev_priv->pm_iir);
> >         mutex_unlock(&dev_priv->dev->struct_mutex);
> >  }
> 
> For this to work we'd need to hold the rps_lock (to avoid racing with the
> irq handler). But imo my approach is conceptually simpler: The work func
> grabs all oustanding PM interrupts and then enables them again in one go
> (protected by rps_lock).

I agree your approach is similar, but I think we should really consider
whether my approach actually requires the lock. I *think* it doesn't. At
least in my head my patch should fix the error you spotted. I don't
know, maybe I need to think some more.

The reason I worked so hard to avoid doing it the way you did in my
original implementation is I was trying really hard to not break the
cardinal rule about minimizing time holding spinlock_irqs. To go with
the other patch, you probably want a POSTING_READ also before releasing
the spin_lock (though I think this is being a bit paranoid).

Again I need to think on this some more, but I'm also not terribly fond
of ever unconditionally setting a value to 0 as it tends to complicate
things when adding new readers or writers to the system.

In summary, let me think about whether my solution actually won't work
(feel free to contribute to my thinking), and if it doesn't then the
decision is easy.

Ben

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] drm/i915: close rps work vs. rps disable races
  2011-09-04 19:50             ` Ben Widawsky
@ 2011-09-04 19:57               ` Daniel Vetter
  2011-09-05  8:15                 ` [PATCH] drm/i915: properly cancel rps_work on module unload Daniel Vetter
  0 siblings, 1 reply; 22+ messages in thread
From: Daniel Vetter @ 2011-09-04 19:57 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Daniel Vetter, intel-gfx

On Sun, Sep 04, 2011 at 07:50:08PM +0000, Ben Widawsky wrote:
> So I would rework some of your comments a bit, and I also think it's
> about time we create a central place to cancel workqueue items. Mostly
> because I think this patch is subject to a deadlock with struct_mutex
> (you're holding it when you call gen6_disable_rps(), but you're doing
> cancel_work_sync on a workqueue which attempts to acquire struct_mutex.

Oh dang, you're correct. This will deadlock. And the PMIMR is really just
cosmetic. So putting the cancel_work_sync into intel_modeset_cleanup
sounds like the right approach. And hey, I've already put other work
cancelling stuff there in my last fury at fixing module unload, so it
can't be that bad ;-)

I'll try to write a better patch tomorrow.
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func
  2011-09-04 19:56             ` Ben Widawsky
@ 2011-09-04 20:10               ` Daniel Vetter
  2011-09-04 21:38                 ` Ben Widawsky
  0 siblings, 1 reply; 22+ messages in thread
From: Daniel Vetter @ 2011-09-04 20:10 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Daniel Vetter, intel-gfx

On Sun, Sep 04, 2011 at 07:56:57PM +0000, Ben Widawsky wrote:
> On Sun, Sep 04, 2011 at 09:26:48PM +0200, Daniel Vetter wrote:
> > On Sun, Sep 04, 2011 at 10:08:17AM -0700, Ben Widawsky wrote:
> > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > index 55518e3..3bc1479 100644
> > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > @@ -415,12 +415,7 @@ static void gen6_pm_rps_work(struct work_struct *work)
> > >         gen6_set_rps(dev_priv->dev, new_delay);
> > >         dev_priv->cur_delay = new_delay;
> > >  
> > > -       /*
> > > -        * rps_lock not held here because clearing is non-destructive. There is
> > > -        * an *extremely* unlikely race with gen6_rps_enable() that is prevented
> > > -        * by holding struct_mutex for the duration of the write.
> > > -        */
> > > -       I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir);
> > > +       I915_WRITE(GEN6_PMIMR, pm_imr & dev_priv->pm_iir);
> > >         mutex_unlock(&dev_priv->dev->struct_mutex);
> > >  }
> > 
> > For this to work we'd need to hold the rps_lock (to avoid racing with the
> > irq handler). But imo my approach is conceptually simpler: The work func
> > grabs all oustanding PM interrupts and then enables them again in one go
> > (protected by rps_lock).
> 
> I agree your approach is similar, but I think we should really consider
> whether my approach actually requires the lock. I *think* it doesn't. At
> least in my head my patch should fix the error you spotted. I don't
> know, maybe I need to think some more.

1. rps work reads dev_priv->pm_iir (anew in the line you've added).
2. irq handler runs, adds a new bit to dev_priv->pm_iir and sets PMIMR to
dev_priv->pm_iir (under irqsafe rps_lock).
3. rps work writes crap to PMIMR.

I.e. same race, you've just dramatically reduced the window ;-)

> The reason I worked so hard to avoid doing it the way you did in my
> original implementation is I was trying really hard to not break the
> cardinal rule about minimizing time holding spinlock_irqs. To go with
> the other patch, you probably want a POSTING_READ also before releasing
> the spin_lock (though I think this is being a bit paranoid).

There POSTING_READ was to order the PMIMR write with the PMIIR write (both
in the irq handler). There's no such ordering here (and the irq handler
can't be interrupted) so I think we're save.

-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func
  2011-09-04 20:10               ` Daniel Vetter
@ 2011-09-04 21:38                 ` Ben Widawsky
  2011-09-05  6:38                   ` Daniel Vetter
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Widawsky @ 2011-09-04 21:38 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx

On Sun, Sep 04, 2011 at 10:10:30PM +0200, Daniel Vetter wrote:
> On Sun, Sep 04, 2011 at 07:56:57PM +0000, Ben Widawsky wrote:
> > On Sun, Sep 04, 2011 at 09:26:48PM +0200, Daniel Vetter wrote:
> > > On Sun, Sep 04, 2011 at 10:08:17AM -0700, Ben Widawsky wrote:
> > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > > index 55518e3..3bc1479 100644
> > > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > > @@ -415,12 +415,7 @@ static void gen6_pm_rps_work(struct work_struct *work)
> > > >         gen6_set_rps(dev_priv->dev, new_delay);
> > > >         dev_priv->cur_delay = new_delay;
> > > >  
> > > > -       /*
> > > > -        * rps_lock not held here because clearing is non-destructive. There is
> > > > -        * an *extremely* unlikely race with gen6_rps_enable() that is prevented
> > > > -        * by holding struct_mutex for the duration of the write.
> > > > -        */
> > > > -       I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir);
> > > > +       I915_WRITE(GEN6_PMIMR, pm_imr & dev_priv->pm_iir);
> > > >         mutex_unlock(&dev_priv->dev->struct_mutex);
> > > >  }
> > > 
> > > For this to work we'd need to hold the rps_lock (to avoid racing with the
> > > irq handler). But imo my approach is conceptually simpler: The work func
> > > grabs all oustanding PM interrupts and then enables them again in one go
> > > (protected by rps_lock).
> > 
> > I agree your approach is similar, but I think we should really consider
> > whether my approach actually requires the lock. I *think* it doesn't. At
> > least in my head my patch should fix the error you spotted. I don't
> > know, maybe I need to think some more.
> 
> 1. rps work reads dev_priv->pm_iir (anew in the line you've added).
> 2. irq handler runs, adds a new bit to dev_priv->pm_iir and sets PMIMR to
> dev_priv->pm_iir (under irqsafe rps_lock).
> 3. rps work writes crap to PMIMR.
> 
> I.e. same race, you've just dramatically reduced the window ;-)
> 
> > The reason I worked so hard to avoid doing it the way you did in my
> > original implementation is I was trying really hard to not break the
> > cardinal rule about minimizing time holding spinlock_irqs. To go with
> > the other patch, you probably want a POSTING_READ also before releasing
> > the spin_lock (though I think this is being a bit paranoid).
> 
> There POSTING_READ was to order the PMIMR write with the PMIIR write (both
> in the irq handler). There's no such ordering here (and the irq handler
> can't be interrupted) so I think we're save.
> 
> -Daniel

Oops, you're totally right, I think I meant:
-       I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir);
+       I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir);

With regarding to the POSTING_READ, the concern I had was if the write
to IMR doesn't land before releasing the spinlock, but I don't feel like
addressing that concern anymore.

Ben

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func
  2011-09-04 21:38                 ` Ben Widawsky
@ 2011-09-05  6:38                   ` Daniel Vetter
  2011-09-05  6:51                     ` Ben Widawsky
  0 siblings, 1 reply; 22+ messages in thread
From: Daniel Vetter @ 2011-09-05  6:38 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Daniel Vetter, intel-gfx

On Sun, Sep 04, 2011 at 09:38:56PM +0000, Ben Widawsky wrote:
> Oops, you're totally right, I think I meant:
> -       I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir);
> +       I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir);

Imo still racy without the irqsafe rps_lock around it. gcc is free to
compile that into a separate load and store which the irq handler can get
in between and change dev_priv->pm_iir and PMIMR. The race is now only one
instruction wide, though ;-)
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func
  2011-09-05  6:38                   ` Daniel Vetter
@ 2011-09-05  6:51                     ` Ben Widawsky
  2011-09-05 12:15                       ` Chris Wilson
  0 siblings, 1 reply; 22+ messages in thread
From: Ben Widawsky @ 2011-09-05  6:51 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx

On Mon, 5 Sep 2011 08:38:07 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:

> On Sun, Sep 04, 2011 at 09:38:56PM +0000, Ben Widawsky wrote:
> > Oops, you're totally right, I think I meant:
> > -       I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir);
> > +       I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir);
> 
> Imo still racy without the irqsafe rps_lock around it. gcc is free to
> compile that into a separate load and store which the irq handler can
> get in between and change dev_priv->pm_iir and PMIMR. The race is now
> only one instruction wide, though ;-)
> -Daniel

You are absolutely correct. The modification to GEN6_PMIMR must be
within the protection of rps_lock.

Ben

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] drm/i915: properly cancel rps_work on module unload
  2011-09-04 19:57               ` Daniel Vetter
@ 2011-09-05  8:15                 ` Daniel Vetter
  2011-09-05 17:27                   ` Ben Widawsky
  0 siblings, 1 reply; 22+ messages in thread
From: Daniel Vetter @ 2011-09-05  8:15 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

The rps disabling code wasn't properly cancelling outstanding work
items. Also add a comment that explains why we're not racing with
the work item that could unmask interrupts - that piece of code
confused me quite a bit.

v2: Ben Widawsky pointed out that the first patch would deadlock
(and a few lesser problems). All corrected.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/intel_display.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 56a8554..2e425be 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7484,6 +7484,10 @@ void gen6_disable_rps(struct drm_device *dev)
 	I915_WRITE(GEN6_RPNSWREQ, 1 << 31);
 	I915_WRITE(GEN6_PMINTRMSK, 0xffffffff);
 	I915_WRITE(GEN6_PMIER, 0);
+	/* Complete PM interrupt masking here doesn't race with the rps work
+	 * item again unmasking PM interrupts because that is using a different
+	 * register (PMIMR) to mask PM interrupts. The only risk is in leaving
+	 * stale bits in PMIIR and PMIMR which gen6_enable_rps will clean up. */
 
 	spin_lock_irq(&dev_priv->rps_lock);
 	dev_priv->pm_iir = 0;
@@ -8478,6 +8482,7 @@ void intel_modeset_cleanup(struct drm_device *dev)
 	 * enqueue unpin/hotplug work. */
 	drm_irq_uninstall(dev);
 	cancel_work_sync(&dev_priv->hotplug_work);
+	cancel_work_sync(&dev_priv->rps_work);
 
 	/* flush any delayed tasks or pending work */
 	flush_scheduled_work();
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func
  2011-09-05  6:51                     ` Ben Widawsky
@ 2011-09-05 12:15                       ` Chris Wilson
  0 siblings, 0 replies; 22+ messages in thread
From: Chris Wilson @ 2011-09-05 12:15 UTC (permalink / raw)
  To: Ben Widawsky, Daniel Vetter; +Cc: Daniel Vetter, intel-gfx

On Sun, 4 Sep 2011 23:51:52 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Mon, 5 Sep 2011 08:38:07 +0200
> Daniel Vetter <daniel@ffwll.ch> wrote:
> 
> > On Sun, Sep 04, 2011 at 09:38:56PM +0000, Ben Widawsky wrote:
> > > Oops, you're totally right, I think I meant:
> > > -       I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir);
> > > +       I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir);
> > 
> > Imo still racy without the irqsafe rps_lock around it. gcc is free to
> > compile that into a separate load and store which the irq handler can
> > get in between and change dev_priv->pm_iir and PMIMR. The race is now
> > only one instruction wide, though ;-)
> > -Daniel
> 
> You are absolutely correct. The modification to GEN6_PMIMR must be
> within the protection of rps_lock.

Right. However, I don't see why we need to hold the mutex though. Why
are we touching PMIMR in gen6_enable_rps()? Afaics, it is because
gen6_disable_rps() may leave a stale PMIMR (it sets dev_priv->pm_iir to
0, causing the existing work-handler to return early and not touch
PMIMR).

I believe everything is correct if we clear PMIMR on module load, remove
the setting of PMIMR from gen6_enable_rps() and clear PMIMR under the
rps lock at the top of the work handler (which both Daniel and I have
desired to do... ;-)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] drm/i915: properly cancel rps_work on module unload
  2011-09-05  8:15                 ` [PATCH] drm/i915: properly cancel rps_work on module unload Daniel Vetter
@ 2011-09-05 17:27                   ` Ben Widawsky
  0 siblings, 0 replies; 22+ messages in thread
From: Ben Widawsky @ 2011-09-05 17:27 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Mon,  5 Sep 2011 10:15:28 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> The rps disabling code wasn't properly cancelling outstanding work
> items. Also add a comment that explains why we're not racing with
> the work item that could unmask interrupts - that piece of code
> confused me quite a bit.
> 
> v2: Ben Widawsky pointed out that the first patch would deadlock
> (and a few lesser problems). All corrected.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

This looks good. I think we need to do something in i915_save_state()
too though.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2011-09-05 17:26 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-04  3:24 [PATCH] drm/i915: Fix rps irq warning Ben Widawsky
2011-09-04  9:03 ` Chris Wilson
2011-09-04 15:49   ` Ben Widawsky
2011-09-04 15:34     ` [PATCH 0/3] slaughter rps races some more Daniel Vetter
2011-09-04 15:35       ` [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Daniel Vetter
2011-09-04 17:09         ` Ben Widawsky
2011-09-04 15:35       ` [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func Daniel Vetter
2011-09-04 17:08         ` Ben Widawsky
2011-09-04 19:26           ` Daniel Vetter
2011-09-04 19:56             ` Ben Widawsky
2011-09-04 20:10               ` Daniel Vetter
2011-09-04 21:38                 ` Ben Widawsky
2011-09-05  6:38                   ` Daniel Vetter
2011-09-05  6:51                     ` Ben Widawsky
2011-09-05 12:15                       ` Chris Wilson
2011-09-04 15:35       ` [PATCH 3/3] drm/i915: close rps work vs. rps disable races Daniel Vetter
2011-09-04 17:23         ` Ben Widawsky
2011-09-04 19:17           ` Daniel Vetter
2011-09-04 19:50             ` Ben Widawsky
2011-09-04 19:57               ` Daniel Vetter
2011-09-05  8:15                 ` [PATCH] drm/i915: properly cancel rps_work on module unload Daniel Vetter
2011-09-05 17:27                   ` Ben Widawsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).