[PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler
@ 2011-09-08 12:00 Daniel Vetter
  2011-09-08 12:00 ` [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func Daniel Vetter
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Daniel Vetter @ 2011-09-08 12:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

Quoting Chris Wilson's more concise description:

"Ah I think I see the problem. As you point out we only mask the current
interrupt received, so that if we have a task pending (and so IMR != 0) we
actually unmask the pending interrupt and so could receive it again before the
tasklet is finally kicked off by the grumpy scheduler."

We need the hw to issue PM interrupts A, B, A while the scheduler is hating us
and refuses to run the rps work item. On receiving PM interrupt A we hit the
WARN because

dev_priv->pm_iir == PM_A | PM_B

Also add a posting read as suggested by Chris to ensure proper ordering of the
writes to PMIMR and PMIIR. Just in case somebody weakens write ordering.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_irq.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 9cbb0cd..2fdd9f9 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -536,8 +536,9 @@ static irqreturn_t ivybridge_irq_handler(DRM_IRQ_ARGS)
 		unsigned long flags;
 		spin_lock_irqsave(&dev_priv->rps_lock, flags);
 		WARN(dev_priv->pm_iir & pm_iir, "Missed a PM interrupt\n");
-		I915_WRITE(GEN6_PMIMR, pm_iir);
 		dev_priv->pm_iir |= pm_iir;
+		I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir);
+		POSTING_READ(GEN6_PMIMR);
 		spin_unlock_irqrestore(&dev_priv->rps_lock, flags);
 		queue_work(dev_priv->wq, &dev_priv->rps_work);
 	}
@@ -649,8 +650,9 @@ static irqreturn_t ironlake_irq_handler(DRM_IRQ_ARGS)
 		unsigned long flags;
 		spin_lock_irqsave(&dev_priv->rps_lock, flags);
 		WARN(dev_priv->pm_iir & pm_iir, "Missed a PM interrupt\n");
-		I915_WRITE(GEN6_PMIMR, pm_iir);
 		dev_priv->pm_iir |= pm_iir;
+		I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir);
+		POSTING_READ(GEN6_PMIMR);
 		spin_unlock_irqrestore(&dev_priv->rps_lock, flags);
 		queue_work(dev_priv->wq, &dev_priv->rps_work);
 	}
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func
  2011-09-08 12:00 [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Daniel Vetter
@ 2011-09-08 12:00 ` Daniel Vetter
  2011-09-08 15:19   ` Ben Widawsky
  2011-09-08 15:25   ` [PATCH] " Daniel Vetter
  2011-09-08 12:00 ` [PATCH 3/3] drm/i915: properly cancel rps_work on module unload v2 Daniel Vetter
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 12+ messages in thread
From: Daniel Vetter @ 2011-09-08 12:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

This patch closes the following race:

We get a PM interrupt A, mask it, set dev_priv->iir = PM_A and kick of the
work item. Scheduler isn't grumpy, so the work queue takes rps_lock,
grabs pm_iir = dev_priv->pm_iir and pm_imr = READ(PMIMR). Note that
pm_imr == pm_iir because we've just masked the interrupt we've got.

Now hw sends out PM interrupt B (not masked), we process it and mask
it.  Later on the irq handler also clears PMIIR.

Then the work item proceeds and at the end clears PMIMR. Because
(local) pm_imr == pm_iir we have
        pm_imr & ~pm_iir == 0
so all interrupts are enabled.

Hardware is still interrupt-happy, and sends out a new PM interrupt B.
PMIMR doesn't mask B (it does not mask anything), PMIIR is cleared, so
we get it and hit the WARN in the interrupt handler (because
dev_priv->pm_iir == PM_B).

That's why I've moved the
        WRITE(PMIMR, 0)
up under the protection of the rps_lock. And write an uncoditional 0
to PMIMR, because that's what we'll do anyway.

This races looks much more likely because we can arbitrarily extend
the window by grabing dev->struct mutex right after the irq handler
has processed the first PM_B interrupt.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_irq.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2fdd9f9..21ebcbd 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -383,6 +383,7 @@ static void gen6_pm_rps_work(struct work_struct *work)
 	pm_iir = dev_priv->pm_iir;
 	dev_priv->pm_iir = 0;
 	pm_imr = I915_READ(GEN6_PMIMR);
+	I915_WRITE(GEN6_PMIMR, 0);
 	spin_unlock_irq(&dev_priv->rps_lock);
 
 	if (!pm_iir)
@@ -420,7 +421,6 @@ static void gen6_pm_rps_work(struct work_struct *work)
 	 * an *extremely* unlikely race with gen6_rps_enable() that is prevented
 	 * by holding struct_mutex for the duration of the write.
 	 */
-	I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir);
 	mutex_unlock(&dev_priv->dev->struct_mutex);
 }
 
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] drm/i915: properly cancel rps_work on module unload v2
  2011-09-08 12:00 [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Daniel Vetter
  2011-09-08 12:00 ` [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func Daniel Vetter
@ 2011-09-08 12:00 ` Daniel Vetter
  2011-09-08 20:51   ` Chris Wilson
  2011-09-08 20:43 ` [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Chris Wilson
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Daniel Vetter @ 2011-09-08 12:00 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

The rps disabling code wasn't properly cancelling outstanding work
items. Also add a comment that explains why we're not racing with
the work item that could unmask interrupts - that piece of code
confused me quite a bit.

v2: Ben Widawsky pointed out that the first patch would deadlock
(and a few lesser problems). All corrected.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/intel_display.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 56a8554..2e425be 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7484,6 +7484,10 @@ void gen6_disable_rps(struct drm_device *dev)
 	I915_WRITE(GEN6_RPNSWREQ, 1 << 31);
 	I915_WRITE(GEN6_PMINTRMSK, 0xffffffff);
 	I915_WRITE(GEN6_PMIER, 0);
+	/* Complete PM interrupt masking here doesn't race with the rps work
+	 * item again unmasking PM interrupts because that is using a different
+	 * register (PMIMR) to mask PM interrupts. The only risk is in leaving
+	 * stale bits in PMIIR and PMIMR which gen6_enable_rps will clean up. */
 
 	spin_lock_irq(&dev_priv->rps_lock);
 	dev_priv->pm_iir = 0;
@@ -8478,6 +8482,7 @@ void intel_modeset_cleanup(struct drm_device *dev)
 	 * enqueue unpin/hotplug work. */
 	drm_irq_uninstall(dev);
 	cancel_work_sync(&dev_priv->hotplug_work);
+	cancel_work_sync(&dev_priv->rps_work);
 
 	/* flush any delayed tasks or pending work */
 	flush_scheduled_work();
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func
  2011-09-08 12:00 ` [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func Daniel Vetter
@ 2011-09-08 15:19   ` Ben Widawsky
  2011-09-08 15:25   ` [PATCH] " Daniel Vetter
  1 sibling, 0 replies; 12+ messages in thread
From: Ben Widawsky @ 2011-09-08 15:19 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Thu, Sep 08, 2011 at 02:00:21PM +0200, Daniel Vetter wrote:
> This patch closes the following race:
> 
> We get a PM interrupt A, mask it, set dev_priv->iir = PM_A and kick of the
> work item. Scheduler isn't grumpy, so the work queue takes rps_lock,
> grabs pm_iir = dev_priv->pm_iir and pm_imr = READ(PMIMR). Note that
> pm_imr == pm_iir because we've just masked the interrupt we've got.
> 
> Now hw sends out PM interrupt B (not masked), we process it and mask
> it.  Later on the irq handler also clears PMIIR.
> 
> Then the work item proceeds and at the end clears PMIMR. Because
> (local) pm_imr == pm_iir we have
>         pm_imr & ~pm_iir == 0
> so all interrupts are enabled.
> 
> Hardware is still interrupt-happy, and sends out a new PM interrupt B.
> PMIMR doesn't mask B (it does not mask anything), PMIIR is cleared, so
> we get it and hit the WARN in the interrupt handler (because
> dev_priv->pm_iir == PM_B).
> 
> That's why I've moved the
>         WRITE(PMIMR, 0)
> up under the protection of the rps_lock. And write an uncoditional 0
> to PMIMR, because that's what we'll do anyway.
> 
> This races looks much more likely because we can arbitrarily extend
> the window by grabing dev->struct mutex right after the irq handler
> has processed the first PM_B interrupt.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] drm/i915: close PM interrupt masking races in the rps work func
  2011-09-08 12:00 ` [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func Daniel Vetter
  2011-09-08 15:19   ` Ben Widawsky
@ 2011-09-08 15:25   ` Daniel Vetter
  2011-09-08 20:49     ` Chris Wilson
  1 sibling, 1 reply; 12+ messages in thread
From: Daniel Vetter @ 2011-09-08 15:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

This patch closes the following race:

We get a PM interrupt A, mask it, set dev_priv->iir = PM_A and kick of the
work item. Scheduler isn't grumpy, so the work queue takes rps_lock,
grabs pm_iir = dev_priv->pm_iir and pm_imr = READ(PMIMR). Note that
pm_imr == pm_iir because we've just masked the interrupt we've got.

Now hw sends out PM interrupt B (not masked), we process it and mask
it.  Later on the irq handler also clears PMIIR.

Then the work item proceeds and at the end clears PMIMR. Because
(local) pm_imr == pm_iir we have
        pm_imr & ~pm_iir == 0
so all interrupts are enabled.

Hardware is still interrupt-happy, and sends out a new PM interrupt B.
PMIMR doesn't mask B (it does not mask anything), PMIIR is cleared, so
we get it and hit the WARN in the interrupt handler (because
dev_priv->pm_iir == PM_B).

That's why I've moved the
        WRITE(PMIMR, 0)
up under the protection of the rps_lock. And write an uncoditional 0
to PMIMR, because that's what we'll do anyway.

This races looks much more likely because we can arbitrarily extend
the window by grabing dev->struct mutex right after the irq handler
has processed the first PM_B interrupt.

v2: Chris Wilson pointed out some dead code and a now misleading
comment to kill.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_irq.c |   10 ++--------
 1 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2fdd9f9..d5a2188 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -377,12 +377,12 @@ static void gen6_pm_rps_work(struct work_struct *work)
 	drm_i915_private_t *dev_priv = container_of(work, drm_i915_private_t,
 						    rps_work);
 	u8 new_delay = dev_priv->cur_delay;
-	u32 pm_iir, pm_imr;
+	u32 pm_iir;
 
 	spin_lock_irq(&dev_priv->rps_lock);
 	pm_iir = dev_priv->pm_iir;
 	dev_priv->pm_iir = 0;
-	pm_imr = I915_READ(GEN6_PMIMR);
+	I915_WRITE(GEN6_PMIMR, 0);
 	spin_unlock_irq(&dev_priv->rps_lock);
 
 	if (!pm_iir)
@@ -415,12 +415,6 @@ static void gen6_pm_rps_work(struct work_struct *work)
 	gen6_set_rps(dev_priv->dev, new_delay);
 	dev_priv->cur_delay = new_delay;
 
-	/*
-	 * rps_lock not held here because clearing is non-destructive. There is
-	 * an *extremely* unlikely race with gen6_rps_enable() that is prevented
-	 * by holding struct_mutex for the duration of the write.
-	 */
-	I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir);
 	mutex_unlock(&dev_priv->dev->struct_mutex);
 }
 
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler
  2011-09-08 12:00 [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Daniel Vetter
  2011-09-08 12:00 ` [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func Daniel Vetter
  2011-09-08 12:00 ` [PATCH 3/3] drm/i915: properly cancel rps_work on module unload v2 Daniel Vetter
@ 2011-09-08 20:43 ` Chris Wilson
  2011-09-23 16:08 ` Daniel Vetter
  2012-04-15  9:06 ` Chris Wilson
  4 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2011-09-08 20:43 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

On Thu,  8 Sep 2011 14:00:20 +0200, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> Quoting Chris Wilson's more concise description:
> 
> "Ah I think I see the problem. As you point out we only mask the current
> interrupt received, so that if we have a task pending (and so IMR != 0) we
> actually unmask the pending interrupt and so could receive it again before the
> tasklet is finally kicked off by the grumpy scheduler."
> 
> We need the hw to issue PM interrupts A, B, A while the scheduler is hating us
> and refuses to run the rps work item. On receiving PM interrupt A we hit the
> WARN because
> 
> dev_priv->pm_iir == PM_A | PM_B
> 
> Also add a posting read as suggested by Chris to ensure proper ordering of the
> writes to PMIMR and PMIIR. Just in case somebody weakens write ordering.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

The bug Daniel found here is not that we do the reg write two lines too
early, but that we were writing the wrong value into the IMR. The effect
was to unmask an already pending IRQ and so we could hit the WARN. Other
than the WARN, the only other side-effect would be that we would kick
off more work functions than required - the bug should not be affecting
system stability...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] drm/i915: close PM interrupt masking races in the rps work func
  2011-09-08 15:25   ` [PATCH] " Daniel Vetter
@ 2011-09-08 20:49     ` Chris Wilson
  0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2011-09-08 20:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

On Thu,  8 Sep 2011 17:25:05 +0200, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> This patch closes the following race:
> 
> We get a PM interrupt A, mask it, set dev_priv->iir = PM_A and kick of the
> work item. Scheduler isn't grumpy, so the work queue takes rps_lock,
> grabs pm_iir = dev_priv->pm_iir and pm_imr = READ(PMIMR). Note that
> pm_imr == pm_iir because we've just masked the interrupt we've got.
> 
> Now hw sends out PM interrupt B (not masked), we process it and mask
> it.  Later on the irq handler also clears PMIIR.
> 
> Then the work item proceeds and at the end clears PMIMR. Because
> (local) pm_imr == pm_iir we have
>         pm_imr & ~pm_iir == 0
> so all interrupts are enabled.
> 
> Hardware is still interrupt-happy, and sends out a new PM interrupt B.
> PMIMR doesn't mask B (it does not mask anything), PMIIR is cleared, so
> we get it and hit the WARN in the interrupt handler (because
> dev_priv->pm_iir == PM_B).
> 
> That's why I've moved the
>         WRITE(PMIMR, 0)
> up under the protection of the rps_lock. And write an uncoditional 0
> to PMIMR, because that's what we'll do anyway.
> 
> This races looks much more likely because we can arbitrarily extend
> the window by grabing dev->struct mutex right after the irq handler
> has processed the first PM_B interrupt.
> 
> v2: Chris Wilson pointed out some dead code and a now misleading
> comment to kill.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

I've stared at this a long time to find meaning to the warning about
having to do the IMR write under the mutex. Having found none, this was
also the patch that I wrote for myself to fix the bug identified by
Daniel.

In this case, the bug is simply that the value of PM_IMR may change since
caching it in pm_imr (and similarly dev_priv->pm_iir will have the
correspoding bit now set) and so we may loose an interrupt masking when
setting it later. Again, this opens the possibility for a second
interrupt to arrive and hit the WARN (and kick off a redundant work
function.)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] drm/i915: properly cancel rps_work on module unload v2
  2011-09-08 12:00 ` [PATCH 3/3] drm/i915: properly cancel rps_work on module unload v2 Daniel Vetter
@ 2011-09-08 20:51   ` Chris Wilson
  0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2011-09-08 20:51 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

On Thu,  8 Sep 2011 14:00:22 +0200, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> The rps disabling code wasn't properly cancelling outstanding work
> items. Also add a comment that explains why we're not racing with
> the work item that could unmask interrupts - that piece of code
> confused me quite a bit.
> 
> v2: Ben Widawsky pointed out that the first patch would deadlock
> (and a few lesser problems). All corrected.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

And since presumably the chip is in a completely different thermal
envelope and thawed, waking up the task function is pointless after
resume. ;-)

With the original deadlocks gone,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler
  2011-09-08 12:00 [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Daniel Vetter
                   ` (2 preceding siblings ...)
  2011-09-08 20:43 ` [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Chris Wilson
@ 2011-09-23 16:08 ` Daniel Vetter
  2012-04-15  9:06 ` Chris Wilson
  4 siblings, 0 replies; 12+ messages in thread
From: Daniel Vetter @ 2011-09-23 16:08 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

Hi Keith,

I think these 3 patches (note that 2/3 has a resend) are ready for -next.
I don't think we need this for -fixes or Cc:stable, imo worst case we miss
a PM interrupt and hit the WARN. Afaik only Jesse reported seeing that on
his ivybrdige machine.

3/3 fixes an unload problem that might lead to an Oops, so that one would
make sense for -fixes. But imho module unload is a developer feature, so I
don't mind much.

Yours, Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler
  2011-09-08 12:00 [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Daniel Vetter
                   ` (3 preceding siblings ...)
  2011-09-23 16:08 ` Daniel Vetter
@ 2012-04-15  9:06 ` Chris Wilson
  4 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2012-04-15  9:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

Ping.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43107
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler
  2011-09-04 15:35   ` [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Daniel Vetter
@ 2011-09-04 17:09     ` Ben Widawsky
  0 siblings, 0 replies; 12+ messages in thread
From: Ben Widawsky @ 2011-09-04 17:09 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Sun,  4 Sep 2011 17:35:00 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> Quoting Chris Wilson's more concise description:
> 
> "Ah I think I see the problem. As you point out we only mask the
> current interrupt received, so that if we have a task pending (and so
> IMR != 0) we actually unmask the pending interrupt and so could
> receive it again before the tasklet is finally kicked off by the
> grumpy scheduler."
> 
> So we need the hw to issue PM interrupts A, B, A while the scheduler
> is hating us and refuses to run the rps work item. On receiving PM
> interrupt A we hit the
> WARN because
> 
> dev_priv->pm_iir == PM_A | PM_B
> 
> Also add a posting read as suggested by Chris to ensure proper
> ordering of the writes to PMIMR and PMIIR. Just in case somebody
> weakens write ordering.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler
  2011-09-04 15:34 ` [PATCH 0/3] slaughter rps races some more Daniel Vetter
@ 2011-09-04 15:35   ` Daniel Vetter
  2011-09-04 17:09     ` Ben Widawsky
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Vetter @ 2011-09-04 15:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Ben Widawsky

Quoting Chris Wilson's more concise description:

"Ah I think I see the problem. As you point out we only mask the
current interrupt received, so that if we have a task pending (and so
IMR != 0) we actually unmask the pending interrupt and so could
receive it again before the tasklet is finally kicked off by the
grumpy scheduler."

So we need the hw to issue PM interrupts A, B, A while the scheduler
is hating us and refuses to run the rps work item. On receiving PM
interrupt A we hit the
WARN because

dev_priv->pm_iir == PM_A | PM_B

Also add a posting read as suggested by Chris to ensure proper
ordering of the writes to PMIMR and PMIIR. Just in case somebody
weakens write ordering.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_irq.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 9cbb0cd..2fdd9f9 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -536,8 +536,9 @@ static irqreturn_t ivybridge_irq_handler(DRM_IRQ_ARGS)
 		unsigned long flags;
 		spin_lock_irqsave(&dev_priv->rps_lock, flags);
 		WARN(dev_priv->pm_iir & pm_iir, "Missed a PM interrupt\n");
-		I915_WRITE(GEN6_PMIMR, pm_iir);
 		dev_priv->pm_iir |= pm_iir;
+		I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir);
+		POSTING_READ(GEN6_PMIMR);
 		spin_unlock_irqrestore(&dev_priv->rps_lock, flags);
 		queue_work(dev_priv->wq, &dev_priv->rps_work);
 	}
@@ -649,8 +650,9 @@ static irqreturn_t ironlake_irq_handler(DRM_IRQ_ARGS)
 		unsigned long flags;
 		spin_lock_irqsave(&dev_priv->rps_lock, flags);
 		WARN(dev_priv->pm_iir & pm_iir, "Missed a PM interrupt\n");
-		I915_WRITE(GEN6_PMIMR, pm_iir);
 		dev_priv->pm_iir |= pm_iir;
+		I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir);
+		POSTING_READ(GEN6_PMIMR);
 		spin_unlock_irqrestore(&dev_priv->rps_lock, flags);
 		queue_work(dev_priv->wq, &dev_priv->rps_work);
 	}
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-04-15  9:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-08 12:00 [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Daniel Vetter
2011-09-08 12:00 ` [PATCH 2/3] drm/i915: close PM interrupt masking races in the rps work func Daniel Vetter
2011-09-08 15:19   ` Ben Widawsky
2011-09-08 15:25   ` [PATCH] " Daniel Vetter
2011-09-08 20:49     ` Chris Wilson
2011-09-08 12:00 ` [PATCH 3/3] drm/i915: properly cancel rps_work on module unload v2 Daniel Vetter
2011-09-08 20:51   ` Chris Wilson
2011-09-08 20:43 ` [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Chris Wilson
2011-09-23 16:08 ` Daniel Vetter
2012-04-15  9:06 ` Chris Wilson
  -- strict thread matches above, loose matches on Subject: below --
2011-09-04 15:49 [PATCH] drm/i915: Fix rps irq warning Ben Widawsky
2011-09-04 15:34 ` [PATCH 0/3] slaughter rps races some more Daniel Vetter
2011-09-04 15:35   ` [PATCH 1/3] drm/i915: close PM interrupt masking races in the irq handler Daniel Vetter
2011-09-04 17:09     ` Ben Widawsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.