All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] Display error reporting
@ 2013-02-08 19:35 Paulo Zanoni
  2013-02-08 19:35 ` [PATCH 01/10] drm/i915: drm/i915: create macros for the "unclaimed register" checks Paulo Zanoni
                   ` (9 more replies)
  0 siblings, 10 replies; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-08 19:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

Hi

This series is the first step to improve error reporting on our driver.

The first 3 patches were already sent to the list and they're a requirement for
the series (because of the relationship between our "unclaimed register" checks
and the display error interrupts).

Patch 4 just removes some code duplication and could be merged before
everything else.

Patches 5-7 are the complicated bits. They use the "I want to report Poison
interrupts" excuse, but the main goal of these 3 patches is: make sure our code
is prepared for a possible "interrupt tsunami", enable SERR_INT and enable
GEN7_ERR_INT. The big problem with GEN7_ERR_INT and SERR_INT is that you can't
enable/disable the specific sub-cases of this interrupt: either you enable all
of the error interrupts or you disable all of them.

Patches 8 and 9 are mainly about printing error messages. They try to make sure
dmesg won't be flooded in case the interrupt tsunami happens, but they still
leave GEN7_ERR_INT and SERR_INT enabled even if we decide to ignore some of the
interrupts, because this way we may detect one problem even if the other is
being ignored.

Patch 10 is just for my OCD.

After this series, enabling and printing more error interrupts will be easier.
So far, I tested this series on ILK, SNB and HSW and I haven't seen the "Poison"
interrupt message, but I have seen the "PCH transcoder FIFO underrun" message on
SNB and ILK, and this seems to be caused by wrong watermark values when using
multiple pipes. Future patches will fix the problems reported by the error
messages. I hope these error messages will help us identify, reproduce and fix
bugs in our driver.

Thanks,
Paulo

Paulo Zanoni (10):
  drm/i915: drm/i915: create macros for the "unclaimed register" checks
  drm/i915: use FPGA_DBG for the "unclaimed register" checks
  drm/i915: clear the FPGA_DBG_RM_NOCLAIM bit at driver init
  drm/i915: add ibx_irq_postinstall
  drm/i915: also disable south interrupts when handling them
  drm/i915: print PCH poison interrupts
  drm/i915: print Gen5+ CPU poison interrupts
  drm/i915: print PCH FIFO underrun interrupts
  drm/i915: print CPU FIFO underruns
  drm/i915: also POSTING_READ(DEIER) on ivybridge_irq_handler

 drivers/gpu/drm/i915/i915_dma.c      |    4 +
 drivers/gpu/drm/i915/i915_drv.c      |   24 ++--
 drivers/gpu/drm/i915/i915_drv.h      |    5 +
 drivers/gpu/drm/i915/i915_irq.c      |  217 ++++++++++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_reg.h      |   18 ++-
 drivers/gpu/drm/i915/intel_display.c |   16 ++-
 6 files changed, 223 insertions(+), 61 deletions(-)

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 01/10] drm/i915: drm/i915: create macros for the "unclaimed register" checks
  2013-02-08 19:35 [PATCH 00/10] Display error reporting Paulo Zanoni
@ 2013-02-08 19:35 ` Paulo Zanoni
  2013-02-18 18:11   ` Daniel Vetter
  2013-02-08 19:35 ` [PATCH 02/10] drm/i915: use FPGA_DBG " Paulo Zanoni
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-08 19:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

This avoids polluting i915_write##x and also allows us to reuse code
on i915_read##x.

v2: Rebase

Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c |   24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index c5b8c81..e24c337 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1131,6 +1131,20 @@ ilk_dummy_write(struct drm_i915_private *dev_priv)
 	I915_WRITE_NOTRACE(MI_MODE, 0);
 }
 
+#define UNCLAIMED_REG_CLEAR(dev_priv, reg) \
+	if (IS_HASWELL(dev_priv->dev) && \
+	    (I915_READ_NOTRACE(GEN7_ERR_INT) & ERR_INT_MMIO_UNCLAIMED)) { \
+		DRM_ERROR("Unknown unclaimed register before writing to %x\n", reg); \
+		I915_WRITE_NOTRACE(GEN7_ERR_INT, ERR_INT_MMIO_UNCLAIMED); \
+	}
+
+#define UNCLAIMED_REG_CHECK(dev_priv, reg) \
+	if (IS_HASWELL(dev_priv->dev) && \
+	    (I915_READ_NOTRACE(GEN7_ERR_INT) & ERR_INT_MMIO_UNCLAIMED)) { \
+		DRM_ERROR("Unclaimed write to %x\n", reg); \
+		writel(ERR_INT_MMIO_UNCLAIMED, dev_priv->regs + GEN7_ERR_INT); \
+	}
+
 #define __i915_read(x, y) \
 u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \
 	u##x val = 0; \
@@ -1167,18 +1181,12 @@ void i915_write##x(struct drm_i915_private *dev_priv, u32 reg, u##x val) { \
 	} \
 	if (IS_GEN5(dev_priv->dev)) \
 		ilk_dummy_write(dev_priv); \
-	if (IS_HASWELL(dev_priv->dev) && (I915_READ_NOTRACE(GEN7_ERR_INT) & ERR_INT_MMIO_UNCLAIMED)) { \
-		DRM_ERROR("Unknown unclaimed register before writing to %x\n", reg); \
-		I915_WRITE_NOTRACE(GEN7_ERR_INT, ERR_INT_MMIO_UNCLAIMED); \
-	} \
+	UNCLAIMED_REG_CLEAR(dev_priv, reg); \
 	write##y(val, dev_priv->regs + reg); \
 	if (unlikely(__fifo_ret)) { \
 		gen6_gt_check_fifodbg(dev_priv); \
 	} \
-	if (IS_HASWELL(dev_priv->dev) && (I915_READ_NOTRACE(GEN7_ERR_INT) & ERR_INT_MMIO_UNCLAIMED)) { \
-		DRM_ERROR("Unclaimed write to %x\n", reg); \
-		writel(ERR_INT_MMIO_UNCLAIMED, dev_priv->regs + GEN7_ERR_INT);	\
-	} \
+	UNCLAIMED_REG_CHECK(dev_priv, reg); \
 }
 __i915_write(8, b)
 __i915_write(16, w)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 02/10] drm/i915: use FPGA_DBG for the "unclaimed register" checks
  2013-02-08 19:35 [PATCH 00/10] Display error reporting Paulo Zanoni
  2013-02-08 19:35 ` [PATCH 01/10] drm/i915: drm/i915: create macros for the "unclaimed register" checks Paulo Zanoni
@ 2013-02-08 19:35 ` Paulo Zanoni
  2013-02-08 19:35 ` [PATCH 03/10] drm/i915: clear the FPGA_DBG_RM_NOCLAIM bit at driver init Paulo Zanoni
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-08 19:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

We plan to treat GEN7_ERR_INT as an interrupt, so use this register
for the checks inside I915_WRITE. This way we can have the best of
both worlds: the error message with a register address and the
interrupt.

V2: Split in 2 patches: one for the macro, one for changing the
register, as requested by Ben.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c |    8 ++++----
 drivers/gpu/drm/i915/i915_reg.h |    3 +++
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index e24c337..24b4e5c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1133,16 +1133,16 @@ ilk_dummy_write(struct drm_i915_private *dev_priv)
 
 #define UNCLAIMED_REG_CLEAR(dev_priv, reg) \
 	if (IS_HASWELL(dev_priv->dev) && \
-	    (I915_READ_NOTRACE(GEN7_ERR_INT) & ERR_INT_MMIO_UNCLAIMED)) { \
+	    (I915_READ_NOTRACE(FPGA_DBG) & FPGA_DBG_RM_NOCLAIM)) { \
 		DRM_ERROR("Unknown unclaimed register before writing to %x\n", reg); \
-		I915_WRITE_NOTRACE(GEN7_ERR_INT, ERR_INT_MMIO_UNCLAIMED); \
+		I915_WRITE_NOTRACE(FPGA_DBG, FPGA_DBG_RM_NOCLAIM); \
 	}
 
 #define UNCLAIMED_REG_CHECK(dev_priv, reg) \
 	if (IS_HASWELL(dev_priv->dev) && \
-	    (I915_READ_NOTRACE(GEN7_ERR_INT) & ERR_INT_MMIO_UNCLAIMED)) { \
+	    (I915_READ_NOTRACE(FPGA_DBG) & FPGA_DBG_RM_NOCLAIM)) { \
 		DRM_ERROR("Unclaimed write to %x\n", reg); \
-		writel(ERR_INT_MMIO_UNCLAIMED, dev_priv->regs + GEN7_ERR_INT); \
+		I915_WRITE_NOTRACE(FPGA_DBG, FPGA_DBG_RM_NOCLAIM); \
 	}
 
 #define __i915_read(x, y) \
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index eaed90f..663b5c1 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -521,6 +521,9 @@
 #define GEN7_ERR_INT	0x44040
 #define   ERR_INT_MMIO_UNCLAIMED (1<<13)
 
+#define FPGA_DBG		0x42300
+#define FPGA_DBG_RM_NOCLAIM	(1<<31)
+
 /* GM45+ chicken bits -- debug workaround bits that may be required
  * for various sorts of correct behavior.  The top 16 bits of each are
  * the enables for writing to the corresponding low bit.
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 03/10] drm/i915: clear the FPGA_DBG_RM_NOCLAIM bit at driver init
  2013-02-08 19:35 [PATCH 00/10] Display error reporting Paulo Zanoni
  2013-02-08 19:35 ` [PATCH 01/10] drm/i915: drm/i915: create macros for the "unclaimed register" checks Paulo Zanoni
  2013-02-08 19:35 ` [PATCH 02/10] drm/i915: use FPGA_DBG " Paulo Zanoni
@ 2013-02-08 19:35 ` Paulo Zanoni
  2013-02-09 17:19   ` Ben Widawsky
  2013-02-08 19:35 ` [PATCH 04/10] drm/i915: add ibx_irq_postinstall Paulo Zanoni
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-08 19:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

Otherwise, if the BIOS did anything wrong, our first I915_{WRITE,READ}
will give us "unclaimed register"  messages.

V2: Even earlier.

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=58897
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 4fa6beb..6d8672e 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1542,6 +1542,10 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 		goto put_gmch;
 	}
 
+	/* This must happen before any I915_{READ,WRITE}: */
+	if (IS_HASWELL(dev))
+		I915_WRITE_NOTRACE(FPGA_DBG, FPGA_DBG_RM_NOCLAIM);
+
 	aperture_size = dev_priv->gtt.mappable_end;
 
 	dev_priv->gtt.mappable =
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 04/10] drm/i915: add ibx_irq_postinstall
  2013-02-08 19:35 [PATCH 00/10] Display error reporting Paulo Zanoni
                   ` (2 preceding siblings ...)
  2013-02-08 19:35 ` [PATCH 03/10] drm/i915: clear the FPGA_DBG_RM_NOCLAIM bit at driver init Paulo Zanoni
@ 2013-02-08 19:35 ` Paulo Zanoni
  2013-02-09 17:27   ` Ben Widawsky
  2013-02-08 19:35 ` [PATCH 05/10] drm/i915: also disable south interrupts when handling them Paulo Zanoni
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-08 19:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

So we can remove duplicated code. Note that this function is used not
only on IBX, but also CPT and LPT.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c |   66 ++++++++++++++-------------------------
 1 file changed, 24 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index be5289b..f096ad9 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1926,6 +1926,28 @@ static void ironlake_enable_pch_hotplug(struct drm_device *dev)
 	I915_WRITE(PCH_PORT_HOTPLUG, hotplug);
 }
 
+static void ibx_irq_postinstall(struct drm_device *dev)
+{
+	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
+	u32 mask;
+
+	if (HAS_PCH_IBX(dev))
+		mask = SDE_HOTPLUG_MASK |
+		       SDE_GMBUS |
+		       SDE_AUX_MASK;
+	else
+		mask = SDE_HOTPLUG_MASK_CPT |
+		       SDE_GMBUS_CPT |
+		       SDE_AUX_MASK_CPT;
+
+	I915_WRITE(SDEIIR, I915_READ(SDEIIR));
+	I915_WRITE(SDEIMR, ~mask);
+	I915_WRITE(SDEIER, mask);
+	POSTING_READ(SDEIER);
+
+	ironlake_enable_pch_hotplug(dev);
+}
+
 static int ironlake_irq_postinstall(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
@@ -1934,8 +1956,6 @@ static int ironlake_irq_postinstall(struct drm_device *dev)
 			   DE_PLANEA_FLIP_DONE | DE_PLANEB_FLIP_DONE |
 			   DE_AUX_CHANNEL_A;
 	u32 render_irqs;
-	u32 hotplug_mask;
-	u32 pch_irq_mask;
 
 	dev_priv->irq_mask = ~display_mask;
 
@@ -1963,30 +1983,7 @@ static int ironlake_irq_postinstall(struct drm_device *dev)
 	I915_WRITE(GTIER, render_irqs);
 	POSTING_READ(GTIER);
 
-	if (HAS_PCH_CPT(dev)) {
-		hotplug_mask = (SDE_CRT_HOTPLUG_CPT |
-				SDE_PORTB_HOTPLUG_CPT |
-				SDE_PORTC_HOTPLUG_CPT |
-				SDE_PORTD_HOTPLUG_CPT |
-				SDE_GMBUS_CPT |
-				SDE_AUX_MASK_CPT);
-	} else {
-		hotplug_mask = (SDE_CRT_HOTPLUG |
-				SDE_PORTB_HOTPLUG |
-				SDE_PORTC_HOTPLUG |
-				SDE_PORTD_HOTPLUG |
-				SDE_GMBUS |
-				SDE_AUX_MASK);
-	}
-
-	pch_irq_mask = ~hotplug_mask;
-
-	I915_WRITE(SDEIIR, I915_READ(SDEIIR));
-	I915_WRITE(SDEIMR, pch_irq_mask);
-	I915_WRITE(SDEIER, hotplug_mask);
-	POSTING_READ(SDEIER);
-
-	ironlake_enable_pch_hotplug(dev);
+	ibx_irq_postinstall(dev);
 
 	if (IS_IRONLAKE_M(dev)) {
 		/* Clear & enable PCU event interrupts */
@@ -2009,8 +2006,6 @@ static int ivybridge_irq_postinstall(struct drm_device *dev)
 		DE_PLANEA_FLIP_DONE_IVB |
 		DE_AUX_CHANNEL_A_IVB;
 	u32 render_irqs;
-	u32 hotplug_mask;
-	u32 pch_irq_mask;
 
 	dev_priv->irq_mask = ~display_mask;
 
@@ -2034,20 +2029,7 @@ static int ivybridge_irq_postinstall(struct drm_device *dev)
 	I915_WRITE(GTIER, render_irqs);
 	POSTING_READ(GTIER);
 
-	hotplug_mask = (SDE_CRT_HOTPLUG_CPT |
-			SDE_PORTB_HOTPLUG_CPT |
-			SDE_PORTC_HOTPLUG_CPT |
-			SDE_PORTD_HOTPLUG_CPT |
-			SDE_GMBUS_CPT |
-			SDE_AUX_MASK_CPT);
-	pch_irq_mask = ~hotplug_mask;
-
-	I915_WRITE(SDEIIR, I915_READ(SDEIIR));
-	I915_WRITE(SDEIMR, pch_irq_mask);
-	I915_WRITE(SDEIER, hotplug_mask);
-	POSTING_READ(SDEIER);
-
-	ironlake_enable_pch_hotplug(dev);
+	ibx_irq_postinstall(dev);
 
 	return 0;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 05/10] drm/i915: also disable south interrupts when handling them
  2013-02-08 19:35 [PATCH 00/10] Display error reporting Paulo Zanoni
                   ` (3 preceding siblings ...)
  2013-02-08 19:35 ` [PATCH 04/10] drm/i915: add ibx_irq_postinstall Paulo Zanoni
@ 2013-02-08 19:35 ` Paulo Zanoni
  2013-02-09 19:29   ` Daniel Vetter
  2013-02-08 19:35 ` [PATCH 06/10] drm/i915: print PCH poison interrupts Paulo Zanoni
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-08 19:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

>From the docs:
  "Only the rising edge of the PCH Display interrupt will cause the
  North Display IIR (DEIIR) PCH Display Interrupt even bit to be set,
  so all PCH Display Interrupts, including back to back interrupts,
  must be cleared before a new PCH Display interrupt can cause DEIIR
  to be set".

The current code works fine because we don't get many interrupts, but
if we enable the PCH FIFO underrun interrupts we'll start getting so
many interrupts that at some point new PCH interrupts won't cause
DEIIR to be set.

The initial implementation I tried was to turn the code that checks
SDEIIR into a loop, but we can still get interrupts even after the
loop is done (and before the irq handler finishes), so we have to
either disable the interrupts or mask them. In the end I concluded
that just disabling the PCH interrupts is enough, you don't even need
the loop, so this is what this patch implements. I've tested it and it
passes the 2 "PCH FIFO underrun interrupt storms" I can reproduce:
the "ironlake_crtc_disable" case and the "wrong watermarks" case.

In other words, here's how to reproduce the problem fixed by this
patch:
  1 - Enable PCH FIFO underrun interrupts (SERR_INT on SNB+)
  2 - Boot the machine
  3 - While booting we'll get tons of PCH FIFO underrun interrupts
  4 - Plug a new monitor
  5 - Run xrandr, notice it won't detect the new monitor
  6 - Read SDEIIR and notice it's not 0 while DEIIR is 0

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c |   16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index f096ad9..500fd65 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -701,7 +701,7 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
 {
 	struct drm_device *dev = (struct drm_device *) arg;
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
-	u32 de_iir, gt_iir, de_ier, pm_iir;
+	u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier;
 	irqreturn_t ret = IRQ_NONE;
 	int i;
 
@@ -711,6 +711,10 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
 	de_ier = I915_READ(DEIER);
 	I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL);
 
+	sde_ier = I915_READ(SDEIER);
+	I915_WRITE(SDEIER, 0);
+	POSTING_READ(SDEIER);
+
 	gt_iir = I915_READ(GTIIR);
 	if (gt_iir) {
 		snb_gt_irq_handler(dev, dev_priv, gt_iir);
@@ -759,6 +763,8 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
 
 	I915_WRITE(DEIER, de_ier);
 	POSTING_READ(DEIER);
+	I915_WRITE(SDEIER, sde_ier);
+	POSTING_READ(SDEIER);
 
 	return ret;
 }
@@ -778,7 +784,7 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
 	struct drm_device *dev = (struct drm_device *) arg;
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
 	int ret = IRQ_NONE;
-	u32 de_iir, gt_iir, de_ier, pm_iir;
+	u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier;
 
 	atomic_inc(&dev_priv->irq_received);
 
@@ -787,6 +793,10 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
 	I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL);
 	POSTING_READ(DEIER);
 
+	sde_ier = I915_READ(SDEIER);
+	I915_WRITE(SDEIER, 0);
+	POSTING_READ(SDEIER);
+
 	de_iir = I915_READ(DEIIR);
 	gt_iir = I915_READ(GTIIR);
 	pm_iir = I915_READ(GEN6_PMIIR);
@@ -849,6 +859,8 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
 done:
 	I915_WRITE(DEIER, de_ier);
 	POSTING_READ(DEIER);
+	I915_WRITE(SDEIER, sde_ier);
+	POSTING_READ(SDEIER);
 
 	return ret;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 06/10] drm/i915: print PCH poison interrupts
  2013-02-08 19:35 [PATCH 00/10] Display error reporting Paulo Zanoni
                   ` (4 preceding siblings ...)
  2013-02-08 19:35 ` [PATCH 05/10] drm/i915: also disable south interrupts when handling them Paulo Zanoni
@ 2013-02-08 19:35 ` Paulo Zanoni
  2013-02-08 19:35 ` [PATCH 07/10] drm/i915: print Gen5+ CPU " Paulo Zanoni
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-08 19:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

This is bad news and shouldn't be happening.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c |   29 +++++++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_reg.h |    4 ++++
 2 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 500fd65..10aec0e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -665,6 +665,17 @@ static void ibx_irq_handler(struct drm_device *dev, u32 pch_iir)
 		DRM_DEBUG_DRIVER("PCH transcoder A underrun interrupt\n");
 }
 
+static void serr_int_handler(struct drm_device *dev)
+{
+	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
+	u32 serr_int = I915_READ(SERR_INT);
+
+	if (serr_int & SERR_INT_POISON)
+		DRM_ERROR("PCH poison interrupt\n");
+
+	I915_WRITE(SERR_INT, serr_int);
+}
+
 static void cpt_irq_handler(struct drm_device *dev, u32 pch_iir)
 {
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
@@ -695,6 +706,9 @@ static void cpt_irq_handler(struct drm_device *dev, u32 pch_iir)
 			DRM_DEBUG_DRIVER("  pipe %c FDI IIR: 0x%08x\n",
 					 pipe_name(pipe),
 					 I915_READ(FDI_RX_IIR(pipe)));
+
+	if (pch_iir & SDE_ERROR_CPT)
+		serr_int_handler(dev);
 }
 
 static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
@@ -1943,14 +1957,19 @@ static void ibx_irq_postinstall(struct drm_device *dev)
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
 	u32 mask;
 
-	if (HAS_PCH_IBX(dev))
+	if (HAS_PCH_IBX(dev)) {
 		mask = SDE_HOTPLUG_MASK |
 		       SDE_GMBUS |
-		       SDE_AUX_MASK;
-	else
+		       SDE_AUX_MASK |
+		       SDE_POISON;
+	} else {
 		mask = SDE_HOTPLUG_MASK_CPT |
 		       SDE_GMBUS_CPT |
-		       SDE_AUX_MASK_CPT;
+		       SDE_AUX_MASK_CPT |
+		       SDE_ERROR_CPT;
+
+		I915_WRITE(SERR_INT, I915_READ(SERR_INT));
+	}
 
 	I915_WRITE(SDEIIR, I915_READ(SDEIIR));
 	I915_WRITE(SDEIMR, ~mask);
@@ -2180,6 +2199,8 @@ static void ironlake_irq_uninstall(struct drm_device *dev)
 	I915_WRITE(SDEIMR, 0xffffffff);
 	I915_WRITE(SDEIER, 0x0);
 	I915_WRITE(SDEIIR, I915_READ(SDEIIR));
+	if (HAS_PCH_CPT(dev) || HAS_PCH_LPT(dev))
+		I915_WRITE(SERR_INT, I915_READ(SERR_INT));
 }
 
 static void i8xx_irq_preinstall(struct drm_device * dev)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 663b5c1..9cd59f7 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3528,6 +3528,7 @@
 				 SDE_PORTC_HOTPLUG_CPT |	\
 				 SDE_PORTB_HOTPLUG_CPT)
 #define SDE_GMBUS_CPT		(1 << 17)
+#define SDE_ERROR_CPT		(1 << 16)
 #define SDE_AUDIO_CP_REQ_C_CPT	(1 << 10)
 #define SDE_AUDIO_CP_CHG_C_CPT	(1 << 9)
 #define SDE_FDI_RXC_CPT		(1 << 8)
@@ -3552,6 +3553,9 @@
 #define SDEIIR  0xc4008
 #define SDEIER  0xc400c
 
+#define SERR_INT		0xc4040
+#define  SERR_INT_POISON	(1 << 31)
+
 /* digital port hotplug */
 #define PCH_PORT_HOTPLUG        0xc4030		/* SHOTPLUG_CTL */
 #define PORTD_HOTPLUG_ENABLE            (1 << 20)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
  2013-02-08 19:35 [PATCH 00/10] Display error reporting Paulo Zanoni
                   ` (5 preceding siblings ...)
  2013-02-08 19:35 ` [PATCH 06/10] drm/i915: print PCH poison interrupts Paulo Zanoni
@ 2013-02-08 19:35 ` Paulo Zanoni
  2013-02-08 19:42   ` Jesse Barnes
  2013-02-08 19:35 ` [PATCH 08/10] drm/i915: print PCH FIFO underrun interrupts Paulo Zanoni
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-08 19:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

On ILK/SNB all we need to do is to enable the "poison" bit, but on
IVB/HSW we need to enable the CPU error interrupt register, which is
responsible not only for poison interrupts, but also other things.
This includes the "unclaimed register" interrupt, so on the IVB irq
handler we now need to: (i) check whether the interrupt was triggered by an
unclaimed register and (ii) mask the error interrupt bit so we don't
risk generating "unclaimed register" interrupts form inside the
interrupt handler.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c |   49 ++++++++++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_reg.h |    5 ++--
 2 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 10aec0e..703a08a 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -665,6 +665,17 @@ static void ibx_irq_handler(struct drm_device *dev, u32 pch_iir)
 		DRM_DEBUG_DRIVER("PCH transcoder A underrun interrupt\n");
 }
 
+static void err_int_handler(struct drm_device *dev)
+{
+	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
+	u32 err_int = I915_READ(GEN7_ERR_INT);
+
+	if (err_int & ERR_INT_POISON)
+		DRM_ERROR("Poison interrupt\n");
+
+	I915_WRITE(GEN7_ERR_INT, err_int);
+}
+
 static void serr_int_handler(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
@@ -715,16 +726,33 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
 {
 	struct drm_device *dev = (struct drm_device *) arg;
 	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
-	u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier;
+	u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier, de_imr;
 	irqreturn_t ret = IRQ_NONE;
 	int i;
 
 	atomic_inc(&dev_priv->irq_received);
 
+	/* We get interrupts on unclaimed registers, so check for this before we
+	 * do any I915_{READ,WRITE}. */
+	if (IS_HASWELL(dev) &&
+	    (I915_READ_NOTRACE(FPGA_DBG) & FPGA_DBG_RM_NOCLAIM)) {
+		DRM_ERROR("Unclaimed register before interrupt\n");
+		I915_WRITE_NOTRACE(FPGA_DBG, FPGA_DBG_RM_NOCLAIM);
+	}
+
 	/* disable master interrupt before clearing iir  */
 	de_ier = I915_READ(DEIER);
 	I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL);
 
+	/* On Haswell, also mask ERR_INT because we don't want to risk
+	 * generating "unclaimed register" interrupts from inside the interrupt
+	 * handler. */
+	de_imr = I915_READ(DEIMR);
+	if (IS_HASWELL(dev)) {
+		I915_WRITE(DEIMR, de_imr | DE_ERR_INT_IVB);
+		POSTING_READ(DEIMR);
+	}
+
 	sde_ier = I915_READ(SDEIER);
 	I915_WRITE(SDEIER, 0);
 	POSTING_READ(SDEIER);
@@ -738,6 +766,9 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
 
 	de_iir = I915_READ(DEIIR);
 	if (de_iir) {
+		if (de_iir & DE_ERR_INT_IVB)
+			err_int_handler(dev);
+
 		if (de_iir & DE_AUX_CHANNEL_A_IVB)
 			dp_aux_irq_handler(dev);
 
@@ -775,6 +806,11 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
 		ret = IRQ_HANDLED;
 	}
 
+	if (IS_HASWELL(dev)) {
+		I915_WRITE(DEIMR, de_imr);
+		POSTING_READ(DEIMR);
+	}
+
 	I915_WRITE(DEIER, de_ier);
 	POSTING_READ(DEIER);
 	I915_WRITE(SDEIER, sde_ier);
@@ -837,6 +873,9 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
 	if (de_iir & DE_PIPEB_VBLANK)
 		drm_handle_vblank(dev, 1);
 
+	if (de_iir & DE_POISON)
+		DRM_ERROR("Poison interrupt\n");
+
 	if (de_iir & DE_PLANEA_FLIP_DONE) {
 		intel_prepare_page_flip(dev, 0);
 		intel_finish_page_flip_plane(dev, 0);
@@ -1985,7 +2024,7 @@ static int ironlake_irq_postinstall(struct drm_device *dev)
 	/* enable kind of interrupts always enabled */
 	u32 display_mask = DE_MASTER_IRQ_CONTROL | DE_GSE | DE_PCH_EVENT |
 			   DE_PLANEA_FLIP_DONE | DE_PLANEB_FLIP_DONE |
-			   DE_AUX_CHANNEL_A;
+			   DE_AUX_CHANNEL_A | DE_POISON;
 	u32 render_irqs;
 
 	dev_priv->irq_mask = ~display_mask;
@@ -2035,12 +2074,14 @@ static int ivybridge_irq_postinstall(struct drm_device *dev)
 		DE_PLANEC_FLIP_DONE_IVB |
 		DE_PLANEB_FLIP_DONE_IVB |
 		DE_PLANEA_FLIP_DONE_IVB |
-		DE_AUX_CHANNEL_A_IVB;
+		DE_AUX_CHANNEL_A_IVB |
+		DE_ERR_INT_IVB;
 	u32 render_irqs;
 
 	dev_priv->irq_mask = ~display_mask;
 
 	/* should always can generate irq */
+	I915_WRITE(GEN7_ERR_INT, I915_READ(GEN7_ERR_INT));
 	I915_WRITE(DEIIR, I915_READ(DEIIR));
 	I915_WRITE(DEIMR, dev_priv->irq_mask);
 	I915_WRITE(DEIER,
@@ -2191,6 +2232,8 @@ static void ironlake_irq_uninstall(struct drm_device *dev)
 	I915_WRITE(DEIMR, 0xffffffff);
 	I915_WRITE(DEIER, 0x0);
 	I915_WRITE(DEIIR, I915_READ(DEIIR));
+	if (IS_GEN7(dev))
+		I915_WRITE(GEN7_ERR_INT, I915_READ(GEN7_ERR_INT));
 
 	I915_WRITE(GTIMR, 0xffffffff);
 	I915_WRITE(GTIER, 0x0);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 9cd59f7..f22e27d 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -519,7 +519,8 @@
 
 #define ERROR_GEN6	0x040a0
 #define GEN7_ERR_INT	0x44040
-#define   ERR_INT_MMIO_UNCLAIMED (1<<13)
+#define   ERR_INT_POISON		(1<<31)
+#define   ERR_INT_MMIO_UNCLAIMED	(1<<13)
 
 #define FPGA_DBG		0x42300
 #define FPGA_DBG_RM_NOCLAIM	(1<<31)
@@ -3378,7 +3379,7 @@
 #define DE_PIPEA_FIFO_UNDERRUN  (1 << 0)
 
 /* More Ivybridge lolz */
-#define DE_ERR_DEBUG_IVB		(1<<30)
+#define DE_ERR_INT_IVB			(1<<30)
 #define DE_GSE_IVB			(1<<29)
 #define DE_PCH_EVENT_IVB		(1<<28)
 #define DE_DP_A_HOTPLUG_IVB		(1<<27)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 08/10] drm/i915: print PCH FIFO underrun interrupts
  2013-02-08 19:35 [PATCH 00/10] Display error reporting Paulo Zanoni
                   ` (6 preceding siblings ...)
  2013-02-08 19:35 ` [PATCH 07/10] drm/i915: print Gen5+ CPU " Paulo Zanoni
@ 2013-02-08 19:35 ` Paulo Zanoni
  2013-02-09 19:43   ` Daniel Vetter
  2013-02-08 19:35 ` [PATCH 09/10] drm/i915: print CPU FIFO underruns Paulo Zanoni
  2013-02-08 19:35 ` [PATCH 10/10] drm/i915: also POSTING_READ(DEIER) on ivybridge_irq_handler Paulo Zanoni
  9 siblings, 1 reply; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-08 19:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

Also add an "ignore" bit that avoids printing the message in two
cases:
  - When the message is in fact expected.
  - After we get the first message. In tihs case, we expect to get
    hundreds of consecutive messages, so we just ignore all the
    subsequent messages until the next crtc_enable.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h      |    3 +++
 drivers/gpu/drm/i915/i915_irq.c      |   37 +++++++++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_reg.h      |    7 +++++--
 drivers/gpu/drm/i915/intel_display.c |   14 ++++++++++++-
 4 files changed, 53 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 08c5def..e96d75e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -909,6 +909,9 @@ typedef struct drm_i915_private {
 	struct work_struct hotplug_work;
 	bool enable_hotplug_processing;
 
+	/* Bit 0: PCH transcoder A and so on. */
+	u8 ignore_pch_fifo_underrun;
+
 	int num_pipe;
 	int num_pch_pll;
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 703a08a..7497589 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -659,10 +659,17 @@ static void ibx_irq_handler(struct drm_device *dev, u32 pch_iir)
 	if (pch_iir & (SDE_TRANSB_CRC_ERR | SDE_TRANSA_CRC_ERR))
 		DRM_DEBUG_DRIVER("PCH transcoder CRC error interrupt\n");
 
-	if (pch_iir & SDE_TRANSB_FIFO_UNDER)
-		DRM_DEBUG_DRIVER("PCH transcoder B underrun interrupt\n");
-	if (pch_iir & SDE_TRANSA_FIFO_UNDER)
-		DRM_DEBUG_DRIVER("PCH transcoder A underrun interrupt\n");
+	if ((pch_iir & SDE_TRANSB_FIFO_UNDER) &&
+	    !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_B))) {
+		DRM_DEBUG_DRIVER("PCH transcoder B underrun\n");
+		dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_B);
+	}
+
+	if ((pch_iir & SDE_TRANSA_FIFO_UNDER) &&
+	    !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_A))) {
+		DRM_DEBUG_DRIVER("PCH transcoder A underrun\n");
+		dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
+	}
 }
 
 static void err_int_handler(struct drm_device *dev)
@@ -684,6 +691,24 @@ static void serr_int_handler(struct drm_device *dev)
 	if (serr_int & SERR_INT_POISON)
 		DRM_ERROR("PCH poison interrupt\n");
 
+	if ((serr_int & SERR_INT_TRANS_A_FIFO_UNDERRUN) &&
+	    !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_A))) {
+		DRM_ERROR("PCH transcoder A FIFO underrun\n");
+		dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
+	}
+
+	if ((serr_int & SERR_INT_TRANS_B_FIFO_UNDERRUN) &&
+	    !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_B))) {
+		DRM_ERROR("PCH transcoder B FIFO underrun\n");
+		dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_B);
+	}
+
+	if ((serr_int & SERR_INT_TRANS_C_FIFO_UNDERRUN) &&
+	    !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_C))) {
+		DRM_ERROR("PCH transcoder C FIFO underrun\n");
+		dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_C);
+	}
+
 	I915_WRITE(SERR_INT, serr_int);
 }
 
@@ -2000,7 +2025,9 @@ static void ibx_irq_postinstall(struct drm_device *dev)
 		mask = SDE_HOTPLUG_MASK |
 		       SDE_GMBUS |
 		       SDE_AUX_MASK |
-		       SDE_POISON;
+		       SDE_POISON |
+		       SDE_TRANSB_FIFO_UNDER |
+		       SDE_TRANSA_FIFO_UNDER;
 	} else {
 		mask = SDE_HOTPLUG_MASK_CPT |
 		       SDE_GMBUS_CPT |
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index f22e27d..d565bd7 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3554,8 +3554,11 @@
 #define SDEIIR  0xc4008
 #define SDEIER  0xc400c
 
-#define SERR_INT		0xc4040
-#define  SERR_INT_POISON	(1 << 31)
+#define SERR_INT			0xc4040
+#define  SERR_INT_POISON		(1 << 31)
+#define  SERR_INT_TRANS_C_FIFO_UNDERRUN	(1 << 6)
+#define  SERR_INT_TRANS_B_FIFO_UNDERRUN	(1 << 3)
+#define  SERR_INT_TRANS_A_FIFO_UNDERRUN	(1 << 0)
 
 /* digital port hotplug */
 #define PCH_PORT_HOTPLUG        0xc4030		/* SHOTPLUG_CTL */
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index d75c6a0..67bfb58 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3274,6 +3274,9 @@ static void ironlake_crtc_enable(struct drm_crtc *crtc)
 		return;
 
 	intel_crtc->active = true;
+
+	dev_priv->ignore_pch_fifo_underrun &= ~(1 << pipe);
+
 	intel_update_watermarks(dev);
 
 	if (intel_pipe_has_type(crtc, INTEL_OUTPUT_LVDS)) {
@@ -3366,11 +3369,15 @@ static void haswell_crtc_enable(struct drm_crtc *crtc)
 		return;
 
 	intel_crtc->active = true;
-	intel_update_watermarks(dev);
 
 	is_pch_port = haswell_crtc_driving_pch(crtc);
 
 	if (is_pch_port)
+		dev_priv->ignore_pch_fifo_underrun &= ~(1 << TRANSCODER_A);
+
+	intel_update_watermarks(dev);
+
+	if (is_pch_port)
 		dev_priv->display.fdi_link_train(crtc);
 
 	for_each_encoder_on_crtc(dev, crtc, encoder)
@@ -3453,6 +3460,7 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
 	if (dev_priv->cfb_plane == plane)
 		intel_disable_fbc(dev);
 
+	dev_priv->ignore_pch_fifo_underrun |= (1 << pipe);
 	intel_disable_pipe(dev_priv, pipe);
 
 	/* Disable PF */
@@ -3466,6 +3474,7 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
 	ironlake_fdi_disable(crtc);
 
 	ironlake_disable_pch_transcoder(dev_priv, pipe);
+	dev_priv->ignore_pch_fifo_underrun &= ~(1 << pipe);
 
 	if (HAS_PCH_CPT(dev)) {
 		/* disable TRANS_DP_CTL */
@@ -3535,6 +3544,8 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
 	if (dev_priv->cfb_plane == plane)
 		intel_disable_fbc(dev);
 
+	if (is_pch_port)
+		dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
 	intel_disable_pipe(dev_priv, pipe);
 
 	intel_ddi_disable_transcoder_func(dev_priv, cpu_transcoder);
@@ -3551,6 +3562,7 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
 
 	if (is_pch_port) {
 		lpt_disable_pch_transcoder(dev_priv);
+		dev_priv->ignore_pch_fifo_underrun &= ~(1 << TRANSCODER_A);
 		intel_ddi_fdi_disable(crtc);
 	}
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 09/10] drm/i915: print CPU FIFO underruns
  2013-02-08 19:35 [PATCH 00/10] Display error reporting Paulo Zanoni
                   ` (7 preceding siblings ...)
  2013-02-08 19:35 ` [PATCH 08/10] drm/i915: print PCH FIFO underrun interrupts Paulo Zanoni
@ 2013-02-08 19:35 ` Paulo Zanoni
  2013-02-08 19:35 ` [PATCH 10/10] drm/i915: also POSTING_READ(DEIER) on ivybridge_irq_handler Paulo Zanoni
  9 siblings, 0 replies; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-08 19:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

Just like the PCH FIFO underruns, except that there's no place
where we expect it to happen.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h      |    2 ++
 drivers/gpu/drm/i915/i915_irq.c      |   33 ++++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_reg.h      |    3 +++
 drivers/gpu/drm/i915/intel_display.c |    2 ++
 4 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e96d75e..f614ea8 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -909,6 +909,8 @@ typedef struct drm_i915_private {
 	struct work_struct hotplug_work;
 	bool enable_hotplug_processing;
 
+	/* Bit 0: pipe A and son on. */
+	u8 ignore_cpu_fifo_underrun;
 	/* Bit 0: PCH transcoder A and so on. */
 	u8 ignore_pch_fifo_underrun;
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 7497589..09bd8d4 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -680,6 +680,24 @@ static void err_int_handler(struct drm_device *dev)
 	if (err_int & ERR_INT_POISON)
 		DRM_ERROR("Poison interrupt\n");
 
+	if ((err_int & ERR_INT_FIFO_UNDERRUN_A) &&
+	    !(dev_priv->ignore_cpu_fifo_underrun & (1 << PIPE_A))) {
+		DRM_ERROR("Pipe A FIFO underrun\n");
+		dev_priv->ignore_cpu_fifo_underrun |= (1 << PIPE_A);
+	}
+
+	if ((err_int & ERR_INT_FIFO_UNDERRUN_B) &&
+	    !(dev_priv->ignore_cpu_fifo_underrun & (1 << PIPE_B))) {
+		DRM_ERROR("Pipe B FIFO underrun\n");
+		dev_priv->ignore_cpu_fifo_underrun |= (1 << PIPE_B);
+	}
+
+	if ((err_int & ERR_INT_FIFO_UNDERRUN_C) &&
+	    !(dev_priv->ignore_cpu_fifo_underrun & (1 << PIPE_C))) {
+		DRM_ERROR("Pipe C FIFO underrun\n");
+		dev_priv->ignore_cpu_fifo_underrun |= (1 << PIPE_C);
+	}
+
 	I915_WRITE(GEN7_ERR_INT, err_int);
 }
 
@@ -901,6 +919,18 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
 	if (de_iir & DE_POISON)
 		DRM_ERROR("Poison interrupt\n");
 
+	if ((de_iir & DE_PIPEA_FIFO_UNDERRUN) &&
+	    !(dev_priv->ignore_cpu_fifo_underrun & (1 << PIPE_A))) {
+		DRM_ERROR("Pipe A FIFO underrun\n");
+		dev_priv->ignore_cpu_fifo_underrun |= (1 << PIPE_A);
+	}
+
+	if ((de_iir & DE_PIPEB_FIFO_UNDERRUN) &&
+	    !(dev_priv->ignore_cpu_fifo_underrun & (1 << PIPE_B))) {
+		DRM_ERROR("Pipe B FIFO underrun\n");
+		dev_priv->ignore_cpu_fifo_underrun |= (1 << PIPE_B);
+	}
+
 	if (de_iir & DE_PLANEA_FLIP_DONE) {
 		intel_prepare_page_flip(dev, 0);
 		intel_finish_page_flip_plane(dev, 0);
@@ -2051,7 +2081,8 @@ static int ironlake_irq_postinstall(struct drm_device *dev)
 	/* enable kind of interrupts always enabled */
 	u32 display_mask = DE_MASTER_IRQ_CONTROL | DE_GSE | DE_PCH_EVENT |
 			   DE_PLANEA_FLIP_DONE | DE_PLANEB_FLIP_DONE |
-			   DE_AUX_CHANNEL_A | DE_POISON;
+			   DE_AUX_CHANNEL_A | DE_POISON |
+			   DE_PIPEB_FIFO_UNDERRUN | DE_PIPEA_FIFO_UNDERRUN;
 	u32 render_irqs;
 
 	dev_priv->irq_mask = ~display_mask;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d565bd7..c0db8b4 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -521,6 +521,9 @@
 #define GEN7_ERR_INT	0x44040
 #define   ERR_INT_POISON		(1<<31)
 #define   ERR_INT_MMIO_UNCLAIMED	(1<<13)
+#define   ERR_INT_FIFO_UNDERRUN_C	(1<<6)
+#define   ERR_INT_FIFO_UNDERRUN_B	(1<<3)
+#define   ERR_INT_FIFO_UNDERRUN_A	(1<<0)
 
 #define FPGA_DBG		0x42300
 #define FPGA_DBG_RM_NOCLAIM	(1<<31)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 67bfb58..030a8d3 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3275,6 +3275,7 @@ static void ironlake_crtc_enable(struct drm_crtc *crtc)
 
 	intel_crtc->active = true;
 
+	dev_priv->ignore_cpu_fifo_underrun &= ~(1 << pipe);
 	dev_priv->ignore_pch_fifo_underrun &= ~(1 << pipe);
 
 	intel_update_watermarks(dev);
@@ -3372,6 +3373,7 @@ static void haswell_crtc_enable(struct drm_crtc *crtc)
 
 	is_pch_port = haswell_crtc_driving_pch(crtc);
 
+	dev_priv->ignore_cpu_fifo_underrun &= ~(1 << pipe);
 	if (is_pch_port)
 		dev_priv->ignore_pch_fifo_underrun &= ~(1 << TRANSCODER_A);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 10/10] drm/i915: also POSTING_READ(DEIER) on ivybridge_irq_handler
  2013-02-08 19:35 [PATCH 00/10] Display error reporting Paulo Zanoni
                   ` (8 preceding siblings ...)
  2013-02-08 19:35 ` [PATCH 09/10] drm/i915: print CPU FIFO underruns Paulo Zanoni
@ 2013-02-08 19:35 ` Paulo Zanoni
  2013-02-09 17:34   ` Ben Widawsky
  9 siblings, 1 reply; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-08 19:35 UTC (permalink / raw)
  To: intel-gfx; +Cc: Paulo Zanoni

From: Paulo Zanoni <paulo.r.zanoni@intel.com>

This is already done on ironlake_irq_handler. We want to make sure the
interrupts are disabled before we check any of the other interrupt
registers.

Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 drivers/gpu/drm/i915/i915_irq.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 09bd8d4..e9a6ade 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -786,6 +786,7 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
 	/* disable master interrupt before clearing iir  */
 	de_ier = I915_READ(DEIER);
 	I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL);
+	POSTING_READ(DEIER);
 
 	/* On Haswell, also mask ERR_INT because we don't want to risk
 	 * generating "unclaimed register" interrupts from inside the interrupt
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
  2013-02-08 19:35 ` [PATCH 07/10] drm/i915: print Gen5+ CPU " Paulo Zanoni
@ 2013-02-08 19:42   ` Jesse Barnes
  2013-02-08 19:54     ` Paulo Zanoni
  2013-02-09 17:30     ` Ben Widawsky
  0 siblings, 2 replies; 31+ messages in thread
From: Jesse Barnes @ 2013-02-08 19:42 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Fri,  8 Feb 2013 17:35:18 -0200
Paulo Zanoni <przanoni@gmail.com> wrote:

> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> On ILK/SNB all we need to do is to enable the "poison" bit, but on
> IVB/HSW we need to enable the CPU error interrupt register, which is
> responsible not only for poison interrupts, but also other things.
> This includes the "unclaimed register" interrupt, so on the IVB irq
> handler we now need to: (i) check whether the interrupt was triggered by an
> unclaimed register and (ii) mask the error interrupt bit so we don't
> risk generating "unclaimed register" interrupts form inside the
> interrupt handler.
> 
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---

OTOH there's nothing the user can do about it... so we might do a
WARN_ONCE or something here instead.  But even then, I'm not sure
there's much *we* can do about these, as they indicate a corruption in
the communication between the CPU and PCH.

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
  2013-02-08 19:42   ` Jesse Barnes
@ 2013-02-08 19:54     ` Paulo Zanoni
  2013-02-08 20:01       ` Jesse Barnes
  2013-02-09 17:30     ` Ben Widawsky
  1 sibling, 1 reply; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-08 19:54 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: intel-gfx, Paulo Zanoni

Hi

2013/2/8 Jesse Barnes <jbarnes@virtuousgeek.org>:
> On Fri,  8 Feb 2013 17:35:18 -0200
> Paulo Zanoni <przanoni@gmail.com> wrote:
>
>> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>
>> On ILK/SNB all we need to do is to enable the "poison" bit, but on
>> IVB/HSW we need to enable the CPU error interrupt register, which is
>> responsible not only for poison interrupts, but also other things.
>> This includes the "unclaimed register" interrupt, so on the IVB irq
>> handler we now need to: (i) check whether the interrupt was triggered by an
>> unclaimed register and (ii) mask the error interrupt bit so we don't
>> risk generating "unclaimed register" interrupts form inside the
>> interrupt handler.
>>
>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> ---
>
> OTOH there's nothing the user can do about it... so we might do a
> WARN_ONCE or something here instead.

Well, so far I haven't seen the message. If we conclude it happens
*too much*, then we can use WARN_ONCE.

> But even then, I'm not sure
> there's much *we* can do about these, as they indicate a corruption in
> the communication between the CPU and PCH.

At least if we get the message we may be able to understand and/or
reproduce the problems. So far we don't even know whether the problem
is happening or not... And when there's a display bug, we don't know
if it's caused by "poison".

>
> --
> Jesse Barnes, Intel Open Source Technology Center



-- 
Paulo Zanoni

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
  2013-02-08 19:54     ` Paulo Zanoni
@ 2013-02-08 20:01       ` Jesse Barnes
  0 siblings, 0 replies; 31+ messages in thread
From: Jesse Barnes @ 2013-02-08 20:01 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Fri, 8 Feb 2013 17:54:23 -0200
Paulo Zanoni <przanoni@gmail.com> wrote:

> Hi
> 
> 2013/2/8 Jesse Barnes <jbarnes@virtuousgeek.org>:
> > On Fri,  8 Feb 2013 17:35:18 -0200
> > Paulo Zanoni <przanoni@gmail.com> wrote:
> >
> >> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >>
> >> On ILK/SNB all we need to do is to enable the "poison" bit, but on
> >> IVB/HSW we need to enable the CPU error interrupt register, which is
> >> responsible not only for poison interrupts, but also other things.
> >> This includes the "unclaimed register" interrupt, so on the IVB irq
> >> handler we now need to: (i) check whether the interrupt was triggered by an
> >> unclaimed register and (ii) mask the error interrupt bit so we don't
> >> risk generating "unclaimed register" interrupts form inside the
> >> interrupt handler.
> >>
> >> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >> ---
> >
> > OTOH there's nothing the user can do about it... so we might do a
> > WARN_ONCE or something here instead.
> 
> Well, so far I haven't seen the message. If we conclude it happens
> *too much*, then we can use WARN_ONCE.
> 
> > But even then, I'm not sure
> > there's much *we* can do about these, as they indicate a corruption in
> > the communication between the CPU and PCH.
> 
> At least if we get the message we may be able to understand and/or
> reproduce the problems. So far we don't even know whether the problem
> is happening or not... And when there's a display bug, we don't know
> if it's caused by "poison".

Ok I guess the DRM_ERROR won't hurt if/until we see reports.  Then we
can dig in and see if keeping the message makes sense or not.

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 03/10] drm/i915: clear the FPGA_DBG_RM_NOCLAIM bit at driver init
  2013-02-08 19:35 ` [PATCH 03/10] drm/i915: clear the FPGA_DBG_RM_NOCLAIM bit at driver init Paulo Zanoni
@ 2013-02-09 17:19   ` Ben Widawsky
  2013-02-14 20:26     ` Paulo Zanoni
  0 siblings, 1 reply; 31+ messages in thread
From: Ben Widawsky @ 2013-02-09 17:19 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Fri,  8 Feb 2013 17:35:14 -0200
Paulo Zanoni <przanoni@gmail.com> wrote:

> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> Otherwise, if the BIOS did anything wrong, our first I915_{WRITE,READ}
> will give us "unclaimed register"  messages.
> 
> V2: Even earlier.
> 
> Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=58897
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

I really wish we were allowed to call Haswell something like gen7.x, so
we can do INTEL_INFO(dev)->gen > 7

Also, I would have cleared all the bits in the register, not just
NOCLAIM.

Either way it's
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[snip]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 04/10] drm/i915: add ibx_irq_postinstall
  2013-02-08 19:35 ` [PATCH 04/10] drm/i915: add ibx_irq_postinstall Paulo Zanoni
@ 2013-02-09 17:27   ` Ben Widawsky
  2013-02-09 19:07     ` Daniel Vetter
  0 siblings, 1 reply; 31+ messages in thread
From: Ben Widawsky @ 2013-02-09 17:27 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Fri,  8 Feb 2013 17:35:15 -0200
Paulo Zanoni <przanoni@gmail.com> wrote:

> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> So we can remove duplicated code. Note that this function is used not
> only on IBX, but also CPT and LPT.
> 
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[snip]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
  2013-02-08 19:42   ` Jesse Barnes
  2013-02-08 19:54     ` Paulo Zanoni
@ 2013-02-09 17:30     ` Ben Widawsky
  2013-02-14 20:35       ` Paulo Zanoni
  1 sibling, 1 reply; 31+ messages in thread
From: Ben Widawsky @ 2013-02-09 17:30 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: intel-gfx, Paulo Zanoni

On Fri, 8 Feb 2013 11:42:39 -0800
Jesse Barnes <jbarnes@virtuousgeek.org> wrote:

> On Fri,  8 Feb 2013 17:35:18 -0200
> Paulo Zanoni <przanoni@gmail.com> wrote:
> 
> > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > 
> > On ILK/SNB all we need to do is to enable the "poison" bit, but on
> > IVB/HSW we need to enable the CPU error interrupt register, which is
> > responsible not only for poison interrupts, but also other things.
> > This includes the "unclaimed register" interrupt, so on the IVB irq
> > handler we now need to: (i) check whether the interrupt was
> > triggered by an unclaimed register and (ii) mask the error
> > interrupt bit so we don't risk generating "unclaimed register"
> > interrupts form inside the interrupt handler.
> > 
> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > ---
> 
> OTOH there's nothing the user can do about it... so we might do a
> WARN_ONCE or something here instead.  But even then, I'm not sure
> there's much *we* can do about these, as they indicate a corruption in
> the communication between the CPU and PCH.
> 

I agree with Jesse. I wouldn't bother with these. Even a WARN_ONCE
isn't helpful since the backtrace wouldn't really be meaningful.

If OTOH, you wanted to save away this information into error state; I
could get behind that.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 10/10] drm/i915: also POSTING_READ(DEIER) on ivybridge_irq_handler
  2013-02-08 19:35 ` [PATCH 10/10] drm/i915: also POSTING_READ(DEIER) on ivybridge_irq_handler Paulo Zanoni
@ 2013-02-09 17:34   ` Ben Widawsky
  0 siblings, 0 replies; 31+ messages in thread
From: Ben Widawsky @ 2013-02-09 17:34 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Fri,  8 Feb 2013 17:35:21 -0200
Paulo Zanoni <przanoni@gmail.com> wrote:

> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> This is already done on ironlake_irq_handler. We want to make sure the
> interrupts are disabled before we check any of the other interrupt
> registers.
> 
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_irq.c |    1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c
> b/drivers/gpu/drm/i915/i915_irq.c index 09bd8d4..e9a6ade 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -786,6 +786,7 @@ static irqreturn_t ivybridge_irq_handler(int irq,
> void *arg) /* disable master interrupt before clearing iir  */
>  	de_ier = I915_READ(DEIER);
>  	I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL);
> +	POSTING_READ(DEIER);
>  
>  	/* On Haswell, also mask ERR_INT because we don't want to
> risk
>  	 * generating "unclaimed register" interrupts from inside
> the interrupt


The POSTING_READ isn't needed because the very next line will serve as
a posting read, assuming that hasn't changed throughout the series, but
it still looks to be the case as of patch 5.

Also note, the only thing DEIER should effect is the propagation of the
interrupt to the CPU, and not the status bits in registers.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 04/10] drm/i915: add ibx_irq_postinstall
  2013-02-09 17:27   ` Ben Widawsky
@ 2013-02-09 19:07     ` Daniel Vetter
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel Vetter @ 2013-02-09 19:07 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Paulo Zanoni

On Sat, Feb 09, 2013 at 09:27:22AM -0800, Ben Widawsky wrote:
> On Fri,  8 Feb 2013 17:35:15 -0200
> Paulo Zanoni <przanoni@gmail.com> wrote:
> 
> > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > 
> > So we can remove duplicated code. Note that this function is used not
> > only on IBX, but also CPT and LPT.
> > 
> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> [snip]
Queued for -next, thanks for the patch. I've applied a tiny bikeshed on
top to rename the hotplug enable function for the pch to also use the ibx
prefix. My ocd simply couldn't resist ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 05/10] drm/i915: also disable south interrupts when handling them
  2013-02-08 19:35 ` [PATCH 05/10] drm/i915: also disable south interrupts when handling them Paulo Zanoni
@ 2013-02-09 19:29   ` Daniel Vetter
  2013-02-20 20:06     ` Paulo Zanoni
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Vetter @ 2013-02-09 19:29 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Fri, Feb 08, 2013 at 05:35:16PM -0200, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> From the docs:
>   "Only the rising edge of the PCH Display interrupt will cause the
>   North Display IIR (DEIIR) PCH Display Interrupt even bit to be set,
>   so all PCH Display Interrupts, including back to back interrupts,
>   must be cleared before a new PCH Display interrupt can cause DEIIR
>   to be set".
> 
> The current code works fine because we don't get many interrupts, but
> if we enable the PCH FIFO underrun interrupts we'll start getting so
> many interrupts that at some point new PCH interrupts won't cause
> DEIIR to be set.
> 
> The initial implementation I tried was to turn the code that checks
> SDEIIR into a loop, but we can still get interrupts even after the
> loop is done (and before the irq handler finishes), so we have to
> either disable the interrupts or mask them. In the end I concluded
> that just disabling the PCH interrupts is enough, you don't even need
> the loop, so this is what this patch implements. I've tested it and it
> passes the 2 "PCH FIFO underrun interrupt storms" I can reproduce:
> the "ironlake_crtc_disable" case and the "wrong watermarks" case.
> 
> In other words, here's how to reproduce the problem fixed by this
> patch:
>   1 - Enable PCH FIFO underrun interrupts (SERR_INT on SNB+)
>   2 - Boot the machine
>   3 - While booting we'll get tons of PCH FIFO underrun interrupts
>   4 - Plug a new monitor
>   5 - Run xrandr, notice it won't detect the new monitor
>   6 - Read SDEIIR and notice it's not 0 while DEIIR is 0
> 
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

This smells fishy. Looking at the code, clearing the SDEIIR before the
DEIIR looks strange. If we immediately get another south interrupt, it
will propagate to the DEIIR. But since we currently processing south
interrupts the corresponding bit in DEIIR is set, so when we clear DEIIR
later on, we kill that south interrupt. And since it's then once lost,
we'll lose all subsequent ones, too.

So have you tried clearing DEIIR before clearing any of the subordinate
registers?

The patch itself is imo wrong, since afaik the hw will drop any interrupts
if we disable them. Hence any south interrupt happening while we are in
the irq handler will effectively be lost.

And for the underrun reporting irq storm, I guess we need to first have
infrastructure to block out interrupts while the pipe is going down before
we can enable them. Assuming that pipe underruns are expected in that
case.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_irq.c |   16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index f096ad9..500fd65 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -701,7 +701,7 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
>  {
>  	struct drm_device *dev = (struct drm_device *) arg;
>  	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
> -	u32 de_iir, gt_iir, de_ier, pm_iir;
> +	u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier;
>  	irqreturn_t ret = IRQ_NONE;
>  	int i;
>  
> @@ -711,6 +711,10 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
>  	de_ier = I915_READ(DEIER);
>  	I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL);
>  
> +	sde_ier = I915_READ(SDEIER);
> +	I915_WRITE(SDEIER, 0);
> +	POSTING_READ(SDEIER);
> +
>  	gt_iir = I915_READ(GTIIR);
>  	if (gt_iir) {
>  		snb_gt_irq_handler(dev, dev_priv, gt_iir);
> @@ -759,6 +763,8 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
>  
>  	I915_WRITE(DEIER, de_ier);
>  	POSTING_READ(DEIER);
> +	I915_WRITE(SDEIER, sde_ier);
> +	POSTING_READ(SDEIER);
>  
>  	return ret;
>  }
> @@ -778,7 +784,7 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
>  	struct drm_device *dev = (struct drm_device *) arg;
>  	drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
>  	int ret = IRQ_NONE;
> -	u32 de_iir, gt_iir, de_ier, pm_iir;
> +	u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier;
>  
>  	atomic_inc(&dev_priv->irq_received);
>  
> @@ -787,6 +793,10 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
>  	I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL);
>  	POSTING_READ(DEIER);
>  
> +	sde_ier = I915_READ(SDEIER);
> +	I915_WRITE(SDEIER, 0);
> +	POSTING_READ(SDEIER);
> +
>  	de_iir = I915_READ(DEIIR);
>  	gt_iir = I915_READ(GTIIR);
>  	pm_iir = I915_READ(GEN6_PMIIR);
> @@ -849,6 +859,8 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
>  done:
>  	I915_WRITE(DEIER, de_ier);
>  	POSTING_READ(DEIER);
> +	I915_WRITE(SDEIER, sde_ier);
> +	POSTING_READ(SDEIER);
>  
>  	return ret;
>  }
> -- 
> 1.7.10.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/10] drm/i915: print PCH FIFO underrun interrupts
  2013-02-08 19:35 ` [PATCH 08/10] drm/i915: print PCH FIFO underrun interrupts Paulo Zanoni
@ 2013-02-09 19:43   ` Daniel Vetter
  2013-02-14 20:53     ` Paulo Zanoni
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Vetter @ 2013-02-09 19:43 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Fri, Feb 08, 2013 at 05:35:19PM -0200, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> Also add an "ignore" bit that avoids printing the message in two
> cases:
>   - When the message is in fact expected.
>   - After we get the first message. In tihs case, we expect to get
>     hundreds of consecutive messages, so we just ignore all the
>     subsequent messages until the next crtc_enable.
> 
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

Ah, here's the "block underrun reporting stuff". You need to set up this
infrastructure first, then enable the error interrupts. Otherwise a poor
bloke will hit the irq strom in between these two patches in a bisect and
die from a heart attack when looking at dmesg.

Also, you update the ignore_bla from the irq handler, so this needs proper
locking. And I vote for the bikeshed to be moved into struct intel_crtc,
e.g.

struct intel_crtc {
	...
	
	struct spinlock underrun_lock;
	int underrun_reporting_enabled;
}

Then smash a little helper into every crtc_enable/disable function. Aside:
you need to use the spin_lock_irqsafe/restore functions in the irq handler
(and for safety also in the helpers).  Another curious question: Why do we
need separate ignore bits for the cpu and the pch?
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h      |    3 +++
>  drivers/gpu/drm/i915/i915_irq.c      |   37 +++++++++++++++++++++++++++++-----
>  drivers/gpu/drm/i915/i915_reg.h      |    7 +++++--
>  drivers/gpu/drm/i915/intel_display.c |   14 ++++++++++++-
>  4 files changed, 53 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 08c5def..e96d75e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -909,6 +909,9 @@ typedef struct drm_i915_private {
>  	struct work_struct hotplug_work;
>  	bool enable_hotplug_processing;
>  
> +	/* Bit 0: PCH transcoder A and so on. */
> +	u8 ignore_pch_fifo_underrun;
> +
>  	int num_pipe;
>  	int num_pch_pll;
>  
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 703a08a..7497589 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -659,10 +659,17 @@ static void ibx_irq_handler(struct drm_device *dev, u32 pch_iir)
>  	if (pch_iir & (SDE_TRANSB_CRC_ERR | SDE_TRANSA_CRC_ERR))
>  		DRM_DEBUG_DRIVER("PCH transcoder CRC error interrupt\n");
>  
> -	if (pch_iir & SDE_TRANSB_FIFO_UNDER)
> -		DRM_DEBUG_DRIVER("PCH transcoder B underrun interrupt\n");
> -	if (pch_iir & SDE_TRANSA_FIFO_UNDER)
> -		DRM_DEBUG_DRIVER("PCH transcoder A underrun interrupt\n");
> +	if ((pch_iir & SDE_TRANSB_FIFO_UNDER) &&
> +	    !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_B))) {
> +		DRM_DEBUG_DRIVER("PCH transcoder B underrun\n");
> +		dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_B);
> +	}
> +
> +	if ((pch_iir & SDE_TRANSA_FIFO_UNDER) &&
> +	    !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_A))) {
> +		DRM_DEBUG_DRIVER("PCH transcoder A underrun\n");
> +		dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
> +	}
>  }
>  
>  static void err_int_handler(struct drm_device *dev)
> @@ -684,6 +691,24 @@ static void serr_int_handler(struct drm_device *dev)
>  	if (serr_int & SERR_INT_POISON)
>  		DRM_ERROR("PCH poison interrupt\n");
>  
> +	if ((serr_int & SERR_INT_TRANS_A_FIFO_UNDERRUN) &&
> +	    !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_A))) {
> +		DRM_ERROR("PCH transcoder A FIFO underrun\n");
> +		dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
> +	}
> +
> +	if ((serr_int & SERR_INT_TRANS_B_FIFO_UNDERRUN) &&
> +	    !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_B))) {
> +		DRM_ERROR("PCH transcoder B FIFO underrun\n");
> +		dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_B);
> +	}
> +
> +	if ((serr_int & SERR_INT_TRANS_C_FIFO_UNDERRUN) &&
> +	    !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_C))) {
> +		DRM_ERROR("PCH transcoder C FIFO underrun\n");
> +		dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_C);
> +	}
> +
>  	I915_WRITE(SERR_INT, serr_int);
>  }
>  
> @@ -2000,7 +2025,9 @@ static void ibx_irq_postinstall(struct drm_device *dev)
>  		mask = SDE_HOTPLUG_MASK |
>  		       SDE_GMBUS |
>  		       SDE_AUX_MASK |
> -		       SDE_POISON;
> +		       SDE_POISON |
> +		       SDE_TRANSB_FIFO_UNDER |
> +		       SDE_TRANSA_FIFO_UNDER;
>  	} else {
>  		mask = SDE_HOTPLUG_MASK_CPT |
>  		       SDE_GMBUS_CPT |
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index f22e27d..d565bd7 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -3554,8 +3554,11 @@
>  #define SDEIIR  0xc4008
>  #define SDEIER  0xc400c
>  
> -#define SERR_INT		0xc4040
> -#define  SERR_INT_POISON	(1 << 31)
> +#define SERR_INT			0xc4040
> +#define  SERR_INT_POISON		(1 << 31)
> +#define  SERR_INT_TRANS_C_FIFO_UNDERRUN	(1 << 6)
> +#define  SERR_INT_TRANS_B_FIFO_UNDERRUN	(1 << 3)
> +#define  SERR_INT_TRANS_A_FIFO_UNDERRUN	(1 << 0)
>  
>  /* digital port hotplug */
>  #define PCH_PORT_HOTPLUG        0xc4030		/* SHOTPLUG_CTL */
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index d75c6a0..67bfb58 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -3274,6 +3274,9 @@ static void ironlake_crtc_enable(struct drm_crtc *crtc)
>  		return;
>  
>  	intel_crtc->active = true;
> +
> +	dev_priv->ignore_pch_fifo_underrun &= ~(1 << pipe);
> +
>  	intel_update_watermarks(dev);
>  
>  	if (intel_pipe_has_type(crtc, INTEL_OUTPUT_LVDS)) {
> @@ -3366,11 +3369,15 @@ static void haswell_crtc_enable(struct drm_crtc *crtc)
>  		return;
>  
>  	intel_crtc->active = true;
> -	intel_update_watermarks(dev);
>  
>  	is_pch_port = haswell_crtc_driving_pch(crtc);
>  
>  	if (is_pch_port)
> +		dev_priv->ignore_pch_fifo_underrun &= ~(1 << TRANSCODER_A);
> +
> +	intel_update_watermarks(dev);
> +
> +	if (is_pch_port)
>  		dev_priv->display.fdi_link_train(crtc);
>  
>  	for_each_encoder_on_crtc(dev, crtc, encoder)
> @@ -3453,6 +3460,7 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
>  	if (dev_priv->cfb_plane == plane)
>  		intel_disable_fbc(dev);
>  
> +	dev_priv->ignore_pch_fifo_underrun |= (1 << pipe);
>  	intel_disable_pipe(dev_priv, pipe);
>  
>  	/* Disable PF */
> @@ -3466,6 +3474,7 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
>  	ironlake_fdi_disable(crtc);
>  
>  	ironlake_disable_pch_transcoder(dev_priv, pipe);
> +	dev_priv->ignore_pch_fifo_underrun &= ~(1 << pipe);
>  
>  	if (HAS_PCH_CPT(dev)) {
>  		/* disable TRANS_DP_CTL */
> @@ -3535,6 +3544,8 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
>  	if (dev_priv->cfb_plane == plane)
>  		intel_disable_fbc(dev);
>  
> +	if (is_pch_port)
> +		dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
>  	intel_disable_pipe(dev_priv, pipe);
>  
>  	intel_ddi_disable_transcoder_func(dev_priv, cpu_transcoder);
> @@ -3551,6 +3562,7 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
>  
>  	if (is_pch_port) {
>  		lpt_disable_pch_transcoder(dev_priv);
> +		dev_priv->ignore_pch_fifo_underrun &= ~(1 << TRANSCODER_A);
>  		intel_ddi_fdi_disable(crtc);
>  	}
>  
> -- 
> 1.7.10.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 03/10] drm/i915: clear the FPGA_DBG_RM_NOCLAIM bit at driver init
  2013-02-09 17:19   ` Ben Widawsky
@ 2013-02-14 20:26     ` Paulo Zanoni
  2013-02-15  4:07       ` Ben Widawsky
  0 siblings, 1 reply; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-14 20:26 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Paulo Zanoni

Hi

2013/2/9 Ben Widawsky <ben@bwidawsk.net>:
> On Fri,  8 Feb 2013 17:35:14 -0200
> Paulo Zanoni <przanoni@gmail.com> wrote:
>
>> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>
>> Otherwise, if the BIOS did anything wrong, our first I915_{WRITE,READ}
>> will give us "unclaimed register"  messages.
>>
>> V2: Even earlier.
>>
>> Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=58897
>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> I really wish we were allowed to call Haswell something like gen7.x, so
> we can do INTEL_INFO(dev)->gen > 7

Like gen 70 and 75?

>
> Also, I would have cleared all the bits in the register, not just
> NOCLAIM.

I'm not so sure, the other bits have completely different purposes,
unrelated with the "unclaimed registers". I don't think it's a good
idea to zero bits that have nothing to do with the purpose of the
code.

>
> Either way it's
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>

Thanks for the review :)

> [snip]



-- 
Paulo Zanoni

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
  2013-02-09 17:30     ` Ben Widawsky
@ 2013-02-14 20:35       ` Paulo Zanoni
  2013-02-15  4:05         ` Ben Widawsky
  0 siblings, 1 reply; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-14 20:35 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: intel-gfx, Paulo Zanoni

Hi

2013/2/9 Ben Widawsky <ben@bwidawsk.net>:
> On Fri, 8 Feb 2013 11:42:39 -0800
> Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
>
>> On Fri,  8 Feb 2013 17:35:18 -0200
>> Paulo Zanoni <przanoni@gmail.com> wrote:
>>
>> > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> >
>> > On ILK/SNB all we need to do is to enable the "poison" bit, but on
>> > IVB/HSW we need to enable the CPU error interrupt register, which is
>> > responsible not only for poison interrupts, but also other things.
>> > This includes the "unclaimed register" interrupt, so on the IVB irq
>> > handler we now need to: (i) check whether the interrupt was
>> > triggered by an unclaimed register and (ii) mask the error
>> > interrupt bit so we don't risk generating "unclaimed register"
>> > interrupts form inside the interrupt handler.
>> >
>> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> > ---
>>
>> OTOH there's nothing the user can do about it... so we might do a
>> WARN_ONCE or something here instead.  But even then, I'm not sure
>> there's much *we* can do about these, as they indicate a corruption in
>> the communication between the CPU and PCH.
>>
>
> I agree with Jesse. I wouldn't bother with these. Even a WARN_ONCE
> isn't helpful since the backtrace wouldn't really be meaningful.

Why isn't it helpful? Right now we don't even know whether this
problem happens or not, we're completely "blind" to a possible problem
that may be affecting us in some specific cases and we don't even
know. Knowing that it happens and how often it happens is IMHO
certainly better than closing our eyes and pretending it doesn't
exist.

>
> If OTOH, you wanted to save away this information into error state; I
> could get behind that.




-- 
Paulo Zanoni

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/10] drm/i915: print PCH FIFO underrun interrupts
  2013-02-09 19:43   ` Daniel Vetter
@ 2013-02-14 20:53     ` Paulo Zanoni
  2013-02-14 21:13       ` Daniel Vetter
  0 siblings, 1 reply; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-14 20:53 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Paulo Zanoni

Hi

2013/2/9 Daniel Vetter <daniel@ffwll.ch>:
> On Fri, Feb 08, 2013 at 05:35:19PM -0200, Paulo Zanoni wrote:
>> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>
>> Also add an "ignore" bit that avoids printing the message in two
>> cases:
>>   - When the message is in fact expected.
>>   - After we get the first message. In tihs case, we expect to get
>>     hundreds of consecutive messages, so we just ignore all the
>>     subsequent messages until the next crtc_enable.
>>
>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> Ah, here's the "block underrun reporting stuff". You need to set up this
> infrastructure first, then enable the error interrupts. Otherwise a poor
> bloke will hit the irq strom in between these two patches in a bisect and
> die from a heart attack when looking at dmesg.

Nope. This patch is also the patch that adds the dmesg message, so
without this patch there is no message to ignore. Without this patch
we still get the interrupts, but this shouldn't be a problem because
we don't print anything and because of patch 5.

>
> Also, you update the ignore_bla from the irq handler, so this needs proper
> locking.
Will fix.

> And I vote for the bikeshed to be moved into struct intel_crtc,
> e.g.

It may be a little slightly slower, since we'll have to look for the
crtc pointers on our structs. And on Haswell, checking which crtc is
using the PCH transcoder will be more complicated. I'll try to think
on a good solution.

My initial idea was to actually group all the "interrupt-related" bits
from dev_priv into an "interrupt struct", and put the ignore_fifo bits
there.

>
> struct intel_crtc {
>         ...
>
>         struct spinlock underrun_lock;
>         int underrun_reporting_enabled;
> }
>
> Then smash a little helper into every crtc_enable/disable function. Aside:
> you need to use the spin_lock_irqsafe/restore functions in the irq handler
> (and for safety also in the helpers).  Another curious question: Why do we
> need separate ignore bits for the cpu and the pch?

Because: (i) sometimes we get errors from one but not from the other;
(ii) during "pipe/transcoder disable" we only want to ignore the PCH
underruns, not the CPU underruns; (iii) on Haswell, there's no 1:1
mapping between CPU pipe and PCH transcoder.

> -Daniel
>
>> ---
>>  drivers/gpu/drm/i915/i915_drv.h      |    3 +++
>>  drivers/gpu/drm/i915/i915_irq.c      |   37 +++++++++++++++++++++++++++++-----
>>  drivers/gpu/drm/i915/i915_reg.h      |    7 +++++--
>>  drivers/gpu/drm/i915/intel_display.c |   14 ++++++++++++-
>>  4 files changed, 53 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 08c5def..e96d75e 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -909,6 +909,9 @@ typedef struct drm_i915_private {
>>       struct work_struct hotplug_work;
>>       bool enable_hotplug_processing;
>>
>> +     /* Bit 0: PCH transcoder A and so on. */
>> +     u8 ignore_pch_fifo_underrun;
>> +
>>       int num_pipe;
>>       int num_pch_pll;
>>
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>> index 703a08a..7497589 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -659,10 +659,17 @@ static void ibx_irq_handler(struct drm_device *dev, u32 pch_iir)
>>       if (pch_iir & (SDE_TRANSB_CRC_ERR | SDE_TRANSA_CRC_ERR))
>>               DRM_DEBUG_DRIVER("PCH transcoder CRC error interrupt\n");
>>
>> -     if (pch_iir & SDE_TRANSB_FIFO_UNDER)
>> -             DRM_DEBUG_DRIVER("PCH transcoder B underrun interrupt\n");
>> -     if (pch_iir & SDE_TRANSA_FIFO_UNDER)
>> -             DRM_DEBUG_DRIVER("PCH transcoder A underrun interrupt\n");
>> +     if ((pch_iir & SDE_TRANSB_FIFO_UNDER) &&
>> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_B))) {
>> +             DRM_DEBUG_DRIVER("PCH transcoder B underrun\n");
>> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_B);
>> +     }
>> +
>> +     if ((pch_iir & SDE_TRANSA_FIFO_UNDER) &&
>> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_A))) {
>> +             DRM_DEBUG_DRIVER("PCH transcoder A underrun\n");
>> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
>> +     }
>>  }
>>
>>  static void err_int_handler(struct drm_device *dev)
>> @@ -684,6 +691,24 @@ static void serr_int_handler(struct drm_device *dev)
>>       if (serr_int & SERR_INT_POISON)
>>               DRM_ERROR("PCH poison interrupt\n");
>>
>> +     if ((serr_int & SERR_INT_TRANS_A_FIFO_UNDERRUN) &&
>> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_A))) {
>> +             DRM_ERROR("PCH transcoder A FIFO underrun\n");
>> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
>> +     }
>> +
>> +     if ((serr_int & SERR_INT_TRANS_B_FIFO_UNDERRUN) &&
>> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_B))) {
>> +             DRM_ERROR("PCH transcoder B FIFO underrun\n");
>> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_B);
>> +     }
>> +
>> +     if ((serr_int & SERR_INT_TRANS_C_FIFO_UNDERRUN) &&
>> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_C))) {
>> +             DRM_ERROR("PCH transcoder C FIFO underrun\n");
>> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_C);
>> +     }
>> +
>>       I915_WRITE(SERR_INT, serr_int);
>>  }
>>
>> @@ -2000,7 +2025,9 @@ static void ibx_irq_postinstall(struct drm_device *dev)
>>               mask = SDE_HOTPLUG_MASK |
>>                      SDE_GMBUS |
>>                      SDE_AUX_MASK |
>> -                    SDE_POISON;
>> +                    SDE_POISON |
>> +                    SDE_TRANSB_FIFO_UNDER |
>> +                    SDE_TRANSA_FIFO_UNDER;
>>       } else {
>>               mask = SDE_HOTPLUG_MASK_CPT |
>>                      SDE_GMBUS_CPT |
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index f22e27d..d565bd7 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -3554,8 +3554,11 @@
>>  #define SDEIIR  0xc4008
>>  #define SDEIER  0xc400c
>>
>> -#define SERR_INT             0xc4040
>> -#define  SERR_INT_POISON     (1 << 31)
>> +#define SERR_INT                     0xc4040
>> +#define  SERR_INT_POISON             (1 << 31)
>> +#define  SERR_INT_TRANS_C_FIFO_UNDERRUN      (1 << 6)
>> +#define  SERR_INT_TRANS_B_FIFO_UNDERRUN      (1 << 3)
>> +#define  SERR_INT_TRANS_A_FIFO_UNDERRUN      (1 << 0)
>>
>>  /* digital port hotplug */
>>  #define PCH_PORT_HOTPLUG        0xc4030              /* SHOTPLUG_CTL */
>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>> index d75c6a0..67bfb58 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -3274,6 +3274,9 @@ static void ironlake_crtc_enable(struct drm_crtc *crtc)
>>               return;
>>
>>       intel_crtc->active = true;
>> +
>> +     dev_priv->ignore_pch_fifo_underrun &= ~(1 << pipe);
>> +
>>       intel_update_watermarks(dev);
>>
>>       if (intel_pipe_has_type(crtc, INTEL_OUTPUT_LVDS)) {
>> @@ -3366,11 +3369,15 @@ static void haswell_crtc_enable(struct drm_crtc *crtc)
>>               return;
>>
>>       intel_crtc->active = true;
>> -     intel_update_watermarks(dev);
>>
>>       is_pch_port = haswell_crtc_driving_pch(crtc);
>>
>>       if (is_pch_port)
>> +             dev_priv->ignore_pch_fifo_underrun &= ~(1 << TRANSCODER_A);
>> +
>> +     intel_update_watermarks(dev);
>> +
>> +     if (is_pch_port)
>>               dev_priv->display.fdi_link_train(crtc);
>>
>>       for_each_encoder_on_crtc(dev, crtc, encoder)
>> @@ -3453,6 +3460,7 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
>>       if (dev_priv->cfb_plane == plane)
>>               intel_disable_fbc(dev);
>>
>> +     dev_priv->ignore_pch_fifo_underrun |= (1 << pipe);
>>       intel_disable_pipe(dev_priv, pipe);
>>
>>       /* Disable PF */
>> @@ -3466,6 +3474,7 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
>>       ironlake_fdi_disable(crtc);
>>
>>       ironlake_disable_pch_transcoder(dev_priv, pipe);
>> +     dev_priv->ignore_pch_fifo_underrun &= ~(1 << pipe);
>>
>>       if (HAS_PCH_CPT(dev)) {
>>               /* disable TRANS_DP_CTL */
>> @@ -3535,6 +3544,8 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
>>       if (dev_priv->cfb_plane == plane)
>>               intel_disable_fbc(dev);
>>
>> +     if (is_pch_port)
>> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
>>       intel_disable_pipe(dev_priv, pipe);
>>
>>       intel_ddi_disable_transcoder_func(dev_priv, cpu_transcoder);
>> @@ -3551,6 +3562,7 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
>>
>>       if (is_pch_port) {
>>               lpt_disable_pch_transcoder(dev_priv);
>> +             dev_priv->ignore_pch_fifo_underrun &= ~(1 << TRANSCODER_A);
>>               intel_ddi_fdi_disable(crtc);
>>       }
>>
>> --
>> 1.7.10.4
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 
Paulo Zanoni

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/10] drm/i915: print PCH FIFO underrun interrupts
  2013-02-14 20:53     ` Paulo Zanoni
@ 2013-02-14 21:13       ` Daniel Vetter
  2013-02-14 21:32         ` Paulo Zanoni
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Vetter @ 2013-02-14 21:13 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Thu, Feb 14, 2013 at 06:53:42PM -0200, Paulo Zanoni wrote:
> Hi
> 
> 2013/2/9 Daniel Vetter <daniel@ffwll.ch>:
> > On Fri, Feb 08, 2013 at 05:35:19PM -0200, Paulo Zanoni wrote:
> >> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >>
> >> Also add an "ignore" bit that avoids printing the message in two
> >> cases:
> >>   - When the message is in fact expected.
> >>   - After we get the first message. In tihs case, we expect to get
> >>     hundreds of consecutive messages, so we just ignore all the
> >>     subsequent messages until the next crtc_enable.
> >>
> >> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >
> > Ah, here's the "block underrun reporting stuff". You need to set up this
> > infrastructure first, then enable the error interrupts. Otherwise a poor
> > bloke will hit the irq strom in between these two patches in a bisect and
> > die from a heart attack when looking at dmesg.
> 
> Nope. This patch is also the patch that adds the dmesg message, so
> without this patch there is no message to ignore. Without this patch
> we still get the interrupts, but this shouldn't be a problem because
> we don't print anything and because of patch 5.
> 
> >
> > Also, you update the ignore_bla from the irq handler, so this needs proper
> > locking.
> Will fix.
> 
> > And I vote for the bikeshed to be moved into struct intel_crtc,
> > e.g.
> 
> It may be a little slightly slower, since we'll have to look for the
> crtc pointers on our structs. And on Haswell, checking which crtc is
> using the PCH transcoder will be more complicated. I'll try to think
> on a good solution.

Imo speed doesn't matter at all - we shouldn't ever get these in the
interrupt handler under normal conditions. And if we get too many of those
while modesetting, so that we suck away cpu time, it's probably best to
just disable all underrun reporting while doing a modeset.

> My initial idea was to actually group all the "interrupt-related" bits
> from dev_priv into an "interrupt struct", and put the ignore_fifo bits
> there.
> 
> >
> > struct intel_crtc {
> >         ...
> >
> >         struct spinlock underrun_lock;
> >         int underrun_reporting_enabled;
> > }
> >
> > Then smash a little helper into every crtc_enable/disable function. Aside:
> > you need to use the spin_lock_irqsafe/restore functions in the irq handler
> > (and for safety also in the helpers).  Another curious question: Why do we
> > need separate ignore bits for the cpu and the pch?
> 
> Because: (i) sometimes we get errors from one but not from the other;
> (ii) during "pipe/transcoder disable" we only want to ignore the PCH
> underruns, not the CPU underruns; (iii) on Haswell, there's no 1:1
> mapping between CPU pipe and PCH transcoder.

Ok, I see your aim with separate underrun reporting disables for pch/cpu.
But I also fear that we'll have tons of fun with just enabling the
underrun interrupts once the pipe is up, so a fine-grained report
disabling scheme sounds a bit like overkill. At least until we've fixed
the first set of big bugs.

The other thing is that I think we should just disable the interrupts when
we expect them (modeset) to avoid blowing through too much cpu time. So
having just one knob for everything sounds simpler.

Wrt Haswell pch interrupts: Either loop through the crtc list and check
->pch_encoder or check the DDI E port. The former shouldn't be a locking
issue as long as you grab the right irqsafe spinlocks first and check
whether reporting is enabled, since locks serve as barriers, too.

The reason for all this "can we keep this generic" bikeshed is that I'd
like to enable underrun reporting for i9xx/vlv, too. So having a bit of
abstraction would help.
-Daniel

> 
> > -Daniel
> >
> >> ---
> >>  drivers/gpu/drm/i915/i915_drv.h      |    3 +++
> >>  drivers/gpu/drm/i915/i915_irq.c      |   37 +++++++++++++++++++++++++++++-----
> >>  drivers/gpu/drm/i915/i915_reg.h      |    7 +++++--
> >>  drivers/gpu/drm/i915/intel_display.c |   14 ++++++++++++-
> >>  4 files changed, 53 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >> index 08c5def..e96d75e 100644
> >> --- a/drivers/gpu/drm/i915/i915_drv.h
> >> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> @@ -909,6 +909,9 @@ typedef struct drm_i915_private {
> >>       struct work_struct hotplug_work;
> >>       bool enable_hotplug_processing;
> >>
> >> +     /* Bit 0: PCH transcoder A and so on. */
> >> +     u8 ignore_pch_fifo_underrun;
> >> +
> >>       int num_pipe;
> >>       int num_pch_pll;
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> >> index 703a08a..7497589 100644
> >> --- a/drivers/gpu/drm/i915/i915_irq.c
> >> +++ b/drivers/gpu/drm/i915/i915_irq.c
> >> @@ -659,10 +659,17 @@ static void ibx_irq_handler(struct drm_device *dev, u32 pch_iir)
> >>       if (pch_iir & (SDE_TRANSB_CRC_ERR | SDE_TRANSA_CRC_ERR))
> >>               DRM_DEBUG_DRIVER("PCH transcoder CRC error interrupt\n");
> >>
> >> -     if (pch_iir & SDE_TRANSB_FIFO_UNDER)
> >> -             DRM_DEBUG_DRIVER("PCH transcoder B underrun interrupt\n");
> >> -     if (pch_iir & SDE_TRANSA_FIFO_UNDER)
> >> -             DRM_DEBUG_DRIVER("PCH transcoder A underrun interrupt\n");
> >> +     if ((pch_iir & SDE_TRANSB_FIFO_UNDER) &&
> >> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_B))) {
> >> +             DRM_DEBUG_DRIVER("PCH transcoder B underrun\n");
> >> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_B);
> >> +     }
> >> +
> >> +     if ((pch_iir & SDE_TRANSA_FIFO_UNDER) &&
> >> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_A))) {
> >> +             DRM_DEBUG_DRIVER("PCH transcoder A underrun\n");
> >> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
> >> +     }
> >>  }
> >>
> >>  static void err_int_handler(struct drm_device *dev)
> >> @@ -684,6 +691,24 @@ static void serr_int_handler(struct drm_device *dev)
> >>       if (serr_int & SERR_INT_POISON)
> >>               DRM_ERROR("PCH poison interrupt\n");
> >>
> >> +     if ((serr_int & SERR_INT_TRANS_A_FIFO_UNDERRUN) &&
> >> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_A))) {
> >> +             DRM_ERROR("PCH transcoder A FIFO underrun\n");
> >> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
> >> +     }
> >> +
> >> +     if ((serr_int & SERR_INT_TRANS_B_FIFO_UNDERRUN) &&
> >> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_B))) {
> >> +             DRM_ERROR("PCH transcoder B FIFO underrun\n");
> >> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_B);
> >> +     }
> >> +
> >> +     if ((serr_int & SERR_INT_TRANS_C_FIFO_UNDERRUN) &&
> >> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_C))) {
> >> +             DRM_ERROR("PCH transcoder C FIFO underrun\n");
> >> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_C);
> >> +     }
> >> +
> >>       I915_WRITE(SERR_INT, serr_int);
> >>  }
> >>
> >> @@ -2000,7 +2025,9 @@ static void ibx_irq_postinstall(struct drm_device *dev)
> >>               mask = SDE_HOTPLUG_MASK |
> >>                      SDE_GMBUS |
> >>                      SDE_AUX_MASK |
> >> -                    SDE_POISON;
> >> +                    SDE_POISON |
> >> +                    SDE_TRANSB_FIFO_UNDER |
> >> +                    SDE_TRANSA_FIFO_UNDER;
> >>       } else {
> >>               mask = SDE_HOTPLUG_MASK_CPT |
> >>                      SDE_GMBUS_CPT |
> >> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> >> index f22e27d..d565bd7 100644
> >> --- a/drivers/gpu/drm/i915/i915_reg.h
> >> +++ b/drivers/gpu/drm/i915/i915_reg.h
> >> @@ -3554,8 +3554,11 @@
> >>  #define SDEIIR  0xc4008
> >>  #define SDEIER  0xc400c
> >>
> >> -#define SERR_INT             0xc4040
> >> -#define  SERR_INT_POISON     (1 << 31)
> >> +#define SERR_INT                     0xc4040
> >> +#define  SERR_INT_POISON             (1 << 31)
> >> +#define  SERR_INT_TRANS_C_FIFO_UNDERRUN      (1 << 6)
> >> +#define  SERR_INT_TRANS_B_FIFO_UNDERRUN      (1 << 3)
> >> +#define  SERR_INT_TRANS_A_FIFO_UNDERRUN      (1 << 0)
> >>
> >>  /* digital port hotplug */
> >>  #define PCH_PORT_HOTPLUG        0xc4030              /* SHOTPLUG_CTL */
> >> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> >> index d75c6a0..67bfb58 100644
> >> --- a/drivers/gpu/drm/i915/intel_display.c
> >> +++ b/drivers/gpu/drm/i915/intel_display.c
> >> @@ -3274,6 +3274,9 @@ static void ironlake_crtc_enable(struct drm_crtc *crtc)
> >>               return;
> >>
> >>       intel_crtc->active = true;
> >> +
> >> +     dev_priv->ignore_pch_fifo_underrun &= ~(1 << pipe);
> >> +
> >>       intel_update_watermarks(dev);
> >>
> >>       if (intel_pipe_has_type(crtc, INTEL_OUTPUT_LVDS)) {
> >> @@ -3366,11 +3369,15 @@ static void haswell_crtc_enable(struct drm_crtc *crtc)
> >>               return;
> >>
> >>       intel_crtc->active = true;
> >> -     intel_update_watermarks(dev);
> >>
> >>       is_pch_port = haswell_crtc_driving_pch(crtc);
> >>
> >>       if (is_pch_port)
> >> +             dev_priv->ignore_pch_fifo_underrun &= ~(1 << TRANSCODER_A);
> >> +
> >> +     intel_update_watermarks(dev);
> >> +
> >> +     if (is_pch_port)
> >>               dev_priv->display.fdi_link_train(crtc);
> >>
> >>       for_each_encoder_on_crtc(dev, crtc, encoder)
> >> @@ -3453,6 +3460,7 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
> >>       if (dev_priv->cfb_plane == plane)
> >>               intel_disable_fbc(dev);
> >>
> >> +     dev_priv->ignore_pch_fifo_underrun |= (1 << pipe);
> >>       intel_disable_pipe(dev_priv, pipe);
> >>
> >>       /* Disable PF */
> >> @@ -3466,6 +3474,7 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
> >>       ironlake_fdi_disable(crtc);
> >>
> >>       ironlake_disable_pch_transcoder(dev_priv, pipe);
> >> +     dev_priv->ignore_pch_fifo_underrun &= ~(1 << pipe);
> >>
> >>       if (HAS_PCH_CPT(dev)) {
> >>               /* disable TRANS_DP_CTL */
> >> @@ -3535,6 +3544,8 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
> >>       if (dev_priv->cfb_plane == plane)
> >>               intel_disable_fbc(dev);
> >>
> >> +     if (is_pch_port)
> >> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
> >>       intel_disable_pipe(dev_priv, pipe);
> >>
> >>       intel_ddi_disable_transcoder_func(dev_priv, cpu_transcoder);
> >> @@ -3551,6 +3562,7 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
> >>
> >>       if (is_pch_port) {
> >>               lpt_disable_pch_transcoder(dev_priv);
> >> +             dev_priv->ignore_pch_fifo_underrun &= ~(1 << TRANSCODER_A);
> >>               intel_ddi_fdi_disable(crtc);
> >>       }
> >>
> >> --
> >> 1.7.10.4
> >>
> >> _______________________________________________
> >> Intel-gfx mailing list
> >> Intel-gfx@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> 
> 
> 
> -- 
> Paulo Zanoni

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 08/10] drm/i915: print PCH FIFO underrun interrupts
  2013-02-14 21:13       ` Daniel Vetter
@ 2013-02-14 21:32         ` Paulo Zanoni
  0 siblings, 0 replies; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-14 21:32 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Paulo Zanoni

Hi

2013/2/14 Daniel Vetter <daniel@ffwll.ch>:
> On Thu, Feb 14, 2013 at 06:53:42PM -0200, Paulo Zanoni wrote:
>> Hi
>>
>> 2013/2/9 Daniel Vetter <daniel@ffwll.ch>:
>> > On Fri, Feb 08, 2013 at 05:35:19PM -0200, Paulo Zanoni wrote:
>> >> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> >>
>> >> Also add an "ignore" bit that avoids printing the message in two
>> >> cases:
>> >>   - When the message is in fact expected.
>> >>   - After we get the first message. In tihs case, we expect to get
>> >>     hundreds of consecutive messages, so we just ignore all the
>> >>     subsequent messages until the next crtc_enable.
>> >>
>> >> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>> >
>> > Ah, here's the "block underrun reporting stuff". You need to set up this
>> > infrastructure first, then enable the error interrupts. Otherwise a poor
>> > bloke will hit the irq strom in between these two patches in a bisect and
>> > die from a heart attack when looking at dmesg.
>>
>> Nope. This patch is also the patch that adds the dmesg message, so
>> without this patch there is no message to ignore. Without this patch
>> we still get the interrupts, but this shouldn't be a problem because
>> we don't print anything and because of patch 5.
>>
>> >
>> > Also, you update the ignore_bla from the irq handler, so this needs proper
>> > locking.
>> Will fix.
>>
>> > And I vote for the bikeshed to be moved into struct intel_crtc,
>> > e.g.
>>
>> It may be a little slightly slower, since we'll have to look for the
>> crtc pointers on our structs. And on Haswell, checking which crtc is
>> using the PCH transcoder will be more complicated. I'll try to think
>> on a good solution.
>
> Imo speed doesn't matter at all - we shouldn't ever get these in the
> interrupt handler under normal conditions.

Nope. We get hundreds of interrupts at every crtc_disable. We are
*ignoring* them (by not printing anything), but we're still receiving
the interrupts (so we run the irq handler every time).

> And if we get too many of those
> while modesetting, so that we suck away cpu time, it's probably best to
> just disable all underrun reporting while doing a modeset.

With the current code I didn't notice and slowdowns, but didn't really
measure anything. Do you mean "disable all underrun reporting" (aka
don't print messages but still get the interrupts) or do you mean
"disable all the interrupts" (aka don't run the irq handler)?

The problem with disabling the interrupts is that you can't just
disable the "PCH FIFO underrun interrupts". You have to disable all
the SERR interrupts, so you may lose other possible error messages.
Same thing with the CPU on IVB/HSW.

>
>> My initial idea was to actually group all the "interrupt-related" bits
>> from dev_priv into an "interrupt struct", and put the ignore_fifo bits
>> there.
>>
>> >
>> > struct intel_crtc {
>> >         ...
>> >
>> >         struct spinlock underrun_lock;
>> >         int underrun_reporting_enabled;
>> > }
>> >
>> > Then smash a little helper into every crtc_enable/disable function. Aside:
>> > you need to use the spin_lock_irqsafe/restore functions in the irq handler
>> > (and for safety also in the helpers).  Another curious question: Why do we
>> > need separate ignore bits for the cpu and the pch?
>>
>> Because: (i) sometimes we get errors from one but not from the other;
>> (ii) during "pipe/transcoder disable" we only want to ignore the PCH
>> underruns, not the CPU underruns; (iii) on Haswell, there's no 1:1
>> mapping between CPU pipe and PCH transcoder.
>
> Ok, I see your aim with separate underrun reporting disables for pch/cpu.
> But I also fear that we'll have tons of fun with just enabling the
> underrun interrupts once the pipe is up, so a fine-grained report
> disabling scheme sounds a bit like overkill. At least until we've fixed
> the first set of big bugs.

What do you mean by "tons of fun"? As soon as we get the error
interrupt again we enable the "ignore" bit, so we'll see at most 1
message per mode set per pipe/transcoder.

>
> The other thing is that I think we should just disable the interrupts when
> we expect them (modeset) to avoid blowing through too much cpu time. So
> having just one knob for everything sounds simpler.

You mean definitely *disable* the interrupts? We can't just disable
the PCH FIFO underrun interrupts, we need to completely disable SERR,
which in turn disable all the other possible error messages (e.g.,
Poison, the other FIFO underruns from all the other pipes and also the
other error interrupts we're currently ignoring). Same thing with the
CPU FIFO underruns on IVB/Haswell: we need to disable GEN7_ERR_INT, so
we disable all the other interrupts it can generate.

>
> Wrt Haswell pch interrupts: Either loop through the crtc list and check
> ->pch_encoder or check the DDI E port. The former shouldn't be a locking
> issue as long as you grab the right irqsafe spinlocks first and check
> whether reporting is enabled, since locks serve as barriers, too.
>
> The reason for all this "can we keep this generic" bikeshed is that I'd
> like to enable underrun reporting for i9xx/vlv, too. So having a bit of
> abstraction would help.

I don't understand how the current code can't be used on i9xx/vlv too.


But based on this discussion, I think that instead of adding variables
that just ignore the interrupts (avoid printing the error messages
while still keeping the interrupts enabled) you want me to actually
completely disable the error interrupts as soon as we get the first
(this way we'll be forced to disable *all* the error interrupts on
*all* the pipes or transcoders). We could do this without adding
anything to dev_priv or intel_crtc, but the downside is that we'll be
blocking *all* the error messages at every crtc_disable. I can provide
a patch series for this, but it's not as "complete" as the current
series.

> -Daniel
>
>>
>> > -Daniel
>> >
>> >> ---
>> >>  drivers/gpu/drm/i915/i915_drv.h      |    3 +++
>> >>  drivers/gpu/drm/i915/i915_irq.c      |   37 +++++++++++++++++++++++++++++-----
>> >>  drivers/gpu/drm/i915/i915_reg.h      |    7 +++++--
>> >>  drivers/gpu/drm/i915/intel_display.c |   14 ++++++++++++-
>> >>  4 files changed, 53 insertions(+), 8 deletions(-)
>> >>
>> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> >> index 08c5def..e96d75e 100644
>> >> --- a/drivers/gpu/drm/i915/i915_drv.h
>> >> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> >> @@ -909,6 +909,9 @@ typedef struct drm_i915_private {
>> >>       struct work_struct hotplug_work;
>> >>       bool enable_hotplug_processing;
>> >>
>> >> +     /* Bit 0: PCH transcoder A and so on. */
>> >> +     u8 ignore_pch_fifo_underrun;
>> >> +
>> >>       int num_pipe;
>> >>       int num_pch_pll;
>> >>
>> >> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>> >> index 703a08a..7497589 100644
>> >> --- a/drivers/gpu/drm/i915/i915_irq.c
>> >> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> >> @@ -659,10 +659,17 @@ static void ibx_irq_handler(struct drm_device *dev, u32 pch_iir)
>> >>       if (pch_iir & (SDE_TRANSB_CRC_ERR | SDE_TRANSA_CRC_ERR))
>> >>               DRM_DEBUG_DRIVER("PCH transcoder CRC error interrupt\n");
>> >>
>> >> -     if (pch_iir & SDE_TRANSB_FIFO_UNDER)
>> >> -             DRM_DEBUG_DRIVER("PCH transcoder B underrun interrupt\n");
>> >> -     if (pch_iir & SDE_TRANSA_FIFO_UNDER)
>> >> -             DRM_DEBUG_DRIVER("PCH transcoder A underrun interrupt\n");
>> >> +     if ((pch_iir & SDE_TRANSB_FIFO_UNDER) &&
>> >> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_B))) {
>> >> +             DRM_DEBUG_DRIVER("PCH transcoder B underrun\n");
>> >> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_B);
>> >> +     }
>> >> +
>> >> +     if ((pch_iir & SDE_TRANSA_FIFO_UNDER) &&
>> >> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_A))) {
>> >> +             DRM_DEBUG_DRIVER("PCH transcoder A underrun\n");
>> >> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
>> >> +     }
>> >>  }
>> >>
>> >>  static void err_int_handler(struct drm_device *dev)
>> >> @@ -684,6 +691,24 @@ static void serr_int_handler(struct drm_device *dev)
>> >>       if (serr_int & SERR_INT_POISON)
>> >>               DRM_ERROR("PCH poison interrupt\n");
>> >>
>> >> +     if ((serr_int & SERR_INT_TRANS_A_FIFO_UNDERRUN) &&
>> >> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_A))) {
>> >> +             DRM_ERROR("PCH transcoder A FIFO underrun\n");
>> >> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
>> >> +     }
>> >> +
>> >> +     if ((serr_int & SERR_INT_TRANS_B_FIFO_UNDERRUN) &&
>> >> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_B))) {
>> >> +             DRM_ERROR("PCH transcoder B FIFO underrun\n");
>> >> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_B);
>> >> +     }
>> >> +
>> >> +     if ((serr_int & SERR_INT_TRANS_C_FIFO_UNDERRUN) &&
>> >> +         !(dev_priv->ignore_pch_fifo_underrun & (1 << TRANSCODER_C))) {
>> >> +             DRM_ERROR("PCH transcoder C FIFO underrun\n");
>> >> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_C);
>> >> +     }
>> >> +
>> >>       I915_WRITE(SERR_INT, serr_int);
>> >>  }
>> >>
>> >> @@ -2000,7 +2025,9 @@ static void ibx_irq_postinstall(struct drm_device *dev)
>> >>               mask = SDE_HOTPLUG_MASK |
>> >>                      SDE_GMBUS |
>> >>                      SDE_AUX_MASK |
>> >> -                    SDE_POISON;
>> >> +                    SDE_POISON |
>> >> +                    SDE_TRANSB_FIFO_UNDER |
>> >> +                    SDE_TRANSA_FIFO_UNDER;
>> >>       } else {
>> >>               mask = SDE_HOTPLUG_MASK_CPT |
>> >>                      SDE_GMBUS_CPT |
>> >> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> >> index f22e27d..d565bd7 100644
>> >> --- a/drivers/gpu/drm/i915/i915_reg.h
>> >> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> >> @@ -3554,8 +3554,11 @@
>> >>  #define SDEIIR  0xc4008
>> >>  #define SDEIER  0xc400c
>> >>
>> >> -#define SERR_INT             0xc4040
>> >> -#define  SERR_INT_POISON     (1 << 31)
>> >> +#define SERR_INT                     0xc4040
>> >> +#define  SERR_INT_POISON             (1 << 31)
>> >> +#define  SERR_INT_TRANS_C_FIFO_UNDERRUN      (1 << 6)
>> >> +#define  SERR_INT_TRANS_B_FIFO_UNDERRUN      (1 << 3)
>> >> +#define  SERR_INT_TRANS_A_FIFO_UNDERRUN      (1 << 0)
>> >>
>> >>  /* digital port hotplug */
>> >>  #define PCH_PORT_HOTPLUG        0xc4030              /* SHOTPLUG_CTL */
>> >> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>> >> index d75c6a0..67bfb58 100644
>> >> --- a/drivers/gpu/drm/i915/intel_display.c
>> >> +++ b/drivers/gpu/drm/i915/intel_display.c
>> >> @@ -3274,6 +3274,9 @@ static void ironlake_crtc_enable(struct drm_crtc *crtc)
>> >>               return;
>> >>
>> >>       intel_crtc->active = true;
>> >> +
>> >> +     dev_priv->ignore_pch_fifo_underrun &= ~(1 << pipe);
>> >> +
>> >>       intel_update_watermarks(dev);
>> >>
>> >>       if (intel_pipe_has_type(crtc, INTEL_OUTPUT_LVDS)) {
>> >> @@ -3366,11 +3369,15 @@ static void haswell_crtc_enable(struct drm_crtc *crtc)
>> >>               return;
>> >>
>> >>       intel_crtc->active = true;
>> >> -     intel_update_watermarks(dev);
>> >>
>> >>       is_pch_port = haswell_crtc_driving_pch(crtc);
>> >>
>> >>       if (is_pch_port)
>> >> +             dev_priv->ignore_pch_fifo_underrun &= ~(1 << TRANSCODER_A);
>> >> +
>> >> +     intel_update_watermarks(dev);
>> >> +
>> >> +     if (is_pch_port)
>> >>               dev_priv->display.fdi_link_train(crtc);
>> >>
>> >>       for_each_encoder_on_crtc(dev, crtc, encoder)
>> >> @@ -3453,6 +3460,7 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
>> >>       if (dev_priv->cfb_plane == plane)
>> >>               intel_disable_fbc(dev);
>> >>
>> >> +     dev_priv->ignore_pch_fifo_underrun |= (1 << pipe);
>> >>       intel_disable_pipe(dev_priv, pipe);
>> >>
>> >>       /* Disable PF */
>> >> @@ -3466,6 +3474,7 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
>> >>       ironlake_fdi_disable(crtc);
>> >>
>> >>       ironlake_disable_pch_transcoder(dev_priv, pipe);
>> >> +     dev_priv->ignore_pch_fifo_underrun &= ~(1 << pipe);
>> >>
>> >>       if (HAS_PCH_CPT(dev)) {
>> >>               /* disable TRANS_DP_CTL */
>> >> @@ -3535,6 +3544,8 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
>> >>       if (dev_priv->cfb_plane == plane)
>> >>               intel_disable_fbc(dev);
>> >>
>> >> +     if (is_pch_port)
>> >> +             dev_priv->ignore_pch_fifo_underrun |= (1 << TRANSCODER_A);
>> >>       intel_disable_pipe(dev_priv, pipe);
>> >>
>> >>       intel_ddi_disable_transcoder_func(dev_priv, cpu_transcoder);
>> >> @@ -3551,6 +3562,7 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
>> >>
>> >>       if (is_pch_port) {
>> >>               lpt_disable_pch_transcoder(dev_priv);
>> >> +             dev_priv->ignore_pch_fifo_underrun &= ~(1 << TRANSCODER_A);
>> >>               intel_ddi_fdi_disable(crtc);
>> >>       }
>> >>
>> >> --
>> >> 1.7.10.4
>> >>
>> >> _______________________________________________
>> >> Intel-gfx mailing list
>> >> Intel-gfx@lists.freedesktop.org
>> >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>> >
>> > --
>> > Daniel Vetter
>> > Software Engineer, Intel Corporation
>> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>>
>>
>>
>> --
>> Paulo Zanoni
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 
Paulo Zanoni

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
  2013-02-14 20:35       ` Paulo Zanoni
@ 2013-02-15  4:05         ` Ben Widawsky
  0 siblings, 0 replies; 31+ messages in thread
From: Ben Widawsky @ 2013-02-15  4:05 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Thu, 14 Feb 2013 18:35:32 -0200
Paulo Zanoni <przanoni@gmail.com> wrote:

> Hi
> 
> 2013/2/9 Ben Widawsky <ben@bwidawsk.net>:
> > On Fri, 8 Feb 2013 11:42:39 -0800
> > Jesse Barnes <jbarnes@virtuousgeek.org> wrote:
> >
> >> On Fri,  8 Feb 2013 17:35:18 -0200
> >> Paulo Zanoni <przanoni@gmail.com> wrote:
> >>
> >> > From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >> >
> >> > On ILK/SNB all we need to do is to enable the "poison" bit, but
> >> > on IVB/HSW we need to enable the CPU error interrupt register,
> >> > which is responsible not only for poison interrupts, but also
> >> > other things. This includes the "unclaimed register" interrupt,
> >> > so on the IVB irq handler we now need to: (i) check whether the
> >> > interrupt was triggered by an unclaimed register and (ii) mask
> >> > the error interrupt bit so we don't risk generating "unclaimed
> >> > register" interrupts form inside the interrupt handler.
> >> >
> >> > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >> > ---
> >>
> >> OTOH there's nothing the user can do about it... so we might do a
> >> WARN_ONCE or something here instead.  But even then, I'm not sure
> >> there's much *we* can do about these, as they indicate a
> >> corruption in the communication between the CPU and PCH.
> >>
> >
> > I agree with Jesse. I wouldn't bother with these. Even a WARN_ONCE
> > isn't helpful since the backtrace wouldn't really be meaningful.
> 
> Why isn't it helpful? Right now we don't even know whether this
> problem happens or not, we're completely "blind" to a possible problem
> that may be affecting us in some specific cases and we don't even
> know. Knowing that it happens and how often it happens is IMHO
> certainly better than closing our eyes and pretending it doesn't
> exist.
> 

I suppose you're right. I'm strongly of the opinion that we won't
ever see this error because the system will crap out before we'd be
able to get that info - of course I cannot prove that, and I don't
know enough about what exactly poison means. I just think it sucks that
we have yet another gen specific thing which has TBD value. I certainly
won't nak it, and of course if it proves useful, I'll be most
apologetic.

As for the WARN being unhelpful, it's the same problem again. You're
getting the notifications via interrupt, so a backtrace is useless on
IVB/HSW. Perhaps it makes sense on ILK/SNB. Reinventing a "do this
once" macro isn't worthwhile either, so I guess WARN_ON with the
assumption that we ignore the backtrace is fine on IVB/HSW.

> >
> > If OTOH, you wanted to save away this information into error state;
> > I could get behind that.
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 03/10] drm/i915: clear the FPGA_DBG_RM_NOCLAIM bit at driver init
  2013-02-14 20:26     ` Paulo Zanoni
@ 2013-02-15  4:07       ` Ben Widawsky
  0 siblings, 0 replies; 31+ messages in thread
From: Ben Widawsky @ 2013-02-15  4:07 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Thu, 14 Feb 2013 18:26:59 -0200
Paulo Zanoni <przanoni@gmail.com> wrote:

> Hi
> 
> 2013/2/9 Ben Widawsky <ben@bwidawsk.net>:
> > On Fri,  8 Feb 2013 17:35:14 -0200
> > Paulo Zanoni <przanoni@gmail.com> wrote:
> >
> >> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >>
> >> Otherwise, if the BIOS did anything wrong, our first
> >> I915_{WRITE,READ} will give us "unclaimed register"  messages.
> >>
> >> V2: Even earlier.
> >>
> >> Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=58897
> >> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> >
> > I really wish we were allowed to call Haswell something like
> > gen7.x, so we can do INTEL_INFO(dev)->gen > 7
> 
> Like gen 70 and 75?

Yeah, but that's not your problem or fault.
> 
> >
> > Also, I would have cleared all the bits in the register, not just
> > NOCLAIM.
> 
> I'm not so sure, the other bits have completely different purposes,
> unrelated with the "unclaimed registers". I don't think it's a good
> idea to zero bits that have nothing to do with the purpose of the
> code.

I think doing that is a separate patch, and it goes along with the
"we don't care what errors BIOS induced" philosophy IMO.

> 
> >
> > Either way it's
> > Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Thanks for the review :)
> 
> > [snip]
> 
> 
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/10] drm/i915: drm/i915: create macros for the "unclaimed register" checks
  2013-02-08 19:35 ` [PATCH 01/10] drm/i915: drm/i915: create macros for the "unclaimed register" checks Paulo Zanoni
@ 2013-02-18 18:11   ` Daniel Vetter
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel Vetter @ 2013-02-18 18:11 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Fri, Feb 08, 2013 at 05:35:12PM -0200, Paulo Zanoni wrote:
> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
> 
> This avoids polluting i915_write##x and also allows us to reuse code
> on i915_read##x.
> 
> v2: Rebase
> 
> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1)
> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

Checkpatch complained about the macro parameters not being enclosed in
(reg) params. I tend to agree, so maybe just extract the entire thing into
a static inline function while at it?
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.c |   24 ++++++++++++++++--------
>  1 file changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index c5b8c81..e24c337 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1131,6 +1131,20 @@ ilk_dummy_write(struct drm_i915_private *dev_priv)
>  	I915_WRITE_NOTRACE(MI_MODE, 0);
>  }
>  
> +#define UNCLAIMED_REG_CLEAR(dev_priv, reg) \
> +	if (IS_HASWELL(dev_priv->dev) && \
> +	    (I915_READ_NOTRACE(GEN7_ERR_INT) & ERR_INT_MMIO_UNCLAIMED)) { \
> +		DRM_ERROR("Unknown unclaimed register before writing to %x\n", reg); \
> +		I915_WRITE_NOTRACE(GEN7_ERR_INT, ERR_INT_MMIO_UNCLAIMED); \
> +	}
> +
> +#define UNCLAIMED_REG_CHECK(dev_priv, reg) \
> +	if (IS_HASWELL(dev_priv->dev) && \
> +	    (I915_READ_NOTRACE(GEN7_ERR_INT) & ERR_INT_MMIO_UNCLAIMED)) { \
> +		DRM_ERROR("Unclaimed write to %x\n", reg); \
> +		writel(ERR_INT_MMIO_UNCLAIMED, dev_priv->regs + GEN7_ERR_INT); \
> +	}
> +
>  #define __i915_read(x, y) \
>  u##x i915_read##x(struct drm_i915_private *dev_priv, u32 reg) { \
>  	u##x val = 0; \
> @@ -1167,18 +1181,12 @@ void i915_write##x(struct drm_i915_private *dev_priv, u32 reg, u##x val) { \
>  	} \
>  	if (IS_GEN5(dev_priv->dev)) \
>  		ilk_dummy_write(dev_priv); \
> -	if (IS_HASWELL(dev_priv->dev) && (I915_READ_NOTRACE(GEN7_ERR_INT) & ERR_INT_MMIO_UNCLAIMED)) { \
> -		DRM_ERROR("Unknown unclaimed register before writing to %x\n", reg); \
> -		I915_WRITE_NOTRACE(GEN7_ERR_INT, ERR_INT_MMIO_UNCLAIMED); \
> -	} \
> +	UNCLAIMED_REG_CLEAR(dev_priv, reg); \
>  	write##y(val, dev_priv->regs + reg); \
>  	if (unlikely(__fifo_ret)) { \
>  		gen6_gt_check_fifodbg(dev_priv); \
>  	} \
> -	if (IS_HASWELL(dev_priv->dev) && (I915_READ_NOTRACE(GEN7_ERR_INT) & ERR_INT_MMIO_UNCLAIMED)) { \
> -		DRM_ERROR("Unclaimed write to %x\n", reg); \
> -		writel(ERR_INT_MMIO_UNCLAIMED, dev_priv->regs + GEN7_ERR_INT);	\
> -	} \
> +	UNCLAIMED_REG_CHECK(dev_priv, reg); \
>  }
>  __i915_write(8, b)
>  __i915_write(16, w)
> -- 
> 1.7.10.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 05/10] drm/i915: also disable south interrupts when handling them
  2013-02-09 19:29   ` Daniel Vetter
@ 2013-02-20 20:06     ` Paulo Zanoni
  2013-02-20 20:24       ` Daniel Vetter
  0 siblings, 1 reply; 31+ messages in thread
From: Paulo Zanoni @ 2013-02-20 20:06 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Paulo Zanoni

Hi

I finally had some time to look at this again.

2013/2/9 Daniel Vetter <daniel@ffwll.ch>:
> On Fri, Feb 08, 2013 at 05:35:16PM -0200, Paulo Zanoni wrote:
>> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>
>> From the docs:
>>   "Only the rising edge of the PCH Display interrupt will cause the
>>   North Display IIR (DEIIR) PCH Display Interrupt even bit to be set,
>>   so all PCH Display Interrupts, including back to back interrupts,
>>   must be cleared before a new PCH Display interrupt can cause DEIIR
>>   to be set".
>>
>> The current code works fine because we don't get many interrupts, but
>> if we enable the PCH FIFO underrun interrupts we'll start getting so
>> many interrupts that at some point new PCH interrupts won't cause
>> DEIIR to be set.
>>
>> The initial implementation I tried was to turn the code that checks
>> SDEIIR into a loop, but we can still get interrupts even after the
>> loop is done (and before the irq handler finishes), so we have to
>> either disable the interrupts or mask them. In the end I concluded
>> that just disabling the PCH interrupts is enough, you don't even need
>> the loop, so this is what this patch implements. I've tested it and it
>> passes the 2 "PCH FIFO underrun interrupt storms" I can reproduce:
>> the "ironlake_crtc_disable" case and the "wrong watermarks" case.
>>
>> In other words, here's how to reproduce the problem fixed by this
>> patch:
>>   1 - Enable PCH FIFO underrun interrupts (SERR_INT on SNB+)
>>   2 - Boot the machine
>>   3 - While booting we'll get tons of PCH FIFO underrun interrupts
>>   4 - Plug a new monitor
>>   5 - Run xrandr, notice it won't detect the new monitor
>>   6 - Read SDEIIR and notice it's not 0 while DEIIR is 0
>>
>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>
> This smells fishy. Looking at the code, clearing the SDEIIR before the
> DEIIR looks strange. If we immediately get another south interrupt, it
> will propagate to the DEIIR.

Unless south interrupts are disabled by SDEIER, which is what my patch does.

> But since we currently processing south
> interrupts the corresponding bit in DEIIR is set, so when we clear DEIIR
> later on, we kill that south interrupt. And since it's then once lost,
> we'll lose all subsequent ones, too.
>
> So have you tried clearing DEIIR before clearing any of the subordinate
> registers?

Yes and it doesn't work. I just re-re-tested it today to be sure.

>
> The patch itself is imo wrong, since afaik the hw will drop any interrupts
> if we disable them. Hence any south interrupt happening while we are in
> the irq handler will effectively be lost.

They won't be lost since I don't mask the interrupts (IMR), I just
disable them (IER), so they are still listed in the SDEIIR register.
What happens is that in the patch I sent, we'll just re-enable the
south interrupts after we re-enable the north interrupts (DEIER it
31), so exactly after this, if we have pending interrupts stored,
we'll just get another interrupt, then our interrupt handler will run
and process the "missed" interrupt (IIR can queue up to two interrupt
events; when the IIR is cleared, it will set itself again after one
clock if a second event was stored).

This can be easily verified with the following:
- Apply patch 6 ("drm/915: print PCH poison interrupts") without this
one  (it enables SERR_INT, which will trigger the bug)
- Boot SVB with LVDS only
- While booting, the mode sets will trigger a lot of interrupts, which
will trigger the bug solved by this patch
- Plug a new monitor
- Run xrandr, notice the monitor will be listed as "disconnected"
- Dump the interrupt registers, notice that SDEIIR has some bits set
but DEIIR is 0x0
- Disable south interrupts by writing 0 SDEIER (use intel_reg_write)
- Read SDEIIR and write the same value back (notice it won't become 0
due to the stored interrupt events)
- Re-enable SDEIER by writing the old value back to it
- Since SDEIIR is not 0 (due to second interrupts stored), this will
generate an interrupt, which will be processed by the Kernel and call
our irq handler (you could put some print messages on the interrupt
handler just to check if you want)
- Check all the interrupt registers, notice everything is fine (SDEIIR
is 0 and DEIIR is 0)
- Run xrandr and the plugged monitor will finally be detected (i.e.,
south interrupts are working again)

>
> And for the underrun reporting irq storm, I guess we need to first have
> infrastructure to block out interrupts while the pipe is going down before
> we can enable them. Assuming that pipe underruns are expected in that
> case.

And before all the things you've mentioned, we need this patch to
properly clear the interrupt registers so we can deal with them even
if they come too fast.

I plan to resend a new interrupt series maybe in the next days.

Thanks for the review,
Paulo

> -Daniel
>
>> ---
>>  drivers/gpu/drm/i915/i915_irq.c |   16 ++++++++++++++--
>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>> index f096ad9..500fd65 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -701,7 +701,7 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
>>  {
>>       struct drm_device *dev = (struct drm_device *) arg;
>>       drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
>> -     u32 de_iir, gt_iir, de_ier, pm_iir;
>> +     u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier;
>>       irqreturn_t ret = IRQ_NONE;
>>       int i;
>>
>> @@ -711,6 +711,10 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
>>       de_ier = I915_READ(DEIER);
>>       I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL);
>>
>> +     sde_ier = I915_READ(SDEIER);
>> +     I915_WRITE(SDEIER, 0);
>> +     POSTING_READ(SDEIER);
>> +
>>       gt_iir = I915_READ(GTIIR);
>>       if (gt_iir) {
>>               snb_gt_irq_handler(dev, dev_priv, gt_iir);
>> @@ -759,6 +763,8 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
>>
>>       I915_WRITE(DEIER, de_ier);
>>       POSTING_READ(DEIER);
>> +     I915_WRITE(SDEIER, sde_ier);
>> +     POSTING_READ(SDEIER);
>>
>>       return ret;
>>  }
>> @@ -778,7 +784,7 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
>>       struct drm_device *dev = (struct drm_device *) arg;
>>       drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
>>       int ret = IRQ_NONE;
>> -     u32 de_iir, gt_iir, de_ier, pm_iir;
>> +     u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier;
>>
>>       atomic_inc(&dev_priv->irq_received);
>>
>> @@ -787,6 +793,10 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
>>       I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL);
>>       POSTING_READ(DEIER);
>>
>> +     sde_ier = I915_READ(SDEIER);
>> +     I915_WRITE(SDEIER, 0);
>> +     POSTING_READ(SDEIER);
>> +
>>       de_iir = I915_READ(DEIIR);
>>       gt_iir = I915_READ(GTIIR);
>>       pm_iir = I915_READ(GEN6_PMIIR);
>> @@ -849,6 +859,8 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
>>  done:
>>       I915_WRITE(DEIER, de_ier);
>>       POSTING_READ(DEIER);
>> +     I915_WRITE(SDEIER, sde_ier);
>> +     POSTING_READ(SDEIER);
>>
>>       return ret;
>>  }
>> --
>> 1.7.10.4
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 
Paulo Zanoni

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 05/10] drm/i915: also disable south interrupts when handling them
  2013-02-20 20:06     ` Paulo Zanoni
@ 2013-02-20 20:24       ` Daniel Vetter
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel Vetter @ 2013-02-20 20:24 UTC (permalink / raw)
  To: Paulo Zanoni; +Cc: intel-gfx, Paulo Zanoni

On Wed, Feb 20, 2013 at 9:06 PM, Paulo Zanoni <przanoni@gmail.com> wrote:
> I finally had some time to look at this again.
>
> 2013/2/9 Daniel Vetter <daniel@ffwll.ch>:
>> On Fri, Feb 08, 2013 at 05:35:16PM -0200, Paulo Zanoni wrote:
>>> From: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>>
>>> From the docs:
>>>   "Only the rising edge of the PCH Display interrupt will cause the
>>>   North Display IIR (DEIIR) PCH Display Interrupt even bit to be set,
>>>   so all PCH Display Interrupts, including back to back interrupts,
>>>   must be cleared before a new PCH Display interrupt can cause DEIIR
>>>   to be set".
>>>
>>> The current code works fine because we don't get many interrupts, but
>>> if we enable the PCH FIFO underrun interrupts we'll start getting so
>>> many interrupts that at some point new PCH interrupts won't cause
>>> DEIIR to be set.
>>>
>>> The initial implementation I tried was to turn the code that checks
>>> SDEIIR into a loop, but we can still get interrupts even after the
>>> loop is done (and before the irq handler finishes), so we have to
>>> either disable the interrupts or mask them. In the end I concluded
>>> that just disabling the PCH interrupts is enough, you don't even need
>>> the loop, so this is what this patch implements. I've tested it and it
>>> passes the 2 "PCH FIFO underrun interrupt storms" I can reproduce:
>>> the "ironlake_crtc_disable" case and the "wrong watermarks" case.
>>>
>>> In other words, here's how to reproduce the problem fixed by this
>>> patch:
>>>   1 - Enable PCH FIFO underrun interrupts (SERR_INT on SNB+)
>>>   2 - Boot the machine
>>>   3 - While booting we'll get tons of PCH FIFO underrun interrupts
>>>   4 - Plug a new monitor
>>>   5 - Run xrandr, notice it won't detect the new monitor
>>>   6 - Read SDEIIR and notice it's not 0 while DEIIR is 0
>>>
>>> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
>>
>> This smells fishy. Looking at the code, clearing the SDEIIR before the
>> DEIIR looks strange. If we immediately get another south interrupt, it
>> will propagate to the DEIIR.
>
> Unless south interrupts are disabled by SDEIER, which is what my patch does.
>
>> But since we currently processing south
>> interrupts the corresponding bit in DEIIR is set, so when we clear DEIIR
>> later on, we kill that south interrupt. And since it's then once lost,
>> we'll lose all subsequent ones, too.
>>
>> So have you tried clearing DEIIR before clearing any of the subordinate
>> registers?
>
> Yes and it doesn't work. I just re-re-tested it today to be sure.

Ok, so I guess we just have really funny interrupt handling in our hw.
Hidden interrupt queues and fancy irq handling sequences ...

So count me convinced, one request for your patch though: Can you
please add a small comment to both places where we disable south
interrupts to explain what's going on and that this is the only way to
make it work?

Also please amend the commit message saying that proper ordering of
interrupt registers clearing doesn't do what we want (probably due to
the 2-deep irq getting in the way I suspect).

Thanks, Daniel

>> The patch itself is imo wrong, since afaik the hw will drop any interrupts
>> if we disable them. Hence any south interrupt happening while we are in
>> the irq handler will effectively be lost.
>
> They won't be lost since I don't mask the interrupts (IMR), I just
> disable them (IER), so they are still listed in the SDEIIR register.
> What happens is that in the patch I sent, we'll just re-enable the
> south interrupts after we re-enable the north interrupts (DEIER it
> 31), so exactly after this, if we have pending interrupts stored,
> we'll just get another interrupt, then our interrupt handler will run
> and process the "missed" interrupt (IIR can queue up to two interrupt
> events; when the IIR is cleared, it will set itself again after one
> clock if a second event was stored).
>
> This can be easily verified with the following:
> - Apply patch 6 ("drm/915: print PCH poison interrupts") without this
> one  (it enables SERR_INT, which will trigger the bug)
> - Boot SVB with LVDS only
> - While booting, the mode sets will trigger a lot of interrupts, which
> will trigger the bug solved by this patch
> - Plug a new monitor
> - Run xrandr, notice the monitor will be listed as "disconnected"
> - Dump the interrupt registers, notice that SDEIIR has some bits set
> but DEIIR is 0x0
> - Disable south interrupts by writing 0 SDEIER (use intel_reg_write)
> - Read SDEIIR and write the same value back (notice it won't become 0
> due to the stored interrupt events)
> - Re-enable SDEIER by writing the old value back to it
> - Since SDEIIR is not 0 (due to second interrupts stored), this will
> generate an interrupt, which will be processed by the Kernel and call
> our irq handler (you could put some print messages on the interrupt
> handler just to check if you want)
> - Check all the interrupt registers, notice everything is fine (SDEIIR
> is 0 and DEIIR is 0)
> - Run xrandr and the plugged monitor will finally be detected (i.e.,
> south interrupts are working again)
>
>>
>> And for the underrun reporting irq storm, I guess we need to first have
>> infrastructure to block out interrupts while the pipe is going down before
>> we can enable them. Assuming that pipe underruns are expected in that
>> case.
>
> And before all the things you've mentioned, we need this patch to
> properly clear the interrupt registers so we can deal with them even
> if they come too fast.
>
> I plan to resend a new interrupt series maybe in the next days.
>
> Thanks for the review,
> Paulo
>
>> -Daniel
>>
>>> ---
>>>  drivers/gpu/drm/i915/i915_irq.c |   16 ++++++++++++++--
>>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
>>> index f096ad9..500fd65 100644
>>> --- a/drivers/gpu/drm/i915/i915_irq.c
>>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>>> @@ -701,7 +701,7 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
>>>  {
>>>       struct drm_device *dev = (struct drm_device *) arg;
>>>       drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
>>> -     u32 de_iir, gt_iir, de_ier, pm_iir;
>>> +     u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier;
>>>       irqreturn_t ret = IRQ_NONE;
>>>       int i;
>>>
>>> @@ -711,6 +711,10 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
>>>       de_ier = I915_READ(DEIER);
>>>       I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL);
>>>
>>> +     sde_ier = I915_READ(SDEIER);
>>> +     I915_WRITE(SDEIER, 0);
>>> +     POSTING_READ(SDEIER);
>>> +
>>>       gt_iir = I915_READ(GTIIR);
>>>       if (gt_iir) {
>>>               snb_gt_irq_handler(dev, dev_priv, gt_iir);
>>> @@ -759,6 +763,8 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg)
>>>
>>>       I915_WRITE(DEIER, de_ier);
>>>       POSTING_READ(DEIER);
>>> +     I915_WRITE(SDEIER, sde_ier);
>>> +     POSTING_READ(SDEIER);
>>>
>>>       return ret;
>>>  }
>>> @@ -778,7 +784,7 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
>>>       struct drm_device *dev = (struct drm_device *) arg;
>>>       drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private;
>>>       int ret = IRQ_NONE;
>>> -     u32 de_iir, gt_iir, de_ier, pm_iir;
>>> +     u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier;
>>>
>>>       atomic_inc(&dev_priv->irq_received);
>>>
>>> @@ -787,6 +793,10 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
>>>       I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL);
>>>       POSTING_READ(DEIER);
>>>
>>> +     sde_ier = I915_READ(SDEIER);
>>> +     I915_WRITE(SDEIER, 0);
>>> +     POSTING_READ(SDEIER);
>>> +
>>>       de_iir = I915_READ(DEIIR);
>>>       gt_iir = I915_READ(GTIIR);
>>>       pm_iir = I915_READ(GEN6_PMIIR);
>>> @@ -849,6 +859,8 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg)
>>>  done:
>>>       I915_WRITE(DEIER, de_ier);
>>>       POSTING_READ(DEIER);
>>> +     I915_WRITE(SDEIER, sde_ier);
>>> +     POSTING_READ(SDEIER);
>>>
>>>       return ret;
>>>  }
>>> --
>>> 1.7.10.4
>>>
>>> _______________________________________________
>>> Intel-gfx mailing list
>>> Intel-gfx@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>>
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
>
>
> --
> Paulo Zanoni



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2013-02-20 20:24 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-08 19:35 [PATCH 00/10] Display error reporting Paulo Zanoni
2013-02-08 19:35 ` [PATCH 01/10] drm/i915: drm/i915: create macros for the "unclaimed register" checks Paulo Zanoni
2013-02-18 18:11   ` Daniel Vetter
2013-02-08 19:35 ` [PATCH 02/10] drm/i915: use FPGA_DBG " Paulo Zanoni
2013-02-08 19:35 ` [PATCH 03/10] drm/i915: clear the FPGA_DBG_RM_NOCLAIM bit at driver init Paulo Zanoni
2013-02-09 17:19   ` Ben Widawsky
2013-02-14 20:26     ` Paulo Zanoni
2013-02-15  4:07       ` Ben Widawsky
2013-02-08 19:35 ` [PATCH 04/10] drm/i915: add ibx_irq_postinstall Paulo Zanoni
2013-02-09 17:27   ` Ben Widawsky
2013-02-09 19:07     ` Daniel Vetter
2013-02-08 19:35 ` [PATCH 05/10] drm/i915: also disable south interrupts when handling them Paulo Zanoni
2013-02-09 19:29   ` Daniel Vetter
2013-02-20 20:06     ` Paulo Zanoni
2013-02-20 20:24       ` Daniel Vetter
2013-02-08 19:35 ` [PATCH 06/10] drm/i915: print PCH poison interrupts Paulo Zanoni
2013-02-08 19:35 ` [PATCH 07/10] drm/i915: print Gen5+ CPU " Paulo Zanoni
2013-02-08 19:42   ` Jesse Barnes
2013-02-08 19:54     ` Paulo Zanoni
2013-02-08 20:01       ` Jesse Barnes
2013-02-09 17:30     ` Ben Widawsky
2013-02-14 20:35       ` Paulo Zanoni
2013-02-15  4:05         ` Ben Widawsky
2013-02-08 19:35 ` [PATCH 08/10] drm/i915: print PCH FIFO underrun interrupts Paulo Zanoni
2013-02-09 19:43   ` Daniel Vetter
2013-02-14 20:53     ` Paulo Zanoni
2013-02-14 21:13       ` Daniel Vetter
2013-02-14 21:32         ` Paulo Zanoni
2013-02-08 19:35 ` [PATCH 09/10] drm/i915: print CPU FIFO underruns Paulo Zanoni
2013-02-08 19:35 ` [PATCH 10/10] drm/i915: also POSTING_READ(DEIER) on ivybridge_irq_handler Paulo Zanoni
2013-02-09 17:34   ` Ben Widawsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.