All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] fsldma: ignore end of segments interrupt
@ 2012-01-26 20:58 Ira W. Snyder
  2012-01-31 21:30 ` [PATCH v2] " Ira W. Snyder
  2012-02-16 17:50 ` [PATCH 1/1] " Tabi Timur-B04825
  0 siblings, 2 replies; 8+ messages in thread
From: Ira W. Snyder @ 2012-01-26 20:58 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Dan Williams

The mpc8349ea has been observed to generate spurious end of segments
interrupts despite the fact that they are not enabled by this driver.
Check for them and ignore them to avoid a kernel error message.

Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Cc: Dan Williams <dan.j.williams@intel.com>
---
 drivers/dma/fsldma.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 8a78154..7dc9689 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -1052,6 +1052,16 @@ static irqreturn_t fsldma_chan_irq(int irq, void *data)
 		stat &= ~FSL_DMA_SR_EOLNI;
 	}
 
+	/*
+	 * This driver does not use this feature, therefore we shouldn't
+	 * ever see this bit set in the status register. However, it has
+	 * been observed on MPC8349EA parts.
+	 */
+	if (stat & FSL_DMA_SR_EOSI) {
+		chan_dbg(chan, "irq: End-of-Segments INT\n");
+		stat &= ~FSL_DMA_SR_EOSI;
+	}
+
 	/* check that the DMA controller is really idle */
 	if (!dma_is_idle(chan))
 		chan_err(chan, "irq: controller not idle!\n");
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2] fsldma: ignore end of segments interrupt
  2012-01-26 20:58 [PATCH 1/1] fsldma: ignore end of segments interrupt Ira W. Snyder
@ 2012-01-31 21:30 ` Ira W. Snyder
  2012-02-16 17:50 ` [PATCH 1/1] " Tabi Timur-B04825
  1 sibling, 0 replies; 8+ messages in thread
From: Ira W. Snyder @ 2012-01-31 21:30 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Dan Williams

The mpc8349ea has been observed to generate spurious end of segments
interrupts despite the fact that they are not enabled by this driver.
Check for them and ignore them to avoid a kernel error message.

Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Cc: Dan Williams <dan.j.williams@intel.com>
---

Changes v1 -> v2:
- skip the descriptor cleanup tasklet if the controller is not yet idle

 drivers/dma/fsldma.c |   27 ++++++++++++++++++++++++---
 1 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 8a78154..037631a 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -1052,20 +1052,41 @@ static irqreturn_t fsldma_chan_irq(int irq, void *data)
 		stat &= ~FSL_DMA_SR_EOLNI;
 	}
 
-	/* check that the DMA controller is really idle */
-	if (!dma_is_idle(chan))
-		chan_err(chan, "irq: controller not idle!\n");
+	/*
+	 * This driver does not use this feature, therefore we shouldn't
+	 * ever see this bit set in the status register. However, it has
+	 * been observed on MPC8349EA parts.
+	 */
+	if (stat & FSL_DMA_SR_EOSI) {
+		chan_dbg(chan, "irq: End-of-Segments INT\n");
+		stat &= ~FSL_DMA_SR_EOSI;
+	}
 
 	/* check that we handled all of the bits */
 	if (stat)
 		chan_err(chan, "irq: unhandled sr 0x%08x\n", stat);
 
 	/*
+	 * Check that the DMA controller is really idle
+	 *
+	 * Occasionally on MPC8349EA parts, a spurious End-of-Segments
+	 * interrupt is generated. When this happens, the controller is
+	 * still busy. In this case, we shouldn't run the tasklet to
+	 * clean up idle descriptors, since the controller is not yet idle.
+	 */
+	if (!dma_is_idle(chan)) {
+		chan_err(chan, "irq: controller not idle!\n");
+		goto out_skip_tasklet;
+	}
+
+	/*
 	 * Schedule the tasklet to handle all cleanup of the current
 	 * transaction. It will start a new transaction if there is
 	 * one pending.
 	 */
 	tasklet_schedule(&chan->tasklet);
+
+out_skip_tasklet:
 	chan_dbg(chan, "irq: Exit\n");
 	return IRQ_HANDLED;
 }
-- 
1.7.3.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] fsldma: ignore end of segments interrupt
  2012-01-26 20:58 [PATCH 1/1] fsldma: ignore end of segments interrupt Ira W. Snyder
  2012-01-31 21:30 ` [PATCH v2] " Ira W. Snyder
@ 2012-02-16 17:50 ` Tabi Timur-B04825
  2012-02-16 19:00   ` Ira W. Snyder
  1 sibling, 1 reply; 8+ messages in thread
From: Tabi Timur-B04825 @ 2012-02-16 17:50 UTC (permalink / raw)
  To: Ira W. Snyder; +Cc: Dan Williams, linuxppc-dev

On Thu, Jan 26, 2012 at 2:58 PM, Ira W. Snyder <iws@ovro.caltech.edu> wrote=
:
> The mpc8349ea has been observed to generate spurious end of segments
> interrupts despite the fact that they are not enabled by this driver.
> Check for them and ignore them to avoid a kernel error message.

When this happens, are there any other status bits set?  It seems
weird that there are spurious interrupts from an internal block,
especially since it's the same block on all 83xx parts.

I wonder if the EOSI bit just happens to be set when the interrupt
occurs for some other reason.

--=20
Timur Tabi
Linux kernel developer at Freescale=

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] fsldma: ignore end of segments interrupt
  2012-02-16 17:50 ` [PATCH 1/1] " Tabi Timur-B04825
@ 2012-02-16 19:00   ` Ira W. Snyder
  2012-02-16 19:34     ` Timur Tabi
  0 siblings, 1 reply; 8+ messages in thread
From: Ira W. Snyder @ 2012-02-16 19:00 UTC (permalink / raw)
  To: Tabi Timur-B04825; +Cc: Dan Williams, linuxppc-dev

On Thu, Feb 16, 2012 at 05:50:47PM +0000, Tabi Timur-B04825 wrote:
> On Thu, Jan 26, 2012 at 2:58 PM, Ira W. Snyder <iws@ovro.caltech.edu> wrote:
> > The mpc8349ea has been observed to generate spurious end of segments
> > interrupts despite the fact that they are not enabled by this driver.
> > Check for them and ignore them to avoid a kernel error message.
> 
> When this happens, are there any other status bits set?  It seems
> weird that there are spurious interrupts from an internal block,
> especially since it's the same block on all 83xx parts.
> 
> I wonder if the EOSI bit just happens to be set when the interrupt
> occurs for some other reason.
> 

I'm not sure. The fsldma irq handler only prints bits it did not handle.
There are several other bits in the driver which should never be seen,
but they are handled by the irq handler anyway. This is just a remnant
from the original Freescale code.

I have a set of 15 test boards that I can use to figure out which other
bits are set when this happens, if it is important.

I put a variation of this patch (missing the "skip tasklet if not idle"
logic) into my production boards roughly a month ago. I've gotten the
"controller not idle" error message 748 times, as compared to the
"unhandled sr 0x00000002" message 3449 times.

This leads me to believe that this occurs mostly (but not always)
concurrent with the end-of-chain interrupt.

In the last month, the "unhandled sr" error has occurred on 92 out of
120 boards in production use. The statistics are included below. On some
boards, it is much more frequent than on others. All boards have roughly
the same workload.

Another interesting tidbit from my logs: this only occurs on DMA channel
2 (the are numbered starting at 0, it is the 3rd channel). Here is an
example log message:

[3484053.821689] of:fsl-elo-dma e00082a8.dma: chan2: irq: unhandled sr 0x00000002

Thanks,
Ira

     15 serial-number-5
      1 serial-number-16
      8 serial-number-18
     16 serial-number-19
      3 serial-number-20
     21 serial-number-21
      1 serial-number-24
      1 serial-number-26
      3 serial-number-27
      2 serial-number-28
     16 serial-number-29
      4 serial-number-30
      1 serial-number-31
      4 serial-number-32
      5 serial-number-33
      1 serial-number-34
      6 serial-number-35
     18 serial-number-36
      1 serial-number-39
      1 serial-number-40
      2 serial-number-41
     10 serial-number-42
     11 serial-number-43
     32 serial-number-45
      6 serial-number-46
      4 serial-number-47
      1 serial-number-49
      6 serial-number-50
      2 serial-number-51
      4 serial-number-53
      1 serial-number-55
      1 serial-number-57
     15 serial-number-58
      1 serial-number-60
      1 serial-number-62
      1 serial-number-66
      8 serial-number-67
      2 serial-number-75
      1 serial-number-76
     11 serial-number-79
      4 serial-number-80
      8 serial-number-81
      1 serial-number-82
     11 serial-number-84
      2 serial-number-92
     20 serial-number-93
     30 serial-number-94
     19 serial-number-95
     32 serial-number-96
     73 serial-number-97
     18 serial-number-99
     57 serial-number-100
     41 serial-number-101
     28 serial-number-102
      8 serial-number-103
    132 serial-number-107
     60 serial-number-108
     55 serial-number-109
     97 serial-number-110
     18 serial-number-111
     45 serial-number-113
      6 serial-number-114
    123 serial-number-115
     27 serial-number-117
     29 serial-number-118
     12 serial-number-119
     47 serial-number-120
     74 serial-number-121
      8 serial-number-124
    128 serial-number-125
    326 serial-number-128
     84 serial-number-129
     36 serial-number-130
      2 serial-number-131
     75 serial-number-133
     64 serial-number-135
    686 serial-number-137
     97 serial-number-139
     28 serial-number-140
     82 serial-number-141
     36 serial-number-144
     31 serial-number-145
     47 serial-number-147
     60 serial-number-150
     22 serial-number-152
     36 serial-number-154
     57 serial-number-156
     68 serial-number-158
     54 serial-number-159
     37 serial-number-160
     46 serial-number-161
     14 serial-number-162

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] fsldma: ignore end of segments interrupt
  2012-02-16 19:00   ` Ira W. Snyder
@ 2012-02-16 19:34     ` Timur Tabi
  2012-02-16 19:46       ` Ira W. Snyder
  0 siblings, 1 reply; 8+ messages in thread
From: Timur Tabi @ 2012-02-16 19:34 UTC (permalink / raw)
  To: Ira W. Snyder; +Cc: Dan Williams, linuxppc-dev

Ira W. Snyder wrote:
> This leads me to believe that this occurs mostly (but not always)
> concurrent with the end-of-chain interrupt.

Have you tested this on an 85xx platform?

I noticed something odd.  You're modifying fsldma_chan_irq(), which is for
DMA controllers that have per-channel IRQs.  83xx devices don't have
per-channel IRQs -- all channels on one controller have the same IRQ.
Looking at the device tree, I see that the IRQs are listed in the channel
nodes *and* in the controller node.  I don't see how we ever use the
per-controller ISR.

I wonder if the shared IRQ is the part of the cause of the interrupts
you're seeing.

> 
> In the last month, the "unhandled sr" error has occurred on 92 out of
> 120 boards in production use. The statistics are included below. On some
> boards, it is much more frequent than on others. All boards have roughly
> the same workload.
> 
> Another interesting tidbit from my logs: this only occurs on DMA channel
> 2 (the are numbered starting at 0, it is the 3rd channel). Here is an
> example log message:

What happens if you never register that channel?  That is, remove this
node from the device tree:

dma-channel@100 {
	compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel";
	reg = <0x100 0x80>;
	cell-index = <2>;
	interrupt-parent = <&ipic>;
	interrupts = <71 8>;
};

-- 
Timur Tabi
Linux kernel developer at Freescale

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] fsldma: ignore end of segments interrupt
  2012-02-16 19:34     ` Timur Tabi
@ 2012-02-16 19:46       ` Ira W. Snyder
  2012-02-16 19:48         ` Timur Tabi
  0 siblings, 1 reply; 8+ messages in thread
From: Ira W. Snyder @ 2012-02-16 19:46 UTC (permalink / raw)
  To: Timur Tabi; +Cc: Dan Williams, linuxppc-dev

On Thu, Feb 16, 2012 at 01:34:00PM -0600, Timur Tabi wrote:
> Ira W. Snyder wrote:
> > This leads me to believe that this occurs mostly (but not always)
> > concurrent with the end-of-chain interrupt.
> 
> Have you tested this on an 85xx platform?
> 

No. I don't have the ability to connect my P2020 up to an FPGA to
recreate the DMA workload that causes this on my 8349EA. I can run the
dmatest module, if you'd like.

> I noticed something odd.  You're modifying fsldma_chan_irq(), which is for
> DMA controllers that have per-channel IRQs.  83xx devices don't have
> per-channel IRQs -- all channels on one controller have the same IRQ.
> Looking at the device tree, I see that the IRQs are listed in the channel
> nodes *and* in the controller node.  I don't see how we ever use the
> per-controller ISR.
> 

fsldma_ctrl_irq() (the per-controller irq handler) just calls through to
fsldma_chan_irq() (the per-channel irq handler).

> I wonder if the shared IRQ is the part of the cause of the interrupts
> you're seeing.
> 

My device tree is slightly modified to remove the per-controller
interrupts and interrupt-parent properties. Each individual channel has
identical interrupts and interrupt-parent properties specified.

Someone here suggested that I do that, several years ago. It has been
too long, and I do not remember who. I can reverse it, and use the
per-controller IRQ instead.

> > 
> > In the last month, the "unhandled sr" error has occurred on 92 out of
> > 120 boards in production use. The statistics are included below. On some
> > boards, it is much more frequent than on others. All boards have roughly
> > the same workload.
> > 
> > Another interesting tidbit from my logs: this only occurs on DMA channel
> > 2 (the are numbered starting at 0, it is the 3rd channel). Here is an
> > example log message:
> 
> What happens if you never register that channel?  That is, remove this
> node from the device tree:
> 
> dma-channel@100 {
> 	compatible = "fsl,mpc8349-dma-channel", "fsl,elo-dma-channel";
> 	reg = <0x100 0x80>;
> 	cell-index = <2>;
> 	interrupt-parent = <&ipic>;
> 	interrupts = <71 8>;
> };
> 

I can try that. I hunch the problem will move, as the carma-fpga driver
(see drivers/misc/carma/carma-fpga.c) will claim the 4th channel
instead.

Ira

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] fsldma: ignore end of segments interrupt
  2012-02-16 19:46       ` Ira W. Snyder
@ 2012-02-16 19:48         ` Timur Tabi
  2012-02-17  0:57           ` Ira W. Snyder
  0 siblings, 1 reply; 8+ messages in thread
From: Timur Tabi @ 2012-02-16 19:48 UTC (permalink / raw)
  To: Ira W. Snyder; +Cc: Dan Williams, linuxppc-dev

Ira W. Snyder wrote:

> No. I don't have the ability to connect my P2020 up to an FPGA to
> recreate the DMA workload that causes this on my 8349EA. I can run the
> dmatest module, if you'd like.

I just want to make sure your patch doesn't break 85xx.

-- 
Timur Tabi
Linux kernel developer at Freescale

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/1] fsldma: ignore end of segments interrupt
  2012-02-16 19:48         ` Timur Tabi
@ 2012-02-17  0:57           ` Ira W. Snyder
  0 siblings, 0 replies; 8+ messages in thread
From: Ira W. Snyder @ 2012-02-17  0:57 UTC (permalink / raw)
  To: Timur Tabi; +Cc: Dan Williams, linuxppc-dev

On Thu, Feb 16, 2012 at 01:48:20PM -0600, Timur Tabi wrote:
> Ira W. Snyder wrote:
> 
> > No. I don't have the ability to connect my P2020 up to an FPGA to
> > recreate the DMA workload that causes this on my 8349EA. I can run the
> > dmatest module, if you'd like.
> 
> I just want to make sure your patch doesn't break 85xx.
> 

I tried both with and without this patch on my P2020 COM Express board.
With both kernels, the board locks up after 20 minutes or so, no
messages to the serial console.

I wouldn't be surprised if there are some memory problems with this
board. In any case, I don't have any reason to believe that this patch
causes any trouble: the board dies without it.

However, the patch doesn't break DMA on 85xx. If I unload the dmatest
module after 10 minutes or so, it claims to have passed many thousands
of tests without problems.

My 8349EA test boards (15 of them) have been running their normal DMA
workload plus dmatest on the unused 4th channel, all without errors, for
several hours. ~2.5 million successful tests per board, as I write this.

Ira

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-02-17  0:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-26 20:58 [PATCH 1/1] fsldma: ignore end of segments interrupt Ira W. Snyder
2012-01-31 21:30 ` [PATCH v2] " Ira W. Snyder
2012-02-16 17:50 ` [PATCH 1/1] " Tabi Timur-B04825
2012-02-16 19:00   ` Ira W. Snyder
2012-02-16 19:34     ` Timur Tabi
2012-02-16 19:46       ` Ira W. Snyder
2012-02-16 19:48         ` Timur Tabi
2012-02-17  0:57           ` Ira W. Snyder

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.