All of lore.kernel.org
 help / color / mirror / Atom feed
* mxcmmc driver hangs on sync
@ 2010-06-15  3:29 Morgan Howe
  2010-06-15  6:31 ` Daniel Mack
  0 siblings, 1 reply; 10+ messages in thread
From: Morgan Howe @ 2010-06-15  3:29 UTC (permalink / raw)
  To: linux-arm-kernel

Greetings,

I'm using a freescale i.mx27 board with 2 class 2 sdhc cards.
Originally we were working with the 2.6.22 kernel from freescale, but
found that after an extended period of writing to the SD card (around
2k-3k writes of a 1mb file), the driver would hang when we called the
sync command. It appeared to be an issue related to dma interrupts.

Since we have been wanting to move to the newer kernel anyway, I
decided to try a similar test using the current mainline kernel from
Linus' tree.  Basically, my test is a simple bash script which does
this:

while(true) {
cp 1mb_test_file SD1
cp SD1/1mb_test_file SD2
sync
}

With the older kernel this would hang on sync after a few thousand
loops, and much sooner if you ran 2 or 3 of these processes at a time.
I tried last night with the newer kernel and kicking off 3 processes
and after ~100-150 loops per process I get this:

mxc-mmc mxc-mmc.0: mxcmci_finish_data: No CRC -ETIMEDOUT
mmcblk0: error -110 transferring data, sector 24765007, nr 512, card
status 0x0 mmcblk0: error -110 sending stop command, response 0x0, card
status 0x0 end_request: I/O error, dev mmcblk0, sector 24765021
end_request: I/O error, dev mmcblk0, sector 24765023
end_request: I/O error, dev mmcblk0, sector 24765031
end_request: I/O error, dev mmcblk0, sector 24765039
...
<snip>

It then continues for ~20 more loops per process, and then again I get
that same error. Continues for a while longer and then finally this:

mxc-mmc mxc-mmc.1: mxcmci_finish_data: No CRC -ETIMEDOUT
mmcblk1: error -110 transferring data, sector 8859727, nr 144, card
status 0xc00 FEC: MDIO read timeout

And I can see in the output from ps:

  535 0         2808 S    /bin/sh ./test.sh 
  673 0         2808 S    /bin/sh ./test.sh 
  747 0         2808 S    /bin/sh ./test.sh 
 2097 0         2672 D    sync 
 2098 0         2672 D    sync 
 2099 0         2672 D    sync 
 2261 0            0 SW   [flush-179:8]
 2262 0         3052 R    ps

Has anyone seen this or have any suggestions how I may be able to go
about fixing it?

Regards,
Morgan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* mxcmmc driver hangs on sync
  2010-06-15  3:29 mxcmmc driver hangs on sync Morgan Howe
@ 2010-06-15  6:31 ` Daniel Mack
  2010-06-15  7:20   ` Morgan Howe
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Mack @ 2010-06-15  6:31 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Tue, Jun 15, 2010 at 11:29:36AM +0800, Morgan Howe wrote:
> With the older kernel this would hang on sync after a few thousand
> loops, and much sooner if you ran 2 or 3 of these processes at a time.
> I tried last night with the newer kernel and kicking off 3 processes
> and after ~100-150 loops per process I get this:

Which 'newer kernel' did you try?

Daniel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* mxcmmc driver hangs on sync
  2010-06-15  6:31 ` Daniel Mack
@ 2010-06-15  7:20   ` Morgan Howe
  2010-06-15  7:22     ` Daniel Mack
  0 siblings, 1 reply; 10+ messages in thread
From: Morgan Howe @ 2010-06-15  7:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 15 Jun 2010 08:31:43 +0200
Daniel Mack <daniel@caiaq.de> wrote:

> Hi,
> 
> On Tue, Jun 15, 2010 at 11:29:36AM +0800, Morgan Howe wrote:
> > With the older kernel this would hang on sync after a few thousand
> > loops, and much sooner if you ran 2 or 3 of these processes at a
> > time. I tried last night with the newer kernel and kicking off 3
> > processes and after ~100-150 loops per process I get this:
> 
> Which 'newer kernel' did you try?

Hey Daniel,

Sorry, I said current mainline, but actually it's 2.6.35-rc1.

Regards,
Morgan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* mxcmmc driver hangs on sync
  2010-06-15  7:20   ` Morgan Howe
@ 2010-06-15  7:22     ` Daniel Mack
  2010-06-15 10:32       ` Morgan Howe
  2010-06-17  8:33       ` Morgan Howe
  0 siblings, 2 replies; 10+ messages in thread
From: Daniel Mack @ 2010-06-15  7:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 15, 2010 at 03:20:51PM +0800, Morgan Howe wrote:
> On Tue, 15 Jun 2010 08:31:43 +0200
> Daniel Mack <daniel@caiaq.de> wrote:
> > On Tue, Jun 15, 2010 at 11:29:36AM +0800, Morgan Howe wrote:
> > > With the older kernel this would hang on sync after a few thousand
> > > loops, and much sooner if you ran 2 or 3 of these processes at a
> > > time. I tried last night with the newer kernel and kicking off 3
> > > processes and after ~100-150 loops per process I get this:
> > 
> > Which 'newer kernel' did you try?
> 
> Hey Daniel,
> 
> Sorry, I said current mainline, but actually it's 2.6.35-rc1.

Could you try two things:

a) build a kernel without MX2 DMA support
b) try 2.6.34, as there were some updates to the mxcmmc driver after
   2.6.34 which could be related

Daniel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* mxcmmc driver hangs on sync
  2010-06-15  7:22     ` Daniel Mack
@ 2010-06-15 10:32       ` Morgan Howe
  2010-06-17  8:33       ` Morgan Howe
  1 sibling, 0 replies; 10+ messages in thread
From: Morgan Howe @ 2010-06-15 10:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hey Daniel,


On Tue, 15 Jun 2010 09:22:37 +0200
Daniel Mack <daniel@caiaq.de> wrote:
> Could you try two things:
> 
> a) build a kernel without MX2 DMA support

I did this and ran the test again several times and the behavior seemed
basically the same.

Around 100 times, write the file to each SD and sync.
Then finally:

mxc-mmc mxc-mmc.0: mxcmci_finish_data: No CRC -ETIMEDOUT
mmcblk0: error -110 transferring data, sector 29692583, nr 512, card
status 0xc00 end_request: I/O error, dev mmcblk0, sector 29692863
... <snip ~20 similar I/O errors for other sectors>

After this it continues to write without problems and eventually:

Loop: 159
Loop: 162
Loop: 164
Loop: 187
mxc-mmc mxc-mmc.0: mxcmci_finish_data: No CRC -ETIMEDOUT
mmcblk0: error -110 transferring data, sector 29517031, nr 96, card
status 0xc00 end_request: I/O error, dev mmcblk0, sector 29517125

Loop # represents the number of times each of the four processes has
copied to SD1, copied from SD1 to SD2, and then called sync.  It
continues after this error again and then:

Loop: 228
Loop: 233
Loop: 231
Loop: 257
mxc-mmc mxc-mmc.1: mxcmci_finish_data: No CRC -ETIMEDOUT
mmcblk1: error -110 transferring data, sector 22419455, nr 56, card
status 0xc00

At this point I can see that all four of the test scripts are stuck
at calling sync.

> b) try 2.6.34, as there were some updates to the mxcmmc driver after
>    2.6.34 which could be related

I will try this, but may need a bit more time.

Regards,
Morgan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* mxcmmc driver hangs on sync
  2010-06-15  7:22     ` Daniel Mack
  2010-06-15 10:32       ` Morgan Howe
@ 2010-06-17  8:33       ` Morgan Howe
  2010-06-17 18:51         ` Erik Oomen
  1 sibling, 1 reply; 10+ messages in thread
From: Morgan Howe @ 2010-06-17  8:33 UTC (permalink / raw)
  To: linux-arm-kernel

Daniel,

On Tue, 15 Jun 2010 09:22:37 +0200
Daniel Mack <daniel@caiaq.de> wrote:
> On Tue, Jun 15, 2010 at 03:20:51PM +0800, Morgan Howe wrote:
> > On Tue, 15 Jun 2010 08:31:43 +0200
> > Daniel Mack <daniel@caiaq.de> wrote:
> > > On Tue, Jun 15, 2010 at 11:29:36AM +0800, Morgan Howe wrote:
> > > > With the older kernel this would hang on sync after a few
> > > > thousand loops, and much sooner if you ran 2 or 3 of these
> > > > processes at a time. I tried last night with the newer kernel
> > > > and kicking off 3 processes and after ~100-150 loops per
> > > > process I get this:
> > > 
> > > Which 'newer kernel' did you try?
> > 
> > Hey Daniel,
> > 
> > Sorry, I said current mainline, but actually it's 2.6.35-rc1.
> 
> Could you try two things:
> 
> a) build a kernel without MX2 DMA support
> b) try 2.6.34, as there were some updates to the mxcmmc driver after
>    2.6.34 which could be related

I have just been able to confirm using the final release of the 2.6.34
kernel that the problem also exists, with the same behavior as
described for 2.6.35.

Regards,
Morgan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* mxcmmc driver hangs on sync
  2010-06-17  8:33       ` Morgan Howe
@ 2010-06-17 18:51         ` Erik Oomen
  2010-06-17 19:59           ` Daniel Mack
  2010-06-18  5:47           ` Morgan Howe
  0 siblings, 2 replies; 10+ messages in thread
From: Erik Oomen @ 2010-06-17 18:51 UTC (permalink / raw)
  To: linux-arm-kernel

Morgan,

> Daniel,
> 
> On Tue, 15 Jun 2010 09:22:37 +0200
> Daniel Mack <daniel@caiaq.de> wrote:
>> On Tue, Jun 15, 2010 at 03:20:51PM +0800, Morgan Howe wrote:
>>> On Tue, 15 Jun 2010 08:31:43 +0200
>>> Daniel Mack <daniel@caiaq.de> wrote:
>>>> On Tue, Jun 15, 2010 at 11:29:36AM +0800, Morgan Howe wrote:
>>>>> With the older kernel this would hang on sync after a few
>>>>> thousand loops, and much sooner if you ran 2 or 3 of these
>>>>> processes at a time. I tried last night with the newer kernel
>>>>> and kicking off 3 processes and after ~100-150 loops per
>>>>> process I get this:
>>>> 
>>>> Which 'newer kernel' did you try?
>>> 
>>> Hey Daniel,
>>> 
>>> Sorry, I said current mainline, but actually it's 2.6.35-rc1.
>> 
>> Could you try two things:
>> 
>> a) build a kernel without MX2 DMA support
>> b) try 2.6.34, as there were some updates to the mxcmmc driver after
>>   2.6.34 which could be related

We've had the same problems for various kernels and mxcmmc modifications. The following fixed it.  We applied it to the 2.6.28 kernel and have been writing and reading *many* Gigabytes without a problem. 

diff --git a/arch/arm/plat-mxc/dma-mx1-mx2.c b/arch/arm/plat-mxc/dma-mx1-mx2.c
index e16014b..f295d68 100644
--- a/arch/arm/plat-mxc/dma-mx1-mx2.c
+++ b/arch/arm/plat-mxc/dma-mx1-mx2.c
@@ -653,7 +653,9 @@ static void dma_irq_handle_channel(int chno)
 static irqreturn_t dma_irq_handler(int irq, void *dev_id)
 {
        int i, disr;
+       unsigned long flags;
 
+       local_irq_save(flags); 
 #ifdef CONFIG_ARCH_MX2
        if (cpu_is_mx21() || cpu_is_mx27())
                dma_err_handler(irq, dev_id);
@@ -669,7 +671,7 @@ static irqreturn_t dma_irq_handler(int irq, void *dev_id)
                if (disr & (1 << i))
                        dma_irq_handle_channel(i);
        }
-
+       local_irq_restore(flags);
        return IRQ_HANDLED;
 }
 
Regards,
  Erik

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* mxcmmc driver hangs on sync
  2010-06-17 18:51         ` Erik Oomen
@ 2010-06-17 19:59           ` Daniel Mack
  2010-06-17 20:43             ` Erik Oomen
  2010-06-18  5:47           ` Morgan Howe
  1 sibling, 1 reply; 10+ messages in thread
From: Daniel Mack @ 2010-06-17 19:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 17, 2010 at 08:51:37PM +0200, Erik Oomen wrote:
> > On Tue, 15 Jun 2010 09:22:37 +0200
> > Daniel Mack <daniel@caiaq.de> wrote:
> >> On Tue, Jun 15, 2010 at 03:20:51PM +0800, Morgan Howe wrote:
> >>> On Tue, 15 Jun 2010 08:31:43 +0200
> >>> Daniel Mack <daniel@caiaq.de> wrote:
> >>>> On Tue, Jun 15, 2010 at 11:29:36AM +0800, Morgan Howe wrote:
> >>>>> With the older kernel this would hang on sync after a few
> >>>>> thousand loops, and much sooner if you ran 2 or 3 of these
> >>>>> processes at a time. I tried last night with the newer kernel
> >>>>> and kicking off 3 processes and after ~100-150 loops per
> >>>>> process I get this:
> >>>> 
> >>>> Which 'newer kernel' did you try?
> >>> 
> >>> Hey Daniel,
> >>> 
> >>> Sorry, I said current mainline, but actually it's 2.6.35-rc1.
> >> 
> >> Could you try two things:
> >> 
> >> a) build a kernel without MX2 DMA support
> >> b) try 2.6.34, as there were some updates to the mxcmmc driver after
> >>   2.6.34 which could be related
> 
> We've had the same problems for various kernels and mxcmmc modifications. The following fixed it.  We applied it to the 2.6.28 kernel and have been writing and reading *many* Gigabytes without a problem. 

Interesting. Did you try to push this back to mainline?

Daniel

> 
> diff --git a/arch/arm/plat-mxc/dma-mx1-mx2.c b/arch/arm/plat-mxc/dma-mx1-mx2.c
> index e16014b..f295d68 100644
> --- a/arch/arm/plat-mxc/dma-mx1-mx2.c
> +++ b/arch/arm/plat-mxc/dma-mx1-mx2.c
> @@ -653,7 +653,9 @@ static void dma_irq_handle_channel(int chno)
>  static irqreturn_t dma_irq_handler(int irq, void *dev_id)
>  {
>         int i, disr;
> +       unsigned long flags;
>  
> +       local_irq_save(flags); 
>  #ifdef CONFIG_ARCH_MX2
>         if (cpu_is_mx21() || cpu_is_mx27())
>                 dma_err_handler(irq, dev_id);
> @@ -669,7 +671,7 @@ static irqreturn_t dma_irq_handler(int irq, void *dev_id)
>                 if (disr & (1 << i))
>                         dma_irq_handle_channel(i);
>         }
> -
> +       local_irq_restore(flags);
>         return IRQ_HANDLED;
>  }
>  
> Regards,
>   Erik
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* mxcmmc driver hangs on sync
  2010-06-17 19:59           ` Daniel Mack
@ 2010-06-17 20:43             ` Erik Oomen
  0 siblings, 0 replies; 10+ messages in thread
From: Erik Oomen @ 2010-06-17 20:43 UTC (permalink / raw)
  To: linux-arm-kernel


>> We've had the same problems for various kernels and mxcmmc modifications. The following fixed it.  We applied it to the 2.6.28 kernel and have been writing and reading *many* Gigabytes without a problem. 
> 
> Interesting. Did you try to push this back to mainline?
> 
> Daniel

No, at that time we where not sure this patch was a fix. Let's see if this solves Morgan's problems.

Erik.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* mxcmmc driver hangs on sync
  2010-06-17 18:51         ` Erik Oomen
  2010-06-17 19:59           ` Daniel Mack
@ 2010-06-18  5:47           ` Morgan Howe
  1 sibling, 0 replies; 10+ messages in thread
From: Morgan Howe @ 2010-06-18  5:47 UTC (permalink / raw)
  To: linux-arm-kernel

Hey Erik,

On Thu, 17 Jun 2010 20:51:37 +0200
Erik Oomen <erik.oomen@zepcam.com> wrote:
> We've had the same problems for various kernels and mxcmmc
> modifications. The following fixed it.  We applied it to the 2.6.28
> kernel and have been writing and reading *many* Gigabytes without a
> problem. 

We ran into this problem using 2 SD cards and assumed that it was
directly related to that.  This morning I decided to try and reproduce
this problem using only a single SD card, and I was indeed able to do
so.

Thanks for the patch - it does appear to fix the problem for a single
SD, but when using 2 SD cards I still ran into the issue.  However,
replacing your irq_save/restore calls with a spinlock seems to fix the
issue for multiple SD cards.  Is using a spinlock an appropriate way to
fix this or is there some other way that would be preferable?

Regards,
Morgan

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-06-18  5:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-15  3:29 mxcmmc driver hangs on sync Morgan Howe
2010-06-15  6:31 ` Daniel Mack
2010-06-15  7:20   ` Morgan Howe
2010-06-15  7:22     ` Daniel Mack
2010-06-15 10:32       ` Morgan Howe
2010-06-17  8:33       ` Morgan Howe
2010-06-17 18:51         ` Erik Oomen
2010-06-17 19:59           ` Daniel Mack
2010-06-17 20:43             ` Erik Oomen
2010-06-18  5:47           ` Morgan Howe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.