linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BISECTED, REGRESSION] OMAP3 onenand/DMA broken
@ 2020-01-03  8:17 Aaro Koskinen
  2020-01-03  8:46 ` H. Nikolaus Schaller
  2020-01-04  7:38 ` Peter Ujfalusi
  0 siblings, 2 replies; 5+ messages in thread
From: Aaro Koskinen @ 2020-01-03  8:17 UTC (permalink / raw)
  To: Peter Ujfalusi, Tony Lindgren; +Cc: linux-omap, linux-kernel

Hi,

When booting v5.4 (or v5.5-rc4) on N900, the console gets flooded with:

[    8.335754] omap2-onenand 1000000.onenand: timeout waiting for DMA
[    8.365753] omap2-onenand 1000000.onenand: timeout waiting for DMA
[    8.395751] omap2-onenand 1000000.onenand: timeout waiting for DMA
[    8.425750] omap2-onenand 1000000.onenand: timeout waiting for DMA
[    8.455749] omap2-onenand 1000000.onenand: timeout waiting for DMA
[    8.485748] omap2-onenand 1000000.onenand: timeout waiting for DMA
[    8.515777] omap2-onenand 1000000.onenand: timeout waiting for DMA
[    8.545776] omap2-onenand 1000000.onenand: timeout waiting for DMA
[    8.575775] omap2-onenand 1000000.onenand: timeout waiting for DMA

making the system unusable.

Bisected to:

4689d35c765c696bdf0535486a990038b242a26b is the first bad commit
commit 4689d35c765c696bdf0535486a990038b242a26b
Author: Peter Ujfalusi <peter.ujfalusi@ti.com>
Date:   Tue Jul 16 11:24:59 2019 +0300

    dmaengine: ti: omap-dma: Improved memcpy polling support

The commit does not revert cleanly anymore. Any ideas how to fix this?

A.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BISECTED, REGRESSION] OMAP3 onenand/DMA broken
  2020-01-03  8:17 [BISECTED, REGRESSION] OMAP3 onenand/DMA broken Aaro Koskinen
@ 2020-01-03  8:46 ` H. Nikolaus Schaller
  2020-01-03 17:23   ` Aaro Koskinen
  2020-01-04  7:38 ` Peter Ujfalusi
  1 sibling, 1 reply; 5+ messages in thread
From: H. Nikolaus Schaller @ 2020-01-03  8:46 UTC (permalink / raw)
  To: Aaro Koskinen, Peter Ujfalusi, Tony Lindgren
  Cc: Linux-OMAP, Linux Kernel Mailing List,
	Discussions about the Letux Kernel

Hi,

> Am 03.01.2020 um 09:17 schrieb Aaro Koskinen <aaro.koskinen@iki.fi>:
> 
> Hi,
> 
> When booting v5.4 (or v5.5-rc4) on N900, the console gets flooded with:
> 
> [    8.335754] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.365753] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.395751] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.425750] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.455749] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.485748] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.515777] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.545776] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.575775] omap2-onenand 1000000.onenand: timeout waiting for DMA
> 
> making the system unusable.

I can confirm that this issue exists but so far we failed to bisect
and make a proper report.

Sometimes the system boots fine and sometimes it fails.

It happens on omap3-gta04a5one.dts only, but not with omap3-gta04a4.dts
(both dm3730 but different NAND).

> 
> Bisected to:
> 
> 4689d35c765c696bdf0535486a990038b242a26b is the first bad commit
> commit 4689d35c765c696bdf0535486a990038b242a26b
> Author: Peter Ujfalusi <peter.ujfalusi@ti.com>
> Date:   Tue Jul 16 11:24:59 2019 +0300
> 
>    dmaengine: ti: omap-dma: Improved memcpy polling support
> 
> The commit does not revert cleanly anymore. Any ideas how to fix this?
> 
> A.

BR, Nikolaus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BISECTED, REGRESSION] OMAP3 onenand/DMA broken
  2020-01-03  8:46 ` H. Nikolaus Schaller
@ 2020-01-03 17:23   ` Aaro Koskinen
  2020-01-03 18:29     ` H. Nikolaus Schaller
  0 siblings, 1 reply; 5+ messages in thread
From: Aaro Koskinen @ 2020-01-03 17:23 UTC (permalink / raw)
  To: H. Nikolaus Schaller
  Cc: Peter Ujfalusi, Tony Lindgren, Linux-OMAP,
	Linux Kernel Mailing List, Discussions about the Letux Kernel

Hi,

On Fri, Jan 03, 2020 at 09:46:58AM +0100, H. Nikolaus Schaller wrote:
> > Am 03.01.2020 um 09:17 schrieb Aaro Koskinen <aaro.koskinen@iki.fi>:
> > When booting v5.4 (or v5.5-rc4) on N900, the console gets flooded with:
> > 
> > [    8.335754] omap2-onenand 1000000.onenand: timeout waiting for DMA
> > [    8.365753] omap2-onenand 1000000.onenand: timeout waiting for DMA
> > [    8.395751] omap2-onenand 1000000.onenand: timeout waiting for DMA
> > [    8.425750] omap2-onenand 1000000.onenand: timeout waiting for DMA
> > [    8.455749] omap2-onenand 1000000.onenand: timeout waiting for DMA
> > [    8.485748] omap2-onenand 1000000.onenand: timeout waiting for DMA
> > [    8.515777] omap2-onenand 1000000.onenand: timeout waiting for DMA
> > [    8.545776] omap2-onenand 1000000.onenand: timeout waiting for DMA
> > [    8.575775] omap2-onenand 1000000.onenand: timeout waiting for DMA
> > 
> > making the system unusable.
> 
> I can confirm that this issue exists but so far we failed to bisect
> and make a proper report.
> 
> Sometimes the system boots fine and sometimes it fails.
> 
> It happens on omap3-gta04a5one.dts only, but not with omap3-gta04a4.dts
> (both dm3730 but different NAND).

I tried three different boards (N810, N900 and N950) and it always
fails reliably.

A.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BISECTED, REGRESSION] OMAP3 onenand/DMA broken
  2020-01-03 17:23   ` Aaro Koskinen
@ 2020-01-03 18:29     ` H. Nikolaus Schaller
  0 siblings, 0 replies; 5+ messages in thread
From: H. Nikolaus Schaller @ 2020-01-03 18:29 UTC (permalink / raw)
  To: Aaro Koskinen
  Cc: Peter Ujfalusi, Tony Lindgren, Linux-OMAP,
	Linux Kernel Mailing List, Discussions about the Letux Kernel

Hi Aaro,

> Am 03.01.2020 um 18:23 schrieb Aaro Koskinen <aaro.koskinen@iki.fi>:
> 
> Hi,
> 
> On Fri, Jan 03, 2020 at 09:46:58AM +0100, H. Nikolaus Schaller wrote:
>>> Am 03.01.2020 um 09:17 schrieb Aaro Koskinen <aaro.koskinen@iki.fi>:
>>> When booting v5.4 (or v5.5-rc4) on N900, the console gets flooded with:
>>> 
>>> [    8.335754] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [    8.365753] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [    8.395751] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [    8.425750] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [    8.455749] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [    8.485748] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [    8.515777] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [    8.545776] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> [    8.575775] omap2-onenand 1000000.onenand: timeout waiting for DMA
>>> 
>>> making the system unusable.
>> 
>> I can confirm that this issue exists but so far we failed to bisect
>> and make a proper report.
>> 
>> Sometimes the system boots fine and sometimes it fails.

Well, we boot from µSD and the number of the timeouts changes. So it may
be a race or depend on driver load sequence if we come to a login: or not.
But this is not the real bug.

>> 
>> It happens on omap3-gta04a5one.dts only, but not with omap3-gta04a4.dts
>> (both dm3730 but different NAND).
> 
> I tried three different boards (N810, N900 and N950) and it always
> fails reliably.

The big question is why the patch is harmful.

I tried to understand what the patch is doing (without any knowledge
about the DMA hard- or software architecture).

Basically it reorders error handling and some corner cases.
Maybe it handles one differently that happens only for OneNAND.

What did jump to my mind is that before the patch there is an
unconditional call to omap_dma_chan_read(c, CCR) if (!c->paused && c->running) 

And then DMA_COMPLETE is returned or ret if txstate == 0

With the new code the check for DMA_COMPLETE comes first and
directly leads to a return. Independently of txstate.

So if we have (!c->paused && c->running) and dma_cookie_status()
returns DMA_COMPLETE, there is no longer a call to omap_dma_chan_read()

Since I do not understand what omap_dma_chan_read() is doing,
and if (!c->paused && c->running) is relevant here,
I can not conclude if that is harmful.

But I can imagine that reading a register may have a side-effect of
resetting some bit like interrupt status registers.

I hope that Peter or Tony can respond soon.

BR and thanks,
Nikolaus




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BISECTED, REGRESSION] OMAP3 onenand/DMA broken
  2020-01-03  8:17 [BISECTED, REGRESSION] OMAP3 onenand/DMA broken Aaro Koskinen
  2020-01-03  8:46 ` H. Nikolaus Schaller
@ 2020-01-04  7:38 ` Peter Ujfalusi
  1 sibling, 0 replies; 5+ messages in thread
From: Peter Ujfalusi @ 2020-01-04  7:38 UTC (permalink / raw)
  To: Aaro Koskinen, Tony Lindgren; +Cc: linux-omap, linux-kernel

Hi Aaro,

On 1/3/20 10:17 AM, Aaro Koskinen wrote:
> Hi,
> 
> When booting v5.4 (or v5.5-rc4) on N900, the console gets flooded with:
> 
> [    8.335754] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.365753] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.395751] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.425750] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.455749] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.485748] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.515777] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.545776] omap2-onenand 1000000.onenand: timeout waiting for DMA
> [    8.575775] omap2-onenand 1000000.onenand: timeout waiting for DMA
> 
> making the system unusable.
> 
> Bisected to:
> 
> 4689d35c765c696bdf0535486a990038b242a26b is the first bad commit
> commit 4689d35c765c696bdf0535486a990038b242a26b
> Author: Peter Ujfalusi <peter.ujfalusi@ti.com>
> Date:   Tue Jul 16 11:24:59 2019 +0300
> 
>     dmaengine: ti: omap-dma: Improved memcpy polling support
> 
> The commit does not revert cleanly anymore. Any ideas how to fix this?

I certainly tested the memcpy via dmatest in polled and non polled mode..

I can take a look on Tuesday earliest, but sent a patch (untested) which
should fix the issue:
https://lore.kernel.org/lkml/20200104073453.16077-1-peter.ujfalusi@ti.com/


> 
> A.
> 

- Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-01-04  7:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-03  8:17 [BISECTED, REGRESSION] OMAP3 onenand/DMA broken Aaro Koskinen
2020-01-03  8:46 ` H. Nikolaus Schaller
2020-01-03 17:23   ` Aaro Koskinen
2020-01-03 18:29     ` H. Nikolaus Schaller
2020-01-04  7:38 ` Peter Ujfalusi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).