Linux-mtd Archive on lore.kernel.org
 help / color / Atom feed
From: Miquel Raynal <miquel.raynal@bootlin.com>
To: Greg Ungerer <gerg@kernel.org>
Cc: s.hauer@pengutronix.de,
	Michael Nazzareno Trimarchi <michael@amarulasolutions.com>,
	linux-mtd@lists.infradead.org,
	Boris Brezillon <bbrezillon@kernel.org>
Subject: Re: GPMI iMX6ull timeout on DMA
Date: Mon, 29 Jul 2019 14:47:30 +0200
Message-ID: <20190729144730.4a58de32@xps13> (raw)
In-Reply-To: <18734a1d-17d9-d390-58ef-ad8ca1be925f@kernel.org>

Hi Greg,

+ Boris

Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000:

> Hi Miquel,
> 
> On 29/7/19 6:36 pm, Miquel Raynal wrote:
> > Hi Greg,
> > 
> > One question below.
> > 
> > +Michael
> > +Sascha
> > 
> > Hello Michael, here is a similar issue to yours, I know you did not
> > have enough time to share your solution but here we have someone else
> > reproducing the issue, would you mind sharing a branch or a patch, even
> > a WIP one, just to help debugging?
> > 
> > Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000:
> >   
> >> Hi Miquel,
> >>
> >> I am experiencing a problem with NAND flash DMA timeouts on
> >> iMX6ull based boards. The problem is very similar to that
> >> described in:
> >>
> >>     https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma
> >>
> >> That didn't come to any specific resolution that I could see
> >> in that thread.
> >>
> >> The boot trace on the console for me looks like this:
> >>
> >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda
> >> nand: Micron MT29F2G08ABAEAWP
> >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
> >> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA
> >> gpmi-nand 1806000.gpmi-nand: Show GPMI registers :
> >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002
> >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c
> >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101
> >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336
> >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee
> >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001
> >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001
> >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000
> >> gpmi-nand 1806000.gpmi-nand: Show BCH registers :
> >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100
> >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010
> >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080
> >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080
> >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080
> >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080
> >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080
> >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080
> >> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080
> >> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080
> >> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342
> >> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000
> >> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000
> >> gpmi-nand 1806000.gpmi-nand: BCH Geometry :
> >> GF length              : 13
> >> ECC Strength           : 8
> >> Page Size in Bytes     : 2110
> >> Metadata Size in Bytes : 10
> >> ECC Chunk0 Size in Bytes: 512
> >> ECC Chunkn Size in Bytes: 512
> >> ECC Chunk Count        : 4
> >> Payload Size in Bytes  : 2048
> >> Auxiliary Size in Bytes: 16
> >> Auxiliary Status Offset: 12
> >> Block Mark Byte Offset : 1999
> >> Block Mark Bit Offset  : 0
> >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110
> >> nand: timing mode 5 not acknowledged by the NAND chip  
> > 
> > What is the final timing mode used? Most of us tested in mode 5 I
> > guess, maybe mode 4 is broken (don't know if this is the one used here,
> > neither why mode 5 is refused). Can you please try by limiting the mode
> > to 0, 1, 2... until, hopefully, we narrow down to the failing mode.  
> 
> Sure, how to do that?

This loop [1] tries to configure each mode (5, 4, ...) until one
succeeds (default is 0: must always work). Please try to limit mode to
0, 1, etc.

Mode 0 should work.

[1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_base.c#L933

> 
> 
> >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
> >> Scanning device for bad blocks
> >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
> >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
> >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
> >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
> >> ....
> >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
> >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
> >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
> >> 5 fixed-partitions partitions found on MTD device gpmi-nand
> >> Creating 5 MTD partitions on "gpmi-nand":
> >> 0x000000000000-0x000000500000 : "u-boot"
> >> 0x000000500000-0x000000600000 : "u-boot-env"
> >> 0x000000600000-0x000000800000 : "log"
> >> 0x000000800000-0x000010000000 : "flash"
> >> 0x000000000000-0x000010000000 : "all"
> >> gpmi-nand 1806000.gpmi-nand: driver registered.
> >>
> >>
> >> This is using a linux kernel v5.1.14. I have seen this happen on
> >> a number of boards I have here - but it is only occasional. It
> >> only happens once in a while on boot, maybe 1 in 40 or more times.
> >> So it can take quite a while to reproduce (using a boot loop setup).  
> > 
> > That's strange... I don't get what would produce such unstable issue.  
> 
> My initial guess is that the calculated timing is very marginal.

What do you mean by "marginal"?

> The problem seems more likely to happen if flash write activity
> had been occurring just before a soft reboot. Its not a guarantee,
> just more likely.

That's really disturbing. I doubt this is the real cause though.

> 
> Interesting observation is that Michael was using Micron flash,
> and boards that I have with the problem also have Micron flash.
> Both a form of Micron MT29F2G08.
> 
> I have similar boards, iMX6ull based, with different brands of
> NAND flash and I have not seen any problem on them.

That's great to narrow down the root cause. Maybe these chips have
tighter timing constraints.

> 
> Regards
> Greg
> 
> 
> 
> >> As per the email thread I pointed to above I looked at reverting
> >> those patches, but that was not at all easy given how much the gpmi
> >> driver code had moved. So instead I modified the code with this:
> >>
> >> --- a/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
> >> +++ b/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
> >> @@ -481,6 +481,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this,
> >>      void gpmi_nfc_apply_timings(struct gpmi_nand_data *this)
> >>    {
> >> +#if 0
> >>           struct gpmi_nfc_hardware_timing *hw = &this->hw;
> >>           struct resources *r = &this->resources;
> >>           void __iomem *gpmi_regs = r->gpmi_regs;
> >> @@ -505,6 +512,7 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this)
> >>             /* Wait for the DLL to settle. */
> >>           udelay(dll_wait_time_us);
> >> +#endif
> >>    }
> >>      int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr,
> >>
> >> So far after a couple of days of testing with this I no longer
> >> see the DMA timeout.
> >>
> >> Any thoughts?
> >>
> >> Regards
> >> Greg
> >>  
> > 
> > Thanks,
> > Miquèl
> >   

Thanks,
Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

  reply index

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-29  6:41 Greg Ungerer
2019-07-29  8:36 ` Miquel Raynal
2019-07-29  8:42   ` Michael Nazzareno Trimarchi
2019-07-29 12:18     ` Greg Ungerer
2019-07-29 12:20       ` Michael Nazzareno Trimarchi
2019-07-29 12:33   ` Greg Ungerer
2019-07-29 12:47     ` Miquel Raynal [this message]
2019-07-29 12:49       ` Michael Nazzareno Trimarchi
2019-07-29 12:55         ` Miquel Raynal
2019-07-29 13:00           ` Michael Nazzareno Trimarchi
2019-07-29 13:22             ` Miquel Raynal
2019-07-29 20:00               ` Michael Nazzareno Trimarchi
2019-07-29 21:02                 ` Miquel Raynal
2019-07-30  0:28       ` Greg Ungerer
2019-07-30  0:41         ` Greg Ungerer
2019-07-30  6:06           ` Greg Ungerer
2019-07-30  8:38             ` Miquel Raynal
2019-07-30  8:58               ` Boris Brezillon
2019-07-31  2:05               ` Greg Ungerer
2019-07-31  6:28                 ` Boris Brezillon
2019-08-02  7:19                   ` Greg Ungerer
2019-08-02 12:34                   ` Greg Ungerer
2019-08-02 12:51                     ` Boris Brezillon
2019-08-05  5:51                       ` Greg Ungerer
2019-08-07 16:05                         ` Miquel Raynal
2019-08-08  0:43                           ` Greg Ungerer
2019-08-08 16:36                         ` Boris Brezillon
2019-08-09  5:20                           ` Greg Ungerer
2019-08-09  6:23                             ` Boris Brezillon
2019-08-09  6:55                               ` Greg Ungerer
2019-08-09  7:32                                 ` Boris Brezillon
2019-08-09 13:57                                   ` Greg Ungerer
2019-08-09 13:59                                     ` Boris Brezillon
2019-08-12  2:50                                       ` Greg Ungerer
2019-08-12  4:04                                         ` Greg Ungerer
2019-08-12  7:31                                         ` Boris Brezillon
2019-08-13  0:50                                           ` Greg Ungerer
  -- strict thread matches above, loose matches on Subject: below --
2018-10-02 13:22 GPMI IMX6ull timeout on dma Michael Nazzareno Trimarchi
2018-10-04 14:36 ` Michael Nazzareno Trimarchi

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190729144730.4a58de32@xps13 \
    --to=miquel.raynal@bootlin.com \
    --cc=bbrezillon@kernel.org \
    --cc=gerg@kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=michael@amarulasolutions.com \
    --cc=s.hauer@pengutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mtd Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mtd/0 linux-mtd/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mtd linux-mtd/ https://lore.kernel.org/linux-mtd \
		linux-mtd@lists.infradead.org linux-mtd@archiver.kernel.org
	public-inbox-index linux-mtd

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-mtd


AGPL code for this site: git clone https://public-inbox.org/ public-inbox