From: Greg Ungerer <gerg@kernel.org>
To: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: s.hauer@pengutronix.de,
Michael Nazzareno Trimarchi <michael@amarulasolutions.com>,
linux-mtd@lists.infradead.org,
Boris Brezillon <bbrezillon@kernel.org>
Subject: Re: GPMI iMX6ull timeout on DMA
Date: Tue, 30 Jul 2019 10:28:46 +1000 [thread overview]
Message-ID: <17b49e7d-ff63-315f-cf12-3474f7228c6d@kernel.org> (raw)
In-Reply-To: <20190729144730.4a58de32@xps13>
Hi Miquel,
On 29/7/19 10:47 pm, Miquel Raynal wrote:
> Hi Greg,
>
> + Boris
>
> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 22:33:56 +1000:
>
>> Hi Miquel,
>>
>> On 29/7/19 6:36 pm, Miquel Raynal wrote:
>>> Hi Greg,
>>>
>>> One question below.
>>>
>>> +Michael
>>> +Sascha
>>>
>>> Hello Michael, here is a similar issue to yours, I know you did not
>>> have enough time to share your solution but here we have someone else
>>> reproducing the issue, would you mind sharing a branch or a patch, even
>>> a WIP one, just to help debugging?
>>>
>>> Greg Ungerer <gerg@kernel.org> wrote on Mon, 29 Jul 2019 16:41:51 +1000:
>>>
>>>> Hi Miquel,
>>>>
>>>> I am experiencing a problem with NAND flash DMA timeouts on
>>>> iMX6ull based boards. The problem is very similar to that
>>>> described in:
>>>>
>>>> https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma
>>>>
>>>> That didn't come to any specific resolution that I could see
>>>> in that thread.
>>>>
>>>> The boot trace on the console for me looks like this:
>>>>
>>>> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda
>>>> nand: Micron MT29F2G08ABAEAWP
>>>> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
>>>> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA
>>>> gpmi-nand 1806000.gpmi-nand: Show GPMI registers :
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000
>>>> gpmi-nand 1806000.gpmi-nand: Show BCH registers :
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000
>>>> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000
>>>> gpmi-nand 1806000.gpmi-nand: BCH Geometry :
>>>> GF length : 13
>>>> ECC Strength : 8
>>>> Page Size in Bytes : 2110
>>>> Metadata Size in Bytes : 10
>>>> ECC Chunk0 Size in Bytes: 512
>>>> ECC Chunkn Size in Bytes: 512
>>>> ECC Chunk Count : 4
>>>> Payload Size in Bytes : 2048
>>>> Auxiliary Size in Bytes: 16
>>>> Auxiliary Status Offset: 12
>>>> Block Mark Byte Offset : 1999
>>>> Block Mark Bit Offset : 0
>>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110
>>>> nand: timing mode 5 not acknowledged by the NAND chip
>>>
>>> What is the final timing mode used? Most of us tested in mode 5 I
>>> guess, maybe mode 4 is broken (don't know if this is the one used here,
>>> neither why mode 5 is refused). Can you please try by limiting the mode
>>> to 0, 1, 2... until, hopefully, we narrow down to the failing mode.
>>
>> Sure, how to do that?
>
> This loop [1] tries to configure each mode (5, 4, ...) until one
> succeeds (default is 0: must always work). Please try to limit mode to
> 0, 1, etc.
>
> Mode 0 should work.
>
> [1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_base.c#L933
The normal behavior - which usually works - has
chip->onfi_timing_mode_default=5 here. So in other words on the first pass
through this loop it is checking mode 5, and setting it as the default.
I am running a test/reboot loop now waiting for failure to see
if it is still using mode 5 in that case.
Regards
Greg
>>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
>>>> Scanning device for bad blocks
>>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
>>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
>>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
>>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
>>>> ....
>>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
>>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
>>>> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22
>>>> 5 fixed-partitions partitions found on MTD device gpmi-nand
>>>> Creating 5 MTD partitions on "gpmi-nand":
>>>> 0x000000000000-0x000000500000 : "u-boot"
>>>> 0x000000500000-0x000000600000 : "u-boot-env"
>>>> 0x000000600000-0x000000800000 : "log"
>>>> 0x000000800000-0x000010000000 : "flash"
>>>> 0x000000000000-0x000010000000 : "all"
>>>> gpmi-nand 1806000.gpmi-nand: driver registered.
>>>>
>>>>
>>>> This is using a linux kernel v5.1.14. I have seen this happen on
>>>> a number of boards I have here - but it is only occasional. It
>>>> only happens once in a while on boot, maybe 1 in 40 or more times.
>>>> So it can take quite a while to reproduce (using a boot loop setup).
>>>
>>> That's strange... I don't get what would produce such unstable issue.
>>
>> My initial guess is that the calculated timing is very marginal.
>
> What do you mean by "marginal"?
>
>> The problem seems more likely to happen if flash write activity
>> had been occurring just before a soft reboot. Its not a guarantee,
>> just more likely.
>
> That's really disturbing. I doubt this is the real cause though.
>
>>
>> Interesting observation is that Michael was using Micron flash,
>> and boards that I have with the problem also have Micron flash.
>> Both a form of Micron MT29F2G08.
>>
>> I have similar boards, iMX6ull based, with different brands of
>> NAND flash and I have not seen any problem on them.
>
> That's great to narrow down the root cause. Maybe these chips have
> tighter timing constraints.
>
>>
>> Regards
>> Greg
>>
>>
>>
>>>> As per the email thread I pointed to above I looked at reverting
>>>> those patches, but that was not at all easy given how much the gpmi
>>>> driver code had moved. So instead I modified the code with this:
>>>>
>>>> --- a/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
>>>> +++ b/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c
>>>> @@ -481,6 +481,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this,
>>>> void gpmi_nfc_apply_timings(struct gpmi_nand_data *this)
>>>> {
>>>> +#if 0
>>>> struct gpmi_nfc_hardware_timing *hw = &this->hw;
>>>> struct resources *r = &this->resources;
>>>> void __iomem *gpmi_regs = r->gpmi_regs;
>>>> @@ -505,6 +512,7 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this)
>>>> /* Wait for the DLL to settle. */
>>>> udelay(dll_wait_time_us);
>>>> +#endif
>>>> }
>>>> int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr,
>>>>
>>>> So far after a couple of days of testing with this I no longer
>>>> see the DMA timeout.
>>>>
>>>> Any thoughts?
>>>>
>>>> Regards
>>>> Greg
>>>>
>>>
>>> Thanks,
>>> Miquèl
>>>
>
> Thanks,
> Miquèl
>
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/
next prev parent reply other threads:[~2019-07-30 0:29 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-29 6:41 GPMI iMX6ull timeout on DMA Greg Ungerer
2019-07-29 8:36 ` Miquel Raynal
2019-07-29 8:42 ` Michael Nazzareno Trimarchi
2019-07-29 12:18 ` Greg Ungerer
2019-07-29 12:20 ` Michael Nazzareno Trimarchi
2019-07-29 12:33 ` Greg Ungerer
2019-07-29 12:47 ` Miquel Raynal
2019-07-29 12:49 ` Michael Nazzareno Trimarchi
2019-07-29 12:55 ` Miquel Raynal
2019-07-29 13:00 ` Michael Nazzareno Trimarchi
2019-07-29 13:22 ` Miquel Raynal
2019-07-29 20:00 ` Michael Nazzareno Trimarchi
2019-07-29 21:02 ` Miquel Raynal
2019-07-30 0:28 ` Greg Ungerer [this message]
2019-07-30 0:41 ` Greg Ungerer
2019-07-30 6:06 ` Greg Ungerer
2019-07-30 8:38 ` Miquel Raynal
2019-07-30 8:58 ` Boris Brezillon
2019-07-31 2:05 ` Greg Ungerer
2019-07-31 6:28 ` Boris Brezillon
2019-08-02 7:19 ` Greg Ungerer
2019-08-02 12:34 ` Greg Ungerer
2019-08-02 12:51 ` Boris Brezillon
2019-08-05 5:51 ` Greg Ungerer
2019-08-07 16:05 ` Miquel Raynal
2019-08-08 0:43 ` Greg Ungerer
2019-08-08 16:36 ` Boris Brezillon
2019-08-09 5:20 ` Greg Ungerer
2019-08-09 6:23 ` Boris Brezillon
2019-08-09 6:55 ` Greg Ungerer
2019-08-09 7:32 ` Boris Brezillon
2019-08-09 13:57 ` Greg Ungerer
2019-08-09 13:59 ` Boris Brezillon
2019-08-12 2:50 ` Greg Ungerer
2019-08-12 4:04 ` Greg Ungerer
2019-08-12 7:31 ` Boris Brezillon
2019-08-13 0:50 ` Greg Ungerer
2021-01-28 9:45 ` Michael Nazzareno Trimarchi
2021-01-28 10:26 ` Miquel Raynal
2021-01-28 10:35 ` Michael Nazzareno Trimarchi
2021-01-28 11:55 ` Michael Nazzareno Trimarchi
2021-01-29 12:43 ` Greg Ungerer
2021-01-30 9:41 ` Michael Nazzareno Trimarchi
2021-02-01 14:13 ` Miquel Raynal
2021-02-01 14:32 ` Michael Nazzareno Trimarchi
2021-02-01 15:08 ` Michael Nazzareno Trimarchi
2021-02-01 15:14 ` Miquel Raynal
2021-02-01 15:17 ` Michael Nazzareno Trimarchi
2021-10-15 20:05 ` Michael Trimarchi
2021-10-15 20:12 ` Michael Nazzareno Trimarchi
2021-10-18 7:19 ` Miquel Raynal
2021-10-18 7:33 ` Michael Nazzareno Trimarchi
2021-10-18 7:43 ` Miquel Raynal
2021-10-04 5:54 ` Christian Eggers
2021-10-04 6:27 ` Michael Nazzareno Trimarchi
2021-10-04 15:33 ` Miquel Raynal
2021-10-04 16:06 ` Han Xu
2021-10-05 6:02 ` Christian Eggers
2021-10-08 9:55 ` Christian Eggers
2021-10-08 12:08 ` Stefan Riedmüller
2021-10-08 12:27 ` Miquel Raynal
2021-10-08 13:11 ` Christian Eggers
2021-10-08 13:29 ` Miquel Raynal
2021-10-08 13:36 ` Miquel Raynal
2021-10-08 13:49 ` Christian Eggers
2021-10-08 16:07 ` Miquel Raynal
2021-10-09 5:53 ` Michael Nazzareno Trimarchi
2021-10-11 6:46 ` Miquel Raynal
2021-10-12 9:02 ` [RFC PATCH 1/2] mtd: rawnand: gpmi: Remove explicit default gpmi clock setting for i.MX6 Stefan Riedmueller
2021-10-12 9:02 ` [RFC PATCH 2/2] gpmi-nand: Add ERR007117 protection for nfc_apply_timings Stefan Riedmueller
2021-10-13 5:01 ` Han Xu
2021-10-22 8:45 ` Stefan Riedmüller
2021-10-22 14:35 ` han.xu
2021-10-25 9:39 ` Stefan Riedmüller
2021-10-28 9:28 ` Stefan Riedmüller
2021-11-01 4:01 ` han.xu
2021-10-13 6:10 ` Christian Eggers
2021-10-13 6:00 ` [RFC PATCH 1/2] mtd: rawnand: gpmi: Remove explicit default gpmi clock setting for i.MX6 Christian Eggers
2021-10-09 6:26 ` GPMI iMX6ull timeout on DMA Christian Eggers
2021-10-13 6:15 ` Christian Eggers
2021-10-08 13:13 ` Christian Eggers
2021-10-08 13:30 ` Miquel Raynal
2021-10-09 6:33 ` Christian Eggers
-- strict thread matches above, loose matches on Subject: below --
2018-10-02 13:22 GPMI IMX6ull timeout on dma Michael Nazzareno Trimarchi
2018-10-04 14:36 ` Michael Nazzareno Trimarchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=17b49e7d-ff63-315f-cf12-3474f7228c6d@kernel.org \
--to=gerg@kernel.org \
--cc=bbrezillon@kernel.org \
--cc=linux-mtd@lists.infradead.org \
--cc=michael@amarulasolutions.com \
--cc=miquel.raynal@bootlin.com \
--cc=s.hauer@pengutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).