linux-mtd.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* mtd_nandbiterrs errors
@ 2020-02-04 12:32 JH
  2020-02-05 11:28 ` JH
  0 siblings, 1 reply; 6+ messages in thread
From: JH @ 2020-02-04 12:32 UTC (permalink / raw)
  To: linux-mtd

Hi,

I am running kernel version 4.19.75 on iMX6, here I got problem to run
mtd_nandbiterrs:

# modprobe mtd_nandbiterrs
[  695.090585]
[  695.092143] ==================================================
[  695.098317] mtd_nandbiterrs: MTD device: 0
[  695.114256] mtd_nandbiterrs: MTD device size 1048576, eraseblock=131072, pag4
[  695.122867] mtd_nandbiterrs: Device uses 1 subpages of 2048 bytes
[  695.129138] mtd_nandbiterrs: Using page=0, offset=0, eraseblock=0
[  695.144888] mtd_nandbiterrs: incremental biterrors test
[  695.150594] mtd_nandbiterrs: write_page
[  695.158629] mtd_nandbiterrs: rewrite page
[  695.163488] mtd_nandbiterrs: read_page
[  695.170790] mtd_nandbiterrs: verify_page
[  695.174887] mtd_nandbiterrs: Successfully corrected 0 bit errors per subpage
[  695.182279] mtd_nandbiterrs: Inserted biterror @ 0/5
[  695.187387] mtd_nandbiterrs: rewrite page
[  695.196243] mtd_nandbiterrs: read_page
[  695.202608] mtd_nandbiterrs: Read reported 1 corrected bit errors
[  695.209115] mtd_nandbiterrs: verify_page
[  695.213192] mtd_nandbiterrs: Successfully corrected 1 bit errors per subpage
[  695.220361] mtd_nandbiterrs: Inserted biterror @ 0/2
[  695.225361] mtd_nandbiterrs: rewrite page
[  695.235261] mtd_nandbiterrs: read_page
[  695.240237] mtd_nandbiterrs: Read reported 2 corrected bit errors
[  695.246384] mtd_nandbiterrs: verify_page
[  695.250771] mtd_nandbiterrs: Successfully corrected 2 bit errors per subpage
[  695.257984] mtd_nandbiterrs: Inserted biterror @ 0/0
[  695.262984] mtd_nandbiterrs: rewrite page
[  695.273646] mtd_nandbiterrs: read_page
[  695.280000] mtd_nandbiterrs: Read reported 2 corrected bit errors
[  695.286230] mtd_nandbiterrs: verify_page
[  695.290489] mtd_nandbiterrs: Error: page offset 0, expected 25, got 00
[  695.297155] mtd_nandbiterrs: Error: page offset 282, expected 29, got 28
[  695.303897] mtd_nandbiterrs: Error: page offset 359, expected a7, got 27
[  695.310834] mtd_nandbiterrs: ECC failure, read data is incorrect despite reas
modprobe: ERROR: could not insert 'mtd_nandbiterrs': Input/output error

What I got wrong here?

Thank you.

Kind regards,

- jh

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mtd_nandbiterrs errors
  2020-02-04 12:32 mtd_nandbiterrs errors JH
@ 2020-02-05 11:28 ` JH
  2020-02-05 20:23   ` Boris Brezillon
  0 siblings, 1 reply; 6+ messages in thread
From: JH @ 2020-02-05 11:28 UTC (permalink / raw)
  To: linux-mtd

Resolved, using kernel test probably a bad idea, change to use
mtd-utils nandbiterrs resolved issue.

Thank you.

Kind regards,

- jh

On 2/4/20, JH <jupiter.hce@gmail.com> wrote:
> Hi,
>
> I am running kernel version 4.19.75 on iMX6, here I got problem to run
> mtd_nandbiterrs:
>
> # modprobe mtd_nandbiterrs
> [  695.090585]
> [  695.092143] ==================================================
> [  695.098317] mtd_nandbiterrs: MTD device: 0
> [  695.114256] mtd_nandbiterrs: MTD device size 1048576, eraseblock=131072,
> pag4
> [  695.122867] mtd_nandbiterrs: Device uses 1 subpages of 2048 bytes
> [  695.129138] mtd_nandbiterrs: Using page=0, offset=0, eraseblock=0
> [  695.144888] mtd_nandbiterrs: incremental biterrors test
> [  695.150594] mtd_nandbiterrs: write_page
> [  695.158629] mtd_nandbiterrs: rewrite page
> [  695.163488] mtd_nandbiterrs: read_page
> [  695.170790] mtd_nandbiterrs: verify_page
> [  695.174887] mtd_nandbiterrs: Successfully corrected 0 bit errors per
> subpage
> [  695.182279] mtd_nandbiterrs: Inserted biterror @ 0/5
> [  695.187387] mtd_nandbiterrs: rewrite page
> [  695.196243] mtd_nandbiterrs: read_page
> [  695.202608] mtd_nandbiterrs: Read reported 1 corrected bit errors
> [  695.209115] mtd_nandbiterrs: verify_page
> [  695.213192] mtd_nandbiterrs: Successfully corrected 1 bit errors per
> subpage
> [  695.220361] mtd_nandbiterrs: Inserted biterror @ 0/2
> [  695.225361] mtd_nandbiterrs: rewrite page
> [  695.235261] mtd_nandbiterrs: read_page
> [  695.240237] mtd_nandbiterrs: Read reported 2 corrected bit errors
> [  695.246384] mtd_nandbiterrs: verify_page
> [  695.250771] mtd_nandbiterrs: Successfully corrected 2 bit errors per
> subpage
> [  695.257984] mtd_nandbiterrs: Inserted biterror @ 0/0
> [  695.262984] mtd_nandbiterrs: rewrite page
> [  695.273646] mtd_nandbiterrs: read_page
> [  695.280000] mtd_nandbiterrs: Read reported 2 corrected bit errors
> [  695.286230] mtd_nandbiterrs: verify_page
> [  695.290489] mtd_nandbiterrs: Error: page offset 0, expected 25, got 00
> [  695.297155] mtd_nandbiterrs: Error: page offset 282, expected 29, got 28
> [  695.303897] mtd_nandbiterrs: Error: page offset 359, expected a7, got 27
> [  695.310834] mtd_nandbiterrs: ECC failure, read data is incorrect despite
> reas
> modprobe: ERROR: could not insert 'mtd_nandbiterrs': Input/output error
>
> What I got wrong here?
>
> Thank you.
>
> Kind regards,
>
> - jh
>

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mtd_nandbiterrs errors
  2020-02-05 11:28 ` JH
@ 2020-02-05 20:23   ` Boris Brezillon
  2020-02-06  0:20     ` JH
  0 siblings, 1 reply; 6+ messages in thread
From: Boris Brezillon @ 2020-02-05 20:23 UTC (permalink / raw)
  To: JH; +Cc: linux-mtd

On Wed, 5 Feb 2020 22:28:50 +1100
JH <jupiter.hce@gmail.com> wrote:

> Resolved, using kernel test probably a bad idea, change to use
> mtd-utils nandbiterrs resolved issue.

I doubt it solved the real problem: ECC is not working properly.

> > [  695.257984] mtd_nandbiterrs: Inserted biterror @ 0/0
> > [  695.262984] mtd_nandbiterrs: rewrite page
> > [  695.273646] mtd_nandbiterrs: read_page
> > [  695.280000] mtd_nandbiterrs: Read reported 2 corrected bit errors

The ECC engine should report an uncorrectable error here, not 2
corrected bits. BTW, an ECC of 2bits/512bytes sounds weak for a 2k-page
NAND. What's the NAND part you're testing with?

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mtd_nandbiterrs errors
  2020-02-05 20:23   ` Boris Brezillon
@ 2020-02-06  0:20     ` JH
  2020-02-06  2:03       ` Steve deRosier
  0 siblings, 1 reply; 6+ messages in thread
From: JH @ 2020-02-06  0:20 UTC (permalink / raw)
  To: Boris Brezillon; +Cc: linux-mtd

Hi Boris,

Thanks for the response.

On 2/6/20, Boris Brezillon <boris.brezillon@collabora.com> wrote:
> On Wed, 5 Feb 2020 22:28:50 +1100
> JH <jupiter.hce@gmail.com> wrote:
>
>> Resolved, using kernel test probably a bad idea, change to use
>> mtd-utils nandbiterrs resolved issue.
>
> I doubt it solved the real problem: ECC is not working properly.

You are right, I was working and posted at middle night, my brain was
not functional well. Let me try it again to clarify it.

# nandbiterrs -i /dev/mtd2
incremental biterrors test
Successfully corrected 0 bit errors per subpage
Inserted biterror @ 1/7
Read reported 1 corrected bit errors
Successfully corrected 1 bit errors per subpage
Inserted biterror @ 3/7
Read reported 2 corrected bit errors
Successfully corrected 2 bit errors per subpage
Inserted biterror @ 5/7
Failed to recover 1 bitflips
Read error after 3 bit errors per page

It did have errors after reading 3 bit errors per page. Could it be
ECC strength not be set up correctly?

I did not set up ECC strength, how can I check the ECC strength bit? I
run the nandbiterrs --help, it did not tell me which option I could
check ECC strength bits.

Also, how to set up ECC strength bits?

Sorry for all rudimentary questions.

>> > [  695.257984] mtd_nandbiterrs: Inserted biterror @ 0/0
>> > [  695.262984] mtd_nandbiterrs: rewrite page
>> > [  695.273646] mtd_nandbiterrs: read_page
>> > [  695.280000] mtd_nandbiterrs: Read reported 2 corrected bit errors
>
> The ECC engine should report an uncorrectable error here, not 2
> corrected bits. BTW, an ECC of 2bits/512bytes sounds weak for a 2k-page
> NAND. What's the NAND part you're testing with?

I am currently testing a test unit that is using W29N02GVSIAA, it will
change to Samsung : K9F2G08U0D-SCB0 in the future, I have no idea why
the hardware contractor uses two different parts in development and in
product.

Sorry to repeat my questions above again, how to run nandbiterrs to
read ECC strength bit? And how to run nandbiterrs or other command to
set ECC strength bit? I thought that default should be 4 bits, I have
never set it up here, have no idea why it was 2 bits.

Thank you so much.

Kind regards,

- jh

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mtd_nandbiterrs errors
  2020-02-06  0:20     ` JH
@ 2020-02-06  2:03       ` Steve deRosier
  2020-02-06  6:42         ` JH
  0 siblings, 1 reply; 6+ messages in thread
From: Steve deRosier @ 2020-02-06  2:03 UTC (permalink / raw)
  To: JH; +Cc: Boris Brezillon, linux-mtd

Hi JH,

On Wed, Feb 5, 2020 at 4:20 PM JH <jupiter.hce@gmail.com> wrote:
>
> Hi Boris,
>
> Thanks for the response.
>
> On 2/6/20, Boris Brezillon <boris.brezillon@collabora.com> wrote:
> > On Wed, 5 Feb 2020 22:28:50 +1100
> > JH <jupiter.hce@gmail.com> wrote:
> >
> >> Resolved, using kernel test probably a bad idea, change to use
> >> mtd-utils nandbiterrs resolved issue.
> >
> > I doubt it solved the real problem: ECC is not working properly.
>
> You are right, I was working and posted at middle night, my brain was
> not functional well. Let me try it again to clarify it.
>
> # nandbiterrs -i /dev/mtd2
> incremental biterrors test
> Successfully corrected 0 bit errors per subpage
> Inserted biterror @ 1/7
> Read reported 1 corrected bit errors
> Successfully corrected 1 bit errors per subpage
> Inserted biterror @ 3/7
> Read reported 2 corrected bit errors
> Successfully corrected 2 bit errors per subpage
> Inserted biterror @ 5/7
> Failed to recover 1 bitflips
> Read error after 3 bit errors per page
>
> It did have errors after reading 3 bit errors per page. Could it be
> ECC strength not be set up correctly?
>
> I did not set up ECC strength, how can I check the ECC strength bit? I
> run the nandbiterrs --help, it did not tell me which option I could
> check ECC strength bits.
>
> Also, how to set up ECC strength bits?
>
> Sorry for all rudimentary questions.
>
> >> > [  695.257984] mtd_nandbiterrs: Inserted biterror @ 0/0
> >> > [  695.262984] mtd_nandbiterrs: rewrite page
> >> > [  695.273646] mtd_nandbiterrs: read_page
> >> > [  695.280000] mtd_nandbiterrs: Read reported 2 corrected bit errors
> >
> > The ECC engine should report an uncorrectable error here, not 2
> > corrected bits. BTW, an ECC of 2bits/512bytes sounds weak for a 2k-page
> > NAND. What's the NAND part you're testing with?
>
> I am currently testing a test unit that is using W29N02GVSIAA, it will
> change to Samsung : K9F2G08U0D-SCB0 in the future, I have no idea why
> the hardware contractor uses two different parts in development and in
> product.
>

Probably because the other part was cheaper. You can't let them sub
parts without testing and approval.

> Sorry to repeat my questions above again, how to run nandbiterrs to
> read ECC strength bit? And how to run nandbiterrs or other command to
> set ECC strength bit? I thought that default should be 4 bits, I have
> never set it up here, have no idea why it was 2 bits.
>

ECC is dependant on the device. And it can't be mixed-and-matched.
Every device has a datasheet that will tell you the minimum required.
You can (and usually should) go more than the minimum required, up to
however much you can fit in the OOB area. There's several ways to
check it, one way is to dump a programed page via u-boot `nand dump`
command from each partition and see how much of the OOB is taken up by
ECC bits. Personally, I'd do that even if I thought I knew what the
setting is supposed to be to validate that the data was actually
written in correctly. Depending on your system, you can find the
configured strength in your DTS. And also the u-boot config for your
platform (boot loader and kernel need to agree on ECC settings).

You need to find the datasheets for your devices, it will tell you
what you need to know.

- Steve

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: mtd_nandbiterrs errors
  2020-02-06  2:03       ` Steve deRosier
@ 2020-02-06  6:42         ` JH
  0 siblings, 0 replies; 6+ messages in thread
From: JH @ 2020-02-06  6:42 UTC (permalink / raw)
  To: Steve deRosier; +Cc: Boris Brezillon, linux-mtd

Hi Steve,

Glad you help me here as well :-)

On 2/6/20, Steve deRosier <derosier@gmail.com> wrote:
> ECC is dependant on the device. And it can't be mixed-and-matched.
> Every device has a datasheet that will tell you the minimum required.
> You can (and usually should) go more than the minimum required, up to
> however much you can fit in the OOB area. There's several ways to
> check it, one way is to dump a programed page via u-boot `nand dump`
> command from each partition and see how much of the OOB is taken up by
> ECC bits. Personally, I'd do that even if I thought I knew what the
> setting is supposed to be to validate that the data was actually
> written in correctly. Depending on your system, you can find the
> configured strength in your DTS. And also the u-boot config for your
> platform (boot loader and kernel need to agree on ECC settings).

Just got a dts file, it uses fsl,use-minimum-ecc, I think that dts was
copied from original imx6ull EVK, it looks like it is 4 bits.

Sorry for a silly question, how could I run command in u-boot and
Linux to verify the ECC strength bits and setting in u-boot and Linux?

> You need to find the datasheets for your devices, it will tell you
> what you need to know.

The datasheet says "The system has to use a minimum 1-bit ECC per 528
bytes of data to ensure data recovery". For 2KB page size, I guess a 4
bits should be adequate, right? I need to find a way to run commands
in u-boot and Linux to find the ECC bits.

Thank you so much Steve,

Kind regards,

- jh

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-02-06  6:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-04 12:32 mtd_nandbiterrs errors JH
2020-02-05 11:28 ` JH
2020-02-05 20:23   ` Boris Brezillon
2020-02-06  0:20     ` JH
2020-02-06  2:03       ` Steve deRosier
2020-02-06  6:42         ` JH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).