All of lore.kernel.org
 help / color / mirror / Atom feed
From: Miquel Raynal <miquel.raynal@bootlin.com>
To: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: linux-mtd <linux-mtd@lists.infradead.org>
Subject: Re: nand: WARNING: a0000000.nand: the ECC used on your system (1b/256B) is too weak compared to the one required by the NAND chip (4b/512B)
Date: Sat, 19 Jun 2021 20:40:35 +0200	[thread overview]
Message-ID: <20210618225032.69cdc30c@xps13> (raw)
In-Reply-To: <d37a8a7e-6181-9642-18fb-470d1d8cf006@csgroup.eu>

Hi Christophe,

> >> Now and then I'm using one of the latest kernels (Today is 5.13-rc6), and sometime in one of the 5.x releases, I started to get errors like:
> >>
> >> [    5.098265] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.103859] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 60
> >>    bytes from PEB 99:59824, read only 60 bytes, retry
> >> [    5.525843] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.531571] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.537490] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 30
> >> 73 bytes from PEB 107:108976, read only 3073 bytes, retry
> >> [    5.691121] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.696709] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.702426] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.708141] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [    5.714103] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 30
> >> 35 bytes from PEB 107:25144, read only 3035 bytes, retry
> >> [   20.523689] random: crng init done
> >> [   21.892130] ecc_sw_hamming_correct: uncorrectable ECC error
> >> [   21.897730] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 13
> >> 94 bytes from PEB 116:75776, read only 1394 bytes, retry
> >>
> >> Most of the time, when the reading of the file fails, I just have to read it once more and it gets read without that error.  
> > 
> > It really looks like a regular bitflip happening "sometimes". Is this a
> > board which already had a life? What are the usage counters (UBI should
> > tell you this) compared to the official endurance of your chip (see the
> > datasheet)?  
> 
> The board had a peacefull life:
> 
> UBI reports "ubi0: max/mean erase counter: 49/20, WL threshold: 4096"

Mmmh. Indeed.

> 
> I have tried with half a dozen of boards and all have the issue.
> 
> >   
> >> What am I supposed to do to avoid the ECC weakness warning at startup and to fix that ECC error issue ?  
> > 
> > I honestly don't think the errors come from the 5.1x kernels given the
> > above logs. If you flash back your old 4.14 I am pretty sure you'll
> > have the same errors at some point.  
> 
> I don't have any problem like that with 4.14 with any of the board.
> 
> When booting a 4.14 kernel I don't get any problem on the same board.
> 

If you can reliably show that when returning to a 4.14 kernel the ECC
weakness disappears, then there is certainly something new. What driver
are you using? Maybe you can do a bisection?

> > 
> > NAND really is a fragile storage medium, not following in a production
> > environment the minimum ECC scheme (there is a real difference between
> > 1/256 vs 4/512) really leads to complicated solutions like this one,
> > unfortunately...  
> 
> I see kernel has "Software BCH ECC". Should I use that with that chip ?
> 
> If yes, how do I use it ? Seems like selecting the option at Kernel build is not enough, do I have to configure something somewhere, for instance in the device tree ? At the time being I have the following in the device tree:

Enabling software BCH in the configuration will just built-in the
support. You then need to follow the NAND controller bindings, see the
example in [1].

However, given all the data you provided, I know think that there is
something weird happening in the driver you use, it might be relevant
to try to understand what. 

[1] Documentation/devicetree/bindings/mtd/nand-controller.yaml

Thanks,
Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

  reply	other threads:[~2021-06-19 18:41 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-17 17:17 nand: WARNING: a0000000.nand: the ECC used on your system (1b/256B) is too weak compared to the one required by the NAND chip (4b/512B) Christophe Leroy
2021-06-18  6:43 ` Miquel Raynal
2021-06-18 14:18   ` Christophe Leroy
2021-06-19 18:40     ` Miquel Raynal [this message]
2021-06-23  9:41       ` Christophe Leroy
2021-06-23  9:41         ` Christophe Leroy
2021-06-23 13:16         ` Miquel Raynal
2021-06-23 13:16           ` Miquel Raynal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210618225032.69cdc30c@xps13 \
    --to=miquel.raynal@bootlin.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=linux-mtd@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.