Some questions on bit-flips and JFFS2

* Some questions on bit-flips and JFFS2
@ 2010-05-03 13:05 Thorsten Mühlfelder
  2010-05-04  9:28 ` Norbert van Bolhuis
  2010-05-05  6:59 ` Ricard Wanderlof
  0 siblings, 2 replies; 17+ messages in thread
From: Thorsten Mühlfelder @ 2010-05-03 13:05 UTC (permalink / raw)
  To: linux-mtd

Hi there,

I'm experiencing some problems with bit-flips on devices using NAND and JFFS2:
NAND device: Manufacturer ID: 0x2c, Chip ID: 0xdc (Micron NAND 512MiB 3,3V 
8-bit)
Creating 2 MTD partitions on "NAND 512MiB 3,3V 8-bit":
0x00000000-0x00a00000 : "Bootloader Area"
0x00a00000-0x20000000 : "User Area"

In rare cases 1 or 2 bits in the bootloader area (kernel) flip, so that the 
system won't boot anymore (kernel checksum error).
As the bootloader image is not mounted at all I wonder if this may be caused 
by these read disturbs I've heard of.

I've found some statements from different people about it here on the ML:

> We use JFFS2. As known JFFS2 detects and corrects single bit-flips
> (per 256 byte subpage) but it doesn't physically correct them
> on the NAND device itself.

and:

> AFAIK, jffs2 doesn't handle correctly bit flip on read: it won't try to
> copy the data on another block while the data can still be recovered
> by ecc.

For me this means that data still is read correctly because of ECC but it 
won't get moved to a new block if a bit-flip happens? And what happens if 
this occours on the kernel partition?

Furthermore:
> > How about detection of ECC errors in read only partitions?
>
> ECC should be done on both rw and read-only partitions. Sometimes NAND gets
> read disturbs which would impact on read-only partitions. Also, write
> disturbs from writes to one partition can still corrupt a read-only
> partition on the same chip.

So writing to my root partition may harm my kernel partition, too?

PS: I could not reproduce the bit-flip problem. It just happens in rare cases. 
Furthermore some of my devices are using Samsung NAND instead of the Micron 
NAND and did not show any problems yet. So perhaps my problem are just some 
bad NAND chip? But still I have to find a solution for the problem.

^ permalink raw reply	[flat|nested] 17+ messages in thread