All of lore.kernel.org
 help / color / mirror / Atom feed
* PXA310: double bit errors on NAND
@ 2010-11-08 14:54 Bjørn Forsman
  2010-11-08 16:54 ` Bjørn Forsman
  2010-11-08 22:18 ` [Openpxa-users] " pieterg
  0 siblings, 2 replies; 8+ messages in thread
From: Bjørn Forsman @ 2010-11-08 14:54 UTC (permalink / raw)
  To: openpxa-users, linux-mtd

Hi all,

I get "double bit error @ page" messages while working with NAND on a
Colibri PXA310 module. Basically, when I write an UBI image to NAND,
either with U-Boot "nand write" or "nandwrite" from mtd-utils,
everything is OK on the *first* mount. But on the second mount I get
several "double bit error @ page" messages and mount fails. This
always happen (unless I'm mounting read-only -- then the problem is
gone). But the funny thing is that if I write an UBIFS image using
these commands:

ubiformat /dev/mtd7
ubiattach -m 7
ubimkvol /dev/ubi0 -N rootfs -m
ubiupdatevol /dev/ubi0_0 /rootfs.ubifs

everything is OK! I can mount/umount many times and no errors.

Any ideas why writing UBI image directly allows only *one* mount but
using the UBI tools above it works perfectly on all later mounts? I'd
really like to be able to flash UBI images directly and not having to
boot Linux over NFS just to be able to flash my rootfs...

Some notes:

1) I found a thread at
http://sourceforge.net/mailarchive/forum.php?thread_name=201009181607.38352.marek.vasut%40gmail.com&forum_name=openpxa-users
which seems to be about the very same problem, but I was not on the
list back then so I couldn't reply to that thread.

2) A console session where I provoke the "2nd UBIFS mount" failure:
# flash_eraseall /dev/mtd7
# nandwrite -p /dev/mtd7 /rootfs.ubi
# ubiattach -m7
[ 2011.367544] UBI: attaching mtd7 to ubi0
[ 2011.378022] UBI: physical eraseblock size:   131072 bytes (128 KiB)
[ 2011.385102] UBI: logical eraseblock size:    126976 bytes
[ 2011.395127] UBI: smallest flash I/O unit:    2048
[ 2011.404498] UBI: VID header offset:          2048 (aligned 2048)
[ 2011.415154] UBI: data offset:                4096
[ 2011.826179] UBI: max. sequence number:       0
[ 2011.849846] UBI: attached mtd7 to ubi0
[ 2011.853591] UBI: MTD device name:            "rootfs 64 MiB"
[ 2011.860003] UBI: MTD device size:            64 MiB
[ 2011.868983] UBI: number of good PEBs:        512
[ 2011.874397] UBI: number of bad PEBs:         0
[ 2011.881566] UBI: max. allowed volumes:       128
[ 2011.886170] UBI: wear-leveling threshold:    4096
[ 2011.891614] UBI: number of internal volumes: 1
[ 2011.899102] UBI: number of user volumes:     1
[ 2011.906238] UBI: available PEBs:             142
[ 2011.911610] UBI: total number of reserved PEBs: 370
[ 2011.919086] UBI: number of PEBs reserved for bad PEB handling: 5
[ 2011.925844] UBI: max/mean erase counter: 0/0
[ 2011.932870] UBI: image sequence number:  0
[ 2011.937009] UBI: background thread "ubi_bgt0d" started, PID 1019
UBI device number 0, total 512 LEBs (65011712 bytes, 62.0 MiB),
available 142 LEBs (18030592 bytes, 17)
# mount -t ubifs /dev/ubi0_0 /mnt
[ 2174.116000] UBIFS: mounted UBI device 0, volume 0, name "rootfs"
[ 2174.122117] UBIFS: file system size:   44441600 bytes (43400 KiB,
42 MiB, 350 LEBs)
[ 2174.129771] UBIFS: journal size:       9023488 bytes (8812 KiB, 8
MiB, 72 LEBs)
[ 2174.137036] UBIFS: media format:       w4/r0 (latest is w4/r0)
[ 2174.142857] UBIFS: default compressor: lzo
[ 2174.146924] UBIFS: reserved for root:  0 bytes (0 KiB)
# ls /mnt
bin      etc      init     linuxrc  opt      root     sys      usr      www
dev      home     lib      mnt      proc     sbin     tmp      var
# umount /mnt
[ 2187.011934] UBIFS: un-mount UBI device 0, volume 0
# mount -t ubifs /dev/ubi0_0 /mnt
[ 2189.827801] double bit error @ page 0000d2c5
[ 2189.859454] double bit error @ page 0000d2c5
[ 2189.889241] UBIFS error (pid 1028): ubifs_recover_master_node:
failed to recover master node
[ 2189.901888] double bit error @ page 0000d2c5
mount: mounting /dev/ubi0_0 on /mnt failed: Invalid argument
# [ 2189.942810] UBI: scrubbed PEB 3 (LEB 0:1), data moved to PEB 511

3) I tried to flash and mount a jffs2 image instead of UBI/UBIFS. It
produced *lots* of errors and mount eventually failed:
# flash_eraseall /dev/mtd7      # the '-j' flag produces errors, so
I'm skipping it...
# nandwrite -p /dev/mtd7 /rootfs.jffs2
# mount -t jffs2 /dev/mtdblock7 /mnt
[ 5320.153576] jffs2_scan_eraseblock(): Node at 0x00000000 {0x1985,
0xe001, 0x0000002b) has invalid CRC 0x1804b18f (calculated 0x7d266ee6)
[ 5320.168986] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x00000004: 0x002b instead
[ 5320.180415] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x00000008: 0xb18f instead
[ 5320.189258] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x0000000c: 0x0001 instead
[ 5320.198878] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x00000014: 0x0002 instead
[ 5320.212126] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x00000018: 0xc63b instead
[ 5320.221757] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x0000001c: 0x0403 instead
[ 5320.233390] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x00000020: 0x148e instead
[ 5320.243017] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x00000024: 0x5aed instead
[ 5320.254612] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x00000028: 0x6962 instead
[ 5320.263601] Further such events for this erase block will not be printed
<...snip lots of errors...>
[ 5349.320793] Empty flash at 0x0156c5c0 ends at 0x0156c5c4
[ 5349.336938] Empty flash at 0x0156c5d0 ends at 0x0156c5d4
[ 5349.355108] Empty flash at 0x0156c5e0 ends at 0x0156c5e4
<..snip lots of errors...>
[ 5374.079867] jffs2_scan_eraseblock(): Node at 0x02800000 {0x1985,
0xe002, 0x00000415) has invalid CRC 0x896de46f (calculated 0xec4f3b06)
[ 5374.092709] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x02800004: 0x0415 instead
[ 5374.104261] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x02800008: 0xe46f instead
[ 5374.113883] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x0280000c: 0x134b instead
[ 5374.125444] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x02800010: 0x0001 instead
[ 5374.135058] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x02800014: 0x81a4 instead
[ 5374.146676] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x0280001c: 0x165d instead
[ 5374.156287] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x02800020: 0xa68f instead
[ 5374.166811] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x02800024: 0xa68f instead
[ 5374.178375] jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found
at 0x02800028: 0xa68f instead
[ 5374.187367] Further such events for this erase block will not be printed
[ 5374.237539] Old JFFS2 bitmask found at 0x0280b86c
[ 5374.242332] You cannot use older JFFS2 filesystems with newer kernels
[ 5374.753907] Cowardly refusing to erase blocks on filesystem with no
valid JFFS2 nodes
[ 5374.761797] empty_blocks 191, bad_blocks 0, c->nr_blocks 512
mount: mounting /dev/mtdblock7 on /mnt failed: Input/output error
#

Best regards,
Bjørn Forsman

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PXA310: double bit errors on NAND
  2010-11-08 14:54 PXA310: double bit errors on NAND Bjørn Forsman
@ 2010-11-08 16:54 ` Bjørn Forsman
  2010-11-08 22:18 ` [Openpxa-users] " pieterg
  1 sibling, 0 replies; 8+ messages in thread
From: Bjørn Forsman @ 2010-11-08 16:54 UTC (permalink / raw)
  To: openpxa-users, linux-mtd

2010/11/8 Bjørn Forsman <bjorn.forsman@gmail.com>:
> Hi all,
>
> I get "double bit error @ page" messages while working with NAND on a
> Colibri PXA310 module. Basically, when I write an UBI image to NAND,
> either with U-Boot "nand write" or "nandwrite" from mtd-utils,
> everything is OK on the *first* mount. But on the second mount I get
> several "double bit error @ page" messages and mount fails. This
> always happen (unless I'm mounting read-only -- then the problem is
> gone). But the funny thing is that if I write an UBIFS image using
> these commands:
>
> ubiformat /dev/mtd7
> ubiattach -m 7
> ubimkvol /dev/ubi0 -N rootfs -m
> ubiupdatevol /dev/ubi0_0 /rootfs.ubifs
>
> everything is OK! I can mount/umount many times and no errors.

After reading the MTD how to send bugreport[1] (which I should have
read earlier!) I enabled "UBI debugging messages" and "Extra
self-checks". The console output is now much more useful :-) AFAICS,
it shows that UBI fails to write data to PEB 3 and switches to
read-only mode. UBIFS then reports write failure at LEB 1 and then the
kernel tries to dereference a NULL pointer:

# flash_eraseall /dev/mtd7 && nandwrite -p /dev/mtd7 /rootfs.ubi
# ubiattach -m 7
# mount -t ubifs /dev/ubi0_0 /mnt
[  664.964427] UBI warning: ubi_eba_write_leb: failed to write data to PEB 3
[  664.971275] UBI warning: ubi_ro_mode: switch to read-only mode
[  664.977087] UBIFS error (pid 1014): ubifs_write_node: cannot write
2048 bytes to LEB 1:2048, error 5
[  665.008605] kernel BUG at fs/super.c:948!
[  665.012616] Unable to handle kernel NULL pointer dereference at
virtual address 00000000
[  665.020789] pgd = c6888000
[  665.023484] [00000000] *pgd=a7ba0031, *pte=00000000, *ppte=00000000
[  665.029782] Internal error: Oops: 817 [#1] PREEMPT
[  665.034537] last sysfs file: /sys/devices/virtual/ubi/ubi0/min_io_size
[  665.041017] Modules linked in:
[  665.044055] CPU: 0    Not tainted  (2.6.36-00018-g14edd22-dirty #265)
[  665.050467] PC is at __bug+0x24/0x30
[  665.054031] LR is at release_console_sem+0x1e8/0x200
[  665.058968] pc : [<c002b9e0>]    lr : [<c0040454>]    psr: 60000013
[  665.058981] sp : c6873ec0  ip : c6873df0  fp : c6873ecc
[  665.070370] r10: c686b000  r9 : c689cbe0  r8 : c686b000
[  665.075557] r7 : 00008000  r6 : 00000005  r5 : c03e5cd8  r4 : c7b1f000
[  665.082044] r3 : 00000000  r2 : 00000001  r1 : c6873e18  r0 : 00000033
[  665.088526] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[  665.095614] Control: 0000397f  Table: a6888018  DAC: 00000015
[  665.101317] Process mount (pid: 1014, stack limit = 0xc6872278)
[  665.107201] Stack: (0xc6873ec0 to 0xc6874000)
[  665.111540] 3ec0: c6873efc c6873ed0 c00a4d38 c002b9c8 c7b1f000
c6873ee0 c00bae28 c03e5cd8
[  665.119679] 3ee0: c689cbe0 c686b000 c689cbc0 00008000 c6873f24
c6873f00 c00a4e28 c00a4cc0
[  665.127811] 3f00: c686b000 00000020 c689cbc0 00000000 00000000
00008000 c6873f6c c6873f28
[  665.135944] 3f20: c00be030 c00a4df8 00000000 c6873f6c 00000000
c6873f48 c7805a00 c7452c00
[  665.144077] 3f40: 20000013 c7b9b000 be8f9eb8 00008000 00000000
c0028108 c6872000 00000000
[  665.152211] 3f60: c6873fa4 c6873f70 c00be140 c00bd9d4 c686b000
3a699d01 386d4604 c686b000
[  665.160342] 3f80: c689cbc0 c689cbe0 000b9018 be8f9c70 4018d380
00000015 00000000 c6873fa8
[  665.168476] 3fa0: c0027f60 c00be0c0 000b9018 be8f9c70 be8f9eb8
be8f9ec4 be8f9eb2 00008000
[  665.176607] 3fc0: 000b9018 be8f9c70 4018d380 00000015 00000000
0000d0b0 00008000 00000000
[  665.184742] 3fe0: 40140da4 be8f9b80 0004aefc 40140db8 20000010
be8f9eb8 ffffffff ffffffff
[  665.192860] Backtrace:
[  665.195305] [<c002b9bc>] (__bug+0x0/0x30) from [<c00a4d38>]
(vfs_kern_mount+0x84/0x118)
[  665.203266] [<c00a4cb4>] (vfs_kern_mount+0x0/0x118) from
[<c00a4e28>] (do_kern_mount+0x3c/0xe0)
[  665.211902]  r8:00008000 r7:c689cbc0 r6:c686b000 r5:c689cbe0 r4:c03e5cd8
[  665.218619] [<c00a4dec>] (do_kern_mount+0x0/0xe0) from [<c00be030>]
(do_mount+0x668/0x6ec)
[  665.226822]  r8:00008000 r7:00000000 r6:00000000 r5:c689cbc0 r4:00000020
[  665.233335] r3:c686b000
[  665.235941] [<c00bd9c8>] (do_mount+0x0/0x6ec) from [<c00be140>]
(sys_mount+0x8c/0xcc)
[  665.243751] [<c00be0b4>] (sys_mount+0x0/0xcc) from [<c0027f60>]
(ret_fast_syscall+0x0/0x2c)
[  665.252039]  r7:00000015 r6:4018d380 r5:be8f9c70 r4:000b9018
[  665.257693] Code: e59f0010 e1a01003 eb0b42cf e3a03000 (e5833000)
[  665.264041] ---[ end trace f4c146898e7fbb35 ]---
Segmentation fault
#

If I try mount one more time:

# mount -t ubifs /dev/ubi0_0 /mnt
[  726.910538] UBI DBG (pid 1018): ubi_open_volume_path: open volume
/dev/ubi0_0, mode 1
[  726.918530] UBI DBG (pid 1018): ubi_open_volume: open device 0,
volume 0, mode 1
[  726.925986] UBI DBG (pid 1018): ubi_open_volume: open device 0,
volume 0, mode 2
[  726.952020] UBIFS: read-only UBI device
[  726.955874] UBI DBG (pid 1018): ubi_is_mapped: test LEB 0:0
[  726.961568] UBIFS error (pid 1018): mount_ubifs: cannot mount
read-write - read-only media
[  726.988356] UBI DBG (pid 1018): ubi_close_volume: close device 0,
volume 0, mode 2
[  726.995913] UBI DBG (pid 1018): ubi_close_volume: close device 0,
volume 0, mode 1
[  727.007180] UBI DBG (pid 1018): ubi_open_volume_path: open volume
/dev/ubi0_0, mode 1
[  727.015195] UBI DBG (pid 1018): ubi_open_volume: open device 0,
volume 0, mode 1
[  727.022720] UBI DBG (pid 1018): ubi_open_volume: open device 0,
volume 0, mode 2
[  727.035420] UBIFS: read-only UBI device
[  727.039428] UBI DBG (pid 1018): ubi_is_mapped: test LEB 0:0
[  727.045149] UBI DBG (pid 1018): ubi_leb_read: read 4096 bytes from LEB 0:0:0
[  727.057599] UBI DBG (pid 1018): ubi_leb_read: read 126976 bytes
from LEB 0:1:0
[  727.089040] UBI DBG (pid 1018): ubi_io_read: fixable bit-flip
detected at PEB 3
[  727.096321] UBI DBG (pid 1018): ubi_wl_scrub_peb: schedule PEB 3
for scrubbing
[  727.104738] UBI DBG (pid 1018): ubi_leb_read: read 126976 bytes
from LEB 0:1:0
[  727.138638] UBI DBG (pid 1018): ubi_io_read: fixable bit-flip
detected at PEB 3
[  727.145919] UBI DBG (pid 1018): ubi_wl_scrub_peb: schedule PEB 3
for scrubbing
[  727.155845] UBI DBG (pid 1018): ubi_leb_read: read 126976 bytes
from LEB 0:2:0
[  727.192788] UBIFS: recovered master node from LEB 1
[  727.198194] UBI DBG (pid 1018): ubi_leb_read: read 11 bytes from LEB 0:8:1868
[  727.206483] UBI DBG (pid 1018): ubi_leb_read: read 12 bytes from LEB 0:8:1856
[  727.213789] UBI DBG (pid 1018): ubi_leb_read: read 12 bytes from LEB 0:8:1844
[  727.221136] UBI DBG (pid 1018): ubi_leb_read: read 12 bytes from LEB 0:8:1820
[  727.228335] UBI DBG (pid 1018): ubi_leb_read: read 12 bytes from LEB 0:8:1748
[  727.235442] UBI DBG (pid 1018): ubi_leb_read: read 17 bytes from LEB 0:8:1479
[  727.242813] UBI DBG (pid 1018): ubi_leb_read: read 126976 bytes
from LEB 0:3:0
[  727.285249] UBI DBG (pid 1018): ubi_leb_read: read 126976 bytes
from LEB 0:4:0
[  727.318372] UBI DBG (pid 1018): ubi_leb_read: read 17 bytes from LEB 0:8:1445
[  727.325575] UBIFS: mounted UBI device 0, volume 0, name "rootfs"
[  727.331666] UBIFS: mounted read-only
[  727.335222] UBIFS: file system size:   44441600 bytes (43400 KiB,
42 MiB, 350 LEBs)
[  727.342863] UBIFS: journal size:       9023488 bytes (8812 KiB, 8
MiB, 72 LEBs)
[  727.350154] UBIFS: media format:       w4/r0 (latest is w4/r0)
[  727.355945] UBIFS: default compressor: lzo
[  727.360043] UBIFS: reserved for root:  0 bytes (0 KiB)
[  727.365179] UBI DBG (pid 1018): ubi_leb_read: read 188 bytes from
LEB 0:360:66688
[  727.373887] UBI DBG (pid 1018): ubi_leb_read: read 188 bytes from
LEB 0:360:65256
[  727.386062] UBI DBG (pid 1018): ubi_leb_read: read 188 bytes from
LEB 0:360:53928
[  727.398812] UBI DBG (pid 1018): ubi_leb_read: read 188 bytes from
LEB 0:359:90216
[  727.408316] UBI DBG (pid 1018): ubi_leb_read: read 188 bytes from LEB 0:354:0
[  727.416968] UBI DBG (pid 1018): ubi_leb_read: read 160 bytes from
LEB 0:352:12248
[  727.439783] UBI DBG (pid 1018): ubi_close_volume: close device 0,
volume 0, mode 1
#

It succeeds by using read-only mode. Verification:

# mount | grep ubi
/dev/ubi0_0 on /mnt type ubifs (ro,relatime)


And this is how it looks when /dev/ubi0_0 is created with ubimkvol +
ubiupdatevol (mount succeds, even in rw mode):

# mount -t ubifs /dev/ubi0_0 /mnt/
[ 1977.951135] UBI DBG (pid 1046): ubi_open_volume_path: open volume
/dev/ubi0_0, mode 1
[ 1977.959138] UBI DBG (pid 1046): ubi_open_volume: open device 0,
volume 0, mode 1
[ 1977.966629] UBI DBG (pid 1046): ubi_open_volume: open device 0,
volume 0, mode 2
[ 1977.980317] UBI DBG (pid 1046): ubi_is_mapped: test LEB 0:0
[ 1977.986369] UBI DBG (pid 1046): ubi_leb_read: read 4096 bytes from LEB 0:0:0
[ 1977.996481] UBI DBG (pid 1046): ubi_leb_change: atomically write
4096 bytes to LEB 0:0
[ 1978.166771] UBI DBG (pid 1046): ubi_leb_read: read 126976 bytes
from LEB 0:1:0
[ 1978.276756] UBI DBG (pid 1046): ubi_leb_read: read 126976 bytes
from LEB 0:2:0
[ 1978.386391] UBI DBG (pid 1046): ubi_leb_write: write 2048 bytes to
LEB 0:1:2048
[ 1978.458923] UBI DBG (pid 1046): ubi_leb_write: write 2048 bytes to
LEB 0:2:2048
[ 1978.520369] UBI DBG (pid 1046): ubi_leb_read: read 11 bytes from LEB 0:8:1868
[ 1978.529013] UBI DBG (pid 1046): ubi_leb_unmap: unmap LEB 0:9
[ 1978.534805] UBI DBG (pid 1046): ubi_leb_read: read 12 bytes from LEB 0:8:1856
[ 1978.541955] UBI DBG (pid 1046): ubi_leb_read: read 12 bytes from LEB 0:8:1844
[ 1978.549146] UBI DBG (pid 1046): ubi_leb_read: read 12 bytes from LEB 0:8:1820
[ 1978.556332] UBI DBG (pid 1046): ubi_leb_read: read 12 bytes from LEB 0:8:1748
[ 1978.563441] UBI DBG (pid 1046): ubi_leb_read: read 17 bytes from LEB 0:8:1479
[ 1978.570830] UBI DBG (pid 1046): ubi_leb_read: read 126976 bytes
from LEB 0:3:0
[ 1978.639280] UBI DBG (pid 1046): ubi_leb_read: read 126976 bytes
from LEB 0:4:0
[ 1978.649533] UBI DBG (pid 1046): ubi_leb_unmap: unmap LEB 0:10
[ 1978.655423] UBI DBG (pid 1046): ubi_leb_read: read 17 bytes from LEB 0:8:1445
[ 1978.662589] UBI DBG (pid 1046): ubi_leb_unmap: unmap LEB 0:353
[ 1978.668471] UBIFS: mounted UBI device 0, volume 0, name "rootfs"
[ 1978.674487] UBIFS: file system size:   62472192 bytes (61008 KiB,
59 MiB, 492 LEBs)
[ 1978.682095] UBIFS: journal size:       9023488 bytes (8812 KiB, 8
MiB, 72 LEBs)
[ 1978.689388] UBIFS: media format:       w4/r0 (latest is w4/r0)
[ 1978.695208] UBIFS: default compressor: lzo
[ 1978.699279] UBIFS: reserved for root:  0 bytes (0 KiB)
[ 1978.704447] UBI DBG (pid 1046): ubi_leb_read: read 188 bytes from
LEB 0:360:66688
[ 1978.715500] UBI DBG (pid 1046): ubi_leb_read: read 188 bytes from
LEB 0:360:65256
[ 1978.727822] UBI DBG (pid 1046): ubi_leb_read: read 188 bytes from
LEB 0:360:53928
[ 1978.739949] UBI DBG (pid 1046): ubi_leb_read: read 188 bytes from
LEB 0:359:90216
[ 1978.752414] UBI DBG (pid 1046): ubi_leb_read: read 188 bytes from LEB 0:354:0
[ 1978.764848] UBI DBG (pid 1046): ubi_leb_read: read 160 bytes from
LEB 0:352:12248
[ 1978.774388] UBI DBG (pid 1046): ubi_close_volume: close device 0,
volume 0, mode 1
# mount | grep ubi
/dev/ubi0_0 on /mnt type ubifs (rw,relatime)

I'll get back when I find out more. But of course; any help appreciated :-)

Best regards,
Bjørn Forsman

[1]: http://www.linux-mtd.infradead.org/doc/ubifs.html#L_how_send_bugreport

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Openpxa-users] PXA310: double bit errors on NAND
  2010-11-08 14:54 PXA310: double bit errors on NAND Bjørn Forsman
  2010-11-08 16:54 ` Bjørn Forsman
@ 2010-11-08 22:18 ` pieterg
  2010-11-09  9:15   ` Bjørn Forsman
  1 sibling, 1 reply; 8+ messages in thread
From: pieterg @ 2010-11-08 22:18 UTC (permalink / raw)
  To: openpxa-users; +Cc: Bjørn Forsman, linux-mtd

On Monday, November 08, 2010 15:54:43 Bjørn Forsman wrote:
> I get "double bit error @ page" messages while working with NAND on a
> Colibri PXA310 module. Basically, when I write an UBI image to NAND,
> either with U-Boot "nand write" or "nandwrite" from mtd-utils,
> everything is OK on the *first* mount. But on the second mount I get
> several "double bit error @ page" messages and mount fails.

You are not erase-block padding your image, when you write it?
That's the mistake I made (with jffs2 in my case), in that openpxa-users
thread you mention.
As soon as I found out my image was eraseblock padded, and fixed that,
the number of single/double bit errors was reduced dramatically, 
but some errors remained.

I haven't yet had the time to further investigate this, my preliminary 
conclusion is that the Samsung K9K's, though advertised as SLC, 
are very poor quality, and they probably need a better ecc than 
Hamming (as the pxa nand controller implements).
And also that jffs2 seems to care less than ubi(fs) when double bit errors 
do happen.(ubifs often refused to even mount when biterrors occurred, 
jffs2 so far always continued with what remained)
That's the reason why I'm using jffs2 for the time being, a non-booting 
device is worse than a device with a corrupted file.
(from a theoretical point of view however both are equally bad of course, 
this issue does need to be resolved)

Rgds, Pieter

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Openpxa-users] PXA310: double bit errors on NAND
  2010-11-08 22:18 ` [Openpxa-users] " pieterg
@ 2010-11-09  9:15   ` Bjørn Forsman
  2010-11-09  9:58     ` pieterg
  0 siblings, 1 reply; 8+ messages in thread
From: Bjørn Forsman @ 2010-11-09  9:15 UTC (permalink / raw)
  To: pieterg; +Cc: linux-mtd, openpxa-users

Hi Pieter,

Thanks for your reply. It's nice to know someone is out there :-)

2010/11/8 pieterg <pieterg@gmx.com>:
> On Monday, November 08, 2010 15:54:43 Bjørn Forsman wrote:
>> I get "double bit error @ page" messages while working with NAND on a
>> Colibri PXA310 module. Basically, when I write an UBI image to NAND,
>> either with U-Boot "nand write" or "nandwrite" from mtd-utils,
>> everything is OK on the *first* mount. But on the second mount I get
>> several "double bit error @ page" messages and mount fails.
>
> You are not erase-block padding your image, when you write it?
> That's the mistake I made (with jffs2 in my case), in that openpxa-users
> thread you mention.
> As soon as I found out my image was eraseblock padded, and fixed that,
> the number of single/double bit errors was reduced dramatically,
> but some errors remained.

I've tried nandwrite with or without the --pad option. Is that the
padding you are referring to? Or is it the --blockalign option? How
exactly did you flash your rootfs?

> I haven't yet had the time to further investigate this, my preliminary
> conclusion is that the Samsung K9K's, though advertised as SLC,
> are very poor quality, and they probably need a better ecc than
> Hamming (as the pxa nand controller implements).
> And also that jffs2 seems to care less than ubi(fs) when double bit errors
> do happen.(ubifs often refused to even mount when biterrors occurred,
> jffs2 so far always continued with what remained)
> That's the reason why I'm using jffs2 for the time being, a non-booting
> device is worse than a device with a corrupted file.
> (from a theoretical point of view however both are equally bad of course,
> this issue does need to be resolved)

Thanks for the info. Note that I've had *zero* trouble with ubifs when
it's flashed with ubiformat and ubiupdatevol. I did some more googling
and found some other threads about the "ubiformat + ubiupdatevol works
but not direct UBI image flashing" issue. I'll take a closer look.

Best regards,
Bjørn Forsman

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Openpxa-users] PXA310: double bit errors on NAND
  2010-11-09  9:15   ` Bjørn Forsman
@ 2010-11-09  9:58     ` pieterg
  2011-01-15 17:01       ` Bjørn Forsman
  0 siblings, 1 reply; 8+ messages in thread
From: pieterg @ 2010-11-09  9:58 UTC (permalink / raw)
  To: Bjørn Forsman; +Cc: linux-mtd, openpxa-users

On Tuesday, November 09, 2010 10:15:40 Bjørn Forsman wrote:
> I've tried nandwrite with or without the --pad option. Is that the
> padding you are referring to? Or is it the --blockalign option? How
> exactly did you flash your rootfs?

nandwrite -p pads to pagesize, which is ok.
But I was using a --pad=0x800 (=pagesize) option for mkfs.jffs2 instead, 
which somehow resulted in eraseblock padding.

> Thanks for the info. Note that I've had *zero* trouble with ubifs when
> it's flashed with ubiformat and ubiupdatevol. I did some more googling
> and found some other threads about the "ubiformat + ubiupdatevol works
> but not direct UBI image flashing" issue. I'll take a closer look.

Interesting. In that case I might give that a try as well, as jffs2 is still
far from perfect, even with correct pagesize padding.

I've added a single bit debug message in the nand driver by the way,
which shows a lot of lucky escapes, before the first double bit error
eventually occurs. Which seems inevitable, as the driver takes
no action (such as refreshing a page) when single bit errors occur.

Rgds, Pieter

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Openpxa-users] PXA310: double bit errors on NAND
  2010-11-09  9:58     ` pieterg
@ 2011-01-15 17:01       ` Bjørn Forsman
  2011-01-16 21:53         ` Ricard Wanderlof
  0 siblings, 1 reply; 8+ messages in thread
From: Bjørn Forsman @ 2011-01-15 17:01 UTC (permalink / raw)
  To: pieterg; +Cc: linux-mtd, openpxa-users

Hi,

Seems I forgot to send this mail. Better late than never?

2010/11/9 pieterg <pieterg@gmx.com>:
> On Tuesday, November 09, 2010 10:15:40 Bjørn Forsman wrote:
>> I've tried nandwrite with or without the --pad option. Is that the
>> padding you are referring to? Or is it the --blockalign option? How
>> exactly did you flash your rootfs?
>
> nandwrite -p pads to pagesize, which is ok.
> But I was using a --pad=0x800 (=pagesize) option for mkfs.jffs2 instead,
> which somehow resulted in eraseblock padding.
>
>> Thanks for the info. Note that I've had *zero* trouble with ubifs when
>> it's flashed with ubiformat and ubiupdatevol. I did some more googling
>> and found some other threads about the "ubiformat + ubiupdatevol works
>> but not direct UBI image flashing" issue. I'll take a closer look.
>
> Interesting. In that case I might give that a try as well, as jffs2 is still
> far from perfect, even with correct pagesize padding.
>
> I've added a single bit debug message in the nand driver by the way,
> which shows a lot of lucky escapes, before the first double bit error
> eventually occurs. Which seems inevitable, as the driver takes
> no action (such as refreshing a page) when single bit errors occur.

Turns out that the errors I got when using jffs2 was because
of a bug in mkfs.jffs2. I was testing jffs2 with an image generated
by Buildroot, and Buildroot was using a mkfs.jffs2 affected by this bug
until mid october (I hadn't updated my tree in
a while). mkfs.jffs2 on my build host (Ubuntu 10.04) is not affected.

So now jffs2 is working great:

flash_eraseall /dev/mtd7
nandwrite -p /dev/mtd7 /rootfs.jffs2
mount -t jffs2 /dev/mtdblock7 /mnt

UBIFS with ubiupdatevol is working great:

ubiformat /dev/mtd8
ubiattach -m 8
ubimkvol /dev/ubi0 -N rootfs -m
ubiupdatevol /dev/ubi0_0 /rootfs.ubifs

However, UBI images are broken after mounting them once in
read-write mode if they were flashed with either 'nandwrite' or
U-Boot 'nand write'. Mounting read-only always works.

I'd like to know how UBI images is supposed to be flashed.
By searching the internet I've seen people saying that it is
impossible to write UBI images with nandwrite and others say
it *is* possible. By looking at
http://www.linux-mtd.infradead.org/doc/ubi.html#L_flasher_algo
it may appear that flashing UBI images requires an UBI aware flasher?
But isn't the whole point of having ubinize so that you can take UBIFS
images and wrap them into something that non UBIFS aware flashers can handle?

What am I missing?

Best regards,
Bjørn Forsman

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Openpxa-users] PXA310: double bit errors on NAND
  2011-01-15 17:01       ` Bjørn Forsman
@ 2011-01-16 21:53         ` Ricard Wanderlof
  2011-01-17 11:27           ` Bjørn Forsman
  0 siblings, 1 reply; 8+ messages in thread
From: Ricard Wanderlof @ 2011-01-16 21:53 UTC (permalink / raw)
  To: Bjørn Forsman; +Cc: linux-mtd, pieterg, openpxa-users


On Sat, 15 Jan 2011, Bjørn Forsman wrote:

> I'd like to know how UBI images is supposed to be flashed.
> By searching the internet I've seen people saying that it is
> impossible to write UBI images with nandwrite and others say
> it *is* possible. By looking at
> http://www.linux-mtd.infradead.org/doc/ubi.html#L_flasher_algo
> it may appear that flashing UBI images requires an UBI aware flasher?
> But isn't the whole point of having ubinize so that you can take UBIFS
> images and wrap them into something that non UBIFS aware flashers can handle?

I think there are two separate issues here:

1. When reflashing a flash with a new image you'll want to preserve the 
UBI erase counters, hence a UBI-aware flasher (e.g. ubiformat) is 
required. You can use nandwrite, but then the erase counter information is 
lost.

2. When flashing a virgin flash there are no erase counters to be 
preserved, so nandwrite will suffice. However, there appears to be an old 
'bug' in nandwrite whereby all-ff pages in the source image are 
nevertheless written to the flash. All-ff pages are assumed erased by UBI, 
and so are never re-erased before subsequent usage. The problem is that 
some flash chips do not take lightly to being rewritten multiple times 
without an intermediate erase. So it's really an old bug in nandwrite, but 
it might not have manifested itself until heavy testing with UBI had been 
performed. There's more information on this in the FAQ I believe.

/Ricard
-- 
Ricard Wolf Wanderlöf                           ricardw(at)axis.com
Axis Communications AB, Lund, Sweden            www.axis.com
Phone +46 46 272 2016                           Fax +46 46 13 61 30

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Openpxa-users] PXA310: double bit errors on NAND
  2011-01-16 21:53         ` Ricard Wanderlof
@ 2011-01-17 11:27           ` Bjørn Forsman
  0 siblings, 0 replies; 8+ messages in thread
From: Bjørn Forsman @ 2011-01-17 11:27 UTC (permalink / raw)
  To: Ricard Wanderlof; +Cc: linux-mtd, pieterg, openpxa-users

2011/1/16 Ricard Wanderlof <ricard.wanderlof@axis.com>:
>
> On Sat, 15 Jan 2011, Bjørn Forsman wrote:
>
>> I'd like to know how UBI images is supposed to be flashed.
>> By searching the internet I've seen people saying that it is
>> impossible to write UBI images with nandwrite and others say
>> it *is* possible. By looking at
>> http://www.linux-mtd.infradead.org/doc/ubi.html#L_flasher_algo
>> it may appear that flashing UBI images requires an UBI aware flasher?
>> But isn't the whole point of having ubinize so that you can take UBIFS
>> images and wrap them into something that non UBIFS aware flashers can
>> handle?
>
> I think there are two separate issues here:
>
> 1. When reflashing a flash with a new image you'll want to preserve the UBI
> erase counters, hence a UBI-aware flasher (e.g. ubiformat) is required. You
> can use nandwrite, but then the erase counter information is lost.

Yes, this is for production so I don't care that much about erase counters.

> 2. When flashing a virgin flash there are no erase counters to be preserved,
> so nandwrite will suffice. However, there appears to be an old 'bug' in
> nandwrite whereby all-ff pages in the source image are nevertheless written
> to the flash. All-ff pages are assumed erased by UBI, and so are never
> re-erased before subsequent usage. The problem is that some flash chips do
> not take lightly to being rewritten multiple times without an intermediate
> erase. So it's really an old bug in nandwrite, but it might not have
> manifested itself until heavy testing with UBI had been performed. There's
> more information on this in the FAQ I believe.

Interesting. Thanks!

It would be nice if someone could take a look at this 'bug' in nandwrite :-)

Best regards,
Bjørn Forsman

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2011-01-17 11:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-08 14:54 PXA310: double bit errors on NAND Bjørn Forsman
2010-11-08 16:54 ` Bjørn Forsman
2010-11-08 22:18 ` [Openpxa-users] " pieterg
2010-11-09  9:15   ` Bjørn Forsman
2010-11-09  9:58     ` pieterg
2011-01-15 17:01       ` Bjørn Forsman
2011-01-16 21:53         ` Ricard Wanderlof
2011-01-17 11:27           ` Bjørn Forsman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.