All of lore.kernel.org
 help / color / mirror / Atom feed
* UBIFS errors are randomly seen after reboots
@ 2017-01-16  5:28 chaitanya vinnakota
  2017-01-16  7:52 ` Richard Weinberger
  0 siblings, 1 reply; 12+ messages in thread
From: chaitanya vinnakota @ 2017-01-16  5:28 UTC (permalink / raw)
  To: linux-mtd

Hi All,

We are observing  UBIFS and ECC errors randomly even after safe
reboots . We are using a 256M  spansion nand flash and the kernel
version is 3.2.54. These errors disappear upon next reboots and appear
again randomly.

In our test scenario , we've a background process which continuously
writes data to all the available five nand partitions , by creating
files with random sizes in all the partitions. Of all these partitions
rootfs is based on UBIFS and the rest all are yaffs2 based. Now ,
during the file write operation , a reboot is issued . We expect that
file system sync properly happens before the reboot, but at times we
observe UBIFS errors in the subsequent reboots.

We ensured that the reboots are safe based on the umount rc script
which is invoked when reboot is issued. Below is the content of umount
rc script

#!/bin/sh /etc/rc.common
# Copyright (C) 2006 OpenWrt.org

STOP=99
stop() {
        echo "Filesystem sync initiated" > /media/USB/USB1/bootlog
        sync
        echo "Filesystem sync completed" >> /media/USB/USB1/bootlog
        umount -a -d -r
}
When the UBIFS error is reported, mounting the root filesystem fails,
resulting in kernel panic. Strangely , the very next time system
bootup is normal and successful

Below is the UBIFS error log during bootup

[ 6.185055] UBI: default fastmap pool size: 35
[    6.189518] UBI: default fastmap WL pool size: 25
[    6.194238] UBI: attaching mtd4 to ubi0
[    9.453670] UBI: scanning is finished
[    9.508323] UBI: attached mtd4 (name "rootfs1", size 92 MiB) to ubi0
[    9.514704] UBI: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
[    9.521533] UBI: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
[    9.528267] UBI: VID header offset: 2048 (aligned 2048), data offset: 4096
[    9.535172] UBI: good PEBs: 736, bad PEBs: 0, corrupted PEBs: 0
[    9.541112] UBI: user volume: 1, internal volumes: 1, max. volumes count: 128
[    9.548284] UBI: max/mean erase counter: 21/4, WL threshold: 4096,
image sequence number: 1902060199
[    9.557454] UBI: available PEBs: 0, total reserved PEBs: 736, PEBs
reserved for bad PEB handling: 50
[    9.566652] UBI: background thread "ubi_bgt0d" started, PID 512
[    9.573961] c2k-rtc c2k-rtc: setting system clock to 2012-07-13
12:40:34 UTC (1342183234)
[    9.582193] Registering CPUFreq(comcerto)
[    9.586981] md: Skipping autodetection of RAID arrays.
(raid=autodetect will force)
[   10.428162] UBIFS error (pid 1): ubifs_check_node: bad CRC:
calculated 0xa86e925a, read 0xd9a150bb
[   10.437178] UBIFS error (pid 1): ubifs_check_node: bad node at LEB 526:12240
[   10.444253] UBIFS error (pid 1): ubifs_scanned_corruption:
corruption at LEB 526:12240
[   10.457304] UBIFS error (pid 1): ubifs_scan: LEB 526 scanning failed
[   10.504836] VFS: Cannot open root device "ubi0:rootfs" or unknown-block(0,0)
[   10.511914] Please append a correct "root=" boot option; here are
the available partitions:
 10.520315] 1f00             512 mtdblock0  (driver?)
[   10.525411] 1f01             512 mtdblock1  (driver?)
[   10.530495] 1f02             128 mtdblock2  (driver?)
[   10.535586] 1f03            8192 mtdblock3  (driver?)
[   10.540669] 1f04           94208 mtdblock4  (driver?)
[   10.545766] 1f05             128 mtdblock5  (driver?)
[   10.550852] 1f06            8192 mtdblock6  (driver?)
[   10.555945] 1f07           94208 mtdblock7  (driver?)
[   10.561027] 1f08             128 mtdblock8  (driver?)
[   10.566117] 1f09            2048 mtdblock9  (driver?)
[   10.571202] 1f0a           12288 mtdblock10  (driver?)
[   10.576379] 1f0b           32768 mtdblock11  (driver?)
[   10.581551] 1f0c            2048 mtdblock12  (driver?)
[   10.586730] 1f0d             128 mtdblock13  (driver?)
[   10.591900] 1f0e             512 mtdblock14  (driver?)
[   10.597079] 1f0f             128 mtdblock15  (driver?)
[   10.602248] 1f10             128 mtdblock16  (driver?)
[   10.607426] 1f11              64 mtdblock17  (driver?)
[   10.612598] 1f12              64 mtdblock18  (driver?)
[   10.617777] 1f13              64 mtdblock19  (driver?)
[   10.622948] 1f14           84320 mtdblock20  (driver?)
[   10.628126] Kernel panic - not syncing: VFS: Unable to mount root
fs on unknown-block(0,0)
[   10.636423] Backtrace:
[   10.638920] [<c40113a0>] (dump_backtrace+0x0/0x110) from
[<c43fff28>] (dump_stack+0x18/0x1c)
[   10.647393]  r6:c457feb0 r5:e616a000 r4:c45c74f0 r3:c45a2dbc
[   10.653128] [<c43fff10>] (dump_stack+0x0/0x1c) from [<c43fff88>]
(panic+0x5c/0x18c)
[   10.660830] [<c43fff2c>] (panic+0x0/0x18c) from [<c455fd04>]
(mount_block_root+0x23c/0x28c)
[   10.669215]  r3:00000004 r2:00000000 r1:e6035f78 r0:c44ea033
[   10.674948]  r7:c4e25f0d
[   10.677510] [<c455fac8>] (mount_block_root+0x0/0x28c) from
[<c455fe50>] (prepare_namespace+0x98/0x17c)
[   10.686860] [<c455fdb8>] (prepare_namespace+0x0/0x17c) from
[<c455f930>] (kernel_init+0x178/0x1b8)
[   10.695855]  r5:c4587514 r4:c4587514
[   10.699477] [<c455f7b8>] (kernel_init+0x0/0x1b8) from [<c4036bd4>]
(do_exit+0x0/0x6f8)
[   10.707431]  r5:c455f7b8 r4:00000000
[   10.711047] CPU1: stopping
[   10.713763] Backtrace:
[   10.716243] [<c40113a0>] (dump_backtrace+0x0/0x110) from
[<c43fff28>] (dump_stack+0x18/0x1c)
[   10.724707]  r6:fff00100 r5:c45c6ac4 r4:00000001 r3:c45a2dbc
[   10.730442] [<c43fff10>] (dump_stack+0x0/0x1c) from [<c4012e44>]
(handle_IPI+0x100/0x17c)
[   10.738652] [<c4012d44>] (handle_IPI+0x0/0x17c) from [<c4008344>]
(do_IPI+0x10/0x14)
[   10.746420]  r5:60000013 r4:c400e72c
[   10.750029] [<c4008334>] (do_IPI+0x0/0x14) from [<c400d738>]
(__irq_svc+0x38/0x90)
[   10.757624] Exception stack(0xe6067f68 to 0xe6067fb0)
[   10.762696] 7f60:                   00000001 00000000 e6067fb0
00000000 e6066000 c44058a8
[   10.770904] 7f80: c45c6984 c45c6ab4 0400406a 412fc091 00000000
e6067fbc e6067fc0 e6067fb0
[   10.779109] 7fa0: c400e728 c400e72c 60000013 ffffffff
[   10.784185] [<c400e700>] (default_idle+0x0/0x30) from [<c400e90c>]
(cpu_idle+0x84/0xc8)
[   10.792232] [<c400e888>] (cpu_idle+0x0/0xc8) from [<c43fcc0c>]
(secondary_start_kernel+0x140/0x158)
[   10.801306]  r6:10c03c7d r5:c45a8b14 r4:00000002 r3:c45988ac
[   10.807041] [<c43fcacc>] (secondary_start_kernel+0x0/0x158) from
[<043fc4b4>] (0x43fc4b4)
[   10.815244]  r5:00000015 r4:2606806a
[   10.818860] Rebooting in 10 seconds..

Can you please help us in identifying the root cause of this random behavior.

Is the issue with UBIFS or with the nand driver ?

How to ensure that the file system sync and reboot is successful ?

Thanks
Chaitanya

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS errors are randomly seen after reboots
  2017-01-16  5:28 UBIFS errors are randomly seen after reboots chaitanya vinnakota
@ 2017-01-16  7:52 ` Richard Weinberger
  2017-01-16 15:21   ` chaitanya vinnakota
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Weinberger @ 2017-01-16  7:52 UTC (permalink / raw)
  To: chaitanya vinnakota; +Cc: linux-mtd

Chaitanya,

On Mon, Jan 16, 2017 at 6:28 AM, chaitanya vinnakota
<chaitanya.sai.v@gmail.com> wrote:
> Hi All,
>
> We are observing  UBIFS and ECC errors randomly even after safe
> reboots . We are using a 256M  spansion nand flash and the kernel
> version is 3.2.54. These errors disappear upon next reboots and appear
> again randomly.

Your log does not show ECC errors.

> [   10.428162] UBIFS error (pid 1): ubifs_check_node: bad CRC:
> calculated 0xa86e925a, read 0xd9a150bb

Hmm, a bad CRC without ECC error? Odd.

> Is the issue with UBIFS or with the nand driver ?

I'd check the driver first. Does it pass mtd and ubi tests?

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS errors are randomly seen after reboots
  2017-01-16  7:52 ` Richard Weinberger
@ 2017-01-16 15:21   ` chaitanya vinnakota
  2017-01-16 16:30     ` Richard Weinberger
  0 siblings, 1 reply; 12+ messages in thread
From: chaitanya vinnakota @ 2017-01-16 15:21 UTC (permalink / raw)
  To: linux-mtd; +Cc: richard.weinberger

Hi Richard,

Thanks for your response.

Based on your response I enabled chk_io , chk_gen,
tst_emulate_bitflips and tst_emulate_power_failure  options in
/sys/kernel/debug/ubi/ubi0 directory. I carried on with my test which
keeps on creating/writing data to files in all the available nand
partitions and during which a reboot is issued .

 Below are the various UBI messages I'm seeing quite consistently at
different times after enabling the debug options.

1. UBI error: do_sync_erase: cannot erase PEB 200 (emulated)
2. UBI error: erase_worker: failed to erase PEB 200, error -5
3. UBI: mark PEB 200 as bad
4. UBI error: ubi_io_write: cannot write 2048 bytes to PEB 452:0 (emulated)
5. UBI error: erase_worker: failed to erase PEB 452, error -5
     UBI: mark PEB 452 as bad
     Bad block table written to 0x00000ffe0000, version 0x0B
     Bad block table written to 0x00000ffc0000, version 0x0B
     UBI: 40 PEBs left in the reserve

Are all the above errors forecasting any doom or are they expected ?


Below is the complete nand flash and  UBI logs since booting, Sorry
for the huge log  :-

[    5.112214] nand_probe: comcertonand base: 0xe8800000
[    5.117330] ONFI flash detected
[    5.120771] ONFI param page 0 valid
[    5.124278] NAND device: Manufacturer ID: 0x01, Chip ID: 0xda (AMD S34ML02G2)
[    5.131475] hw_syndrome correction 2048.
[    5.135429] Using default values for hw ecc
[    5.141895] Bad block table found at page 131008, version 0x07
[    5.149973] Bad block table found at page 130944, version 0x07
[    5.158054] nand_read_bbt: bad block at 0x000002220000
[    5.163210] nand_read_bbt: bad block at 0x0000026a0000
[    5.168376] nand_read_bbt: bad block at 0x0000026c0000
[    5.173535] nand_read_bbt: bad block at 0x000003b60000
[    5.178699] nand_read_bbt: bad block at 0x000004cc0000
[    5.183856] nand_read_bbt: bad block at 0x000005960000
[    5.191340] 13 cmdlinepart partitions found on MTD device comcertonand
[    5.197903] Creating 13 MTD partitions on "comcertonand":
[    5.203324] 0x000000000000-0x000000080000 : "barebox"
[    5.209529] 0x000000080000-0x000000100000 : "bareboxfact"
[    5.216029] 0x000000100000-0x000000120000 : "env"
[    5.221767] 0x000000120000-0x000000920000 : "kernel1"
[    5.227889] 0x000000920000-0x000006520000 : "rootfs1"
[    5.234019] 0x000006520000-0x000006540000 : "reserved_dtb1"
[    5.240697] 0x000006540000-0x000006d40000 : "kernel2"
[    5.246855] 0x000006d40000-0x00000c940000 : "rootfs2"
[    5.253048] 0x00000c940000-0x00000c960000 : "reserved_dtb2"
[    5.259726] 0x00000c960000-0x00000cb60000 : "configcert"
[    5.266117] 0x00000cb60000-0x00000d760000 : "avcsign"
[    5.272211] 0x00000d760000-0x00000f760000 : "webrootdb"
[    5.278564] 0x00000f760000-0x00000f960000 : "license"
[    5.285177] designware_spi_probe:Initializing SPI Controller :
Using dma=1 CLK(DUS)=250000000 Hz
[    5.294003] dw_spi_dma_init:343: initializing spi dma....done
[    5.299991] spi_dma_init:105: dma_ineted set to 1.
[    5.305283] jedec 66069 ext_jedec 19712
[    5.309231] m25p80 spi1.0: found s25sl032a, expected m25p80
[    5.314876] m25p80 spi1.0: s25sl032a (4096 Kbytes)
[    5.319691] 7 cmdlinepart partitions found on MTD device spi1.0
[    5.325672] Creating 7 MTD partitions on "spi1.0":
[    5.330484] 0x000000000000-0x000000020000 : "uloader"
[    5.336801] 0x000000020000-0x0000000a0000 : "barebox"
[    5.342923] 0x0000000a0000-0x0000000c0000 : "env"
[    5.348726] 0x0000000c0000-0x0000000e0000 : "boardinfo"
[    5.355096] 0x0000000e0000-0x0000000f0000 : "md5sum1"
[    5.361227] 0x0000000f0000-0x000000100000 : "md5sum2"
[    5.367402] 0x000000100000-0x000000110000 : "boot_check"
[    5.373787] comcerto_spi comcerto_spi.1: at 0x96500000 mapped to
0xE8814000, irq=60
[    5.381539] designware_spi_probe:Initializing SPI Controller :
Using dma=0 CLK(spi_i2c)=250000000 Hz
[    5.391342] comcerto_spi comcerto_spi.0: at 0x90498000 mapped to
0xE8816000, irq=61
-----------------------------------------------
[    9.481547] UBI: scanning is finished
[    9.536221] UBI: attached mtd4 (name "rootfs1", size 92 MiB) to ubi0
[    9.542602] UBI: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
[    9.549452] UBI: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
[    9.556190] UBI: VID header offset: 2048 (aligned 2048), data offset: 4096
[    9.563089] UBI: good PEBs: 730, bad PEBs: 6, corrupted PEBs: 0
[    9.569039] UBI: user volume: 1, internal volumes: 1, max. volumes count: 128
[    9.576209] UBI: max/mean erase counter: 11/1, WL threshold: 4096,
image sequence number: 938621452
[    9.585298] UBI: available PEBs: 0, total reserved PEBs: 730, PEBs
reserved for bad PEB handling: 44
[    9.594502] UBI: background thread "ubi_bgt0d" started, PID 527
[    9.601627] c2k-rtc c2k-rtc: setting system clock to 2012-07-16
17:49:19 UTC (1342460959)
[    9.609854] Registering CPUFreq(comcerto)
[    9.614671] md: Skipping autodetection of RAID arrays.
(raid=autodetect will force)
[   10.109537] UBIFS: mounted UBI device 0, volume 0, name "rootfs"
[   10.115593] UBIFS: file system size:   84946944 bytes (82956 KiB,
81 MiB, 669 LEBs)
[   10.123277] UBIFS: journal size:       9023488 bytes (8812 KiB, 8
MiB, 72 LEBs)
[   10.130622] UBIFS: media format:       w4/r0 (latest is w4/r0)
[   10.136483] UBIFS: default compressor: zlib
[   10.140681] UBIFS: reserved for root:  0 bytes (0 KiB)
[   10.161459] VFS: Mounted root (ubifs filesystem) on device 0:13.
--------------------------------------------------------------------------
[   34.491872] UBI: schedule PEB 720 for scrubbing
[   35.496551] UBI: scrubbed PEB 720 (LEB 0:178), data moved to PEB 685
[   38.338551] UBI: schedule PEB 144 for scrubbing
[   39.861853] UBI: scrubbed PEB 144 (LEB 0:142), data moved to PEB 112
[   40.762302] UBI: schedule PEB 170 for scrubbing
[   41.785315] UBI: scrubbed PEB 170 (LEB 0:168), data moved to PEB 522
[   44.344547] UBI: schedule PEB 603 for scrubbing
[   45.340803] UBI: scrubbed PEB 603 (LEB 0:374), data moved to PEB 596
[   47.956012] UBI: schedule PEB 652 for scrubbing
[   49.248152] UBI: scrubbed PEB 652 (LEB 0:312), data moved to PEB 53
[   49.612931] UBI: schedule PEB 15 for scrubbing
[   50.567397] UBI: scrubbed PEB 15 (LEB 0:141), data moved to PEB 725
[   63.262926] UBI: schedule PEB 245 for scrubbing
[   64.358367] UBI: scrubbed PEB 245 (LEB 0:450), data moved to PEB 709
[   65.383584] UBI: schedule PEB 27 for scrubbing
[   66.675284] UBI: scrubbed PEB 27 (LEB 0:177), data moved to PEB 523
[   67.086551] UBI: schedule PEB 644 for scrubbing
[   67.950969] yaffs: dev is 8388609 name is "sda1" rw
[   67.956027] yaffs: passed flags ""
[   67.959551] yaffs: mtd is read only, setting superblock read only
[   68.362497] UBI: scrubbed PEB 644 (LEB 0:123), data moved to PEB 33
[   68.391761] yaffs: dev is 8388609 name is "sda1" rw
[   68.396803] yaffs: passed flags ""
[   68.400271] yaffs: mtd is read only, setting superblock read only
[   69.250548] UBI: schedule PEB 33 for scrubbing
[   70.138746] UBI: scrubbed PEB 33 (LEB 0:123), data moved to PEB 25
[   72.689685] UBI: schedule PEB 685 for scrubbing
[   73.946288] UBI: scrubbed PEB 685 (LEB 0:178), data moved to PEB 726
[   78.505748] UBI: schedule PEB 602 for scrubbing
[   79.495422] UBI: scrubbed PEB 602 (LEB 0:126), data moved to PEB 550
[   80.653549] UBI: schedule PEB 39 for scrubbing
[   81.798931] UBI: scrubbed PEB 39 (LEB 0:122), data moved to PEB 229
[   88.502851] UBI: schedule PEB 256 for scrubbing
[   89.405373] UBI: scrubbed PEB 256 (LEB 0:447), data moved to PEB 675
[   92.521612] UBI: schedule PEB 38 for scrubbing
[   93.448157] UBI: scrubbed PEB 38 (LEB 0:118), data moved to PEB 451
[   94.272328] UBI: schedule PEB 451 for scrubbing
[   95.526283] UBI: scrubbed PEB 451 (LEB 0:118), data moved to PEB 222
[   99.864548] UBI: schedule PEB 547 for scrubbing
[  101.467978] UBI: scrubbed PEB 547 (LEB 0:259), data moved to PEB 260
[  103.768054] UBI: schedule PEB 244 for scrubbing
[  108.655102] UBI: schedule PEB 250 for scrubbing
[  109.955079] UBI: scrubbed PEB 244 (LEB 0:242), data moved to PEB 685
[  111.398264] UBI: scrubbed PEB 250 (LEB 0:208), data moved to PEB 26
[  115.776754] UBI: schedule PEB 646 for scrubbing
[  116.692774] UBI: scrubbed PEB 646 (LEB 0:214), data moved to PEB 59
[  116.867968] UBI error: do_sync_erase: cannot erase PEB 646 (emulated)
[  116.874490] UBI error: erase_worker: failed to erase PEB 646, error -5
[  116.881067] UBI: mark PEB 646 as bad
------------------------------------------------------------------
[  284.244846] UBI: schedule PEB 205 for scrubbing
[  285.176521] UBI: scrubbed PEB 205 (LEB 0:450), data moved to PEB 525
[  288.379804] UBI: schedule PEB 171 for scrubbing
[  289.894932] UBI: scrubbed PEB 171 (LEB 0:169), data moved to PEB 678
[  291.096216] UBI: schedule PEB 608 for scrubbing
[  292.153477] UBI: scrubbed PEB 608 (LEB 0:268), data moved to PEB 107
[  292.344435] UBI: schedule PEB 705 for scrubbing
[  293.311608] UBI: schedule PEB 625 for scrubbing
[  293.739674] UBI: scrubbed PEB 705 (LEB 0:324), data moved to PEB 526
[  294.549757] UBI: schedule PEB 194 for scrubbing
[  295.201757] UBI: scrubbed PEB 625 (LEB 0:173), data moved to PEB 300
[  295.343045] UBI: schedule PEB 300 for scrubbing
[  296.597891] UBI: scrubbed PEB 194 (LEB 0:449), data moved to PEB 636
[  298.533679] UBI: scrubbed PEB 300 (LEB 0:173), data moved to PEB 591
[  310.821339] UBI: schedule PEB 540 for scrubbing
[  311.342791] UBI error: ubi_io_write: cannot write 2048 bytes to PEB
452:0 (emulated)
[  311.350613] Backtrace:
[  311.353199] [<c40113a0>] (dump_backtrace+0x0/0x110) from
[<c43fff28>] (dump_stack+0x18/0x1c)
[  311.361720]  r6:000001c4 r5:fffffffb r4:e6196000 r3:c45a2dbc
[  311.367632] [<c43fff10>] (dump_stack+0x0/0x1c) from [<c426854c>]
(ubi_io_write+0x2d0/0x58c)
[  311.376098] [<c426827c>] (ubi_io_write+0x0/0x58c) from [<c4268a58>]
(ubi_io_write_ec_hdr+0x148/0x160)
[  311.385432] [<c4268910>] (ubi_io_write_ec_hdr+0x0/0x160) from
[<c42698d4>] (erase_worker+0x270/0x670)
[  311.394759]  r6:e4ce4000 r5:e6196000 r4:00000001
[  311.399562] [<c4269664>] (erase_worker+0x0/0x670) from [<c4269d2c>]
(do_sync_erase+0x58/0x68)
[  311.408172] [<c4269cd4>] (do_sync_erase+0x0/0x68) from [<c426a35c>]
(wear_leveling_worker+0x5d0/0x76c)
[  311.417537]  r7:e61ded28 r6:00000000 r5:e6196000 r4:e4ce4000
[  311.423370] [<c4269d8c>] (wear_leveling_worker+0x0/0x76c) from
[<c4269568>] (do_work+0xd4/0x110)
[  311.432250] [<c4269494>] (do_work+0x0/0x110) from [<c426b2c0>]
(ubi_thread+0x110/0x1a0)
[  311.440330]  r7:00000000 r6:00000000 r5:e61f2000 r4:e6196000
[  311.446252] [<c426b1b0>] (ubi_thread+0x0/0x1a0) from [<c405234c>]
(kthread+0x90/0x98)
[  311.454167] [<c40522bc>] (kthread+0x0/0x98) from [<c4036bd4>]
(do_exit+0x0/0x6f8)
[  311.461745]  r6:c4036bd4 r5:c40522bc r4:e6035ec4
[  311.466600] UBI error: erase_worker: failed to erase PEB 452, error -5
[  311.473199] UBI: mark PEB 452 as bad
[  311.501575] Bad block table written to 0x00000ffe0000, version 0x0B
[  311.528197] Bad block table written to 0x00000ffc0000, version 0x0B
[  311.534519] UBI: 40 PEBs left in the reserve
[  313.355032] UBI: schedule PEB 195 for scrubbing
[  314.580035] UBI: scrubbed PEB 540 (LEB 0:385), data moved to PEB 281
[  316.154272] UBI: scrubbed PEB 195 (LEB 0:111), data moved to PEB 262


Thanks
Chaitanya



On Mon, Jan 16, 2017 at 1:22 PM, Richard Weinberger
<richard.weinberger@gmail.com> wrote:
> Chaitanya,
>
> On Mon, Jan 16, 2017 at 6:28 AM, chaitanya vinnakota
> <chaitanya.sai.v@gmail.com> wrote:
>> Hi All,
>>
>> We are observing  UBIFS and ECC errors randomly even after safe
>> reboots . We are using a 256M  spansion nand flash and the kernel
>> version is 3.2.54. These errors disappear upon next reboots and appear
>> again randomly.
>
> Your log does not show ECC errors.
>
>> [   10.428162] UBIFS error (pid 1): ubifs_check_node: bad CRC:
>> calculated 0xa86e925a, read 0xd9a150bb
>
> Hmm, a bad CRC without ECC error? Odd.
>
>> Is the issue with UBIFS or with the nand driver ?
>
> I'd check the driver first. Does it pass mtd and ubi tests?
>
> --
> Thanks,
> //richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS errors are randomly seen after reboots
  2017-01-16 15:21   ` chaitanya vinnakota
@ 2017-01-16 16:30     ` Richard Weinberger
  2017-01-17 11:47       ` chaitanya vinnakota
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Weinberger @ 2017-01-16 16:30 UTC (permalink / raw)
  To: chaitanya vinnakota, linux-mtd; +Cc: David Oberhollenzer

Chaitanya,

Am 16.01.2017 um 16:21 schrieb chaitanya vinnakota:
> Hi Richard,
> 
> Thanks for your response.
> 
> Based on your response I enabled chk_io , chk_gen,
> tst_emulate_bitflips and tst_emulate_power_failure  options in
> /sys/kernel/debug/ubi/ubi0 directory. I carried on with my test which
> keeps on creating/writing data to files in all the available nand
> partitions and during which a reboot is issued .
> 
>  Below are the various UBI messages I'm seeing quite consistently at
> different times after enabling the debug options.
> 
> 1. UBI error: do_sync_erase: cannot erase PEB 200 (emulated)
> 2. UBI error: erase_worker: failed to erase PEB 200, error -5
> 3. UBI: mark PEB 200 as bad
> 4. UBI error: ubi_io_write: cannot write 2048 bytes to PEB 452:0 (emulated)
> 5. UBI error: erase_worker: failed to erase PEB 452, error -5
>      UBI: mark PEB 452 as bad
>      Bad block table written to 0x00000ffe0000, version 0x0B
>      Bad block table written to 0x00000ffc0000, version 0x0B
>      UBI: 40 PEBs left in the reserve
> 
> Are all the above errors forecasting any doom or are they expected ?

You've enabled UBI self-tests and error *emulation*. This is not what I meant.

I had in mind:
1. MTD tests, these are available in the kernel source:
   http://linux-mtd.infradead.org/doc/general.html#L_mtd_tests
   And now also as part of mtd-utils.
2. UBI tests, you can find them in the tests/ubi-tests/ directory
   of the mtd-utils source package.

David, I think we should add a page to mtd-www on testing. :-)

Thanks,
//richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS errors are randomly seen after reboots
  2017-01-16 16:30     ` Richard Weinberger
@ 2017-01-17 11:47       ` chaitanya vinnakota
  2017-01-17 12:33         ` Richard Weinberger
  0 siblings, 1 reply; 12+ messages in thread
From: chaitanya vinnakota @ 2017-01-17 11:47 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

Hi Richard,

Seems that the ubi-tests are failing work on my board. I'm getting the
below output when I initiated the test.

# ./runtests.sh /dev/ubi1

Running mkvol_basic /dev/ubi1
[mkvol_basic] __initial_check():74: insufficient available eraseblocks
0 on UBI device, required 5

# ./runtests.sh /dev/ubi1_0
Running mkvol_basic /dev/ubi1_0
[mkvol_basic] __initial_check():69: function ubi_get_dev_info() failed
with error 19 (No such device)

Which device should I pass as the argument , is that ubi1 or ubi1_0
[the partition to mount rootfs ] ?

Added ubinfo of /dev/ubi1_0 and /dev/ubi1 for further reference

root@router6654A1:~/arm-openwrt-linux# ubinfo /dev/ubi1_0
Volume ID:   0 (on ubi1)
Type:        dynamic
Alignment:   1
Size:        680 LEBs (86343680 bytes, 82.3 MiB)
State:       OK
Name:        rootfs
Character device major/minor: 248:1

root@router6654A1:~/arm-openwrt-linux# ubinfo /dev/ubi1
ubi1
Volumes count:                           1
Logical eraseblock size:                 126976 bytes, 124.0 KiB
Total amount of logical eraseblocks:     736 (93454336 bytes, 89.1 MiB)
Amount of available logical eraseblocks: 0 (0 bytes)
Maximum count of volumes                 128
Count of bad physical eraseblocks:       0
Count of reserved physical eraseblocks:  50
Current maximum erase counter value:     1
Minimum input/output unit size:          2048 bytes
Character device major/minor:            248:0
Present volumes:                         0

Can you please refer to any article/tutorial on understanding the
output of ubi-tests ?

Thanks
Chaitanya

On Mon, Jan 16, 2017 at 10:00 PM, Richard Weinberger <richard@nod.at> wrote:
> Chaitanya,
>
> Am 16.01.2017 um 16:21 schrieb chaitanya vinnakota:
>> Hi Richard,
>>
>> Thanks for your response.
>>
>> Based on your response I enabled chk_io , chk_gen,
>> tst_emulate_bitflips and tst_emulate_power_failure  options in
>> /sys/kernel/debug/ubi/ubi0 directory. I carried on with my test which
>> keeps on creating/writing data to files in all the available nand
>> partitions and during which a reboot is issued .
>>
>>  Below are the various UBI messages I'm seeing quite consistently at
>> different times after enabling the debug options.
>>
>> 1. UBI error: do_sync_erase: cannot erase PEB 200 (emulated)
>> 2. UBI error: erase_worker: failed to erase PEB 200, error -5
>> 3. UBI: mark PEB 200 as bad
>> 4. UBI error: ubi_io_write: cannot write 2048 bytes to PEB 452:0 (emulated)
>> 5. UBI error: erase_worker: failed to erase PEB 452, error -5
>>      UBI: mark PEB 452 as bad
>>      Bad block table written to 0x00000ffe0000, version 0x0B
>>      Bad block table written to 0x00000ffc0000, version 0x0B
>>      UBI: 40 PEBs left in the reserve
>>
>> Are all the above errors forecasting any doom or are they expected ?
>
> You've enabled UBI self-tests and error *emulation*. This is not what I meant.
>
> I had in mind:
> 1. MTD tests, these are available in the kernel source:
>    http://linux-mtd.infradead.org/doc/general.html#L_mtd_tests
>    And now also as part of mtd-utils.
> 2. UBI tests, you can find them in the tests/ubi-tests/ directory
>    of the mtd-utils source package.
>
> David, I think we should add a page to mtd-www on testing. :-)
>
> Thanks,
> //richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS errors are randomly seen after reboots
  2017-01-17 11:47       ` chaitanya vinnakota
@ 2017-01-17 12:33         ` Richard Weinberger
       [not found]           ` <CAMBC=j6t2h_yB3wQbNUexrJto+O3ruTXo992JWFLH293OyH0Bg@mail.gmail.com>
  2017-01-17 14:46           ` chaitanya vinnakota
  0 siblings, 2 replies; 12+ messages in thread
From: Richard Weinberger @ 2017-01-17 12:33 UTC (permalink / raw)
  To: chaitanya vinnakota; +Cc: linux-mtd

Chaitanya,


Am 17.01.2017 um 12:47 schrieb chaitanya vinnakota:
> Hi Richard,
> 
> Seems that the ubi-tests are failing work on my board. I'm getting the
> below output when I initiated the test.
> 
> # ./runtests.sh /dev/ubi1
> 
> Running mkvol_basic /dev/ubi1
> [mkvol_basic] __initial_check():74: insufficient available eraseblocks
> 0 on UBI device, required 5
> 
> # ./runtests.sh /dev/ubi1_0
> Running mkvol_basic /dev/ubi1_0
> [mkvol_basic] __initial_check():69: function ubi_get_dev_info() failed
> with error 19 (No such device)
> 
> Which device should I pass as the argument , is that ubi1 or ubi1_0
> [the partition to mount rootfs ] ?
> 
> Added ubinfo of /dev/ubi1_0 and /dev/ubi1 for further reference

ubi1 needs to be empty for testing.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS errors are randomly seen after reboots
       [not found]           ` <CAMBC=j6t2h_yB3wQbNUexrJto+O3ruTXo992JWFLH293OyH0Bg@mail.gmail.com>
@ 2017-01-17 14:39             ` Richard Weinberger
  2017-01-18 11:35               ` chaitanya vinnakota
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Weinberger @ 2017-01-17 14:39 UTC (permalink / raw)
  To: chaitanya vinnakota; +Cc: linux-mtd

Chaitanya,

Am 17.01.2017 um 15:32 schrieb chaitanya vinnakota:
> Now , I'm seeing altogether different form of errors.
> 
> ./runtests.sh /dev/ubi1
> Running mkvol_basic /dev/ubi1
> Running mkvol_bad /dev/ubi1

_bad is supposed to return errors.

> [77361.468670] UBI error: verify_mkvol_req: bad volume creation request
> [77361.475116] Volume creation request dump:
> [77361.479139]  vol_id    -2
> [77361.481768]  alignment 1
> [77361.484322]  bytes     86343680
> [77361.487473]  vol_type  3
> [77361.490013]  name_len  22
> [77361.492641]  1st 16 characters of name: mkvol_bad:test_m
> [77361.498235] UBI error: verify_mkvol_req: bad volume creation request
> [77361.504652] Volume creation request dump:
> [77361.508676]  vol_id    128
> [77361.511390]  alignment 1
> [77361.513930]  bytes     86343680
> [77361.517095]  vol_type  3
> [77361.519636]  name_len  22
> [77361.522265]  1st 16 characters of name: mkvol_bad:test_m
> [77361.527634] UBI error: verify_mkvol_req: bad volume creation request
> [77361.534009] Volume creation request dump:
> [77361.538064]  vol_id    0
> [77361.540606]  alignment 0
> [77361.543145]  bytes     86343680
> [77361.546316]  vol_type  3
> [77361.548857]  name_len  22
> [77361.551485]  1st 16 characters of name: mkvol_bad:test_m
> [77361.558437] UBI error: verify_mkvol_req: bad volume creation request
> [77361.564862] Volume creation request dump:
> [77361.568902]  vol_id    0
> [77361.571457]  alignment -1
> [77361.574101]  bytes     86343680
> [77361.577313]  vol_type  3
> [77361.579866]  name_len  22
> [77361.582504]  1st 16 characters of name: mkvol_bad:test_m
> [77361.587943] UBI error: verify_mkvol_req: bad volume creation request
> [77361.594355] Volume creation request dump:
> [77361.598394]  vol_id    0
> [77361.600945]  alignment 126977
> [77361.603936]  bytes     86343680
> [77361.607132]  vol_type  3
> [77361.609688]  name_len  22
> [77361.612329]  1st 16 characters of name: mkvol_bad:test_m
> [77361.617765] UBI error: verify_mkvol_req: bad volume creation request
> [77361.624199] Volume creation request dump:
> [77361.628241]  vol_id    0
> [77361.630800]  alignment 2049
> [77361.633617]  bytes     86343680
> [77361.636814]  vol_type  3
> [77361.639368]  name_len  22
> [77361.642010]  1st 16 characters of name: mkvol_bad:test_m
> [77361.647478] UBI error: verify_mkvol_req: bad volume creation request
> [77361.653868] Volume creation request dump:
> [77361.657940]  vol_id    0
> [77361.660510]  alignment 1
> [77361.663065]  bytes     -1
> [77361.665739]  vol_type  3
> [77361.668290]  name_len  22
> [77361.670930]  1st 16 characters of name: mkvol_bad:test_m
> [77361.676360] UBI error: verify_mkvol_req: bad volume creation request
> [77361.682748] Volume creation request dump:
> [77361.686826]  vol_id    0
> [77361.689383]  alignment 1
> [77361.691936]  bytes     0
> [77361.694513]  vol_type  3
> [77361.697068]  name_len  22
> [77361.699708]  1st 16 characters of name: mkvol_bad:test_m
> [77361.705119] UBI error: ubi_create_volume: not enough PEBs, only 680 available
> [77361.712293] UBI error: ubi_create_volume: cannot create volume 0, error -28
> [77361.719415] UBI error: ubi_create_volume: not enough PEBs, only 680 available
> [77361.726655] UBI error: ubi_create_volume: cannot create volume 0, error -28
> [77361.733804] UBI error: verify_mkvol_req: bad volume creation request

These are fine.

> [77361.740243] Volume creation request dump:
> [77361.744301]  vol_id    0
> [77361.746854]  alignment 1
> [77361.749407]  bytes     126976
> [77361.752397]  vol_type  7
> [77361.754981]  name_len  22
> [77361.757622]  1st 16 characters of name: mkvol_bad:test_m
> [77361.926468] UBI error: ubi_create_volume: volume 0 already exists
> [77361.932606] UBI error: ubi_create_volume: cannot create volume 0, error -17
> [77361.939839] UBI error: ubi_create_volume: volume "mkvol_bad:test_mkvol()" exists (ID 0)
> [77361.947946] UBI error: ubi_create_volume: cannot create volume 1, error -17
> [77362.290898] UBI error: ubi_create_volume: volume "mkvol_bad:test_mkvol()" exists (ID 0)
> [77362.299021] UBI error: ubi_create_volume: cannot create volume 1, error -17
> [77405.406937] UBI error: ubi_open_volume: cannot open device 1, volume 128, error -22
> [77405.420226] UBI error: ubi_open_volume: cannot open device 1, volume -1, error -22
> [77405.427910] UBI error: ubi_open_volume: cannot open device 1, volume 128, error -22
> [77405.435657] UBI error: ubi_open_volume: cannot open device 1, volume 0, error -19
> [77405.774800] UBI error: ubi_open_volume: cannot open device 1, volume 0, error -19

These are fine too.

> Running mkvol_paral /dev/ubi1
> Running rsvol /dev/ubi1
> [rsvol] test_rsvol1():178: function open() failed with error 2 (No such file or directory)
> [rsvol] test_rsvol1():179: cannot open "/dev/ubi1_0"

That's one is odd.
Do all other tests fail too?
io_basic and io_parallel are interesting. You can also start them by hand.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS errors are randomly seen after reboots
  2017-01-17 12:33         ` Richard Weinberger
       [not found]           ` <CAMBC=j6t2h_yB3wQbNUexrJto+O3ruTXo992JWFLH293OyH0Bg@mail.gmail.com>
@ 2017-01-17 14:46           ` chaitanya vinnakota
  1 sibling, 0 replies; 12+ messages in thread
From: chaitanya vinnakota @ 2017-01-17 14:46 UTC (permalink / raw)
  To: Richard Weinberger, linux-mtd

Hi Richard,
I formatted the mtd partition using ubiformat and issued ubiattach on
the mtd partition.

root@router6654A1:~/arm-openwrt-linux# ubiformat /dev/mtd7

ubiformat: mtd7 (nand), size 96468992 bytes (92.0 MiB), 736
eraseblocks of 131072 bytes (128.0 KiB), min. I/O size 2048 bytes
libscan: scanning eraseblock 735 -- 100 % complete
ubiformat: 736 eraseblocks have valid erase counter, mean value is 0
ubiformat: formatting eraseblock 735 -- 100 % complete

root@router6654A1:~/arm-openwrt-linux# ubiattach -p /dev/mtd7

[77286.016761] UBI: default fastmap pool size: 35
[77286.021226] UBI: default fastmap WL pool size: 25
[77286.026003] UBI: attaching mtd7 to ubi1

[77289.297700] UBI: scanning is finished
[77289.352234] UBI: attached mtd7 (name "rootfs2", size 92 MiB) to ubi1
[77289.358649] UBI: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
[77289.365476] UBI: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
[77289.372201] UBI: VID header offset: 2048 (aligned 2048), data offset: 4096
[77289.379112] UBI: good PEBs: 736, bad PEBs: 0, corrupted PEBs: 0
[77289.385060] UBI: user volume: 0, internal volumes: 1, max. volumes count: 128
[77289.392220] UBI: max/mean erase counter: 2/1, WL threshold: 4096,
image sequence number: 1177844582
[77289.401303] UBI: available PEBs: 680, total reserved PEBs: 56, PEBs
reserved for bad PEB handling: 50
[77289.410574] UBI: background thread "ubi_bgt1d" started, PID 20224
UBI device number 1, total 736 LEBs (93454336 bytes, 89.1 MiB),
available 680 LEBs (86343680 bytes, 82.3 MiB), LEB size 126976 bytes
(124.0 )

Now , I'm seeing altogether different form of errors.

./runtests.sh /dev/ubi1
Running mkvol_basic /dev/ubi1
Running mkvol_bad /dev/ubi1

[77361.468670] UBI error: verify_mkvol_req: bad volume creation request
[77361.475116] Volume creation request dump:
[77361.479139]  vol_id    -2
[77361.481768]  alignment 1
[77361.484322]  bytes     86343680
[77361.487473]  vol_type  3
[77361.490013]  name_len  22
[77361.492641]  1st 16 characters of name: mkvol_bad:test_m
[77361.498235] UBI error: verify_mkvol_req: bad volume creation request
[77361.504652] Volume creation request dump:
[77361.508676]  vol_id    128
[77361.511390]  alignment 1
[77361.513930]  bytes     86343680
[77361.517095]  vol_type  3
[77361.519636]  name_len  22
[77361.522265]  1st 16 characters of name: mkvol_bad:test_m
[77361.527634] UBI error: verify_mkvol_req: bad volume creation request
[77361.534009] Volume creation request dump:
[77361.538064]  vol_id    0
[77361.540606]  alignment 0
[77361.543145]  bytes     86343680
[77361.546316]  vol_type  3
[77361.548857]  name_len  22
[77361.551485]  1st 16 characters of name: mkvol_bad:test_m
[77361.558437] UBI error: verify_mkvol_req: bad volume creation request
[77361.564862] Volume creation request dump:
[77361.568902]  vol_id    0
[77361.571457]  alignment -1
[77361.574101]  bytes     86343680
[77361.577313]  vol_type  3
[77361.579866]  name_len  22
[77361.582504]  1st 16 characters of name: mkvol_bad:test_m
[77361.587943] UBI error: verify_mkvol_req: bad volume creation request
[77361.594355] Volume creation request dump:
[77361.598394]  vol_id    0
[77361.600945]  alignment 126977
[77361.603936]  bytes     86343680
[77361.607132]  vol_type  3
[77361.609688]  name_len  22
[77361.612329]  1st 16 characters of name: mkvol_bad:test_m
[77361.617765] UBI error: verify_mkvol_req: bad volume creation request
[77361.624199] Volume creation request dump:
[77361.628241]  vol_id    0
[77361.630800]  alignment 2049
[77361.633617]  bytes     86343680
[77361.636814]  vol_type  3
[77361.639368]  name_len  22
[77361.642010]  1st 16 characters of name: mkvol_bad:test_m
[77361.647478] UBI error: verify_mkvol_req: bad volume creation request
[77361.653868] Volume creation request dump:
[77361.657940]  vol_id    0
[77361.660510]  alignment 1
[77361.663065]  bytes     -1
[77361.665739]  vol_type  3
[77361.668290]  name_len  22
[77361.670930]  1st 16 characters of name: mkvol_bad:test_m
[77361.676360] UBI error: verify_mkvol_req: bad volume creation request
[77361.682748] Volume creation request dump:
[77361.686826]  vol_id    0
[77361.689383]  alignment 1
[77361.691936]  bytes     0
[77361.694513]  vol_type  3
[77361.697068]  name_len  22
[77361.699708]  1st 16 characters of name: mkvol_bad:test_m
[77361.705119] UBI error: ubi_create_volume: not enough PEBs, only 680 available
[77361.712293] UBI error: ubi_create_volume: cannot create volume 0, error -28
[77361.719415] UBI error: ubi_create_volume: not enough PEBs, only 680 available
[77361.726655] UBI error: ubi_create_volume: cannot create volume 0, error -28
[77361.733804] UBI error: verify_mkvol_req: bad volume creation request
[77361.740243] Volume creation request dump:
[77361.744301]  vol_id    0
[77361.746854]  alignment 1
[77361.749407]  bytes     126976
[77361.752397]  vol_type  7
[77361.754981]  name_len  22
[77361.757622]  1st 16 characters of name: mkvol_bad:test_m
[77361.926468] UBI error: ubi_create_volume: volume 0 already exists
[77361.932606] UBI error: ubi_create_volume: cannot create volume 0, error -17
[77361.939839] UBI error: ubi_create_volume: volume
"mkvol_bad:test_mkvol()" exists (ID 0)
[77361.947946] UBI error: ubi_create_volume: cannot create volume 1, error -17
[77362.290898] UBI error: ubi_create_volume: volume
"mkvol_bad:test_mkvol()" exists (ID 0)
[77362.299021] UBI error: ubi_create_volume: cannot create volume 1, error -17
[77405.406937] UBI error: ubi_open_volume: cannot open device 1,
volume 128, error -22
[77405.420226] UBI error: ubi_open_volume: cannot open device 1,
volume -1, error -22
[77405.427910] UBI error: ubi_open_volume: cannot open device 1,
volume 128, error -22
[77405.435657] UBI error: ubi_open_volume: cannot open device 1,
volume 0, error -19
[77405.774800] UBI error: ubi_open_volume: cannot open device 1,
volume 0, error -19
Running mkvol_paral /dev/ubi1
Running rsvol /dev/ubi1
[rsvol] test_rsvol1():178: function open() failed with error 2 (No
such file or directory)
[rsvol] test_rsvol1():179: cannot open "/dev/ubi1_0"


Thanks
Chaitanya

On Tue, Jan 17, 2017 at 6:03 PM, Richard Weinberger <richard@nod.at> wrote:
> Chaitanya,
>
>
> Am 17.01.2017 um 12:47 schrieb chaitanya vinnakota:
>> Hi Richard,
>>
>> Seems that the ubi-tests are failing work on my board. I'm getting the
>> below output when I initiated the test.
>>
>> # ./runtests.sh /dev/ubi1
>>
>> Running mkvol_basic /dev/ubi1
>> [mkvol_basic] __initial_check():74: insufficient available eraseblocks
>> 0 on UBI device, required 5
>>
>> # ./runtests.sh /dev/ubi1_0
>> Running mkvol_basic /dev/ubi1_0
>> [mkvol_basic] __initial_check():69: function ubi_get_dev_info() failed
>> with error 19 (No such device)
>>
>> Which device should I pass as the argument , is that ubi1 or ubi1_0
>> [the partition to mount rootfs ] ?
>>
>> Added ubinfo of /dev/ubi1_0 and /dev/ubi1 for further reference
>
> ubi1 needs to be empty for testing.
>
> Thanks,
> //richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS errors are randomly seen after reboots
  2017-01-17 14:39             ` Richard Weinberger
@ 2017-01-18 11:35               ` chaitanya vinnakota
  2017-01-23 10:48                 ` chaitanya vinnakota
  0 siblings, 1 reply; 12+ messages in thread
From: chaitanya vinnakota @ 2017-01-18 11:35 UTC (permalink / raw)
  To: Richard Weinberger, linux-mtd

Hi Richard,
io_basic test failed  whereas the rest of the tests i.e io_read
,io_update,  io_paral,  volrefcnt passed.  I also ran mtd-tests and
all of them passed.

To sum it up , we've rsvol and io_basic which failed and the rest all
passed. Does this give any lead to the root cause of UBIFS errors or
are there any pending tests which we need to perform ?


Below is the output of io_basic test

root@router6654A1:~/arm-openwrt-linux# ./io_basic /dev/ubi1
[137176.494161] INFO: rcu_sched detected stall on CPU 0 (t=6000 jiffies)
[137176.500628] Backtrace:
[137176.503212] [<c40113a0>] (dump_backtrace+0x0/0x110) from
[<c43fff28>] (dump_stack+0x18/0x1c)
[137176.511765]  r6:c458e080
[137176.514150] INFO: rcu_sched detected stalls on CPUs/tasks: { 0}
(detected by 1, t=6002 jiffies)
[137176.514166] Backtrace:
[137176.514186] [<c40113a0>] (dump_backtrace+0x0/0x110) from
[<c43fff28>] (dump_stack+0x18/0x1c)
[137176.514196]  r6:c45a6680 r5:c45a6620 r4:c4e31674 r3:00000000
[137176.514228] [<c43fff10>] (dump_stack+0x0/0x1c) from [<c4080368>]
(__rcu_pending+0x210/0x38c)
[137176.514248] [<c4080158>] (__rcu_pending+0x0/0x38c) from
[<c4081270>] (rcu_check_callbacks+0xe8/0x17c)
[137176.514269] [<c4081188>] (rcu_check_callbacks+0x0/0x17c) from
[<c4041f70>] (update_process_times+0x40/0x64)
[137176.514279]  r8:e5267595 r7:00000001 r6:e6049780 r5:00000000 r4:e6066000
[137176.514294] r3:00010000
[137176.514315] [<c4041f30>] (update_process_times+0x0/0x64) from
[<c4063174>] (tick_sched_timer+0x9c/0xdc)
[137176.514325]  r7:c4e314a0 r6:e6067f68 r5:e6066000 r4:c4e31598
[137176.514351] [<c40630d8>] (tick_sched_timer+0x0/0xdc) from
[<c40564f0>] (__run_hrtimer+0xf4/0x1c8)
[137176.514360]  r9:e6067eb0 r8:e5266b80 r6:c4e314d8 r5:c4e314a0 r4:c4e31598
[137176.514385] [<c40563fc>] (__run_hrtimer+0x0/0x1c8) from
[<c4057248>] (hrtimer_interrupt+0x124/0x288)
[137176.514404] [<c4057124>] (hrtimer_interrupt+0x0/0x288) from
[<c4013640>] (twd_handler+0x28/0x30)
[137176.514422] [<c4013618>] (twd_handler+0x0/0x30) from [<c407d8d8>]
(handle_percpu_devid_irq+0xd0/0x150)
[137176.514431]  r4:0000001d r3:c4013618
[137176.514448] [<c407d808>] (handle_percpu_devid_irq+0x0/0x150) from
[<c4079e88>] (generic_handle_irq+0x34/0x48)
[137176.514466] [<c4079e54>] (generic_handle_irq+0x0/0x48) from
[<c400e494>] (handle_IRQ+0x80/0xc0)
[137176.514483] [<c400e414>] (handle_IRQ+0x0/0xc0) from [<c40081d0>]
(asm_do_IRQ+0x10/0x14)
[137176.514492]  r5:60000013 r4:c400e72c
[137176.514507] [<c40081c0>] (asm_do_IRQ+0x0/0x14) from [<c400d738>]
(__irq_svc+0x38/0x90)
[137176.514518] Exception stack(0xe6067f68 to 0xe6067fb0)
[137176.514529] 7f60:                   00000001 00000000 e6067fb0
00000000 e6066000 c44058a8
[137176.514544] 7f80: c45c6984 c45c6ab4 0400406a 412fc091 00000000
e6067fbc e6067fc0 e6067fb0
[137176.514555] 7fa0: c400e728 c400e72c 60000013 ffffffff
[137176.514571] [<c400e700>] (default_idle+0x0/0x30) from [<c400e90c>]
(cpu_idle+0x84/0xc8)
[137176.514596] [<c400e888>] (cpu_idle+0x0/0xc8) from [<c43fcc0c>]
(secondary_start_kernel+0x140/0x158)
[137176.514606]  r6:10c03c7d r5:c45a8b14 r4:00000002 r3:c45988ac
[137176.514693] [<c43fcacc>] (secondary_start_kernel+0x0/0x158) from
[<043fc4b4>] (0x43fc4b4)
[137176.514703]  r5:00000015 r4:2606806a
[137176.739774]  r5:c45a6620 r4:c4e29674 r3:c45a2dbc
[137176.744558] [<c43fff10>] (dump_stack+0x0/0x1c) from [<c40801e0>]
(__rcu_pending+0x88/0x38c)
[137176.753035] [<c4080158>] (__rcu_pending+0x0/0x38c) from
[<c4081270>] (rcu_check_callbacks+0xe8/0x17c)
[137176.762382] [<c4081188>] (rcu_check_callbacks+0x0/0x17c) from
[<c4041f70>] (update_process_times+0x40/0x64)
[137176.772240]  r8:e3f551f5 r7:00000000 r6:e4a71540 r5:00000000 r4:e2012000
[137176.778925] r3:00010000
[137176.781677] [<c4041f30>] (update_process_times+0x0/0x64) from
[<c4063174>] (tick_sched_timer+0x9c/0xdc)
[137176.791186]  r7:c4e294a0 r6:e2013c70 r5:e2012000 r4:c4e29598
[137176.797011] [<c40630d8>] (tick_sched_timer+0x0/0xdc) from
[<c40564f0>] (__run_hrtimer+0xf4/0x1c8)
[137176.805998]  r9:e2013bb8 r8:e3f53e80 r6:c4e294d8 r5:c4e294a0 r4:c4e29598
[137176.812875] [<c40563fc>] (__run_hrtimer+0x0/0x1c8) from
[<c4057248>] (hrtimer_interrupt+0x124/0x288)
[137176.822134] [<c4057124>] (hrtimer_interrupt+0x0/0x288) from
[<c4013640>] (twd_handler+0x28/0x30)
[137176.831042] [<c4013618>] (twd_handler+0x0/0x30) from [<c407d8d8>]
(handle_percpu_devid_irq+0xd0/0x150)
[137176.840465]  r4:0000001d r3:c4013618
[137176.844170] [<c407d808>] (handle_percpu_devid_irq+0x0/0x150) from
[<c4079e88>] (generic_handle_irq+0x34/0x48)
[137176.854213] [<c4079e54>] (generic_handle_irq+0x0/0x48) from
[<c400e494>] (handle_IRQ+0x80/0xc0)
[137176.863032] [<c400e414>] (handle_IRQ+0x0/0xc0) from [<c40081d0>]
(asm_do_IRQ+0x10/0x14)
[137176.871148]  r5:20000013 r4:c41c330c
[137176.874852] [<c40081c0>] (asm_do_IRQ+0x0/0x14) from [<c400d738>]
(__irq_svc+0x38/0x90)
[137176.882883] Exception stack(0xe2013c70 to 0xe2013cb8)
[137176.888041] 3c60:                                     e8800000
ec3cf580 00000270 a5a5a5a5
[137176.896335] 3c80: a5a5a5a5 0000a5a5 000000a5 ec3cf400 00000001
c4619aa8 00000400 e2013cd4
[137176.904629] 3ca0: 000000a5 e2013cb8 000000a5 c41c330c 20000013 ffffffff
[137176.911371] [<c42594c8>] (nand_read_buf+0x0/0x1c) from
[<c425d8a4>] (comcerto_nand_read_page_hwecc+0x70/0x140)
[137176.921498] [<c425d834>] (comcerto_nand_read_page_hwecc+0x0/0x140)
from [<c4258e68>] (nand_do_read_ops+0x200/0x438)
[137176.932062] [<c4258c68>] (nand_do_read_ops+0x0/0x438) from
[<c425a950>] (nand_read+0x80/0xa4)
[137176.940708] [<c425a8d0>] (nand_read+0x0/0xa4) from [<c42442ac>]
(part_read+0x78/0xcc)
[137176.948649]  r9:00000000 r8:04140000 r7:00000000 r6:06d40000 r5:00000003
[137176.955334] r4:e6295c00
[137176.958081] [<c4244234>] (part_read+0x0/0xcc) from [<c42675b0>]
(ubi_io_read+0x120/0x2b0)
[137176.966379] [<c4267490>] (ubi_io_read+0x0/0x2b0) from [<c426522c>]
(ubi_eba_read_leb+0x2b8/0x3b0)
[137176.975375] [<c4264f74>] (ubi_eba_read_leb+0x0/0x3b0) from
[<c426ddd4>] (ubi_check_volume+0x78/0xbc)
[137176.984630] [<c426dd5c>] (ubi_check_volume+0x0/0xbc) from
[<c42639ec>] (vol_cdev_write+0x2d4/0x354)
[137176.993806] [<c4263718>] (vol_cdev_write+0x0/0x354) from
[<c40d7c00>] (vfs_write+0xbc/0x148)
[137177.002364] [<c40d7b44>] (vfs_write+0x0/0x148) from [<c40d7e8c>]
(sys_write+0x48/0x74)
[137177.010392]  r8:c400dd44 r7:00000004 r6:00000200 r5:bebc3a18 r4:e4996e40
[137177.017270] [<c40d7e44>] (sys_write+0x0/0x74) from [<c400dbc0>]
(ret_fast_syscall+0x0/0x30)
[137177.025734]  r6:00010ba0 r5:00000000 r4:bebc3ca8


Thanks
Chaitanya

On Tue, Jan 17, 2017 at 8:09 PM, Richard Weinberger <richard@nod.at> wrote:
> Chaitanya,
>
> Am 17.01.2017 um 15:32 schrieb chaitanya vinnakota:
>> Now , I'm seeing altogether different form of errors.
>>
>> ./runtests.sh /dev/ubi1
>> Running mkvol_basic /dev/ubi1
>> Running mkvol_bad /dev/ubi1
>
> _bad is supposed to return errors.
>
>> [77361.468670] UBI error: verify_mkvol_req: bad volume creation request
>> [77361.475116] Volume creation request dump:
>> [77361.479139]  vol_id    -2
>> [77361.481768]  alignment 1
>> [77361.484322]  bytes     86343680
>> [77361.487473]  vol_type  3
>> [77361.490013]  name_len  22
>> [77361.492641]  1st 16 characters of name: mkvol_bad:test_m
>> [77361.498235] UBI error: verify_mkvol_req: bad volume creation request
>> [77361.504652] Volume creation request dump:
>> [77361.508676]  vol_id    128
>> [77361.511390]  alignment 1
>> [77361.513930]  bytes     86343680
>> [77361.517095]  vol_type  3
>> [77361.519636]  name_len  22
>> [77361.522265]  1st 16 characters of name: mkvol_bad:test_m
>> [77361.527634] UBI error: verify_mkvol_req: bad volume creation request
>> [77361.534009] Volume creation request dump:
>> [77361.538064]  vol_id    0
>> [77361.540606]  alignment 0
>> [77361.543145]  bytes     86343680
>> [77361.546316]  vol_type  3
>> [77361.548857]  name_len  22
>> [77361.551485]  1st 16 characters of name: mkvol_bad:test_m
>> [77361.558437] UBI error: verify_mkvol_req: bad volume creation request
>> [77361.564862] Volume creation request dump:
>> [77361.568902]  vol_id    0
>> [77361.571457]  alignment -1
>> [77361.574101]  bytes     86343680
>> [77361.577313]  vol_type  3
>> [77361.579866]  name_len  22
>> [77361.582504]  1st 16 characters of name: mkvol_bad:test_m
>> [77361.587943] UBI error: verify_mkvol_req: bad volume creation request
>> [77361.594355] Volume creation request dump:
>> [77361.598394]  vol_id    0
>> [77361.600945]  alignment 126977
>> [77361.603936]  bytes     86343680
>> [77361.607132]  vol_type  3
>> [77361.609688]  name_len  22
>> [77361.612329]  1st 16 characters of name: mkvol_bad:test_m
>> [77361.617765] UBI error: verify_mkvol_req: bad volume creation request
>> [77361.624199] Volume creation request dump:
>> [77361.628241]  vol_id    0
>> [77361.630800]  alignment 2049
>> [77361.633617]  bytes     86343680
>> [77361.636814]  vol_type  3
>> [77361.639368]  name_len  22
>> [77361.642010]  1st 16 characters of name: mkvol_bad:test_m
>> [77361.647478] UBI error: verify_mkvol_req: bad volume creation request
>> [77361.653868] Volume creation request dump:
>> [77361.657940]  vol_id    0
>> [77361.660510]  alignment 1
>> [77361.663065]  bytes     -1
>> [77361.665739]  vol_type  3
>> [77361.668290]  name_len  22
>> [77361.670930]  1st 16 characters of name: mkvol_bad:test_m
>> [77361.676360] UBI error: verify_mkvol_req: bad volume creation request
>> [77361.682748] Volume creation request dump:
>> [77361.686826]  vol_id    0
>> [77361.689383]  alignment 1
>> [77361.691936]  bytes     0
>> [77361.694513]  vol_type  3
>> [77361.697068]  name_len  22
>> [77361.699708]  1st 16 characters of name: mkvol_bad:test_m
>> [77361.705119] UBI error: ubi_create_volume: not enough PEBs, only 680 available
>> [77361.712293] UBI error: ubi_create_volume: cannot create volume 0, error -28
>> [77361.719415] UBI error: ubi_create_volume: not enough PEBs, only 680 available
>> [77361.726655] UBI error: ubi_create_volume: cannot create volume 0, error -28
>> [77361.733804] UBI error: verify_mkvol_req: bad volume creation request
>
> These are fine.
>
>> [77361.740243] Volume creation request dump:
>> [77361.744301]  vol_id    0
>> [77361.746854]  alignment 1
>> [77361.749407]  bytes     126976
>> [77361.752397]  vol_type  7
>> [77361.754981]  name_len  22
>> [77361.757622]  1st 16 characters of name: mkvol_bad:test_m
>> [77361.926468] UBI error: ubi_create_volume: volume 0 already exists
>> [77361.932606] UBI error: ubi_create_volume: cannot create volume 0, error -17
>> [77361.939839] UBI error: ubi_create_volume: volume "mkvol_bad:test_mkvol()" exists (ID 0)
>> [77361.947946] UBI error: ubi_create_volume: cannot create volume 1, error -17
>> [77362.290898] UBI error: ubi_create_volume: volume "mkvol_bad:test_mkvol()" exists (ID 0)
>> [77362.299021] UBI error: ubi_create_volume: cannot create volume 1, error -17
>> [77405.406937] UBI error: ubi_open_volume: cannot open device 1, volume 128, error -22
>> [77405.420226] UBI error: ubi_open_volume: cannot open device 1, volume -1, error -22
>> [77405.427910] UBI error: ubi_open_volume: cannot open device 1, volume 128, error -22
>> [77405.435657] UBI error: ubi_open_volume: cannot open device 1, volume 0, error -19
>> [77405.774800] UBI error: ubi_open_volume: cannot open device 1, volume 0, error -19
>
> These are fine too.
>
>> Running mkvol_paral /dev/ubi1
>> Running rsvol /dev/ubi1
>> [rsvol] test_rsvol1():178: function open() failed with error 2 (No such file or directory)
>> [rsvol] test_rsvol1():179: cannot open "/dev/ubi1_0"
>
> That's one is odd.
> Do all other tests fail too?
> io_basic and io_parallel are interesting. You can also start them by hand.
>
> Thanks,
> //richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS errors are randomly seen after reboots
  2017-01-18 11:35               ` chaitanya vinnakota
@ 2017-01-23 10:48                 ` chaitanya vinnakota
  2017-01-26  8:42                   ` Richard Weinberger
  0 siblings, 1 reply; 12+ messages in thread
From: chaitanya vinnakota @ 2017-01-23 10:48 UTC (permalink / raw)
  To: Richard Weinberger, linux-mtd

Hi Richard,

We are seeing UBIFS errors even when the root-filesystem  is mounted
read-only. But , the error is reported only once.
Our test scenario is we are rebooting the device by calling "reboot"
from a script ,  during data-write operations to the mtd partitions
other than root-filesystem executed by another script.

What's more baffling is why root-filesystem UBIFS errors are seen when
the data-write operations are  performed on the other partitions and
most importantly when rootfs is mounted read-only,

[  155.121005] UBIFS error (pid 5040): ubifs_decompress: cannot
decompress 2434 bytes, compressor zlib, error -22
[  155.121017] UBIFS error (pid 5040): read_block: bad data node
(block 60, inode 3484)
[  155.121026] UBIFS error (pid 5040): do_readpage: cannot read page
60 of inode 3484, error -22

ECC errors are also observed sometimes when the rootfs is mounted read-only

[  154.824361] ECC: uncorrectable error 2 !!!
[  154.824368] ECC correction failed for page 0x00014b58
[  154.825474] ECC: uncorrectable error 2 !!!
[  154.825479] ECC correction failed for page 0x00014b58
[  154.825604] UBI warning: ubi_io_read: error -74 (ECC error) while
reading 188 bytes from PEB 451:50368, read only 188 bytes, retry

The page 0x00014b58 falls in the rootfs partition. But , nanddump
utility is not reporting any bad blocks from the rootfs partition .

 ~# nanddump /dev/mtd4
nanddump: warning!: you did not specify a default bad-block handling
  method. In future versions, the default will change to
  --bb=skipbad. Use "nanddump --help" for more information.
nanddump: warning!: in next release, nanddump will not dump OOB
  by default. Use `nanddump --oob' explicitly to ensure
  it is dumped.
ECC failed: 0
ECC corrected: 0
Number of bad blocks: 0
Number of bbt blocks: 0

 We ran mtd and ubi tests , mtd all tests passed but UBI one test i.e
io_basic failed.

Can you please help us in this regard. Any inputs or suggestions ?


Thanks
Chaitanya

On Wed, Jan 18, 2017 at 5:05 PM, chaitanya vinnakota
<chaitanya.sai.v@gmail.com> wrote:
> Hi Richard,
> io_basic test failed  whereas the rest of the tests i.e io_read
> ,io_update,  io_paral,  volrefcnt passed.  I also ran mtd-tests and
> all of them passed.
>
> To sum it up , we've rsvol and io_basic which failed and the rest all
> passed. Does this give any lead to the root cause of UBIFS errors or
> are there any pending tests which we need to perform ?
>
>
> Below is the output of io_basic test
>
> root@router6654A1:~/arm-openwrt-linux# ./io_basic /dev/ubi1
> [137176.494161] INFO: rcu_sched detected stall on CPU 0 (t=6000 jiffies)
> [137176.500628] Backtrace:
> [137176.503212] [<c40113a0>] (dump_backtrace+0x0/0x110) from
> [<c43fff28>] (dump_stack+0x18/0x1c)
> [137176.511765]  r6:c458e080
> [137176.514150] INFO: rcu_sched detected stalls on CPUs/tasks: { 0}
> (detected by 1, t=6002 jiffies)
> [137176.514166] Backtrace:
> [137176.514186] [<c40113a0>] (dump_backtrace+0x0/0x110) from
> [<c43fff28>] (dump_stack+0x18/0x1c)
> [137176.514196]  r6:c45a6680 r5:c45a6620 r4:c4e31674 r3:00000000
> [137176.514228] [<c43fff10>] (dump_stack+0x0/0x1c) from [<c4080368>]
> (__rcu_pending+0x210/0x38c)
> [137176.514248] [<c4080158>] (__rcu_pending+0x0/0x38c) from
> [<c4081270>] (rcu_check_callbacks+0xe8/0x17c)
> [137176.514269] [<c4081188>] (rcu_check_callbacks+0x0/0x17c) from
> [<c4041f70>] (update_process_times+0x40/0x64)
> [137176.514279]  r8:e5267595 r7:00000001 r6:e6049780 r5:00000000 r4:e6066000
> [137176.514294] r3:00010000
> [137176.514315] [<c4041f30>] (update_process_times+0x0/0x64) from
> [<c4063174>] (tick_sched_timer+0x9c/0xdc)
> [137176.514325]  r7:c4e314a0 r6:e6067f68 r5:e6066000 r4:c4e31598
> [137176.514351] [<c40630d8>] (tick_sched_timer+0x0/0xdc) from
> [<c40564f0>] (__run_hrtimer+0xf4/0x1c8)
> [137176.514360]  r9:e6067eb0 r8:e5266b80 r6:c4e314d8 r5:c4e314a0 r4:c4e31598
> [137176.514385] [<c40563fc>] (__run_hrtimer+0x0/0x1c8) from
> [<c4057248>] (hrtimer_interrupt+0x124/0x288)
> [137176.514404] [<c4057124>] (hrtimer_interrupt+0x0/0x288) from
> [<c4013640>] (twd_handler+0x28/0x30)
> [137176.514422] [<c4013618>] (twd_handler+0x0/0x30) from [<c407d8d8>]
> (handle_percpu_devid_irq+0xd0/0x150)
> [137176.514431]  r4:0000001d r3:c4013618
> [137176.514448] [<c407d808>] (handle_percpu_devid_irq+0x0/0x150) from
> [<c4079e88>] (generic_handle_irq+0x34/0x48)
> [137176.514466] [<c4079e54>] (generic_handle_irq+0x0/0x48) from
> [<c400e494>] (handle_IRQ+0x80/0xc0)
> [137176.514483] [<c400e414>] (handle_IRQ+0x0/0xc0) from [<c40081d0>]
> (asm_do_IRQ+0x10/0x14)
> [137176.514492]  r5:60000013 r4:c400e72c
> [137176.514507] [<c40081c0>] (asm_do_IRQ+0x0/0x14) from [<c400d738>]
> (__irq_svc+0x38/0x90)
> [137176.514518] Exception stack(0xe6067f68 to 0xe6067fb0)
> [137176.514529] 7f60:                   00000001 00000000 e6067fb0
> 00000000 e6066000 c44058a8
> [137176.514544] 7f80: c45c6984 c45c6ab4 0400406a 412fc091 00000000
> e6067fbc e6067fc0 e6067fb0
> [137176.514555] 7fa0: c400e728 c400e72c 60000013 ffffffff
> [137176.514571] [<c400e700>] (default_idle+0x0/0x30) from [<c400e90c>]
> (cpu_idle+0x84/0xc8)
> [137176.514596] [<c400e888>] (cpu_idle+0x0/0xc8) from [<c43fcc0c>]
> (secondary_start_kernel+0x140/0x158)
> [137176.514606]  r6:10c03c7d r5:c45a8b14 r4:00000002 r3:c45988ac
> [137176.514693] [<c43fcacc>] (secondary_start_kernel+0x0/0x158) from
> [<043fc4b4>] (0x43fc4b4)
> [137176.514703]  r5:00000015 r4:2606806a
> [137176.739774]  r5:c45a6620 r4:c4e29674 r3:c45a2dbc
> [137176.744558] [<c43fff10>] (dump_stack+0x0/0x1c) from [<c40801e0>]
> (__rcu_pending+0x88/0x38c)
> [137176.753035] [<c4080158>] (__rcu_pending+0x0/0x38c) from
> [<c4081270>] (rcu_check_callbacks+0xe8/0x17c)
> [137176.762382] [<c4081188>] (rcu_check_callbacks+0x0/0x17c) from
> [<c4041f70>] (update_process_times+0x40/0x64)
> [137176.772240]  r8:e3f551f5 r7:00000000 r6:e4a71540 r5:00000000 r4:e2012000
> [137176.778925] r3:00010000
> [137176.781677] [<c4041f30>] (update_process_times+0x0/0x64) from
> [<c4063174>] (tick_sched_timer+0x9c/0xdc)
> [137176.791186]  r7:c4e294a0 r6:e2013c70 r5:e2012000 r4:c4e29598
> [137176.797011] [<c40630d8>] (tick_sched_timer+0x0/0xdc) from
> [<c40564f0>] (__run_hrtimer+0xf4/0x1c8)
> [137176.805998]  r9:e2013bb8 r8:e3f53e80 r6:c4e294d8 r5:c4e294a0 r4:c4e29598
> [137176.812875] [<c40563fc>] (__run_hrtimer+0x0/0x1c8) from
> [<c4057248>] (hrtimer_interrupt+0x124/0x288)
> [137176.822134] [<c4057124>] (hrtimer_interrupt+0x0/0x288) from
> [<c4013640>] (twd_handler+0x28/0x30)
> [137176.831042] [<c4013618>] (twd_handler+0x0/0x30) from [<c407d8d8>]
> (handle_percpu_devid_irq+0xd0/0x150)
> [137176.840465]  r4:0000001d r3:c4013618
> [137176.844170] [<c407d808>] (handle_percpu_devid_irq+0x0/0x150) from
> [<c4079e88>] (generic_handle_irq+0x34/0x48)
> [137176.854213] [<c4079e54>] (generic_handle_irq+0x0/0x48) from
> [<c400e494>] (handle_IRQ+0x80/0xc0)
> [137176.863032] [<c400e414>] (handle_IRQ+0x0/0xc0) from [<c40081d0>]
> (asm_do_IRQ+0x10/0x14)
> [137176.871148]  r5:20000013 r4:c41c330c
> [137176.874852] [<c40081c0>] (asm_do_IRQ+0x0/0x14) from [<c400d738>]
> (__irq_svc+0x38/0x90)
> [137176.882883] Exception stack(0xe2013c70 to 0xe2013cb8)
> [137176.888041] 3c60:                                     e8800000
> ec3cf580 00000270 a5a5a5a5
> [137176.896335] 3c80: a5a5a5a5 0000a5a5 000000a5 ec3cf400 00000001
> c4619aa8 00000400 e2013cd4
> [137176.904629] 3ca0: 000000a5 e2013cb8 000000a5 c41c330c 20000013 ffffffff
> [137176.911371] [<c42594c8>] (nand_read_buf+0x0/0x1c) from
> [<c425d8a4>] (comcerto_nand_read_page_hwecc+0x70/0x140)
> [137176.921498] [<c425d834>] (comcerto_nand_read_page_hwecc+0x0/0x140)
> from [<c4258e68>] (nand_do_read_ops+0x200/0x438)
> [137176.932062] [<c4258c68>] (nand_do_read_ops+0x0/0x438) from
> [<c425a950>] (nand_read+0x80/0xa4)
> [137176.940708] [<c425a8d0>] (nand_read+0x0/0xa4) from [<c42442ac>]
> (part_read+0x78/0xcc)
> [137176.948649]  r9:00000000 r8:04140000 r7:00000000 r6:06d40000 r5:00000003
> [137176.955334] r4:e6295c00
> [137176.958081] [<c4244234>] (part_read+0x0/0xcc) from [<c42675b0>]
> (ubi_io_read+0x120/0x2b0)
> [137176.966379] [<c4267490>] (ubi_io_read+0x0/0x2b0) from [<c426522c>]
> (ubi_eba_read_leb+0x2b8/0x3b0)
> [137176.975375] [<c4264f74>] (ubi_eba_read_leb+0x0/0x3b0) from
> [<c426ddd4>] (ubi_check_volume+0x78/0xbc)
> [137176.984630] [<c426dd5c>] (ubi_check_volume+0x0/0xbc) from
> [<c42639ec>] (vol_cdev_write+0x2d4/0x354)
> [137176.993806] [<c4263718>] (vol_cdev_write+0x0/0x354) from
> [<c40d7c00>] (vfs_write+0xbc/0x148)
> [137177.002364] [<c40d7b44>] (vfs_write+0x0/0x148) from [<c40d7e8c>]
> (sys_write+0x48/0x74)
> [137177.010392]  r8:c400dd44 r7:00000004 r6:00000200 r5:bebc3a18 r4:e4996e40
> [137177.017270] [<c40d7e44>] (sys_write+0x0/0x74) from [<c400dbc0>]
> (ret_fast_syscall+0x0/0x30)
> [137177.025734]  r6:00010ba0 r5:00000000 r4:bebc3ca8
>
>
> Thanks
> Chaitanya
>
> On Tue, Jan 17, 2017 at 8:09 PM, Richard Weinberger <richard@nod.at> wrote:
>> Chaitanya,
>>
>> Am 17.01.2017 um 15:32 schrieb chaitanya vinnakota:
>>> Now , I'm seeing altogether different form of errors.
>>>
>>> ./runtests.sh /dev/ubi1
>>> Running mkvol_basic /dev/ubi1
>>> Running mkvol_bad /dev/ubi1
>>
>> _bad is supposed to return errors.
>>
>>> [77361.468670] UBI error: verify_mkvol_req: bad volume creation request
>>> [77361.475116] Volume creation request dump:
>>> [77361.479139]  vol_id    -2
>>> [77361.481768]  alignment 1
>>> [77361.484322]  bytes     86343680
>>> [77361.487473]  vol_type  3
>>> [77361.490013]  name_len  22
>>> [77361.492641]  1st 16 characters of name: mkvol_bad:test_m
>>> [77361.498235] UBI error: verify_mkvol_req: bad volume creation request
>>> [77361.504652] Volume creation request dump:
>>> [77361.508676]  vol_id    128
>>> [77361.511390]  alignment 1
>>> [77361.513930]  bytes     86343680
>>> [77361.517095]  vol_type  3
>>> [77361.519636]  name_len  22
>>> [77361.522265]  1st 16 characters of name: mkvol_bad:test_m
>>> [77361.527634] UBI error: verify_mkvol_req: bad volume creation request
>>> [77361.534009] Volume creation request dump:
>>> [77361.538064]  vol_id    0
>>> [77361.540606]  alignment 0
>>> [77361.543145]  bytes     86343680
>>> [77361.546316]  vol_type  3
>>> [77361.548857]  name_len  22
>>> [77361.551485]  1st 16 characters of name: mkvol_bad:test_m
>>> [77361.558437] UBI error: verify_mkvol_req: bad volume creation request
>>> [77361.564862] Volume creation request dump:
>>> [77361.568902]  vol_id    0
>>> [77361.571457]  alignment -1
>>> [77361.574101]  bytes     86343680
>>> [77361.577313]  vol_type  3
>>> [77361.579866]  name_len  22
>>> [77361.582504]  1st 16 characters of name: mkvol_bad:test_m
>>> [77361.587943] UBI error: verify_mkvol_req: bad volume creation request
>>> [77361.594355] Volume creation request dump:
>>> [77361.598394]  vol_id    0
>>> [77361.600945]  alignment 126977
>>> [77361.603936]  bytes     86343680
>>> [77361.607132]  vol_type  3
>>> [77361.609688]  name_len  22
>>> [77361.612329]  1st 16 characters of name: mkvol_bad:test_m
>>> [77361.617765] UBI error: verify_mkvol_req: bad volume creation request
>>> [77361.624199] Volume creation request dump:
>>> [77361.628241]  vol_id    0
>>> [77361.630800]  alignment 2049
>>> [77361.633617]  bytes     86343680
>>> [77361.636814]  vol_type  3
>>> [77361.639368]  name_len  22
>>> [77361.642010]  1st 16 characters of name: mkvol_bad:test_m
>>> [77361.647478] UBI error: verify_mkvol_req: bad volume creation request
>>> [77361.653868] Volume creation request dump:
>>> [77361.657940]  vol_id    0
>>> [77361.660510]  alignment 1
>>> [77361.663065]  bytes     -1
>>> [77361.665739]  vol_type  3
>>> [77361.668290]  name_len  22
>>> [77361.670930]  1st 16 characters of name: mkvol_bad:test_m
>>> [77361.676360] UBI error: verify_mkvol_req: bad volume creation request
>>> [77361.682748] Volume creation request dump:
>>> [77361.686826]  vol_id    0
>>> [77361.689383]  alignment 1
>>> [77361.691936]  bytes     0
>>> [77361.694513]  vol_type  3
>>> [77361.697068]  name_len  22
>>> [77361.699708]  1st 16 characters of name: mkvol_bad:test_m
>>> [77361.705119] UBI error: ubi_create_volume: not enough PEBs, only 680 available
>>> [77361.712293] UBI error: ubi_create_volume: cannot create volume 0, error -28
>>> [77361.719415] UBI error: ubi_create_volume: not enough PEBs, only 680 available
>>> [77361.726655] UBI error: ubi_create_volume: cannot create volume 0, error -28
>>> [77361.733804] UBI error: verify_mkvol_req: bad volume creation request
>>
>> These are fine.
>>
>>> [77361.740243] Volume creation request dump:
>>> [77361.744301]  vol_id    0
>>> [77361.746854]  alignment 1
>>> [77361.749407]  bytes     126976
>>> [77361.752397]  vol_type  7
>>> [77361.754981]  name_len  22
>>> [77361.757622]  1st 16 characters of name: mkvol_bad:test_m
>>> [77361.926468] UBI error: ubi_create_volume: volume 0 already exists
>>> [77361.932606] UBI error: ubi_create_volume: cannot create volume 0, error -17
>>> [77361.939839] UBI error: ubi_create_volume: volume "mkvol_bad:test_mkvol()" exists (ID 0)
>>> [77361.947946] UBI error: ubi_create_volume: cannot create volume 1, error -17
>>> [77362.290898] UBI error: ubi_create_volume: volume "mkvol_bad:test_mkvol()" exists (ID 0)
>>> [77362.299021] UBI error: ubi_create_volume: cannot create volume 1, error -17
>>> [77405.406937] UBI error: ubi_open_volume: cannot open device 1, volume 128, error -22
>>> [77405.420226] UBI error: ubi_open_volume: cannot open device 1, volume -1, error -22
>>> [77405.427910] UBI error: ubi_open_volume: cannot open device 1, volume 128, error -22
>>> [77405.435657] UBI error: ubi_open_volume: cannot open device 1, volume 0, error -19
>>> [77405.774800] UBI error: ubi_open_volume: cannot open device 1, volume 0, error -19
>>
>> These are fine too.
>>
>>> Running mkvol_paral /dev/ubi1
>>> Running rsvol /dev/ubi1
>>> [rsvol] test_rsvol1():178: function open() failed with error 2 (No such file or directory)
>>> [rsvol] test_rsvol1():179: cannot open "/dev/ubi1_0"
>>
>> That's one is odd.
>> Do all other tests fail too?
>> io_basic and io_parallel are interesting. You can also start them by hand.
>>
>> Thanks,
>> //richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS errors are randomly seen after reboots
  2017-01-23 10:48                 ` chaitanya vinnakota
@ 2017-01-26  8:42                   ` Richard Weinberger
  2017-02-03  9:30                     ` chaitanya vinnakota
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Weinberger @ 2017-01-26  8:42 UTC (permalink / raw)
  To: chaitanya vinnakota, linux-mtd

Chaitanya,

Am 23.01.2017 um 11:48 schrieb chaitanya vinnakota:
> Hi Richard,
> 
> We are seeing UBIFS errors even when the root-filesystem  is mounted
> read-only. But , the error is reported only once.
> Our test scenario is we are rebooting the device by calling "reboot"
> from a script ,  during data-write operations to the mtd partitions
> other than root-filesystem executed by another script.
> 
> What's more baffling is why root-filesystem UBIFS errors are seen when
> the data-write operations are  performed on the other partitions and
> most importantly when rootfs is mounted read-only,
> 
> [  155.121005] UBIFS error (pid 5040): ubifs_decompress: cannot
> decompress 2434 bytes, compressor zlib, error -22

The compressor fails to uncompress because the payload is corrupted.
This can be due to a driver bug, not strong enough ECC, etc..

> [  155.121017] UBIFS error (pid 5040): read_block: bad data node
> (block 60, inode 3484)
> [  155.121026] UBIFS error (pid 5040): do_readpage: cannot read page
> 60 of inode 3484, error -22
> 
> ECC errors are also observed sometimes when the rootfs is mounted read-only
> 
> [  154.824361] ECC: uncorrectable error 2 !!!
> [  154.824368] ECC correction failed for page 0x00014b58
> [  154.825474] ECC: uncorrectable error 2 !!!
> [  154.825479] ECC correction failed for page 0x00014b58
> [  154.825604] UBI warning: ubi_io_read: error -74 (ECC error) while

Here we have a classic ECC error. This should not happen on fresh
NANDs.

> reading 188 bytes from PEB 451:50368, read only 188 bytes, retry
> 
> The page 0x00014b58 falls in the rootfs partition. But , nanddump
> utility is not reporting any bad blocks from the rootfs partition .
> 
>  ~# nanddump /dev/mtd4
> nanddump: warning!: you did not specify a default bad-block handling
>   method. In future versions, the default will change to
>   --bb=skipbad. Use "nanddump --help" for more information.
> nanddump: warning!: in next release, nanddump will not dump OOB
>   by default. Use `nanddump --oob' explicitly to ensure
>   it is dumped.
> ECC failed: 0
> ECC corrected: 0
> Number of bad blocks: 0
> Number of bbt blocks: 0
> 
>  We ran mtd and ubi tests , mtd all tests passed but UBI one test i.e
> io_basic failed.
> 
> Can you please help us in this regard. Any inputs or suggestions ?

This is not easy. I suggest to double check all NAND/MTD settings from
group up. Timings, ECC strength, basic driver testing...

Thanks,
//richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS errors are randomly seen after reboots
  2017-01-26  8:42                   ` Richard Weinberger
@ 2017-02-03  9:30                     ` chaitanya vinnakota
  0 siblings, 0 replies; 12+ messages in thread
From: chaitanya vinnakota @ 2017-02-03  9:30 UTC (permalink / raw)
  To: Richard Weinberger, linux-mtd

Hi Richard,
I've taken the UBIFS changes backported to 3.2 kernel from
git://git.infradead.org/users/dedekind/ubifs-v3.2.git. With these
changes , the frequency of the root-filesystem mount failure is
reduced but it is seen some times. One thing I noticed is that
root-filesystem mount failures are seen when the prior reboot
encountered errors during un-mounting root filesystem. I modified the
UBIFS error debug print by adding the process name along with pid.

 Below are the excerpt of the errors seen during reboots

UBIFS error (pid 9515, process umount): dbg_check_space_info: free
space changed from 14128638 to 14743030
UBIFS assert failed in reserve_space at 125 (pid 9529)
UBIFS assert failed in ubifs_write_begin at 436 (pid 9529)
[  234.725204] Backtrace:
[  234.725217] [<c40113a0>] (dump_backtrace+0x0/0x110) from
[<c44120e8>] (dump_stack+0x18/0x1c)
[  234.725224]  r6:e5470120 r5:e5470060 r4:002bb000 r3:00000000
[  234.725239] [<c44120d0>] (dump_stack+0x0/0x1c) from [<c417d024>]
(ubifs_write_begin+0x9c/0x498)
[  234.725251] [<c417cf88>] (ubifs_write_begin+0x0/0x498) from
[<c40a5bc4>] (generic_file_buffered_write+0xe0/0x234)
[  234.725264] [<c40a5ae4>] (generic_file_buffered_write+0x0/0x234)
from [<c40a7794>] (__generic_file_aio_write+0x3fc/0x440)
[  234.725276] [<c40a7398>] (__generic_file_aio_write+0x0/0x440) from
[<c40a7844>] (generic_file_aio_write+0x6c/0xd0)
[  234.725288] [<c40a77d8>] (generic_file_aio_write+0x0/0xd0) from
[<c417c7a8>] (ubifs_aio_write+0x16c/0x180)
[  234.725296]  r8:e653b000 r7:e4cc9f78 r6:e4cc9ea8 r5:e5470060 r4:e609f800
[  234.725313] [<c417c63c>] (ubifs_aio_write+0x0/0x180) from
[<c40d72ac>] (do_sync_write+0xa0/0xe0)
[  234.725325] [<c40d720c>] (do_sync_write+0x0/0xe0) from [<c40d7c00>]
(vfs_write+0xbc/0x148)
[  234.725331]  r5:00300000 r4:e609f800
[  234.725342] [<c40d7b44>] (vfs_write+0x0/0x148) from [<c40d7e8c>]
(sys_write+0x48/0x74)
[  234.725349]  r8:c400dd44 r7:00000004 r6:00300000 r5:4039b008 r4:e609f800
[  234.725366] [<c40d7e44>] (sys_write+0x0/0x74) from [<c400dbc0>]
(ret_fast_syscall+0x0/0x30)
[  234.725372]  r6:4039b008 r5:00000001 r4:0008426c

Subsequently during the boot , the root filesystem mount is failing,
below is excerpt from the logs:-

[   10.090852] UBIFS error (pid 1, process swapper/0):
ubifs_check_node: bad CRC: calculated 0xb4b7338e, read 0xe5385648
[   10.101515] UBIFS error (pid 1, process swapper/0):
ubifs_check_node: bad node at LEB 499:98208
[   10.372656] UBIFS error (pid 1, process swapper/0): ubifs_scan: bad node
[   10.379386] UBIFS error (pid 1, process swapper/0):
ubifs_scanned_corruption: corruption at LEB 499:98208
[   10.388988] UBIFS error (pid 1, process swapper/0):
ubifs_scanned_corruption: first 8192 bytes from LEB 499:98208
[   10.403113] UBIFS error (pid 1, process swapper/0): ubifs_scan: LEB
499 scanning failed
[   10.411213] UBIFS: background thread "ubifs_bgt0_0" stops
[   10.467214] VFS: Cannot open root device "ubi0:rootfs" or unknown-block(0,0)
[   10.474287] Please append a correct "root=" boot option; here are
the available partitions:
[   10.482683] 1f00             512 mtdblock0  (driver?)
[   10.487774] 1f01             512 mtdblock1  (driver?)
[   10.492855] 1f02             128 mtdblock2  (driver?)
[   10.497942] 1f03            8192 mtdblock3  (driver?)
[   10.503022] 1f04           94208 mtdblock4  (driver?)
[   10.508110] 1f05             128 mtdblock5  (driver?)
[   10.513190] 1f06            8192 mtdblock6  (driver?)
[   10.518280] 1f07           94208 mtdblock7  (driver?)
[   10.523360] 1f08             128 mtdblock8  (driver?)
[   10.528449] 1f09            2048 mtdblock9  (driver?)
[   10.533529] 1f0a           12288 mtdblock10  (driver?)
[   10.538702] 1f0b           32768 mtdblock11  (driver?)
[   10.543868] 1f0c            2048 mtdblock12  (driver?)
[   10.549048] 1f0d             128 mtdblock13  (driver?)
[   10.554215] 1f0e             512 mtdblock14  (driver?)
[   10.559391] 1f0f             128 mtdblock15  (driver?)
[   10.564559] 1f10             128 mtdblock16  (driver?)
[   10.569735] 1f11              64 mtdblock17  (driver?)
[   10.574901] 1f12              64 mtdblock18  (driver?)
[   10.580074] 1f13              64 mtdblock19  (driver?)
[   10.585240] 0800         3915776 sda  driver: sd
[   10.589895]   0801         3914752 sda1 00000000-0000-0000-0000-000000000000
[   10.596979] 1f14           84320 mtdblock20  (driver?)
[   10.602155] Kernel panic - not syncing: VFS: Unable to mount root
fs on unknown-block(0,0)

Can the relation between erroneous reboot followed by unsuccessful
boot give us any insight ?

Thanks
Chaitanya




On Thu, Jan 26, 2017 at 2:12 PM, Richard Weinberger <richard@nod.at> wrote:
> Chaitanya,
>
> Am 23.01.2017 um 11:48 schrieb chaitanya vinnakota:
>> Hi Richard,
>>
>> We are seeing UBIFS errors even when the root-filesystem  is mounted
>> read-only. But , the error is reported only once.
>> Our test scenario is we are rebooting the device by calling "reboot"
>> from a script ,  during data-write operations to the mtd partitions
>> other than root-filesystem executed by another script.
>>
>> What's more baffling is why root-filesystem UBIFS errors are seen when
>> the data-write operations are  performed on the other partitions and
>> most importantly when rootfs is mounted read-only,
>>
>> [  155.121005] UBIFS error (pid 5040): ubifs_decompress: cannot
>> decompress 2434 bytes, compressor zlib, error -22
>
> The compressor fails to uncompress because the payload is corrupted.
> This can be due to a driver bug, not strong enough ECC, etc..
>
>> [  155.121017] UBIFS error (pid 5040): read_block: bad data node
>> (block 60, inode 3484)
>> [  155.121026] UBIFS error (pid 5040): do_readpage: cannot read page
>> 60 of inode 3484, error -22
>>
>> ECC errors are also observed sometimes when the rootfs is mounted read-only
>>
>> [  154.824361] ECC: uncorrectable error 2 !!!
>> [  154.824368] ECC correction failed for page 0x00014b58
>> [  154.825474] ECC: uncorrectable error 2 !!!
>> [  154.825479] ECC correction failed for page 0x00014b58
>> [  154.825604] UBI warning: ubi_io_read: error -74 (ECC error) while
>
> Here we have a classic ECC error. This should not happen on fresh
> NANDs.
>
>> reading 188 bytes from PEB 451:50368, read only 188 bytes, retry
>>
>> The page 0x00014b58 falls in the rootfs partition. But , nanddump
>> utility is not reporting any bad blocks from the rootfs partition .
>>
>>  ~# nanddump /dev/mtd4
>> nanddump: warning!: you did not specify a default bad-block handling
>>   method. In future versions, the default will change to
>>   --bb=skipbad. Use "nanddump --help" for more information.
>> nanddump: warning!: in next release, nanddump will not dump OOB
>>   by default. Use `nanddump --oob' explicitly to ensure
>>   it is dumped.
>> ECC failed: 0
>> ECC corrected: 0
>> Number of bad blocks: 0
>> Number of bbt blocks: 0
>>
>>  We ran mtd and ubi tests , mtd all tests passed but UBI one test i.e
>> io_basic failed.
>>
>> Can you please help us in this regard. Any inputs or suggestions ?
>
> This is not easy. I suggest to double check all NAND/MTD settings from
> group up. Timings, ECC strength, basic driver testing...
>
> Thanks,
> //richard

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-02-03  9:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-16  5:28 UBIFS errors are randomly seen after reboots chaitanya vinnakota
2017-01-16  7:52 ` Richard Weinberger
2017-01-16 15:21   ` chaitanya vinnakota
2017-01-16 16:30     ` Richard Weinberger
2017-01-17 11:47       ` chaitanya vinnakota
2017-01-17 12:33         ` Richard Weinberger
     [not found]           ` <CAMBC=j6t2h_yB3wQbNUexrJto+O3ruTXo992JWFLH293OyH0Bg@mail.gmail.com>
2017-01-17 14:39             ` Richard Weinberger
2017-01-18 11:35               ` chaitanya vinnakota
2017-01-23 10:48                 ` chaitanya vinnakota
2017-01-26  8:42                   ` Richard Weinberger
2017-02-03  9:30                     ` chaitanya vinnakota
2017-01-17 14:46           ` chaitanya vinnakota

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.