All of lore.kernel.org
 help / color / mirror / Atom feed
* UBIFS fails to mount on second boot
@ 2012-06-25  5:23 Iwo
  2012-06-27 14:10 ` Artem Bityutskiy
  0 siblings, 1 reply; 11+ messages in thread
From: Iwo @ 2012-06-25  5:23 UTC (permalink / raw)
  To: linux-mtd

Hi all,


I'm trying to use a UBIFS partition as the linux root filesystem.
After flashing the
UBI image, the system boots successfully and is functional.

On the second boot, however, mounting of the UBIFS fails and the kernel panics.
It is possible to avoid the failure by performing a large number of filesystem
operations (i.e. file system benchmark) during the first session.

Has anyone seen this before? Suggestions?

The hardware is a TI am335x processor with a 256MB, 2K pages, 128K erase block
NAND flash.

As is all too common, the flash supports sub-pages, the driver does
not. Thus, I'm
forcing the VID header offset to 2048.

The kernel is a 3.1.0 with TI patches and UBI+UBIFS patches from
git://git.infradead.org/~dedekind/ubifs-v3.1.git today. I had to change
IS_ENABLED(DEBUG_FS) to IS_ENABLED(CONFIG_DEBUG_FS)
to make that compile. The original v3.1.0 UBI/UBIFS code exhibited
the same behaviour though.

The image is generated like so (using the latest MTD tools GIT):

 mkfs.ubifs -d staging -F -o root.ubifs -m 0x800 -e 126976 -c 511
 Using UBI volume size of 16760832 (16777216)
 ubinize -o root.ubi -m 0x800 -p 0x20000 ubi.cfg

ubi.cfg:

 [ubifs]
 mode=ubi
 image=root.ubifs
 vol_id=0
 vol_size=16777216
 vol_type=dynamic
 vol_name=rootfs
 vol_flags=autoresize

The UBI image is written by U-Boot, skipping write on any pages that
contain 0xFF.

Here are the interesting bits from the kernel log. First (successful) boot:

[    0.000000] Kernel command line: console=ttyO0,115200n8 noinitrd
rootfstype=ubifs rootwait=1 ip=none ubifs.debug_msgs=0xffff
root=ubi0:rootfs rw ubi.mtd=roota,2048
...
[    1.370993] Creating 8 MTD partitions on "omap2-nand.0":
[    1.376607] 0x000000000000-0x000000300000 : "S1S2EN"
[    1.384479] 0x000000300000-0x000000800000 : "kernela"
[    1.392911] 0x000000800000-0x000002800000 : "roota"
[    1.412292] 0x000002800000-0x000003800000 : "ovla"
[    1.424904] 0x000003800000-0x000003d00000 : "kernelb"
[    1.433216] 0x000003d00000-0x000005d00000 : "rootb"
[    1.452552] 0x000005d00000-0x000006d00000 : "ovlb"
[    1.465172] 0x000006d00000-0x000020000000 : "NAND"
[    1.638785] UBI: attaching mtd2 to ubi0
[    1.643081] UBI: physical eraseblock size:   131072 bytes (128 KiB)
[    1.649669] UBI: logical eraseblock size:    126976 bytes
[    1.655362] UBI: smallest flash I/O unit:    2048
[    1.660305] UBI: sub-page size:              512
[    1.665174] UBI: VID header offset:          2048 (aligned 2048)
[    1.671487] UBI: data offset:                4096
[    1.878540] UBI: max. sequence number:       0
[    1.912626] UBI: volume 0 ("rootfs") re-sized from 133 to 250 LEBs
[    1.920864] UBI: attached mtd2 to ubi0
[    1.924870] UBI: MTD device name:            "roota"
[    1.930090] UBI: MTD device size:            32 MiB
[    1.935234] UBI: number of good PEBs:        256
[    1.940085] UBI: number of bad PEBs:         0
[    1.944769] UBI: number of corrupted PEBs:   0
[    1.949438] UBI: max. allowed volumes:       128
[    1.954304] UBI: wear-leveling threshold:    2048
[    1.959248] UBI: number of internal volumes: 1
[    1.963930] UBI: number of user volumes:     1
[    1.968599] UBI: available PEBs:             0
[    1.973282] UBI: total number of reserved PEBs: 256
[    1.978409] UBI: number of PEBs reserved for bad PEB handling: 2
[    1.984738] UBI: max/mean erase counter: 1/0
[    1.989225] UBI: image sequence number:  1111822412
[    1.995070] UBI: background thread "ubi_bgt0d" started, PID 38
...
[    2.500482] UBIFS: start fixing up free space
[    3.876113] UBIFS: free space fixup complete
[    3.882701] UBIFS: mounted UBI device 0, volume 0, name "rootfs"
[    3.889027] UBIFS: file system size:   30347264 bytes (29636 KiB,
28 MiB, 239 LEBs)
[    3.897125] UBIFS: journal size:       8507392 bytes (8308 KiB, 8
MiB, 67 LEBs)
[    3.904833] UBIFS: media format:       w4/r0 (latest is w4/r0)
[    3.910965] UBIFS: default compressor: lzo
[    3.915285] UBIFS: reserved for root:  0 bytes (0 KiB)
[    3.924923] VFS: Mounted root (ubifs filesystem) on device 0:14.

Second (failed) boot:

[    0.000000] Kernel command line: console=ttyO0,115200n8 noinitrd
rootfstype=ubifs rootwait=1 ip=none ubifs.debug_msgs=0xffff
root=ubi0:rootfs rw ubi.mtd=roota,2048
...
[    1.370779] Creating 8 MTD partitions on "omap2-nand.0":
[    1.376394] 0x000000000000-0x000000300000 : "S1S2EN"
[    1.384281] 0x000000300000-0x000000800000 : "kernela"
[    1.392713] 0x000000800000-0x000002800000 : "roota"
[    1.412102] 0x000002800000-0x000003800000 : "ovla"
[    1.424706] 0x000003800000-0x000003d00000 : "kernelb"
[    1.433017] 0x000003d00000-0x000005d00000 : "rootb"
[    1.452362] 0x000005d00000-0x000006d00000 : "ovlb"
[    1.464988] 0x000006d00000-0x000020000000 : "NAND"
[    1.638603] UBI: attaching mtd2 to ubi0
[    1.642893] UBI: physical eraseblock size:   131072 bytes (128 KiB)
[    1.649480] UBI: logical eraseblock size:    126976 bytes
[    1.655171] UBI: smallest flash I/O unit:    2048
[    1.660115] UBI: sub-page size:              512
[    1.664984] UBI: VID header offset:          2048 (aligned 2048)
[    1.671297] UBI: data offset:                4096
[    1.937185] UBI: max. sequence number:       30
[    1.956309] UBI: attached mtd2 to ubi0
[    1.960262] UBI: MTD device name:            "roota"
[    1.965548] UBI: MTD device size:            32 MiB
[    1.970675] UBI: number of good PEBs:        256
[    1.975544] UBI: number of bad PEBs:         0
[    1.980214] UBI: number of corrupted PEBs:   0
[    1.984898] UBI: max. allowed volumes:       128
[    1.989750] UBI: wear-leveling threshold:    2048
[    1.994707] UBI: number of internal volumes: 1
[    1.999376] UBI: number of user volumes:     1
[    2.004062] UBI: available PEBs:             0
[    2.008731] UBI: total number of reserved PEBs: 256
[    2.013871] UBI: number of PEBs reserved for bad PEB handling: 2
[    2.020186] UBI: max/mean erase counter: 2/0
[    2.024687] UBI: image sequence number:  1111822412
[    2.030525] UBI: background thread "ubi_bgt0d" started, PID 38
...
[    2.400325] UBIFS: recovery needed
[    2.437280] UBIFS error (pid 1): replay_log_leb: first log node at
LEB 3:0 is not CS node
[    2.445923] UBIFS error (pid 1): replay_log_leb: log error detected
while replaying the log at LEB 3:0
[    2.455729]  magic          0x6101831
[    2.459575]  crc            0xb972ae2d
[    2.463532]  node_type      8 (reference node)
[    2.468202]  group_type     0 (no node group)
[    2.472795]  sqnum          11443
[    2.476276]  len            64
[    2.479484]  lnum           128
[    2.482797]  offs           71680
[    2.486278]  jhead          1
[    2.491687] List of all partitions:
[    2.495377] No filesystem could mount root, tried:  ubifs
[    2.501071] Kernel panic - not syncing: VFS: Unable to mount root
fs on unknown-block(0,0)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: UBIFS fails to mount on second boot
  2012-06-25  5:23 UBIFS fails to mount on second boot Iwo
@ 2012-06-27 14:10 ` Artem Bityutskiy
  2012-06-29  6:05   ` Iwo Mergler
  0 siblings, 1 reply; 11+ messages in thread
From: Artem Bityutskiy @ 2012-06-27 14:10 UTC (permalink / raw)
  To: Iwo; +Cc: linux-mtd

[-- Attachment #1: Type: text/plain, Size: 1764 bytes --]

Hi,

On Mon, 2012-06-25 at 15:23 +1000, Iwo wrote:
> I'm trying to use a UBIFS partition as the linux root filesystem.
> After flashing the
> UBI image, the system boots successfully and is functional.
> 
> On the second boot, however, mounting of the UBIFS fails and the kernel panics.

So this is reproducible? Can you provide me the ubifs image?

> It is possible to avoid the failure by performing a large number of filesystem
> operations (i.e. file system benchmark) during the first session.

Hmm, sounds strange.

> Has anyone seen this before? Suggestions?

Similar issue was reported once and the reported disappeared:
http://lists.infradead.org/pipermail/linux-mtd/2012-June/042046.html

> The hardware is a TI am335x processor with a 256MB, 2K pages, 128K erase block
> NAND flash.

SLC flash?

Did you validate your flash with MTD tests?

> As is all too common, the flash supports sub-pages, the driver does
> not. Thus, I'm
> forcing the VID header offset to 2048.

This means the driver is buggy: it does not support sub-pages but still
reports that it does. Just fix it instead.

> The kernel is a 3.1.0 with TI patches and UBI+UBIFS patches from
> git://git.infradead.org/~dedekind/ubifs-v3.1.git today. I had to change
> IS_ENABLED(DEBUG_FS) to IS_ENABLED(CONFIG_DEBUG_FS)
> to make that compile. The original v3.1.0 UBI/UBIFS code exhibited
> the same behaviour though.

I'll fix the IS_ENABLED stuff, sorry for that.

> The image is generated like so (using the latest MTD tools GIT):
> 
>  mkfs.ubifs -d staging -F -o root.ubifs -m 0x800 -e 126976 -c 511

Did you try to mount an empty volume and let UBIFS auto-format it, and
then reproduce the issue?

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: UBIFS fails to mount on second boot
  2012-06-27 14:10 ` Artem Bityutskiy
@ 2012-06-29  6:05   ` Iwo Mergler
  2012-06-29 13:46     ` Artem Bityutskiy
  0 siblings, 1 reply; 11+ messages in thread
From: Iwo Mergler @ 2012-06-29  6:05 UTC (permalink / raw)
  To: dedekind1; +Cc: linux-mtd

Hi Artem,

thanks for looking into this.

On Thu, 28 Jun 2012 00:10:19 +1000
Artem Bityutskiy <dedekind1@gmail.com> wrote:
> So this is reproducible? Can you provide me the ubifs image?

Yes, I'll send you the image and a copy of the whole rootfs
partition with the breakage separately.

>
> > It is possible to avoid the failure by performing a large number of
> > filesystem operations (i.e. file system benchmark) during the first
> > session.
>
> Hmm, sounds strange.

While trying to reproduce the problem, I have come across another
way to avoid it. If the boot scripts in the rootfs perform an
ubiformat, attach, mkvol & mount on an unrelated empty mtd
partition, the problem goes away.

Is there any global state shared between separate UBI/UBIFS
partitions?

>
> > Has anyone seen this before? Suggestions?
>
> Similar issue was reported once and the reported disappeared:
> http://lists.infradead.org/pipermail/linux-mtd/2012-June/042046.html

That sounds exactly like this thing.

>
> > The hardware is a TI am335x processor with a 256MB, 2K pages, 128K
> > erase block NAND flash.
>
> SLC flash?

Yes. A unusually resilient one for that, re-writing the same random
pattern into a page takes 2K-10K writes before the first bit error
appears. I'm also running an apparently unnecessary BCH8 ECC on it.

>
> Did you validate your flash with MTD tests?

Yes, see the end of this mail. All tests seem to pass, apart from
the subpage one.

>
> > As is all too common, the flash supports sub-pages, the driver does
> > not. Thus, I'm
> > forcing the VID header offset to 2048.
>
> This means the driver is buggy: it does not support sub-pages but
> still reports that it does. Just fix it instead.

I was under the impression that the subpage capability is extracted
from the ONFI information. So I take it there is a flag for the
driver to override that?

>
> Did you try to mount an empty volume and let UBIFS auto-format it, and
> then reproduce the issue?

No, UBIFS created from an empty partition work OK. In fact, doing that
also stops the rootfs mount failure on the second boot.


MTD tests:

root:~# mtd_debug info /dev/mtd2
mtd.type = MTD_NANDFLASH
mtd.flags = MTD_CAP_NANDFLASH
mtd.size = 33554432 (32M)
mtd.erasesize = 131072 (128K)
mtd.writesize = 2048 (2K)
mtd.oobsize = 64
regions = 0

root:/lib/modules/3.1.0+/kernel/drivers/mtd/tests# insmod mtd_nandecctest.ko
[  182.206980] mtd_nandecctest: ok - nand-ecc-256
[  182.207470] mtd_nandecctest: ok - nand-ecc-512

root:/lib/modules/3.1.0+/kernel/drivers/mtd/tests# insmod mtd_oobtest.ko
[  380.639390]
[  380.640974] =================================================
[  380.647159] mtd_oobtest: MTD device: 0
[  380.651116] mtd_oobtest: MTD device size 3145728, eraseblock size 131072, page size 2048, count of eraseblocks 24, pages per eraseblock 64, OOB size 64
[  380.665434] mtd_oobtest: scanning for bad eraseblocks
[  380.673032] mtd_oobtest: scanned 24 eraseblocks, 0 are bad
[  380.678806] mtd_oobtest: test 1 of 5
[  380.682613] mtd_oobtest: erasing whole device
[  380.698184] mtd_oobtest: erased 24 eraseblocks
[  380.702891] mtd_oobtest: writing OOBs of whole device
[  380.721776] mtd_oobtest: written up to eraseblock 0
[  381.043079] mtd_oobtest: written 24 eraseblocks
[  381.047846] mtd_oobtest: verifying all eraseblocks
[  381.057118] mtd_oobtest: verified up to eraseblock 0
[  381.162073] mtd_oobtest: verified 24 eraseblocks
[  381.166949] mtd_oobtest: test 2 of 5
[  381.170706] mtd_oobtest: erasing whole device
[  381.186723] mtd_oobtest: erased 24 eraseblocks
[  381.191397] mtd_oobtest: writing OOBs of whole device
[  381.210406] mtd_oobtest: written up to eraseblock 0
[  381.532281] mtd_oobtest: written 24 eraseblocks
[  381.537043] mtd_oobtest: verifying all eraseblocks
[  381.546248] mtd_oobtest: verified up to eraseblock 0
[  381.647383] mtd_oobtest: verified 24 eraseblocks
[  381.652265] mtd_oobtest: test 3 of 5
[  381.656021] mtd_oobtest: erasing whole device
[  381.672033] mtd_oobtest: erased 24 eraseblocks
[  381.676704] mtd_oobtest: writing OOBs of whole device
[  381.695205] mtd_oobtest: written up to eraseblock 0
[  382.007184] mtd_oobtest: written 24 eraseblocks
[  382.011992] mtd_oobtest: verifying all eraseblocks
[  382.025181] mtd_oobtest: verified up to eraseblock 0
[  382.221002] mtd_oobtest: verified 24 eraseblocks
[  382.225904] mtd_oobtest: test 4 of 5
[  382.229661] mtd_oobtest: erasing whole device
[  382.245284] mtd_oobtest: erased 24 eraseblocks
[  382.249957] mtd_oobtest: attempting to start write past end of OOB
[  382.256486] mtd_oobtest: an error is expected...
[  382.261344] mtd_oobtest: error occurred as expected
[  382.266488] mtd_oobtest: attempting to start read past end of OOB
[  382.272906] mtd_oobtest: an error is expected...
[  382.277760] mtd_oobtest: error occurred as expected
[  382.282901] mtd_oobtest: attempting to write past end of device
[  382.289122] mtd_oobtest: an error is expected...
[  382.293988] mtd_oobtest: error occurred as expected
[  382.299114] mtd_oobtest: attempting to read past end of device
[  382.305258] mtd_oobtest: an error is expected...
[  382.310110] mtd_oobtest: error occurred as expected
[  382.316509] mtd_oobtest: attempting to write past end of device
[  382.322774] mtd_oobtest: an error is expected...
[  382.327640] mtd_oobtest: error occurred as expected
[  382.332784] mtd_oobtest: attempting to read past end of device
[  382.338914] mtd_oobtest: an error is expected...
[  382.343780] mtd_oobtest: error occurred as expected
[  382.348905] mtd_oobtest: test 5 of 5
[  382.352673] mtd_oobtest: erasing whole device
[  382.368233] mtd_oobtest: erased 24 eraseblocks
[  382.372927] mtd_oobtest: writing OOBs of whole device
[  382.378483] mtd_oobtest: written up to eraseblock 0
[  382.383839] mtd_oobtest: written up to eraseblock 0
[  382.398650] mtd_oobtest: written 23 eraseblocks
[  382.403440] mtd_oobtest: verifying all eraseblocks
[  382.408613] mtd_oobtest: verified up to eraseblock 0
[  382.416748] mtd_oobtest: verified 23 eraseblocks
[  382.421625] mtd_oobtest: finished with 0 errors
[  382.426402] =================================================

root:/lib/modules/3.1.0+/kernel/drivers/mtd/tests# insmod mtd_pagetest.ko
[  412.379751]
[  412.381334] =================================================
[  412.387509] mtd_pagetest: MTD device: 0
[  412.391558] mtd_pagetest: MTD device size 3145728, eraseblock size 131072, page size 2048, count of eraseblocks 24, pages per eraseblock 64, OOB size 64
[  412.405962] mtd_pagetest: scanning for bad eraseblocks
[  412.413604] mtd_pagetest: scanned 24 eraseblocks, 0 are bad
[  412.419468] mtd_pagetest: erasing whole device
[  412.435239] mtd_pagetest: erased 24 eraseblocks
[  412.440004] mtd_pagetest: writing whole device
[  412.484655] mtd_pagetest: written up to eraseblock 0
[  413.405323] mtd_pagetest: written 24 eraseblocks
[  413.410192] mtd_pagetest: verifying all eraseblocks
[  413.607643] mtd_pagetest: verified up to eraseblock 0
[  418.045976] mtd_pagetest: verified 24 eraseblocks
[  418.050937] mtd_pagetest: crosstest
[  418.055640] mtd_pagetest: reading page at 0x0
[  418.060719] mtd_pagetest: reading page at 0x2ff800
[  418.066259] mtd_pagetest: reading page at 0x0
[  418.071319] mtd_pagetest: verifying pages read at 0x0 match
[  418.077249] mtd_pagetest: crosstest ok
[  418.081189] mtd_pagetest: erasecrosstest
[  418.085326] mtd_pagetest: erasing block 0
[  418.090717] mtd_pagetest: writing 1st page of block 0
[  418.096767] mtd_pagetest: reading 1st page of block 0
[  418.102616] mtd_pagetest: verifying 1st page of block 0
[  418.108152] mtd_pagetest: erasing block 0
[  418.112852] mtd_pagetest: writing 1st page of block 0
[  418.118789] mtd_pagetest: erasing block 23
[  418.123988] mtd_pagetest: reading 1st page of block 0
[  418.129789] mtd_pagetest: verifying 1st page of block 0
[  418.135358] mtd_pagetest: erasecrosstest ok
[  418.139755] mtd_pagetest: erasetest
[  418.143437] mtd_pagetest: erasing block 0
[  418.148211] mtd_pagetest: writing 1st page of block 0
[  418.154184] mtd_pagetest: erasing block 0
[  418.158884] mtd_pagetest: reading 1st page of block 0
[  418.164738] mtd_pagetest: verifying 1st page of block 0 is all 0xff
[  418.171365] mtd_pagetest: erasetest ok
[  418.175326] mtd_pagetest: finished with 0 errors
[  418.180184] =================================================

root:/lib/modules/3.1.0+/kernel/drivers/mtd/tests# insmod mtd_readtest.ko
[  474.413743]
[  474.415326] =================================================
[  474.421419] mtd_readtest: MTD device: 0
[  474.425546] mtd_readtest: MTD device size 3145728, eraseblock size 131072, page size 2048, count of eraseblocks 24, pages per eraseblock 64, OOB size 64
[  474.439952] mtd_readtest: scanning for bad eraseblocks
[  474.447493] mtd_readtest: scanned 24 eraseblocks, 0 are bad
[  474.453417] mtd_readtest: testing page read
[  475.300355] mtd_readtest: finished
[  475.303986] =================================================

root:/lib/modules/3.1.0+/kernel/drivers/mtd/tests# insmod mtd_speedtest.ko
[  518.657209]
[  518.658792] =================================================
[  518.664968] mtd_speedtest: MTD device: 0
[  518.669107] mtd_speedtest: MTD device size 3145728, eraseblock size 131072, page size 2048, count of eraseblocks 24, pages per eraseblock 64, OOB size 64
[  518.686489] mtd_speedtest: scanning for bad eraseblocks
[  518.694125] mtd_speedtest: scanned 24 eraseblocks, 0 are bad
[  518.719583] mtd_speedtest: testing eraseblock write speed
[  519.609520] mtd_speedtest: eraseblock write speed is 3471 KiB/s
[  519.615799] mtd_speedtest: testing eraseblock read speed
[  520.361942] mtd_speedtest: eraseblock read speed is 4145 KiB/s
[  520.388366] mtd_speedtest: testing page write speed
[  521.280191] mtd_speedtest: page write speed is 3463 KiB/s
[  521.285923] mtd_speedtest: testing page read speed
[  522.033009] mtd_speedtest: page read speed is 4134 KiB/s
[  522.058862] mtd_speedtest: testing 2 page write speed
[  522.949521] mtd_speedtest: 2 page write speed is 3467 KiB/s
[  522.955441] mtd_speedtest: testing 2 page read speed
[  523.699606] mtd_speedtest: 2 page read speed is 4162 KiB/s
[  523.705409] mtd_speedtest: Testing erase speed
[  523.730343] mtd_speedtest: erase speed is 153600 KiB/s
[  523.735778] mtd_speedtest: Testing 2x multi-block erase speed
[  523.752796] mtd_speedtest: 2x multi-block erase speed is 307200 KiB/s
[  523.759569] mtd_speedtest: Testing 4x multi-block erase speed
[  523.776609] mtd_speedtest: 4x multi-block erase speed is 307200 KiB/s
[  523.783411] mtd_speedtest: Testing 8x multi-block erase speed
[  523.800412] mtd_speedtest: 8x multi-block erase speed is 307200 KiB/s
[  523.807214] mtd_speedtest: Testing 16x multi-block erase speed
[  523.825476] mtd_speedtest: 16x multi-block erase speed is 256000 KiB/s
[  523.832394] mtd_speedtest: Testing 32x multi-block erase speed
[  523.849498] mtd_speedtest: 32x multi-block erase speed is 307200 KiB/s
[  523.856396] mtd_speedtest: Testing 64x multi-block erase speed
[  523.873493] mtd_speedtest: 64x multi-block erase speed is 307200 KiB/s
[  523.880356] mtd_speedtest: finished
[  523.884054] =================================================

root:/lib/modules/3.1.0+/kernel/drivers/mtd/tests# insmod mtd_subpagetest.ko
[ 1669.603950]
[ 1669.605533] =================================================
[ 1669.611679] mtd_subpagetest: MTD device: 0
[ 1669.616002] mtd_subpagetest: MTD device size 3145728, eraseblock size 131072, page size 2048, subpage size 512, count of eraseblocks 24, pages per eraseblock 64, OOB size 64
[ 1669.632327] mtd_subpagetest: scanning for bad eraseblocks
[ 1669.640123] mtd_subpagetest: scanned 24 eraseblocks, 0 are bad
[ 1669.646327] mtd_subpagetest: erasing whole device
[ 1669.662277] mtd_subpagetest: erased 24 eraseblocks
[ 1669.667315] mtd_subpagetest: writing whole device
[ 1669.673635] mtd_subpagetest: written up to eraseblock 0
[ 1669.706083] mtd_subpagetest: written 24 eraseblocks
[ 1669.711211] mtd_subpagetest: verifying all eraseblocks
[ 1669.717551] mtd_subpagetest: error: read failed at 0x0
[ 1669.722998] mtd_subpagetest: error -74 occurred
[ 1669.727770] =================================================

______________________________________________________________________
This communication contains information which may be confidential or privileged. The information is intended solely for the use of the individual or entity named above.  If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited.  If you have received this communication in error, please notify me by telephone immediately.
______________________________________________________________________

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: UBIFS fails to mount on second boot
  2012-06-29  6:05   ` Iwo Mergler
@ 2012-06-29 13:46     ` Artem Bityutskiy
  2012-07-02  7:49       ` Iwo Mergler
  0 siblings, 1 reply; 11+ messages in thread
From: Artem Bityutskiy @ 2012-06-29 13:46 UTC (permalink / raw)
  To: Iwo Mergler; +Cc: linux-mtd

[-- Attachment #1: Type: text/plain, Size: 3872 bytes --]

On Fri, 2012-06-29 at 16:05 +1000, Iwo Mergler wrote:
> > > It is possible to avoid the failure by performing a large number of
> > > filesystem operations (i.e. file system benchmark) during the first
> > > session.
> >
> > Hmm, sounds strange.
> 
> While trying to reproduce the problem, I have come across another
> way to avoid it. If the boot scripts in the rootfs perform an
> ubiformat, attach, mkvol & mount on an unrelated empty mtd
> partition, the problem goes away.
> 
> Is there any global state shared between separate UBI/UBIFS
> partitions?

No. Do you MTD partitions overlap? What is in /proc/mtd ?

> > This means the driver is buggy: it does not support sub-pages but
> > still reports that it does. Just fix it instead.
> 
> I was under the impression that the subpage capability is extracted
> from the ONFI information. So I take it there is a flag for the
> driver to override that?

I do not know your system, but if your flash chip supports subpages, but
the ECC you use does not allow them, the driver should report that
sub-pages are not supported..

> > Did you try to mount an empty volume and let UBIFS auto-format it, and
> > then reproduce the issue?
> 
> No, UBIFS created from an empty partition work OK. In fact, doing that
> also stops the rootfs mount failure on the second boot.

Sounds like this is not UBIFS fault but rather like a side-effect of
something strange happening elsewhere. Probably it is related to how you
flash it.

We had the following issue in the past.

1. You have some UBI on your flash. Then you want to flash an new image.
2. The flasher for some reason did not erase some PEBs of the partition.
Probably because Linux view of the partition and flashers did not 100%
match. Anyway, on or few PEBs were not erased in the end of the
partition. Lets call them "ghost PEBs".
3. We flashed new image.
4. UBI attached the partition, the ghost PEBs were scanned and treated
as valid PEBs and their data appeared in one of the volumes, because
their generation numbers were higher than in PEBs from the new image
(the generation number is in the UBI headers). The ghost data, instead
of valid data, was read by UBIFS. And we had strange corruptions.

We introduced so-called "image sequence number" to catch such issues. It
is stored in the EC header. All EC headers on the MTD device have to
have the same. Every time we generate an image - we pick random one. So
if there are ghost PEBs, we notice this because they have a different
image sequence number.

See 'image_seq' in drivers/mtd/ubi/ubi-media.h.

Can this problem affect you as well?

If you use 'ubiformat' for flashing your images, it will generate a
random image sequence number every time it flashes. So it won't use the
one in the image.

Do you use ubiformat for flashing? If not, try to re-generate your image
- ubinize will put a different number there, and flash it and see what
happens. You'd get an error like this:

UBI error: process_eb: bad image sequence number 3726164569 in PEB 47,
expected 642536469

Additional thoughts...

I think what could be more interesting if you could enable debugging for
real. The docs on the web-site are out of date and we switched to
dynamic debugging, so you need to enable the debugging messages
differently. I need to write a howto, and I do not know how to do this
via kernel cmdline so far, need to find out. I know how to do this via
debugfs. But check Documentation/dynamic-debug-howto.txt.

The image is not very helpful. UBI or UBIFS messages would probably
allow to track what UBI/UBIFS is doing to the "faulty" LEB and
corresponding PEB and verify that it is ok. But I really have a strong
feeling it is not UBI/UBIFS fault, so may be we'd spend time to just
prove this.

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: UBIFS fails to mount on second boot
  2012-06-29 13:46     ` Artem Bityutskiy
@ 2012-07-02  7:49       ` Iwo Mergler
  2012-07-03  1:16         ` Iwo Mergler
  2012-07-15  8:25         ` Artem Bityutskiy
  0 siblings, 2 replies; 11+ messages in thread
From: Iwo Mergler @ 2012-07-02  7:49 UTC (permalink / raw)
  To: dedekind1; +Cc: linux-mtd

[-- Attachment #1: Type: text/plain, Size: 6269 bytes --]

On Fri, 29 Jun 2012 23:46:17 +1000
Artem Bityutskiy <dedekind1@gmail.com> wrote:

> On Fri, 2012-06-29 at 16:05 +1000, Iwo Mergler wrote:
> > > > It is possible to avoid the failure by performing a large
> > > > number of filesystem operations (i.e. file system benchmark)
> > > > during the first session.
> > >
> > > Hmm, sounds strange.
> > 
> > While trying to reproduce the problem, I have come across another
> > way to avoid it. If the boot scripts in the rootfs perform an
> > ubiformat, attach, mkvol & mount on an unrelated empty mtd
> > partition, the problem goes away.
> > 
> > Is there any global state shared between separate UBI/UBIFS
> > partitions?
> 
> No. Do you MTD partitions overlap? What is in /proc/mtd ?

root:~# cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00300000 00020000 "S1S2EN"
mtd1: 00500000 00020000 "kernela"
mtd2: 02000000 00020000 "roota"
mtd3: 01000000 00020000 "ovla"
mtd4: 00500000 00020000 "kernelb"
mtd5: 02000000 00020000 "rootb"
mtd6: 01000000 00020000 "ovlb"
mtd7: 19300000 00020000 "NAND"

The partitions are passed from U-Boot on the command line.

> 
> > > This means the driver is buggy: it does not support sub-pages but
> > > still reports that it does. Just fix it instead.
> > 
> > I was under the impression that the subpage capability is extracted
> > from the ONFI information. So I take it there is a flag for the
> > driver to override that?
> 
> I do not know your system, but if your flash chip supports subpages,
> but the ECC you use does not allow them, the driver should report that
> sub-pages are not supported..

It's similar to a BeagleBone, but with NAND FLASH, I'm using the
omap2 ECC driver.

So, if I set NAND_NO_SUBPAGE_WRITE in chip.options, UBI will
leave the subpages alone?

> 
> > > Did you try to mount an empty volume and let UBIFS auto-format
> > > it, and then reproduce the issue?
> > 
> > No, UBIFS created from an empty partition work OK. In fact, doing
> > that also stops the rootfs mount failure on the second boot.
> 
> Sounds like this is not UBIFS fault but rather like a side-effect of
> something strange happening elsewhere. Probably it is related to how
> you flash it.

Well, I'm just copying the UBI image into the NAND partition in U-Boot,
while skipping pages containing all 0xFF. It would also skip bad blocks,
but this NAND hasn't got any.

> 
> We had the following issue in the past.
> 
> 1. You have some UBI on your flash. Then you want to flash an new
> image. 2. The flasher for some reason did not erase some PEBs of the
> partition. Probably because Linux view of the partition and flashers
> did not 100% match. Anyway, on or few PEBs were not erased in the end
> of the partition. Lets call them "ghost PEBs".
> 3. We flashed new image.
> 4. UBI attached the partition, the ghost PEBs were scanned and treated
> as valid PEBs and their data appeared in one of the volumes, because
> their generation numbers were higher than in PEBs from the new image
> (the generation number is in the UBI headers). The ghost data, instead
> of valid data, was read by UBIFS. And we had strange corruptions.
> 
> We introduced so-called "image sequence number" to catch such issues.
> It is stored in the EC header. All EC headers on the MTD device have
> to have the same. Every time we generate an image - we pick random
> one. So if there are ghost PEBs, we notice this because they have a
> different image sequence number.
> 
> See 'image_seq' in drivers/mtd/ubi/ubi-media.h.
> 
> Can this problem affect you as well?

The image is re-created with ubinize on every build. It could affect
me, if I was re-flashing the same image a second time - the sequence
numbers would match then.

But the problem happens with a freshly generated image as well.

> 
> If you use 'ubiformat' for flashing your images, it will generate a
> random image sequence number every time it flashes. So it won't use
> the one in the image.
> 
> Do you use ubiformat for flashing? If not, try to re-generate your
> image
> - ubinize will put a different number there, and flash it and see what
> happens. You'd get an error like this:
> 
> UBI error: process_eb: bad image sequence number 3726164569 in PEB 47,
> expected 642536469

This doesn't seem to happen. Also, the behaviour is no different
between erasing the whole flash and only erasing the rootfs
partition.

> 
> Additional thoughts...
> 
> I think what could be more interesting if you could enable debugging
> for real. The docs on the web-site are out of date and we switched to
> dynamic debugging, so you need to enable the debugging messages
> differently. I need to write a howto, and I do not know how to do this
> via kernel cmdline so far, need to find out. I know how to do this via
> debugfs. But check Documentation/dynamic-debug-howto.txt.

Thanks for the hint, I have. Turns out it's fairly simple, just pass
ddebug_query="module ubifs +p" on the command line. Unfortunately,
the way ddebug parses the line, you can only have one rule there.

I think there may be a change in newer kernels that allows to use
a semicolon as a separator between rules, but my kernel doesn't
allow that.

So I had to make two debug runs, one each for ubi and ubifs debugging
enabled.

The attached archive contains 4 log files:

konsole_ubi_1.txt   = first (successful) boot with UBI debug enabled
konsole_ubi_2.txt   = second (failed) boot with UBI debug enabled
konsole_ubifs_1.txt = first (successful) boot with UBIFS debug enabled
konsole_ubifs_2.txt = second (failed) boot with UBIFS debug enabled


Best regards,

Iwo

______________________________________________________________________
This communication contains information which may be confidential or privileged. The information is intended solely for the use of the individual or entity named above.  If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited.  If you have received this communication in error, please notify me by telephone immediately.
______________________________________________________________________

[-- Attachment #2: ubi_ubifs_debug.tar.bz2 --]
[-- Type: application/octet-stream, Size: 129537 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: UBIFS fails to mount on second boot
  2012-07-02  7:49       ` Iwo Mergler
@ 2012-07-03  1:16         ` Iwo Mergler
  2012-07-14  4:39           ` Artem Bityutskiy
  2012-07-15  8:25         ` Artem Bityutskiy
  1 sibling, 1 reply; 11+ messages in thread
From: Iwo Mergler @ 2012-07-03  1:16 UTC (permalink / raw)
  To: dedekind1; +Cc: linux-mtd


Hi Artem,


I can confirm now that a sync after the first boot also fixes the problem, just like
James reported in

http://lists.infradead.org/pipermail/linux-mtd/2012-June/042046.html

If I power-cycle before sync, the next boot fails. Power cycle after sync,
and the next (and all subsequent) boots are fine.

Below is what happens during sync, with both UBIFS & UBI debug enabled.


Best regards,

Iwo


root:~# sync
[  656.066204] UBIFS DBG gen: 'sync' in dir ino 157
[  656.071086] UBIFS DBG tnc: search key (157, direntry, 0x1c79da4)
[  656.077483] UBIFS DBG tnc: found 1, lvl 0, n 1
[  656.082186] UBIFS DBG io: LEB 21:48136, direntry node, length 61
[  656.088510] UBI DBG gen: read 61 bytes from LEB 0:21:48136
[  656.094305] UBI DBG eba: read 61 bytes from offset 48136 of LEB 0:21, PEB 23
[  656.101737] UBI DBG io: read 61 bytes from PEB 23:52232
[  656.107791] UBIFS DBG gen: inode 255
[  656.111626] UBIFS DBG tnc: search key (255, inode)
[  656.116670] UBIFS DBG tnc: found 1, lvl 0, n 5
[  656.121348] UBIFS DBG tnc: LEB 21:47968, key (255, inode)
[  656.127049] UBIFS DBG io: LEB 21:47968, inode node, length 167
[  656.133207] UBI DBG gen: read 167 bytes from LEB 0:21:47968
[  656.139073] UBI DBG eba: read 167 bytes from offset 47968 of LEB 0:21, PEB 23
[  656.146593] UBI DBG io: read 167 bytes from PEB 23:52064
[  656.155255] UBIFS DBG gen: xattr 'security.capability', ino 233 ('busybox'), buf size 20
[  656.163902] UBIFS DBG tnc: search key (233, xentry, 0x10888ae6)
[  656.170136] UBIFS DBG tnc: found 0, lvl 0, n 0
[  656.179385] UBIFS DBG cmt: start
[  656.182952] UBIFS DBG log: add ref to LEB 249:6144 for jhead 2 (data)
[  656.189767] UBI DBG gen: unmap LEB 0:4
[  656.193746] UBIFS DBG log: writing commit start at LEB 4:0, len 2048
[  656.200432] UBI DBG gen: write 2048 bytes to LEB 0:4:0
[  656.205872] UBI DBG wl: PEB 198 EC 1
[  656.209634] UBI DBG wl: added PEB 198 EC 1 to the protection queue
[  656.216154] UBI DBG eba: write VID hdr and 2048 bytes at offset 0 of LEB 0:4, PEB 198
[  656.224405] UBI DBG io: write VID header to PEB 198
[  656.229543] UBI DBG io: write 512 bytes to PEB 198:2048
[  656.235689] UBI DBG io: write 2048 bytes to PEB 198:4096
[  656.241950] UBIFS DBG log: preserve 63:124928, jhead 1 (base), bud bytes 2048, cmt_bud_bytes 2048
[  656.251306] UBIFS DBG log: remove 127:118784, jhead 1 (base), bud bytes 8192, cmt_bud_bytes 10240
[  656.260661] UBIFS DBG log: preserve 249:0, jhead 2 (data), bud bytes 6144, cmt_bud_bytes 16384
[  656.269773] UBIFS DBG cmt: committing 20 znodes
[  656.274555] UBIFS DBG cmt: need about 0 empty LEBS for TNC commit
[  656.280982] UBIFS DBG lp: LEB 130, free 6144, dirty 3712, flags 48
[  656.287500] UBIFS DBG lp: LEB 130, free 2048, dirty 5136, flags 48
[  656.294026] UBIFS DBG lp: LEB 130, free 6144, dirty 3712, flags 48
[  656.300537] UBIFS DBG find: found 1 dirty index LEBs
[  656.305777] UBIFS DBG find: dirtiest index LEB is 129 with dirty 640 and free 0
[  656.313484] UBIFS DBG cmt: number of index LEBs 2
[  656.318428] UBIFS DBG cmt: size of index 246128
[  656.323208] UBIFS DBG lp: 
[  656.326077] UBIFS DBG cmt: committing 13 cnodes
[  656.330839] UBIFS DBG lp: committing 13 cnodes
[  656.335539] UBIFS DBG lp: LEB 8 free 124928 dirty 1529 to 122880 +1861
[  656.342422] UBIFS DBG cmt: 0 orphans to commit
[  656.347212] UBI DBG gen: write 2048 bytes to LEB 0:130:120832
[  656.353292] UBI DBG eba: write 2048 bytes at offset 120832 of LEB 0:130, PEB 194
[  656.361088] UBI DBG io: write 2048 bytes to PEB 194:124928
[  656.367475] UBI DBG gen: write 2048 bytes to LEB 0:130:122880
[  656.373545] UBI DBG eba: write 2048 bytes at offset 122880 of LEB 0:130, PEB 194
[  656.381323] UBI DBG io: write 2048 bytes to PEB 194:126976
[  656.390913] UBIFS DBG cmt: TNC height is 5
[  656.395293] UBIFS DBG lp: 
[  656.398193] UBI DBG gen: write 2048 bytes to LEB 0:8:2048
[  656.403897] UBI DBG eba: write 2048 bytes at offset 2048 of LEB 0:8, PEB 169
[  656.411310] UBI DBG io: write 2048 bytes to PEB 169:6144
[  656.417913] UBIFS DBG lp: LPT root is at 8:2223
[  656.422727] UBIFS DBG lp: LPT head is at 8:4096
[  656.427491] UBIFS DBG lp: LPT ltab is at 8:2048
[  656.432278] UBIFS DBG log: old tail was LEB 3:0, new tail is LEB 4:0
[  656.438981] UBIFS DBG io: LEB 1:4096, master node, length 512 (aligned 2048)
[  656.446451] UBI DBG gen: write 2048 bytes to LEB 0:1:4096
[  656.452153] UBI DBG eba: write 2048 bytes at offset 4096 of LEB 0:1, PEB 167
[  656.459567] UBI DBG io: write 2048 bytes to PEB 167:8192
[  656.467262] UBIFS DBG io: LEB 2:4096, master node, length 512 (aligned 2048)
[  656.474743] UBI DBG gen: write 2048 bytes to LEB 0:2:4096
[  656.480426] UBI DBG eba: write 2048 bytes at offset 4096 of LEB 0:2, PEB 168
[  656.487858] UBI DBG io: write 2048 bytes to PEB 168:8192
[  656.494335] UBIFS DBG lp: LEB 8 add 12 to 3390
[  656.499012] UBIFS DBG lp: LEB 8 add 11 to 3402
[  656.503725] UBIFS DBG lp: LEB 8 add 12 to 3413
[  656.508398] UBIFS DBG lp: LEB 8 add 12 to 3425
[  656.513088] UBIFS DBG lp: LEB 8 add 12 to 3437
[  656.517760] UBIFS DBG lp: LEB 8 add 17 to 3449
[  656.522450] UBIFS DBG lp: LEB 127, free 0, dirty 9128, flags 16
[  656.528678] UBIFS DBG lp: LEB 127, free -2147483647, dirty -2147483647, flags 0
[  656.536383] UBIFS DBG lp: LEB 127, free 0, dirty 9128, flags 16
[  656.542636] UBIFS DBG log: unmap log LEB 3
[  656.546942] UBI DBG gen: unmap LEB 0:3
[  656.550887] UBI DBG eba: erase LEB 0:3, PEB 196
[  656.555666] UBI DBG wl: PEB 196
[  656.558977] UBI DBG wl: deleted PEB 196 from the protection queue
[  656.565403] UBI DBG wl: schedule erasure of PEB 196, EC 1, torture 0
[  656.572110] UBIFS DBG cmt: commit end
[  656.577344] UBI DBG wl: erase PEB 196 EC 1 LEB 0:3
[  656.582431] UBI DBG wl: erase PEB 196, old EC 1
[  656.587200] UBI DBG io: erase PEB 196
root:~# [  656.593077] UBI DBG wl: erased PEB 196, new EC 2
[  656.598441] UBI DBG io: write EC header to PEB 196
[  656.603570] UBI DBG io: write 512 bytes to PEB 196:0

________________________________________
From: Iwo Mergler
Sent: Monday, 2 July 2012 5:49 PM
To: dedekind1@gmail.com
Cc: linux-mtd@lists.infradead.org
Subject: RE: UBIFS fails to mount on second boot

On Fri, 29 Jun 2012 23:46:17 +1000
Artem Bityutskiy <dedekind1@gmail.com> wrote:

> On Fri, 2012-06-29 at 16:05 +1000, Iwo Mergler wrote:
> > > > It is possible to avoid the failure by performing a large
> > > > number of filesystem operations (i.e. file system benchmark)
> > > > during the first session.
> > >
> > > Hmm, sounds strange.
> >
> > While trying to reproduce the problem, I have come across another
> > way to avoid it. If the boot scripts in the rootfs perform an
> > ubiformat, attach, mkvol & mount on an unrelated empty mtd
> > partition, the problem goes away.
> >
> > Is there any global state shared between separate UBI/UBIFS
> > partitions?
>
> No. Do you MTD partitions overlap? What is in /proc/mtd ?

root:~# cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00300000 00020000 "S1S2EN"
mtd1: 00500000 00020000 "kernela"
mtd2: 02000000 00020000 "roota"
mtd3: 01000000 00020000 "ovla"
mtd4: 00500000 00020000 "kernelb"
mtd5: 02000000 00020000 "rootb"
mtd6: 01000000 00020000 "ovlb"
mtd7: 19300000 00020000 "NAND"

The partitions are passed from U-Boot on the command line.

>
> > > This means the driver is buggy: it does not support sub-pages but
> > > still reports that it does. Just fix it instead.
> >
> > I was under the impression that the subpage capability is extracted
> > from the ONFI information. So I take it there is a flag for the
> > driver to override that?
>
> I do not know your system, but if your flash chip supports subpages,
> but the ECC you use does not allow them, the driver should report that
> sub-pages are not supported..

It's similar to a BeagleBone, but with NAND FLASH, I'm using the
omap2 ECC driver.

So, if I set NAND_NO_SUBPAGE_WRITE in chip.options, UBI will
leave the subpages alone?

>
> > > Did you try to mount an empty volume and let UBIFS auto-format
> > > it, and then reproduce the issue?
> >
> > No, UBIFS created from an empty partition work OK. In fact, doing
> > that also stops the rootfs mount failure on the second boot.
>
> Sounds like this is not UBIFS fault but rather like a side-effect of
> something strange happening elsewhere. Probably it is related to how
> you flash it.

Well, I'm just copying the UBI image into the NAND partition in U-Boot,
while skipping pages containing all 0xFF. It would also skip bad blocks,
but this NAND hasn't got any.

>
> We had the following issue in the past.
>
> 1. You have some UBI on your flash. Then you want to flash an new
> image. 2. The flasher for some reason did not erase some PEBs of the
> partition. Probably because Linux view of the partition and flashers
> did not 100% match. Anyway, on or few PEBs were not erased in the end
> of the partition. Lets call them "ghost PEBs".
> 3. We flashed new image.
> 4. UBI attached the partition, the ghost PEBs were scanned and treated
> as valid PEBs and their data appeared in one of the volumes, because
> their generation numbers were higher than in PEBs from the new image
> (the generation number is in the UBI headers). The ghost data, instead
> of valid data, was read by UBIFS. And we had strange corruptions.
>
> We introduced so-called "image sequence number" to catch such issues.
> It is stored in the EC header. All EC headers on the MTD device have
> to have the same. Every time we generate an image - we pick random
> one. So if there are ghost PEBs, we notice this because they have a
> different image sequence number.
>
> See 'image_seq' in drivers/mtd/ubi/ubi-media.h.
>
> Can this problem affect you as well?

The image is re-created with ubinize on every build. It could affect
me, if I was re-flashing the same image a second time - the sequence
numbers would match then.

But the problem happens with a freshly generated image as well.

>
> If you use 'ubiformat' for flashing your images, it will generate a
> random image sequence number every time it flashes. So it won't use
> the one in the image.
>
> Do you use ubiformat for flashing? If not, try to re-generate your
> image
> - ubinize will put a different number there, and flash it and see what
> happens. You'd get an error like this:
>
> UBI error: process_eb: bad image sequence number 3726164569 in PEB 47,
> expected 642536469

This doesn't seem to happen. Also, the behaviour is no different
between erasing the whole flash and only erasing the rootfs
partition.

>
> Additional thoughts...
>
> I think what could be more interesting if you could enable debugging
> for real. The docs on the web-site are out of date and we switched to
> dynamic debugging, so you need to enable the debugging messages
> differently. I need to write a howto, and I do not know how to do this
> via kernel cmdline so far, need to find out. I know how to do this via
> debugfs. But check Documentation/dynamic-debug-howto.txt.

Thanks for the hint, I have. Turns out it's fairly simple, just pass
ddebug_query="module ubifs +p" on the command line. Unfortunately,
the way ddebug parses the line, you can only have one rule there.

I think there may be a change in newer kernels that allows to use
a semicolon as a separator between rules, but my kernel doesn't
allow that.

So I had to make two debug runs, one each for ubi and ubifs debugging
enabled.

The attached archive contains 4 log files:

konsole_ubi_1.txt   = first (successful) boot with UBI debug enabled
konsole_ubi_2.txt   = second (failed) boot with UBI debug enabled
konsole_ubifs_1.txt = first (successful) boot with UBIFS debug enabled
konsole_ubifs_2.txt = second (failed) boot with UBIFS debug enabled


Best regards,

Iwo

______________________________________________________________________
This communication contains information which may be confidential or privileged. The information is intended solely for the use of the individual or entity named above.  If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited.  If you have received this communication in error, please notify me by telephone immediately.
______________________________________________________________________

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: UBIFS fails to mount on second boot
  2012-07-03  1:16         ` Iwo Mergler
@ 2012-07-14  4:39           ` Artem Bityutskiy
  2012-07-18  7:05             ` Iwo Mergler
  0 siblings, 1 reply; 11+ messages in thread
From: Artem Bityutskiy @ 2012-07-14  4:39 UTC (permalink / raw)
  To: Iwo Mergler; +Cc: linux-mtd

[-- Attachment #1: Type: text/plain, Size: 1639 bytes --]

On Tue, 2012-07-03 at 11:16 +1000, Iwo Mergler wrote:
> Thanks for the hint, I have. Turns out it's fairly simple, just pass
> ddebug_query="module ubifs +p" on the command line. Unfortunately,
> the way ddebug parses the line, you can only have one rule there.

Iwo, if you send more logs, please, include function names because
otherwise the logs are very difficult to follow. I think it wold be +pf.
> 
> I think there may be a change in newer kernels that allows to use
> a semicolon as a separator between rules, but my kernel doesn't
> allow that.
> 
> So I had to make two debug runs, one each for ubi and ubifs debugging
> enabled.
> 
> The attached archive contains 4 log files:
> 
> konsole_ubi_1.txt   = first (successful) boot with UBI debug enabled
> konsole_ubi_2.txt   = second (failed) boot with UBI debug enabled
> konsole_ubifs_1.txt = first (successful) boot with UBIFS debug enabled
> konsole_ubifs_2.txt = second (failed) boot with UBIFS debug enabled 

I suspect this is a bug in the free space fixup code - my level of
confidence on this is 80%. Could you please verify this by removing the
-F mkfs.ubifs parameter or using the patch below. Thanks!

diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index 5862dd9..3398114 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -1295,11 +1295,13 @@ static int mount_ubifs(struct ubifs_info *c)
 	if (err)
 		goto out_master;
 
+#if 0
 	if (!c->ro_mount && c->space_fixup) {
 		err = ubifs_fixup_free_space(c);
 		if (err)
 			goto out_lpt;
 	}
+#endif
 
 	if (!c->ro_mount) {
 		/*

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: UBIFS fails to mount on second boot
  2012-07-02  7:49       ` Iwo Mergler
  2012-07-03  1:16         ` Iwo Mergler
@ 2012-07-15  8:25         ` Artem Bityutskiy
  1 sibling, 0 replies; 11+ messages in thread
From: Artem Bityutskiy @ 2012-07-15  8:25 UTC (permalink / raw)
  To: Iwo Mergler; +Cc: linux-mtd

[-- Attachment #1: Type: text/plain, Size: 782 bytes --]

On Mon, 2012-07-02 at 17:49 +1000, Iwo Mergler wrote:
> It's similar to a BeagleBone, but with NAND FLASH, I'm using the
> omap2 ECC driver.
> 
> So, if I set NAND_NO_SUBPAGE_WRITE in chip.options, UBI will
> leave the subpages alone?

I think so, but it is better to dig and find out why sub-pages to not
work - this sounds like a bug. With sub-pages support you'll waste less
flash space because UBI overhead will be smaller.

> So I had to make two debug runs, one each for ubi and ubifs debugging
> enabled.

I've reproduced this bug using nandsim and fixed it. I've sent you the
fix few minutes ago. Could you please test it? After I've got a
"Tested-by:", I'll send it to Linus and it will also go to -stable.
Thanks!

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: UBIFS fails to mount on second boot
  2012-07-14  4:39           ` Artem Bityutskiy
@ 2012-07-18  7:05             ` Iwo Mergler
  2012-07-18  7:21               ` Artem Bityutskiy
  0 siblings, 1 reply; 11+ messages in thread
From: Iwo Mergler @ 2012-07-18  7:05 UTC (permalink / raw)
  To: dedekind1; +Cc: linux-mtd


Artem Bityutskiy <dedekind1@gmail.com> wrote:
> I suspect this is a bug in the free space fixup code - my level of
> confidence on this is 80%. Could you please verify this by removing
> the -F mkfs.ubifs parameter or using the patch below. Thanks!

You're right, the problem doesn't happen without fixup.

I thought I was not flashing empty pages in U-Boot, but that turned
out not to be quite true. Without fixup, the boot died in a hailstorm
of asserts. I fixed that, but now the flashing takes noticeably longer.

In other words, I quite like the fixup feature and would prefer if it doesn't
go away permanently. Otherwise:

Tested-by: Iwo Mergler <Iwo.Mergler@netcommwireless.com>

> 
> diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
> index 5862dd9..3398114 100644
> --- a/fs/ubifs/super.c
> +++ b/fs/ubifs/super.c
> @@ -1295,11 +1295,13 @@ static int mount_ubifs(struct ubifs_info *c)
>  	if (err)
>  		goto out_master;
>  
> +#if 0
>  	if (!c->ro_mount && c->space_fixup) {
>  		err = ubifs_fixup_free_space(c);
>  		if (err)
>  			goto out_lpt;
>  	}
> +#endif
>  
>  	if (!c->ro_mount) {
>  		/*
> 

Best regards,

Iwo

______________________________________________________________________
This communication contains information which may be confidential or privileged. The information is intended solely for the use of the individual or entity named above.  If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited.  If you have received this communication in error, please notify me by telephone immediately.
______________________________________________________________________

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: UBIFS fails to mount on second boot
  2012-07-18  7:05             ` Iwo Mergler
@ 2012-07-18  7:21               ` Artem Bityutskiy
  2012-07-19  1:06                 ` Iwo Mergler
  0 siblings, 1 reply; 11+ messages in thread
From: Artem Bityutskiy @ 2012-07-18  7:21 UTC (permalink / raw)
  To: Iwo Mergler; +Cc: linux-mtd

[-- Attachment #1: Type: text/plain, Size: 551 bytes --]

On Wed, 2012-07-18 at 17:05 +1000, Iwo Mergler wrote:
> In other words, I quite like the fixup feature and would prefer if it doesn't
> go away permanently. Otherwise:
> 
> Tested-by: Iwo Mergler <Iwo.Mergler@netcommwireless.com>

You should have the fix in your mailbox, and for that fix I'd be
greatful to have your "Tested-by:". BTW, I've updated the UBI/UBIFS
documentation WRT debugging - if you could review it - would also be
great!

http://linux-mtd.infradead.org/faq/ubifs.html#L_how_debug

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: UBIFS fails to mount on second boot
  2012-07-18  7:21               ` Artem Bityutskiy
@ 2012-07-19  1:06                 ` Iwo Mergler
  0 siblings, 0 replies; 11+ messages in thread
From: Iwo Mergler @ 2012-07-19  1:06 UTC (permalink / raw)
  To: dedekind1; +Cc: linux-mtd

On Wed, 18 Jul 2012 17:21:10 +1000
Artem Bityutskiy <dedekind1@gmail.com> wrote:
> You should have the fix in your mailbox, and for that fix I'd be
> greatful to have your "Tested-by:". 

Thanks, got it. Applied and tested - the bug seems to be
gone.

> From 5978233c1e065c3d01bd88f4bf3598cc293ea611 Mon Sep 17 00:00:00 2001
> From: Artem Bityutskiy <Artem.Bityutskiy@linux.intel.com>
> Date: Sat, 14 Jul 2012 14:33:09 +0300
> Subject: [PATCH] UBIFS: fix a bug in empty space fix-up
...
> 
> -	/* Fixup the current log head */
> -	err = fixup_leb(c, c->lhead_lnum, c->lhead_offs);
> +	/* Fixup the log head contains the only a CS node at the beginning */
> +	err = fixup_leb(c, c->lhead_lnum,
> +			ALIGN(UBIFS_CS_NODE_SZ, c->min_io_size));
>  	if (err)

Tested-by: Iwo Mergler <Iwo.Mergler@netcommwireless.com>

Thanks again for your efforts.

Best regards,

Iwo

______________________________________________________________________
This communication contains information which may be confidential or privileged. The information is intended solely for the use of the individual or entity named above.  If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited.  If you have received this communication in error, please notify me by telephone immediately.
______________________________________________________________________

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-07-19  1:06 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-25  5:23 UBIFS fails to mount on second boot Iwo
2012-06-27 14:10 ` Artem Bityutskiy
2012-06-29  6:05   ` Iwo Mergler
2012-06-29 13:46     ` Artem Bityutskiy
2012-07-02  7:49       ` Iwo Mergler
2012-07-03  1:16         ` Iwo Mergler
2012-07-14  4:39           ` Artem Bityutskiy
2012-07-18  7:05             ` Iwo Mergler
2012-07-18  7:21               ` Artem Bityutskiy
2012-07-19  1:06                 ` Iwo Mergler
2012-07-15  8:25         ` Artem Bityutskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.