All of lore.kernel.org
 help / color / mirror / Atom feed
* ubifs : corruption after power cut test
@ 2010-07-07 12:04 Matthieu CASTET
  2010-07-13  7:27 ` Matthieu CASTET
  2010-07-13 11:07 ` Artem Bityutskiy
  0 siblings, 2 replies; 19+ messages in thread
From: Matthieu CASTET @ 2010-07-07 12:04 UTC (permalink / raw)
  To: linux-mtd

Hello,

we are testing robustness of ubifs on our boards. We are using a 2.6.27
kernel with ubi/ubifs backport from 2.6.27 branch and some of 2.6.28
(since 2.6.27 it is not supported anymore) [1]
We use SLC nand (ST and micron one).

We have a test program that create/delete/modify files randomly (with a
checksum to check files integrity).

During the test we do random power cut (1-10 min after booting).

After some reboot we got an uncorrectable ecc error, and a failure in
mounting ubifs [2].

In one of our test the uncorrectable ecc error, become correctable after 
some reboots [3].

We run mtd tests without error. Torture test run more than 100000 cycles 
(~60 hours).

If we enable ubi and ubifs selftest we didn't manage to reproduce the 
corruption.

We have a trace of the failure with ubifs debug [4], it seems there are 
some data after the corrupted zone (I can post the full log it if needed).


Do you have any idea to investigate this ?

Matthieu


PS : On another OS using the same flash (with a proprietary fs), we saw 
that interrupted erase can do weird stuff. The eraseblock with 
interrupted erase can become unstable. For example it acts like erased 
block, can be written with data (and be can read again) but after some 
times uncorrectable error happens.
 From what I understood, ubi should be safe because in case of 
interrupted erase, we will add it to erase or corr list, erase the block 
again before writing EC.

BTW what's the difference between erase and corr list in scan ? We seem 
to do the same thing for these lists (schedule_erase).


[1]
UBIFS: mark VFS SB RO too
UBI: init even if MTD device cannot be attached, if built into kernel
UBI: remove reboot notifier
random: Remove unused inode variable
random: drop weird m_time/a_time manipulation
UBI: add write checking
UBI: simplify debugging return codes
UBI: fix attaching error path
UBI: support attaching by MTD character device name
UBI: mark few variables as __initdata
UBI: fix volume creation input checking
UBI: fix memory leak in update path
UBI: add more checks to chdev open
UBI: initialise update marker
UBIFS: support mounting of UBI volume character devices
UBI: Add ubi_open_volume_path

[2]
UBIFS: recovery needed
ba315 : BA315_STATUS_DEC_FAIL
read error -74 retry 0 PEB 133:10240
UBIFS error (pid 284): ubifs_check_node: bad CRC: calculated 0x2a87ef17,
read 0x395cbef4
UBIFS error (pid 284): ubifs_check_node: bad node at LEB 198:6144
UBIFS error (pid 284): ubifs_scanned_corruption: corruption at LEB 198:6144


[3]
ba : BA315_STATUS_DEC_FAIL
read error -74 retry 0
UBIFS error (pid 274): ubifs_check_node: bad CRC: calculated 0x2b0f6371, 
read 0x7f94ebe7
UBIFS error (pid 274): ubifs_check_node: bad node at LEB 85:0
UBIFS error (pid 274): ubifs_scanned_corruption: corruption at LEB 85:0

[...]
2 reboot with same error
[...]
ba : BA315_STATUS_DEC_ERR
detected ecc error num=1, ret=0
error : -74
fixable bit-flip detected at PEB 244
ba : BA315_STATUS_DEC_ERR
detected ecc error num=1, ret=0
error : -74
fixable bit-flip detected at PEB 244
UBI: scrubbed PEB 244 (LEB 0:85), data moved to PEB 181
UBIFS: recovery completed

[4]
read error -74 retry 0 PEB 204:4096
UBIFS DBG (pid 278): ubifs_recover_leb: look at LEB 219:0 (126976 bytes 
left)
UBIFS DBG (pid 278): ubifs_scan_a_node: scanning data node
UBIFS DBG (pid 278): no_more_nodes: unexpected data at 219:6144
UBIFS DBG (pid 278): ubifs_recover_leb: look at LEB 219:0 (126976 bytes 
left)
UBIFS DBG (pid 278): ubifs_scan_a_node: scanning data node
UBIFS error (pid 278): ubifs_check_node: bad CRC: calculated 0xe468570a, 
read 0x846858e8
UBIFS error (pid 278): ubifs_check_node: bad node at LEB 219:0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-07-07 12:04 ubifs : corruption after power cut test Matthieu CASTET
@ 2010-07-13  7:27 ` Matthieu CASTET
  2010-07-13  8:43   ` Matthieu CASTET
  2010-07-13 11:07 ` Artem Bityutskiy
  1 sibling, 1 reply; 19+ messages in thread
From: Matthieu CASTET @ 2010-07-13  7:27 UTC (permalink / raw)
  To: linux-mtd

Hi,

we found some bug in our driver. Now there no more ubifs error when 
there is uncorrectable ecc error (they should happen in the last 
(interrupted) written page).

But now we got "validate_master: bad master node at offset 69632 error 
7" [1].

The error is not very clear to me.

What does it means ?

What could cause it.


Thanks

Matthieu



[1]
[...]
UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
69632
UBIFS DBG (pid 288): ubifs_scan: look at LEB 1:69632 (57344 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 1:70144 (56832 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
71680
UBIFS DBG (pid 288): ubifs_scan: look at LEB 1:71680 (55296 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: hit empty space 

UBIFS DBG (pid 288): ubifs_end_scan: stop scanning LEB 1 at offset 71680 

UBIFS DBG (pid 288): ubifs_start_scan: scan LEB 2:0 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:0 (126976 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:512 (126464 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
2048
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:2048 (124928 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:2560 (124416 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
4096
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:4096 (122880 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:4608 (122368 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
6144
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:6144 (120832 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:6656 (120320 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
8192
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:8192 (118784 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:8704 (118272 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
10240
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:10240 (116736 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:10752 (116224 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
12288
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:12288 (114688 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:12800 (114176 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
14336
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:14336 (112640 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:14848 (112128 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
16384
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:16384 (110592 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:16896 (110080 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
18432
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:18432 (108544 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:18944 (108032 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
20480
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:20480 (106496 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:20992 (105984 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
22528
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:22528 (104448 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:23040 (103936 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
24576
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:24576 (102400 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:25088 (101888 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
26624
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:26624 (100352 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:27136 (99840 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
28672
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:28672 (98304 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:29184 (97792 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
30720
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:30720 (96256 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:31232 (95744 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
32768
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:32768 (94208 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:33280 (93696 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
34816
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:34816 (92160 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:35328 (91648 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
36864
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:36864 (90112 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:37376 (89600 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
38912
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:38912 (88064 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:39424 (87552 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
40960
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:40960 (86016 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:41472 (85504 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
43008
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:43008 (83968 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:43520 (83456 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
45056
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:45056 (81920 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:45568 (81408 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
47104
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:47104 (79872 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:47616 (79360 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
49152
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:49152 (77824 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:49664 (77312 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
51200
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:51200 (75776 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:51712 (75264 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
53248
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:53248 (73728 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:53760 (73216 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
55296
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:55296 (71680 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:55808 (71168 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
57344
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:57344 (69632 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:57856 (69120 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
59392
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:59392 (67584 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:59904 (67072 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
61440
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:61440 (65536 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:61952 (65024 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
63488
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:63488 (63488 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:64000 (62976 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
65536
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:65536 (61440 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:66048 (60928 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
67584
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:67584 (59392 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:68096 (58880 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
69632
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:69632 (57344 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node 

UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:70144 (56832 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node 

UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now 
71680
UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:71680 (55296 bytes left) 

UBIFS DBG (pid 288): ubifs_scan_a_node: hit empty space 

UBIFS DBG (pid 288): ubifs_end_scan: stop scanning LEB 2 at offset 71680 

UBIFS error (pid 288): validate_master: bad master node at offset 69632 
error 7
         magic          0x6101831 

         crc            0xf1cb595d 

         node_type      7 (master node) 

         group_type     0 (no node group) 

         sqnum          53559169 

         len            512 

         highest_inum   50055 

         commit number  208832 

         flags          0x2 

         log_lnum       3 

         root_lnum      42 

         root_offs      78864 

         root_len       68 

         gc_lnum        4294967295 

         ihead_lnum     42 

         ihead_offs     79872 

         index_size     81072 

         lpt_lnum       6 

         lpt_offs       27032 

         nhead_lnum     6 

         nhead_offs     28672 

         ltab_lnum      6 

         ltab_offs      26624 

         lsave_lnum     0 

         lsave_offs     0 

         lscan_lnum     58 

         leb_cnt        122 

         empty_lebs     1 

         idx_lebs       3 

         total_free     327680 

         total_dirty    4500616 

         total_used     9438920 

         total_dead     0 

         total_dark     620816 

UBIFS DBG (pid 292): ubifs_bg_thread: background thread "ubifs_bgt3_0" 
stops

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-07-13  7:27 ` Matthieu CASTET
@ 2010-07-13  8:43   ` Matthieu CASTET
  2010-07-13  9:24     ` Matthieu CASTET
  0 siblings, 1 reply; 19+ messages in thread
From: Matthieu CASTET @ 2010-07-13  8:43 UTC (permalink / raw)
  To: linux-mtd

Matthieu CASTET a écrit :
> Hi,
> 
> we found some bug in our driver. Now there no more ubifs error when
> there is uncorrectable ecc error (they should happen in the last
> (interrupted) written page).
> 
> But now we got "validate_master: bad master node at offset 69632 error
> 7" [1].
notice that gc_lnum==-1 in this case.
Also this didn't happen on power cut.
The senario was :
- power cut
- mount fs [1]
- do some fs operation
- umount fs quickly (9 second after mount in this case) [2]
- mount fs [3]

The the problem seems that gc_lnum==-1 is not handled in mount or 
shouldn't happen in umount.

[1]
UBIFS: recovery needed
UBIFS: recovery completed
UBIFS: mounted UBI device 3, volume 0, name "test"
UBIFS: file system size:   14348288 bytes (14012 KiB, 13 MiB, 113 LEBs)
UBIFS: journal size:       1015809 bytes (992 KiB, 0 MiB, 6 LEBs)
UBIFS: media format:       w4/r0 (latest is w4/r0)
UBIFS: default compressor: none
UBIFS: reserved for root:  677704 bytes (661 KiB)
[2]
### round 0 : 9 seconds
UBIFS: un-mount UBI device 3, volume 0
[3]
UBIFS error (pid 287): validate_master: bad master node at offset 69632 
error 7

> 
> The error is not very clear to me.
> 
> What does it means ?
> 
> What could cause it.
> 
> 
> Thanks
> 
> Matthieu
> 
> 
> 
> [1]
> [...]
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 69632
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 1:69632 (57344 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 1:70144 (56832 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 71680
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 1:71680 (55296 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: hit empty space
> 
> UBIFS DBG (pid 288): ubifs_end_scan: stop scanning LEB 1 at offset 71680
> 
> UBIFS DBG (pid 288): ubifs_start_scan: scan LEB 2:0
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:0 (126976 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:512 (126464 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 2048
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:2048 (124928 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:2560 (124416 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 4096
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:4096 (122880 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:4608 (122368 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 6144
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:6144 (120832 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:6656 (120320 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 8192
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:8192 (118784 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:8704 (118272 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 10240
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:10240 (116736 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:10752 (116224 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 12288
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:12288 (114688 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:12800 (114176 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 14336
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:14336 (112640 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:14848 (112128 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 16384
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:16384 (110592 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:16896 (110080 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 18432
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:18432 (108544 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:18944 (108032 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 20480
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:20480 (106496 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:20992 (105984 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 22528
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:22528 (104448 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:23040 (103936 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 24576
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:24576 (102400 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:25088 (101888 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 26624
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:26624 (100352 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:27136 (99840 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 28672
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:28672 (98304 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:29184 (97792 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 30720
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:30720 (96256 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:31232 (95744 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 32768
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:32768 (94208 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:33280 (93696 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 34816
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:34816 (92160 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:35328 (91648 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 36864
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:36864 (90112 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:37376 (89600 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 38912
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:38912 (88064 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:39424 (87552 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 40960
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:40960 (86016 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:41472 (85504 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 43008
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:43008 (83968 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:43520 (83456 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 45056
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:45056 (81920 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:45568 (81408 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 47104
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:47104 (79872 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:47616 (79360 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 49152
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:49152 (77824 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:49664 (77312 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 51200
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:51200 (75776 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:51712 (75264 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 53248
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:53248 (73728 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:53760 (73216 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 55296
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:55296 (71680 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:55808 (71168 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 57344
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:57344 (69632 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:57856 (69120 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 59392
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:59392 (67584 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:59904 (67072 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 61440
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:61440 (65536 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:61952 (65024 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 63488
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:63488 (63488 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:64000 (62976 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 65536
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:65536 (61440 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:66048 (60928 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 67584
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:67584 (59392 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:68096 (58880 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 69632
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:69632 (57344 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning master node
> 
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:70144 (56832 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: scanning padding node
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: 1508 bytes padded, offset now
> 71680
> UBIFS DBG (pid 288): ubifs_scan: look at LEB 2:71680 (55296 bytes left)
> 
> UBIFS DBG (pid 288): ubifs_scan_a_node: hit empty space
> 
> UBIFS DBG (pid 288): ubifs_end_scan: stop scanning LEB 2 at offset 71680
> 
> UBIFS error (pid 288): validate_master: bad master node at offset 69632
> error 7
>          magic          0x6101831
> 
>          crc            0xf1cb595d
> 
>          node_type      7 (master node)
> 
>          group_type     0 (no node group)
> 
>          sqnum          53559169
> 
>          len            512
> 
>          highest_inum   50055
> 
>          commit number  208832
> 
>          flags          0x2
> 
>          log_lnum       3
> 
>          root_lnum      42
> 
>          root_offs      78864
> 
>          root_len       68
> 
>          gc_lnum        4294967295
> 
>          ihead_lnum     42
> 
>          ihead_offs     79872
> 
>          index_size     81072
> 
>          lpt_lnum       6
> 
>          lpt_offs       27032
> 
>          nhead_lnum     6
> 
>          nhead_offs     28672
> 
>          ltab_lnum      6
> 
>          ltab_offs      26624
> 
>          lsave_lnum     0
> 
>          lsave_offs     0
> 
>          lscan_lnum     58
> 
>          leb_cnt        122
> 
>          empty_lebs     1
> 
>          idx_lebs       3
> 
>          total_free     327680
> 
>          total_dirty    4500616
> 
>          total_used     9438920
> 
>          total_dead     0
> 
>          total_dark     620816
> 
> UBIFS DBG (pid 292): ubifs_bg_thread: background thread "ubifs_bgt3_0"
> stops
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-07-13  8:43   ` Matthieu CASTET
@ 2010-07-13  9:24     ` Matthieu CASTET
  2010-07-13 14:24       ` Artem Bityutskiy
  0 siblings, 1 reply; 19+ messages in thread
From: Matthieu CASTET @ 2010-07-13  9:24 UTC (permalink / raw)
  To: linux-mtd

[-- Attachment #1: Type: text/plain, Size: 754 bytes --]

Matthieu CASTET a écrit :
> Matthieu CASTET a écrit :
>> Hi,
>>
>> we found some bug in our driver. Now there no more ubifs error when
>> there is uncorrectable ecc error (they should happen in the last
>> (interrupted) written page).
>>
>> But now we got "validate_master: bad master node at offset 69632 error
>> 7" [1].
> notice that gc_lnum==-1 in this case.
> Also this didn't happen on power cut.
> The senario was :
> - power cut
> - mount fs [1]
> - do some fs operation
> - umount fs quickly (9 second after mount in this case) [2]
> - mount fs [3]
> 
> The the problem seems that gc_lnum==-1 is not handled in mount or
> shouldn't happen in umount.
> 
The attached patch try to support mount with gc_lnum == -1.

Does it look sane ?


Matthieu

[-- Attachment #2: ubifs.diff --]
[-- Type: text/x-diff, Size: 1369 bytes --]

diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index eb532e0..56de77f 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -1320,18 +1322,27 @@ static int mount_ubifs(struct ubifs_info *c)
 			if (err)
 				goto out_orphans;
 			err = ubifs_rcvry_gc_commit(c);
-		} else {
-			err = take_gc_lnum(c);
 			if (err)
 				goto out_orphans;
+		} else {
+			if (c->gc_lnum == -1) {
+				err = ubifs_rcvry_gc_commit(c);
+				if (err)
+					goto out_orphans;
+			}
+			else {
+				err = take_gc_lnum(c);
+				if (err)
+					goto out_orphans;
 
-			/*
-			 * GC LEB may contain garbage if there was an unclean
-			 * reboot, and it should be un-mapped.
-			 */
-			err = ubifs_leb_unmap(c, c->gc_lnum);
-			if (err)
-				return err;
+				/*
+				 * GC LEB may contain garbage if there was an unclean
+				 * reboot, and it should be un-mapped.
+				 */
+				err = ubifs_leb_unmap(c, c->gc_lnum);
+				if (err)
+					return err;
+			}
 		}
 
 		err = dbg_check_lprops(c);
diff --git a/fs/ubifs/master.c b/fs/ubifs/master.c
index 28beaee..8d5b080 100644
--- a/fs/ubifs/master.c
+++ b/fs/ubifs/master.c
@@ -135,7 +135,7 @@ static int validate_master(const struct ubifs_info *c)
 		goto out;
 	}
 
-	if (c->gc_lnum >= c->leb_cnt || c->gc_lnum < c->main_first) {
+	if (c->gc_lnum != -1 && (c->gc_lnum >= c->leb_cnt || c->gc_lnum < c->main_first)) {
 		err = 7;
 		goto out;
 	}

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-07-07 12:04 ubifs : corruption after power cut test Matthieu CASTET
  2010-07-13  7:27 ` Matthieu CASTET
@ 2010-07-13 11:07 ` Artem Bityutskiy
  2010-07-13 12:06   ` Matthieu CASTET
  1 sibling, 1 reply; 19+ messages in thread
From: Artem Bityutskiy @ 2010-07-13 11:07 UTC (permalink / raw)
  To: Matthieu CASTET; +Cc: linux-mtd

Hi,

On Wed, 2010-07-07 at 14:04 +0200, Matthieu CASTET wrote:
> PS : On another OS using the same flash (with a proprietary fs), we saw 
> that interrupted erase can do weird stuff. The eraseblock with 
> interrupted erase can become unstable. For example it acts like erased 
> block, can be written with data (and be can read again) but after some 
> times uncorrectable error happens.
>  From what I understood, ubi should be safe because in case of 
> interrupted erase, we will add it to erase or corr list, erase the block 
> again before writing EC.

Yes.

> BTW what's the difference between erase and corr list in scan ? We seem 
> to do the same thing for these lists (schedule_erase).

Probably we wanted to do something special with corrupted eraseblocks,
so introduced a separate list for this, in current implementation it is
not really needed. It is convenient because we then can walk the list
and print corrupted block numbers (we do this in ubi_scan()).

So, let's keep it this way.

> [1]
> UBIFS: mark VFS SB RO too
> UBI: init even if MTD device cannot be attached, if built into kernel
> UBI: remove reboot notifier
> random: Remove unused inode variable
> random: drop weird m_time/a_time manipulation
> UBI: add write checking
> UBI: simplify debugging return codes
> UBI: fix attaching error path
> UBI: support attaching by MTD character device name
> UBI: mark few variables as __initdata
> UBI: fix volume creation input checking
> UBI: fix memory leak in update path
> UBI: add more checks to chdev open
> UBI: initialise update marker
> UBIFS: support mounting of UBI volume character devices
> UBI: Add ubi_open_volume_path

You also wan this (new) patch I think:
http://git.infradead.org/ubifs-2.6.git/commit/6fb4374f6b1b3932f3acfe9d353568d3d8599cad

I'll add it to the back-port trees at some point as well.

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-07-13 11:07 ` Artem Bityutskiy
@ 2010-07-13 12:06   ` Matthieu CASTET
  2010-07-13 14:13     ` Artem Bityutskiy
  2010-07-13 14:33     ` Artem Bityutskiy
  0 siblings, 2 replies; 19+ messages in thread
From: Matthieu CASTET @ 2010-07-13 12:06 UTC (permalink / raw)
  To: dedekind1; +Cc: linux-mtd

Hi,

Artem Bityutskiy a écrit :
> Hi,
>> [1]
>> UBIFS: mark VFS SB RO too
>> UBI: init even if MTD device cannot be attached, if built into kernel
>> UBI: remove reboot notifier
>> random: Remove unused inode variable
>> random: drop weird m_time/a_time manipulation
>> UBI: add write checking
>> UBI: simplify debugging return codes
>> UBI: fix attaching error path
>> UBI: support attaching by MTD character device name
>> UBI: mark few variables as __initdata
>> UBI: fix volume creation input checking
>> UBI: fix memory leak in update path
>> UBI: add more checks to chdev open
>> UBI: initialise update marker
>> UBIFS: support mounting of UBI volume character devices
>> UBI: Add ubi_open_volume_path
> 
> You also wan this (new) patch I think:
> http://git.infradead.org/ubifs-2.6.git/commit/6fb4374f6b1b3932f3acfe9d353568d3d8599cad
> 
Yes I managed to trigger this case :)
I also taken 
http://git.infradead.org/ubifs-2.6.git/commitdiff/276de5d2a18bcdc69e6d48a4d96afc14cfef9dcb


Thanks,

Matthieu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-07-13 12:06   ` Matthieu CASTET
@ 2010-07-13 14:13     ` Artem Bityutskiy
  2010-07-13 14:33     ` Artem Bityutskiy
  1 sibling, 0 replies; 19+ messages in thread
From: Artem Bityutskiy @ 2010-07-13 14:13 UTC (permalink / raw)
  To: Matthieu CASTET; +Cc: linux-mtd

On Tue, 2010-07-13 at 14:06 +0200, Matthieu CASTET wrote:
> > You also wan this (new) patch I think:
> > http://git.infradead.org/ubifs-2.6.git/commit/6fb4374f6b1b3932f3acfe9d353568d3d8599cad
> > 
> Yes I managed to trigger this case :)
> I also taken 
> http://git.infradead.org/ubifs-2.6.git/commitdiff/276de5d2a18bcdc69e6d48a4d96afc14cfef9dcb

Right.

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-07-13  9:24     ` Matthieu CASTET
@ 2010-07-13 14:24       ` Artem Bityutskiy
  2010-07-13 15:10         ` Matthieu CASTET
  0 siblings, 1 reply; 19+ messages in thread
From: Artem Bityutskiy @ 2010-07-13 14:24 UTC (permalink / raw)
  To: Matthieu CASTET; +Cc: linux-mtd

On Tue, 2010-07-13 at 11:24 +0200, Matthieu CASTET wrote:
> Matthieu CASTET a écrit :
> > Matthieu CASTET a écrit :
> >> Hi,
> >>
> >> we found some bug in our driver. Now there no more ubifs error when
> >> there is uncorrectable ecc error (they should happen in the last
> >> (interrupted) written page).
> >>
> >> But now we got "validate_master: bad master node at offset 69632 error
> >> 7" [1].
> > notice that gc_lnum==-1 in this case.
> > Also this didn't happen on power cut.
> > The senario was :
> > - power cut
> > - mount fs [1]
> > - do some fs operation
> > - umount fs quickly (9 second after mount in this case) [2]
> > - mount fs [3]
> > 
> > The the problem seems that gc_lnum==-1 is not handled in mount or
> > shouldn't happen in umount.
> > 
> The attached patch try to support mount with gc_lnum == -1.
> 
> Does it look sane ?

I did not give it much thought, but I do not see how master node can end
up with gc_lnum = -1 in it, and it seems we assumed this cannot happen.
Could you please add this hack to your kernel? It should catch the
situations when we write gc_lnum == -1 to the master node and print the
stack dump, which should give some idea about the code-path which causes
it.

diff --git a/fs/ubifs/master.c b/fs/ubifs/master.c
index 28beaee..8277f64 100644
--- a/fs/ubifs/master.c
+++ b/fs/ubifs/master.c
@@ -378,6 +378,15 @@ int ubifs_write_master(struct ubifs_info *c)
 	c->mst_offs = offs;
 	c->mst_node->highest_inum = cpu_to_le64(c->highest_inum);
 
+	{
+		/* Temporary hack for Matthieu */
+		int gc_lnum = le32_to_cpu(c->mst_node->gc_lnum);
+		if (gc_lnum < 0) {
+			printk(KERN_CRIT "%s: gc_lnum is %d!\n", __func__, gc_lnum);
+			dump_stack();
+		}
+	}
+
 	err = ubifs_write_node(c, c->mst_node, len, lnum, offs, UBI_SHORTTERM);
 	if (err)
 		return err;

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-07-13 12:06   ` Matthieu CASTET
  2010-07-13 14:13     ` Artem Bityutskiy
@ 2010-07-13 14:33     ` Artem Bityutskiy
  1 sibling, 0 replies; 19+ messages in thread
From: Artem Bityutskiy @ 2010-07-13 14:33 UTC (permalink / raw)
  To: Matthieu CASTET; +Cc: linux-mtd

On Tue, 2010-07-13 at 14:06 +0200, Matthieu CASTET wrote:
> > You also wan this (new) patch I think:
> > http://git.infradead.org/ubifs-2.6.git/commit/6fb4374f6b1b3932f3acfe9d353568d3d8599cad
> > 
> Yes I managed to trigger this case :)
> I also taken 
> http://git.infradead.org/ubifs-2.6.git/commitdiff/276de5d2a18bcdc69e6d48a4d96afc14cfef9dcb

But this is a good lesson for me - it does not matter how much you test,
other people will find bugs. I mean, the recovery is something we really
tested a lot. There is even debugging mode which emulates various
failures, we used that as well as random power cuts while running stress
tests.

Nevermind, just a though which crossed my mind :-)

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-07-13 14:24       ` Artem Bityutskiy
@ 2010-07-13 15:10         ` Matthieu CASTET
  2010-07-28  7:40           ` Matthieu CASTET
  0 siblings, 1 reply; 19+ messages in thread
From: Matthieu CASTET @ 2010-07-13 15:10 UTC (permalink / raw)
  To: dedekind1; +Cc: linux-mtd

Artem Bityutskiy a écrit :
> On Tue, 2010-07-13 at 11:24 +0200, Matthieu CASTET wrote:
>> Matthieu CASTET a écrit :
>>> Matthieu CASTET a écrit :
>>>> Hi,
>>>>
>>>> we found some bug in our driver. Now there no more ubifs error when
>>>> there is uncorrectable ecc error (they should happen in the last
>>>> (interrupted) written page).
>>>>
>>>> But now we got "validate_master: bad master node at offset 69632 error
>>>> 7" [1].
>>> notice that gc_lnum==-1 in this case.
>>> Also this didn't happen on power cut.
>>> The senario was :
>>> - power cut
>>> - mount fs [1]
>>> - do some fs operation
>>> - umount fs quickly (9 second after mount in this case) [2]
>>> - mount fs [3]
>>>
>>> The the problem seems that gc_lnum==-1 is not handled in mount or
>>> shouldn't happen in umount.
>>>
>> The attached patch try to support mount with gc_lnum == -1.
>>
>> Does it look sane ?
> 
> I did not give it much thought, but I do not see how master node can end
> up with gc_lnum = -1 in it, and it seems we assumed this cannot happen.
> Could you please add this hack to your kernel? It should catch the
> situations when we write gc_lnum == -1 to the master node and print the
> stack dump, which should give some idea about the code-path which causes
> it.
Ok thanks, I will run it

When checking the code, I saw that switch_gc_head can set c->gc_lnum to -1.

In ubifs_put_super, we set c->mst_node->gc_lnum to c->gc_lnum and write 
master node.
Can't ubifs_put_super run while switch_gc_head set gc_lnum to -1 ?

Matthieu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-07-13 15:10         ` Matthieu CASTET
@ 2010-07-28  7:40           ` Matthieu CASTET
  2010-08-02  9:32             ` Matthieu CASTET
  2010-08-22  7:44             ` Artem Bityutskiy
  0 siblings, 2 replies; 19+ messages in thread
From: Matthieu CASTET @ 2010-07-28  7:40 UTC (permalink / raw)
  To: dedekind1; +Cc: linux-mtd

Hi,

Matthieu CASTET a écrit :
> Artem Bityutskiy a écrit :
>> On Tue, 2010-07-13 at 11:24 +0200, Matthieu CASTET wrote:
>>> Matthieu CASTET a écrit :
>>>> Matthieu CASTET a écrit :
>>>>> Hi,
>>>>>
>>>>> we found some bug in our driver. Now there no more ubifs error when
>>>>> there is uncorrectable ecc error (they should happen in the last
>>>>> (interrupted) written page).
>>>>>
>>>>> But now we got "validate_master: bad master node at offset 69632 error
>>>>> 7" [1].
>>>> notice that gc_lnum==-1 in this case.
>>>> Also this didn't happen on power cut.
>>>> The senario was :
>>>> - power cut
>>>> - mount fs [1]
>>>> - do some fs operation
>>>> - umount fs quickly (9 second after mount in this case) [2]
>>>> - mount fs [3]
>>>>
>>>> The the problem seems that gc_lnum==-1 is not handled in mount or
>>>> shouldn't happen in umount.
>>>>
>>> The attached patch try to support mount with gc_lnum == -1.
>>>
>>> Does it look sane ?
>> I did not give it much thought, but I do not see how master node can end
>> up with gc_lnum = -1 in it, and it seems we assumed this cannot happen.
>> Could you please add this hack to your kernel? It should catch the
>> situations when we write gc_lnum == -1 to the master node and print the
>> stack dump, which should give some idea about the code-path which causes
>> it.
> Ok thanks, I will run it
> 
> When checking the code, I saw that switch_gc_head can set c->gc_lnum to -1.
> 
> In ubifs_put_super, we set c->mst_node->gc_lnum to c->gc_lnum and write 
> master node.
> Can't ubifs_put_super run while switch_gc_head set gc_lnum to -1 ?
> 
I manage to reproduce it with the backtrace [1].

Matthieu

[1]
# UBIFS: recovery completed
UBIFS: mounted UBI device 3, volume 0, name "test"
UBIFS: file system size:   30474240 bytes (29760 KiB, 29 MiB, 240 LEBs)
UBIFS: journal size:       1523712 bytes (1488 KiB, 1 MiB, 12 LEBs)
UBIFS: media format:       w4/r0 (latest is w4/r0)
UBIFS: default compressor: lzo
UBIFS: reserved for root:  1439373 bytes (1405 KiB)
checking all files...
++++++ power failure detected, cleaning up tmpfile (262415 bytes)
### round 0 : 16 seconds
UBIFS: un-mount UBI device 3, volume 0
ubifs_write_master: gc_lnum is -1!
[<c00279f0>] (dump_stack+0x0/0x14) from [<c00d64c4>] 
(ubifs_write_master+0x170/0x1b0)
[<c00d6354>] (ubifs_write_master+0x0/0x1b0) from [<c00ce264>] 
(ubifs_put_super+0x1a0/0x1d8)
  r7:c7a7e000 r6:00000003 r5:c795c124 r4:c795c100
[<c00ce0c4>] (ubifs_put_super+0x0/0x1d8) from [<c007ed20>] 
(generic_shutdown_super+0x78/0xfc)
  r8:00000000 r7:c780cf38 r6:c780cf20 r5:c01b08bc r4:c7a9d400
[<c007eca8>] (generic_shutdown_super+0x0/0xfc) from [<c007ede8>] 
(kill_anon_super+0x18/0x34)
  r5:c022739c r4:0000000b
[<c007edd0>] (kill_anon_super+0x0/0x34) from [<c007ee7c>] 
(deactivate_super+0x48/0x60)
  r4:c7a9d400
[<c007ee34>] (deactivate_super+0x0/0x60) from [<c0093998>] 
(mntput_no_expire+0x64/0xc8)
  r5:c7a9d400 r4:c780cf20
[<c0093934>] (mntput_no_expire+0x0/0xc8) from [<c009456c>] 
(sys_umount+0x58/0x31c)
  r5:c780cf38 r4:c780cf18
[<c0094514>] (sys_umount+0x0/0x31c) from [<c0023c00>] 
(ret_fast_syscall+0x0/0x2c)
UBIFS error (pid 285): validate_master: bad master node at offset 104448 
error 7

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-07-28  7:40           ` Matthieu CASTET
@ 2010-08-02  9:32             ` Matthieu CASTET
  2010-08-04 16:14               ` Artem Bityutskiy
  2010-08-22  7:44             ` Artem Bityutskiy
  1 sibling, 1 reply; 19+ messages in thread
From: Matthieu CASTET @ 2010-08-02  9:32 UTC (permalink / raw)
  To: Matthieu CASTET; +Cc: linux-mtd, dedekind1

[-- Attachment #1: Type: text/plain, Size: 500 bytes --]

Matthieu CASTET a écrit :
> Hi,
> 
> Matthieu CASTET a écrit :
>> Artem Bityutskiy a écrit :
>> Ok thanks, I will run it
>>
>> When checking the code, I saw that switch_gc_head can set c->gc_lnum to -1.
>>
>> In ubifs_put_super, we set c->mst_node->gc_lnum to c->gc_lnum and write 
>> master node.
>> Can't ubifs_put_super run while switch_gc_head set gc_lnum to -1 ?
>>
> I manage to reproduce it with the backtrace [1].
> 
Waiting for a proper fix, I force recovery if gc_lnum to -1.


Matthieu

[-- Attachment #2: ubifs --]
[-- Type: text/plain, Size: 865 bytes --]

diff --git a/fs/ubifs/master.c b/fs/ubifs/master.c
index 28beaee..2b668cc 100644
--- a/fs/ubifs/master.c
+++ b/fs/ubifs/master.c
@@ -135,7 +135,7 @@ static int validate_master(const struct ubifs_info *c)
 		goto out;
 	}
 
-	if (c->gc_lnum >= c->leb_cnt || c->gc_lnum < c->main_first) {
+	if (c->gc_lnum != -1 && (c->gc_lnum >= c->leb_cnt || c->gc_lnum < c->main_first)) {
 		err = 7;
 		goto out;
 	}
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index 8cdcdc5..0207620 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -1260,7 +1260,7 @@ static int mount_ubifs(struct ubifs_info *c)
 
 	init_constants_master(c);
 
-	if ((c->mst_node->flags & cpu_to_le32(UBIFS_MST_DIRTY)) != 0) {
+	if ((c->mst_node->flags & cpu_to_le32(UBIFS_MST_DIRTY)) != 0 || c->gc_lnum == -1) {
 		ubifs_msg("recovery needed");
 		c->need_recovery = 1;
 		if (!mounted_read_only) {

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-08-02  9:32             ` Matthieu CASTET
@ 2010-08-04 16:14               ` Artem Bityutskiy
  0 siblings, 0 replies; 19+ messages in thread
From: Artem Bityutskiy @ 2010-08-04 16:14 UTC (permalink / raw)
  To: Matthieu CASTET; +Cc: linux-mtd

On Mon, 2010-08-02 at 11:32 +0200, Matthieu CASTET wrote:
> Matthieu CASTET a écrit :
> > Hi,
> > 
> > Matthieu CASTET a écrit :
> >> Artem Bityutskiy a écrit :
> >> Ok thanks, I will run it
> >>
> >> When checking the code, I saw that switch_gc_head can set c->gc_lnum to -1.
> >>
> >> In ubifs_put_super, we set c->mst_node->gc_lnum to c->gc_lnum and write 
> >> master node.
> >> Can't ubifs_put_super run while switch_gc_head set gc_lnum to -1 ?
> >>
> > I manage to reproduce it with the backtrace [1].
> > 
> Waiting for a proper fix, I force recovery if gc_lnum to 

The workaround looks ok, but I still do not understand how we end up
with writing -1. The only place c->gc_lnum is set to -1 is in GC, but it
is then initialized properly, and only error can cause GC to return with
c->gc_lnum == -1, in which case we switch to R/O mode immediately.

Is your UBIFS identical to what I have in my 2.6.27 back-port tree?

Also, I will have really little time till the beginning of September, so
probably this will wait...

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-07-28  7:40           ` Matthieu CASTET
  2010-08-02  9:32             ` Matthieu CASTET
@ 2010-08-22  7:44             ` Artem Bityutskiy
  2010-09-06  8:55               ` Artem Bityutskiy
  2010-09-24 15:31               ` Matthieu CASTET
  1 sibling, 2 replies; 19+ messages in thread
From: Artem Bityutskiy @ 2010-08-22  7:44 UTC (permalink / raw)
  To: Matthieu CASTET; +Cc: linux-mtd

On Wed, 2010-07-28 at 09:40 +0200, Matthieu CASTET wrote:
> I manage to reproduce it with the backtrace [1].

Matthieu, your work-around patch or something very close should
certainly be applied to the UBIFS tree, but I still would like to find
out what exactly happened in your setup.

I see 2 possibilities:

1. An error happened and 'ubifs_garbage_collect()' returned while
c->gc_lnum was -1. But in this case we should have switched to R/O mode,
and the master node would not be written. But may be for some reasons we
did not switch to R/O mode, dunno.

2. More likely scenario: in 'ubifs_rcvry_gc_commit()' we call
'ubifs_garbage_collect_leb()' directly, which can return while
c->gc_lnum is -1. And we do not handle this.

Would you please be patient enough to reproduce the issue once again with
the following patch, which was created against the latest ubifs-2.6.git, but
you should be easily able to apply it to your tree.

Artem.

diff --git a/fs/ubifs/budget.c b/fs/ubifs/budget.c
index c8ff0d1..aa433cd 100644
--- a/fs/ubifs/budget.c
+++ b/fs/ubifs/budget.c
@@ -83,6 +83,10 @@ static int run_gc(struct ubifs_info *c)
 	down_read(&c->commit_sem);
 	lnum = ubifs_garbage_collect(c, 1);
 	up_read(&c->commit_sem);
+	if (c->gc_lnum == -1) {
+		ubifs_err("gc_lnum is -1! ubifs_garbage_collect() returned %d", lnum);
+		dump_stack();
+	}
 	if (lnum < 0)
 		return lnum;
 
diff --git a/fs/ubifs/gc.c b/fs/ubifs/gc.c
index 396f24a..0e78832 100644
--- a/fs/ubifs/gc.c
+++ b/fs/ubifs/gc.c
@@ -807,12 +807,20 @@ int ubifs_garbage_collect(struct ubifs_info *c, int anyway)
 		goto out;
 	}
 out_unlock:
+	if (c->gc_lnum == -1) {
+		ubifs_err("gc_lnum is -1! ubifs_garbage_collect() is returning %d", ret);
+		dump_stack();
+	}
 	mutex_unlock(&wbuf->io_mutex);
 	return ret;
 
 out:
 	ubifs_assert(ret < 0);
 	ubifs_assert(ret != -ENOSPC && ret != -EAGAIN);
+	if (c->gc_lnum == -1) {
+		ubifs_err("gc_lnum is -1! ubifs_garbage_collect() is returning %d", ret);
+		dump_stack();
+	}
 	ubifs_wbuf_sync_nolock(wbuf);
 	ubifs_ro_mode(c, ret);
 	mutex_unlock(&wbuf->io_mutex);
diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index d321bae..44df514 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -162,6 +162,10 @@ again:
 	mutex_unlock(&wbuf->io_mutex);
 
 	lnum = ubifs_garbage_collect(c, 0);
+	if (c->gc_lnum == -1) {
+		ubifs_err("gc_lnum is -1! ubifs_garbage_collect() returned %d", lnum);
+		dump_stack();
+	}
 	if (lnum < 0) {
 		err = lnum;
 		if (err != -ENOSPC)
diff --git a/fs/ubifs/recovery.c b/fs/ubifs/recovery.c
index daae9e1..3058256 100644
--- a/fs/ubifs/recovery.c
+++ b/fs/ubifs/recovery.c
@@ -1126,6 +1126,10 @@ int ubifs_rcvry_gc_commit(struct ubifs_info *c)
 	dbg_rcvry("GC'ing LEB %d", lnum);
 	mutex_lock_nested(&wbuf->io_mutex, wbuf->jhead);
 	err = ubifs_garbage_collect_leb(c, &lp);
+	if (c->gc_lnum == -1) {
+		ubifs_err("gc_lnum is -1! ubifs_garbage_collect_leb() returned %d", err);
+		dump_stack();
+	}
 	if (err >= 0) {
 		int err2 = ubifs_wbuf_sync_nolock(wbuf);
 

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-08-22  7:44             ` Artem Bityutskiy
@ 2010-09-06  8:55               ` Artem Bityutskiy
  2010-09-09  9:22                 ` Matthieu CASTET
  2010-09-24 15:31               ` Matthieu CASTET
  1 sibling, 1 reply; 19+ messages in thread
From: Artem Bityutskiy @ 2010-09-06  8:55 UTC (permalink / raw)
  To: Matthieu CASTET; +Cc: linux-mtd

On Sun, 2010-08-22 at 10:44 +0300, Artem Bityutskiy wrote:
> On Wed, 2010-07-28 at 09:40 +0200, Matthieu CASTET wrote:
> > I manage to reproduce it with the backtrace [1].
> 
> Matthieu, your work-around patch or something very close should
> certainly be applied to the UBIFS tree, but I still would like to find
> out what exactly happened in your setup.
> 
> I see 2 possibilities:
> 
> 1. An error happened and 'ubifs_garbage_collect()' returned while
> c->gc_lnum was -1. But in this case we should have switched to R/O mode,
> and the master node would not be written. But may be for some reasons we
> did not switch to R/O mode, dunno.
> 
> 2. More likely scenario: in 'ubifs_rcvry_gc_commit()' we call
> 'ubifs_garbage_collect_leb()' directly, which can return while
> c->gc_lnum is -1. And we do not handle this.
> 
> Would you please be patient enough to reproduce the issue once again with
> the following patch, which was created against the latest ubifs-2.6.git, but
> you should be easily able to apply it to your tree.

Hi, any news?

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-09-06  8:55               ` Artem Bityutskiy
@ 2010-09-09  9:22                 ` Matthieu CASTET
  2010-09-09  9:51                   ` Artem Bityutskiy
  0 siblings, 1 reply; 19+ messages in thread
From: Matthieu CASTET @ 2010-09-09  9:22 UTC (permalink / raw)
  To: dedekind1; +Cc: linux-mtd

Artem Bityutskiy a écrit :
> On Sun, 2010-08-22 at 10:44 +0300, Artem Bityutskiy wrote:
>> On Wed, 2010-07-28 at 09:40 +0200, Matthieu CASTET wrote:
>>> I manage to reproduce it with the backtrace [1].
>> Matthieu, your work-around patch or something very close should
>> certainly be applied to the UBIFS tree, but I still would like to find
>> out what exactly happened in your setup.
>>
>> I see 2 possibilities:
>>
>> 1. An error happened and 'ubifs_garbage_collect()' returned while
>> c->gc_lnum was -1. But in this case we should have switched to R/O mode,
>> and the master node would not be written. But may be for some reasons we
>> did not switch to R/O mode, dunno.
>>
>> 2. More likely scenario: in 'ubifs_rcvry_gc_commit()' we call
>> 'ubifs_garbage_collect_leb()' directly, which can return while
>> c->gc_lnum is -1. And we do not handle this.
>>
>> Would you please be patient enough to reproduce the issue once again with
>> the following patch, which was created against the latest ubifs-2.6.git, but
>> you should be easily able to apply it to your tree.
> 
> Hi, any news?
> 
Not much, I was busy on another subject but I will try ASAP.


Matthieu

PS : any idea/comment on the handling of interrupted write page by 
UBI/UBIFS ?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-09-09  9:22                 ` Matthieu CASTET
@ 2010-09-09  9:51                   ` Artem Bityutskiy
  0 siblings, 0 replies; 19+ messages in thread
From: Artem Bityutskiy @ 2010-09-09  9:51 UTC (permalink / raw)
  To: Matthieu CASTET; +Cc: linux-mtd

> PS : any idea/comment on the handling of interrupted write page by 
> UBI/UBIFS ?

Err, I think these are perfectly handled, I read your e-mails, they were
a little messy, but I did not find anything UBIFS does not handle. I
sent you a fix for your oops.

Would you please re-formulate your questions nicely in a separate
e-mail, if you still have them?

-- 
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-08-22  7:44             ` Artem Bityutskiy
  2010-09-06  8:55               ` Artem Bityutskiy
@ 2010-09-24 15:31               ` Matthieu CASTET
  2010-09-24 16:50                 ` Artem Bityutskiy
  1 sibling, 1 reply; 19+ messages in thread
From: Matthieu CASTET @ 2010-09-24 15:31 UTC (permalink / raw)
  To: dedekind1; +Cc: linux-mtd

Artem Bityutskiy a écrit :
> On Wed, 2010-07-28 at 09:40 +0200, Matthieu CASTET wrote:
>> I manage to reproduce it with the backtrace [1].
> 
> Matthieu, your work-around patch or something very close should
> certainly be applied to the UBIFS tree, but I still would like to find
> out what exactly happened in your setup.
> 
> I see 2 possibilities:
> 
> 1. An error happened and 'ubifs_garbage_collect()' returned while
> c->gc_lnum was -1. But in this case we should have switched to R/O mode,
> and the master node would not be written. But may be for some reasons we
> did not switch to R/O mode, dunno.
> 
> 2. More likely scenario: in 'ubifs_rcvry_gc_commit()' we call
> 'ubifs_garbage_collect_leb()' directly, which can return while
> c->gc_lnum is -1. And we do not handle this.
> 
> Would you please be patient enough to reproduce the issue once again with
> the following patch, which was created against the latest ubifs-2.6.git, but
> you should be easily able to apply it to your tree.
None of these check happen.

only the dump in ubifs_write_master.


Matthieu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ubifs : corruption after power cut test
  2010-09-24 15:31               ` Matthieu CASTET
@ 2010-09-24 16:50                 ` Artem Bityutskiy
  0 siblings, 0 replies; 19+ messages in thread
From: Artem Bityutskiy @ 2010-09-24 16:50 UTC (permalink / raw)
  To: Matthieu CASTET; +Cc: linux-mtd

On Fri, 2010-09-24 at 17:31 +0200, Matthieu CASTET wrote:
> Artem Bityutskiy a écrit :
> > On Wed, 2010-07-28 at 09:40 +0200, Matthieu CASTET wrote:
> >> I manage to reproduce it with the backtrace [1].
> > 
> > Matthieu, your work-around patch or something very close should
> > certainly be applied to the UBIFS tree, but I still would like to find
> > out what exactly happened in your setup.
> > 
> > I see 2 possibilities:
> > 
> > 1. An error happened and 'ubifs_garbage_collect()' returned while
> > c->gc_lnum was -1. But in this case we should have switched to R/O mode,
> > and the master node would not be written. But may be for some reasons we
> > did not switch to R/O mode, dunno.
> > 
> > 2. More likely scenario: in 'ubifs_rcvry_gc_commit()' we call
> > 'ubifs_garbage_collect_leb()' directly, which can return while
> > c->gc_lnum is -1. And we do not handle this.
> > 
> > Would you please be patient enough to reproduce the issue once again with
> > the following patch, which was created against the latest ubifs-2.6.git, but
> > you should be easily able to apply it to your tree.
> None of these check happen.
> 
> only the dump in ubifs_write_master.

Hmm.... This is weird... I think I need your UBIFS. Is it possible to
share? You can take vanilla 2.6.27 and put all UBIFS stuff there. Or
send patches against ubifs-v2.6.27.git

-- 
Best Regards,
Artem Bityutskiy (Битюцкий Артём)

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2010-09-24 16:50 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-07 12:04 ubifs : corruption after power cut test Matthieu CASTET
2010-07-13  7:27 ` Matthieu CASTET
2010-07-13  8:43   ` Matthieu CASTET
2010-07-13  9:24     ` Matthieu CASTET
2010-07-13 14:24       ` Artem Bityutskiy
2010-07-13 15:10         ` Matthieu CASTET
2010-07-28  7:40           ` Matthieu CASTET
2010-08-02  9:32             ` Matthieu CASTET
2010-08-04 16:14               ` Artem Bityutskiy
2010-08-22  7:44             ` Artem Bityutskiy
2010-09-06  8:55               ` Artem Bityutskiy
2010-09-09  9:22                 ` Matthieu CASTET
2010-09-09  9:51                   ` Artem Bityutskiy
2010-09-24 15:31               ` Matthieu CASTET
2010-09-24 16:50                 ` Artem Bityutskiy
2010-07-13 11:07 ` Artem Bityutskiy
2010-07-13 12:06   ` Matthieu CASTET
2010-07-13 14:13     ` Artem Bityutskiy
2010-07-13 14:33     ` Artem Bityutskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.