* UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 @ 2019-03-21 10:02 Lars Persson 2019-03-25 10:21 ` Lars Persson 2019-03-26 9:21 ` Lars Persson 0 siblings, 2 replies; 12+ messages in thread From: Lars Persson @ 2019-03-21 10:02 UTC (permalink / raw) To: linux-mtd Hi We have recently encountered multiple cases of corrupted UBIFS volumes that were triggered by a power-cut during startup. It may be a regression in the 4.14 stable branch. The symptom is seen when accessing a file in the corrupted FS: UBIFS error (ubi0:20 pid 513): ubifs_iget: failed to read inode 348, error -2 UBIFS error (ubi0:20 pid 513): ubifs_lookup: dead directory entry 'tampering.conf', error -2 UBIFS warning (ubi0:20 pid 513): ubifs_ro_mode.part.0: switched to read-only mode, error -2 [<80506a51>] (dump_stack) from [<80297e2f>] (ubifs_lookup+0x29b/0x300) [<80297e2f>] (ubifs_lookup) from [<80226169>] (lookup_slow+0x69/0xe4) We enabled chk_fs and reproduced it: UBIFS error (ubi0:20 pid 120): read_add_inode: inode 352 not found in index UBIFS error (ubi0:20 pid 120): check_leaf: error -2 while processing entry node and trying to find inode node 352 UBIFS (ubi0:20): dump of node at LEB 29:94656 magic 0x6101831 crc 0xda5d6bee node_type 2 (direntry node) group_type 1 (in node group) sqnum 4079 len 70 key (66, direntry, 0x4974b0f) inum 352 type 0 nlen 13 name emotiond.conf UBIFS error (ubi0:20 pid 120): dbg_walk_index: leaf checking function returned error -2, for leaf at LEB 29:94656 UBIFS (ubi0:20): dump of znode at LEB 37:57680 znode be0bf000, LEB 37:57680 len 128 parent be0af800 iip 1 level 0 child_cnt 5 flags 0 zbranches: 0: LNC (null) LEB 32:53248 len 81 key (66, direntry, 0x363e97f) 1: LNC (null) LEB 29:94208 len 63 key (66, direntry, 0x3f88c71) 2: LNC (null) LEB 12:92160 len 74 key (66, direntry, 0x477308e) 3: LNC (null) LEB 29:94656 len 70 key (66, direntry, 0x4974b0f) 4: LNC be084600 LEB 814:49152 len 63 key (66, direntry, 0x49bb6f7) UBIFS error (ubi0:20 pid 120): dbg_check_filesystem: file-system check failed with error -2 The setup of our system is: An overlayfs stack for /etc with: - The lower file-system is a read-only squashfs on ubi - The upper file-system is an ubifs The ubi partition resides on an SLC NAND from Toshiba (TH58NVG2S3HBAI4). The missing inode is always triggered on those two particular files emotiond.conf and tampering.conf that share the same write pattern at startup: cp file file.tmp echo some data > file.tmp mv file.tmp file fsync file Do not ask me about the logic of this script :> It overwrites the result of cp and does not implement a proper atomic move.. Anyway the end result must not be a file-system that is corrupt and mounts as RO. BR, Lars ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 2019-03-21 10:02 UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 Lars Persson @ 2019-03-25 10:21 ` Lars Persson 2019-03-26 9:21 ` Lars Persson 1 sibling, 0 replies; 12+ messages in thread From: Lars Persson @ 2019-03-25 10:21 UTC (permalink / raw) To: linux-mtd On Thu, Mar 21, 2019 at 11:02 AM Lars Persson <lists@bofh.nu> wrote: > > Hi > > We have recently encountered multiple cases of corrupted UBIFS volumes that > were triggered by a power-cut during startup. It may be a regression in the > 4.14 stable branch. > > The symptom is seen when accessing a file in the corrupted FS: > UBIFS error (ubi0:20 pid 513): ubifs_iget: failed to read inode 348, error -2 > UBIFS error (ubi0:20 pid 513): ubifs_lookup: dead directory entry > 'tampering.conf', error -2 > UBIFS warning (ubi0:20 pid 513): ubifs_ro_mode.part.0: switched to > read-only mode, error -2 > [<80506a51>] (dump_stack) from [<80297e2f>] (ubifs_lookup+0x29b/0x300) > [<80297e2f>] (ubifs_lookup) from [<80226169>] (lookup_slow+0x69/0xe4) > > We enabled chk_fs and reproduced it: > UBIFS error (ubi0:20 pid 120): read_add_inode: inode 352 not found in index > UBIFS error (ubi0:20 pid 120): check_leaf: error -2 while processing > entry node and trying to find inode node 352 > UBIFS (ubi0:20): dump of node at LEB 29:94656 > magic 0x6101831 > crc 0xda5d6bee > node_type 2 (direntry node) > group_type 1 (in node group) > sqnum 4079 > len 70 > key (66, direntry, 0x4974b0f) > inum 352 > type 0 > nlen 13 > name emotiond.conf > UBIFS error (ubi0:20 pid 120): dbg_walk_index: leaf checking > function returned error -2, for leaf at LEB 29:94656 > UBIFS (ubi0:20): dump of znode at LEB 37:57680 > znode be0bf000, LEB 37:57680 len 128 parent be0af800 iip 1 level 0 > child_cnt 5 flags 0 > zbranches: > 0: LNC (null) LEB 32:53248 len 81 key (66, direntry, 0x363e97f) > 1: LNC (null) LEB 29:94208 len 63 key (66, direntry, 0x3f88c71) > 2: LNC (null) LEB 12:92160 len 74 key (66, direntry, 0x477308e) > 3: LNC (null) LEB 29:94656 len 70 key (66, direntry, 0x4974b0f) > 4: LNC be084600 LEB 814:49152 len 63 key (66, direntry, 0x49bb6f7) > UBIFS error (ubi0:20 pid 120): dbg_check_filesystem: file-system > check failed with error -2 > > The setup of our system is: > An overlayfs stack for /etc with: > - The lower file-system is a read-only squashfs on ubi > - The upper file-system is an ubifs > > The ubi partition resides on an SLC NAND from Toshiba (TH58NVG2S3HBAI4). > > The missing inode is always triggered on those two particular files > emotiond.conf and tampering.conf that share the same write pattern at startup: > > cp file file.tmp > echo some data > file.tmp > mv file.tmp file > fsync file > > Do not ask me about the logic of this script :> It overwrites the result of cp > and does not implement a proper atomic move.. Anyway the end result must not be > a file-system that is corrupt and mounts as RO. > Further debug information with dbg_rcvry, dbg_mnt, dbg_jnl and dbg_jnlk prints enabled. Let me know if additional information is needed. The recovery code is dropping the inode, claiming it is orphaned. UBIFS DBG rcvry (pid 103): checking index head at 37:92160 UBIFS DBG rcvry (pid 103): checking LPT head at 7:18432 UBIFS DBG mnt (pid 103): start replaying the journal UBIFS DBG mnt (pid 103): replay log LEB 3:0 UBIFS DBG mnt (pid 103): commit start sqnum 7367 UBIFS DBG mnt (pid 103): add replay bud LEB 831:110592, head 1 UBIFS DBG mnt (pid 103): add replay bud LEB 823:49152, head 2 UBIFS DBG mnt (pid 103): add replay bud LEB 828:124928, head 1 UBIFS DBG mnt (pid 103): add replay bud LEB 822:0, head 1 UBIFS DBG mnt (pid 103): add replay bud LEB 821:0, head 2 UBIFS DBG mnt (pid 103): add replay bud LEB 820:0, head 2 UBIFS DBG mnt (pid 103): add replay bud LEB 819:0, head 2 UBIFS DBG mnt (pid 103): add replay bud LEB 818:0, head 2 UBIFS DBG mnt (pid 103): add replay bud LEB 817:0, head 1 UBIFS DBG mnt (pid 103): add replay bud LEB 816:0, head 2 UBIFS DBG mnt (pid 103): add replay bud LEB 815:0, head 2 UBIFS DBG mnt (pid 103): add replay bud LEB 814:0, head 1 UBIFS DBG mnt (pid 103): add replay bud LEB 813:0, head 2 UBIFS DBG mnt (pid 103): add replay bud LEB 812:0, head 2 UBIFS DBG mnt (pid 103): add replay bud LEB 811:0, head 2 UBIFS DBG mnt (pid 103): add replay bud LEB 810:0, head 2 UBIFS DBG mnt (pid 103): add replay bud LEB 809:0, head 2 UBIFS DBG mnt (pid 103): add replay bud LEB 808:0, head 1 UBIFS DBG mnt (pid 103): replay log LEB 4:0 UBIFS DBG mnt (pid 103): replay bud LEB 831, head 1, offs 110592, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 831 replied: dirty 12608, free 0 UBIFS DBG mnt (pid 103): replay bud LEB 823, head 2, offs 49152, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 823 replied: dirty 20376, free 2048 UBIFS DBG mnt (pid 103): replay bud LEB 828, head 1, offs 124928, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 828 replied: dirty 1488, free 0 UBIFS DBG mnt (pid 103): replay bud LEB 822, head 1, offs 0, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 822 replied: dirty 93248, free 0 UBIFS DBG mnt (pid 103): replay bud LEB 821, head 2, offs 0, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 821 replied: dirty 608, free 2048 UBIFS DBG mnt (pid 103): replay bud LEB 820, head 2, offs 0, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 820 replied: dirty 608, free 2048 UBIFS DBG mnt (pid 103): replay bud LEB 819, head 2, offs 0, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 819 replied: dirty 10736, free 0 UBIFS DBG mnt (pid 103): replay bud LEB 818, head 2, offs 0, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 818 replied: dirty 27608, free 0 UBIFS DBG mnt (pid 103): replay bud LEB 817, head 1, offs 0, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 817 replied: dirty 90896, free 0 UBIFS DBG mnt (pid 103): replay bud LEB 816, head 2, offs 0, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 816 replied: dirty 20456, free 0 UBIFS DBG mnt (pid 103): replay bud LEB 815, head 2, offs 0, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 815 replied: dirty 8352, free 0 UBIFS DBG mnt (pid 103): replay bud LEB 814, head 1, offs 0, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 814 replied: dirty 87888, free 0 UBIFS DBG mnt (pid 103): replay bud LEB 813, head 2, offs 0, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 813 replied: dirty 1864, free 0 UBIFS DBG mnt (pid 103): replay bud LEB 812, head 2, offs 0, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 812 replied: dirty 632, free 2048 UBIFS DBG mnt (pid 103): replay bud LEB 811, head 2, offs 0, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 811 replied: dirty 800, free 0 UBIFS DBG mnt (pid 103): replay bud LEB 810, head 2, offs 0, is_last 0 UBIFS DBG mnt (pid 103): bud LEB 810 replied: dirty 24776, free 0 UBIFS DBG mnt (pid 103): replay bud LEB 809, head 2, offs 0, is_last 1 UBIFS DBG rcvry (pid 103): 809:0, jhead 2, grouped 1 UBIFS DBG rcvry (pid 103): found corruption (-1) at 809:26624 UBIFS DBG rcvry (pid 103): cleaning corruption at 809:26624 UBIFS DBG rcvry (pid 103): fixing LEB 809 start 0 endpt 26624 UBIFS DBG mnt (pid 103): bud LEB 809 replied: dirty 9592, free 100352 UBIFS DBG mnt (pid 103): replay bud LEB 808, head 1, offs 0, is_last 1 UBIFS DBG rcvry (pid 103): 808:0, jhead 1, grouped 1 UBIFS DBG rcvry (pid 103): found corruption (-1) at 808:53248 UBIFS DBG rcvry (pid 103): cleaning corruption at 808:53248 UBIFS DBG rcvry (pid 103): fixing LEB 808 start 0 endpt 53248 UBIFS DBG mnt (pid 103): bud LEB 808 replied: dirty 19608, free 73728 UBIFS DBG mnt (pid 103): bud LEB 822 was GC'd (126976 free, 30920 dirty) UBIFS DBG mnt (pid 103): LEB 822 lp: 126976 free 30920 dirty replay: 0 free 93248 dirty UBIFS DBG mnt (pid 103): bud LEB 818 was GC'd (126976 free, 336 dirty) UBIFS DBG mnt (pid 103): LEB 818 lp: 126976 free 336 dirty replay: 0 free 27608 dirty UBIFS DBG mnt (pid 103): bud LEB 817 was GC'd (126976 free, 30344 dirty) UBIFS DBG mnt (pid 103): LEB 817 lp: 126976 free 30344 dirty replay: 0 free 90896 dirty UBIFS DBG mnt (pid 103): bud LEB 815 was GC'd (126976 free, 760 dirty) UBIFS DBG mnt (pid 103): LEB 815 lp: 126976 free 760 dirty replay: 0 free 8352 dirty UBIFS DBG mnt (pid 103): bud LEB 814 was GC'd (126976 free, 31152 dirty) UBIFS DBG mnt (pid 103): LEB 814 lp: 126976 free 31152 dirty replay: 0 free 87888 dirty UBIFS DBG mnt (pid 103): bud LEB 811 was GC'd (126976 free, 400 dirty) UBIFS DBG mnt (pid 103): LEB 811 lp: 126976 free 400 dirty replay: 0 free 800 dirty UBIFS DBG mnt (pid 103): bud LEB 810 was GC'd (126976 free, 4336 dirty) UBIFS DBG mnt (pid 103): LEB 810 lp: 126976 free 4336 dirty replay: 0 free 24776 dirty UBIFS DBG mnt (pid 103): bud LEB 808 was GC'd (126976 free, 5000 dirty) UBIFS DBG mnt (pid 103): LEB 808 lp: 126976 free 5000 dirty replay: 73728 free 19608 dirty UBIFS DBG mnt (pid 103): finished, log head LEB 3:34816, max_sqnum 9265, highest_inum 721 UBIFS DBG rcvry (pid 103): LEB 9 UBIFS DBG rcvry (pid 103): deleting orphaned inode 218 UBIFS DBG mnt (pid 103): ino 218, new 0, tot 1 UBIFS DBG rcvry (pid 103): last orph node for commit 15 at 9:0 UBIFS DBG rcvry (pid 103): deleting orphaned inode 237 UBIFS DBG mnt (pid 103): ino 237, new 0, tot 2 UBIFS DBG rcvry (pid 103): last orph node for commit 17 at 9:2048 UBIFS DBG rcvry (pid 103): deleting orphaned inode 247 UBIFS DBG mnt (pid 103): ino 247, new 0, tot 3 UBIFS DBG rcvry (pid 103): last orph node for commit 19 at 9:4096 UBIFS DBG rcvry (pid 103): deleting orphaned inode 250 UBIFS DBG mnt (pid 103): ino 250, new 0, tot 4 UBIFS DBG rcvry (pid 103): last orph node for commit 21 at 9:6144 UBIFS DBG rcvry (pid 103): deleting orphaned inode 253 UBIFS DBG mnt (pid 103): ino 253, new 0, tot 5 UBIFS DBG rcvry (pid 103): last orph node for commit 23 at 9:8192 UBIFS DBG rcvry (pid 103): deleting orphaned inode 256 UBIFS DBG mnt (pid 103): ino 256, new 0, tot 6 UBIFS DBG rcvry (pid 103): last orph node for commit 25 at 9:10240 UBIFS DBG rcvry (pid 103): deleting orphaned inode 259 UBIFS DBG mnt (pid 103): ino 259, new 0, tot 7 UBIFS DBG rcvry (pid 103): last orph node for commit 27 at 9:12288 UBIFS DBG rcvry (pid 103): deleting orphaned inode 262 UBIFS DBG mnt (pid 103): ino 262, new 0, tot 8 UBIFS DBG rcvry (pid 103): last orph node for commit 29 at 9:14336 UBIFS DBG rcvry (pid 103): deleting orphaned inode 265 UBIFS DBG mnt (pid 103): ino 265, new 0, tot 9 UBIFS DBG rcvry (pid 103): last orph node for commit 31 at 9:16384 UBIFS DBG rcvry (pid 103): deleting orphaned inode 268 UBIFS DBG mnt (pid 103): ino 268, new 0, tot 10 UBIFS DBG rcvry (pid 103): last orph node for commit 33 at 9:18432 UBIFS DBG rcvry (pid 103): deleting orphaned inode 272 UBIFS DBG mnt (pid 103): ino 272, new 0, tot 11 UBIFS DBG rcvry (pid 103): last orph node for commit 35 at 9:20480 UBIFS DBG rcvry (pid 103): deleting orphaned inode 275 UBIFS DBG mnt (pid 103): ino 275, new 0, tot 12 UBIFS DBG rcvry (pid 103): last orph node for commit 37 at 9:22528 UBIFS DBG rcvry (pid 103): deleting orphaned inode 278 UBIFS DBG mnt (pid 103): ino 278, new 0, tot 13 UBIFS DBG rcvry (pid 103): last orph node for commit 39 at 9:24576 UBIFS DBG rcvry (pid 103): deleting orphaned inode 281 UBIFS DBG mnt (pid 103): ino 281, new 0, tot 14 UBIFS DBG rcvry (pid 103): last orph node for commit 41 at 9:26624 UBIFS DBG rcvry (pid 103): deleting orphaned inode 284 UBIFS DBG mnt (pid 103): ino 284, new 0, tot 15 UBIFS DBG rcvry (pid 103): last orph node for commit 43 at 9:28672 UBIFS DBG rcvry (pid 103): deleting orphaned inode 287 UBIFS DBG mnt (pid 103): ino 287, new 0, tot 16 UBIFS DBG rcvry (pid 103): last orph node for commit 45 at 9:30720 UBIFS DBG rcvry (pid 103): deleting orphaned inode 290 UBIFS DBG mnt (pid 103): ino 290, new 0, tot 17 UBIFS DBG rcvry (pid 103): last orph node for commit 47 at 9:32768 UBIFS DBG rcvry (pid 103): deleting orphaned inode 293 UBIFS DBG mnt (pid 103): ino 293, new 0, tot 18 UBIFS DBG rcvry (pid 103): last orph node for commit 49 at 9:34816 UBIFS DBG rcvry (pid 103): deleting orphaned inode 296 UBIFS DBG mnt (pid 103): ino 296, new 0, tot 19 UBIFS DBG rcvry (pid 103): last orph node for commit 51 at 9:36864 UBIFS DBG rcvry (pid 103): deleting orphaned inode 299 UBIFS DBG mnt (pid 103): ino 299, new 0, tot 20 UBIFS DBG rcvry (pid 103): last orph node for commit 53 at 9:38912 UBIFS DBG rcvry (pid 103): deleting orphaned inode 302 UBIFS DBG mnt (pid 103): ino 302, new 0, tot 21 UBIFS DBG rcvry (pid 103): last orph node for commit 55 at 9:40960 UBIFS DBG rcvry (pid 103): deleting orphaned inode 305 UBIFS DBG mnt (pid 103): ino 305, new 0, tot 22 UBIFS DBG rcvry (pid 103): last orph node for commit 57 at 9:43008 UBIFS DBG rcvry (pid 103): deleting orphaned inode 308 UBIFS DBG mnt (pid 103): ino 308, new 0, tot 23 UBIFS DBG rcvry (pid 103): last orph node for commit 59 at 9:45056 UBIFS DBG rcvry (pid 103): deleting orphaned inode 311 UBIFS DBG mnt (pid 103): ino 311, new 0, tot 24 UBIFS DBG rcvry (pid 103): last orph node for commit 61 at 9:47104 UBIFS DBG rcvry (pid 103): deleting orphaned inode 314 UBIFS DBG mnt (pid 103): ino 314, new 0, tot 25 UBIFS DBG rcvry (pid 103): last orph node for commit 63 at 9:49152 UBIFS DBG rcvry (pid 103): deleting orphaned inode 317 UBIFS DBG mnt (pid 103): ino 317, new 0, tot 26 UBIFS DBG rcvry (pid 103): last orph node for commit 65 at 9:51200 UBIFS DBG rcvry (pid 103): deleting orphaned inode 320 UBIFS DBG mnt (pid 103): ino 320, new 0, tot 27 UBIFS DBG rcvry (pid 103): last orph node for commit 68 at 9:53248 UBIFS DBG rcvry (pid 103): deleting orphaned inode 349 UBIFS DBG mnt (pid 103): ino 349, new 0, tot 28 UBIFS DBG rcvry (pid 103): last orph node for commit 72 at 9:55296 UBIFS DBG rcvry (pid 103): deleting orphaned inode 466 UBIFS DBG mnt (pid 103): ino 466, new 0, tot 29 UBIFS DBG rcvry (pid 103): last orph node for commit 127 at 9:57344 UBIFS DBG rcvry (pid 103): LEB 10 UBIFS DBG rcvry (pid 103): GC head LEB -1, offs -1 UBIFS DBG rcvry (pid 103): found empty LEB 807, run commit UBIFS error (ubi0:20 pid 103): read_add_inode: inode 349 not found in index UBIFS error (ubi0:20 pid 103): check_leaf: error -2 while processing entry node and trying to find inode node 349 magic 0x6101831 crc 0xba9856a7 node_type 2 (direntry node) group_type 1 (in node group) sqnum 4115 len 70 key (66, direntry, 0x4974b0f) inum 349 type 0 nlen 13 name emotiond.conf UBIFS error (ubi0:20 pid 103): dbg_walk_index: leaf checking function returned error -2, for leaf at LEB 29:90112 znode be091400, LEB 37:92496 len 128 parent be089600 iip 1 level 0 child_cnt 5 flags 0 zbranches: 0: LNC (null) LEB 32:24576 len 81 key (66, direntry, 0x363e97f) 1: LNC (null) LEB 29:100352 len 63 key (66, direntry, 0x3f88c71) 2: LNC (null) LEB 12:94208 len 74 key (66, direntry, 0x477308e) 3: LNC (null) LEB 29:90112 len 70 key (66, direntry, 0x4974b0f) 4: LNC be0590c0 LEB 814:73728 len 63 key (66, direntry, 0x49bb6f7) UBIFS error (ubi0:20 pid 103): dbg_check_filesystem: file-system check failed with error -2 ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 2019-03-21 10:02 UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 Lars Persson 2019-03-25 10:21 ` Lars Persson @ 2019-03-26 9:21 ` Lars Persson 2019-03-28 11:54 ` Lars Persson 1 sibling, 1 reply; 12+ messages in thread From: Lars Persson @ 2019-03-26 9:21 UTC (permalink / raw) To: linux-mtd On Thu, Mar 21, 2019 at 11:02 AM Lars Persson <lists@bofh.nu> wrote: > > The missing inode is always triggered on those two particular files > emotiond.conf and tampering.conf that share the same write pattern at startup: > > cp file file.tmp > echo some data > file.tmp > mv file.tmp file > fsync file > A closer look at that script shows that the atomic file update part was bypassed. It only performs a chown operation on that file. That simple chown will trigger an overlayfs copyup operation. With a proper unmount after the system has finished writing to /etc I cannot reproduce any corruption. An unclean shutdown triggers corruption in about 1/3 of the attempts, also when I cut power tens of seconds after the problematic script has executed. ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 2019-03-26 9:21 ` Lars Persson @ 2019-03-28 11:54 ` Lars Persson 2019-03-28 17:31 ` Richard Weinberger 0 siblings, 1 reply; 12+ messages in thread From: Lars Persson @ 2019-03-28 11:54 UTC (permalink / raw) To: linux-mtd On Tue, Mar 26, 2019 at 10:21 AM Lars Persson <lists@bofh.nu> wrote: > > On Thu, Mar 21, 2019 at 11:02 AM Lars Persson <lists@bofh.nu> wrote: > > > > The missing inode is always triggered on those two particular files > > emotiond.conf and tampering.conf that share the same write pattern at startup: > > > > cp file file.tmp > > echo some data > file.tmp > > mv file.tmp file > > fsync file > > > > A closer look at that script shows that the atomic file update part > was bypassed. It only performs a chown operation on that file. > That simple chown will trigger an overlayfs copyup operation. > > With a proper unmount after the system has finished writing to /etc I > cannot reproduce any corruption. An unclean shutdown triggers > corruption in about 1/3 of the attempts, also when I cut power tens of > seconds after the problematic script has executed. OK I think I have found the bug. Commit 32fe905c17f0 ("Fix O_TMPFILE corner case in ubifs_link()") tried to fix the use of O_TMPFILE by removing an inode from the orphan list when it is relinked. Unfortunately the removal never gets committed to flash because the removal of an orphan does not trigger a commit. The condition for committing is that new_orphans > 0, but in our case it is zero after the call to ubifs_delete_orphan(). ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 2019-03-28 11:54 ` Lars Persson @ 2019-03-28 17:31 ` Richard Weinberger 2019-03-29 9:43 ` Lars Persson 0 siblings, 1 reply; 12+ messages in thread From: Richard Weinberger @ 2019-03-28 17:31 UTC (permalink / raw) To: Lars Persson; +Cc: linux-mtd On Thu, Mar 28, 2019 at 12:54 PM Lars Persson <lists@bofh.nu> wrote: > > On Tue, Mar 26, 2019 at 10:21 AM Lars Persson <lists@bofh.nu> wrote: > > > > On Thu, Mar 21, 2019 at 11:02 AM Lars Persson <lists@bofh.nu> wrote: > > > > > > The missing inode is always triggered on those two particular files > > > emotiond.conf and tampering.conf that share the same write pattern at startup: > > > > > > cp file file.tmp > > > echo some data > file.tmp > > > mv file.tmp file > > > fsync file > > > > > > > A closer look at that script shows that the atomic file update part > > was bypassed. It only performs a chown operation on that file. > > That simple chown will trigger an overlayfs copyup operation. > > > > With a proper unmount after the system has finished writing to /etc I > > cannot reproduce any corruption. An unclean shutdown triggers > > corruption in about 1/3 of the attempts, also when I cut power tens of > > seconds after the problematic script has executed. > > OK I think I have found the bug. Thanks for hunting this down! > Commit 32fe905c17f0 ("Fix O_TMPFILE corner case in ubifs_link()") > tried to fix the use of O_TMPFILE by removing an inode from the orphan > list when it is relinked. > > Unfortunately the removal never gets committed to flash because the > removal of an orphan does not trigger a commit. The condition for > committing is that new_orphans > 0, but in our case it is zero after > the call to ubifs_delete_orphan(). Can you please share me your corrupted fs? I'm not so sure what is really the root cause. The thing is, ubifs_delete_orphan() does not need to tigger a commit. Ff something depends on a commit, we have another problem, I fear. -- Thanks, //richard ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 2019-03-28 17:31 ` Richard Weinberger @ 2019-03-29 9:43 ` Lars Persson 2019-04-04 21:31 ` Richard Weinberger 2019-04-04 21:53 ` Richard Weinberger 0 siblings, 2 replies; 12+ messages in thread From: Lars Persson @ 2019-03-29 9:43 UTC (permalink / raw) To: Richard Weinberger; +Cc: linux-mtd On Thu, Mar 28, 2019 at 6:31 PM Richard Weinberger <richard.weinberger@gmail.com> wrote: > > Can you please share me your corrupted fs? > I'm not so sure what is really the root cause. > The thing is, ubifs_delete_orphan() does not need to tigger a commit. > Ff something depends on a commit, we have another problem, I fear. > Thanks. The download link for the file-system was shared to your gmail address. BR, Lars ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 2019-03-29 9:43 ` Lars Persson @ 2019-04-04 21:31 ` Richard Weinberger 2019-04-04 21:53 ` Richard Weinberger 1 sibling, 0 replies; 12+ messages in thread From: Richard Weinberger @ 2019-04-04 21:31 UTC (permalink / raw) To: Lars Persson, linux-mtd Am Freitag, 29. März 2019, 10:43:33 CEST schrieb Lars Persson: > On Thu, Mar 28, 2019 at 6:31 PM Richard Weinberger > <richard.weinberger@gmail.com> wrote: > > > > Can you please share me your corrupted fs? > > I'm not so sure what is really the root cause. > > The thing is, ubifs_delete_orphan() does not need to tigger a commit. > > Ff something depends on a commit, we have another problem, I fear. > > > > Thanks. The download link for the file-system was shared to your gmail address. Okay, I see the problem. :-\ But so far I don't see a good solution. /me thinks harder. O_TMPFILE turns out more and more to be a nightmare for UBIFS, it all boils down to the fact that UBIFS does not have transactions, just many small journal updates which interfere with each other. In this case the orphan list has an old entry although the journal did the right thing. Thanks, //richard ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 2019-03-29 9:43 ` Lars Persson 2019-04-04 21:31 ` Richard Weinberger @ 2019-04-04 21:53 ` Richard Weinberger 2019-04-05 13:19 ` Lars Persson 1 sibling, 1 reply; 12+ messages in thread From: Richard Weinberger @ 2019-04-04 21:53 UTC (permalink / raw) To: Lars Persson, linux-mtd Am Freitag, 29. März 2019, 10:43:33 CEST schrieb Lars Persson: > On Thu, Mar 28, 2019 at 6:31 PM Richard Weinberger > <richard.weinberger@gmail.com> wrote: > > > > Can you please share me your corrupted fs? > > I'm not so sure what is really the root cause. > > The thing is, ubifs_delete_orphan() does not need to tigger a commit. > > Ff something depends on a commit, we have another problem, I fear. > > > > Thanks. The download link for the file-system was shared to your gmail address. Can you please run your tests again with this patch applied? It needs further optimization, but we are on the right track. diff --git a/fs/ubifs/orphan.c b/fs/ubifs/orphan.c index 2f1618f300fb..c15e484f73a4 100644 --- a/fs/ubifs/orphan.c +++ b/fs/ubifs/orphan.c @@ -691,8 +691,21 @@ static int do_kill_orphans(struct ubifs_info *c, struct ubifs_scan_leb *sleb, n = (le32_to_cpu(orph->ch.len) - UBIFS_ORPH_NODE_SZ) >> 3; for (i = 0; i < n; i++) { union ubifs_key key1, key2; + struct ubifs_ino_node *ino; + + ino = kmalloc(UBIFS_MAX_INO_NODE_SZ, GFP_NOFS); + if (!ino) + return -ENOMEM; inum = le64_to_cpu(orph->inos[i]); + ino_key_init(c, &key1, inum); + err = ubifs_tnc_lookup(c, &key1, ino); + if (!err && ino->nlink) { + kfree(ino); + continue; + } + kfree(ino); + dbg_rcvry("deleting orphaned inode %lu", (unsigned long)inum); Thanks, //richard ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 2019-04-04 21:53 ` Richard Weinberger @ 2019-04-05 13:19 ` Lars Persson 2019-04-05 13:23 ` Richard Weinberger 0 siblings, 1 reply; 12+ messages in thread From: Lars Persson @ 2019-04-05 13:19 UTC (permalink / raw) To: Richard Weinberger; +Cc: linux-mtd On Thu, Apr 4, 2019 at 11:53 PM Richard Weinberger <richard@sigma-star.at> wrote: > Can you please run your tests again with this patch applied? > It needs further optimization, but we are on the right track. > > diff --git a/fs/ubifs/orphan.c b/fs/ubifs/orphan.c > index 2f1618f300fb..c15e484f73a4 100644 > --- a/fs/ubifs/orphan.c > +++ b/fs/ubifs/orphan.c > @@ -691,8 +691,21 @@ static int do_kill_orphans(struct ubifs_info *c, struct ubifs_scan_leb *sleb, > n = (le32_to_cpu(orph->ch.len) - UBIFS_ORPH_NODE_SZ) >> 3; > for (i = 0; i < n; i++) { > union ubifs_key key1, key2; > + struct ubifs_ino_node *ino; > + > + ino = kmalloc(UBIFS_MAX_INO_NODE_SZ, GFP_NOFS); > + if (!ino) > + return -ENOMEM; > > inum = le64_to_cpu(orph->inos[i]); > + ino_key_init(c, &key1, inum); > + err = ubifs_tnc_lookup(c, &key1, ino); > + if (!err && ino->nlink) { > + kfree(ino); > + continue; > + } > + kfree(ino); > + > dbg_rcvry("deleting orphaned inode %lu", > (unsigned long)inum); > Thanks Richard. The power-cut test is OK now with this change. I run it with chk_fs=1 and it passes every time. ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 2019-04-05 13:19 ` Lars Persson @ 2019-04-05 13:23 ` Richard Weinberger 2019-07-10 7:14 ` Lars Persson 0 siblings, 1 reply; 12+ messages in thread From: Richard Weinberger @ 2019-04-05 13:23 UTC (permalink / raw) To: Lars Persson; +Cc: linux-mtd Am Freitag, 5. April 2019, 15:19:31 CEST schrieb Lars Persson: > On Thu, Apr 4, 2019 at 11:53 PM Richard Weinberger > <richard@sigma-star.at> wrote: > > Can you please run your tests again with this patch applied? > > It needs further optimization, but we are on the right track. > > > > diff --git a/fs/ubifs/orphan.c b/fs/ubifs/orphan.c > > index 2f1618f300fb..c15e484f73a4 100644 > > --- a/fs/ubifs/orphan.c > > +++ b/fs/ubifs/orphan.c > > @@ -691,8 +691,21 @@ static int do_kill_orphans(struct ubifs_info *c, struct ubifs_scan_leb *sleb, > > n = (le32_to_cpu(orph->ch.len) - UBIFS_ORPH_NODE_SZ) >> 3; > > for (i = 0; i < n; i++) { > > union ubifs_key key1, key2; > > + struct ubifs_ino_node *ino; > > + > > + ino = kmalloc(UBIFS_MAX_INO_NODE_SZ, GFP_NOFS); > > + if (!ino) > > + return -ENOMEM; > > > > inum = le64_to_cpu(orph->inos[i]); > > + ino_key_init(c, &key1, inum); > > + err = ubifs_tnc_lookup(c, &key1, ino); > > + if (!err && ino->nlink) { > > + kfree(ino); > > + continue; > > + } > > + kfree(ino); > > + > > dbg_rcvry("deleting orphaned inode %lu", > > (unsigned long)inum); > > > > Thanks Richard. The power-cut test is OK now with this change. I run > it with chk_fs=1 and it passes every time. Cool, please let it run for some more time, just to be sure. I will also do more tests on my side and optimize the patch. Thanks, //richard ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 2019-04-05 13:23 ` Richard Weinberger @ 2019-07-10 7:14 ` Lars Persson 2019-07-10 9:02 ` Richard Weinberger 0 siblings, 1 reply; 12+ messages in thread From: Lars Persson @ 2019-07-10 7:14 UTC (permalink / raw) To: Richard Weinberger; +Cc: linux-mtd On Fri, Apr 5, 2019 at 3:23 PM Richard Weinberger <richard@nod.at> wrote: > > Am Freitag, 5. April 2019, 15:19:31 CEST schrieb Lars Persson: > > On Thu, Apr 4, 2019 at 11:53 PM Richard Weinberger > > <richard@sigma-star.at> wrote: > > > Can you please run your tests again with this patch applied? > > > It needs further optimization, but we are on the right track. > > > > > > diff --git a/fs/ubifs/orphan.c b/fs/ubifs/orphan.c > > > index 2f1618f300fb..c15e484f73a4 100644 > > > --- a/fs/ubifs/orphan.c > > > +++ b/fs/ubifs/orphan.c > > > @@ -691,8 +691,21 @@ static int do_kill_orphans(struct ubifs_info *c, struct ubifs_scan_leb *sleb, > > > n = (le32_to_cpu(orph->ch.len) - UBIFS_ORPH_NODE_SZ) >> 3; > > > for (i = 0; i < n; i++) { > > > union ubifs_key key1, key2; > > > + struct ubifs_ino_node *ino; > > > + > > > + ino = kmalloc(UBIFS_MAX_INO_NODE_SZ, GFP_NOFS); > > > + if (!ino) > > > + return -ENOMEM; > > > > > > inum = le64_to_cpu(orph->inos[i]); > > > + ino_key_init(c, &key1, inum); > > > + err = ubifs_tnc_lookup(c, &key1, ino); > > > + if (!err && ino->nlink) { > > > + kfree(ino); > > > + continue; > > > + } > > > + kfree(ino); > > > + > > > dbg_rcvry("deleting orphaned inode %lu", > > > (unsigned long)inum); > > > > > > > Thanks Richard. The power-cut test is OK now with this change. I run > > it with chk_fs=1 and it passes every time. > > Cool, please let it run for some more time, just to be sure. > I will also do more tests on my side and optimize the patch. > > Thanks, > //richard > Richard, we tested this patch for two weeks and saw no further issues. We can assist again with testing if you make the optimized patch. Regards, Lars ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 2019-07-10 7:14 ` Lars Persson @ 2019-07-10 9:02 ` Richard Weinberger 0 siblings, 0 replies; 12+ messages in thread From: Richard Weinberger @ 2019-07-10 9:02 UTC (permalink / raw) To: Lars Persson; +Cc: linux-mtd Lars, ----- Ursprüngliche Mail ----- > Richard, we tested this patch for two weeks and saw no further issues. > We can assist again with testing if you make the optimized patch. Cool! The optimized patch is in linux-next. If you could test linux-next it would be awesome. Thanks, //richard ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/ ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2019-07-10 9:10 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-03-21 10:02 UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 Lars Persson 2019-03-25 10:21 ` Lars Persson 2019-03-26 9:21 ` Lars Persson 2019-03-28 11:54 ` Lars Persson 2019-03-28 17:31 ` Richard Weinberger 2019-03-29 9:43 ` Lars Persson 2019-04-04 21:31 ` Richard Weinberger 2019-04-04 21:53 ` Richard Weinberger 2019-04-05 13:19 ` Lars Persson 2019-04-05 13:23 ` Richard Weinberger 2019-07-10 7:14 ` Lars Persson 2019-07-10 9:02 ` Richard Weinberger
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.