All of lore.kernel.org
 help / color / mirror / Atom feed
* UBIFS file-system corruption (missing inode) after power-cut on 4.14.96
@ 2019-03-21 10:02 Lars Persson
  2019-03-25 10:21 ` Lars Persson
  2019-03-26  9:21 ` Lars Persson
  0 siblings, 2 replies; 12+ messages in thread
From: Lars Persson @ 2019-03-21 10:02 UTC (permalink / raw)
  To: linux-mtd

Hi

We have recently encountered multiple cases of corrupted UBIFS volumes that
were triggered by a power-cut during startup. It may be a regression in the
4.14 stable branch.

The symptom is seen when accessing a file in the corrupted FS:
  UBIFS error (ubi0:20 pid 513): ubifs_iget: failed to read inode 348, error -2
  UBIFS error (ubi0:20 pid 513): ubifs_lookup: dead directory entry
'tampering.conf', error -2
  UBIFS warning (ubi0:20 pid 513): ubifs_ro_mode.part.0: switched to
read-only mode, error -2
  [<80506a51>] (dump_stack) from [<80297e2f>] (ubifs_lookup+0x29b/0x300)
  [<80297e2f>] (ubifs_lookup) from [<80226169>] (lookup_slow+0x69/0xe4)

We enabled chk_fs and reproduced it:
  UBIFS error (ubi0:20 pid 120): read_add_inode: inode 352 not found in index
  UBIFS error (ubi0:20 pid 120): check_leaf: error -2 while processing
entry node and trying to find inode node 352
  UBIFS (ubi0:20): dump of node at LEB 29:94656
        magic          0x6101831
        crc            0xda5d6bee
        node_type      2 (direntry node)
        group_type     1 (in node group)
        sqnum          4079
        len            70
        key            (66, direntry, 0x4974b0f)
        inum           352
        type           0
        nlen           13
        name           emotiond.conf
  UBIFS error (ubi0:20 pid 120): dbg_walk_index: leaf checking
function returned error -2, for leaf at LEB 29:94656
  UBIFS (ubi0:20): dump of znode at LEB 37:57680
  znode be0bf000, LEB 37:57680 len 128 parent be0af800 iip 1 level 0
child_cnt 5 flags 0
  zbranches:
        0: LNC   (null) LEB 32:53248 len 81 key (66, direntry, 0x363e97f)
        1: LNC   (null) LEB 29:94208 len 63 key (66, direntry, 0x3f88c71)
        2: LNC   (null) LEB 12:92160 len 74 key (66, direntry, 0x477308e)
        3: LNC   (null) LEB 29:94656 len 70 key (66, direntry, 0x4974b0f)
        4: LNC be084600 LEB 814:49152 len 63 key (66, direntry, 0x49bb6f7)
  UBIFS error (ubi0:20 pid 120): dbg_check_filesystem: file-system
check failed with error -2

The setup of our system is:
An overlayfs stack for /etc with:
- The lower file-system is a read-only squashfs on ubi
- The upper file-system is an ubifs

The ubi partition resides on an SLC NAND from Toshiba (TH58NVG2S3HBAI4).

The missing inode is always triggered on those two particular files
emotiond.conf and tampering.conf that share the same write pattern at startup:

 cp file file.tmp
 echo some data > file.tmp
 mv file.tmp file
 fsync file

Do not ask me about the logic of this script :> It overwrites the result of cp
and does not implement a proper atomic move.. Anyway the end result must not be
a file-system that is corrupt and mounts as RO.

BR,
 Lars

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96
  2019-03-21 10:02 UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 Lars Persson
@ 2019-03-25 10:21 ` Lars Persson
  2019-03-26  9:21 ` Lars Persson
  1 sibling, 0 replies; 12+ messages in thread
From: Lars Persson @ 2019-03-25 10:21 UTC (permalink / raw)
  To: linux-mtd

On Thu, Mar 21, 2019 at 11:02 AM Lars Persson <lists@bofh.nu> wrote:
>
> Hi
>
> We have recently encountered multiple cases of corrupted UBIFS volumes that
> were triggered by a power-cut during startup. It may be a regression in the
> 4.14 stable branch.
>
> The symptom is seen when accessing a file in the corrupted FS:
>   UBIFS error (ubi0:20 pid 513): ubifs_iget: failed to read inode 348, error -2
>   UBIFS error (ubi0:20 pid 513): ubifs_lookup: dead directory entry
> 'tampering.conf', error -2
>   UBIFS warning (ubi0:20 pid 513): ubifs_ro_mode.part.0: switched to
> read-only mode, error -2
>   [<80506a51>] (dump_stack) from [<80297e2f>] (ubifs_lookup+0x29b/0x300)
>   [<80297e2f>] (ubifs_lookup) from [<80226169>] (lookup_slow+0x69/0xe4)
>
> We enabled chk_fs and reproduced it:
>   UBIFS error (ubi0:20 pid 120): read_add_inode: inode 352 not found in index
>   UBIFS error (ubi0:20 pid 120): check_leaf: error -2 while processing
> entry node and trying to find inode node 352
>   UBIFS (ubi0:20): dump of node at LEB 29:94656
>         magic          0x6101831
>         crc            0xda5d6bee
>         node_type      2 (direntry node)
>         group_type     1 (in node group)
>         sqnum          4079
>         len            70
>         key            (66, direntry, 0x4974b0f)
>         inum           352
>         type           0
>         nlen           13
>         name           emotiond.conf
>   UBIFS error (ubi0:20 pid 120): dbg_walk_index: leaf checking
> function returned error -2, for leaf at LEB 29:94656
>   UBIFS (ubi0:20): dump of znode at LEB 37:57680
>   znode be0bf000, LEB 37:57680 len 128 parent be0af800 iip 1 level 0
> child_cnt 5 flags 0
>   zbranches:
>         0: LNC   (null) LEB 32:53248 len 81 key (66, direntry, 0x363e97f)
>         1: LNC   (null) LEB 29:94208 len 63 key (66, direntry, 0x3f88c71)
>         2: LNC   (null) LEB 12:92160 len 74 key (66, direntry, 0x477308e)
>         3: LNC   (null) LEB 29:94656 len 70 key (66, direntry, 0x4974b0f)
>         4: LNC be084600 LEB 814:49152 len 63 key (66, direntry, 0x49bb6f7)
>   UBIFS error (ubi0:20 pid 120): dbg_check_filesystem: file-system
> check failed with error -2
>
> The setup of our system is:
> An overlayfs stack for /etc with:
> - The lower file-system is a read-only squashfs on ubi
> - The upper file-system is an ubifs
>
> The ubi partition resides on an SLC NAND from Toshiba (TH58NVG2S3HBAI4).
>
> The missing inode is always triggered on those two particular files
> emotiond.conf and tampering.conf that share the same write pattern at startup:
>
>  cp file file.tmp
>  echo some data > file.tmp
>  mv file.tmp file
>  fsync file
>
> Do not ask me about the logic of this script :> It overwrites the result of cp
> and does not implement a proper atomic move.. Anyway the end result must not be
> a file-system that is corrupt and mounts as RO.
>

Further debug information with dbg_rcvry, dbg_mnt, dbg_jnl and
dbg_jnlk prints enabled.
Let me know if additional information is needed.

The recovery code is dropping the inode, claiming it is orphaned.


UBIFS DBG rcvry (pid 103): checking index head at 37:92160
UBIFS DBG rcvry (pid 103): checking LPT head at 7:18432
UBIFS DBG mnt (pid 103): start replaying the journal
UBIFS DBG mnt (pid 103): replay log LEB 3:0
UBIFS DBG mnt (pid 103): commit start sqnum 7367
UBIFS DBG mnt (pid 103): add replay bud LEB 831:110592, head 1
UBIFS DBG mnt (pid 103): add replay bud LEB 823:49152, head 2
UBIFS DBG mnt (pid 103): add replay bud LEB 828:124928, head 1
UBIFS DBG mnt (pid 103): add replay bud LEB 822:0, head 1
UBIFS DBG mnt (pid 103): add replay bud LEB 821:0, head 2
UBIFS DBG mnt (pid 103): add replay bud LEB 820:0, head 2
UBIFS DBG mnt (pid 103): add replay bud LEB 819:0, head 2
UBIFS DBG mnt (pid 103): add replay bud LEB 818:0, head 2
UBIFS DBG mnt (pid 103): add replay bud LEB 817:0, head 1
UBIFS DBG mnt (pid 103): add replay bud LEB 816:0, head 2
UBIFS DBG mnt (pid 103): add replay bud LEB 815:0, head 2
UBIFS DBG mnt (pid 103): add replay bud LEB 814:0, head 1
UBIFS DBG mnt (pid 103): add replay bud LEB 813:0, head 2
UBIFS DBG mnt (pid 103): add replay bud LEB 812:0, head 2
UBIFS DBG mnt (pid 103): add replay bud LEB 811:0, head 2
UBIFS DBG mnt (pid 103): add replay bud LEB 810:0, head 2
UBIFS DBG mnt (pid 103): add replay bud LEB 809:0, head 2
UBIFS DBG mnt (pid 103): add replay bud LEB 808:0, head 1
UBIFS DBG mnt (pid 103): replay log LEB 4:0
UBIFS DBG mnt (pid 103): replay bud LEB 831, head 1, offs 110592, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 831 replied: dirty 12608, free 0
UBIFS DBG mnt (pid 103): replay bud LEB 823, head 2, offs 49152, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 823 replied: dirty 20376, free 2048
UBIFS DBG mnt (pid 103): replay bud LEB 828, head 1, offs 124928, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 828 replied: dirty 1488, free 0
UBIFS DBG mnt (pid 103): replay bud LEB 822, head 1, offs 0, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 822 replied: dirty 93248, free 0
UBIFS DBG mnt (pid 103): replay bud LEB 821, head 2, offs 0, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 821 replied: dirty 608, free 2048
UBIFS DBG mnt (pid 103): replay bud LEB 820, head 2, offs 0, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 820 replied: dirty 608, free 2048
UBIFS DBG mnt (pid 103): replay bud LEB 819, head 2, offs 0, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 819 replied: dirty 10736, free 0
UBIFS DBG mnt (pid 103): replay bud LEB 818, head 2, offs 0, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 818 replied: dirty 27608, free 0
UBIFS DBG mnt (pid 103): replay bud LEB 817, head 1, offs 0, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 817 replied: dirty 90896, free 0
UBIFS DBG mnt (pid 103): replay bud LEB 816, head 2, offs 0, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 816 replied: dirty 20456, free 0
UBIFS DBG mnt (pid 103): replay bud LEB 815, head 2, offs 0, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 815 replied: dirty 8352, free 0
UBIFS DBG mnt (pid 103): replay bud LEB 814, head 1, offs 0, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 814 replied: dirty 87888, free 0
UBIFS DBG mnt (pid 103): replay bud LEB 813, head 2, offs 0, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 813 replied: dirty 1864, free 0
UBIFS DBG mnt (pid 103): replay bud LEB 812, head 2, offs 0, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 812 replied: dirty 632, free 2048
UBIFS DBG mnt (pid 103): replay bud LEB 811, head 2, offs 0, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 811 replied: dirty 800, free 0
UBIFS DBG mnt (pid 103): replay bud LEB 810, head 2, offs 0, is_last 0
UBIFS DBG mnt (pid 103): bud LEB 810 replied: dirty 24776, free 0
UBIFS DBG mnt (pid 103): replay bud LEB 809, head 2, offs 0, is_last 1
UBIFS DBG rcvry (pid 103): 809:0, jhead 2, grouped 1
UBIFS DBG rcvry (pid 103): found corruption (-1) at 809:26624
UBIFS DBG rcvry (pid 103): cleaning corruption at 809:26624
UBIFS DBG rcvry (pid 103): fixing LEB 809 start 0 endpt 26624
UBIFS DBG mnt (pid 103): bud LEB 809 replied: dirty 9592, free 100352
UBIFS DBG mnt (pid 103): replay bud LEB 808, head 1, offs 0, is_last 1
UBIFS DBG rcvry (pid 103): 808:0, jhead 1, grouped 1
UBIFS DBG rcvry (pid 103): found corruption (-1) at 808:53248
UBIFS DBG rcvry (pid 103): cleaning corruption at 808:53248
UBIFS DBG rcvry (pid 103): fixing LEB 808 start 0 endpt 53248
UBIFS DBG mnt (pid 103): bud LEB 808 replied: dirty 19608, free 73728
UBIFS DBG mnt (pid 103): bud LEB 822 was GC'd (126976 free, 30920 dirty)
UBIFS DBG mnt (pid 103): LEB 822 lp: 126976 free 30920 dirty replay: 0
free 93248 dirty
UBIFS DBG mnt (pid 103): bud LEB 818 was GC'd (126976 free, 336 dirty)
UBIFS DBG mnt (pid 103): LEB 818 lp: 126976 free 336 dirty replay: 0
free 27608 dirty
UBIFS DBG mnt (pid 103): bud LEB 817 was GC'd (126976 free, 30344 dirty)
UBIFS DBG mnt (pid 103): LEB 817 lp: 126976 free 30344 dirty replay: 0
free 90896 dirty
UBIFS DBG mnt (pid 103): bud LEB 815 was GC'd (126976 free, 760 dirty)
UBIFS DBG mnt (pid 103): LEB 815 lp: 126976 free 760 dirty replay: 0
free 8352 dirty
UBIFS DBG mnt (pid 103): bud LEB 814 was GC'd (126976 free, 31152 dirty)
UBIFS DBG mnt (pid 103): LEB 814 lp: 126976 free 31152 dirty replay: 0
free 87888 dirty
UBIFS DBG mnt (pid 103): bud LEB 811 was GC'd (126976 free, 400 dirty)
UBIFS DBG mnt (pid 103): LEB 811 lp: 126976 free 400 dirty replay: 0
free 800 dirty
UBIFS DBG mnt (pid 103): bud LEB 810 was GC'd (126976 free, 4336 dirty)
UBIFS DBG mnt (pid 103): LEB 810 lp: 126976 free 4336 dirty replay: 0
free 24776 dirty
UBIFS DBG mnt (pid 103): bud LEB 808 was GC'd (126976 free, 5000 dirty)
UBIFS DBG mnt (pid 103): LEB 808 lp: 126976 free 5000 dirty replay:
73728 free 19608 dirty
UBIFS DBG mnt (pid 103): finished, log head LEB 3:34816, max_sqnum
9265, highest_inum 721
UBIFS DBG rcvry (pid 103): LEB 9
UBIFS DBG rcvry (pid 103): deleting orphaned inode 218
UBIFS DBG mnt (pid 103): ino 218, new 0, tot 1
UBIFS DBG rcvry (pid 103): last orph node for commit 15 at 9:0
UBIFS DBG rcvry (pid 103): deleting orphaned inode 237
UBIFS DBG mnt (pid 103): ino 237, new 0, tot 2
UBIFS DBG rcvry (pid 103): last orph node for commit 17 at 9:2048
UBIFS DBG rcvry (pid 103): deleting orphaned inode 247
UBIFS DBG mnt (pid 103): ino 247, new 0, tot 3
UBIFS DBG rcvry (pid 103): last orph node for commit 19 at 9:4096
UBIFS DBG rcvry (pid 103): deleting orphaned inode 250
UBIFS DBG mnt (pid 103): ino 250, new 0, tot 4
UBIFS DBG rcvry (pid 103): last orph node for commit 21 at 9:6144
UBIFS DBG rcvry (pid 103): deleting orphaned inode 253
UBIFS DBG mnt (pid 103): ino 253, new 0, tot 5
UBIFS DBG rcvry (pid 103): last orph node for commit 23 at 9:8192
UBIFS DBG rcvry (pid 103): deleting orphaned inode 256
UBIFS DBG mnt (pid 103): ino 256, new 0, tot 6
UBIFS DBG rcvry (pid 103): last orph node for commit 25 at 9:10240
UBIFS DBG rcvry (pid 103): deleting orphaned inode 259
UBIFS DBG mnt (pid 103): ino 259, new 0, tot 7
UBIFS DBG rcvry (pid 103): last orph node for commit 27 at 9:12288
UBIFS DBG rcvry (pid 103): deleting orphaned inode 262
UBIFS DBG mnt (pid 103): ino 262, new 0, tot 8
UBIFS DBG rcvry (pid 103): last orph node for commit 29 at 9:14336
UBIFS DBG rcvry (pid 103): deleting orphaned inode 265
UBIFS DBG mnt (pid 103): ino 265, new 0, tot 9
UBIFS DBG rcvry (pid 103): last orph node for commit 31 at 9:16384
UBIFS DBG rcvry (pid 103): deleting orphaned inode 268
UBIFS DBG mnt (pid 103): ino 268, new 0, tot 10
UBIFS DBG rcvry (pid 103): last orph node for commit 33 at 9:18432
UBIFS DBG rcvry (pid 103): deleting orphaned inode 272
UBIFS DBG mnt (pid 103): ino 272, new 0, tot 11
UBIFS DBG rcvry (pid 103): last orph node for commit 35 at 9:20480
UBIFS DBG rcvry (pid 103): deleting orphaned inode 275
UBIFS DBG mnt (pid 103): ino 275, new 0, tot 12
UBIFS DBG rcvry (pid 103): last orph node for commit 37 at 9:22528
UBIFS DBG rcvry (pid 103): deleting orphaned inode 278
UBIFS DBG mnt (pid 103): ino 278, new 0, tot 13
UBIFS DBG rcvry (pid 103): last orph node for commit 39 at 9:24576
UBIFS DBG rcvry (pid 103): deleting orphaned inode 281
UBIFS DBG mnt (pid 103): ino 281, new 0, tot 14
UBIFS DBG rcvry (pid 103): last orph node for commit 41 at 9:26624
UBIFS DBG rcvry (pid 103): deleting orphaned inode 284
UBIFS DBG mnt (pid 103): ino 284, new 0, tot 15
UBIFS DBG rcvry (pid 103): last orph node for commit 43 at 9:28672
UBIFS DBG rcvry (pid 103): deleting orphaned inode 287
UBIFS DBG mnt (pid 103): ino 287, new 0, tot 16
UBIFS DBG rcvry (pid 103): last orph node for commit 45 at 9:30720
UBIFS DBG rcvry (pid 103): deleting orphaned inode 290
UBIFS DBG mnt (pid 103): ino 290, new 0, tot 17
UBIFS DBG rcvry (pid 103): last orph node for commit 47 at 9:32768
UBIFS DBG rcvry (pid 103): deleting orphaned inode 293
UBIFS DBG mnt (pid 103): ino 293, new 0, tot 18
UBIFS DBG rcvry (pid 103): last orph node for commit 49 at 9:34816
UBIFS DBG rcvry (pid 103): deleting orphaned inode 296
UBIFS DBG mnt (pid 103): ino 296, new 0, tot 19
UBIFS DBG rcvry (pid 103): last orph node for commit 51 at 9:36864
UBIFS DBG rcvry (pid 103): deleting orphaned inode 299
UBIFS DBG mnt (pid 103): ino 299, new 0, tot 20
UBIFS DBG rcvry (pid 103): last orph node for commit 53 at 9:38912
UBIFS DBG rcvry (pid 103): deleting orphaned inode 302
UBIFS DBG mnt (pid 103): ino 302, new 0, tot 21
UBIFS DBG rcvry (pid 103): last orph node for commit 55 at 9:40960
UBIFS DBG rcvry (pid 103): deleting orphaned inode 305
UBIFS DBG mnt (pid 103): ino 305, new 0, tot 22
UBIFS DBG rcvry (pid 103): last orph node for commit 57 at 9:43008
UBIFS DBG rcvry (pid 103): deleting orphaned inode 308
UBIFS DBG mnt (pid 103): ino 308, new 0, tot 23
UBIFS DBG rcvry (pid 103): last orph node for commit 59 at 9:45056
UBIFS DBG rcvry (pid 103): deleting orphaned inode 311
UBIFS DBG mnt (pid 103): ino 311, new 0, tot 24
UBIFS DBG rcvry (pid 103): last orph node for commit 61 at 9:47104
UBIFS DBG rcvry (pid 103): deleting orphaned inode 314
UBIFS DBG mnt (pid 103): ino 314, new 0, tot 25
UBIFS DBG rcvry (pid 103): last orph node for commit 63 at 9:49152
UBIFS DBG rcvry (pid 103): deleting orphaned inode 317
UBIFS DBG mnt (pid 103): ino 317, new 0, tot 26
UBIFS DBG rcvry (pid 103): last orph node for commit 65 at 9:51200
UBIFS DBG rcvry (pid 103): deleting orphaned inode 320
UBIFS DBG mnt (pid 103): ino 320, new 0, tot 27
UBIFS DBG rcvry (pid 103): last orph node for commit 68 at 9:53248
UBIFS DBG rcvry (pid 103): deleting orphaned inode 349
UBIFS DBG mnt (pid 103): ino 349, new 0, tot 28
UBIFS DBG rcvry (pid 103): last orph node for commit 72 at 9:55296
UBIFS DBG rcvry (pid 103): deleting orphaned inode 466
UBIFS DBG mnt (pid 103): ino 466, new 0, tot 29
UBIFS DBG rcvry (pid 103): last orph node for commit 127 at 9:57344
UBIFS DBG rcvry (pid 103): LEB 10
UBIFS DBG rcvry (pid 103): GC head LEB -1, offs -1
UBIFS DBG rcvry (pid 103): found empty LEB 807, run commit
UBIFS error (ubi0:20 pid 103): read_add_inode: inode 349 not found in index
UBIFS error (ubi0:20 pid 103): check_leaf: error -2 while processing
entry node and trying to find inode node 349
 magic          0x6101831
 crc            0xba9856a7
 node_type      2 (direntry node)
 group_type     1 (in node group)
 sqnum          4115
 len            70
 key            (66, direntry, 0x4974b0f)
 inum           349
 type           0
 nlen           13
 name           emotiond.conf
UBIFS error (ubi0:20 pid 103): dbg_walk_index: leaf checking function
returned error -2, for leaf at LEB 29:90112
znode be091400, LEB 37:92496 len 128 parent be089600 iip 1 level 0
child_cnt 5 flags 0
zbranches:
 0: LNC   (null) LEB 32:24576 len 81 key (66, direntry, 0x363e97f)
 1: LNC   (null) LEB 29:100352 len 63 key (66, direntry, 0x3f88c71)
 2: LNC   (null) LEB 12:94208 len 74 key (66, direntry, 0x477308e)
 3: LNC   (null) LEB 29:90112 len 70 key (66, direntry, 0x4974b0f)
 4: LNC be0590c0 LEB 814:73728 len 63 key (66, direntry, 0x49bb6f7)
UBIFS error (ubi0:20 pid 103): dbg_check_filesystem: file-system check
failed with error -2

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96
  2019-03-21 10:02 UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 Lars Persson
  2019-03-25 10:21 ` Lars Persson
@ 2019-03-26  9:21 ` Lars Persson
  2019-03-28 11:54   ` Lars Persson
  1 sibling, 1 reply; 12+ messages in thread
From: Lars Persson @ 2019-03-26  9:21 UTC (permalink / raw)
  To: linux-mtd

On Thu, Mar 21, 2019 at 11:02 AM Lars Persson <lists@bofh.nu> wrote:
>
> The missing inode is always triggered on those two particular files
> emotiond.conf and tampering.conf that share the same write pattern at startup:
>
>  cp file file.tmp
>  echo some data > file.tmp
>  mv file.tmp file
>  fsync file
>

A closer look at that script shows that the atomic file update part
was bypassed. It only performs a chown operation on that file.
That simple chown will trigger an overlayfs copyup operation.

With a proper unmount after the system has finished writing to /etc I
cannot reproduce any corruption. An unclean shutdown triggers
corruption in about 1/3 of the attempts, also when I cut power tens of
seconds after the problematic script has executed.

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96
  2019-03-26  9:21 ` Lars Persson
@ 2019-03-28 11:54   ` Lars Persson
  2019-03-28 17:31     ` Richard Weinberger
  0 siblings, 1 reply; 12+ messages in thread
From: Lars Persson @ 2019-03-28 11:54 UTC (permalink / raw)
  To: linux-mtd

On Tue, Mar 26, 2019 at 10:21 AM Lars Persson <lists@bofh.nu> wrote:
>
> On Thu, Mar 21, 2019 at 11:02 AM Lars Persson <lists@bofh.nu> wrote:
> >
> > The missing inode is always triggered on those two particular files
> > emotiond.conf and tampering.conf that share the same write pattern at startup:
> >
> >  cp file file.tmp
> >  echo some data > file.tmp
> >  mv file.tmp file
> >  fsync file
> >
>
> A closer look at that script shows that the atomic file update part
> was bypassed. It only performs a chown operation on that file.
> That simple chown will trigger an overlayfs copyup operation.
>
> With a proper unmount after the system has finished writing to /etc I
> cannot reproduce any corruption. An unclean shutdown triggers
> corruption in about 1/3 of the attempts, also when I cut power tens of
> seconds after the problematic script has executed.

OK I think I have found the bug.

Commit 32fe905c17f0 ("Fix O_TMPFILE corner case in ubifs_link()")
tried to fix the use of O_TMPFILE by removing an inode from the orphan
list when it is relinked.

Unfortunately the removal never gets committed to flash because the
removal of an orphan does not trigger a commit. The condition for
committing is that new_orphans > 0, but in our case it is zero after
the call to ubifs_delete_orphan().

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96
  2019-03-28 11:54   ` Lars Persson
@ 2019-03-28 17:31     ` Richard Weinberger
  2019-03-29  9:43       ` Lars Persson
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Weinberger @ 2019-03-28 17:31 UTC (permalink / raw)
  To: Lars Persson; +Cc: linux-mtd

On Thu, Mar 28, 2019 at 12:54 PM Lars Persson <lists@bofh.nu> wrote:
>
> On Tue, Mar 26, 2019 at 10:21 AM Lars Persson <lists@bofh.nu> wrote:
> >
> > On Thu, Mar 21, 2019 at 11:02 AM Lars Persson <lists@bofh.nu> wrote:
> > >
> > > The missing inode is always triggered on those two particular files
> > > emotiond.conf and tampering.conf that share the same write pattern at startup:
> > >
> > >  cp file file.tmp
> > >  echo some data > file.tmp
> > >  mv file.tmp file
> > >  fsync file
> > >
> >
> > A closer look at that script shows that the atomic file update part
> > was bypassed. It only performs a chown operation on that file.
> > That simple chown will trigger an overlayfs copyup operation.
> >
> > With a proper unmount after the system has finished writing to /etc I
> > cannot reproduce any corruption. An unclean shutdown triggers
> > corruption in about 1/3 of the attempts, also when I cut power tens of
> > seconds after the problematic script has executed.
>
> OK I think I have found the bug.

Thanks for hunting this down!

> Commit 32fe905c17f0 ("Fix O_TMPFILE corner case in ubifs_link()")
> tried to fix the use of O_TMPFILE by removing an inode from the orphan
> list when it is relinked.
>
> Unfortunately the removal never gets committed to flash because the
> removal of an orphan does not trigger a commit. The condition for
> committing is that new_orphans > 0, but in our case it is zero after
> the call to ubifs_delete_orphan().

Can you please share me your corrupted fs?
I'm not so sure what is really the root cause.
The thing is, ubifs_delete_orphan() does not need to tigger a commit.
Ff something depends on a commit, we have another problem, I fear.

-- 
Thanks,
//richard

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96
  2019-03-28 17:31     ` Richard Weinberger
@ 2019-03-29  9:43       ` Lars Persson
  2019-04-04 21:31         ` Richard Weinberger
  2019-04-04 21:53         ` Richard Weinberger
  0 siblings, 2 replies; 12+ messages in thread
From: Lars Persson @ 2019-03-29  9:43 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

On Thu, Mar 28, 2019 at 6:31 PM Richard Weinberger
<richard.weinberger@gmail.com> wrote:
>
> Can you please share me your corrupted fs?
> I'm not so sure what is really the root cause.
> The thing is, ubifs_delete_orphan() does not need to tigger a commit.
> Ff something depends on a commit, we have another problem, I fear.
>

Thanks. The download link for the file-system was shared to your gmail address.

BR,
 Lars

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96
  2019-03-29  9:43       ` Lars Persson
@ 2019-04-04 21:31         ` Richard Weinberger
  2019-04-04 21:53         ` Richard Weinberger
  1 sibling, 0 replies; 12+ messages in thread
From: Richard Weinberger @ 2019-04-04 21:31 UTC (permalink / raw)
  To: Lars Persson, linux-mtd

Am Freitag, 29. März 2019, 10:43:33 CEST schrieb Lars Persson:
> On Thu, Mar 28, 2019 at 6:31 PM Richard Weinberger
> <richard.weinberger@gmail.com> wrote:
> >
> > Can you please share me your corrupted fs?
> > I'm not so sure what is really the root cause.
> > The thing is, ubifs_delete_orphan() does not need to tigger a commit.
> > Ff something depends on a commit, we have another problem, I fear.
> >
> 
> Thanks. The download link for the file-system was shared to your gmail address.

Okay, I see the problem. :-\
But so far I don't see a good solution. /me thinks harder.

O_TMPFILE turns out more and more to be a nightmare for UBIFS,
it all boils down to the fact that UBIFS does not have transactions,
just many small journal updates which interfere with each other.
In this case the orphan list has an old entry
although the journal did the right thing.

Thanks,
//richard



______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96
  2019-03-29  9:43       ` Lars Persson
  2019-04-04 21:31         ` Richard Weinberger
@ 2019-04-04 21:53         ` Richard Weinberger
  2019-04-05 13:19           ` Lars Persson
  1 sibling, 1 reply; 12+ messages in thread
From: Richard Weinberger @ 2019-04-04 21:53 UTC (permalink / raw)
  To: Lars Persson, linux-mtd

Am Freitag, 29. März 2019, 10:43:33 CEST schrieb Lars Persson:
> On Thu, Mar 28, 2019 at 6:31 PM Richard Weinberger
> <richard.weinberger@gmail.com> wrote:
> >
> > Can you please share me your corrupted fs?
> > I'm not so sure what is really the root cause.
> > The thing is, ubifs_delete_orphan() does not need to tigger a commit.
> > Ff something depends on a commit, we have another problem, I fear.
> >
> 
> Thanks. The download link for the file-system was shared to your gmail address.

Can you please run your tests again with this patch applied?
It needs further optimization, but we are on the right track.

diff --git a/fs/ubifs/orphan.c b/fs/ubifs/orphan.c
index 2f1618f300fb..c15e484f73a4 100644
--- a/fs/ubifs/orphan.c
+++ b/fs/ubifs/orphan.c
@@ -691,8 +691,21 @@ static int do_kill_orphans(struct ubifs_info *c, struct ubifs_scan_leb *sleb,
 		n = (le32_to_cpu(orph->ch.len) - UBIFS_ORPH_NODE_SZ) >> 3;
 		for (i = 0; i < n; i++) {
 			union ubifs_key key1, key2;
+			struct ubifs_ino_node *ino;
+
+			ino = kmalloc(UBIFS_MAX_INO_NODE_SZ, GFP_NOFS);
+			if (!ino)
+				return -ENOMEM;
 
 			inum = le64_to_cpu(orph->inos[i]);
+			ino_key_init(c, &key1, inum);
+			err = ubifs_tnc_lookup(c, &key1, ino);
+			if (!err && ino->nlink) {
+				kfree(ino);
+				continue;
+			}
+			kfree(ino);
+
 			dbg_rcvry("deleting orphaned inode %lu",
 				  (unsigned long)inum);
 
Thanks,
//richard



______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96
  2019-04-04 21:53         ` Richard Weinberger
@ 2019-04-05 13:19           ` Lars Persson
  2019-04-05 13:23             ` Richard Weinberger
  0 siblings, 1 reply; 12+ messages in thread
From: Lars Persson @ 2019-04-05 13:19 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

On Thu, Apr 4, 2019 at 11:53 PM Richard Weinberger
<richard@sigma-star.at> wrote:
> Can you please run your tests again with this patch applied?
> It needs further optimization, but we are on the right track.
>
> diff --git a/fs/ubifs/orphan.c b/fs/ubifs/orphan.c
> index 2f1618f300fb..c15e484f73a4 100644
> --- a/fs/ubifs/orphan.c
> +++ b/fs/ubifs/orphan.c
> @@ -691,8 +691,21 @@ static int do_kill_orphans(struct ubifs_info *c, struct ubifs_scan_leb *sleb,
>                 n = (le32_to_cpu(orph->ch.len) - UBIFS_ORPH_NODE_SZ) >> 3;
>                 for (i = 0; i < n; i++) {
>                         union ubifs_key key1, key2;
> +                       struct ubifs_ino_node *ino;
> +
> +                       ino = kmalloc(UBIFS_MAX_INO_NODE_SZ, GFP_NOFS);
> +                       if (!ino)
> +                               return -ENOMEM;
>
>                         inum = le64_to_cpu(orph->inos[i]);
> +                       ino_key_init(c, &key1, inum);
> +                       err = ubifs_tnc_lookup(c, &key1, ino);
> +                       if (!err && ino->nlink) {
> +                               kfree(ino);
> +                               continue;
> +                       }
> +                       kfree(ino);
> +
>                         dbg_rcvry("deleting orphaned inode %lu",
>                                   (unsigned long)inum);
>

Thanks Richard. The power-cut test is OK now with this change. I run
it with chk_fs=1 and it passes every time.

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96
  2019-04-05 13:19           ` Lars Persson
@ 2019-04-05 13:23             ` Richard Weinberger
  2019-07-10  7:14               ` Lars Persson
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Weinberger @ 2019-04-05 13:23 UTC (permalink / raw)
  To: Lars Persson; +Cc: linux-mtd

Am Freitag, 5. April 2019, 15:19:31 CEST schrieb Lars Persson:
> On Thu, Apr 4, 2019 at 11:53 PM Richard Weinberger
> <richard@sigma-star.at> wrote:
> > Can you please run your tests again with this patch applied?
> > It needs further optimization, but we are on the right track.
> >
> > diff --git a/fs/ubifs/orphan.c b/fs/ubifs/orphan.c
> > index 2f1618f300fb..c15e484f73a4 100644
> > --- a/fs/ubifs/orphan.c
> > +++ b/fs/ubifs/orphan.c
> > @@ -691,8 +691,21 @@ static int do_kill_orphans(struct ubifs_info *c, struct ubifs_scan_leb *sleb,
> >                 n = (le32_to_cpu(orph->ch.len) - UBIFS_ORPH_NODE_SZ) >> 3;
> >                 for (i = 0; i < n; i++) {
> >                         union ubifs_key key1, key2;
> > +                       struct ubifs_ino_node *ino;
> > +
> > +                       ino = kmalloc(UBIFS_MAX_INO_NODE_SZ, GFP_NOFS);
> > +                       if (!ino)
> > +                               return -ENOMEM;
> >
> >                         inum = le64_to_cpu(orph->inos[i]);
> > +                       ino_key_init(c, &key1, inum);
> > +                       err = ubifs_tnc_lookup(c, &key1, ino);
> > +                       if (!err && ino->nlink) {
> > +                               kfree(ino);
> > +                               continue;
> > +                       }
> > +                       kfree(ino);
> > +
> >                         dbg_rcvry("deleting orphaned inode %lu",
> >                                   (unsigned long)inum);
> >
> 
> Thanks Richard. The power-cut test is OK now with this change. I run
> it with chk_fs=1 and it passes every time.

Cool, please let it run for some more time, just to be sure.
I will also do more tests on my side and optimize the patch.

Thanks,
//richard



______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96
  2019-04-05 13:23             ` Richard Weinberger
@ 2019-07-10  7:14               ` Lars Persson
  2019-07-10  9:02                 ` Richard Weinberger
  0 siblings, 1 reply; 12+ messages in thread
From: Lars Persson @ 2019-07-10  7:14 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-mtd

On Fri, Apr 5, 2019 at 3:23 PM Richard Weinberger <richard@nod.at> wrote:
>
> Am Freitag, 5. April 2019, 15:19:31 CEST schrieb Lars Persson:
> > On Thu, Apr 4, 2019 at 11:53 PM Richard Weinberger
> > <richard@sigma-star.at> wrote:
> > > Can you please run your tests again with this patch applied?
> > > It needs further optimization, but we are on the right track.
> > >
> > > diff --git a/fs/ubifs/orphan.c b/fs/ubifs/orphan.c
> > > index 2f1618f300fb..c15e484f73a4 100644
> > > --- a/fs/ubifs/orphan.c
> > > +++ b/fs/ubifs/orphan.c
> > > @@ -691,8 +691,21 @@ static int do_kill_orphans(struct ubifs_info *c, struct ubifs_scan_leb *sleb,
> > >                 n = (le32_to_cpu(orph->ch.len) - UBIFS_ORPH_NODE_SZ) >> 3;
> > >                 for (i = 0; i < n; i++) {
> > >                         union ubifs_key key1, key2;
> > > +                       struct ubifs_ino_node *ino;
> > > +
> > > +                       ino = kmalloc(UBIFS_MAX_INO_NODE_SZ, GFP_NOFS);
> > > +                       if (!ino)
> > > +                               return -ENOMEM;
> > >
> > >                         inum = le64_to_cpu(orph->inos[i]);
> > > +                       ino_key_init(c, &key1, inum);
> > > +                       err = ubifs_tnc_lookup(c, &key1, ino);
> > > +                       if (!err && ino->nlink) {
> > > +                               kfree(ino);
> > > +                               continue;
> > > +                       }
> > > +                       kfree(ino);
> > > +
> > >                         dbg_rcvry("deleting orphaned inode %lu",
> > >                                   (unsigned long)inum);
> > >
> >
> > Thanks Richard. The power-cut test is OK now with this change. I run
> > it with chk_fs=1 and it passes every time.
>
> Cool, please let it run for some more time, just to be sure.
> I will also do more tests on my side and optimize the patch.
>
> Thanks,
> //richard
>

Richard, we tested this patch for two weeks and saw no further issues.
We can assist again with testing if you make the optimized patch.

Regards,
 Lars

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: UBIFS file-system corruption (missing inode) after power-cut on 4.14.96
  2019-07-10  7:14               ` Lars Persson
@ 2019-07-10  9:02                 ` Richard Weinberger
  0 siblings, 0 replies; 12+ messages in thread
From: Richard Weinberger @ 2019-07-10  9:02 UTC (permalink / raw)
  To: Lars Persson; +Cc: linux-mtd

Lars,

----- Ursprüngliche Mail -----
> Richard, we tested this patch for two weeks and saw no further issues.
> We can assist again with testing if you make the optimized patch.

Cool!
The optimized patch is in linux-next. If you could test linux-next it would be awesome.

Thanks,
//richard

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-07-10  9:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-21 10:02 UBIFS file-system corruption (missing inode) after power-cut on 4.14.96 Lars Persson
2019-03-25 10:21 ` Lars Persson
2019-03-26  9:21 ` Lars Persson
2019-03-28 11:54   ` Lars Persson
2019-03-28 17:31     ` Richard Weinberger
2019-03-29  9:43       ` Lars Persson
2019-04-04 21:31         ` Richard Weinberger
2019-04-04 21:53         ` Richard Weinberger
2019-04-05 13:19           ` Lars Persson
2019-04-05 13:23             ` Richard Weinberger
2019-07-10  7:14               ` Lars Persson
2019-07-10  9:02                 ` Richard Weinberger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.