All of lore.kernel.org
 help / color / mirror / Atom feed
* f2fs Crash Consistency Problem
@ 2017-04-13 15:28 Raouf Rokhjavan
  2017-04-13 21:19 ` Jaegeuk Kim
  0 siblings, 1 reply; 14+ messages in thread
From: Raouf Rokhjavan @ 2017-04-13 15:28 UTC (permalink / raw)
  To: linux-f2fs-devel

Hi

The Flash friendly features of f2fs has motivated me to make use of 
these characteristics as rootfs in my project. Since one of my main 
considerations is the resilience in the face of power failure, I've been 
looking for some techniques to prove this assertion. Finally, I ended up 
finding a wonderful device-mapper target, developed by Josef Bacik who 
is btrfs developer, for this purpose..

As you know, log-writes target logs all bios which are passed to the 
block layer and keeps the order of logging to simulate the file system 
logic of maintaining the consistency. To take advantage of this helpful 
tool to verify the consistency of f2fs file system after power failure, 
I combined xfstests test suite with log-writes. According to the LFS 
based nature of f2fs, I expected that I would never encounter with 
inconsistency problem, but test results shows something else.

To clarify further this notion, this is my test environment:

     - Fedora 24

     - kernel 4.9.8 - f2fs was compiled as module which all features

         - mount options: default + noatime

     - f2fs-tools 1.8.0

     - xfstests  1.1.1 - from https://github.com/jaegeuk/xfstests-f2fs

     - device-mapper 1.02.122

In my test environment, I run each generic test of xfstests on 
log-writes device with newly created f2fs. After that, I replay the log 
after the mkfs one by one and check the consistency of file system with 
fsck.f2fs. In test #009 which is fallocate test with 
FALLOC_FL_ZERO_RANGE mode, after a while, fsck.f2fs complains this:

Info: [/dev/sdc] Disk Model: VMware Virtual S1.0
Info: Segments per section = 1
Info: Sections per zone = 1
Info: sector size = 512
Info: total sectors = 2097152 (1024 MB)
Info: MKFS version
   "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 
4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 
2017"
Info: FSCK version
   from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 
4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 
2017"
     to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 
4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 
2017"
Info: superblock features = 0 :
Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
Info: total FS sectors = 2097152 (1024 MB)
Info: CKPT version = 2a4679e0
Info: checkpoint state = 45 :  compacted_summary unmount

NID[0x4c] is unreachable
NID[0x4d] is unreachable
[FSCK] Unreachable nat entries                        [Fail] [0x2]
[FSCK] SIT valid block bitmap checking                [Ok..]
[FSCK] Hard link checking for regular file            [Ok..] [0x0]
[FSCK] valid_block_count matching with CP             [Ok..] [0x2]
[FSCK] valid_node_count matcing with CP (de lookup)   [Ok..] [0x1]
[FSCK] valid_node_count matcing with CP (nat lookup)  [Fail] [0x3]
[FSCK] valid_inode_count matched with CP              [Ok..] [0x1]
[FSCK] free segment_count matched with CP             [Ok..] [0x1f0]
[FSCK] next block offset is free                      [Ok..]
[FSCK] fixing SIT types
[FSCK] other corrupted bugs                           [Fail]

Do you want to restore lost files into ./lost_found/? [Y/N] Y
  - File name         : 009.48244.2
  - File size         : 20,480 (bytes)
Do you want to fix this partition? [Y/N] Y

The interesting side of this is that when I issue fsck.f2fs with -p 
option, fsck.f2fs doesn't complain !!!

Would you please tell me why fsck.f2fs reports an inconsistency which 
needs to be fixed? Does it violate the crash consistency promise of f2fs?

Moreover, Why is fsck.f2fs silent with -p option? Does it mean whether 
f2fs kernel module finds it not serious?

I really appreciate for your help.

Thanks




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: f2fs Crash Consistency Problem
  2017-04-13 15:28 f2fs Crash Consistency Problem Raouf Rokhjavan
@ 2017-04-13 21:19 ` Jaegeuk Kim
       [not found]   ` <60e7c703-13f1-0f7e-24cc-2c5fae3bc958@gmail.com>
  2017-05-10 17:51   ` Raouf Rokhjavan
  0 siblings, 2 replies; 14+ messages in thread
From: Jaegeuk Kim @ 2017-04-13 21:19 UTC (permalink / raw)
  To: Raouf Rokhjavan; +Cc: linux-f2fs-devel

Hello,

On 04/13, Raouf Rokhjavan wrote:
> Hi
> 
> The Flash friendly features of f2fs has motivated me to make use of 
> these characteristics as rootfs in my project. Since one of my main 
> considerations is the resilience in the face of power failure, I've been 
> looking for some techniques to prove this assertion. Finally, I ended up 
> finding a wonderful device-mapper target, developed by Josef Bacik who 
> is btrfs developer, for this purpose..
> 
> As you know, log-writes target logs all bios which are passed to the 
> block layer and keeps the order of logging to simulate the file system 
> logic of maintaining the consistency. To take advantage of this helpful 
> tool to verify the consistency of f2fs file system after power failure, 
> I combined xfstests test suite with log-writes. According to the LFS 
> based nature of f2fs, I expected that I would never encounter with 
> inconsistency problem, but test results shows something else.
> 
> To clarify further this notion, this is my test environment:
> 
>      - Fedora 24
> 
>      - kernel 4.9.8 - f2fs was compiled as module which all features
> 
>          - mount options: default + noatime
> 
>      - f2fs-tools 1.8.0
> 
>      - xfstests  1.1.1 - from https://github.com/jaegeuk/xfstests-f2fs
> 
>      - device-mapper 1.02.122
> 
> In my test environment, I run each generic test of xfstests on 
> log-writes device with newly created f2fs. After that, I replay the log 
> after the mkfs one by one and check the consistency of file system with 
> fsck.f2fs. In test #009 which is fallocate test with 
> FALLOC_FL_ZERO_RANGE mode, after a while, fsck.f2fs complains this:

I just ran log-writer with fsstress and found one issue when replaying IOs.
If you replay after mkfs.f2fs, you get the wrong valid checkpoint which was
overwritten by previous run. IOWs, at the beginning of replay, there was
no *correct* checkpoint representing that initial moment. So, I think you
need to replay the log including mkfs.

You can verify the below CKPT version info.

> Info: [/dev/sdc] Disk Model: VMware Virtual S1.0
> Info: Segments per section = 1
> Info: Sections per zone = 1
> Info: sector size = 512
> Info: total sectors = 2097152 (1024 MB)
> Info: MKFS version
>    "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 
> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 
> 2017"
> Info: FSCK version
>    from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 
> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 
> 2017"
>      to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 
> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 
> 2017"
> Info: superblock features = 0 :
> Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
> Info: total FS sectors = 2097152 (1024 MB)
> Info: CKPT version = 2a4679e0
> Info: checkpoint state = 45 :  compacted_summary unmount
> 
> NID[0x4c] is unreachable
> NID[0x4d] is unreachable
> [FSCK] Unreachable nat entries                        [Fail] [0x2]
> [FSCK] SIT valid block bitmap checking                [Ok..]
> [FSCK] Hard link checking for regular file            [Ok..] [0x0]
> [FSCK] valid_block_count matching with CP             [Ok..] [0x2]
> [FSCK] valid_node_count matcing with CP (de lookup)   [Ok..] [0x1]
> [FSCK] valid_node_count matcing with CP (nat lookup)  [Fail] [0x3]
> [FSCK] valid_inode_count matched with CP              [Ok..] [0x1]
> [FSCK] free segment_count matched with CP             [Ok..] [0x1f0]
> [FSCK] next block offset is free                      [Ok..]
> [FSCK] fixing SIT types
> [FSCK] other corrupted bugs                           [Fail]
> 
> Do you want to restore lost files into ./lost_found/? [Y/N] Y
>   - File name         : 009.48244.2
>   - File size         : 20,480 (bytes)
> Do you want to fix this partition? [Y/N] Y
> 
> The interesting side of this is that when I issue fsck.f2fs with -p 
> option, fsck.f2fs doesn't complain !!!
> 
> Would you please tell me why fsck.f2fs reports an inconsistency which 
> needs to be fixed? Does it violate the crash consistency promise of f2fs?

As I mentioned above, I guess you did with "--start-mark mkfs" which will lose
the initial checkpoint.

> Moreover, Why is fsck.f2fs silent with -p option? Does it mean whether 
> f2fs kernel module finds it not serious?

The -p [level] and default level is zero, which checks the image iif runtime
f2fs reported any bug case before. Otherwise, it simply returns. If you set
level 1, fsck.f2fs will check basic FS metadata parts.

Thanks,

> 
> I really appreciate for your help.
> 
> Thanks
> 
> 
> 
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: f2fs Crash Consistency Problem
       [not found]     ` <20170414184520.GA6827@jaegeuk.local>
@ 2017-04-15  4:33       ` Raouf Rokhjavan
  0 siblings, 0 replies; 14+ messages in thread
From: Raouf Rokhjavan @ 2017-04-15  4:33 UTC (permalink / raw)
  To: linux-f2fs-devel; +Cc: Jaegeuk Kim



On 04/14/17 23:15, Jaegeuk Kim wrote:
> On 04/14, Raouf Rokhjavan wrote:
>> First of all, thank you so much for your help.
>>> Hello,
>>>
>>> I just ran log-writer with fsstress and found one issue when replaying IOs.
>>> If you replay after mkfs.f2fs, you get the wrong valid checkpoint which was
>>> overwritten by previous run. IOWs, at the beginning of replay, there was
>>> no *correct* checkpoint representing that initial moment. So, I think you
>>> need to replay the log including mkfs.
>> I made check my scripts to make sure about the CKPT in the boundary of mkfs.
>> To do that, I wrote something which produces this output:
>>
>> SECTION       -- f2fs_consistency
>> FSTYP         -- f2fs
>> PLATFORM      -- Linux/x86_64 localhost 4.9.8
>>
>> generic/009 1s ... 1s
>> Ran: generic/009
>> Passed all 1 tests
>>
>> SECTION       -- f2fs_consistency
>> =========================
>> Ran: generic/009
>> Passed all 1 tests
>>
>> ***** Replaying mkfs *****
>> mkfs_end entry is 1552.
>> CKPT after mkfs_end
>> Info: CKPT version = 6bd74d5f
>>
>> Replaying test #009 ...
>>
>> START_ENTRY is 1553
>> END_ENTRY is 1825
>>
>> ***** Entry #1553 *****
>> CKPT after replay
>> Info: CKPT version = 6bd74d5f
>> CKPT after mount/umount
> What does this mean? Did you mount/umount after replaying each entry?
> I suspect it will break the original behaviors resulting in fs inconsistency.
My test logic is this:
1. setup a log-writes dm-device including including REPLAYDEV and LOGDEV.
2. run mkfs.f2fs on log-writes dm-device
3. run a xfstests on log-writes dm-device which is mounted and 
configured as xfstests' TEST_DEV (no SCRATCH_DEV is configured in order 
to control the mkfs log and replay process)
4. remove the log-writes dm-device
5. replay the logs of mkfs
6. replay remaining entries from mkfs_end mark up to the end of log and 
check the consistency of file system
     - replay each entry -> fsck.f2fs -> mount, umount > fsck.f2fs

The gist of my scripts is something like this:
# ./replay-log --log $LOGDEV --replay $REPLAYDEV --end-mark mkfs_end
# ./replay-log -v --log $LOGDEV --replay $REPLAYDEV --start-mark 
009_start --fsck "./fsck_script.sh" --check 1

*** ./fsck_script.sh ***
fsck.f2fs $REPLAYDEV
mount -t f2fs $REPLAYDEV $CCTESTS_MNT
umount $CCTESTS_MNT
fsck.f2fs $REPLAYDEV

I do this since it's quite similar to what occurs during the power 
failure. At any given point (any entry) system crashes, and after that, 
the file system is mounted/umounted.
>
> BTW, when I tested below, I couldn't find any issue.
>
> # dmsetup create log
> # mkfs.f2fs
> # run xfstests/009
> # dmsetup remove log
> # ./replay-log --log log --replay dev --fsck "fsck.f2fs dev" --check fua
You run fsck after FUA command to block layer, is there any specific 
reason for that? Since I thought REQ_FUA  guarantees writing requests to 
devices which leads to a consistent state. It is not quite
what we're looking for, testing the file system response in the face of 
inconsistency, Am I right?

Regards, Raouf Rokhjavan
>
> It'd be helpful that I can reproduce this tho. ;)
>
> Thanks,
>
>> Info: CKPT version = 6bd74d60
>>
>> ***** Entry #1554 *****
>> CKPT after replay
>> Info: CKPT version = 6bd74d60
>> CKPT after mount/umount
>> Info: CKPT version = 6bd74d61
>>
>> (Continues until)
>>
>> ***** Entry #1703 *****
>> CKPT after replay
>> Info: CKPT version = 6bd74df1
>> CKPT after mount/umount
>> Info: CKPT version = 6bd74df2
>>
>> ***** Entry #1704 *****
>> CKPT after replay
>> (fsck.f2fs with no option asks question at this point)
>>> You can verify the below CKPT version info.
>>> As I mentioned above, I guess you did with "--start-mark mkfs" which will lose
>>> the initial checkpoint.
>> As you can see in the result, the CKPT after mkfs replay matches with the
>> CKPT after the first entry replay, so we can conclude CKPT is valid at this
>> point. Accordingly, I believe everything goes well up to this point, Am I
>> right?
>>> The -p [level] and default level is zero, which checks the image iif runtime
>>> f2fs reported any bug case before. Otherwise, it simply returns. If you set
>>> level 1, fsck.f2fs will check basic FS metadata parts.
>> Does it mean if f2fs module reports nothing as it does in my case (Entry
>> 1704), the fs is consistent?
>> If so, why does fsck.f2fs complain about inaccessible NAT nodes which means
>> inconsistency, at least based on what I think?
>>
>> I do appreciate.
>>
>> Regards,  Raouf Rokhjavan
>>> Thanks,
>>>
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> Linux-f2fs-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: f2fs Crash Consistency Problem
       [not found]     ` <20170414235834.GA8933@jaegeuk.local>
@ 2017-04-15  5:13       ` Raouf Rokhjavan
  2017-04-17 22:34         ` Jaegeuk Kim
  0 siblings, 1 reply; 14+ messages in thread
From: Raouf Rokhjavan @ 2017-04-15  5:13 UTC (permalink / raw)
  Cc: linux-f2fs-devel



On 04/15/17 04:28, Jaegeuk Kim wrote:
> On 04/14, Raouf Rokhjavan wrote:
>>
>> On 04/14/17 01:49, Jaegeuk Kim wrote:
>> Hi
>>
>> First of all, thank you so much for your help.
>>> Hello,
>>>
>>> I just ran log-writer with fsstress and found one issue when replaying IOs.
>>> If you replay after mkfs.f2fs, you get the wrong valid checkpoint which was
>>> overwritten by previous run. IOWs, at the beginning of replay, there was
>>> no *correct* checkpoint representing that initial moment. So, I think you
>>> need to replay the log including mkfs.
>> I made check my scripts to make sure about the CKPT in the boundary of mkfs.
>> To do that, I wrote something which produces this output:
>>
>> SECTION       -- f2fs_consistency
>> FSTYP         -- f2fs
>> PLATFORM      -- Linux/x86_64 localhost 4.9.8
>>
>> generic/009 1s ... 1s
>> Ran: generic/009
>> Passed all 1 tests
>>
>> SECTION       -- f2fs_consistency
>> =========================
>> Ran: generic/009
>> Passed all 1 tests
>>
>> ***** Replaying mkfs *****
>> mkfs_end entry is 1552.
>> CKPT after mkfs_end
>> Info: CKPT version = 6bd74d5f
>>
>> Replaying test #009 ...
>>
>> START_ENTRY is 1553
>> END_ENTRY is 1825
>>
>> ***** Entry #1553 *****
>> CKPT after replay
>> Info: CKPT version = 6bd74d5f
>> CKPT after mount/umount
>> Info: CKPT version = 6bd74d60
>>
>> ***** Entry #1554 *****
>> CKPT after replay
>> Info: CKPT version = 6bd74d60
>> CKPT after mount/umount
>> Info: CKPT version = 6bd74d61
>>
>> (Continues until)
>>
>> ***** Entry #1703 *****
>> CKPT after replay
>> Info: CKPT version = 6bd74df1
>> CKPT after mount/umount
>> Info: CKPT version = 6bd74df2
>>
>> ***** Entry #1704 *****
>> CKPT after replay
>> (fsck.f2fs with no option asks question at this point)
>>> You can verify the below CKPT version info.
>>> As I mentioned above, I guess you did with "--start-mark mkfs" which will lose
>>> the initial checkpoint.
>> As you can see in the result, the CKPT after mkfs replay matches with the
>> CKPT after the first entry replay, so we can conclude CKPT is valid at this
>> point. Accordingly, I believe everything goes well up to this point, Am I
>> right?
> Hmm, it seems you didn't set snapshot-base and snapshot-cow devices which are
> acutally used by replay-individual.sh, since your CKPT versions just increase.
Yes, you're completely right. I didn't set any snapshot target. In the 
kernel Documentation of log-writes, it hasn't made any reference to 
snapshot; honestly speaking, I didn't quite grab the point relating to 
snapshot in "replay-individual.sh". I thought the main reason of this 
target here is saving the state of whole disk (snapshot) after each log 
entry, and it has nothing to do with f2fs consistency and specifically CKPT.

> It make sense that versions should look like what I've run replay-individual.sh
> after executing generic/009.
>
> creating snapshot base
> 0 2097152 snapshot-origin /dev/nvme0n1p1
> setting up COW TABLE
> 100+0 records in
> 100+0 records out
> 104857600 bytes (105 MB) copied, 0.330227 s, 318 MB/s
> replayin to mark
> replaying entry 3116          <- starting entry number of 009
> Info: CKPT version = 724f931a
> Info: CKPT version = 724f931b
> replaying entry 3117
> Info: CKPT version = 724f931a
> Info: CKPT version = 724f931c
> replaying entry 3118
> Info: CKPT version = 724f931a
> Info: CKPT version = 724f931c
> replaying entry 3119
> Info: CKPT version = 724f931a
> Info: CKPT version = 724f931c
> replaying entry 3120
> Info: CKPT version = 724f931a
> Info: CKPT version = 724f931c
> replaying entry 3121
> Info: CKPT version = 724f931a
> Info: CKPT version = 724f931c
> replaying entry 3122
> Info: CKPT version = 724f931a
> ...
> replaying entry 3294
> Info: CKPT version = 724f931b
> Info: CKPT version = 724f931c
> replaying entry 3295          <- ending entry number
> Info: CKPT version = 724f931c
> Info: CKPT version = 724f931d
>
> Again, I couldn't find any issue here.
>
> Thanks,
I wonder how a dm-target like snapshot can affect the behavior of upper 
layer like f2fs and specifically its CKPT versions !!! Would you please 
explain it a little bit?

Regards, Raouf Rokhjavan
>>> The -p [level] and default level is zero, which checks the image iif runtime
>>> f2fs reported any bug case before. Otherwise, it simply returns. If you set
>>> level 1, fsck.f2fs will check basic FS metadata parts.
>> Does it mean if f2fs module reports nothing as it does in my case (Entry
>> 1704), the fs is consistent?
>> If so, why does fsck.f2fs complain about inaccessible NAT nodes which means
>> inconsistency, at least based on what I think?
>>
>> I do appreciate.
>>
>> Regards,  Raouf Rokhjavan
>>> Thanks,
>>>
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> Linux-f2fs-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: f2fs Crash Consistency Problem
  2017-04-15  5:13       ` Raouf Rokhjavan
@ 2017-04-17 22:34         ` Jaegeuk Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Jaegeuk Kim @ 2017-04-17 22:34 UTC (permalink / raw)
  To: linux-f2fs-devel

On 04/15, Raouf Rokhjavan wrote:
> 
> 
> On 04/15/17 04:28, Jaegeuk Kim wrote:
> > On 04/14, Raouf Rokhjavan wrote:
> >>
> >> On 04/14/17 01:49, Jaegeuk Kim wrote:
> >> Hi
> >>
> >> First of all, thank you so much for your help.
> >>> Hello,
> >>>
> >>> I just ran log-writer with fsstress and found one issue when replaying IOs.
> >>> If you replay after mkfs.f2fs, you get the wrong valid checkpoint which was
> >>> overwritten by previous run. IOWs, at the beginning of replay, there was
> >>> no *correct* checkpoint representing that initial moment. So, I think you
> >>> need to replay the log including mkfs.
> >> I made check my scripts to make sure about the CKPT in the boundary of mkfs.
> >> To do that, I wrote something which produces this output:
> >>
> >> SECTION       -- f2fs_consistency
> >> FSTYP         -- f2fs
> >> PLATFORM      -- Linux/x86_64 localhost 4.9.8
> >>
> >> generic/009 1s ... 1s
> >> Ran: generic/009
> >> Passed all 1 tests
> >>
> >> SECTION       -- f2fs_consistency
> >> =========================
> >> Ran: generic/009
> >> Passed all 1 tests
> >>
> >> ***** Replaying mkfs *****
> >> mkfs_end entry is 1552.
> >> CKPT after mkfs_end
> >> Info: CKPT version = 6bd74d5f
> >>
> >> Replaying test #009 ...
> >>
> >> START_ENTRY is 1553
> >> END_ENTRY is 1825
> >>
> >> ***** Entry #1553 *****
> >> CKPT after replay
> >> Info: CKPT version = 6bd74d5f
> >> CKPT after mount/umount
> >> Info: CKPT version = 6bd74d60
> >>
> >> ***** Entry #1554 *****
> >> CKPT after replay
> >> Info: CKPT version = 6bd74d60
> >> CKPT after mount/umount
> >> Info: CKPT version = 6bd74d61
> >>
> >> (Continues until)
> >>
> >> ***** Entry #1703 *****
> >> CKPT after replay
> >> Info: CKPT version = 6bd74df1
> >> CKPT after mount/umount
> >> Info: CKPT version = 6bd74df2
> >>
> >> ***** Entry #1704 *****
> >> CKPT after replay
> >> (fsck.f2fs with no option asks question at this point)
> >>> You can verify the below CKPT version info.
> >>> As I mentioned above, I guess you did with "--start-mark mkfs" which will lose
> >>> the initial checkpoint.
> >> As you can see in the result, the CKPT after mkfs replay matches with the
> >> CKPT after the first entry replay, so we can conclude CKPT is valid at this
> >> point. Accordingly, I believe everything goes well up to this point, Am I
> >> right?
> > Hmm, it seems you didn't set snapshot-base and snapshot-cow devices which are
> > acutally used by replay-individual.sh, since your CKPT versions just increase.
> Yes, you're completely right. I didn't set any snapshot target. In the 
> kernel Documentation of log-writes, it hasn't made any reference to 
> snapshot; honestly speaking, I didn't quite grab the point relating to 
> snapshot in "replay-individual.sh". I thought the main reason of this 
> target here is saving the state of whole disk (snapshot) after each log 
> entry, and it has nothing to do with f2fs consistency and specifically CKPT.

At mount time, f2fs writes checkpoint blocks occasionally, if there is a data
block to recover, which is like injecting another IOs in the original sequence.
That will break the FS consistency accordingly, and that's why you can see the
above increasing CKPT versions which were actually written by mount, not IO log.
In order to avoid such the IO injection by mount/umount, original script used
the snapshot device.

Thanks,

> 
> > It make sense that versions should look like what I've run replay-individual.sh
> > after executing generic/009.
> >
> > creating snapshot base
> > 0 2097152 snapshot-origin /dev/nvme0n1p1
> > setting up COW TABLE
> > 100+0 records in
> > 100+0 records out
> > 104857600 bytes (105 MB) copied, 0.330227 s, 318 MB/s
> > replayin to mark
> > replaying entry 3116          <- starting entry number of 009
> > Info: CKPT version = 724f931a
> > Info: CKPT version = 724f931b
> > replaying entry 3117
> > Info: CKPT version = 724f931a
> > Info: CKPT version = 724f931c
> > replaying entry 3118
> > Info: CKPT version = 724f931a
> > Info: CKPT version = 724f931c
> > replaying entry 3119
> > Info: CKPT version = 724f931a
> > Info: CKPT version = 724f931c
> > replaying entry 3120
> > Info: CKPT version = 724f931a
> > Info: CKPT version = 724f931c
> > replaying entry 3121
> > Info: CKPT version = 724f931a
> > Info: CKPT version = 724f931c
> > replaying entry 3122
> > Info: CKPT version = 724f931a
> > ...
> > replaying entry 3294
> > Info: CKPT version = 724f931b
> > Info: CKPT version = 724f931c
> > replaying entry 3295          <- ending entry number
> > Info: CKPT version = 724f931c
> > Info: CKPT version = 724f931d
> >
> > Again, I couldn't find any issue here.
> >
> > Thanks,
> I wonder how a dm-target like snapshot can affect the behavior of upper 
> layer like f2fs and specifically its CKPT versions !!! Would you please 
> explain it a little bit?
> 
> Regards, Raouf Rokhjavan
> >>> The -p [level] and default level is zero, which checks the image iif runtime
> >>> f2fs reported any bug case before. Otherwise, it simply returns. If you set
> >>> level 1, fsck.f2fs will check basic FS metadata parts.
> >> Does it mean if f2fs module reports nothing as it does in my case (Entry
> >> 1704), the fs is consistent?
> >> If so, why does fsck.f2fs complain about inaccessible NAT nodes which means
> >> inconsistency, at least based on what I think?
> >>
> >> I do appreciate.
> >>
> >> Regards,  Raouf Rokhjavan
> >>> Thanks,
> >>>
> >>> ------------------------------------------------------------------------------
> >>> Check out the vibrant tech community on one of the world's most
> >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>> _______________________________________________
> >>> Linux-f2fs-devel mailing list
> >>> Linux-f2fs-devel@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: f2fs Crash Consistency Problem
  2017-04-13 21:19 ` Jaegeuk Kim
       [not found]   ` <60e7c703-13f1-0f7e-24cc-2c5fae3bc958@gmail.com>
@ 2017-05-10 17:51   ` Raouf Rokhjavan
  2017-05-12  0:14     ` Jaegeuk Kim
  1 sibling, 1 reply; 14+ messages in thread
From: Raouf Rokhjavan @ 2017-05-10 17:51 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-f2fs-devel



On 04/14/17 01:49, Jaegeuk Kim wrote:
> Hello,
>
> On 04/13, Raouf Rokhjavan wrote:
>> Hi
>>
>> The Flash friendly features of f2fs has motivated me to make use of
>> these characteristics as rootfs in my project. Since one of my main
>> considerations is the resilience in the face of power failure, I've been
>> looking for some techniques to prove this assertion. Finally, I ended up
>> finding a wonderful device-mapper target, developed by Josef Bacik who
>> is btrfs developer, for this purpose..
>>
>> As you know, log-writes target logs all bios which are passed to the
>> block layer and keeps the order of logging to simulate the file system
>> logic of maintaining the consistency. To take advantage of this helpful
>> tool to verify the consistency of f2fs file system after power failure,
>> I combined xfstests test suite with log-writes. According to the LFS
>> based nature of f2fs, I expected that I would never encounter with
>> inconsistency problem, but test results shows something else.
>>
>> To clarify further this notion, this is my test environment:
>>
>>       - Fedora 24
>>
>>       - kernel 4.9.8 - f2fs was compiled as module which all features
>>
>>           - mount options: default + noatime
>>
>>       - f2fs-tools 1.8.0
>>
>>       - xfstests  1.1.1 - from https://github.com/jaegeuk/xfstests-f2fs
>>
>>       - device-mapper 1.02.122
>>
>> In my test environment, I run each generic test of xfstests on
>> log-writes device with newly created f2fs. After that, I replay the log
>> after the mkfs one by one and check the consistency of file system with
>> fsck.f2fs. In test #009 which is fallocate test with
>> FALLOC_FL_ZERO_RANGE mode, after a while, fsck.f2fs complains this:
> I just ran log-writer with fsstress and found one issue when replaying IOs.
> If you replay after mkfs.f2fs, you get the wrong valid checkpoint which was
> overwritten by previous run. IOWs, at the beginning of replay, there was
> no *correct* checkpoint representing that initial moment. So, I think you
> need to replay the log including mkfs.
>
> You can verify the below CKPT version info.
>
>> Info: [/dev/sdc] Disk Model: VMware Virtual S1.0
>> Info: Segments per section = 1
>> Info: Sections per zone = 1
>> Info: sector size = 512
>> Info: total sectors = 2097152 (1024 MB)
>> Info: MKFS version
>>     "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
>> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
>> 2017"
>> Info: FSCK version
>>     from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
>> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
>> 2017"
>>       to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
>> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
>> 2017"
>> Info: superblock features = 0 :
>> Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
>> Info: total FS sectors = 2097152 (1024 MB)
>> Info: CKPT version = 2a4679e0
>> Info: checkpoint state = 45 :  compacted_summary unmount
>>
>> NID[0x4c] is unreachable
>> NID[0x4d] is unreachable
>> [FSCK] Unreachable nat entries                        [Fail] [0x2]
>> [FSCK] SIT valid block bitmap checking                [Ok..]
>> [FSCK] Hard link checking for regular file            [Ok..] [0x0]
>> [FSCK] valid_block_count matching with CP             [Ok..] [0x2]
>> [FSCK] valid_node_count matcing with CP (de lookup)   [Ok..] [0x1]
>> [FSCK] valid_node_count matcing with CP (nat lookup)  [Fail] [0x3]
>> [FSCK] valid_inode_count matched with CP              [Ok..] [0x1]
>> [FSCK] free segment_count matched with CP             [Ok..] [0x1f0]
>> [FSCK] next block offset is free                      [Ok..]
>> [FSCK] fixing SIT types
>> [FSCK] other corrupted bugs                           [Fail]
>>
>> Do you want to restore lost files into ./lost_found/? [Y/N] Y
>>    - File name         : 009.48244.2
>>    - File size         : 20,480 (bytes)
>> Do you want to fix this partition? [Y/N] Y
>>
>> The interesting side of this is that when I issue fsck.f2fs with -p
>> option, fsck.f2fs doesn't complain !!!
>>
>> Would you please tell me why fsck.f2fs reports an inconsistency which
>> needs to be fixed? Does it violate the crash consistency promise of f2fs?
> As I mentioned above, I guess you did with "--start-mark mkfs" which will lose
> the initial checkpoint.
>
>> Moreover, Why is fsck.f2fs silent with -p option? Does it mean whether
>> f2fs kernel module finds it not serious?
> The -p [level] and default level is zero, which checks the image iif runtime
> f2fs reported any bug case before. Otherwise, it simply returns. If you set
> level 1, fsck.f2fs will check basic FS metadata parts.
>
> Thanks,
>
>> I really appreciate for your help.
>>
>> Thanks
>>
Hello,

As you told to use snapshot mechanism to prevent changing ckpt number 
after each mount, I ran again generic tests of xfstests framework on top 
of log-writes target with f2fs file system. In order to automate 
reporting an inconsistency situation, I add a parameter to fsck.f2fs to 
return(-1) when c.bug_on condition is met. To evaluate how f2fs react in 
case of crash consistency, I replay each log and check the consistency 
of f2fs with a my own modified version of fsck.f2fs.  Accordingly, all 
tests passed smoothly except these tests:

[FAIL] Running generic/013 failed. (consistency_single)
[FAIL] Running generic/070 failed. (consistency_single)
[FAIL] Running generic/113 failed. (consistency_single)
[FAIL] Running generic/241 failed. (consistency_single)

In other words, in these tests, c.bug_on() was true. Would you please 
describe why they become inconsistent?

Besides, I ran sysbench for database benchmark with 1 thread, 1000 
records, and 100 transactions on top of log-writes target with f2fs. 
Interestingly, I encountered a weird inconsistency. After replaying 
about 100 logs, fsck.f2fs complains about inconsistency with the 
following messages:

Info: Segments per section = 1
Info: Sections per zone = 1
Info: sector size = 512
Info: total sectors = 2097152 (1024 MB)
Info: MKFS version
   "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 
4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 
2017"
Info: FSCK version
   from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 
4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 
2017"
     to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 
4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 
2017"
Info: superblock features = 0 :
Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
Info: total FS sectors = 2097152 (1024 MB)
Info: CKPT version = 2b59c128
Info: checkpoint state = 44 :  compacted_summary sudden-power-off
[ASSERT] (sanity_check_nid: 388)  --> nid[0x6] nat_entry->ino[0x6] 
footer.ino[0x0]

NID[0x6] is unreachable
NID[0x7] is unreachable
[FSCK] Unreachable nat entries                        [Fail] [0x2]
[FSCK] SIT valid block bitmap checking                [Fail]
[FSCK] Hard link checking for regular file            [Ok..] [0x0]
[FSCK] valid_block_count matching with CP             [Fail] [0x6dc9]
[FSCK] valid_node_count matcing with CP (de lookup)   [Fail] [0xe3]
[FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0xe5]
[FSCK] valid_inode_count matched with CP              [Fail] [0x63]
[FSCK] free segment_count matched with CP             [Ok..] [0x1c6]
[FSCK] next block offset is free                      [Ok..]
[FSCK] fixing SIT types
[FSCK] other corrupted bugs                           [Fail]

After canceling the test by using Ctrl-C without answering any YES/NO 
questions, on another terminal I run fsck.f2fs again, but the output is 
completely different:
[root@localhost CrashConsistencyTest]# ./locals/usr/local/sbin/fsck.f2fs 
/dev/sdc
Info: [/dev/sdc] Disk Model: VMware Virtual S1.0
Info: Segments per section = 1
Info: Sections per zone = 1
Info: sector size = 512
Info: total sectors = 2097152 (1024 MB)
Info: MKFS version
   "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 
4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 
2017"
Info: FSCK version
   from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 
4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 
2017"
     to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 
4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 
2017"
Info: superblock features = 0 :
Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
Info: total FS sectors = 2097152 (1024 MB)
Info: CKPT version = 2b59c128
Info: checkpoint state = 44 :  compacted_summary sudden-power-off

[FSCK] Unreachable nat entries                        [Ok..] [0x0]
[FSCK] SIT valid block bitmap checking                [Ok..]
[FSCK] Hard link checking for regular file            [Ok..] [0x0]
[FSCK] valid_block_count matching with CP             [Ok..] [0x6dcf]
[FSCK] valid_node_count matcing with CP (de lookup)   [Ok..] [0xe5]
[FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0xe5]
[FSCK] valid_inode_count matched with CP              [Ok..] [0x64]
[FSCK] free segment_count matched with CP             [Ok..] [0x1c6]
[FSCK] next block offset is free                      [Ok..]
[FSCK] fixing SIT types
[FSCK] other corrupted bugs                           [Ok..]

This situation raises a couple of questions:
1. How  does an inconsistent file system turn into a consistent one in 
this case?
2. Why does an inconsistency occur in different log numbers; in other 
words, why is it unpredictable?  Does ordering of logs have to do with 
disk controller and I/O scheduler?

I do appreciate for your help.
Regards

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: f2fs Crash Consistency Problem
  2017-05-10 17:51   ` Raouf Rokhjavan
@ 2017-05-12  0:14     ` Jaegeuk Kim
  2017-05-12 18:30       ` Raouf Rokhjavan
  2017-05-17 17:43       ` Raouf Rokhjavan
  0 siblings, 2 replies; 14+ messages in thread
From: Jaegeuk Kim @ 2017-05-12  0:14 UTC (permalink / raw)
  To: Raouf Rokhjavan; +Cc: linux-f2fs-devel

Hi,

On 05/10, Raouf Rokhjavan wrote:
...

> As you told to use snapshot mechanism to prevent changing ckpt number after
> each mount, I ran again generic tests of xfstests framework on top of
> log-writes target with f2fs file system. In order to automate reporting an
> inconsistency situation, I add a parameter to fsck.f2fs to return(-1) when
> c.bug_on condition is met. To evaluate how f2fs react in case of crash
> consistency, I replay each log and check the consistency of f2fs with a my
> own modified version of fsck.f2fs.  Accordingly, all tests passed smoothly
> except these tests:
> 
> [FAIL] Running generic/013 failed. (consistency_single)

Could you check whether any IO made by mkfs was added in the replay log?
If so, fsck.f2fs should be failed when replaying them.

> [FAIL] Running generic/070 failed. (consistency_single)
> [FAIL] Running generic/113 failed. (consistency_single)

I added a mark to replay in the beginning of generic/113, and ran the test.
But, I couldn't find any error given test_dev as a log_dev. (I tested this
in the latest f2fs/dev-test branch.)

> [FAIL] Running generic/241 failed. (consistency_single)
> 
> In other words, in these tests, c.bug_on() was true. Would you please
> describe why they become inconsistent?
> 
> Besides, I ran sysbench for database benchmark with 1 thread, 1000 records,
> and 100 transactions on top of log-writes target with f2fs. Interestingly, I
> encountered a weird inconsistency. After replaying about 100 logs, fsck.f2fs
> complains about inconsistency with the following messages:

Can you share the parameter for sysbench?

Thanks,

> 
> Info: Segments per section = 1
> Info: Sections per zone = 1
> Info: sector size = 512
> Info: total sectors = 2097152 (1024 MB)
> Info: MKFS version
>   "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 4.8.5
> 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 2017"
> Info: FSCK version
>   from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
> 2017"
>     to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
> 2017"
> Info: superblock features = 0 :
> Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
> Info: total FS sectors = 2097152 (1024 MB)
> Info: CKPT version = 2b59c128
> Info: checkpoint state = 44 :  compacted_summary sudden-power-off
> [ASSERT] (sanity_check_nid: 388)  --> nid[0x6] nat_entry->ino[0x6]
> footer.ino[0x0]
> 
> NID[0x6] is unreachable
> NID[0x7] is unreachable
> [FSCK] Unreachable nat entries                        [Fail] [0x2]
> [FSCK] SIT valid block bitmap checking                [Fail]
> [FSCK] Hard link checking for regular file            [Ok..] [0x0]
> [FSCK] valid_block_count matching with CP             [Fail] [0x6dc9]
> [FSCK] valid_node_count matcing with CP (de lookup)   [Fail] [0xe3]
> [FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0xe5]
> [FSCK] valid_inode_count matched with CP              [Fail] [0x63]
> [FSCK] free segment_count matched with CP             [Ok..] [0x1c6]
> [FSCK] next block offset is free                      [Ok..]
> [FSCK] fixing SIT types
> [FSCK] other corrupted bugs                           [Fail]
> 
> After canceling the test by using Ctrl-C without answering any YES/NO
> questions, on another terminal I run fsck.f2fs again, but the output is
> completely different:
> [root@localhost CrashConsistencyTest]# ./locals/usr/local/sbin/fsck.f2fs
> /dev/sdc
> Info: [/dev/sdc] Disk Model: VMware Virtual S1.0
> Info: Segments per section = 1
> Info: Sections per zone = 1
> Info: sector size = 512
> Info: total sectors = 2097152 (1024 MB)
> Info: MKFS version
>   "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 4.8.5
> 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 2017"
> Info: FSCK version
>   from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
> 2017"
>     to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
> 2017"
> Info: superblock features = 0 :
> Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
> Info: total FS sectors = 2097152 (1024 MB)
> Info: CKPT version = 2b59c128
> Info: checkpoint state = 44 :  compacted_summary sudden-power-off
> 
> [FSCK] Unreachable nat entries                        [Ok..] [0x0]
> [FSCK] SIT valid block bitmap checking                [Ok..]
> [FSCK] Hard link checking for regular file            [Ok..] [0x0]
> [FSCK] valid_block_count matching with CP             [Ok..] [0x6dcf]
> [FSCK] valid_node_count matcing with CP (de lookup)   [Ok..] [0xe5]
> [FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0xe5]
> [FSCK] valid_inode_count matched with CP              [Ok..] [0x64]
> [FSCK] free segment_count matched with CP             [Ok..] [0x1c6]
> [FSCK] next block offset is free                      [Ok..]
> [FSCK] fixing SIT types
> [FSCK] other corrupted bugs                           [Ok..]
> 
> This situation raises a couple of questions:
> 1. How  does an inconsistent file system turn into a consistent one in this
> case?
> 2. Why does an inconsistency occur in different log numbers; in other words,
> why is it unpredictable?  Does ordering of logs have to do with disk
> controller and I/O scheduler?
> 
> I do appreciate for your help.
> Regards

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: f2fs Crash Consistency Problem
  2017-05-12  0:14     ` Jaegeuk Kim
@ 2017-05-12 18:30       ` Raouf Rokhjavan
  2017-05-15 17:46         ` Jaegeuk Kim
  2017-05-17 17:43       ` Raouf Rokhjavan
  1 sibling, 1 reply; 14+ messages in thread
From: Raouf Rokhjavan @ 2017-05-12 18:30 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-f2fs-devel

Hi,

On 05/12/17 04:44, Jaegeuk Kim wrote:
> Hi,
>
> On 05/10, Raouf Rokhjavan wrote:
> ...
>
>> As you told to use snapshot mechanism to prevent changing ckpt number after
>> each mount, I ran again generic tests of xfstests framework on top of
>> log-writes target with f2fs file system. In order to automate reporting an
>> inconsistency situation, I add a parameter to fsck.f2fs to return(-1) when
>> c.bug_on condition is met. To evaluate how f2fs react in case of crash
>> consistency, I replay each log and check the consistency of f2fs with a my
>> own modified version of fsck.f2fs.  Accordingly, all tests passed smoothly
>> except these tests:
>>
>> [FAIL] Running generic/013 failed. (consistency_single)
> Could you check whether any IO made by mkfs was added in the replay log?
> If so, fsck.f2fs should be failed when replaying them.
These two functions undertake the whole logic which run different tests 
including sysbench dbtests.

apply_test()
{
     local aTest
     aTest=$1

     # Create log-writes
     TABLE="0 ${BLKSIZE} log-writes ${REPLAYDEV} ${LOGDEV}"
     dmsetup create ${DEVMAP_LOG_WRITES} --table "${TABLE}"

     # Mark mkfs
     dmsetup message ${DEVMAP_LOG_WRITES} 0 mark mkfs_start
     ${TOOLS_DIR}/mkfs.f2fs ${DEVMAP_PART}
     dmsetup message ${DEVMAP_LOG_WRITES} 0 mark mkfs_end

     # Apply the test and mark it
     dmsetup message ${DEVMAP_LOG_WRITES} 0 mark ${aTest}_start
     ./check -E ./${CCTESTS_EXCLUDE} -s ${FSTYPE}_consistency ${aTest}
     dmsetup message ${DEVMAP_LOG_WRITES} 0 mark ${aTest}_end

     # Remove log-writes
     dmsetup remove ${DEVMAP_LOG_WRITES}
}

apply_consistency_test()
{
     local aTest
     aTest=$1

     echo "Replaying mkfs ..."
     ENTRY=$(${TOOLS_DIR}/replay-log --log $LOGDEV --find --end-mark 
mkfs_end)
     ${TOOLS_DIR}/replay-log --log $LOGDEV --replay $REPLAYDEV 
--end-mark mkfs_end

     START_ENTRY=$(${TOOLS_DIR}/replay-log --log $LOGDEV --find 
--end-mark ${aTest}_start)
     END_ENTRY=$(${TOOLS_DIR}/replay-log --log $LOGDEV --find --end-mark 
${aTest}_end)

     echo "Replaying test $aTest ..."
     echo "START_ENTRY is $START_ENTRY"
     echo "END_ENTRY is $END_ENTRY"

     ${TOOLS_DIR}/replay-log -v --log $LOGDEV --replay $REPLAYDEV 
--start-mark ${aTest}_start\
         --fsck "$FSCK_SCRIPT" --check 1
}

First I call apply_test() and then apply_consistency_test().
I ran generic test  #013 a couple of times, but It failed haphazardly; 
what I mean is that it fails one time and passes another time! In the 
case of sysbench tests, the same thing happened. I wonder whether the 
order of replaying IOs can cause this behavior or not.
>> [FAIL] Running generic/070 failed. (consistency_single)
>> [FAIL] Running generic/113 failed. (consistency_single)
> I added a mark to replay in the beginning of generic/113, and ran the test.
> But, I couldn't find any error given test_dev as a log_dev. (I tested this
> in the latest f2fs/dev-test branch.)
>
>> [FAIL] Running generic/241 failed. (consistency_single)
>>
>> In other words, in these tests, c.bug_on() was true. Would you please
>> describe why they become inconsistent?
>>
>> Besides, I ran sysbench for database benchmark with 1 thread, 1000 records,
>> and 100 transactions on top of log-writes target with f2fs. Interestingly, I
>> encountered a weird inconsistency. After replaying about 100 logs, fsck.f2fs
>> complains about inconsistency with the following messages:
> Can you share the parameter for sysbench?
After starting the mariadb and copying /var/lib/mysql contents into f2fs 
partition, These commands are run:

#sysbench 
--test=/home/roraouf/Projects/CrashConsistencyTest/locals/var/lib/dbtests/sysbench-lua/tests/db/oltp_write_only.lua 
--db-driver=mysql --oltp-table-size=1000 --mysql-db=sysbench 
--mysql-socket=/mnt/crash_consistency/f2fs/mysql/mysql.sock 
--mysql-user=sysbench --mysql-password=password prepare

#sysbench 
--test=/home/roraouf/Projects/CrashConsistencyTest/locals/var/lib/dbtests/sysbench-lua/tests/db/oltp_write_only.lua 
--db-driver=mysql --oltp-table-size=1000 --mysql-db=sysbench 
--mysql-user=sysbench --mysql-password=password --max-requests=100 
--num-threads=1 
--mysql-socket=/mnt/crash_consistency/f2fs/mysql/mysql.sock run

#sysbench 
--test=/home/roraouf/Projects/CrashConsistencyTest/locals/var/lib/dbtests/sysbench-lua/tests/db/oltp_write_only.lua 
--db-driver=mysql --mysql-db=sysbench --mysql-user=sysbench 
--mysql-password=password 
--mysql-socket=/mnt/crash_consistency/f2fs/mysql/mysql.sock cleanup

These steps are done in a script file.
Honestly speaking, I couldn't find any form of pattern to reproduce 
these errors.

The most intriguing part of these tests which is in common between all 
of them is that after fsck.f2fs pops out a lot messages on the screen 
and I cancel its execution with accepting fix partition, it doesn't 
generate the same output again and says every thing goes well with [ok] 
!!! Why does it behave in this way?

Regards
>
> Thanks,
>
>> Info: Segments per section = 1
>> Info: Sections per zone = 1
>> Info: sector size = 512
>> Info: total sectors = 2097152 (1024 MB)
>> Info: MKFS version
>>    "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 4.8.5
>> 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 2017"
>> Info: FSCK version
>>    from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
>> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
>> 2017"
>>      to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
>> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
>> 2017"
>> Info: superblock features = 0 :
>> Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
>> Info: total FS sectors = 2097152 (1024 MB)
>> Info: CKPT version = 2b59c128
>> Info: checkpoint state = 44 :  compacted_summary sudden-power-off
>> [ASSERT] (sanity_check_nid: 388)  --> nid[0x6] nat_entry->ino[0x6]
>> footer.ino[0x0]
>>
>> NID[0x6] is unreachable
>> NID[0x7] is unreachable
>> [FSCK] Unreachable nat entries                        [Fail] [0x2]
>> [FSCK] SIT valid block bitmap checking                [Fail]
>> [FSCK] Hard link checking for regular file            [Ok..] [0x0]
>> [FSCK] valid_block_count matching with CP             [Fail] [0x6dc9]
>> [FSCK] valid_node_count matcing with CP (de lookup)   [Fail] [0xe3]
>> [FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0xe5]
>> [FSCK] valid_inode_count matched with CP              [Fail] [0x63]
>> [FSCK] free segment_count matched with CP             [Ok..] [0x1c6]
>> [FSCK] next block offset is free                      [Ok..]
>> [FSCK] fixing SIT types
>> [FSCK] other corrupted bugs                           [Fail]
>>
>> After canceling the test by using Ctrl-C without answering any YES/NO
>> questions, on another terminal I run fsck.f2fs again, but the output is
>> completely different:
>> [root@localhost CrashConsistencyTest]# ./locals/usr/local/sbin/fsck.f2fs
>> /dev/sdc
>> Info: [/dev/sdc] Disk Model: VMware Virtual S1.0
>> Info: Segments per section = 1
>> Info: Sections per zone = 1
>> Info: sector size = 512
>> Info: total sectors = 2097152 (1024 MB)
>> Info: MKFS version
>>    "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 4.8.5
>> 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 2017"
>> Info: FSCK version
>>    from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
>> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
>> 2017"
>>      to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
>> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
>> 2017"
>> Info: superblock features = 0 :
>> Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
>> Info: total FS sectors = 2097152 (1024 MB)
>> Info: CKPT version = 2b59c128
>> Info: checkpoint state = 44 :  compacted_summary sudden-power-off
>>
>> [FSCK] Unreachable nat entries                        [Ok..] [0x0]
>> [FSCK] SIT valid block bitmap checking                [Ok..]
>> [FSCK] Hard link checking for regular file            [Ok..] [0x0]
>> [FSCK] valid_block_count matching with CP             [Ok..] [0x6dcf]
>> [FSCK] valid_node_count matcing with CP (de lookup)   [Ok..] [0xe5]
>> [FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0xe5]
>> [FSCK] valid_inode_count matched with CP              [Ok..] [0x64]
>> [FSCK] free segment_count matched with CP             [Ok..] [0x1c6]
>> [FSCK] next block offset is free                      [Ok..]
>> [FSCK] fixing SIT types
>> [FSCK] other corrupted bugs                           [Ok..]
>>
>> This situation raises a couple of questions:
>> 1. How  does an inconsistent file system turn into a consistent one in this
>> case?
>> 2. Why does an inconsistency occur in different log numbers; in other words,
>> why is it unpredictable?  Does ordering of logs have to do with disk
>> controller and I/O scheduler?
>>
>> I do appreciate for your help.
>> Regards


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: f2fs Crash Consistency Problem
  2017-05-12 18:30       ` Raouf Rokhjavan
@ 2017-05-15 17:46         ` Jaegeuk Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Jaegeuk Kim @ 2017-05-15 17:46 UTC (permalink / raw)
  To: Raouf Rokhjavan; +Cc: linux-f2fs-devel

Hi,

On 05/12, Raouf Rokhjavan wrote:
> Hi,
> 
> On 05/12/17 04:44, Jaegeuk Kim wrote:
> > Hi,
> > 
> > On 05/10, Raouf Rokhjavan wrote:
> > ...
> > 
> > > As you told to use snapshot mechanism to prevent changing ckpt number after
> > > each mount, I ran again generic tests of xfstests framework on top of
> > > log-writes target with f2fs file system. In order to automate reporting an
> > > inconsistency situation, I add a parameter to fsck.f2fs to return(-1) when
> > > c.bug_on condition is met. To evaluate how f2fs react in case of crash
> > > consistency, I replay each log and check the consistency of f2fs with a my
> > > own modified version of fsck.f2fs.  Accordingly, all tests passed smoothly
> > > except these tests:
> > > 
> > > [FAIL] Running generic/013 failed. (consistency_single)
> > Could you check whether any IO made by mkfs was added in the replay log?
> > If so, fsck.f2fs should be failed when replaying them.
> These two functions undertake the whole logic which run different tests
> including sysbench dbtests.
> 
> apply_test()
> {
>     local aTest
>     aTest=$1
> 
>     # Create log-writes
>     TABLE="0 ${BLKSIZE} log-writes ${REPLAYDEV} ${LOGDEV}"
>     dmsetup create ${DEVMAP_LOG_WRITES} --table "${TABLE}"
> 
>     # Mark mkfs
>     dmsetup message ${DEVMAP_LOG_WRITES} 0 mark mkfs_start
>     ${TOOLS_DIR}/mkfs.f2fs ${DEVMAP_PART}
>     dmsetup message ${DEVMAP_LOG_WRITES} 0 mark mkfs_end
> 
>     # Apply the test and mark it
>     dmsetup message ${DEVMAP_LOG_WRITES} 0 mark ${aTest}_start
>     ./check -E ./${CCTESTS_EXCLUDE} -s ${FSTYPE}_consistency ${aTest}

Here, what I did was I added the above start mark in the test script directly.
For example, I added it just before running fsstress in tests/generic/070, since
each test is able to call mkfs which wipes the device with log entries. In that
case, fsck would not find superblock, resulting in a failure.

Thanks,

>     dmsetup message ${DEVMAP_LOG_WRITES} 0 mark ${aTest}_end
> 
>     # Remove log-writes
>     dmsetup remove ${DEVMAP_LOG_WRITES}
> }
> 
> apply_consistency_test()
> {
>     local aTest
>     aTest=$1
> 
>     echo "Replaying mkfs ..."
>     ENTRY=$(${TOOLS_DIR}/replay-log --log $LOGDEV --find --end-mark
> mkfs_end)
>     ${TOOLS_DIR}/replay-log --log $LOGDEV --replay $REPLAYDEV --end-mark
> mkfs_end
> 
>     START_ENTRY=$(${TOOLS_DIR}/replay-log --log $LOGDEV --find --end-mark
> ${aTest}_start)
>     END_ENTRY=$(${TOOLS_DIR}/replay-log --log $LOGDEV --find --end-mark
> ${aTest}_end)
> 
>     echo "Replaying test $aTest ..."
>     echo "START_ENTRY is $START_ENTRY"
>     echo "END_ENTRY is $END_ENTRY"
> 
>     ${TOOLS_DIR}/replay-log -v --log $LOGDEV --replay $REPLAYDEV
> --start-mark ${aTest}_start\
>         --fsck "$FSCK_SCRIPT" --check 1
> }
> 
> First I call apply_test() and then apply_consistency_test().
> I ran generic test  #013 a couple of times, but It failed haphazardly; what
> I mean is that it fails one time and passes another time! In the case of
> sysbench tests, the same thing happened. I wonder whether the order of
> replaying IOs can cause this behavior or not.
> > > [FAIL] Running generic/070 failed. (consistency_single)
> > > [FAIL] Running generic/113 failed. (consistency_single)
> > I added a mark to replay in the beginning of generic/113, and ran the test.
> > But, I couldn't find any error given test_dev as a log_dev. (I tested this
> > in the latest f2fs/dev-test branch.)
> > 
> > > [FAIL] Running generic/241 failed. (consistency_single)
> > > 
> > > In other words, in these tests, c.bug_on() was true. Would you please
> > > describe why they become inconsistent?
> > > 
> > > Besides, I ran sysbench for database benchmark with 1 thread, 1000 records,
> > > and 100 transactions on top of log-writes target with f2fs. Interestingly, I
> > > encountered a weird inconsistency. After replaying about 100 logs, fsck.f2fs
> > > complains about inconsistency with the following messages:
> > Can you share the parameter for sysbench?
> After starting the mariadb and copying /var/lib/mysql contents into f2fs
> partition, These commands are run:
> 
> #sysbench --test=/home/roraouf/Projects/CrashConsistencyTest/locals/var/lib/dbtests/sysbench-lua/tests/db/oltp_write_only.lua
> --db-driver=mysql --oltp-table-size=1000 --mysql-db=sysbench
> --mysql-socket=/mnt/crash_consistency/f2fs/mysql/mysql.sock
> --mysql-user=sysbench --mysql-password=password prepare
> 
> #sysbench --test=/home/roraouf/Projects/CrashConsistencyTest/locals/var/lib/dbtests/sysbench-lua/tests/db/oltp_write_only.lua
> --db-driver=mysql --oltp-table-size=1000 --mysql-db=sysbench
> --mysql-user=sysbench --mysql-password=password --max-requests=100
> --num-threads=1 --mysql-socket=/mnt/crash_consistency/f2fs/mysql/mysql.sock
> run
> 
> #sysbench --test=/home/roraouf/Projects/CrashConsistencyTest/locals/var/lib/dbtests/sysbench-lua/tests/db/oltp_write_only.lua
> --db-driver=mysql --mysql-db=sysbench --mysql-user=sysbench
> --mysql-password=password
> --mysql-socket=/mnt/crash_consistency/f2fs/mysql/mysql.sock cleanup
> 
> These steps are done in a script file.
> Honestly speaking, I couldn't find any form of pattern to reproduce these
> errors.
> 
> The most intriguing part of these tests which is in common between all of
> them is that after fsck.f2fs pops out a lot messages on the screen and I
> cancel its execution with accepting fix partition, it doesn't generate the
> same output again and says every thing goes well with [ok] !!! Why does it
> behave in this way?
> 
> Regards
> > 
> > Thanks,
> > 
> > > Info: Segments per section = 1
> > > Info: Sections per zone = 1
> > > Info: sector size = 512
> > > Info: total sectors = 2097152 (1024 MB)
> > > Info: MKFS version
> > >    "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 4.8.5
> > > 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 2017"
> > > Info: FSCK version
> > >    from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
> > > 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
> > > 2017"
> > >      to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
> > > 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
> > > 2017"
> > > Info: superblock features = 0 :
> > > Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
> > > Info: total FS sectors = 2097152 (1024 MB)
> > > Info: CKPT version = 2b59c128
> > > Info: checkpoint state = 44 :  compacted_summary sudden-power-off
> > > [ASSERT] (sanity_check_nid: 388)  --> nid[0x6] nat_entry->ino[0x6]
> > > footer.ino[0x0]
> > > 
> > > NID[0x6] is unreachable
> > > NID[0x7] is unreachable
> > > [FSCK] Unreachable nat entries                        [Fail] [0x2]
> > > [FSCK] SIT valid block bitmap checking                [Fail]
> > > [FSCK] Hard link checking for regular file            [Ok..] [0x0]
> > > [FSCK] valid_block_count matching with CP             [Fail] [0x6dc9]
> > > [FSCK] valid_node_count matcing with CP (de lookup)   [Fail] [0xe3]
> > > [FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0xe5]
> > > [FSCK] valid_inode_count matched with CP              [Fail] [0x63]
> > > [FSCK] free segment_count matched with CP             [Ok..] [0x1c6]
> > > [FSCK] next block offset is free                      [Ok..]
> > > [FSCK] fixing SIT types
> > > [FSCK] other corrupted bugs                           [Fail]
> > > 
> > > After canceling the test by using Ctrl-C without answering any YES/NO
> > > questions, on another terminal I run fsck.f2fs again, but the output is
> > > completely different:
> > > [root@localhost CrashConsistencyTest]# ./locals/usr/local/sbin/fsck.f2fs
> > > /dev/sdc
> > > Info: [/dev/sdc] Disk Model: VMware Virtual S1.0
> > > Info: Segments per section = 1
> > > Info: Sections per zone = 1
> > > Info: sector size = 512
> > > Info: total sectors = 2097152 (1024 MB)
> > > Info: MKFS version
> > >    "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 4.8.5
> > > 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 2017"
> > > Info: FSCK version
> > >    from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
> > > 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
> > > 2017"
> > >      to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
> > > 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
> > > 2017"
> > > Info: superblock features = 0 :
> > > Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
> > > Info: total FS sectors = 2097152 (1024 MB)
> > > Info: CKPT version = 2b59c128
> > > Info: checkpoint state = 44 :  compacted_summary sudden-power-off
> > > 
> > > [FSCK] Unreachable nat entries                        [Ok..] [0x0]
> > > [FSCK] SIT valid block bitmap checking                [Ok..]
> > > [FSCK] Hard link checking for regular file            [Ok..] [0x0]
> > > [FSCK] valid_block_count matching with CP             [Ok..] [0x6dcf]
> > > [FSCK] valid_node_count matcing with CP (de lookup)   [Ok..] [0xe5]
> > > [FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0xe5]
> > > [FSCK] valid_inode_count matched with CP              [Ok..] [0x64]
> > > [FSCK] free segment_count matched with CP             [Ok..] [0x1c6]
> > > [FSCK] next block offset is free                      [Ok..]
> > > [FSCK] fixing SIT types
> > > [FSCK] other corrupted bugs                           [Ok..]
> > > 
> > > This situation raises a couple of questions:
> > > 1. How  does an inconsistent file system turn into a consistent one in this
> > > case?
> > > 2. Why does an inconsistency occur in different log numbers; in other words,
> > > why is it unpredictable?  Does ordering of logs have to do with disk
> > > controller and I/O scheduler?
> > > 
> > > I do appreciate for your help.
> > > Regards

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: f2fs Crash Consistency Problem
  2017-05-12  0:14     ` Jaegeuk Kim
  2017-05-12 18:30       ` Raouf Rokhjavan
@ 2017-05-17 17:43       ` Raouf Rokhjavan
  2017-05-17 18:01         ` Jaegeuk Kim
  1 sibling, 1 reply; 14+ messages in thread
From: Raouf Rokhjavan @ 2017-05-17 17:43 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-f2fs-devel

On 05/12/17 04:44, Jaegeuk Kim wrote:
> Hi,
>
> On 05/10, Raouf Rokhjavan wrote:
> ...
>
>> As you told to use snapshot mechanism to prevent changing ckpt number after
>> each mount, I ran again generic tests of xfstests framework on top of
>> log-writes target with f2fs file system. In order to automate reporting an
>> inconsistency situation, I add a parameter to fsck.f2fs to return(-1) when
>> c.bug_on condition is met. To evaluate how f2fs react in case of crash
>> consistency, I replay each log and check the consistency of f2fs with a my
>> own modified version of fsck.f2fs.  Accordingly, all tests passed smoothly
>> except these tests:
>>
>> [FAIL] Running generic/013 failed. (consistency_single)
> Could you check whether any IO made by mkfs was added in the replay log?
> If so, fsck.f2fs should be failed when replaying them.
>
>> [FAIL] Running generic/070 failed. (consistency_single)
>> [FAIL] Running generic/113 failed. (consistency_single)
> I added a mark to replay in the beginning of generic/113, and ran the test.
> But, I couldn't find any error given test_dev as a log_dev. (I tested this
> in the latest f2fs/dev-test branch.)
>
>> [FAIL] Running generic/241 failed. (consistency_single)
>>
>> In other words, in these tests, c.bug_on() was true. Would you please
>> describe why they become inconsistent?
>>
>> Besides, I ran sysbench for database benchmark with 1 thread, 1000 records,
>> and 100 transactions on top of log-writes target with f2fs. Interestingly, I
>> encountered a weird inconsistency. After replaying about 100 logs, fsck.f2fs
>> complains about inconsistency with the following messages:
> Can you share the parameter for sysbench?
Hi,

Since I want to make sure that my system, having a database app, stay 
operational after the power failure, I test database system on top of 
f2fs. Accordingly, I use sysbench and dm-log-writes to serve this 
purpose. I took advantage of lua scripting facility in sysbench to 
implement write only operations in database:

#sysbench 
--test=/home/roraouf/Projects/CrashConsistencyTest/locals/var/lib/dbtests/sysbench-lua/tests/db/oltp_write_only.lua 
--db-driver=mysql --oltp-table-size=1000 --mysql-db=sysbench 
--mysql-user=sysbench --mysql-password=password --max-requests=100 
--num-threads=1 
--mysql-socket=/mnt/crash_consistency/f2fs/mysql/mysql.sock run

I ran this test on 3 configurations:
1- ext4 (ordered, noatime) - success 15/15
2- ext4 (norecovery, noatime) - success 0/15
3- f2fs (noatime) - success 3/15

Success, here, means whether file system is operational without running 
fsck and fixing after each replay.
As the result show, ext4 with ordered journaling could surmount this 
test, but ,as it had been expected, ext4 without journaling like ext2 
needs fsck to recover file system after simulated power loss.
The surprising part of this test is f2fs. As f2fs always maintains a 
stable checkpoint of file system, and based on its FAST paper, it always 
rolls back to its stable checkpoint after power loss, I didn't expect to 
see f2fs in inconsistent state after replaying logs as fsck.f2fs 
reports. (It's necessary to mention that we check consistency of f2fs 
after mkfs.f2fs. ext4's results verify this notion.)

Unfortunately, the results are not reproducible, and inconsistency 
occurs in different logs; moreover, fsck.f2fs passes this test 
occasionally.
To give more accurate information, I uploaded the output of fsck.f2fs on 
Google Drive.

https://drive.google.com/open?id=0BxdqCs3G6wd3UWtDTmRGbFBiYmc

Regards,
>
> Thanks,
>
>> Info: Segments per section = 1
>> Info: Sections per zone = 1
>> Info: sector size = 512
>> Info: total sectors = 2097152 (1024 MB)
>> Info: MKFS version
>>    "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 4.8.5
>> 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 2017"
>> Info: FSCK version
>>    from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
>> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
>> 2017"
>>      to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
>> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
>> 2017"
>> Info: superblock features = 0 :
>> Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
>> Info: total FS sectors = 2097152 (1024 MB)
>> Info: CKPT version = 2b59c128
>> Info: checkpoint state = 44 :  compacted_summary sudden-power-off
>> [ASSERT] (sanity_check_nid: 388)  --> nid[0x6] nat_entry->ino[0x6]
>> footer.ino[0x0]
>>
>> NID[0x6] is unreachable
>> NID[0x7] is unreachable
>> [FSCK] Unreachable nat entries                        [Fail] [0x2]
>> [FSCK] SIT valid block bitmap checking                [Fail]
>> [FSCK] Hard link checking for regular file            [Ok..] [0x0]
>> [FSCK] valid_block_count matching with CP             [Fail] [0x6dc9]
>> [FSCK] valid_node_count matcing with CP (de lookup)   [Fail] [0xe3]
>> [FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0xe5]
>> [FSCK] valid_inode_count matched with CP              [Fail] [0x63]
>> [FSCK] free segment_count matched with CP             [Ok..] [0x1c6]
>> [FSCK] next block offset is free                      [Ok..]
>> [FSCK] fixing SIT types
>> [FSCK] other corrupted bugs                           [Fail]
>>
>> After canceling the test by using Ctrl-C without answering any YES/NO
>> questions, on another terminal I run fsck.f2fs again, but the output is
>> completely different:
>> [root@localhost CrashConsistencyTest]# ./locals/usr/local/sbin/fsck.f2fs
>> /dev/sdc
>> Info: [/dev/sdc] Disk Model: VMware Virtual S1.0
>> Info: Segments per section = 1
>> Info: Sections per zone = 1
>> Info: sector size = 512
>> Info: total sectors = 2097152 (1024 MB)
>> Info: MKFS version
>>    "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 4.8.5
>> 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 2017"
>> Info: FSCK version
>>    from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
>> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
>> 2017"
>>      to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
>> 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
>> 2017"
>> Info: superblock features = 0 :
>> Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
>> Info: total FS sectors = 2097152 (1024 MB)
>> Info: CKPT version = 2b59c128
>> Info: checkpoint state = 44 :  compacted_summary sudden-power-off
>>
>> [FSCK] Unreachable nat entries                        [Ok..] [0x0]
>> [FSCK] SIT valid block bitmap checking                [Ok..]
>> [FSCK] Hard link checking for regular file            [Ok..] [0x0]
>> [FSCK] valid_block_count matching with CP             [Ok..] [0x6dcf]
>> [FSCK] valid_node_count matcing with CP (de lookup)   [Ok..] [0xe5]
>> [FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0xe5]
>> [FSCK] valid_inode_count matched with CP              [Ok..] [0x64]
>> [FSCK] free segment_count matched with CP             [Ok..] [0x1c6]
>> [FSCK] next block offset is free                      [Ok..]
>> [FSCK] fixing SIT types
>> [FSCK] other corrupted bugs                           [Ok..]
>>
>> This situation raises a couple of questions:
>> 1. How  does an inconsistent file system turn into a consistent one in this
>> case?
>> 2. Why does an inconsistency occur in different log numbers; in other words,
>> why is it unpredictable?  Does ordering of logs have to do with disk
>> controller and I/O scheduler?
>>
>> I do appreciate for your help.
>> Regards


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: f2fs Crash Consistency Problem
  2017-05-17 17:43       ` Raouf Rokhjavan
@ 2017-05-17 18:01         ` Jaegeuk Kim
  2017-05-24 19:58           ` Raouf Rokhjavan
  0 siblings, 1 reply; 14+ messages in thread
From: Jaegeuk Kim @ 2017-05-17 18:01 UTC (permalink / raw)
  To: Raouf Rokhjavan; +Cc: linux-f2fs-devel

On 05/17, Raouf Rokhjavan wrote:
> On 05/12/17 04:44, Jaegeuk Kim wrote:
> > Hi,
> > 
> > On 05/10, Raouf Rokhjavan wrote:
> > ...
> > 
> > > As you told to use snapshot mechanism to prevent changing ckpt number after
> > > each mount, I ran again generic tests of xfstests framework on top of
> > > log-writes target with f2fs file system. In order to automate reporting an
> > > inconsistency situation, I add a parameter to fsck.f2fs to return(-1) when
> > > c.bug_on condition is met. To evaluate how f2fs react in case of crash
> > > consistency, I replay each log and check the consistency of f2fs with a my
> > > own modified version of fsck.f2fs.  Accordingly, all tests passed smoothly
> > > except these tests:
> > > 
> > > [FAIL] Running generic/013 failed. (consistency_single)
> > Could you check whether any IO made by mkfs was added in the replay log?
> > If so, fsck.f2fs should be failed when replaying them.
> > 
> > > [FAIL] Running generic/070 failed. (consistency_single)
> > > [FAIL] Running generic/113 failed. (consistency_single)
> > I added a mark to replay in the beginning of generic/113, and ran the test.
> > But, I couldn't find any error given test_dev as a log_dev. (I tested this
> > in the latest f2fs/dev-test branch.)
> > 
> > > [FAIL] Running generic/241 failed. (consistency_single)
> > > 
> > > In other words, in these tests, c.bug_on() was true. Would you please
> > > describe why they become inconsistent?
> > > 
> > > Besides, I ran sysbench for database benchmark with 1 thread, 1000 records,
> > > and 100 transactions on top of log-writes target with f2fs. Interestingly, I
> > > encountered a weird inconsistency. After replaying about 100 logs, fsck.f2fs
> > > complains about inconsistency with the following messages:
> > Can you share the parameter for sysbench?
> Hi,
> 
> Since I want to make sure that my system, having a database app, stay
> operational after the power failure, I test database system on top of f2fs.
> Accordingly, I use sysbench and dm-log-writes to serve this purpose. I took
> advantage of lua scripting facility in sysbench to implement write only
> operations in database:
> 
> #sysbench --test=/home/roraouf/Projects/CrashConsistencyTest/locals/var/lib/dbtests/sysbench-lua/tests/db/oltp_write_only.lua
> --db-driver=mysql --oltp-table-size=1000 --mysql-db=sysbench
> --mysql-user=sysbench --mysql-password=password --max-requests=100
> --num-threads=1 --mysql-socket=/mnt/crash_consistency/f2fs/mysql/mysql.sock
> run
> 
> I ran this test on 3 configurations:
> 1- ext4 (ordered, noatime) - success 15/15
> 2- ext4 (norecovery, noatime) - success 0/15
> 3- f2fs (noatime) - success 3/15
> 
> Success, here, means whether file system is operational without running fsck
> and fixing after each replay.
> As the result show, ext4 with ordered journaling could surmount this test,
> but ,as it had been expected, ext4 without journaling like ext2 needs fsck
> to recover file system after simulated power loss.
> The surprising part of this test is f2fs. As f2fs always maintains a stable
> checkpoint of file system, and based on its FAST paper, it always rolls back
> to its stable checkpoint after power loss, I didn't expect to see f2fs in
> inconsistent state after replaying logs as fsck.f2fs reports. (It's
> necessary to mention that we check consistency of f2fs after mkfs.f2fs.
> ext4's results verify this notion.)
> 
> Unfortunately, the results are not reproducible, and inconsistency occurs in
> different logs; moreover, fsck.f2fs passes this test occasionally.
> To give more accurate information, I uploaded the output of fsck.f2fs on
> Google Drive.
> 
> https://drive.google.com/open?id=0BxdqCs3G6wd3UWtDTmRGbFBiYmc

Hi,

Could you please check:
- did you use a snapshot device?
- what command was issued at #1687?
- how's result of fsck.f2fs -d 3?
- can you share your log-dev image?

Thanks,

> 
> Regards,
> > 
> > Thanks,
> > 
> > > Info: Segments per section = 1
> > > Info: Sections per zone = 1
> > > Info: sector size = 512
> > > Info: total sectors = 2097152 (1024 MB)
> > > Info: MKFS version
> > >    "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 4.8.5
> > > 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 2017"
> > > Info: FSCK version
> > >    from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
> > > 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
> > > 2017"
> > >      to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
> > > 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
> > > 2017"
> > > Info: superblock features = 0 :
> > > Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
> > > Info: total FS sectors = 2097152 (1024 MB)
> > > Info: CKPT version = 2b59c128
> > > Info: checkpoint state = 44 :  compacted_summary sudden-power-off
> > > [ASSERT] (sanity_check_nid: 388)  --> nid[0x6] nat_entry->ino[0x6]
> > > footer.ino[0x0]
> > > 
> > > NID[0x6] is unreachable
> > > NID[0x7] is unreachable
> > > [FSCK] Unreachable nat entries                        [Fail] [0x2]
> > > [FSCK] SIT valid block bitmap checking                [Fail]
> > > [FSCK] Hard link checking for regular file            [Ok..] [0x0]
> > > [FSCK] valid_block_count matching with CP             [Fail] [0x6dc9]
> > > [FSCK] valid_node_count matcing with CP (de lookup)   [Fail] [0xe3]
> > > [FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0xe5]
> > > [FSCK] valid_inode_count matched with CP              [Fail] [0x63]
> > > [FSCK] free segment_count matched with CP             [Ok..] [0x1c6]
> > > [FSCK] next block offset is free                      [Ok..]
> > > [FSCK] fixing SIT types
> > > [FSCK] other corrupted bugs                           [Fail]
> > > 
> > > After canceling the test by using Ctrl-C without answering any YES/NO
> > > questions, on another terminal I run fsck.f2fs again, but the output is
> > > completely different:
> > > [root@localhost CrashConsistencyTest]# ./locals/usr/local/sbin/fsck.f2fs
> > > /dev/sdc
> > > Info: [/dev/sdc] Disk Model: VMware Virtual S1.0
> > > Info: Segments per section = 1
> > > Info: Sections per zone = 1
> > > Info: sector size = 512
> > > Info: total sectors = 2097152 (1024 MB)
> > > Info: MKFS version
> > >    "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version 4.8.5
> > > 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST 2017"
> > > Info: FSCK version
> > >    from "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
> > > 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
> > > 2017"
> > >      to "Linux version 4.9.8 (roraoof@desktopr.example.com) (gcc version
> > > 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Tue Feb 7 08:24:57 IRST
> > > 2017"
> > > Info: superblock features = 0 :
> > > Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
> > > Info: total FS sectors = 2097152 (1024 MB)
> > > Info: CKPT version = 2b59c128
> > > Info: checkpoint state = 44 :  compacted_summary sudden-power-off
> > > 
> > > [FSCK] Unreachable nat entries                        [Ok..] [0x0]
> > > [FSCK] SIT valid block bitmap checking                [Ok..]
> > > [FSCK] Hard link checking for regular file            [Ok..] [0x0]
> > > [FSCK] valid_block_count matching with CP             [Ok..] [0x6dcf]
> > > [FSCK] valid_node_count matcing with CP (de lookup)   [Ok..] [0xe5]
> > > [FSCK] valid_node_count matcing with CP (nat lookup)  [Ok..] [0xe5]
> > > [FSCK] valid_inode_count matched with CP              [Ok..] [0x64]
> > > [FSCK] free segment_count matched with CP             [Ok..] [0x1c6]
> > > [FSCK] next block offset is free                      [Ok..]
> > > [FSCK] fixing SIT types
> > > [FSCK] other corrupted bugs                           [Ok..]
> > > 
> > > This situation raises a couple of questions:
> > > 1. How  does an inconsistent file system turn into a consistent one in this
> > > case?
> > > 2. Why does an inconsistency occur in different log numbers; in other words,
> > > why is it unpredictable?  Does ordering of logs have to do with disk
> > > controller and I/O scheduler?
> > > 
> > > I do appreciate for your help.
> > > Regards

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: f2fs Crash Consistency Problem
  2017-05-17 18:01         ` Jaegeuk Kim
@ 2017-05-24 19:58           ` Raouf Rokhjavan
  2017-05-25 23:44             ` Jaegeuk Kim
  0 siblings, 1 reply; 14+ messages in thread
From: Raouf Rokhjavan @ 2017-05-24 19:58 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-f2fs-devel


Hi

First of all, I'm really really sorry for my absence and replying too late.

On 05/17/17 22:31, Jaegeuk Kim wrote:
> On 05/17, Raouf Rokhjavan wrote:
>> Hi,
>>
>> Since I want to make sure that my system, having a database app, stay
>> operational after the power failure, I test database system on top of f2fs.
>> Accordingly, I use sysbench and dm-log-writes to serve this purpose. I took
>> advantage of lua scripting facility in sysbench to implement write only
>> operations in database:
>>
>> #sysbench --test=/home/roraouf/Projects/CrashConsistencyTest/locals/var/lib/dbtests/sysbench-lua/tests/db/oltp_write_only.lua
>> --db-driver=mysql --oltp-table-size=1000 --mysql-db=sysbench
>> --mysql-user=sysbench --mysql-password=password --max-requests=100
>> --num-threads=1 --mysql-socket=/mnt/crash_consistency/f2fs/mysql/mysql.sock
>> run
>>
>> I ran this test on 3 configurations:
>> 1- ext4 (ordered, noatime) - success 15/15
>> 2- ext4 (norecovery, noatime) - success 0/15
>> 3- f2fs (noatime) - success 3/15
>>
>> Success, here, means whether file system is operational without running fsck
>> and fixing after each replay.
>> As the result show, ext4 with ordered journaling could surmount this test,
>> but ,as it had been expected, ext4 without journaling like ext2 needs fsck
>> to recover file system after simulated power loss.
>> The surprising part of this test is f2fs. As f2fs always maintains a stable
>> checkpoint of file system, and based on its FAST paper, it always rolls back
>> to its stable checkpoint after power loss, I didn't expect to see f2fs in
>> inconsistent state after replaying logs as fsck.f2fs reports. (It's
>> necessary to mention that we check consistency of f2fs after mkfs.f2fs.
>> ext4's results verify this notion.)
>>
>> Unfortunately, the results are not reproducible, and inconsistency occurs in
>> different logs; moreover, fsck.f2fs passes this test occasionally.
>> To give more accurate information, I uploaded the output of fsck.f2fs on
>> Google Drive.
>>
>> https://drive.google.com/open?id=0BxdqCs3G6wd3UWtDTmRGbFBiYmc
> Hi,
Honestly speaking, I didn't expect to encountered such a confusing 
condition when I decided to verify the resiliency of f2fs after power 
failure!!! :)

The main thing which baffles me is that I haven't seen consistent 
behavior between ext4 and f2fs.
As I told before, ext4 pass all sysbench which replays single log-writes 
following up with fsck. It doesn't reflect any inconsistency.
Moreover, ext4 with norecovery  option,as we expect, fails in all tests 
and needs to fix the file system after simulated power-failure.
On the contrary, f2fs show peculiar behaviors. It haphazardly passes or 
fails a test on different runs!
> Could you please check:
> - did you use a snapshot device?
In order to prove that I use dm-snap appropriately in my scripts, I 
developed fsck_snap_f2fs_only.sh which logs  the CKPTs of f2fs in 
different stages: before, during, and after snapshot. You can see it here:
CKPT version output, passed test - 
https://drive.google.com/file/d/0BxdqCs3G6wd3aTNPS1pfRWlIWk0/view?usp=sharing
fsck output, passed test - 
https://drive.google.com/file/d/0BxdqCs3G6wd3Nm5DSk9DX0tLUDg/view?usp=sharing

> - what command was issued at #1687?
An important thing is that failures don't occur at fixed positions; 
consequently, they aren't reproducible. In terms of command issued at 
#1687, I don't know exactly since I call sysbench program in my bash 
script to run a write-only database benchmark while I'm capturing disk 
logs via log-writes; on the other hand, sysbench calls a lua script to 
accomplish this task.

> - how's result of fsck.f2fs -d 3?
I run another test (with FSCK_SCRIPT=./fsck_script/fsck_snap.sh in 
config) to capture the inconsistent condition. The outputs are 
available  here:
fsck outputs, failed test - 
https://drive.google.com/file/d/0BxdqCs3G6wd3cy04TXd6QTBsbzA/view?usp=sharing
fsck -d3 output, failed test - 
https://drive.google.com/file/d/0BxdqCs3G6wd3MXVzUHBGZEhlSFk/view?usp=sharing

> - can you share your log-dev image?
After you asked me to share my log-dev, I got intrigued to replay again 
the log-dev which has inconsistency, but ,surprisingly, f2fs.fsck 
doesn't complain at that point, and it again reflects unpredictable 
behaviors!!! What I mean is that, during replaying the log-dev in which 
fsck.f2fs had reported inconsistency, fsck_snap.sh passed one time and 
failed another time at different log number!!! A couple of theories come 
to my mind:
1)  A bug in log-wirtes causes this behavior.
2) The virtualized  block-device in vmware causes this behavior - 
because It's not SSD.
3) Something is wrong with fsck.f2fs.

Another important thing is continuous errors in kernel log during 
replaying and checking the consistency of file system:
- Buffer I/O error on dm-2, logical block X, async page read 
(replay-base; snapshot origin device )
- Buffer I/O error on dm-3, logical block Y, async page read 
(replay-cow; cow based snapshot device)
- ...
- buffer_io_error: Z calls suppressed
However, these kernel log error are generated in all conditions, 
f2fs{success, fail} and ext4.

*** IMPORTANT ***
The most interesting part of my tests happened when I add 
fsck_snap_f2fs_only.sh to check the correctness of using dm-snapshot in 
my scripts. As I told I get CKPT by calling dump.f2fs and grep CKPT and 
log it, just that; however, the results are absolutely surprising. 15/15 
tests passed. I don't know why because there is no change in my tests' 
logic. The main difference is that tests take longer to finish since I 
call more program to grep CKPT.

I put the codes of my tests on github, you can run it and get the results:
http://github.com/raoufro/CrashConsistencyTest.git

What causes the weird behavior of f2fs in these tests?

Regards,

>
> Thanks,
>
>> Regards,


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: f2fs Crash Consistency Problem
  2017-05-24 19:58           ` Raouf Rokhjavan
@ 2017-05-25 23:44             ` Jaegeuk Kim
       [not found]               ` <20170526022213.GA54408@jaegeuk-macbookpro.roam.corp.google.com>
  0 siblings, 1 reply; 14+ messages in thread
From: Jaegeuk Kim @ 2017-05-25 23:44 UTC (permalink / raw)
  To: Raouf Rokhjavan; +Cc: linux-f2fs-devel

On 05/25, Raouf Rokhjavan wrote:
> 
> Hi
> 
> First of all, I'm really really sorry for my absence and replying too late.
> 
> On 05/17/17 22:31, Jaegeuk Kim wrote:
> > On 05/17, Raouf Rokhjavan wrote:
> > > Hi,
> > > 
> > > Since I want to make sure that my system, having a database app, stay
> > > operational after the power failure, I test database system on top of f2fs.
> > > Accordingly, I use sysbench and dm-log-writes to serve this purpose. I took
> > > advantage of lua scripting facility in sysbench to implement write only
> > > operations in database:
> > > 
> > > #sysbench --test=/home/roraouf/Projects/CrashConsistencyTest/locals/var/lib/dbtests/sysbench-lua/tests/db/oltp_write_only.lua
> > > --db-driver=mysql --oltp-table-size=1000 --mysql-db=sysbench
> > > --mysql-user=sysbench --mysql-password=password --max-requests=100
> > > --num-threads=1 --mysql-socket=/mnt/crash_consistency/f2fs/mysql/mysql.sock
> > > run
> > > 
> > > I ran this test on 3 configurations:
> > > 1- ext4 (ordered, noatime) - success 15/15
> > > 2- ext4 (norecovery, noatime) - success 0/15
> > > 3- f2fs (noatime) - success 3/15
> > > 
> > > Success, here, means whether file system is operational without running fsck
> > > and fixing after each replay.
> > > As the result show, ext4 with ordered journaling could surmount this test,
> > > but ,as it had been expected, ext4 without journaling like ext2 needs fsck
> > > to recover file system after simulated power loss.
> > > The surprising part of this test is f2fs. As f2fs always maintains a stable
> > > checkpoint of file system, and based on its FAST paper, it always rolls back
> > > to its stable checkpoint after power loss, I didn't expect to see f2fs in
> > > inconsistent state after replaying logs as fsck.f2fs reports. (It's
> > > necessary to mention that we check consistency of f2fs after mkfs.f2fs.
> > > ext4's results verify this notion.)
> > > 
> > > Unfortunately, the results are not reproducible, and inconsistency occurs in
> > > different logs; moreover, fsck.f2fs passes this test occasionally.
> > > To give more accurate information, I uploaded the output of fsck.f2fs on
> > > Google Drive.
> > > 
> > > https://drive.google.com/open?id=0BxdqCs3G6wd3UWtDTmRGbFBiYmc
> > Hi,
> Honestly speaking, I didn't expect to encountered such a confusing condition
> when I decided to verify the resiliency of f2fs after power failure!!! :)
> 
> The main thing which baffles me is that I haven't seen consistent behavior
> between ext4 and f2fs.
> As I told before, ext4 pass all sysbench which replays single log-writes
> following up with fsck. It doesn't reflect any inconsistency.
> Moreover, ext4 with norecovery  option,as we expect, fails in all tests and
> needs to fix the file system after simulated power-failure.
> On the contrary, f2fs show peculiar behaviors. It haphazardly passes or
> fails a test on different runs!
> > Could you please check:
> > - did you use a snapshot device?
> In order to prove that I use dm-snap appropriately in my scripts, I
> developed fsck_snap_f2fs_only.sh which logs  the CKPTs of f2fs in different
> stages: before, during, and after snapshot. You can see it here:
> CKPT version output, passed test -
> https://drive.google.com/file/d/0BxdqCs3G6wd3aTNPS1pfRWlIWk0/view?usp=sharing
> fsck output, passed test -
> https://drive.google.com/file/d/0BxdqCs3G6wd3Nm5DSk9DX0tLUDg/view?usp=sharing
> 
> > - what command was issued at #1687?
> An important thing is that failures don't occur at fixed positions;
> consequently, they aren't reproducible. In terms of command issued at #1687,
> I don't know exactly since I call sysbench program in my bash script to run
> a write-only database benchmark while I'm capturing disk logs via
> log-writes; on the other hand, sysbench calls a lua script to accomplish
> this task.
> 
> > - how's result of fsck.f2fs -d 3?
> I run another test (with FSCK_SCRIPT=./fsck_script/fsck_snap.sh in config)
> to capture the inconsistent condition. The outputs are available  here:
> fsck outputs, failed test -
> https://drive.google.com/file/d/0BxdqCs3G6wd3cy04TXd6QTBsbzA/view?usp=sharing
> fsck -d3 output, failed test -
> https://drive.google.com/file/d/0BxdqCs3G6wd3MXVzUHBGZEhlSFk/view?usp=sharing
> 
> > - can you share your log-dev image?
> After you asked me to share my log-dev, I got intrigued to replay again the
> log-dev which has inconsistency, but ,surprisingly, f2fs.fsck doesn't
> complain at that point, and it again reflects unpredictable behaviors!!!
> What I mean is that, during replaying the log-dev in which fsck.f2fs had
> reported inconsistency, fsck_snap.sh passed one time and failed another time
> at different log number!!! A couple of theories come to my mind:
> 1)  A bug in log-wirtes causes this behavior.
> 2) The virtualized  block-device in vmware causes this behavior - because
> It's not SSD.
> 3) Something is wrong with fsck.f2fs.
> 
> Another important thing is continuous errors in kernel log during replaying
> and checking the consistency of file system:
> - Buffer I/O error on dm-2, logical block X, async page read (replay-base;
> snapshot origin device )
> - Buffer I/O error on dm-3, logical block Y, async page read (replay-cow;
> cow based snapshot device)
> - ...
> - buffer_io_error: Z calls suppressed
> However, these kernel log error are generated in all conditions,
> f2fs{success, fail} and ext4.
> 
> *** IMPORTANT ***
> The most interesting part of my tests happened when I add
> fsck_snap_f2fs_only.sh to check the correctness of using dm-snapshot in my
> scripts. As I told I get CKPT by calling dump.f2fs and grep CKPT and log it,
> just that; however, the results are absolutely surprising. 15/15 tests
> passed. I don't know why because there is no change in my tests' logic. The
> main difference is that tests take longer to finish since I call more
> program to grep CKPT.
> 
> I put the codes of my tests on github, you can run it and get the results:
> http://github.com/raoufro/CrashConsistencyTest.git
> 
> What causes the weird behavior of f2fs in these tests?

Taking a look at your script, I think you need to wipe cow-dev. Otherwise it
gives stale data.

Thanks,

> 
> Regards,
> 
> > 
> > Thanks,
> > 
> > > Regards,

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: f2fs Crash Consistency Problem
       [not found]               ` <20170526022213.GA54408@jaegeuk-macbookpro.roam.corp.google.com>
@ 2017-06-01  4:44                 ` Raouf Rokhjavan
  0 siblings, 0 replies; 14+ messages in thread
From: Raouf Rokhjavan @ 2017-06-01  4:44 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-f2fs-devel

Hi,

On 05/26/17 06:52, Jaegeuk Kim wrote:
> On 05/25, Jaegeuk Kim wrote:
>> On 05/25, Raouf Rokhjavan wrote:
>>> Hi
>>>
>>> First of all, I'm really really sorry for my absence and replying too late.
>>>
>>> On 05/17/17 22:31, Jaegeuk Kim wrote:
>>>
>>>> Hi,
>>> Honestly speaking, I didn't expect to encountered such a confusing condition
>>> when I decided to verify the resiliency of f2fs after power failure!!! :)
>>>
>>> The main thing which baffles me is that I haven't seen consistent behavior
>>> between ext4 and f2fs.
>>> As I told before, ext4 pass all sysbench which replays single log-writes
>>> following up with fsck. It doesn't reflect any inconsistency.
>>> Moreover, ext4 with norecovery  option,as we expect, fails in all tests and
>>> needs to fix the file system after simulated power-failure.
>>> On the contrary, f2fs show peculiar behaviors. It haphazardly passes or
>>> fails a test on different runs!
>>>> Could you please check:
>>>> - did you use a snapshot device?
>>> In order to prove that I use dm-snap appropriately in my scripts, I
>>> developed fsck_snap_f2fs_only.sh which logs  the CKPTs of f2fs in different
>>> stages: before, during, and after snapshot. You can see it here:
>>> CKPT version output, passed test -
>>> https://drive.google.com/file/d/0BxdqCs3G6wd3aTNPS1pfRWlIWk0/view?usp=sharing
>>> fsck output, passed test -
>>> https://drive.google.com/file/d/0BxdqCs3G6wd3Nm5DSk9DX0tLUDg/view?usp=sharing
>>>
>>>> - what command was issued at #1687?
>>> An important thing is that failures don't occur at fixed positions;
>>> consequently, they aren't reproducible. In terms of command issued at #1687,
>>> I don't know exactly since I call sysbench program in my bash script to run
>>> a write-only database benchmark while I'm capturing disk logs via
>>> log-writes; on the other hand, sysbench calls a lua script to accomplish
>>> this task.
>>>
>>>> - how's result of fsck.f2fs -d 3?
>>> I run another test (with FSCK_SCRIPT=./fsck_script/fsck_snap.sh in config)
>>> to capture the inconsistent condition. The outputs are available  here:
>>> fsck outputs, failed test -
>>> https://drive.google.com/file/d/0BxdqCs3G6wd3cy04TXd6QTBsbzA/view?usp=sharing
>>> fsck -d3 output, failed test -
>>> https://drive.google.com/file/d/0BxdqCs3G6wd3MXVzUHBGZEhlSFk/view?usp=sharing
>>>
>>>> - can you share your log-dev image?
>>> After you asked me to share my log-dev, I got intrigued to replay again the
>>> log-dev which has inconsistency, but ,surprisingly, f2fs.fsck doesn't
>>> complain at that point, and it again reflects unpredictable behaviors!!!
>>> What I mean is that, during replaying the log-dev in which fsck.f2fs had
>>> reported inconsistency, fsck_snap.sh passed one time and failed another time
>>> at different log number!!! A couple of theories come to my mind:
>>> 1)  A bug in log-wirtes causes this behavior.
>>> 2) The virtualized  block-device in vmware causes this behavior - because
>>> It's not SSD.
>>> 3) Something is wrong with fsck.f2fs.
>>>
>>> Another important thing is continuous errors in kernel log during replaying
>>> and checking the consistency of file system:
>>> - Buffer I/O error on dm-2, logical block X, async page read (replay-base;
>>> snapshot origin device )
>>> - Buffer I/O error on dm-3, logical block Y, async page read (replay-cow;
>>> cow based snapshot device)
>>> - ...
>>> - buffer_io_error: Z calls suppressed
>>> However, these kernel log error are generated in all conditions,
>>> f2fs{success, fail} and ext4.
>>>
>>> *** IMPORTANT ***
>>> The most interesting part of my tests happened when I add
>>> fsck_snap_f2fs_only.sh to check the correctness of using dm-snapshot in my
>>> scripts. As I told I get CKPT by calling dump.f2fs and grep CKPT and log it,
>>> just that; however, the results are absolutely surprising. 15/15 tests
>>> passed. I don't know why because there is no change in my tests' logic. The
>>> main difference is that tests take longer to finish since I call more
>>> program to grep CKPT.
>>>
>>> I put the codes of my tests on github, you can run it and get the results:
>>> http://github.com/raoufro/CrashConsistencyTest.git
>>>
>>> What causes the weird behavior of f2fs in these tests?
>> Taking a look at your script, I think you need to wipe cow-dev. Otherwise it
>> gives stale data.
> I've found another potential problem whch can cause random corruption of root
> inode. Could you test this attached patch heading for mkfs.f2fs?
To follow your instructions, I applied a couple of changes to my scripts:
1. delete and recreate cow-dev after each log replay
2. apply 0001-mkfs.f2fs-avoid-wrong-discard-of-dnode.patch to mkfs.f2fs
3. apply "f2fs: remove false-positive bug_on" to kernel which seems 
related to bug_on that causes the inconsistency report of fsck.f2fs

Despite these changes, I faced with the same fsck.f2fs error in my tests.

Have you run my tests? Could you reproduce the same errors I reported?

Thanks,
R.Rokhjavan

> Thanks
>
>> Thanks,
>>
>>> Regards,
>>>
>>>> Thanks,
>>>>
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Linux-f2fs-devel mailing list
>> Linux-f2fs-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-06-01  4:44 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-13 15:28 f2fs Crash Consistency Problem Raouf Rokhjavan
2017-04-13 21:19 ` Jaegeuk Kim
     [not found]   ` <60e7c703-13f1-0f7e-24cc-2c5fae3bc958@gmail.com>
     [not found]     ` <20170414184520.GA6827@jaegeuk.local>
2017-04-15  4:33       ` Raouf Rokhjavan
     [not found]     ` <20170414235834.GA8933@jaegeuk.local>
2017-04-15  5:13       ` Raouf Rokhjavan
2017-04-17 22:34         ` Jaegeuk Kim
2017-05-10 17:51   ` Raouf Rokhjavan
2017-05-12  0:14     ` Jaegeuk Kim
2017-05-12 18:30       ` Raouf Rokhjavan
2017-05-15 17:46         ` Jaegeuk Kim
2017-05-17 17:43       ` Raouf Rokhjavan
2017-05-17 18:01         ` Jaegeuk Kim
2017-05-24 19:58           ` Raouf Rokhjavan
2017-05-25 23:44             ` Jaegeuk Kim
     [not found]               ` <20170526022213.GA54408@jaegeuk-macbookpro.roam.corp.google.com>
2017-06-01  4:44                 ` Raouf Rokhjavan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.