From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-yw0-f182.google.com ([209.85.161.182]:51494 "EHLO mail-yw0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934823AbdIYKxY (ORCPT ); Mon, 25 Sep 2017 06:53:24 -0400 MIME-Version: 1.0 In-Reply-To: <59C8D147.1060608@cn.fujitsu.com> References: <1503830683-21455-1-git-send-email-amir73il@gmail.com> <59C8D147.1060608@cn.fujitsu.com> From: Amir Goldstein Date: Mon, 25 Sep 2017 13:53:21 +0300 Message-ID: Subject: Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug Content-Type: text/plain; charset="UTF-8" Sender: fstests-owner@vger.kernel.org To: Xiao Yang Cc: Theodore Ts'o , Eryu Guan , Josef Bacik , fstests , Ext4 List-ID: On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang wrote: > On 2017/08/27 18:44, Amir Goldstein wrote: >> This test is motivated by a bug found in ext4 during random crash >> consistency tests. >> >> This test uses device mapper flakey target to demonstrate the bug >> found using device mapper log-writes target. >> >> Signed-off-by: Amir Goldstein >> --- >> >> Ted, >> >> While working on crash consistency xfstests [1], I stubmled on what >> appeared to be an ext4 crash consistency bug. >> >> The tests I used rely on the log-writes dm target code written >> by Josef Bacik, which had little exposure to the wide community >> as far as I know. I wanted to prove to myself that the found >> inconsistency was not due to a test bug, so I bisected the failed >> test to the minimal operations that trigger the failure and wrote >> a small independent test to reproduce the issue using dm flakey target. >> >> The following fsck error is reliably reproduced by replaying some fsx ops >> on overlapping file regions, then emulating a crash, followed by mount, >> umount and fsck -nf: >> >> ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile >> 1 write 0x137dd thru 0x21445 (0xdc69 bytes) >> 2 falloc from 0xb531 to 0x16ade (0xb5ad bytes) >> 3 collapse from 0x1c000 to 0x20000, (0x4000 bytes) >> 4 write 0x3e5ec thru 0x3ffff (0x1a14 bytes) >> 5 zero from 0x20fac to 0x27d48, (0x6d9c bytes) >> 6 mapwrite 0x216ad thru 0x23dfb (0x274f bytes) >> All 7 operations completed A-OK! >> _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent >> *** fsck.ext4 output *** >> fsck from util-linux 2.27.1 >> e2fsck 1.42.13 (17-May-2015) >> Pass 1: Checking inodes, blocks, and sizes >> Inode 12, end of extent exceeds allowed value >> (logical block 33, physical block 33441, len 7) >> Clear? no >> Inode 12, i_blocks is 184, should be 128. Fix? no > Hi Amir, > > I always get the following output when running your xfstests test case 501. Now merged as test generic/456 > --------------------------------------------------------------------------- > e2fsck 1.42.9 (28-Dec-2013) > Pass 1: Checking inodes, blocks, and sizes > Inode 12, i_size is 147456, should be 163840. Fix? no > --------------------------------------------------------------------------- > > Could you tell me how to get the expected output as you reported? I can't say I am doing anything special, but I can say that I get the same output as you did when running the test inside kvm-xfstests. Actually, I could not reproduce ANY of the the crash consistency bugs inside kvm-xfstests. Must be something to do with different timing of IO with KVM+virtio disks?? When running on my laptop (Ubuntu 16.04 with latest kernel) on a 10G SSD volume, I always get the error reported above. I just re-verified with latest stable e2fsprogs (1.43.6). Amir.