[RFC][PATCH] fstest: regression test for ext4 crash consistency bug

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
@ 2017-08-31  1:28 Ashlie Martinez
  2017-08-31  4:05 ` Amir Goldstein
  0 siblings, 1 reply; 16+ messages in thread
From: Ashlie Martinez @ 2017-08-31  1:28 UTC (permalink / raw)
  To: amir73il; +Cc: tytso, eguan, jbacik, vvijay03, fstests, linux-ext4

Amir,

I have been working on CrashMonkey more and I have jerry-rigged together 
a test in CrashMonkey that calls into `fsx` with the minimal test case 
you made. I am able to reproduce the ext4 error that you found along 
with a few other potential errors.

A quick point, I run fsck with `-yf` instead of `-nf` that xfstests runs 
with. The reason for this is that CrashMonkey would like to report on 
fixable and unfixable errors in the future.

Running the ported test case, I find that CrashMonkey encounters the 
following errors:
1. Incorrect inode size and incorrect free data block and inode counts 
(fixable)
2. incorrect free data block and inode counts (fixable)
3. `Superblock needs_recovery flag is clear, but journal has data` 
notice along with errors present in case 1
4. `Superblock needs_recovery flag is clear, but journal has data` 
notice with no other errors

For the incorrect i_size errors, I get the output `Inode 12, i_size is 
147456, should be 163840.` which I can also reproduce with your 501 
xfstests test case.

When free data blocks and inode errors occur, the message is `Free 
blocks count wrong (8795, counted=8714).` and `Free inodes count wrong 
(2549, counted=2546).`

I have not had a chance to look into the above errors to find their root 
causes.

In total, CrashMonkey ran 1000 different tests. Of those, 344 passed 
without fsck complaining. The remaining 656 tests saw fsck complain 
about something. All of these tests consisted of unique sequences of 
bios, but may contain equivalent crash states.

The larger range of test results is due to the fact that CrashMonkey 
runs many tests from just the single workload you made. These tests 
consist of replaying some number of bio write operations, so it tests 
states different than you 500 xfstest which I believe only replays to 
sync operations (i.e. it never stops replay before a recorded fsync).

If you're interested, you can find the CrashMonkey code (and branch) at 
https://github.com/utsaslab/crashmonkey/tree/ext4_regression_bug. If you 
would like to run it, you should clone and build you xfstest in your 
home directory so that the jerry-rigged CrashMonkey test case can find 
it. Directions for running this test case in CrashMonkey should be at 
the top of the README.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
  2017-08-31  1:28 [RFC][PATCH] fstest: regression test for ext4 crash consistency bug Ashlie Martinez
@ 2017-08-31  4:05 ` Amir Goldstein
  2017-08-31  4:06   ` Amir Goldstein
  0 siblings, 1 reply; 16+ messages in thread
From: Amir Goldstein @ 2017-08-31  4:05 UTC (permalink / raw)
  To: Ashlie Martinez
  Cc: tytso, Eryu Guan, Josef Bacik, Vijay Chidambaram, fstests, Ext4

On Thu, Aug 31, 2017 at 4:28 AM, Ashlie Martinez <ashmrtn@utexas.edu> wrote:
> Amir,
>
> I have been working on CrashMonkey more and I have jerry-rigged together a
> test in CrashMonkey that calls into `fsx` with the minimal test case you
> made. I am able to reproduce the ext4 error that you found along with a few
> other potential errors.
>
> A quick point, I run fsck with `-yf` instead of `-nf` that xfstests runs
> with. The reason for this is that CrashMonkey would like to report on
> fixable and unfixable errors in the future.
>

That makes sense, but keep in mind that 'fixable' error may still loose data
when fixing them with -y. Perhaps you should consider running fsck is auto
fixing mode (i.e. e2fsck -p) when available, to classify errors as
'safely fixable'
I believe the error these test encountered are 'safely fixable', but
didn't check.

> Running the ported test case, I find that CrashMonkey encounters the
> following errors:
> 1. Incorrect inode size and incorrect free data block and inode counts
> (fixable)
> 2. incorrect free data block and inode counts (fixable)
> 3. `Superblock needs_recovery flag is clear, but journal has data` notice
> along with errors present in case 1
> 4. `Superblock needs_recovery flag is clear, but journal has data` notice
> with no other errors
>
> For the incorrect i_size errors, I get the output `Inode 12, i_size is
> 147456, should be 163840.` which I can also reproduce with your 501 xfstests
> test case.
>
> When free data blocks and inode errors occur, the message is `Free blocks
> count wrong (8795, counted=8714).` and `Free inodes count wrong (2549,
> counted=2546).`
>
> I have not had a chance to look into the above errors to find their root
> causes.
>

I believe this is what you get when you fsck -yf before trying to mount when
the orphan list is not empty. You should avoid doing that.

See what the greatest ext4 crash test experiment of them all is doing
and read the comment to understand why:
https://android.googlesource.com/platform/system/core/+/marshmallow-mr1-dev/fs_mgr/fs_mgr.c#96
1. mount -o  errors=remount-ro; umount
2. e2fsck -y

So upstream Android never runs e2fsck -f. It will only check fs if kernel marked
that fs has errors.
Although Cyanogenmod did add -f and I imagine that many vendors do as well.

As one who hacked and crashed a lot of Android devices, I can attest that I have
observed both data loss and corrupted (non booting) fs, but the rest
of the 2 billion
crash test monkeys don't seem to be bothered ;-)

> In total, CrashMonkey ran 1000 different tests. Of those, 344 passed without
> fsck complaining. The remaining 656 tests saw fsck complain about something.
> All of these tests consisted of unique sequences of bios, but may contain
> equivalent crash states.
>
> The larger range of test results is due to the fact that CrashMonkey runs
> many tests from just the single workload you made. These tests consist of
> replaying some number of bio write operations, so it tests states different
> than you 500 xfstest which I believe only replays to sync operations (i.e.
> it never stops replay before a recorded fsync).

That is correct. test 500 (temporary name) is mostly focused on checking
data consistency of files after fsync. detecting metadata consistency errors
is a by product. I do intend to add more tests focused on metadata consistency.
Josef already wrote an fsstress script that should be converted to an xfstest
which replays the log to every FUA and fsck.

>
> If you're interested, you can find the CrashMonkey code (and branch) at
> https://github.com/utsaslab/crashmonkey/tree/ext4_regression_bug. If you
> would like to run it, you should clone and build you xfstest in your home
> directory so that the jerry-rigged CrashMonkey test case can find it.
> Directions for running this test case in CrashMonkey should be at the top of
> the README.

You seem to have misspelled 'fsx' in README and in the code as 'xfs'.
Funny, I always mistype it as 'sfx' :)

Cheers,
Amir.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
  2017-08-31  4:05 ` Amir Goldstein
@ 2017-08-31  4:06   ` Amir Goldstein
  2017-09-01 12:21     ` Ashlie Martinez
  0 siblings, 1 reply; 16+ messages in thread
From: Amir Goldstein @ 2017-08-31  4:06 UTC (permalink / raw)
  To: Ashlie Martinez
  Cc: Eryu Guan, Josef Bacik, Vijay Chidambaram, fstests, Ext4, Theodore Tso

[now really CC Ted]

On Thu, Aug 31, 2017 at 7:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, Aug 31, 2017 at 4:28 AM, Ashlie Martinez <ashmrtn@utexas.edu> wrote:
>> Amir,
>>
>> I have been working on CrashMonkey more and I have jerry-rigged together a
>> test in CrashMonkey that calls into `fsx` with the minimal test case you
>> made. I am able to reproduce the ext4 error that you found along with a few
>> other potential errors.
>>
>> A quick point, I run fsck with `-yf` instead of `-nf` that xfstests runs
>> with. The reason for this is that CrashMonkey would like to report on
>> fixable and unfixable errors in the future.
>>
>
> That makes sense, but keep in mind that 'fixable' error may still loose data
> when fixing them with -y. Perhaps you should consider running fsck is auto
> fixing mode (i.e. e2fsck -p) when available, to classify errors as
> 'safely fixable'
> I believe the error these test encountered are 'safely fixable', but
> didn't check.
>
>> Running the ported test case, I find that CrashMonkey encounters the
>> following errors:
>> 1. Incorrect inode size and incorrect free data block and inode counts
>> (fixable)
>> 2. incorrect free data block and inode counts (fixable)
>> 3. `Superblock needs_recovery flag is clear, but journal has data` notice
>> along with errors present in case 1
>> 4. `Superblock needs_recovery flag is clear, but journal has data` notice
>> with no other errors
>>
>> For the incorrect i_size errors, I get the output `Inode 12, i_size is
>> 147456, should be 163840.` which I can also reproduce with your 501 xfstests
>> test case.
>>
>> When free data blocks and inode errors occur, the message is `Free blocks
>> count wrong (8795, counted=8714).` and `Free inodes count wrong (2549,
>> counted=2546).`
>>
>> I have not had a chance to look into the above errors to find their root
>> causes.
>>
>
> I believe this is what you get when you fsck -yf before trying to mount when
> the orphan list is not empty. You should avoid doing that.
>
> See what the greatest ext4 crash test experiment of them all is doing
> and read the comment to understand why:
> https://android.googlesource.com/platform/system/core/+/marshmallow-mr1-dev/fs_mgr/fs_mgr.c#96
> 1. mount -o  errors=remount-ro; umount
> 2. e2fsck -y
>
> So upstream Android never runs e2fsck -f. It will only check fs if kernel marked
> that fs has errors.
> Although Cyanogenmod did add -f and I imagine that many vendors do as well.
>
> As one who hacked and crashed a lot of Android devices, I can attest that I have
> observed both data loss and corrupted (non booting) fs, but the rest
> of the 2 billion
> crash test monkeys don't seem to be bothered ;-)
>
>> In total, CrashMonkey ran 1000 different tests. Of those, 344 passed without
>> fsck complaining. The remaining 656 tests saw fsck complain about something.
>> All of these tests consisted of unique sequences of bios, but may contain
>> equivalent crash states.
>>
>> The larger range of test results is due to the fact that CrashMonkey runs
>> many tests from just the single workload you made. These tests consist of
>> replaying some number of bio write operations, so it tests states different
>> than you 500 xfstest which I believe only replays to sync operations (i.e.
>> it never stops replay before a recorded fsync).
>
> That is correct. test 500 (temporary name) is mostly focused on checking
> data consistency of files after fsync. detecting metadata consistency errors
> is a by product. I do intend to add more tests focused on metadata consistency.
> Josef already wrote an fsstress script that should be converted to an xfstest
> which replays the log to every FUA and fsck.
>
>>
>> If you're interested, you can find the CrashMonkey code (and branch) at
>> https://github.com/utsaslab/crashmonkey/tree/ext4_regression_bug. If you
>> would like to run it, you should clone and build you xfstest in your home
>> directory so that the jerry-rigged CrashMonkey test case can find it.
>> Directions for running this test case in CrashMonkey should be at the top of
>> the README.
>
> You seem to have misspelled 'fsx' in README and in the code as 'xfs'.
> Funny, I always mistype it as 'sfx' :)
>
> Cheers,
> Amir.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
  2017-08-31  4:06   ` Amir Goldstein
@ 2017-09-01 12:21     ` Ashlie Martinez
  2017-09-01 14:59       ` Amir Goldstein
  0 siblings, 1 reply; 16+ messages in thread
From: Ashlie Martinez @ 2017-09-01 12:21 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Eryu Guan, Josef Bacik, Vijay Chidambaram, fstests, Ext4, Theodore Tso

Apologies for spam, resending plain text so it appears in the archives.


On Wed, Aug 30, 2017 at 11:06 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> [now really CC Ted]
>
> On Thu, Aug 31, 2017 at 7:05 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Thu, Aug 31, 2017 at 4:28 AM, Ashlie Martinez <ashmrtn@utexas.edu> wrote:
>>> Amir,
>>>
>>> I have been working on CrashMonkey more and I have jerry-rigged together a
>>> test in CrashMonkey that calls into `fsx` with the minimal test case you
>>> made. I am able to reproduce the ext4 error that you found along with a few
>>> other potential errors.
>>>
>>> A quick point, I run fsck with `-yf` instead of `-nf` that xfstests runs
>>> with. The reason for this is that CrashMonkey would like to report on
>>> fixable and unfixable errors in the future.
>>>
>>
>> That makes sense, but keep in mind that 'fixable' error may still loose data
>> when fixing them with -y. Perhaps you should consider running fsck is auto
>> fixing mode (i.e. e2fsck -p) when available, to classify errors as
>> 'safely fixable'
>> I believe the error these test encountered are 'safely fixable', but
>> didn't check.
>>
>>> Running the ported test case, I find that CrashMonkey encounters the
>>> following errors:
>>> 1. Incorrect inode size and incorrect free data block and inode counts
>>> (fixable)
>>> 2. incorrect free data block and inode counts (fixable)
>>> 3. `Superblock needs_recovery flag is clear, but journal has data` notice
>>> along with errors present in case 1
>>> 4. `Superblock needs_recovery flag is clear, but journal has data` notice
>>> with no other errors
>>>
>>> For the incorrect i_size errors, I get the output `Inode 12, i_size is
>>> 147456, should be 163840.` which I can also reproduce with your 501 xfstests
>>> test case.
>>>
>>> When free data blocks and inode errors occur, the message is `Free blocks
>>> count wrong (8795, counted=8714).` and `Free inodes count wrong (2549,
>>> counted=2546).`
>>>
>>> I have not had a chance to look into the above errors to find their root
>>> causes.
>>>
>>
>> I believe this is what you get when you fsck -yf before trying to mount when
>> the orphan list is not empty. You should avoid doing that.

Do you know what the kernel does when a file system like that is mounted? The
link you posted below has a comment about the kernel checking the file system
faster than fsck can. Do you know how it accomplishes this? Should the
mount -o errors=remount-ro; unmount sequence be done for all file systems or
just a select few for which the kernel knows how to handle? Does following the
mount/unmount, check sequence just mean that the kernel will silently (from the
perspective of normal users) fix some issues with the file system much like
fsck does when it is run (though perhaps less thoroughly than fsck)?

>>
>> See what the greatest ext4 crash test experiment of them all is doing
>> and read the comment to understand why:
>> https://android.googlesource.com/platform/system/core/+/marshmallow-mr1-dev/fs_mgr/fs_mgr.c#96
>> 1. mount -o  errors=remount-ro; umount
>> 2. e2fsck -y
>>
>> So upstream Android never runs e2fsck -f. It will only check fs if kernel marked
>> that fs has errors.
>> Although Cyanogenmod did add -f and I imagine that many vendors do as well.
>>
>> As one who hacked and crashed a lot of Android devices, I can attest that I have
>> observed both data loss and corrupted (non booting) fs, but the rest
>> of the 2 billion
>> crash test monkeys don't seem to be bothered ;-)
>>
>>> In total, CrashMonkey ran 1000 different tests. Of those, 344 passed without
>>> fsck complaining. The remaining 656 tests saw fsck complain about something.
>>> All of these tests consisted of unique sequences of bios, but may contain
>>> equivalent crash states.
>>>
>>> The larger range of test results is due to the fact that CrashMonkey runs
>>> many tests from just the single workload you made. These tests consist of
>>> replaying some number of bio write operations, so it tests states different
>>> than you 500 xfstest which I believe only replays to sync operations (i.e.
>>> it never stops replay before a recorded fsync).
>>
>> That is correct. test 500 (temporary name) is mostly focused on checking
>> data consistency of files after fsync. detecting metadata consistency errors
>> is a by product. I do intend to add more tests focused on metadata consistency.
>> Josef already wrote an fsstress script that should be converted to an xfstest
>> which replays the log to every FUA and fsck.
>>
>>>
>>> If you're interested, you can find the CrashMonkey code (and branch) at
>>> https://github.com/utsaslab/crashmonkey/tree/ext4_regression_bug. If you
>>> would like to run it, you should clone and build you xfstest in your home
>>> directory so that the jerry-rigged CrashMonkey test case can find it.
>>> Directions for running this test case in CrashMonkey should be at the top of
>>> the README.
>>
>> You seem to have misspelled 'fsx' in README and in the code as 'xfs'.
>> Funny, I always mistype it as 'sfx' :)

:) thanks for pointing that out, it's hard enough to spell it properly
half the time since
the test suite it's in is xfstests. Thank you for all the other
feedback and links too.

>>
>> Cheers,
>> Amir.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
  2017-09-01 12:21     ` Ashlie Martinez
@ 2017-09-01 14:59       ` Amir Goldstein
  0 siblings, 0 replies; 16+ messages in thread
From: Amir Goldstein @ 2017-09-01 14:59 UTC (permalink / raw)
  To: Ashlie Martinez
  Cc: Eryu Guan, Josef Bacik, Vijay Chidambaram, fstests, Ext4, Theodore Tso

On Fri, Sep 1, 2017 at 3:21 PM, Ashlie Martinez <ashmrtn@utexas.edu> wrote:
> Apologies for spam, resending plain text so it appears in the archives.
>
>
...
>>>> When free data blocks and inode errors occur, the message is `Free blocks
>>>> count wrong (8795, counted=8714).` and `Free inodes count wrong (2549,
>>>> counted=2546).`
>>>>
>>>> I have not had a chance to look into the above errors to find their root
>>>> causes.
>>>>
>>>
>>> I believe this is what you get when you fsck -yf before trying to mount when
>>> the orphan list is not empty. You should avoid doing that.
>
> Do you know what the kernel does when a file system like that is mounted? The
> link you posted below has a comment about the kernel checking the file system
> faster than fsck can. Do you know how it accomplishes this? Should the
> mount -o errors=remount-ro; unmount sequence be done for all file systems or
> just a select few for which the kernel knows how to handle? Does following the
> mount/unmount, check sequence just mean that the kernel will silently (from the
> perspective of normal users) fix some issues with the file system much like
> fsck does when it is run (though perhaps less thoroughly than fsck)?
>

mount/umount does several things.
among other things, its starts with replaying the journal.
with journalling file systems this is essential to the point that it is really
not safe to fsck before replaying the journal. So e2fsck -y will in fact
replay the journal, but it will do a much worse job (much slower) the
the kernel does when replaying the journal (too hard to explain why).
BTW e2fsck -n will not replay the journal so it may report phantom
errors or refuse to run?

HOWEVER, the comment I referred you to in Android code refers
to the efficiency of the kernel with cleaning up the "orphan inode list".
If you open N files, unlink them sync and crash the kernel will cleanup
the disk space used by those N files on next mount.
e2fsck will "zap" those files and then fix all the inode and block counter
and report they were wrong, like you reported above.

Amir.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
  2017-10-05 19:10           ` Amir Goldstein
@ 2017-10-06  0:34             ` Ashlie Martinez
  0 siblings, 0 replies; 16+ messages in thread
From: Ashlie Martinez @ 2017-10-06  0:34 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Xiao Yang, Theodore Ts'o, Eryu Guan, Josef Bacik, fstests,
	Ext4, Vijay Chidambaram

On Thu, Oct 5, 2017 at 2:10 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, Oct 5, 2017 at 6:04 PM, Ashlie Martinez <ashmrtn@utexas.edu> wrote:
>> On Thu, Oct 5, 2017 at 2:27 AM, Xiao Yang <yangx.jy@cn.fujitsu.com> wrote:
>>> On 2017/09/30 22:15, Ashlie Martinez wrote:
>>>>
>>>> Hi Xiao,
>>>>
>>>> I am a student at the University of Texas at Austin. Some researchers
>>>> in the computer science department at UT, myself included, have
>>>> recently been working to develop a file system crash consistency test
>>>> harness called CrashMonkey [1][2]. I have been working on the
>>>> CrashMonkey project since it was started late last year. With
>>>> CrashMonkey we have also been able to reproduce the incorrect i_size
>>>> error you noted but we have not been able to reproduce the other
>>>> output that Amir found. CrashMonkey works by logging and replaying
>>>> operations for a workload, so it should not be sensitive to
>>>> differences in timing that could be caused by things like KVM+virtio.
>>>> I also did a few experiments with Amir's new xfstests test 456 (both
>>>> with and without KVM and virtio) and I was unable to reproduce the
>>>> output noted in the xfstest. I have not spent a lot of time looking
>>>> into the cause of the bug that Amir found and it is rather unfortunate
>>>> that I was unable to reproduce it with either xfstests or CrashMonkey.
>>>
>>> Hi Ashlie,
>>>
>>> Thanks for your detailed comments.
>>>
>>> 1) Do you think the output that Amir noted in xfstests is a false positive?
>>
>> It almost seems that way, though to be honest, I don't think I know
>> enough about 1. the setup used by Amir and 2. all the internal working
>> of KVM+virtio to say for sure.
>
> I believe you misread my email.
> The problem was NOT reproduced on KVM+virtio.
> The problem *is* reproduced on a 10G LVM volume over SSD on
> Ubuntu 16.04 with latest kernel and latest e2fsprogs.

I have also tried running generic/456 on non-KVM, non-virtio machines
and I was unable to reproduce it. Without information on test setups
individuals are using, it is very hard to tell where I, or other
people, are going wrong in testing other than maybe using a virtual
machine.

As an aside, it does not appear to be simply a timing issue due to
KVM+virtio. I would hope that CrashMonkey would still be able to find
the extent error you saw, even on a KVM VM since it is not dependent
on timing.

>
> There is no use in speculating why the problem doesn't happen on your
> systems. If the issue reproduces on A system (2 systems actually that I tested)
> that it is a problem.
>
> Attached is an e2image dump of the inconsistent file system following test
> generic/456 on my system.  I would have attached the image sooner,
> but since on my system problem reproduces 100% on the time, I did not think
> that was a need. Without an ext4 developer looking into this image, I don't
> really see the point in further discussion.
>
> I would be interesting to get more test samples from people running the test
> on other systems. If only some people see the error
> "end of extent exceeds allowed value"
> it would be interesting to find the commonality between those setups.
>

I agree it would be interesting to see why this error appears only for
some people.


>> One thing I have identified as being a
>> potential source of false positives is code in the kernel to remap the
>> block number sent to device drives to something relative to the start
>> of a block device, not the start of a partition. I'm thinking this
>> could cause trouble if 1. a partition higher than the first partition
>> is monitored by dm-log-writes, 2. the block numbers are recorded
>> verbatim in dm-log-writes, and 3. the dm-log-writes log is replayed on
>> a different device with different partitions (or possibly the same
>> device, but a different partition).
>
> You do realize that the test generic/456 is not using dm-log-writes at all.
> I intentionally took it out of the equation and used the even dumber dm-flakey
> target to demonstrate the crash inconsistency.

Oh, yes, I recall that now. The past few weeks have been very busy for
me with all of my other school work, so getting all of my i's dotted
and t's crossed has been a challenge. At any rate, I don't think it is
a real minus to mention the potential remapping of blocks in the cited
function. If people are going to use any sort of block device module
(be dm-log-writes or otherwise) then I think it would be good for
developers to be aware of.

>
>> I know some other undergrad
>> students at UT on the CrashMonkey team are looking into this right
>> now, but I have not had time to test this myself. The offending code
>> in the kernel is in the completely misnamed
>> `generic_make_request_checks` function which has given the CrashMonkey
>> team trouble in the past. Links to that function and the remapping
>> function it calls are below.
>>
>> http://elixir.free-electrons.com/linux/latest/source/block/blk-core.c#L2041
>> http://elixir.free-electrons.com/linux/latest/source/block/blk-core.c#L1902
>>
>>>
>>> 2) About the output that both i and you reproduced,  did you look into it
>>> and find its root cause?
>>
>> I have not looked to find the root cause of the issue that we both
>> reliably reproduce.
>>
>
> Now you have a broken file system image and the exact set of operations
> that led to it. That's should be a pretty big lead for investigation.
>
> Amir.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
  2017-10-05 15:04         ` Ashlie Martinez
@ 2017-10-05 19:10           ` Amir Goldstein
  2017-10-06  0:34             ` Ashlie Martinez
  0 siblings, 1 reply; 16+ messages in thread
From: Amir Goldstein @ 2017-10-05 19:10 UTC (permalink / raw)
  To: Ashlie Martinez
  Cc: Xiao Yang, Theodore Ts'o, Eryu Guan, Josef Bacik, fstests,
	Ext4, Vijay Chidambaram

[-- Attachment #1: Type: text/plain, Size: 4228 bytes --]

On Thu, Oct 5, 2017 at 6:04 PM, Ashlie Martinez <ashmrtn@utexas.edu> wrote:
> On Thu, Oct 5, 2017 at 2:27 AM, Xiao Yang <yangx.jy@cn.fujitsu.com> wrote:
>> On 2017/09/30 22:15, Ashlie Martinez wrote:
>>>
>>> Hi Xiao,
>>>
>>> I am a student at the University of Texas at Austin. Some researchers
>>> in the computer science department at UT, myself included, have
>>> recently been working to develop a file system crash consistency test
>>> harness called CrashMonkey [1][2]. I have been working on the
>>> CrashMonkey project since it was started late last year. With
>>> CrashMonkey we have also been able to reproduce the incorrect i_size
>>> error you noted but we have not been able to reproduce the other
>>> output that Amir found. CrashMonkey works by logging and replaying
>>> operations for a workload, so it should not be sensitive to
>>> differences in timing that could be caused by things like KVM+virtio.
>>> I also did a few experiments with Amir's new xfstests test 456 (both
>>> with and without KVM and virtio) and I was unable to reproduce the
>>> output noted in the xfstest. I have not spent a lot of time looking
>>> into the cause of the bug that Amir found and it is rather unfortunate
>>> that I was unable to reproduce it with either xfstests or CrashMonkey.
>>
>> Hi Ashlie,
>>
>> Thanks for your detailed comments.
>>
>> 1) Do you think the output that Amir noted in xfstests is a false positive?
>
> It almost seems that way, though to be honest, I don't think I know
> enough about 1. the setup used by Amir and 2. all the internal working
> of KVM+virtio to say for sure.

I believe you misread my email.
The problem was NOT reproduced on KVM+virtio.
The problem *is* reproduced on a 10G LVM volume over SSD on
Ubuntu 16.04 with latest kernel and latest e2fsprogs.

There is no use in speculating why the problem doesn't happen on your
systems. If the issue reproduces on A system (2 systems actually that I tested)
that it is a problem.

Attached is an e2image dump of the inconsistent file system following test
generic/456 on my system.  I would have attached the image sooner,
but since on my system problem reproduces 100% on the time, I did not think
that was a need. Without an ext4 developer looking into this image, I don't
really see the point in further discussion.

I would be interesting to get more test samples from people running the test
on other systems. If only some people see the error
"end of extent exceeds allowed value"
it would be interesting to find the commonality between those setups.

> One thing I have identified as being a
> potential source of false positives is code in the kernel to remap the
> block number sent to device drives to something relative to the start
> of a block device, not the start of a partition. I'm thinking this
> could cause trouble if 1. a partition higher than the first partition
> is monitored by dm-log-writes, 2. the block numbers are recorded
> verbatim in dm-log-writes, and 3. the dm-log-writes log is replayed on
> a different device with different partitions (or possibly the same
> device, but a different partition).

You do realize that the test generic/456 is not using dm-log-writes at all.
I intentionally took it out of the equation and used the even dumber dm-flakey
target to demonstrate the crash inconsistency.

> I know some other undergrad
> students at UT on the CrashMonkey team are looking into this right
> now, but I have not had time to test this myself. The offending code
> in the kernel is in the completely misnamed
> `generic_make_request_checks` function which has given the CrashMonkey
> team trouble in the past. Links to that function and the remapping
> function it calls are below.
>
> http://elixir.free-electrons.com/linux/latest/source/block/blk-core.c#L2041
> http://elixir.free-electrons.com/linux/latest/source/block/blk-core.c#L1902
>
>>
>> 2) About the output that both i and you reproduced,  did you look into it
>> and find its root cause?
>
> I have not looked to find the root cause of the issue that we both
> reliably reproduce.
>

Now you have a broken file system image and the exact set of operations
that led to it. That's should be a pretty big lead for investigation.

Amir.

[-- Attachment #2: ext4-crash.qcow2.bz2 --]
[-- Type: application/x-bzip2, Size: 12094 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
  2017-10-05  7:27       ` Xiao Yang
@ 2017-10-05 15:04         ` Ashlie Martinez
  2017-10-05 19:10           ` Amir Goldstein
  0 siblings, 1 reply; 16+ messages in thread
From: Ashlie Martinez @ 2017-10-05 15:04 UTC (permalink / raw)
  To: Xiao Yang
  Cc: Amir Goldstein, Theodore Ts'o, Eryu Guan, Josef Bacik,
	fstests, Ext4, Vijay Chidambaram

On Thu, Oct 5, 2017 at 2:27 AM, Xiao Yang <yangx.jy@cn.fujitsu.com> wrote:
> On 2017/09/30 22:15, Ashlie Martinez wrote:
>>
>> Hi Xiao,
>>
>> I am a student at the University of Texas at Austin. Some researchers
>> in the computer science department at UT, myself included, have
>> recently been working to develop a file system crash consistency test
>> harness called CrashMonkey [1][2]. I have been working on the
>> CrashMonkey project since it was started late last year. With
>> CrashMonkey we have also been able to reproduce the incorrect i_size
>> error you noted but we have not been able to reproduce the other
>> output that Amir found. CrashMonkey works by logging and replaying
>> operations for a workload, so it should not be sensitive to
>> differences in timing that could be caused by things like KVM+virtio.
>> I also did a few experiments with Amir's new xfstests test 456 (both
>> with and without KVM and virtio) and I was unable to reproduce the
>> output noted in the xfstest. I have not spent a lot of time looking
>> into the cause of the bug that Amir found and it is rather unfortunate
>> that I was unable to reproduce it with either xfstests or CrashMonkey.
>
> Hi Ashlie,
>
> Thanks for your detailed comments.
>
> 1) Do you think the output that Amir noted in xfstests is a false positive?

It almost seems that way, though to be honest, I don't think I know
enough about 1. the setup used by Amir and 2. all the internal working
of KVM+virtio to say for sure. One thing I have identified as being a
potential source of false positives is code in the kernel to remap the
block number sent to device drives to something relative to the start
of a block device, not the start of a partition. I'm thinking this
could cause trouble if 1. a partition higher than the first partition
is monitored by dm-log-writes, 2. the block numbers are recorded
verbatim in dm-log-writes, and 3. the dm-log-writes log is replayed on
a different device with different partitions (or possibly the same
device, but a different partition). I know some other undergrad
students at UT on the CrashMonkey team are looking into this right
now, but I have not had time to test this myself. The offending code
in the kernel is in the completely misnamed
`generic_make_request_checks` function which has given the CrashMonkey
team trouble in the past. Links to that function and the remapping
function it calls are below.

http://elixir.free-electrons.com/linux/latest/source/block/blk-core.c#L2041
http://elixir.free-electrons.com/linux/latest/source/block/blk-core.c#L1902

>
> 2) About the output that both i and you reproduced,  did you look into it
> and find its root cause?

I have not looked to find the root cause of the issue that we both
reliably reproduce.

>
> Thanks,
> Xiao Yang
>>
>> At any rate, CrashMonkey is still under development, so it does have
>> some caveats. First, we are running with a fixed random seed in our
>> default RandomPermuter (used to generate crash states) to aid
>> development. Second, the branch with the reproduction of this ext4
>> regression bug in CrashMonkey [3] will yield a few false positives due
>> to the way CrashMonkey works and how fsx runs. These false positives
>> are due to CrashMonkey generating crash states where the directories
>> for files used for the test have not be fsync-ed in the file system.
>> The top of the README in the CrashMonkey branch with this bug
>> reproduction outlines how we determined these were false positives
>>
>> [1] https://github.com/utsaslab/crashmonkey
>> [2]
>> https://www.usenix.org/conference/hotstorage17/program/presentation/martinez
>> [3] https://github.com/utsaslab/crashmonkey/tree/ext4_regression_bug
>>
>>
>> On Mon, Sep 25, 2017 at 5:53 AM, Amir Goldstein<amir73il@gmail.com>
>> wrote:
>>>
>>> On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang<yangx.jy@cn.fujitsu.com>
>>> wrote:
>>>>
>>>> On 2017/08/27 18:44, Amir Goldstein wrote:
>>>>>
>>>>> This test is motivated by a bug found in ext4 during random crash
>>>>> consistency tests.
>>>>>
>>>>> This test uses device mapper flakey target to demonstrate the bug
>>>>> found using device mapper log-writes target.
>>>>>
>>>>> Signed-off-by: Amir Goldstein<amir73il@gmail.com>
>>>>> ---
>>>>>
>>>>> Ted,
>>>>>
>>>>> While working on crash consistency xfstests [1], I stubmled on what
>>>>> appeared to be an ext4 crash consistency bug.
>>>>>
>>>>> The tests I used rely on the log-writes dm target code written
>>>>> by Josef Bacik, which had little exposure to the wide community
>>>>> as far as I know.  I wanted to prove to myself that the found
>>>>> inconsistency was not due to a test bug, so I bisected the failed
>>>>> test to the minimal operations that trigger the failure and wrote
>>>>> a small independent test to reproduce the issue using dm flakey target.
>>>>>
>>>>> The following fsck error is reliably reproduced by replaying some fsx
>>>>> ops
>>>>> on overlapping file regions, then emulating a crash, followed by mount,
>>>>> umount and fsck -nf:
>>>>>
>>>>>    ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile
>>>>>    1 write 0x137dd thru    0x21445 (0xdc69 bytes)
>>>>>    2 falloc        from 0xb531 to 0x16ade (0xb5ad bytes)
>>>>>    3 collapse      from 0x1c000 to 0x20000, (0x4000 bytes)
>>>>>    4 write 0x3e5ec thru    0x3ffff (0x1a14 bytes)
>>>>>    5 zero  from 0x20fac to 0x27d48, (0x6d9c bytes)
>>>>>    6 mapwrite      0x216ad thru    0x23dfb (0x274f bytes)
>>>>>    All 7 operations completed A-OK!
>>>>>    _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is
>>>>> inconsistent
>>>>>    *** fsck.ext4 output ***
>>>>>    fsck from util-linux 2.27.1
>>>>>    e2fsck 1.42.13 (17-May-2015)
>>>>>    Pass 1: Checking inodes, blocks, and sizes
>>>>>    Inode 12, end of extent exceeds allowed value
>>>>>            (logical block 33, physical block 33441, len 7)
>>>>>    Clear? no
>>>>>    Inode 12, i_blocks is 184, should be 128.  Fix? no
>>>>
>>>> Hi Amir,
>>>>
>>>> I always get the following output when running your xfstests test case
>>>> 501.
>>>
>>> Now merged as test generic/456
>>>
>>>>
>>>> ---------------------------------------------------------------------------
>>>> e2fsck 1.42.9 (28-Dec-2013)
>>>> Pass 1: Checking inodes, blocks, and sizes
>>>> Inode 12, i_size is 147456, should be 163840. Fix? no
>>>>
>>>> ---------------------------------------------------------------------------
>>>>
>>>> Could you tell me how to get the expected output as you reported?
>>>
>>> I can't say I am doing anything special, but I can say that I get the
>>> same output as you did when running the test inside kvm-xfstests.
>>> Actually, I could not reproduce ANY of the the crash consistency bugs
>>> inside kvm-xfstests. Must be something to do with different timing of
>>> IO with KVM+virtio disks??
>>>
>>> When running on my laptop (Ubuntu 16.04 with latest kernel)
>>> on a 10G SSD volume, I always get the error reported above.
>>> I just re-verified with latest stable e2fsprogs (1.43.6).
>>>
>>> Amir.
>>
>>
>> .
>>
>
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
  2017-09-30 14:15     ` Ashlie Martinez
@ 2017-10-05  7:27       ` Xiao Yang
  2017-10-05 15:04         ` Ashlie Martinez
  0 siblings, 1 reply; 16+ messages in thread
From: Xiao Yang @ 2017-10-05  7:27 UTC (permalink / raw)
  To: Ashlie Martinez
  Cc: Amir Goldstein, Theodore Ts'o, Eryu Guan, Josef Bacik,
	fstests, Ext4, Vijay Chidambaram

On 2017/09/30 22:15, Ashlie Martinez wrote:
> Hi Xiao,
>
> I am a student at the University of Texas at Austin. Some researchers
> in the computer science department at UT, myself included, have
> recently been working to develop a file system crash consistency test
> harness called CrashMonkey [1][2]. I have been working on the
> CrashMonkey project since it was started late last year. With
> CrashMonkey we have also been able to reproduce the incorrect i_size
> error you noted but we have not been able to reproduce the other
> output that Amir found. CrashMonkey works by logging and replaying
> operations for a workload, so it should not be sensitive to
> differences in timing that could be caused by things like KVM+virtio.
> I also did a few experiments with Amir's new xfstests test 456 (both
> with and without KVM and virtio) and I was unable to reproduce the
> output noted in the xfstest. I have not spent a lot of time looking
> into the cause of the bug that Amir found and it is rather unfortunate
> that I was unable to reproduce it with either xfstests or CrashMonkey.
Hi Ashlie,

Thanks for your detailed comments.

1) Do you think the output that Amir noted in xfstests is a false positive?

2) About the output that both i and you reproduced,  did you look into 
it and find its root cause?

Thanks,
Xiao Yang
> At any rate, CrashMonkey is still under development, so it does have
> some caveats. First, we are running with a fixed random seed in our
> default RandomPermuter (used to generate crash states) to aid
> development. Second, the branch with the reproduction of this ext4
> regression bug in CrashMonkey [3] will yield a few false positives due
> to the way CrashMonkey works and how fsx runs. These false positives
> are due to CrashMonkey generating crash states where the directories
> for files used for the test have not be fsync-ed in the file system.
> The top of the README in the CrashMonkey branch with this bug
> reproduction outlines how we determined these were false positives
>
> [1] https://github.com/utsaslab/crashmonkey
> [2] https://www.usenix.org/conference/hotstorage17/program/presentation/martinez
> [3] https://github.com/utsaslab/crashmonkey/tree/ext4_regression_bug
>
>
> On Mon, Sep 25, 2017 at 5:53 AM, Amir Goldstein<amir73il@gmail.com>  wrote:
>> On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang<yangx.jy@cn.fujitsu.com>  wrote:
>>> On 2017/08/27 18:44, Amir Goldstein wrote:
>>>> This test is motivated by a bug found in ext4 during random crash
>>>> consistency tests.
>>>>
>>>> This test uses device mapper flakey target to demonstrate the bug
>>>> found using device mapper log-writes target.
>>>>
>>>> Signed-off-by: Amir Goldstein<amir73il@gmail.com>
>>>> ---
>>>>
>>>> Ted,
>>>>
>>>> While working on crash consistency xfstests [1], I stubmled on what
>>>> appeared to be an ext4 crash consistency bug.
>>>>
>>>> The tests I used rely on the log-writes dm target code written
>>>> by Josef Bacik, which had little exposure to the wide community
>>>> as far as I know.  I wanted to prove to myself that the found
>>>> inconsistency was not due to a test bug, so I bisected the failed
>>>> test to the minimal operations that trigger the failure and wrote
>>>> a small independent test to reproduce the issue using dm flakey target.
>>>>
>>>> The following fsck error is reliably reproduced by replaying some fsx ops
>>>> on overlapping file regions, then emulating a crash, followed by mount,
>>>> umount and fsck -nf:
>>>>
>>>>    ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile
>>>>    1 write 0x137dd thru    0x21445 (0xdc69 bytes)
>>>>    2 falloc        from 0xb531 to 0x16ade (0xb5ad bytes)
>>>>    3 collapse      from 0x1c000 to 0x20000, (0x4000 bytes)
>>>>    4 write 0x3e5ec thru    0x3ffff (0x1a14 bytes)
>>>>    5 zero  from 0x20fac to 0x27d48, (0x6d9c bytes)
>>>>    6 mapwrite      0x216ad thru    0x23dfb (0x274f bytes)
>>>>    All 7 operations completed A-OK!
>>>>    _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent
>>>>    *** fsck.ext4 output ***
>>>>    fsck from util-linux 2.27.1
>>>>    e2fsck 1.42.13 (17-May-2015)
>>>>    Pass 1: Checking inodes, blocks, and sizes
>>>>    Inode 12, end of extent exceeds allowed value
>>>>            (logical block 33, physical block 33441, len 7)
>>>>    Clear? no
>>>>    Inode 12, i_blocks is 184, should be 128.  Fix? no
>>> Hi Amir,
>>>
>>> I always get the following output when running your xfstests test case 501.
>> Now merged as test generic/456
>>
>>> ---------------------------------------------------------------------------
>>> e2fsck 1.42.9 (28-Dec-2013)
>>> Pass 1: Checking inodes, blocks, and sizes
>>> Inode 12, i_size is 147456, should be 163840. Fix? no
>>> ---------------------------------------------------------------------------
>>>
>>> Could you tell me how to get the expected output as you reported?
>> I can't say I am doing anything special, but I can say that I get the
>> same output as you did when running the test inside kvm-xfstests.
>> Actually, I could not reproduce ANY of the the crash consistency bugs
>> inside kvm-xfstests. Must be something to do with different timing of
>> IO with KVM+virtio disks??
>>
>> When running on my laptop (Ubuntu 16.04 with latest kernel)
>> on a 10G SSD volume, I always get the error reported above.
>> I just re-verified with latest stable e2fsprogs (1.43.6).
>>
>> Amir.
>
> .
>




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
  2017-09-25 10:53   ` Amir Goldstein
  2017-09-26 10:45     ` Xiao Yang
@ 2017-09-30 14:15     ` Ashlie Martinez
  2017-10-05  7:27       ` Xiao Yang
  1 sibling, 1 reply; 16+ messages in thread
From: Ashlie Martinez @ 2017-09-30 14:15 UTC (permalink / raw)
  To: Xiao Yang
  Cc: Amir Goldstein, Theodore Ts'o, Eryu Guan, Josef Bacik,
	fstests, Ext4, Vijay Chidambaram

Hi Xiao,

I am a student at the University of Texas at Austin. Some researchers
in the computer science department at UT, myself included, have
recently been working to develop a file system crash consistency test
harness called CrashMonkey [1][2]. I have been working on the
CrashMonkey project since it was started late last year. With
CrashMonkey we have also been able to reproduce the incorrect i_size
error you noted but we have not been able to reproduce the other
output that Amir found. CrashMonkey works by logging and replaying
operations for a workload, so it should not be sensitive to
differences in timing that could be caused by things like KVM+virtio.
I also did a few experiments with Amir's new xfstests test 456 (both
with and without KVM and virtio) and I was unable to reproduce the
output noted in the xfstest. I have not spent a lot of time looking
into the cause of the bug that Amir found and it is rather unfortunate
that I was unable to reproduce it with either xfstests or CrashMonkey.

At any rate, CrashMonkey is still under development, so it does have
some caveats. First, we are running with a fixed random seed in our
default RandomPermuter (used to generate crash states) to aid
development. Second, the branch with the reproduction of this ext4
regression bug in CrashMonkey [3] will yield a few false positives due
to the way CrashMonkey works and how fsx runs. These false positives
are due to CrashMonkey generating crash states where the directories
for files used for the test have not be fsync-ed in the file system.
The top of the README in the CrashMonkey branch with this bug
reproduction outlines how we determined these were false positives

[1] https://github.com/utsaslab/crashmonkey
[2] https://www.usenix.org/conference/hotstorage17/program/presentation/martinez
[3] https://github.com/utsaslab/crashmonkey/tree/ext4_regression_bug

On Mon, Sep 25, 2017 at 5:53 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang <yangx.jy@cn.fujitsu.com> wrote:
>> On 2017/08/27 18:44, Amir Goldstein wrote:
>>> This test is motivated by a bug found in ext4 during random crash
>>> consistency tests.
>>>
>>> This test uses device mapper flakey target to demonstrate the bug
>>> found using device mapper log-writes target.
>>>
>>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>>> ---
>>>
>>> Ted,
>>>
>>> While working on crash consistency xfstests [1], I stubmled on what
>>> appeared to be an ext4 crash consistency bug.
>>>
>>> The tests I used rely on the log-writes dm target code written
>>> by Josef Bacik, which had little exposure to the wide community
>>> as far as I know.  I wanted to prove to myself that the found
>>> inconsistency was not due to a test bug, so I bisected the failed
>>> test to the minimal operations that trigger the failure and wrote
>>> a small independent test to reproduce the issue using dm flakey target.
>>>
>>> The following fsck error is reliably reproduced by replaying some fsx ops
>>> on overlapping file regions, then emulating a crash, followed by mount,
>>> umount and fsck -nf:
>>>
>>>   ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile
>>>   1 write 0x137dd thru    0x21445 (0xdc69 bytes)
>>>   2 falloc        from 0xb531 to 0x16ade (0xb5ad bytes)
>>>   3 collapse      from 0x1c000 to 0x20000, (0x4000 bytes)
>>>   4 write 0x3e5ec thru    0x3ffff (0x1a14 bytes)
>>>   5 zero  from 0x20fac to 0x27d48, (0x6d9c bytes)
>>>   6 mapwrite      0x216ad thru    0x23dfb (0x274f bytes)
>>>   All 7 operations completed A-OK!
>>>   _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent
>>>   *** fsck.ext4 output ***
>>>   fsck from util-linux 2.27.1
>>>   e2fsck 1.42.13 (17-May-2015)
>>>   Pass 1: Checking inodes, blocks, and sizes
>>>   Inode 12, end of extent exceeds allowed value
>>>           (logical block 33, physical block 33441, len 7)
>>>   Clear? no
>>>   Inode 12, i_blocks is 184, should be 128.  Fix? no
>> Hi Amir,
>>
>> I always get the following output when running your xfstests test case 501.
>
> Now merged as test generic/456
>
>> ---------------------------------------------------------------------------
>> e2fsck 1.42.9 (28-Dec-2013)
>> Pass 1: Checking inodes, blocks, and sizes
>> Inode 12, i_size is 147456, should be 163840. Fix? no
>> ---------------------------------------------------------------------------
>>
>> Could you tell me how to get the expected output as you reported?
>
> I can't say I am doing anything special, but I can say that I get the
> same output as you did when running the test inside kvm-xfstests.
> Actually, I could not reproduce ANY of the the crash consistency bugs
> inside kvm-xfstests. Must be something to do with different timing of
> IO with KVM+virtio disks??
>
> When running on my laptop (Ubuntu 16.04 with latest kernel)
> on a 10G SSD volume, I always get the error reported above.
> I just re-verified with latest stable e2fsprogs (1.43.6).
>
> Amir.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
  2017-09-26 10:45     ` Xiao Yang
@ 2017-09-26 11:48       ` Amir Goldstein
  0 siblings, 0 replies; 16+ messages in thread
From: Amir Goldstein @ 2017-09-26 11:48 UTC (permalink / raw)
  To: Xiao Yang; +Cc: Theodore Ts'o, Eryu Guan, Josef Bacik, fstests, Ext4

On Tue, Sep 26, 2017 at 1:45 PM, Xiao Yang <yangx.jy@cn.fujitsu.com> wrote:
> On 2017/09/25 18:53, Amir Goldstein wrote:
>>
>> On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang<yangx.jy@cn.fujitsu.com>
>> wrote:
>>>
>>> On 2017/08/27 18:44, Amir Goldstein wrote:
>>>>
>>>> This test is motivated by a bug found in ext4 during random crash
>>>> consistency tests.
>>>>
>>>> This test uses device mapper flakey target to demonstrate the bug
>>>> found using device mapper log-writes target.
>>>>
>>>> Signed-off-by: Amir Goldstein<amir73il@gmail.com>
>>>> ---
>>>>
>>>> Ted,
>>>>
>>>> While working on crash consistency xfstests [1], I stubmled on what
>>>> appeared to be an ext4 crash consistency bug.
>>>>
>>>> The tests I used rely on the log-writes dm target code written
>>>> by Josef Bacik, which had little exposure to the wide community
>>>> as far as I know.  I wanted to prove to myself that the found
>>>> inconsistency was not due to a test bug, so I bisected the failed
>>>> test to the minimal operations that trigger the failure and wrote
>>>> a small independent test to reproduce the issue using dm flakey target.
>>>>
>>>> The following fsck error is reliably reproduced by replaying some fsx
>>>> ops
>>>> on overlapping file regions, then emulating a crash, followed by mount,
>>>> umount and fsck -nf:
>>>>
>>>>    ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile
>>>>    1 write 0x137dd thru    0x21445 (0xdc69 bytes)
>>>>    2 falloc        from 0xb531 to 0x16ade (0xb5ad bytes)
>>>>    3 collapse      from 0x1c000 to 0x20000, (0x4000 bytes)
>>>>    4 write 0x3e5ec thru    0x3ffff (0x1a14 bytes)
>>>>    5 zero  from 0x20fac to 0x27d48, (0x6d9c bytes)
>>>>    6 mapwrite      0x216ad thru    0x23dfb (0x274f bytes)
>>>>    All 7 operations completed A-OK!
>>>>    _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is
>>>> inconsistent
>>>>    *** fsck.ext4 output ***
>>>>    fsck from util-linux 2.27.1
>>>>    e2fsck 1.42.13 (17-May-2015)
>>>>    Pass 1: Checking inodes, blocks, and sizes
>>>>    Inode 12, end of extent exceeds allowed value
>>>>            (logical block 33, physical block 33441, len 7)
>>>>    Clear? no
>>>>    Inode 12, i_blocks is 184, should be 128.  Fix? no
>>>
>>> Hi Amir,
>>>
>>> I always get the following output when running your xfstests test case
>>> 501.
>>
>> Now merged as test generic/456
>>
>>>
>>> ---------------------------------------------------------------------------
>>> e2fsck 1.42.9 (28-Dec-2013)
>>> Pass 1: Checking inodes, blocks, and sizes
>>> Inode 12, i_size is 147456, should be 163840. Fix? no
>>>
>>> ---------------------------------------------------------------------------
>>>
>>> Could you tell me how to get the expected output as you reported?
>>
>> I can't say I am doing anything special, but I can say that I get the
>> same output as you did when running the test inside kvm-xfstests.
>> Actually, I could not reproduce ANY of the the crash consistency bugs
>> inside kvm-xfstests. Must be something to do with different timing of
>> IO with KVM+virtio disks??
>>
>> When running on my laptop (Ubuntu 16.04 with latest kernel)
>> on a 10G SSD volume, I always get the error reported above.
>> I just re-verified with latest stable e2fsprogs (1.43.6).
>
> Hi Amir,
>
> I tested generic/456 with KVM+virtio disks and SATA volumes on some kernels

I don't understand. Did you also test without KVM?
Otherwise I suggest that you test without KVM/virtio.

> (including
> v3.10.0, the latest kernel), but i still got the same output as i reported.
>
> Could you determine whether the two different outputs are caused by the same
> bug
> or not ?

No idea if those are 2 symptoms of the same bug or 2 different bugs
I did not investigate the root cause.

Amir.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
  2017-09-25 10:53   ` Amir Goldstein
@ 2017-09-26 10:45     ` Xiao Yang
  2017-09-26 11:48       ` Amir Goldstein
  2017-09-30 14:15     ` Ashlie Martinez
  1 sibling, 1 reply; 16+ messages in thread
From: Xiao Yang @ 2017-09-26 10:45 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Theodore Ts'o, Eryu Guan, Josef Bacik, fstests, Ext4

On 2017/09/25 18:53, Amir Goldstein wrote:
> On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang<yangx.jy@cn.fujitsu.com>  wrote:
>> On 2017/08/27 18:44, Amir Goldstein wrote:
>>> This test is motivated by a bug found in ext4 during random crash
>>> consistency tests.
>>>
>>> This test uses device mapper flakey target to demonstrate the bug
>>> found using device mapper log-writes target.
>>>
>>> Signed-off-by: Amir Goldstein<amir73il@gmail.com>
>>> ---
>>>
>>> Ted,
>>>
>>> While working on crash consistency xfstests [1], I stubmled on what
>>> appeared to be an ext4 crash consistency bug.
>>>
>>> The tests I used rely on the log-writes dm target code written
>>> by Josef Bacik, which had little exposure to the wide community
>>> as far as I know.  I wanted to prove to myself that the found
>>> inconsistency was not due to a test bug, so I bisected the failed
>>> test to the minimal operations that trigger the failure and wrote
>>> a small independent test to reproduce the issue using dm flakey target.
>>>
>>> The following fsck error is reliably reproduced by replaying some fsx ops
>>> on overlapping file regions, then emulating a crash, followed by mount,
>>> umount and fsck -nf:
>>>
>>>    ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile
>>>    1 write 0x137dd thru    0x21445 (0xdc69 bytes)
>>>    2 falloc        from 0xb531 to 0x16ade (0xb5ad bytes)
>>>    3 collapse      from 0x1c000 to 0x20000, (0x4000 bytes)
>>>    4 write 0x3e5ec thru    0x3ffff (0x1a14 bytes)
>>>    5 zero  from 0x20fac to 0x27d48, (0x6d9c bytes)
>>>    6 mapwrite      0x216ad thru    0x23dfb (0x274f bytes)
>>>    All 7 operations completed A-OK!
>>>    _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent
>>>    *** fsck.ext4 output ***
>>>    fsck from util-linux 2.27.1
>>>    e2fsck 1.42.13 (17-May-2015)
>>>    Pass 1: Checking inodes, blocks, and sizes
>>>    Inode 12, end of extent exceeds allowed value
>>>            (logical block 33, physical block 33441, len 7)
>>>    Clear? no
>>>    Inode 12, i_blocks is 184, should be 128.  Fix? no
>> Hi Amir,
>>
>> I always get the following output when running your xfstests test case 501.
> Now merged as test generic/456
>
>> ---------------------------------------------------------------------------
>> e2fsck 1.42.9 (28-Dec-2013)
>> Pass 1: Checking inodes, blocks, and sizes
>> Inode 12, i_size is 147456, should be 163840. Fix? no
>> ---------------------------------------------------------------------------
>>
>> Could you tell me how to get the expected output as you reported?
> I can't say I am doing anything special, but I can say that I get the
> same output as you did when running the test inside kvm-xfstests.
> Actually, I could not reproduce ANY of the the crash consistency bugs
> inside kvm-xfstests. Must be something to do with different timing of
> IO with KVM+virtio disks??
>
> When running on my laptop (Ubuntu 16.04 with latest kernel)
> on a 10G SSD volume, I always get the error reported above.
> I just re-verified with latest stable e2fsprogs (1.43.6).
Hi Amir,

I tested generic/456 with KVM+virtio disks and SATA volumes on some 
kernels (including
v3.10.0, the latest kernel), but i still got the same output as i reported.

Could you determine whether the two different outputs are caused by the 
same bug
or not ?

Thanks,
Xiao Yang.
> Amir.
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> .
>




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
  2017-09-25  9:49   ` Xiao Yang
  (?)
@ 2017-09-25 10:53   ` Amir Goldstein
  2017-09-26 10:45     ` Xiao Yang
  2017-09-30 14:15     ` Ashlie Martinez
  -1 siblings, 2 replies; 16+ messages in thread
From: Amir Goldstein @ 2017-09-25 10:53 UTC (permalink / raw)
  To: Xiao Yang; +Cc: Theodore Ts'o, Eryu Guan, Josef Bacik, fstests, Ext4

On Mon, Sep 25, 2017 at 12:49 PM, Xiao Yang <yangx.jy@cn.fujitsu.com> wrote:
> On 2017/08/27 18:44, Amir Goldstein wrote:
>> This test is motivated by a bug found in ext4 during random crash
>> consistency tests.
>>
>> This test uses device mapper flakey target to demonstrate the bug
>> found using device mapper log-writes target.
>>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> ---
>>
>> Ted,
>>
>> While working on crash consistency xfstests [1], I stubmled on what
>> appeared to be an ext4 crash consistency bug.
>>
>> The tests I used rely on the log-writes dm target code written
>> by Josef Bacik, which had little exposure to the wide community
>> as far as I know.  I wanted to prove to myself that the found
>> inconsistency was not due to a test bug, so I bisected the failed
>> test to the minimal operations that trigger the failure and wrote
>> a small independent test to reproduce the issue using dm flakey target.
>>
>> The following fsck error is reliably reproduced by replaying some fsx ops
>> on overlapping file regions, then emulating a crash, followed by mount,
>> umount and fsck -nf:
>>
>>   ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile
>>   1 write 0x137dd thru    0x21445 (0xdc69 bytes)
>>   2 falloc        from 0xb531 to 0x16ade (0xb5ad bytes)
>>   3 collapse      from 0x1c000 to 0x20000, (0x4000 bytes)
>>   4 write 0x3e5ec thru    0x3ffff (0x1a14 bytes)
>>   5 zero  from 0x20fac to 0x27d48, (0x6d9c bytes)
>>   6 mapwrite      0x216ad thru    0x23dfb (0x274f bytes)
>>   All 7 operations completed A-OK!
>>   _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent
>>   *** fsck.ext4 output ***
>>   fsck from util-linux 2.27.1
>>   e2fsck 1.42.13 (17-May-2015)
>>   Pass 1: Checking inodes, blocks, and sizes
>>   Inode 12, end of extent exceeds allowed value
>>           (logical block 33, physical block 33441, len 7)
>>   Clear? no
>>   Inode 12, i_blocks is 184, should be 128.  Fix? no
> Hi Amir,
>
> I always get the following output when running your xfstests test case 501.

Now merged as test generic/456

> ---------------------------------------------------------------------------
> e2fsck 1.42.9 (28-Dec-2013)
> Pass 1: Checking inodes, blocks, and sizes
> Inode 12, i_size is 147456, should be 163840. Fix? no
> ---------------------------------------------------------------------------
>
> Could you tell me how to get the expected output as you reported?

I can't say I am doing anything special, but I can say that I get the
same output as you did when running the test inside kvm-xfstests.
Actually, I could not reproduce ANY of the the crash consistency bugs
inside kvm-xfstests. Must be something to do with different timing of
IO with KVM+virtio disks??

When running on my laptop (Ubuntu 16.04 with latest kernel)
on a 10G SSD volume, I always get the error reported above.
I just re-verified with latest stable e2fsprogs (1.43.6).

Amir.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
  2017-08-27 10:44 Amir Goldstein
@ 2017-09-25  9:49   ` Xiao Yang
  0 siblings, 0 replies; 16+ messages in thread
From: Xiao Yang @ 2017-09-25  9:49 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Theodore Ts'o, Eryu Guan, Josef Bacik, fstests, linux-ext4

On 2017/08/27 18:44, Amir Goldstein wrote:
> This test is motivated by a bug found in ext4 during random crash
> consistency tests.
>
> This test uses device mapper flakey target to demonstrate the bug
> found using device mapper log-writes target.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>
> Ted,
>
> While working on crash consistency xfstests [1], I stubmled on what
> appeared to be an ext4 crash consistency bug.
>
> The tests I used rely on the log-writes dm target code written
> by Josef Bacik, which had little exposure to the wide community
> as far as I know.  I wanted to prove to myself that the found
> inconsistency was not due to a test bug, so I bisected the failed
> test to the minimal operations that trigger the failure and wrote
> a small independent test to reproduce the issue using dm flakey target.
>
> The following fsck error is reliably reproduced by replaying some fsx ops
> on overlapping file regions, then emulating a crash, followed by mount,
> umount and fsck -nf:
>
>   ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile
>   1 write 0x137dd thru    0x21445 (0xdc69 bytes)
>   2 falloc        from 0xb531 to 0x16ade (0xb5ad bytes)
>   3 collapse      from 0x1c000 to 0x20000, (0x4000 bytes)
>   4 write 0x3e5ec thru    0x3ffff (0x1a14 bytes)
>   5 zero  from 0x20fac to 0x27d48, (0x6d9c bytes)
>   6 mapwrite      0x216ad thru    0x23dfb (0x274f bytes)
>   All 7 operations completed A-OK!
>   _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent
>   *** fsck.ext4 output ***
>   fsck from util-linux 2.27.1
>   e2fsck 1.42.13 (17-May-2015)
>   Pass 1: Checking inodes, blocks, and sizes
>   Inode 12, end of extent exceeds allowed value
>           (logical block 33, physical block 33441, len 7)
>   Clear? no
>   Inode 12, i_blocks is 184, should be 128.  Fix? no
Hi Amir,

I always get the following output when running your xfstests test case 501.
---------------------------------------------------------------------------
e2fsck 1.42.9 (28-Dec-2013)
Pass 1: Checking inodes, blocks, and sizes
Inode 12, i_size is 147456, should be 163840. Fix? no
---------------------------------------------------------------------------

Could you tell me how to get the expected output as you reported?

Thanks,
Xiao Yang
> Note that the inconsistency is "applied" by journal replay during mount.
> fsck -nf before mount does not report any errors.
>
> I did not intend for this test to be merged as is, but rather to be used
> by ext4 developers to analyze the problem and then re-write the test with
> more comments and less arbitrary offset/length values.
>
> P.S.: crash consistency tests also reliably reproduce a btrfs fsck error.
>       a detailed report with I/O recording was sent to Josef.
> P.S.2: crash consistency tests report file data checksum errors on xfs
>        after fsync+crash, but I still need to prove the reliability of
>        these reports.
>  
> [1] https://github.com/amir73il/xfstests/commits/dm-log-writes
>
>  tests/generic/501     | 80 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/501.out |  2 ++
>  tests/generic/group   |  1 +
>  3 files changed, 83 insertions(+)
>  create mode 100755 tests/generic/501
>  create mode 100644 tests/generic/501.out
>
> diff --git a/tests/generic/501 b/tests/generic/501
> new file mode 100755
> index 0000000..ccb513d
> --- /dev/null
> +++ b/tests/generic/501
> @@ -0,0 +1,80 @@
> +#! /bin/bash
> +# FS QA Test No. 501
> +#
> +# This test is motivated by a bug found in ext4 during random crash
> +# consistency tests.
> +#
> +#-----------------------------------------------------------------------
> +# Copyright (C) 2017 CTERA Networks. All Rights Reserved.
> +# Author: Amir Goldstein <amir73il@gmail.com>
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> +#-----------------------------------------------------------------------
> +#
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1	# failure is the default!
> +
> +_cleanup()
> +{
> +	_cleanup_flakey
> +	cd /
> +	rm -f $tmp.*
> +}
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +. ./common/dmflakey
> +
> +# real QA test starts here
> +_supported_fs generic
> +_supported_os Linux
> +_require_scratch
> +_require_dm_target flakey
> +_require_metadata_journaling $SCRATCH_DEV
> +
> +rm -f $seqres.full
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +
> +_init_flakey
> +_mount_flakey
> +
> +fsxops=$tmp.fsxops
> +cat <<EOF > $fsxops
> +write 0x137dd 0xdc69 0x0
> +fallocate 0xb531 0xb5ad 0x21446
> +collapse_range 0x1c000 0x4000 0x21446
> +write 0x3e5ec 0x1a14 0x21446
> +zero_range 0x20fac 0x6d9c 0x40000 keep_size
> +mapwrite 0x216ad 0x274f 0x40000
> +EOF
> +run_check $here/ltp/fsx -d --replay-ops $fsxops $SCRATCH_MNT/testfile
> +
> +_flakey_drop_and_remount
> +_unmount_flakey
> +_cleanup_flakey
> +_check_scratch_fs
> +
> +echo "Silence is golden"
> +
> +status=0
> +exit
> diff --git a/tests/generic/501.out b/tests/generic/501.out
> new file mode 100644
> index 0000000..00133b6
> --- /dev/null
> +++ b/tests/generic/501.out
> @@ -0,0 +1,2 @@
> +QA output created by 501
> +Silence is golden
> diff --git a/tests/generic/group b/tests/generic/group
> index 2396b72..bb870f2 100644
> --- a/tests/generic/group
> +++ b/tests/generic/group
> @@ -454,3 +454,4 @@
>  449 auto quick acl enospc
>  450 auto quick rw
>  500 auto log replay
> +501 auto quick metadata




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
@ 2017-09-25  9:49   ` Xiao Yang
  0 siblings, 0 replies; 16+ messages in thread
From: Xiao Yang @ 2017-09-25  9:49 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Theodore Ts'o, Eryu Guan, Josef Bacik, fstests, linux-ext4

On 2017/08/27 18:44, Amir Goldstein wrote:
> This test is motivated by a bug found in ext4 during random crash
> consistency tests.
>
> This test uses device mapper flakey target to demonstrate the bug
> found using device mapper log-writes target.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>
> Ted,
>
> While working on crash consistency xfstests [1], I stubmled on what
> appeared to be an ext4 crash consistency bug.
>
> The tests I used rely on the log-writes dm target code written
> by Josef Bacik, which had little exposure to the wide community
> as far as I know.  I wanted to prove to myself that the found
> inconsistency was not due to a test bug, so I bisected the failed
> test to the minimal operations that trigger the failure and wrote
> a small independent test to reproduce the issue using dm flakey target.
>
> The following fsck error is reliably reproduced by replaying some fsx ops
> on overlapping file regions, then emulating a crash, followed by mount,
> umount and fsck -nf:
>
>   ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile
>   1 write 0x137dd thru    0x21445 (0xdc69 bytes)
>   2 falloc        from 0xb531 to 0x16ade (0xb5ad bytes)
>   3 collapse      from 0x1c000 to 0x20000, (0x4000 bytes)
>   4 write 0x3e5ec thru    0x3ffff (0x1a14 bytes)
>   5 zero  from 0x20fac to 0x27d48, (0x6d9c bytes)
>   6 mapwrite      0x216ad thru    0x23dfb (0x274f bytes)
>   All 7 operations completed A-OK!
>   _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent
>   *** fsck.ext4 output ***
>   fsck from util-linux 2.27.1
>   e2fsck 1.42.13 (17-May-2015)
>   Pass 1: Checking inodes, blocks, and sizes
>   Inode 12, end of extent exceeds allowed value
>           (logical block 33, physical block 33441, len 7)
>   Clear? no
>   Inode 12, i_blocks is 184, should be 128.  Fix? no
Hi Amir,

I always get the following output when running your xfstests test case 501.
---------------------------------------------------------------------------
e2fsck 1.42.9 (28-Dec-2013)
Pass 1: Checking inodes, blocks, and sizes
Inode 12, i_size is 147456, should be 163840. Fix? no
---------------------------------------------------------------------------

Could you tell me how to get the expected output as you reported?

Thanks,
Xiao Yang
> Note that the inconsistency is "applied" by journal replay during mount.
> fsck -nf before mount does not report any errors.
>
> I did not intend for this test to be merged as is, but rather to be used
> by ext4 developers to analyze the problem and then re-write the test with
> more comments and less arbitrary offset/length values.
>
> P.S.: crash consistency tests also reliably reproduce a btrfs fsck error.
>       a detailed report with I/O recording was sent to Josef.
> P.S.2: crash consistency tests report file data checksum errors on xfs
>        after fsync+crash, but I still need to prove the reliability of
>        these reports.
>  
> [1] https://github.com/amir73il/xfstests/commits/dm-log-writes
>
>  tests/generic/501     | 80 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/501.out |  2 ++
>  tests/generic/group   |  1 +
>  3 files changed, 83 insertions(+)
>  create mode 100755 tests/generic/501
>  create mode 100644 tests/generic/501.out
>
> diff --git a/tests/generic/501 b/tests/generic/501
> new file mode 100755
> index 0000000..ccb513d
> --- /dev/null
> +++ b/tests/generic/501
> @@ -0,0 +1,80 @@
> +#! /bin/bash
> +# FS QA Test No. 501
> +#
> +# This test is motivated by a bug found in ext4 during random crash
> +# consistency tests.
> +#
> +#-----------------------------------------------------------------------
> +# Copyright (C) 2017 CTERA Networks. All Rights Reserved.
> +# Author: Amir Goldstein <amir73il@gmail.com>
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> +#-----------------------------------------------------------------------
> +#
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1	# failure is the default!
> +
> +_cleanup()
> +{
> +	_cleanup_flakey
> +	cd /
> +	rm -f $tmp.*
> +}
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +. ./common/dmflakey
> +
> +# real QA test starts here
> +_supported_fs generic
> +_supported_os Linux
> +_require_scratch
> +_require_dm_target flakey
> +_require_metadata_journaling $SCRATCH_DEV
> +
> +rm -f $seqres.full
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +
> +_init_flakey
> +_mount_flakey
> +
> +fsxops=$tmp.fsxops
> +cat <<EOF > $fsxops
> +write 0x137dd 0xdc69 0x0
> +fallocate 0xb531 0xb5ad 0x21446
> +collapse_range 0x1c000 0x4000 0x21446
> +write 0x3e5ec 0x1a14 0x21446
> +zero_range 0x20fac 0x6d9c 0x40000 keep_size
> +mapwrite 0x216ad 0x274f 0x40000
> +EOF
> +run_check $here/ltp/fsx -d --replay-ops $fsxops $SCRATCH_MNT/testfile
> +
> +_flakey_drop_and_remount
> +_unmount_flakey
> +_cleanup_flakey
> +_check_scratch_fs
> +
> +echo "Silence is golden"
> +
> +status=0
> +exit
> diff --git a/tests/generic/501.out b/tests/generic/501.out
> new file mode 100644
> index 0000000..00133b6
> --- /dev/null
> +++ b/tests/generic/501.out
> @@ -0,0 +1,2 @@
> +QA output created by 501
> +Silence is golden
> diff --git a/tests/generic/group b/tests/generic/group
> index 2396b72..bb870f2 100644
> --- a/tests/generic/group
> +++ b/tests/generic/group
> @@ -454,3 +454,4 @@
>  449 auto quick acl enospc
>  450 auto quick rw
>  500 auto log replay
> +501 auto quick metadata

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC][PATCH] fstest: regression test for ext4 crash consistency bug
@ 2017-08-27 10:44 Amir Goldstein
  2017-09-25  9:49   ` Xiao Yang
  0 siblings, 1 reply; 16+ messages in thread
From: Amir Goldstein @ 2017-08-27 10:44 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Eryu Guan, Josef Bacik, fstests, linux-ext4

This test is motivated by a bug found in ext4 during random crash
consistency tests.

This test uses device mapper flakey target to demonstrate the bug
found using device mapper log-writes target.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---

Ted,

While working on crash consistency xfstests [1], I stubmled on what
appeared to be an ext4 crash consistency bug.

The tests I used rely on the log-writes dm target code written
by Josef Bacik, which had little exposure to the wide community
as far as I know.  I wanted to prove to myself that the found
inconsistency was not due to a test bug, so I bisected the failed
test to the minimal operations that trigger the failure and wrote
a small independent test to reproduce the issue using dm flakey target.

The following fsck error is reliably reproduced by replaying some fsx ops
on overlapping file regions, then emulating a crash, followed by mount,
umount and fsck -nf:

  ./ltp/fsx -d --replay-ops /tmp/8995.fsxops /mnt/scratch/testfile
  1 write 0x137dd thru    0x21445 (0xdc69 bytes)
  2 falloc        from 0xb531 to 0x16ade (0xb5ad bytes)
  3 collapse      from 0x1c000 to 0x20000, (0x4000 bytes)
  4 write 0x3e5ec thru    0x3ffff (0x1a14 bytes)
  5 zero  from 0x20fac to 0x27d48, (0x6d9c bytes)
  6 mapwrite      0x216ad thru    0x23dfb (0x274f bytes)
  All 7 operations completed A-OK!
  _check_generic_filesystem: filesystem on /dev/mapper/ssd-scratch is inconsistent
  *** fsck.ext4 output ***
  fsck from util-linux 2.27.1
  e2fsck 1.42.13 (17-May-2015)
  Pass 1: Checking inodes, blocks, and sizes
  Inode 12, end of extent exceeds allowed value
          (logical block 33, physical block 33441, len 7)
  Clear? no
  Inode 12, i_blocks is 184, should be 128.  Fix? no

Note that the inconsistency is "applied" by journal replay during mount.
fsck -nf before mount does not report any errors.

I did not intend for this test to be merged as is, but rather to be used
by ext4 developers to analyze the problem and then re-write the test with
more comments and less arbitrary offset/length values.

P.S.: crash consistency tests also reliably reproduce a btrfs fsck error.
      a detailed report with I/O recording was sent to Josef.
P.S.2: crash consistency tests report file data checksum errors on xfs
       after fsync+crash, but I still need to prove the reliability of
       these reports.
 
[1] https://github.com/amir73il/xfstests/commits/dm-log-writes

 tests/generic/501     | 80 +++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/501.out |  2 ++
 tests/generic/group   |  1 +
 3 files changed, 83 insertions(+)
 create mode 100755 tests/generic/501
 create mode 100644 tests/generic/501.out

diff --git a/tests/generic/501 b/tests/generic/501
new file mode 100755
index 0000000..ccb513d
--- /dev/null
+++ b/tests/generic/501
@@ -0,0 +1,80 @@
+#! /bin/bash
+# FS QA Test No. 501
+#
+# This test is motivated by a bug found in ext4 during random crash
+# consistency tests.
+#
+#-----------------------------------------------------------------------
+# Copyright (C) 2017 CTERA Networks. All Rights Reserved.
+# Author: Amir Goldstein <amir73il@gmail.com>
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+
+_cleanup()
+{
+	_cleanup_flakey
+	cd /
+	rm -f $tmp.*
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmflakey
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+_require_scratch
+_require_dm_target flakey
+_require_metadata_journaling $SCRATCH_DEV
+
+rm -f $seqres.full
+
+_scratch_mkfs >> $seqres.full 2>&1
+
+_init_flakey
+_mount_flakey
+
+fsxops=$tmp.fsxops
+cat <<EOF > $fsxops
+write 0x137dd 0xdc69 0x0
+fallocate 0xb531 0xb5ad 0x21446
+collapse_range 0x1c000 0x4000 0x21446
+write 0x3e5ec 0x1a14 0x21446
+zero_range 0x20fac 0x6d9c 0x40000 keep_size
+mapwrite 0x216ad 0x274f 0x40000
+EOF
+run_check $here/ltp/fsx -d --replay-ops $fsxops $SCRATCH_MNT/testfile
+
+_flakey_drop_and_remount
+_unmount_flakey
+_cleanup_flakey
+_check_scratch_fs
+
+echo "Silence is golden"
+
+status=0
+exit
diff --git a/tests/generic/501.out b/tests/generic/501.out
new file mode 100644
index 0000000..00133b6
--- /dev/null
+++ b/tests/generic/501.out
@@ -0,0 +1,2 @@
+QA output created by 501
+Silence is golden
diff --git a/tests/generic/group b/tests/generic/group
index 2396b72..bb870f2 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -454,3 +454,4 @@
 449 auto quick acl enospc
 450 auto quick rw
 500 auto log replay
+501 auto quick metadata
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-10-06  0:34 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-31  1:28 [RFC][PATCH] fstest: regression test for ext4 crash consistency bug Ashlie Martinez
2017-08-31  4:05 ` Amir Goldstein
2017-08-31  4:06   ` Amir Goldstein
2017-09-01 12:21     ` Ashlie Martinez
2017-09-01 14:59       ` Amir Goldstein
  -- strict thread matches above, loose matches on Subject: below --
2017-08-27 10:44 Amir Goldstein
2017-09-25  9:49 ` Xiao Yang
2017-09-25  9:49   ` Xiao Yang
2017-09-25 10:53   ` Amir Goldstein
2017-09-26 10:45     ` Xiao Yang
2017-09-26 11:48       ` Amir Goldstein
2017-09-30 14:15     ` Ashlie Martinez
2017-10-05  7:27       ` Xiao Yang
2017-10-05 15:04         ` Ashlie Martinez
2017-10-05 19:10           ` Amir Goldstein
2017-10-06  0:34             ` Ashlie Martinez

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.