All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ashlie Martinez <ashmrtn@utexas.edu>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Josef Bacik <jbacik@fb.com>, fstests <fstests@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Eryu Guan <eguan@redhat.com>
Subject: Re: [PATCH v3 10/13] fstests: crash consistency fsx test using dm-log-writes
Date: Mon, 27 Nov 2017 08:23:49 -0600	[thread overview]
Message-ID: <CAFk8rvZX3YWoShy2i_Jr+-cGt2bhjDxVRaFHgXQNzbmWAkdcXQ@mail.gmail.com> (raw)
In-Reply-To: <CAOQ4uxi_SjKCJ97Knvh4S6UiD3jskHEqDja7wwc+-mXp4CNbEw@mail.gmail.com>

Amir,

Just to throw in what I believe I've found about dm-log-writes (though
Josef would know more about this as it's just what I've concluded
after looking at  the code): dm-log-writes logs at IO completion,
meaning disks that ignore flush operations could exhibit bugs where
there are none, and dm-log-writes does synchronous marks though this
may not line up with the stream of IO operations (due to buffering)
unless you just did a sync. The CrashMonkey team wanted to make
something similar to the "mark" operation dm-log-writes has and we
concluded that the only place we could guarantee a mark would line up
with the stream of IO operations the user had performed was right
after a call to sync as that would force cached updates to disk. If
you call mark without a sync, then you could insert a mark after a
write(2) returns, but before the delayed allocation for that write(2)
actually allocates blocks and changes the extent tree on disk.

On Mon, Nov 27, 2017 at 3:56 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Tue, Sep 5, 2017 at 10:11 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> Cherry-picked the test from commit 70d41e17164b
>> in Josef Bacik's fstests tree (https://github.com/josefbacik/fstests).
>> Quoting from Josef's commit message:
>>
>>   The test just runs some ops and exits, then finds all of the good buffers
>>   in the directory we provided and:
>>   - replays up to the mark given
>>   - mounts the file system and compares the md5sum
>>   - unmounts and fsck's to check for metadata integrity
>>
>>   dm-log-writes will pretend to do discard and the replay-log tool will
>>   replay it properly depending on the underlying device, either by writing
>>   0's or actually calling the discard ioctl, so I've enabled discard in the
>>   test for maximum fun.
>>
>> [Amir:]
>> - Removed unneeded _test_falloc_support dynamic FSX_OPTS
>> - Fold repetitions into for loops
>> - Added place holders for using constant random seeds
>> - Add pre umount checkpint
>> - Add test to new 'replay' group
>> - Address review comments by Eryu Guan
>>
>> Cc: Josef Bacik <jbacik@fb.com>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>
>
> Josef,
>
> As you know, this test is now merged to xfstest as generic/455.
> I have been running the test for a while on xfs and it occasionally
> reports inconsistencies which I try to investigate.
>
> In some of the reports, it appears that dm-log-writes may be exhibiting
> a reliability issue (see below).
>
>> ---
>>  tests/generic/500     | 135 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>  tests/generic/500.out |   2 +
>>  tests/generic/group   |   1 +
>>  3 files changed, 138 insertions(+)
>>  create mode 100755 tests/generic/500
>>  create mode 100644 tests/generic/500.out
>>
>> diff --git a/tests/generic/500 b/tests/generic/500
>> new file mode 100755
>> index 0000000..82f7a93
>> --- /dev/null
>> +++ b/tests/generic/500
>> @@ -0,0 +1,135 @@
>> +#! /bin/bash
>> +# FS QA Test No. 500
>> +#
>> +# Run fsx with log writes to verify power fail safeness.
>> +#
>> +#-----------------------------------------------------------------------
>> +# Copyright (c) 2015 Facebook. All Rights Reserved.
>> +#
>> +# This program is free software; you can redistribute it and/or
>> +# modify it under the terms of the GNU General Public License as
>> +# published by the Free Software Foundation.
>> +#
>> +# This program is distributed in the hope that it would be useful,
>> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +# GNU General Public License for more details.
>> +#
>> +# You should have received a copy of the GNU General Public License
>> +# along with this program; if not, write the Free Software Foundation,
>> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
>> +#-----------------------------------------------------------------------
>> +#
>> +
>> +seq=`basename $0`
>> +seqres=$RESULT_DIR/$seq
>> +echo "QA output created by $seq"
>> +
>> +here=`pwd`
>> +status=1       # failure is the default!
>> +
>> +_cleanup()
>> +{
>> +       _log_writes_cleanup
>> +}
>> +trap "_cleanup; exit \$status" 0 1 2 3 15
>> +
>> +# get standard environment, filters and checks
>> +. ./common/rc
>> +. ./common/filter
>> +. ./common/dmlogwrites
>> +
>> +# real QA test starts here
>> +_supported_fs generic
>> +_supported_os Linux
>> +_require_test
>> +_require_scratch_nocheck
>> +_require_log_writes
>> +
>> +rm -f $seqres.full
>> +
>> +check_files()
>> +{
>> +       local name=$1
>> +
>> +       # Now look for our files
>> +       for i in $(find $SANITY_DIR -type f | grep $name | grep mark)
>> +       do
>> +               local filename=$(basename $i)
>> +               local mark="${filename##*.}"
>> +               echo "checking $filename" >> $seqres.full
>> +               _log_writes_replay_log $filename
>> +               _scratch_mount
>> +               local expected_md5=$(_md5_checksum $i)
>> +               local md5=$(_md5_checksum $SCRATCH_MNT/$name)
>> +               [ "${md5}" != "${expected_md5}" ] && _fail "$filename md5sum mismatched"
>
> One time, the test reported md5 mismatch on a file, but when I replayed
> the log to the same mark I found that md5 of the file is correct compared to the
> 'snapshot' file in test partition.
>
>> +               _scratch_unmount
>> +               _check_scratch_fs
>> +       done
>> +}
>> +
>> +SANITY_DIR=$TEST_DIR/fsxtests
>> +rm -rf $SANITY_DIR
>> +mkdir $SANITY_DIR
>> +
>> +# Create the log
>> +_log_writes_init
>> +
>> +_log_writes_mkfs >> $seqres.full 2>&1
>> +
>> +# Log writes emulates discard support, turn it on for maximum crying.
>> +_log_writes_mount -o discard
>> +
>> +NUM_FILES=4
>> +NUM_OPS=200
>> +FSX_OPTS="-N $NUM_OPS -d -P $SANITY_DIR -i $LOGWRITES_DMDEV"
>> +# Set random seeds for fsx runs (0 for timestamp + pid)
>> +seeds=(0 0 0 0)
>> +# Run fsx for a while
>> +for j in `seq 0 $((NUM_FILES-1))`
>> +do
>> +       run_check $here/ltp/fsx $FSX_OPTS -S ${seeds[$j]} -j $j $SCRATCH_MNT/testfile$j &
>> +done
>> +wait
>> +
>> +test_md5=()
>> +for j in `seq 0 $((NUM_FILES-1))`
>> +do
>> +       test_md5[$j]=$(_md5_checksum $SCRATCH_MNT/testfile$j)
>> +done
>> +
>> +# Unmount the scratch dir and tear down the log writes target
>> +_log_writes_mark last
>> +_log_writes_unmount
>> +_log_writes_mark end
>> +_log_writes_remove
>> +_check_scratch_fs
>> +
>
> Another time xfs_repair complained about dirty log right after
> _log_writes_remove
> and I wasn't able to seek to the "end" mark. Log entries were valid a
> few entries
> after the "last" mark.
>
> This leads me to believe that perhaps dm-log-writes doesn't flush all
> its pending
> log io before remove? Anyway, I added a "sync $LOGWRITES_DEV" call inside
> _log_writes_remove. Not sure if that helps or if that is required
> before removing the
> target?
>
> I will report if I see that the problem persists.
> Thought? suggestions?
>
> Thanks,
> Amir.
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2017-11-27 14:23 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-05 19:11 [PATCH v3 00/13] Crash consistency xfstest using dm-log-writes Amir Goldstein
2017-09-05 19:11 ` [PATCH v3 01/13] fsx: add support for integrity check with dm-log-writes target Amir Goldstein
2017-09-05 19:11 ` [PATCH v3 02/13] fsx: add optional logid prefix to log messages Amir Goldstein
2017-09-05 19:11 ` [PATCH v3 03/13] fsx: add support for recording operations to a file Amir Goldstein
2017-09-05 19:11 ` [PATCH v3 04/13] fsx: add support for writing constant instead of random data Amir Goldstein
2017-09-05 19:11 ` [PATCH v3 05/13] fsx: add support for keeping existing file Amir Goldstein
2017-09-05 19:11 ` [PATCH v3 06/13] log-writes: add replay-log program to replay dm-log-writes target Amir Goldstein
2017-09-05 19:11 ` [PATCH v3 07/13] replay-log: add validations for corrupt log entries Amir Goldstein
2017-09-05 19:11 ` [PATCH v3 08/13] replay-log: add support for replaying ops in target device sector range Amir Goldstein
2017-09-05 19:11 ` [PATCH v3 09/13] fstests: add support for working with dm-log-writes target Amir Goldstein
2017-09-07  7:45   ` Eryu Guan
2017-09-07  7:47   ` Eryu Guan
2017-09-05 19:11 ` [PATCH v3 10/13] fstests: crash consistency fsx test using dm-log-writes Amir Goldstein
2017-09-07  7:50   ` Eryu Guan
2017-09-07  8:50     ` Amir Goldstein
2017-09-07  8:55       ` Eryu Guan
2017-09-07 10:10         ` Amir Goldstein
2017-11-27  9:56   ` Amir Goldstein
2017-11-27 14:23     ` Ashlie Martinez [this message]
2017-11-27 15:07       ` Josef Bacik
2017-11-27 15:04     ` Josef Bacik
2017-11-28 16:48       ` Amir Goldstein
2017-11-28 17:21         ` Josef Bacik
2017-11-28 19:32           ` Amir Goldstein
2017-11-28 20:00             ` Josef Bacik
2017-11-28 20:26               ` Amir Goldstein
     [not found]                 ` <CAOQ4uxhQu-1AK71zg4Ce0cJd+xRt3Gf9zMMVb=Rs00zFuWA3hQ@mail.gmail.com>
2017-11-28 22:33                   ` Darrick J. Wong
2017-11-29  3:33                     ` Amir Goldstein
     [not found]                       ` <CAOQ4uxhXWxkre7L7RDvpH8E4cwsHGZzVHKmCpBESfTUZhmQpUg@mail.gmail.com>
     [not found]                         ` <CAOQ4uxjBQ9ZzPe9GKCRYCjNFv3jP8NMAVQDb=LiNqNcEeRp47w@mail.gmail.com>
2017-12-04 20:53                           ` Darrick J. Wong
2017-09-05 19:11 ` [PATCH v3 11/13] fstests: regression test for ext4 crash consistency bug Amir Goldstein
2017-09-07  7:52   ` Eryu Guan
2017-09-05 19:11 ` [PATCH v3 12/13] fstests: crash consistency fsx test for cloned files Amir Goldstein
2017-09-05 19:11 ` [PATCH v3 13/13] fstests: regression test for xfs leftover CoW extent error Amir Goldstein
2017-09-07  7:55   ` Eryu Guan
2017-09-07  9:34     ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFk8rvZX3YWoShy2i_Jr+-cGt2bhjDxVRaFHgXQNzbmWAkdcXQ@mail.gmail.com \
    --to=ashmrtn@utexas.edu \
    --cc=amir73il@gmail.com \
    --cc=eguan@redhat.com \
    --cc=fstests@vger.kernel.org \
    --cc=jbacik@fb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.