All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jaegeuk Kim <jaegeuk@kernel.org>
To: Brian Foster <bfoster@redhat.com>
Cc: Eryu Guan <eguan@redhat.com>,
	fstests@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH] generic/391: check inode metadata on f{data}sync after power-cut
Date: Thu, 17 Nov 2016 11:17:05 -0800	[thread overview]
Message-ID: <20161117191705.GD74211@jaegeuk> (raw)
In-Reply-To: <20161117163203.GB49658@bfoster.bfoster>

Hello,

On Thu, Nov 17, 2016 at 11:32:03AM -0500, Brian Foster wrote:
> On Thu, Nov 17, 2016 at 10:00:19PM +0800, Eryu Guan wrote:
> > On Thu, Nov 17, 2016 at 07:56:58AM -0500, Brian Foster wrote:
> > > On Thu, Nov 17, 2016 at 04:35:27PM +0800, Eryu Guan wrote:
> > > > [cc'ed linux-xfs for inputs on a flush log issue]
> > > > 
> > > > On Wed, Nov 16, 2016 at 07:27:53PM -0800, Jaegeuk Kim wrote:
> > > > > This patch adds tests/generic/391 to test fsync and fdatasync with power-cuts.
> > > > > 
> > > > > The rule to check is:
> > > > > 1) fsync should guarantee all the inode metadata after power-cut,
> > > > > 2) fdatasync should guarantee i_size and i_blocks at least after power-cut.
> > > > > 
> > > > > Suggested-by: Chao Yu <yuchao0@huawei.com>
> > > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
> > > > > ---
> > > > >  tests/generic/391     | 125 ++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > >  tests/generic/391.out |  11 +++++
> > > > >  tests/generic/group   |   1 +
> > > > >  3 files changed, 137 insertions(+)
> > > > >  create mode 100644 tests/generic/391
> > > > >  create mode 100644 tests/generic/391.out
> > > > > 
...
> > > > > +
> > > > > +seq=`basename $0`
> > > > > +seqres=$RESULT_DIR/$seq
> > > > > +echo "QA output created by $seq"
> > > > > +
> > > > > +here=`pwd`
> > > > > +tmp=/tmp/$$
> > > > > +status=1	# failure is the default!
> > > > > +trap "rm -f $tmp.*; exit \$status" 0 1 2 3 15
> > > > 
> > > > Better to follow the template and other tests on how to do cleanup, i.e.
> > > > trap _cleanup on exit and do real cleanups in it. e.g.
> > > > 
> > > > trap "_cleanup; exit \$status" 0 1 2 3 15
> > > > 
> > > > _cleanup()
> > > > {
> > > > 	cd /
> > > > 	rm -f $tmp.*
> > > > }
> > > > 
> > > > Not a big deal, but it's better to maintain consistency :)
> > > > 
> > > > > +
> > > > > +# get standard environment, filters and checks
> > > > > +. ./common/rc
> > > > > +. ./common/filter
> > > > > +. ./common/punch
> > > > > +
> > > > > +# real QA test starts here
> > > > > +_supported_fs generic
> > > > > +_supported_os Linux
> > > > > +
> > > > > +rm -f $seqres.full
> > > > > +_require_scratch
> > > > > +_require_scratch_shutdown
> > > > > +_require_xfs_io_command "fpunch"
> > > > > +
> > > > > +_scratch_mkfs >/dev/null 2>&1
> > > > > +_scratch_mount
> > > > > +
> > > > > +testfile=$SCRATCH_MNT/testfile
> > > > > +
> > > > > +# check inode metadata after shutdown
> > > > > +_check_inode_metadata()
> > > > 
> > > > No need to name local functions with the leading "_", that's for global
> > > > helper functions.
> > > > 
> > > > > +{
> > > > > +	src/godown $SCRATCH_MNT >> $seqres.full
> > > > > +	$XFS_IO_PROG -r -c "stat -v" $1 >$tmp.before
> > > > 
> > > > Shouldn't we call godown *after* xfs_io -c stat? I saw EIO on this
> > > > xfs_io command and all sub-tests reported stat diff.
> > > > 
> > > 
> > > Yeah.. I haven't run the test, but I would expect pretty much anything
> > > to return an error after an fs shutdown.
> > > 
> > > > And perhaps we need to flush the log on godown for XFS? i.e.
> > > > 
> > > > src/godown -f $SCRATCH_MNT >> $seqres.full
> > > > 
> > > 
> > > I don't think this is necessary. The semantics of fsync() dictate that
> > > the fs do what is necessary to make the file persistent on disk. This
> > > means it is the fs responsibility to ensure the changes are logged on
> > > disk. Indeed, xfs_file_fsync() calls _xfs_log_force_lsn() to flush the
> > > log up to the most recent LSN that covered the inode in question.
> > > 
> > > > Otherwise XFS fails all the "1024" & fsync tests (after I fixed the
> > > > godown & xfs_io order locally), fdatasync tests are fine.
> > > > 
> > > > @@ -1,8 +1,16 @@
> > > >  QA output created by 391
> > > >  ==== i_size 1024 test with fsync ====
> > > > +6c6
> > > > +< stat.blocks = 8200
> > > > +---
> > > > +> stat.blocks = 16256
> > > >  ==== i_size 4096 test with fsync ====
> > > >  ==== i_time test with fsync ====
> > > >  ==== fpunch 1024 test with fsync ====
> > > > +6c6
> > > > +< stat.blocks = 8208
> > > > +---
> > > > +> stat.blocks = 24576
> > > >  ==== fpunch 4096 test with fsync ====
> > > > 
> > > > Not sure if this is the expected behavior on XFS. cc'ed xfs list for
> > > > some inputs.
> > > > 
> > > 
> > > Am I reading this correctly that you're seeing more blocks than
> > > expected? If so, preallocation perhaps?
> > 
> > Yes, you're correct, I see more blocks after godown than before godown.
> > 
> > I tried adding "-o allocsize=4k" to MOUNT_OPTIONS, it works but not
> > always. e.g. on a host with sunit/swidth reported from underlying block
> > device, test still fails.
> > 
> 
> I'm not quite sure where the preallocation is coming from in that case.
> It looks like it should honor allocsize, so that might be worth looking
> into.
> 
> > # xfs_info /mnt/xfs
> > meta-data=/dev/mapper/systemvg-testlv2 isize=512    agcount=16, agsize=245696 blks
> >          =                       sectsz=512   attr=2, projid32bit=1
> >          =                       crc=1        finobt=1 spinodes=0 rmapbt=0
> >          =                       reflink=0
> > data     =                       bsize=4096   blocks=3931136, imaxpct=25
> >          =                       sunit=64     swidth=192 blks
> > naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
> > log      =internal               bsize=4096   blocks=2560, version=2
> >          =                       sectsz=512   sunit=64 blks, lazy-count=1
> > realtime =none                   extsz=4096   blocks=0, rtextents=0
> > 
> > Part of the test diff:
> >  ==== i_size 1024 test with fsync ====
> > +6c6
> > +< stat.blocks = 8200
> > +---
> > +> stat.blocks = 8704
> > 
> > On the other hand, adding "-f" to godown always works for me.
> >
> 
> I'm guessing the difference here is that fsync flushes the inode with
> preallocation, but preallocation is typically cleaned up on file close
> (when xfs_io exits). So a subsequent log flush at shutdown may flush
> the transaction that clears out post-eof blocks. Note that it may also
> hit the disk without the log forcing shutdown, it's just not guaranteed
> in that case.
> 
> The right thing to do is probably deal with preallocation explicitly in
> the test. E.g., a truncate of the file to the current size after a
> potentially preallocated write, but before the fsync, should always
> result in an equivalent blocks count post-recovery.
> 
> (Another angle here might be to consider whether the block count is the
> right metric for this test, as opposed to a hexdump that validates the
> written data actually made it to disk.).

So, it seems it'd be better to avoid checking i_blocks of XFS because of such
the preallocation, but for F2FS, I'd prefer to check i_blocks mismatch to detect
any regression.

Let me submit v2.

Thanks,

  parent reply	other threads:[~2016-11-17 19:17 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-17  3:27 [PATCH] generic/391: check inode metadata on f{data}sync after power-cut Jaegeuk Kim
2016-11-17  8:35 ` Eryu Guan
2016-11-17 12:56   ` Brian Foster
2016-11-17 12:56     ` Brian Foster
2016-11-17 14:00     ` Eryu Guan
2016-11-17 14:00       ` Eryu Guan
2016-11-17 16:32       ` Brian Foster
2016-11-17 16:32         ` Brian Foster
2016-11-17 16:51         ` Eryu Guan
2016-11-17 16:51           ` Eryu Guan
2016-11-17 19:17         ` Jaegeuk Kim [this message]
2016-11-17 18:31   ` Jaegeuk Kim
2016-11-17 19:20 ` [PATCH v2] " Jaegeuk Kim
2016-11-17 19:20   ` Jaegeuk Kim
2016-11-18  6:39   ` Eryu Guan
2016-11-18 19:44     ` Jaegeuk Kim
2016-11-19  0:42       ` Brian Foster
2016-11-19  1:56         ` Jaegeuk Kim
2016-11-18 19:45   ` [f2fs-dev] [PATCH v3] " Jaegeuk Kim
2016-11-19  1:57     ` [f2fs-dev] [PATCH v4] " Jaegeuk Kim
2016-11-20 21:19       ` Dave Chinner
2016-11-21 20:00         ` Jaegeuk Kim
2016-11-21 20:02       ` [f2fs-dev] [PATCH v5] " Jaegeuk Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161117191705.GD74211@jaegeuk \
    --to=jaegeuk@kernel.org \
    --cc=bfoster@redhat.com \
    --cc=eguan@redhat.com \
    --cc=fstests@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.