All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] generic: add test for fsync after shrinking truncate and rename
@ 2019-03-04 14:06 fdmanana
  2019-03-04 15:04 ` Amir Goldstein
  2019-03-05  9:26 ` [PATCH v2] " fdmanana
  0 siblings, 2 replies; 22+ messages in thread
From: fdmanana @ 2019-03-04 14:06 UTC (permalink / raw)
  To: fstests; +Cc: linux-btrfs, Filipe Manana

From: Filipe Manana <fdmanana@suse.com>

Test that if we truncate a file to reduce its size, rename it and then
fsync it, after a power failure the file has a correct size and name.

This test is motivated by a bug found in btrfs, which is fixed by a
patch for the linux kernel titled:

  "Btrfs: fix incorrect file size after shrinking truncate and fsync"

This test currently passes on ext4, xfs, f2fs and patched btrfs.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 tests/generic/532     | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/532.out |  8 +++++++
 tests/generic/group   |  1 +
 3 files changed, 75 insertions(+)
 create mode 100755 tests/generic/532
 create mode 100644 tests/generic/532.out

diff --git a/tests/generic/532 b/tests/generic/532
new file mode 100755
index 00000000..64992d85
--- /dev/null
+++ b/tests/generic/532
@@ -0,0 +1,66 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2019 SUSE Linux Products GmbH. All Rights Reserved.
+#
+# FS QA Test No. 532
+#
+# Test that if we truncate a file to reduce its size, rename it and then fsync
+# it, after a power failure the file has a correct size and name.
+#
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	_cleanup_flakey
+	cd /
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmflakey
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+_require_scratch
+_require_dm_target flakey
+
+rm -f $seqres.full
+
+_scratch_mkfs >>$seqres.full 2>&1
+_require_metadata_journaling $SCRATCH_DEV
+_init_flakey
+_mount_flakey
+
+# Create our test file with an initial size of 8000 bytes, then fsync it,
+# followed by a truncate that reduces its size down to 3000 bytes.
+$XFS_IO_PROG -f -c "pwrite -S 0xab 0 8000" \
+	     -c "fsync" \
+	     -c "truncate 3000" \
+	     $SCRATCH_MNT/foo | _filter_xfs_io
+
+# Now rename the file and fsync it again.
+mv $SCRATCH_MNT/foo $SCRATCH_MNT/bar
+$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/bar
+
+# Simulate a power failure and mount the filesystem to check that the file was
+# persisted with the new name and has a size of 3000 bytes.
+_flakey_drop_and_remount
+
+[ -f $SCRATCH_MNT/bar ] || echo "file name 'bar' is missing"
+[ -f $SCRATCH_MNT/foo ] && echo "file name 'foo' still exists"
+
+echo "File content after power failure:"
+od -A d -t x1 $SCRATCH_MNT/bar
+
+_unmount_flakey
+
+status=0
+exit
diff --git a/tests/generic/532.out b/tests/generic/532.out
new file mode 100644
index 00000000..554fbe2a
--- /dev/null
+++ b/tests/generic/532.out
@@ -0,0 +1,8 @@
+QA output created by 532
+wrote 8000/8000 bytes at offset 0
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+File content after power failure:
+0000000 ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab
+*
+0002992 ab ab ab ab ab ab ab ab
+0003000
diff --git a/tests/generic/group b/tests/generic/group
index 31011ac8..7d63f303 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -534,3 +534,4 @@
 529 auto quick attr
 530 auto quick unlink
 531 auto quick unlink
+532 auto quick log
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-04 14:06 [PATCH] generic: add test for fsync after shrinking truncate and rename fdmanana
@ 2019-03-04 15:04 ` Amir Goldstein
  2019-03-04 15:23   ` Filipe Manana
  2019-03-05  0:50   ` Dave Chinner
  2019-03-05  9:26 ` [PATCH v2] " fdmanana
  1 sibling, 2 replies; 22+ messages in thread
From: Amir Goldstein @ 2019-03-04 15:04 UTC (permalink / raw)
  To: fdmanana; +Cc: fstests, Linux Btrfs, Filipe Manana

On Mon, Mar 4, 2019 at 4:44 PM <fdmanana@kernel.org> wrote:
>
> From: Filipe Manana <fdmanana@suse.com>
>
> Test that if we truncate a file to reduce its size, rename it and then
> fsync it, after a power failure the file has a correct size and name.
>

I am not sure that ext4/xfs semantics guaranty anything about
persisting file name after fsync of file?...

> This test is motivated by a bug found in btrfs, which is fixed by a
> patch for the linux kernel titled:
>
>   "Btrfs: fix incorrect file size after shrinking truncate and fsync"

At least the title of this patch says nothing about persisting the
name.

>
> This test currently passes on ext4, xfs, f2fs and patched btrfs.
>
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>  tests/generic/532     | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/532.out |  8 +++++++
>  tests/generic/group   |  1 +
>  3 files changed, 75 insertions(+)
>  create mode 100755 tests/generic/532
>  create mode 100644 tests/generic/532.out
>
> diff --git a/tests/generic/532 b/tests/generic/532
> new file mode 100755
> index 00000000..64992d85
> --- /dev/null
> +++ b/tests/generic/532
> @@ -0,0 +1,66 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (C) 2019 SUSE Linux Products GmbH. All Rights Reserved.
> +#
> +# FS QA Test No. 532
> +#
> +# Test that if we truncate a file to reduce its size, rename it and then fsync
> +# it, after a power failure the file has a correct size and name.
> +#
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +tmp=/tmp/$$
> +status=1       # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> +       _cleanup_flakey
> +       cd /
> +       rm -f $tmp.*
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +. ./common/dmflakey
> +
> +# real QA test starts here
> +_supported_fs generic
> +_supported_os Linux
> +_require_scratch
> +_require_dm_target flakey
> +
> +rm -f $seqres.full
> +
> +_scratch_mkfs >>$seqres.full 2>&1
> +_require_metadata_journaling $SCRATCH_DEV
> +_init_flakey
> +_mount_flakey
> +
> +# Create our test file with an initial size of 8000 bytes, then fsync it,
> +# followed by a truncate that reduces its size down to 3000 bytes.
> +$XFS_IO_PROG -f -c "pwrite -S 0xab 0 8000" \
> +            -c "fsync" \
> +            -c "truncate 3000" \
> +            $SCRATCH_MNT/foo | _filter_xfs_io
> +
> +# Now rename the file and fsync it again.
> +mv $SCRATCH_MNT/foo $SCRATCH_MNT/bar
> +$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/bar
> +
> +# Simulate a power failure and mount the filesystem to check that the file was
> +# persisted with the new name and has a size of 3000 bytes.
> +_flakey_drop_and_remount
> +
> +[ -f $SCRATCH_MNT/bar ] || echo "file name 'bar' is missing"
> +[ -f $SCRATCH_MNT/foo ] && echo "file name 'foo' still exists"
> +
> +echo "File content after power failure:"
> +od -A d -t x1 $SCRATCH_MNT/bar
> +
> +_unmount_flakey
> +
> +status=0
> +exit
> diff --git a/tests/generic/532.out b/tests/generic/532.out
> new file mode 100644
> index 00000000..554fbe2a
> --- /dev/null
> +++ b/tests/generic/532.out
> @@ -0,0 +1,8 @@
> +QA output created by 532
> +wrote 8000/8000 bytes at offset 0
> +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> +File content after power failure:
> +0000000 ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab
> +*
> +0002992 ab ab ab ab ab ab ab ab
> +0003000
> diff --git a/tests/generic/group b/tests/generic/group
> index 31011ac8..7d63f303 100644
> --- a/tests/generic/group
> +++ b/tests/generic/group
> @@ -534,3 +534,4 @@
>  529 auto quick attr
>  530 auto quick unlink
>  531 auto quick unlink
> +532 auto quick log
> --
> 2.11.0
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-04 15:04 ` Amir Goldstein
@ 2019-03-04 15:23   ` Filipe Manana
  2019-03-04 17:59     ` Amir Goldstein
  2019-03-05  0:50   ` Dave Chinner
  1 sibling, 1 reply; 22+ messages in thread
From: Filipe Manana @ 2019-03-04 15:23 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: fstests, Linux Btrfs, Filipe Manana

On Mon, Mar 4, 2019 at 3:04 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Mon, Mar 4, 2019 at 4:44 PM <fdmanana@kernel.org> wrote:
> >
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > Test that if we truncate a file to reduce its size, rename it and then
> > fsync it, after a power failure the file has a correct size and name.
> >
>
> I am not sure that ext4/xfs semantics guaranty anything about
> persisting file name after fsync of file?...
>
> > This test is motivated by a bug found in btrfs, which is fixed by a
> > patch for the linux kernel titled:
> >
> >   "Btrfs: fix incorrect file size after shrinking truncate and fsync"
>
> At least the title of this patch says nothing about persisting the
> name.

Because the bug in btrfs is not about persisting the name, it's about
persisting the correct inode size.

>
> >
> > This test currently passes on ext4, xfs, f2fs and patched btrfs.
> >
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> > ---
> >  tests/generic/532     | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/532.out |  8 +++++++
> >  tests/generic/group   |  1 +
> >  3 files changed, 75 insertions(+)
> >  create mode 100755 tests/generic/532
> >  create mode 100644 tests/generic/532.out
> >
> > diff --git a/tests/generic/532 b/tests/generic/532
> > new file mode 100755
> > index 00000000..64992d85
> > --- /dev/null
> > +++ b/tests/generic/532
> > @@ -0,0 +1,66 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (C) 2019 SUSE Linux Products GmbH. All Rights Reserved.
> > +#
> > +# FS QA Test No. 532
> > +#
> > +# Test that if we truncate a file to reduce its size, rename it and then fsync
> > +# it, after a power failure the file has a correct size and name.
> > +#
> > +seq=`basename $0`
> > +seqres=$RESULT_DIR/$seq
> > +echo "QA output created by $seq"
> > +tmp=/tmp/$$
> > +status=1       # failure is the default!
> > +trap "_cleanup; exit \$status" 0 1 2 3 15
> > +
> > +_cleanup()
> > +{
> > +       _cleanup_flakey
> > +       cd /
> > +       rm -f $tmp.*
> > +}
> > +
> > +# get standard environment, filters and checks
> > +. ./common/rc
> > +. ./common/filter
> > +. ./common/dmflakey
> > +
> > +# real QA test starts here
> > +_supported_fs generic
> > +_supported_os Linux
> > +_require_scratch
> > +_require_dm_target flakey
> > +
> > +rm -f $seqres.full
> > +
> > +_scratch_mkfs >>$seqres.full 2>&1
> > +_require_metadata_journaling $SCRATCH_DEV
> > +_init_flakey
> > +_mount_flakey
> > +
> > +# Create our test file with an initial size of 8000 bytes, then fsync it,
> > +# followed by a truncate that reduces its size down to 3000 bytes.
> > +$XFS_IO_PROG -f -c "pwrite -S 0xab 0 8000" \
> > +            -c "fsync" \
> > +            -c "truncate 3000" \
> > +            $SCRATCH_MNT/foo | _filter_xfs_io
> > +
> > +# Now rename the file and fsync it again.
> > +mv $SCRATCH_MNT/foo $SCRATCH_MNT/bar
> > +$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/bar
> > +
> > +# Simulate a power failure and mount the filesystem to check that the file was
> > +# persisted with the new name and has a size of 3000 bytes.
> > +_flakey_drop_and_remount
> > +
> > +[ -f $SCRATCH_MNT/bar ] || echo "file name 'bar' is missing"
> > +[ -f $SCRATCH_MNT/foo ] && echo "file name 'foo' still exists"
> > +
> > +echo "File content after power failure:"
> > +od -A d -t x1 $SCRATCH_MNT/bar
> > +
> > +_unmount_flakey
> > +
> > +status=0
> > +exit
> > diff --git a/tests/generic/532.out b/tests/generic/532.out
> > new file mode 100644
> > index 00000000..554fbe2a
> > --- /dev/null
> > +++ b/tests/generic/532.out
> > @@ -0,0 +1,8 @@
> > +QA output created by 532
> > +wrote 8000/8000 bytes at offset 0
> > +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > +File content after power failure:
> > +0000000 ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab
> > +*
> > +0002992 ab ab ab ab ab ab ab ab
> > +0003000
> > diff --git a/tests/generic/group b/tests/generic/group
> > index 31011ac8..7d63f303 100644
> > --- a/tests/generic/group
> > +++ b/tests/generic/group
> > @@ -534,3 +534,4 @@
> >  529 auto quick attr
> >  530 auto quick unlink
> >  531 auto quick unlink
> > +532 auto quick log
> > --
> > 2.11.0
> >

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-04 15:23   ` Filipe Manana
@ 2019-03-04 17:59     ` Amir Goldstein
  2019-03-04 22:30       ` Filipe Manana
  0 siblings, 1 reply; 22+ messages in thread
From: Amir Goldstein @ 2019-03-04 17:59 UTC (permalink / raw)
  To: Filipe Manana; +Cc: fstests, Linux Btrfs, Filipe Manana

On Mon, Mar 4, 2019 at 5:23 PM Filipe Manana <fdmanana@kernel.org> wrote:
>
> On Mon, Mar 4, 2019 at 3:04 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Mon, Mar 4, 2019 at 4:44 PM <fdmanana@kernel.org> wrote:
> > >
> > > From: Filipe Manana <fdmanana@suse.com>
> > >
> > > Test that if we truncate a file to reduce its size, rename it and then
> > > fsync it, after a power failure the file has a correct size and name.
> > >
> >
> > I am not sure that ext4/xfs semantics guaranty anything about
> > persisting file name after fsync of file?...
> >
> > > This test is motivated by a bug found in btrfs, which is fixed by a
> > > patch for the linux kernel titled:
> > >
> > >   "Btrfs: fix incorrect file size after shrinking truncate and fsync"
> >
> > At least the title of this patch says nothing about persisting the
> > name.
>
> Because the bug in btrfs is not about persisting the name, it's about
> persisting the correct inode size.
>

OK. Still AFAIK, ext4/xfs don't guaranty that the name bar will exist
after power failure unless the parent dir was fsynced.
If you fsync the parent dir instead of the file after rename, it will
make the test correct for ext4/xfs (and POSIX for that matter).
Will that test modification still catch the btrfs bug (pre fix)?

Thanks,
Amir.

> >
> > >
> > > This test currently passes on ext4, xfs, f2fs and patched btrfs.
> > >
> > > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> > > ---
> > >  tests/generic/532     | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  tests/generic/532.out |  8 +++++++
> > >  tests/generic/group   |  1 +
> > >  3 files changed, 75 insertions(+)
> > >  create mode 100755 tests/generic/532
> > >  create mode 100644 tests/generic/532.out
> > >
> > > diff --git a/tests/generic/532 b/tests/generic/532
> > > new file mode 100755
> > > index 00000000..64992d85
> > > --- /dev/null
> > > +++ b/tests/generic/532
> > > @@ -0,0 +1,66 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (C) 2019 SUSE Linux Products GmbH. All Rights Reserved.
> > > +#
> > > +# FS QA Test No. 532
> > > +#
> > > +# Test that if we truncate a file to reduce its size, rename it and then fsync
> > > +# it, after a power failure the file has a correct size and name.
> > > +#
> > > +seq=`basename $0`
> > > +seqres=$RESULT_DIR/$seq
> > > +echo "QA output created by $seq"
> > > +tmp=/tmp/$$
> > > +status=1       # failure is the default!
> > > +trap "_cleanup; exit \$status" 0 1 2 3 15
> > > +
> > > +_cleanup()
> > > +{
> > > +       _cleanup_flakey
> > > +       cd /
> > > +       rm -f $tmp.*
> > > +}
> > > +
> > > +# get standard environment, filters and checks
> > > +. ./common/rc
> > > +. ./common/filter
> > > +. ./common/dmflakey
> > > +
> > > +# real QA test starts here
> > > +_supported_fs generic
> > > +_supported_os Linux
> > > +_require_scratch
> > > +_require_dm_target flakey
> > > +
> > > +rm -f $seqres.full
> > > +
> > > +_scratch_mkfs >>$seqres.full 2>&1
> > > +_require_metadata_journaling $SCRATCH_DEV
> > > +_init_flakey
> > > +_mount_flakey
> > > +
> > > +# Create our test file with an initial size of 8000 bytes, then fsync it,
> > > +# followed by a truncate that reduces its size down to 3000 bytes.
> > > +$XFS_IO_PROG -f -c "pwrite -S 0xab 0 8000" \
> > > +            -c "fsync" \
> > > +            -c "truncate 3000" \
> > > +            $SCRATCH_MNT/foo | _filter_xfs_io
> > > +
> > > +# Now rename the file and fsync it again.
> > > +mv $SCRATCH_MNT/foo $SCRATCH_MNT/bar
> > > +$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/bar
> > > +
> > > +# Simulate a power failure and mount the filesystem to check that the file was
> > > +# persisted with the new name and has a size of 3000 bytes.
> > > +_flakey_drop_and_remount
> > > +
> > > +[ -f $SCRATCH_MNT/bar ] || echo "file name 'bar' is missing"
> > > +[ -f $SCRATCH_MNT/foo ] && echo "file name 'foo' still exists"
> > > +
> > > +echo "File content after power failure:"
> > > +od -A d -t x1 $SCRATCH_MNT/bar
> > > +
> > > +_unmount_flakey
> > > +
> > > +status=0
> > > +exit
> > > diff --git a/tests/generic/532.out b/tests/generic/532.out
> > > new file mode 100644
> > > index 00000000..554fbe2a
> > > --- /dev/null
> > > +++ b/tests/generic/532.out
> > > @@ -0,0 +1,8 @@
> > > +QA output created by 532
> > > +wrote 8000/8000 bytes at offset 0
> > > +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > > +File content after power failure:
> > > +0000000 ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab
> > > +*
> > > +0002992 ab ab ab ab ab ab ab ab
> > > +0003000
> > > diff --git a/tests/generic/group b/tests/generic/group
> > > index 31011ac8..7d63f303 100644
> > > --- a/tests/generic/group
> > > +++ b/tests/generic/group
> > > @@ -534,3 +534,4 @@
> > >  529 auto quick attr
> > >  530 auto quick unlink
> > >  531 auto quick unlink
> > > +532 auto quick log
> > > --
> > > 2.11.0
> > >

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-04 17:59     ` Amir Goldstein
@ 2019-03-04 22:30       ` Filipe Manana
  2019-03-05  5:59         ` Amir Goldstein
  0 siblings, 1 reply; 22+ messages in thread
From: Filipe Manana @ 2019-03-04 22:30 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: fstests, Linux Btrfs, Filipe Manana, Dave Chinner

On Mon, Mar 4, 2019 at 5:59 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Mon, Mar 4, 2019 at 5:23 PM Filipe Manana <fdmanana@kernel.org> wrote:
> >
> > On Mon, Mar 4, 2019 at 3:04 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Mon, Mar 4, 2019 at 4:44 PM <fdmanana@kernel.org> wrote:
> > > >
> > > > From: Filipe Manana <fdmanana@suse.com>
> > > >
> > > > Test that if we truncate a file to reduce its size, rename it and then
> > > > fsync it, after a power failure the file has a correct size and name.
> > > >
> > >
> > > I am not sure that ext4/xfs semantics guaranty anything about
> > > persisting file name after fsync of file?...
> > >
> > > > This test is motivated by a bug found in btrfs, which is fixed by a
> > > > patch for the linux kernel titled:
> > > >
> > > >   "Btrfs: fix incorrect file size after shrinking truncate and fsync"
> > >
> > > At least the title of this patch says nothing about persisting the
> > > name.
> >
> > Because the bug in btrfs is not about persisting the name, it's about
> > persisting the correct inode size.
> >
>
> OK. Still AFAIK, ext4/xfs don't guaranty that the name bar will exist
> after power failure unless the parent dir was fsynced.
> If you fsync the parent dir instead of the file after rename, it will
> make the test correct for ext4/xfs (and POSIX for that matter).

I'm not so sure about that. Couldn't find anything explicit anywhere about that.
There are other tests that do similar assumptions, either after
renames or after adding a new
hard link (fsyncing just the file being enough to guarantee the new
hard link exits after a power
failure, without the need to fsync the parent directory). I don't
recall ever having had such
comment before, even for much more complex cases involving renames
such as [1] for example.

[1] https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/commit/?id=e2d866668a98af4cd47cd6913418e410cd80aa93

In this case both old and new names are in the same parent directory,
but if they were in different directories, and
requiring the user/application to fsync both directories, it could be
dangerous, if the old parent dir is fsync'ed and
a power failure happened before fsync'ing the new parent (requiring
the user/application to know about this
and fsync the new parent before the old parent dir also seems very
demanding), leading to a potential file loss
after replaying the log/journal (there were such cases in btrfs before).

> Will that test modification still catch the btrfs bug (pre fix)?

Yes, it will still trigger the btrfs bug.

Well, I don't mind doing such change, but I would like to hear other
opinions too, perhaps Dave could shed some light on this?
Even old reiserfs and f2fs (both on posix and strict fsync modes)
currently pass this test.

Thanks.

>
> Thanks,
> Amir.
>
> > >
> > > >
> > > > This test currently passes on ext4, xfs, f2fs and patched btrfs.
> > > >
> > > > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> > > > ---
> > > >  tests/generic/532     | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  tests/generic/532.out |  8 +++++++
> > > >  tests/generic/group   |  1 +
> > > >  3 files changed, 75 insertions(+)
> > > >  create mode 100755 tests/generic/532
> > > >  create mode 100644 tests/generic/532.out
> > > >
> > > > diff --git a/tests/generic/532 b/tests/generic/532
> > > > new file mode 100755
> > > > index 00000000..64992d85
> > > > --- /dev/null
> > > > +++ b/tests/generic/532
> > > > @@ -0,0 +1,66 @@
> > > > +#! /bin/bash
> > > > +# SPDX-License-Identifier: GPL-2.0
> > > > +# Copyright (C) 2019 SUSE Linux Products GmbH. All Rights Reserved.
> > > > +#
> > > > +# FS QA Test No. 532
> > > > +#
> > > > +# Test that if we truncate a file to reduce its size, rename it and then fsync
> > > > +# it, after a power failure the file has a correct size and name.
> > > > +#
> > > > +seq=`basename $0`
> > > > +seqres=$RESULT_DIR/$seq
> > > > +echo "QA output created by $seq"
> > > > +tmp=/tmp/$$
> > > > +status=1       # failure is the default!
> > > > +trap "_cleanup; exit \$status" 0 1 2 3 15
> > > > +
> > > > +_cleanup()
> > > > +{
> > > > +       _cleanup_flakey
> > > > +       cd /
> > > > +       rm -f $tmp.*
> > > > +}
> > > > +
> > > > +# get standard environment, filters and checks
> > > > +. ./common/rc
> > > > +. ./common/filter
> > > > +. ./common/dmflakey
> > > > +
> > > > +# real QA test starts here
> > > > +_supported_fs generic
> > > > +_supported_os Linux
> > > > +_require_scratch
> > > > +_require_dm_target flakey
> > > > +
> > > > +rm -f $seqres.full
> > > > +
> > > > +_scratch_mkfs >>$seqres.full 2>&1
> > > > +_require_metadata_journaling $SCRATCH_DEV
> > > > +_init_flakey
> > > > +_mount_flakey
> > > > +
> > > > +# Create our test file with an initial size of 8000 bytes, then fsync it,
> > > > +# followed by a truncate that reduces its size down to 3000 bytes.
> > > > +$XFS_IO_PROG -f -c "pwrite -S 0xab 0 8000" \
> > > > +            -c "fsync" \
> > > > +            -c "truncate 3000" \
> > > > +            $SCRATCH_MNT/foo | _filter_xfs_io
> > > > +
> > > > +# Now rename the file and fsync it again.
> > > > +mv $SCRATCH_MNT/foo $SCRATCH_MNT/bar
> > > > +$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/bar
> > > > +
> > > > +# Simulate a power failure and mount the filesystem to check that the file was
> > > > +# persisted with the new name and has a size of 3000 bytes.
> > > > +_flakey_drop_and_remount
> > > > +
> > > > +[ -f $SCRATCH_MNT/bar ] || echo "file name 'bar' is missing"
> > > > +[ -f $SCRATCH_MNT/foo ] && echo "file name 'foo' still exists"
> > > > +
> > > > +echo "File content after power failure:"
> > > > +od -A d -t x1 $SCRATCH_MNT/bar
> > > > +
> > > > +_unmount_flakey
> > > > +
> > > > +status=0
> > > > +exit
> > > > diff --git a/tests/generic/532.out b/tests/generic/532.out
> > > > new file mode 100644
> > > > index 00000000..554fbe2a
> > > > --- /dev/null
> > > > +++ b/tests/generic/532.out
> > > > @@ -0,0 +1,8 @@
> > > > +QA output created by 532
> > > > +wrote 8000/8000 bytes at offset 0
> > > > +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > > > +File content after power failure:
> > > > +0000000 ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab
> > > > +*
> > > > +0002992 ab ab ab ab ab ab ab ab
> > > > +0003000
> > > > diff --git a/tests/generic/group b/tests/generic/group
> > > > index 31011ac8..7d63f303 100644
> > > > --- a/tests/generic/group
> > > > +++ b/tests/generic/group
> > > > @@ -534,3 +534,4 @@
> > > >  529 auto quick attr
> > > >  530 auto quick unlink
> > > >  531 auto quick unlink
> > > > +532 auto quick log
> > > > --
> > > > 2.11.0
> > > >

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-04 15:04 ` Amir Goldstein
  2019-03-04 15:23   ` Filipe Manana
@ 2019-03-05  0:50   ` Dave Chinner
  2019-03-05  1:00     ` Dave Chinner
  2019-03-05  5:39     ` Amir Goldstein
  1 sibling, 2 replies; 22+ messages in thread
From: Dave Chinner @ 2019-03-05  0:50 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: fdmanana, fstests, Linux Btrfs, Filipe Manana

On Mon, Mar 04, 2019 at 05:04:23PM +0200, Amir Goldstein wrote:
> On Mon, Mar 4, 2019 at 4:44 PM <fdmanana@kernel.org> wrote:
> >
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > Test that if we truncate a file to reduce its size, rename it and then
> > fsync it, after a power failure the file has a correct size and name.
> >
> 
> I am not sure that ext4/xfs semantics guaranty anything about
> persisting file name after fsync of file?...

They do.  It's that pesky "strictly ordered metadata" thing I keep
having to explain to people...

i.e. if you fsync an inode, then you are persisting all the changes
needed to reference that file and it's data. And so if there was a
rename in the history of that file, then that is persisted, too.
Which means that both the original and the new directory
modifications are persisted, too.

*POSIX* doesn't require this - it says that if you O_DSYNC data,
then it also includes all the metadata needed to reference that
data. So even if the data is there, POSIX doesn't define whether the
rename is there or noti, just that you can get to the fsync'd data
via either the old or new name. IOWs, POSIX allows the behaviour to
be implementation specific.

In this case, file systems with strictly ordered metadata will end
up making the rename visible because the rename occurred before the
truncate that the fsync() is persisting...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-05  0:50   ` Dave Chinner
@ 2019-03-05  1:00     ` Dave Chinner
  2019-03-05  1:08       ` Vijay Chidambaram
  2019-03-05  5:39     ` Amir Goldstein
  1 sibling, 1 reply; 22+ messages in thread
From: Dave Chinner @ 2019-03-05  1:00 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: fdmanana, fstests, Linux Btrfs, Filipe Manana

On Tue, Mar 05, 2019 at 11:50:20AM +1100, Dave Chinner wrote:
> On Mon, Mar 04, 2019 at 05:04:23PM +0200, Amir Goldstein wrote:
> > On Mon, Mar 4, 2019 at 4:44 PM <fdmanana@kernel.org> wrote:
> > >
> > > From: Filipe Manana <fdmanana@suse.com>
> > >
> > > Test that if we truncate a file to reduce its size, rename it and then
> > > fsync it, after a power failure the file has a correct size and name.
> > >
> > 
> > I am not sure that ext4/xfs semantics guaranty anything about
> > persisting file name after fsync of file?...
> 
> They do.  It's that pesky "strictly ordered metadata" thing I keep
> having to explain to people...

https://marc.info/?l=fstests&m=155010885626284&w=2

-Dave.

> 
> i.e. if you fsync an inode, then you are persisting all the changes
> needed to reference that file and it's data. And so if there was a
> rename in the history of that file, then that is persisted, too.
> Which means that both the original and the new directory
> modifications are persisted, too.
> 
> *POSIX* doesn't require this - it says that if you O_DSYNC data,
> then it also includes all the metadata needed to reference that
> data. So even if the data is there, POSIX doesn't define whether the
> rename is there or noti, just that you can get to the fsync'd data
> via either the old or new name. IOWs, POSIX allows the behaviour to
> be implementation specific.
> 
> In this case, file systems with strictly ordered metadata will end
> up making the rename visible because the rename occurred before the
> truncate that the fsync() is persisting...
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-05  1:00     ` Dave Chinner
@ 2019-03-05  1:08       ` Vijay Chidambaram
  0 siblings, 0 replies; 22+ messages in thread
From: Vijay Chidambaram @ 2019-03-05  1:08 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Amir Goldstein, Filipe Manana, fstests, Linux Btrfs, Filipe Manana

On Mon, Mar 4, 2019 at 7:00 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Tue, Mar 05, 2019 at 11:50:20AM +1100, Dave Chinner wrote:
> > On Mon, Mar 04, 2019 at 05:04:23PM +0200, Amir Goldstein wrote:
> > > On Mon, Mar 4, 2019 at 4:44 PM <fdmanana@kernel.org> wrote:
> > > >
> > > > From: Filipe Manana <fdmanana@suse.com>
> > > >
> > > > Test that if we truncate a file to reduce its size, rename it and then
> > > > fsync it, after a power failure the file has a correct size and name.
> > > >
> > >
> > > I am not sure that ext4/xfs semantics guaranty anything about
> > > persisting file name after fsync of file?...
> >
> > They do.  It's that pesky "strictly ordered metadata" thing I keep
> > having to explain to people...
>
> https://marc.info/?l=fstests&m=155010885626284&w=2

With reference to that previous conversation, Jayashree and I are
planning to submit a patch adding something like
Documentation/filesystems/crash-recovery.txt to document crash
guarantees and strictly ordered metadata semantics soon! So hopefully
there will be in-kernel documentation we can point folks to soon.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-05  0:50   ` Dave Chinner
  2019-03-05  1:00     ` Dave Chinner
@ 2019-03-05  5:39     ` Amir Goldstein
  2019-03-05 22:33       ` Dave Chinner
  1 sibling, 1 reply; 22+ messages in thread
From: Amir Goldstein @ 2019-03-05  5:39 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Filipe Manana, fstests, Linux Btrfs, Filipe Manana

On Tue, Mar 5, 2019 at 2:50 AM Dave Chinner <david@fromorbit.com> wrote:
>
> On Mon, Mar 04, 2019 at 05:04:23PM +0200, Amir Goldstein wrote:
> > On Mon, Mar 4, 2019 at 4:44 PM <fdmanana@kernel.org> wrote:
> > >
> > > From: Filipe Manana <fdmanana@suse.com>
> > >
> > > Test that if we truncate a file to reduce its size, rename it and then
> > > fsync it, after a power failure the file has a correct size and name.
> > >
> >
> > I am not sure that ext4/xfs semantics guaranty anything about
> > persisting file name after fsync of file?...
>
> They do.  It's that pesky "strictly ordered metadata" thing I keep
> having to explain to people...
>
> i.e. if you fsync an inode, then you are persisting all the changes
> needed to reference that file and it's data. And so if there was a
> rename in the history of that file, then that is persisted, too.
> Which means that both the original and the new directory
> modifications are persisted, too.
>
> *POSIX* doesn't require this - it says that if you O_DSYNC data,
> then it also includes all the metadata needed to reference that
> data. So even if the data is there, POSIX doesn't define whether the
> rename is there or noti, just that you can get to the fsync'd data
> via either the old or new name. IOWs, POSIX allows the behaviour to
> be implementation specific.
>
> In this case, file systems with strictly ordered metadata will end
> up making the rename visible because the rename occurred before the
> truncate that the fsync() is persisting...
>

That is not what is happening in Filipe's test. Test has:
- ftruncate A
- fsync A
- rename A B
- fsync B

So the reason this is working is because 2nd fsync needs to
persist ctime of B and not because it needs to persist the
truncate.

XFS does it, but it doesn't seem like something that any
filesystem is guaranteed to do the same:
        /*
         * We always want to hit the ctime on the source inode.
         *
         * This isn't strictly required by the standards since the source
         * inode isn't really being changed, but old unix file systems did
         * it and some incremental backup programs won't work without it.
         */
        xfs_trans_ichgtime(tp, src_ip, XFS_ICHGTIME_CHG);

So for the purpose of the test itself, which needs to guaranty that
btrfs persists the size, fsync of parent would be more robust for
any filesystem.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-04 22:30       ` Filipe Manana
@ 2019-03-05  5:59         ` Amir Goldstein
  2019-03-05  9:26           ` Filipe Manana
  0 siblings, 1 reply; 22+ messages in thread
From: Amir Goldstein @ 2019-03-05  5:59 UTC (permalink / raw)
  To: Filipe Manana; +Cc: fstests, Linux Btrfs, Filipe Manana, Dave Chinner

On Tue, Mar 5, 2019 at 12:31 AM Filipe Manana <fdmanana@kernel.org> wrote:
>
> On Mon, Mar 4, 2019 at 5:59 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Mon, Mar 4, 2019 at 5:23 PM Filipe Manana <fdmanana@kernel.org> wrote:
> > >
> > > On Mon, Mar 4, 2019 at 3:04 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > On Mon, Mar 4, 2019 at 4:44 PM <fdmanana@kernel.org> wrote:
> > > > >
> > > > > From: Filipe Manana <fdmanana@suse.com>
> > > > >
> > > > > Test that if we truncate a file to reduce its size, rename it and then
> > > > > fsync it, after a power failure the file has a correct size and name.
> > > > >
> > > >
> > > > I am not sure that ext4/xfs semantics guaranty anything about
> > > > persisting file name after fsync of file?...
> > > >
> > > > > This test is motivated by a bug found in btrfs, which is fixed by a
> > > > > patch for the linux kernel titled:
> > > > >
> > > > >   "Btrfs: fix incorrect file size after shrinking truncate and fsync"
> > > >
> > > > At least the title of this patch says nothing about persisting the
> > > > name.
> > >
> > > Because the bug in btrfs is not about persisting the name, it's about
> > > persisting the correct inode size.
> > >
> >
> > OK. Still AFAIK, ext4/xfs don't guaranty that the name bar will exist
> > after power failure unless the parent dir was fsynced.
> > If you fsync the parent dir instead of the file after rename, it will
> > make the test correct for ext4/xfs (and POSIX for that matter).
>
> I'm not so sure about that. Couldn't find anything explicit anywhere about that.

man fsync(2):
"       Calling  fsync() does not necessarily ensure that the entry in
the directory
        containing the file has also reached disk.  For that an
explicit fsync() on a file
        descriptor for the directory is also needed."

> There are other tests that do similar assumptions, either after
> renames or after adding a new
> hard link (fsyncing just the file being enough to guarantee the new
> hard link exits after a power
> failure, without the need to fsync the parent directory). I don't
> recall ever having had such comment before,

Because I missed it in review...

> even for much more complex cases involving renames
> such as [1] for example.
>
> [1] https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/commit/?id=e2d866668a98af4cd47cd6913418e410cd80aa93
>
> In this case both old and new names are in the same parent directory,
> but if they were in different directories, and
> requiring the user/application to fsync both directories, it could be
> dangerous, if the old parent dir is fsync'ed and
> a power failure happened before fsync'ing the new parent (requiring
> the user/application to know about this
> and fsync the new parent before the old parent dir also seems very
> demanding), leading to a potential file loss
> after replaying the log/journal (there were such cases in btrfs before).
>

This is not a problem for ext4/xfs.
fsync(newdir) guarantees persisting new filename
metadata ordering implies that olddir changes (and file nlink)
also persist.

To meet requirement of both btrfs and ext4/xfs in that case
your test may fsync both file AND newdir, as long as this
still allows your test to catch the bug?

> > Will that test modification still catch the btrfs bug (pre fix)?
>
> Yes, it will still trigger the btrfs bug.
>
> Well, I don't mind doing such change, but I would like to hear other
> opinions too, perhaps Dave could shed some light on this?
> Even old reiserfs and f2fs (both on posix and strict fsync modes)
> currently pass this test.
>

As I wrote to Dave, if file is not "metadata dirty" before rename,
then whether or not rename dirties to file for fsync is a filesystem
specific implementation detail that is not in any standard.

Since filesystems tend to try to optimize out unneeded journal
commits, so the observation about what filesystems do today
empirically is not a sufficiently strong argument.

Again, if your test can afford to do fsync of file and parent dir
it is best to be on the safe side (please audit other similar tests
that you know about). If your test cannot afford to fsync parent
dir, then at least a fat comment about this issue is worth about
the expected (but not guaranteed) effects of fsync of file.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-05  5:59         ` Amir Goldstein
@ 2019-03-05  9:26           ` Filipe Manana
  2019-03-05 10:51             ` Amir Goldstein
  0 siblings, 1 reply; 22+ messages in thread
From: Filipe Manana @ 2019-03-05  9:26 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: fstests, Linux Btrfs, Filipe Manana, Dave Chinner

On Tue, Mar 5, 2019 at 5:59 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Tue, Mar 5, 2019 at 12:31 AM Filipe Manana <fdmanana@kernel.org> wrote:
> >
> > On Mon, Mar 4, 2019 at 5:59 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Mon, Mar 4, 2019 at 5:23 PM Filipe Manana <fdmanana@kernel.org> wrote:
> > > >
> > > > On Mon, Mar 4, 2019 at 3:04 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > > >
> > > > > On Mon, Mar 4, 2019 at 4:44 PM <fdmanana@kernel.org> wrote:
> > > > > >
> > > > > > From: Filipe Manana <fdmanana@suse.com>
> > > > > >
> > > > > > Test that if we truncate a file to reduce its size, rename it and then
> > > > > > fsync it, after a power failure the file has a correct size and name.
> > > > > >
> > > > >
> > > > > I am not sure that ext4/xfs semantics guaranty anything about
> > > > > persisting file name after fsync of file?...
> > > > >
> > > > > > This test is motivated by a bug found in btrfs, which is fixed by a
> > > > > > patch for the linux kernel titled:
> > > > > >
> > > > > >   "Btrfs: fix incorrect file size after shrinking truncate and fsync"
> > > > >
> > > > > At least the title of this patch says nothing about persisting the
> > > > > name.
> > > >
> > > > Because the bug in btrfs is not about persisting the name, it's about
> > > > persisting the correct inode size.
> > > >
> > >
> > > OK. Still AFAIK, ext4/xfs don't guaranty that the name bar will exist
> > > after power failure unless the parent dir was fsynced.
> > > If you fsync the parent dir instead of the file after rename, it will
> > > make the test correct for ext4/xfs (and POSIX for that matter).
> >
> > I'm not so sure about that. Couldn't find anything explicit anywhere about that.
>
> man fsync(2):
> "       Calling  fsync() does not necessarily ensure that the entry in
> the directory
>         containing the file has also reached disk.  For that an
> explicit fsync() on a file
>         descriptor for the directory is also needed."
>
> > There are other tests that do similar assumptions, either after
> > renames or after adding a new
> > hard link (fsyncing just the file being enough to guarantee the new
> > hard link exits after a power
> > failure, without the need to fsync the parent directory). I don't
> > recall ever having had such comment before,
>
> Because I missed it in review...
>
> > even for much more complex cases involving renames
> > such as [1] for example.
> >
> > [1] https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/commit/?id=e2d866668a98af4cd47cd6913418e410cd80aa93
> >
> > In this case both old and new names are in the same parent directory,
> > but if they were in different directories, and
> > requiring the user/application to fsync both directories, it could be
> > dangerous, if the old parent dir is fsync'ed and
> > a power failure happened before fsync'ing the new parent (requiring
> > the user/application to know about this
> > and fsync the new parent before the old parent dir also seems very
> > demanding), leading to a potential file loss
> > after replaying the log/journal (there were such cases in btrfs before).
> >
>
> This is not a problem for ext4/xfs.
> fsync(newdir) guarantees persisting new filename
> metadata ordering implies that olddir changes (and file nlink)
> also persist.
>
> To meet requirement of both btrfs and ext4/xfs in that case
> your test may fsync both file AND newdir, as long as this
> still allows your test to catch the bug?
>
> > > Will that test modification still catch the btrfs bug (pre fix)?
> >
> > Yes, it will still trigger the btrfs bug.
> >
> > Well, I don't mind doing such change, but I would like to hear other
> > opinions too, perhaps Dave could shed some light on this?
> > Even old reiserfs and f2fs (both on posix and strict fsync modes)
> > currently pass this test.
> >
>
> As I wrote to Dave, if file is not "metadata dirty" before rename,
> then whether or not rename dirties to file for fsync is a filesystem
> specific implementation detail that is not in any standard.
>
> Since filesystems tend to try to optimize out unneeded journal
> commits, so the observation about what filesystems do today
> empirically is not a sufficiently strong argument.

Even if the strictly ordered metadata guarantees Dave explained so
many time before is not formally defined anywhere,
I think it's a behavior that shouldn't change, as not only it makes
sense, but applications might be relying on it and therefore
useful to make sure no regressions happen. If such tests should be
shared (and run only filesystems known to support
the strictly ordered metadata) or require something special, is
another question.

>
> Again, if your test can afford to do fsync of file and parent dir
> it is best to be on the safe side (please audit other similar tests
> that you know about). If your test cannot afford to fsync parent
> dir, then at least a fat comment about this issue is worth about
> the expected (but not guaranteed) effects of fsync of file.

I'm sending a v2 with the fsync against the parent directory, despite
not being needed for any fs I could
run the tests against, because it's a generic test.

Thanks.

>
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2] generic: add test for fsync after shrinking truncate and rename
  2019-03-04 14:06 [PATCH] generic: add test for fsync after shrinking truncate and rename fdmanana
  2019-03-04 15:04 ` Amir Goldstein
@ 2019-03-05  9:26 ` fdmanana
  1 sibling, 0 replies; 22+ messages in thread
From: fdmanana @ 2019-03-05  9:26 UTC (permalink / raw)
  To: fstests; +Cc: linux-btrfs, Filipe Manana

From: Filipe Manana <fdmanana@suse.com>

Test that if we truncate a file to reduce its size, rename it and then
fsync it, after a power failure the file has a correct size and name.

This test is motivated by a bug found in btrfs, which is fixed by a
patch for the linux kernel titled:

  "Btrfs: fix incorrect file size after shrinking truncate and fsync"

This test currently passes on ext4, xfs, f2fs and patched btrfs.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---

V2: Added fsync on directory after the rename, to ensure the rename is
    persisted for filesystem that don't guarantee strictly ordered
    metadata (ext4, xfs and f2fs provide that, even old reiserfs).

 tests/generic/532     | 65 +++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/532.out |  8 +++++++
 tests/generic/group   |  1 +
 3 files changed, 74 insertions(+)
 create mode 100755 tests/generic/532
 create mode 100644 tests/generic/532.out

diff --git a/tests/generic/532 b/tests/generic/532
new file mode 100755
index 00000000..9a9f9deb
--- /dev/null
+++ b/tests/generic/532
@@ -0,0 +1,65 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (C) 2019 SUSE Linux Products GmbH. All Rights Reserved.
+#
+# FS QA Test No. 532
+#
+# Test that if we truncate a file to reduce its size, rename it and then fsync
+# it, after a power failure the file has a correct size and name.
+#
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	_cleanup_flakey
+	cd /
+	rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmflakey
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+_require_scratch
+_require_dm_target flakey
+
+rm -f $seqres.full
+
+_scratch_mkfs >>$seqres.full 2>&1
+_require_metadata_journaling $SCRATCH_DEV
+_init_flakey
+_mount_flakey
+
+# Create our test file with an initial size of 8000 bytes, then fsync it,
+# followed by a truncate that reduces its size down to 3000 bytes.
+$XFS_IO_PROG -f -c "pwrite -S 0xab 0 8000" \
+	     -c "fsync" \
+	     -c "truncate 3000" \
+	     $SCRATCH_MNT/foo | _filter_xfs_io
+
+# Now rename the file and fsync it again.
+mv $SCRATCH_MNT/foo $SCRATCH_MNT/bar
+$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/bar
+# Fsync the parent directory to ensure the rename is persisted.
+$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/
+
+# Simulate a power failure and mount the filesystem to check that the file was
+# persisted with the new name and has a size of 3000 bytes.
+_flakey_drop_and_remount
+
+echo "File content after power failure:"
+od -A d -t x1 $SCRATCH_MNT/bar
+
+_unmount_flakey
+
+status=0
+exit
diff --git a/tests/generic/532.out b/tests/generic/532.out
new file mode 100644
index 00000000..554fbe2a
--- /dev/null
+++ b/tests/generic/532.out
@@ -0,0 +1,8 @@
+QA output created by 532
+wrote 8000/8000 bytes at offset 0
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+File content after power failure:
+0000000 ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab
+*
+0002992 ab ab ab ab ab ab ab ab
+0003000
diff --git a/tests/generic/group b/tests/generic/group
index 31011ac8..7d63f303 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -534,3 +534,4 @@
 529 auto quick attr
 530 auto quick unlink
 531 auto quick unlink
+532 auto quick log
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-05  9:26           ` Filipe Manana
@ 2019-03-05 10:51             ` Amir Goldstein
  0 siblings, 0 replies; 22+ messages in thread
From: Amir Goldstein @ 2019-03-05 10:51 UTC (permalink / raw)
  To: Filipe Manana; +Cc: fstests, Linux Btrfs, Filipe Manana, Dave Chinner

> > As I wrote to Dave, if file is not "metadata dirty" before rename,
> > then whether or not rename dirties to file for fsync is a filesystem
> > specific implementation detail that is not in any standard.
> >
> > Since filesystems tend to try to optimize out unneeded journal
> > commits, so the observation about what filesystems do today
> > empirically is not a sufficiently strong argument.
>
> Even if the strictly ordered metadata guarantees Dave explained so
> many time before is not formally defined anywhere,
> I think it's a behavior that shouldn't change, as not only it makes
> sense, but applications might be relying on it and therefore
> useful to make sure no regressions happen. If such tests should be
> shared (and run only filesystems known to support
> the strictly ordered metadata) or require something special, is
> another question.
>

I agree with you that it is good to test and verify those semantics.
I was not arguing against assuming strictly ordered metadata.
I was arguing against assuming that rename marks the inode dirty.
If rename does not mark the inode dirty, then fsync(file) may be optimized
away by filesystem and will not require persisting anything, without
breaking strictly ordered metadata semantics.


> >
> > Again, if your test can afford to do fsync of file and parent dir
> > it is best to be on the safe side (please audit other similar tests
> > that you know about). If your test cannot afford to fsync parent
> > dir, then at least a fat comment about this issue is worth about
> > the expected (but not guaranteed) effects of fsync of file.
>
> I'm sending a v2 with the fsync against the parent directory, despite
> not being needed for any fs I could
> run the tests against, because it's a generic test.

ACK.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-05  5:39     ` Amir Goldstein
@ 2019-03-05 22:33       ` Dave Chinner
  2019-03-06  7:51         ` Amir Goldstein
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Chinner @ 2019-03-05 22:33 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Filipe Manana, fstests, Linux Btrfs, Filipe Manana

On Tue, Mar 05, 2019 at 07:39:28AM +0200, Amir Goldstein wrote:
> On Tue, Mar 5, 2019 at 2:50 AM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Mon, Mar 04, 2019 at 05:04:23PM +0200, Amir Goldstein wrote:
> > > On Mon, Mar 4, 2019 at 4:44 PM <fdmanana@kernel.org> wrote:
> > > >
> > > > From: Filipe Manana <fdmanana@suse.com>
> > > >
> > > > Test that if we truncate a file to reduce its size, rename it and then
> > > > fsync it, after a power failure the file has a correct size and name.

This says:

- ftruncate A
- rename A B
- fsync B

> > > I am not sure that ext4/xfs semantics guaranty anything about
> > > persisting file name after fsync of file?...
> >
> > They do.  It's that pesky "strictly ordered metadata" thing I keep
> > having to explain to people...
> >
> > i.e. if you fsync an inode, then you are persisting all the changes
> > needed to reference that file and it's data. And so if there was a
> > rename in the history of that file, then that is persisted, too.
> > Which means that both the original and the new directory
> > modifications are persisted, too.
> >
> > *POSIX* doesn't require this - it says that if you O_DSYNC data,
> > then it also includes all the metadata needed to reference that
> > data. So even if the data is there, POSIX doesn't define whether the
> > rename is there or noti, just that you can get to the fsync'd data
> > via either the old or new name. IOWs, POSIX allows the behaviour to
> > be implementation specific.
> >
> > In this case, file systems with strictly ordered metadata will end
> > up making the rename visible because the rename occurred before the
> > truncate that the fsync() is persisting...
> >
> 
> That is not what is happening in Filipe's test. Test has:
> - ftruncate A
> - fsync A
> - rename A B
> - fsync B

And this does not match the test description.

/me goes and looks at the test again to check.

Ok, the test is as Filipe describes:

- pwrite 0 0x8000 A
- fsync A
- truncate 3000 A
- rename A B
- fsync B

There is no fsync between truncate and rename.

> So the reason this is working is because 2nd fsync needs to
> persist ctime of B and not because it needs to persist the
> truncate.

ctime modifications during rename are irrelevent because there's no
fsync between the truncate and the rename so the file inode is
already dirty due to the truncate. I think you've got the wrong end
of the stick here, Amir. :)

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-05 22:33       ` Dave Chinner
@ 2019-03-06  7:51         ` Amir Goldstein
  2019-03-06 21:48           ` Dave Chinner
  0 siblings, 1 reply; 22+ messages in thread
From: Amir Goldstein @ 2019-03-06  7:51 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Filipe Manana, fstests, Linux Btrfs, Filipe Manana

On Wed, Mar 6, 2019 at 12:33 AM Dave Chinner <david@fromorbit.com> wrote:
>
> On Tue, Mar 05, 2019 at 07:39:28AM +0200, Amir Goldstein wrote:
> > On Tue, Mar 5, 2019 at 2:50 AM Dave Chinner <david@fromorbit.com> wrote:
> > >
> > > On Mon, Mar 04, 2019 at 05:04:23PM +0200, Amir Goldstein wrote:
> > > > On Mon, Mar 4, 2019 at 4:44 PM <fdmanana@kernel.org> wrote:
> > > > >
> > > > > From: Filipe Manana <fdmanana@suse.com>
> > > > >
> > > > > Test that if we truncate a file to reduce its size, rename it and then
> > > > > fsync it, after a power failure the file has a correct size and name.
>
> This says:
>
> - ftruncate A
> - rename A B
> - fsync B
>
> > > > I am not sure that ext4/xfs semantics guaranty anything about
> > > > persisting file name after fsync of file?...
> > >
> > > They do.  It's that pesky "strictly ordered metadata" thing I keep
> > > having to explain to people...
> > >
> > > i.e. if you fsync an inode, then you are persisting all the changes
> > > needed to reference that file and it's data. And so if there was a
> > > rename in the history of that file, then that is persisted, too.
> > > Which means that both the original and the new directory
> > > modifications are persisted, too.
> > >
> > > *POSIX* doesn't require this - it says that if you O_DSYNC data,
> > > then it also includes all the metadata needed to reference that
> > > data. So even if the data is there, POSIX doesn't define whether the
> > > rename is there or noti, just that you can get to the fsync'd data
> > > via either the old or new name. IOWs, POSIX allows the behaviour to
> > > be implementation specific.
> > >
> > > In this case, file systems with strictly ordered metadata will end
> > > up making the rename visible because the rename occurred before the
> > > truncate that the fsync() is persisting...
> > >
> >
> > That is not what is happening in Filipe's test. Test has:
> > - ftruncate A
> > - fsync A
> > - rename A B
> > - fsync B
>
> And this does not match the test description.
>
> /me goes and looks at the test again to check.
>
> Ok, the test is as Filipe describes:
>
> - pwrite 0 0x8000 A
> - fsync A
> - truncate 3000 A
> - rename A B
> - fsync B
>
> There is no fsync between truncate and rename.
>
> > So the reason this is working is because 2nd fsync needs to
> > persist ctime of B and not because it needs to persist the
> > truncate.
>
> ctime modifications during rename are irrelevent because there's no
> fsync between the truncate and the rename so the file inode is
> already dirty due to the truncate. I think you've got the wrong end
> of the stick here, Amir. :)
>

Doh! The discussion is still interesting because people have
hard time to understand that those hidden details like ctime
update on rename may have different behavior on different fs
regardless if they obay ordered metadata or not.
Btrfs is different in the respect of metadata dependencies from
xfs/ext4 in many ways as seen in the different rename/link
crash consistency discussions.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-06  7:51         ` Amir Goldstein
@ 2019-03-06 21:48           ` Dave Chinner
  2019-03-07  7:52             ` Amir Goldstein
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Chinner @ 2019-03-06 21:48 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Filipe Manana, fstests, Linux Btrfs, Filipe Manana

On Wed, Mar 06, 2019 at 09:51:23AM +0200, Amir Goldstein wrote:
> On Wed, Mar 6, 2019 at 12:33 AM Dave Chinner <david@fromorbit.com> wrote:
> > > So the reason this is working is because 2nd fsync needs to
> > > persist ctime of B and not because it needs to persist the
> > > truncate.
> >
> > ctime modifications during rename are irrelevent because there's no
> > fsync between the truncate and the rename so the file inode is
> > already dirty due to the truncate. I think you've got the wrong end
> > of the stick here, Amir. :)
> 
> Doh! The discussion is still interesting because people have
> hard time to understand that those hidden details like ctime
> update on rename may have different behavior on different fs
> regardless if they obay ordered metadata or not.
> Btrfs is different in the respect of metadata dependencies from
> xfs/ext4 in many ways as seen in the different rename/link
> crash consistency discussions.

Yes, little things like can result in different behaviour, but what
we are trying to do is get to the point where there is minimal
difference between all crash-recovery-capable linux filesystems.

e.g. what we see here is that by always including the inode being
moved in the rename transaction (regardless of how a filesystem
acheives that), we provide consistent, reliable, predictable
behaviour in all cases of "fsync after rename". IOWs, the SOMC model
that _require_metadata_journaling tests are supposed to conform to
is far more strict that POSIX requires and our tests need to reflect
this stricter consistency model.

IOWs, we should be encoding the behaviour we want in these tests
rather than implementing yet another "test POSIX compatible
behaviour" - POSIX is a complete crapshoot when it comes to
persistence requirements. And if a filesystem fails a SOMC-model
test, then the filesystem needs to be fixed, not have the test
"relaxed" to only exercise POSIX-defined behaviour.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-06 21:48           ` Dave Chinner
@ 2019-03-07  7:52             ` Amir Goldstein
  2019-03-07 23:19               ` Jayashree Mohan
  2019-03-08  3:46               ` Dave Chinner
  0 siblings, 2 replies; 22+ messages in thread
From: Amir Goldstein @ 2019-03-07  7:52 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Filipe Manana, fstests, Linux Btrfs, Filipe Manana

On Wed, Mar 6, 2019 at 11:48 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Wed, Mar 06, 2019 at 09:51:23AM +0200, Amir Goldstein wrote:
> > On Wed, Mar 6, 2019 at 12:33 AM Dave Chinner <david@fromorbit.com> wrote:
> > > > So the reason this is working is because 2nd fsync needs to
> > > > persist ctime of B and not because it needs to persist the
> > > > truncate.
> > >
> > > ctime modifications during rename are irrelevent because there's no
> > > fsync between the truncate and the rename so the file inode is
> > > already dirty due to the truncate. I think you've got the wrong end
> > > of the stick here, Amir. :)
> >
> > Doh! The discussion is still interesting because people have
> > hard time to understand that those hidden details like ctime
> > update on rename may have different behavior on different fs
> > regardless if they obay ordered metadata or not.
> > Btrfs is different in the respect of metadata dependencies from
> > xfs/ext4 in many ways as seen in the different rename/link
> > crash consistency discussions.
>
> Yes, little things like can result in different behaviour, but what
> we are trying to do is get to the point where there is minimal
> difference between all crash-recovery-capable linux filesystems.
>
> e.g. what we see here is that by always including the inode being
> moved in the rename transaction (regardless of how a filesystem
> acheives that), we provide consistent, reliable, predictable
> behaviour in all cases of "fsync after rename". IOWs, the SOMC model
> that _require_metadata_journaling tests are supposed to conform to
> is far more strict that POSIX requires and our tests need to reflect
> this stricter consistency model.
>
> IOWs, we should be encoding the behaviour we want in these tests
> rather than implementing yet another "test POSIX compatible
> behaviour" - POSIX is a complete crapshoot when it comes to
> persistence requirements. And if a filesystem fails a SOMC-model
> test, then the filesystem needs to be fixed, not have the test
> "relaxed" to only exercise POSIX-defined behaviour.
>

Agreed! v1 is better than v2. Sorry for my mistake in v1 review.

I went back to look at similar fsync tests by Filipe:
generic/{106,107,335,336,341,342,343,348,498,501,502,509,510,512}

I found some alleged subtle mistakes about SOMC assumptions.

generic/336 does:
touch $SCRATCH_MNT/a/foo
ln $SCRATCH_MNT/a/foo $SCRATCH_MNT/b/foo_link
touch $SCRATCH_MNT/b/bar
sync
unlink $SCRATCH_MNT/b/foo_link
mv $SCRATCH_MNT/b/bar $SCRATCH_MNT/c/
$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a/foo

And expects both unlink and rename to persist.
However, this is only true in the *very likely* case that there is no
journal commit in between unlink and rename, because fsync foo
is only guaranteed to persist metadata changes that depend on the
unlink and happened BEFORE it, which is not the case for the rename
of bar.

generic/343 and generic/510 do something similar.

At first glance, generic/498 is actually broken (for xfs) or at least
I don't understand why it works.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-07  7:52             ` Amir Goldstein
@ 2019-03-07 23:19               ` Jayashree Mohan
  2019-03-08  4:35                 ` Dave Chinner
  2019-03-08  3:46               ` Dave Chinner
  1 sibling, 1 reply; 22+ messages in thread
From: Jayashree Mohan @ 2019-03-07 23:19 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Dave Chinner, Filipe Manana, fstests, Linux Btrfs, Filipe Manana

Hi Amir,

> I went back to look at similar fsync tests by Filipe:
> generic/{106,107,335,336,341,342,343,348,498,501,502,509,510,512}
>
> I found some alleged subtle mistakes about SOMC assumptions.
>
> generic/336 does:
> touch $SCRATCH_MNT/a/foo
> ln $SCRATCH_MNT/a/foo $SCRATCH_MNT/b/foo_link
> touch $SCRATCH_MNT/b/bar
> sync
> unlink $SCRATCH_MNT/b/foo_link
> mv $SCRATCH_MNT/b/bar $SCRATCH_MNT/c/
> $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a/foo

This is probably what's happening in this particular test :

SOMC requires:
          fsync(a/foo) must ensure unlink(b/foo_link)  (because they
were linked at some point)

But what happens is:
           fsync(a/foo)          -->  unlink(b/foo_link)
           unlink(b/foo_link)  -->  fsync(b)
           fsync(b)                 -->  rename goes through

SOMC should only require that the unlink persists. The rename
operation persists due to the side-effect of SOMC. So we should be
only testing if the unlink operation went through.

Take a look at this thread that describes the bug which resulted in
this test case (https://patchwork.kernel.org/patch/8293181/).
The file bar was lost during the rename operation and I think this
test simply wanted to verify that file bar was intact in either one of
the directories. Probably the reason this test was written was not to
test SOMC, but to verify that the log replay in btrfs does not delete
the renamed file - luckily it happens to pass on other file systems
because they all persist the rename operation as a side effect of
SOMC. To me, this test currently checks for something more than SOMC.
The check must be modified to only test if b/foo_link is removed, and
that the file bar is *either* in directory b or c.

Test 343 is a similar case where in btrfs log replay ended up
persisting the file in both the source and destination directories
after a rename operation
(https://patchwork.kernel.org/patch/8766401/). The checks must be
updated here as well.

Thanks,
Jayashree

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-07  7:52             ` Amir Goldstein
  2019-03-07 23:19               ` Jayashree Mohan
@ 2019-03-08  3:46               ` Dave Chinner
  1 sibling, 0 replies; 22+ messages in thread
From: Dave Chinner @ 2019-03-08  3:46 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Filipe Manana, fstests, Linux Btrfs, Filipe Manana

On Thu, Mar 07, 2019 at 09:52:03AM +0200, Amir Goldstein wrote:
> On Wed, Mar 6, 2019 at 11:48 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Wed, Mar 06, 2019 at 09:51:23AM +0200, Amir Goldstein wrote:
> > > On Wed, Mar 6, 2019 at 12:33 AM Dave Chinner <david@fromorbit.com> wrote:
> > > > > So the reason this is working is because 2nd fsync needs to
> > > > > persist ctime of B and not because it needs to persist the
> > > > > truncate.
> > > >
> > > > ctime modifications during rename are irrelevent because there's no
> > > > fsync between the truncate and the rename so the file inode is
> > > > already dirty due to the truncate. I think you've got the wrong end
> > > > of the stick here, Amir. :)
> > >
> > > Doh! The discussion is still interesting because people have
> > > hard time to understand that those hidden details like ctime
> > > update on rename may have different behavior on different fs
> > > regardless if they obay ordered metadata or not.
> > > Btrfs is different in the respect of metadata dependencies from
> > > xfs/ext4 in many ways as seen in the different rename/link
> > > crash consistency discussions.
> >
> > Yes, little things like can result in different behaviour, but what
> > we are trying to do is get to the point where there is minimal
> > difference between all crash-recovery-capable linux filesystems.
> >
> > e.g. what we see here is that by always including the inode being
> > moved in the rename transaction (regardless of how a filesystem
> > acheives that), we provide consistent, reliable, predictable
> > behaviour in all cases of "fsync after rename". IOWs, the SOMC model
> > that _require_metadata_journaling tests are supposed to conform to
> > is far more strict that POSIX requires and our tests need to reflect
> > this stricter consistency model.
> >
> > IOWs, we should be encoding the behaviour we want in these tests
> > rather than implementing yet another "test POSIX compatible
> > behaviour" - POSIX is a complete crapshoot when it comes to
> > persistence requirements. And if a filesystem fails a SOMC-model
> > test, then the filesystem needs to be fixed, not have the test
> > "relaxed" to only exercise POSIX-defined behaviour.
> >
> 
> Agreed! v1 is better than v2. Sorry for my mistake in v1 review.
> 
> I went back to look at similar fsync tests by Filipe:
> generic/{106,107,335,336,341,342,343,348,498,501,502,509,510,512}
> 
> I found some alleged subtle mistakes about SOMC assumptions.
> 
> generic/336 does:
> touch $SCRATCH_MNT/a/foo
> ln $SCRATCH_MNT/a/foo $SCRATCH_MNT/b/foo_link
> touch $SCRATCH_MNT/b/bar
> sync
> unlink $SCRATCH_MNT/b/foo_link
> mv $SCRATCH_MNT/b/bar $SCRATCH_MNT/c/
> $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a/foo
> 
> And expects both unlink and rename to persist.

*nod*. Yup, that's a typical async transactional ordering
behaviour because the dirty state is carried on the inode, not the
path or fd that is being used to access it.

> However, this is only true in the *very likely* case that there is no
> journal commit in between unlink and rename, because fsync foo
> is only guaranteed to persist metadata changes that depend on the
> unlink and happened BEFORE it, which is not the case for the rename
> of bar.

Yup, most likely. I haven't looked at any of these btrfs-inspired
fsync tests in any detail so it wouldn't surprise me that there are
issues like this in them - I just don't have time to look at
everything.

Indeed, if there are mistaken assumptions in the tests based around
pending dirty state and async transaction aggregation, then running
the tests with "-o dirsync" or even "-o wsync" should cause such
tests to fail.

> At first glance, generic/498 is actually broken (for xfs) or at least
> I don't understand why it works.

The test does this:

mkdir $SCRATCH_MNT/A
mkdir $SCRATCH_MNT/B
mkdir $SCRATCH_MNT/A/C
touch $SCRATCH_MNT/B/foo
$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/B/foo

# It is important the new hard link is located in a hierarchy of new directories
# (not yet persisted).
ln $SCRATCH_MNT/B/foo $SCRATCH_MNT/A/C/foo
$XFS_IO_PROG -c "fsync" $SCRATCH_MNT/A

_flakey_drop_and_remount

And it expects /A and B/foo to be there afterwards. The comment
"(not yet persisted)" is about the buggy btrfs behaviour, not how a
SOMC fs should behave.

So, yeah, that should always work on XFS, because the first fsync
persists everything the test checks (due to common ancestors), and
second fsync is a no-opt because dir A has already been
checkpointed.


Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-07 23:19               ` Jayashree Mohan
@ 2019-03-08  4:35                 ` Dave Chinner
  2019-03-08 15:11                   ` Vijay Chidambaram
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Chinner @ 2019-03-08  4:35 UTC (permalink / raw)
  To: Jayashree Mohan
  Cc: Amir Goldstein, Filipe Manana, fstests, Linux Btrfs, Filipe Manana

On Thu, Mar 07, 2019 at 05:19:51PM -0600, Jayashree Mohan wrote:
> Hi Amir,
> 
> > I went back to look at similar fsync tests by Filipe:
> > generic/{106,107,335,336,341,342,343,348,498,501,502,509,510,512}
> >
> > I found some alleged subtle mistakes about SOMC assumptions.
> >
> > generic/336 does:
> > touch $SCRATCH_MNT/a/foo
> > ln $SCRATCH_MNT/a/foo $SCRATCH_MNT/b/foo_link
> > touch $SCRATCH_MNT/b/bar
> > sync
> > unlink $SCRATCH_MNT/b/foo_link
> > mv $SCRATCH_MNT/b/bar $SCRATCH_MNT/c/
> > $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a/foo
> 
> This is probably what's happening in this particular test :
> 
> SOMC requires:
>           fsync(a/foo) must ensure unlink(b/foo_link)  (because they
> were linked at some point)
> 
> But what happens is:
>            fsync(a/foo)          -->  unlink(b/foo_link)
>            unlink(b/foo_link)  -->  fsync(b)
>            fsync(b)                 -->  rename goes through
> 
> SOMC should only require that the unlink persists.

That's a /fsync/ requirement, not SOMC.

SOMC says:

"If the rename after the fsync()d unlink is present after recovery,
then every metadata operation completed between the unlink and the
rename must also be present after recovery."

> The rename
> operation persists due to the side-effect of SOMC. So we should be
> only testing if the unlink operation went through.

No, it persists as a side effect of the a filesystem fsync()
implementation, not SOMC.

i.e. on fsync(), the filesystem must commit the journal at a point
in time that *includes the unlink*. That means it can chose any time
between the unlink and the moment the fsync() is received by the
filesystem. Once the point in time has been chosen by the fsync
operation, SOMC then defines what else must be journalled and
recovered with the information that must be fsync()d.

> Take a look at this thread that describes the bug which resulted in
> this test case (https://patchwork.kernel.org/patch/8293181/).

I really wouldn't try to infer anything from the bugs in btrfs fsync
behaviour or the test cases that expose them. 'Behave like other
filesystems" is not a substitute for having solid fundamental
algorithms...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-08  4:35                 ` Dave Chinner
@ 2019-03-08 15:11                   ` Vijay Chidambaram
  2019-03-19  1:13                     ` Dave Chinner
  0 siblings, 1 reply; 22+ messages in thread
From: Vijay Chidambaram @ 2019-03-08 15:11 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Jayashree Mohan, Amir Goldstein, Filipe Manana, fstests,
	Linux Btrfs, Filipe Manana

On Thu, Mar 7, 2019 at 10:35 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Thu, Mar 07, 2019 at 05:19:51PM -0600, Jayashree Mohan wrote:
> > Hi Amir,
> >
> > > I went back to look at similar fsync tests by Filipe:
> > > generic/{106,107,335,336,341,342,343,348,498,501,502,509,510,512}
> > >
> > > I found some alleged subtle mistakes about SOMC assumptions.
> > >
> > > generic/336 does:
> > > touch $SCRATCH_MNT/a/foo
> > > ln $SCRATCH_MNT/a/foo $SCRATCH_MNT/b/foo_link
> > > touch $SCRATCH_MNT/b/bar
> > > sync
> > > unlink $SCRATCH_MNT/b/foo_link
> > > mv $SCRATCH_MNT/b/bar $SCRATCH_MNT/c/
> > > $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a/foo
> >
> > This is probably what's happening in this particular test :
> >
> > SOMC requires:
> >           fsync(a/foo) must ensure unlink(b/foo_link)  (because they
> > were linked at some point)
> >
> > But what happens is:
> >            fsync(a/foo)          -->  unlink(b/foo_link)
> >            unlink(b/foo_link)  -->  fsync(b)
> >            fsync(b)                 -->  rename goes through
> >
> > SOMC should only require that the unlink persists.
>
> That's a /fsync/ requirement, not SOMC.
>
> SOMC says:
>
> "If the rename after the fsync()d unlink is present after recovery,
> then every metadata operation completed between the unlink and the
> rename must also be present after recovery."

Isn't this a bit too broad? That sounds more like total ordering of
metadata operations rather than SOMC. Shouldn't SOMC say "all
operations dependent upon the rename should be present after recovery
if rename is present?"

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] generic: add test for fsync after shrinking truncate and rename
  2019-03-08 15:11                   ` Vijay Chidambaram
@ 2019-03-19  1:13                     ` Dave Chinner
  0 siblings, 0 replies; 22+ messages in thread
From: Dave Chinner @ 2019-03-19  1:13 UTC (permalink / raw)
  To: Vijay Chidambaram
  Cc: Jayashree Mohan, Amir Goldstein, Filipe Manana, fstests,
	Linux Btrfs, Filipe Manana

(Sorry, missed this email and only just noticed it...)

On Fri, Mar 08, 2019 at 09:11:19AM -0600, Vijay Chidambaram wrote:
> On Thu, Mar 7, 2019 at 10:35 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Thu, Mar 07, 2019 at 05:19:51PM -0600, Jayashree Mohan wrote:
> > > Hi Amir,
> > >
> > > > I went back to look at similar fsync tests by Filipe:
> > > > generic/{106,107,335,336,341,342,343,348,498,501,502,509,510,512}
> > > >
> > > > I found some alleged subtle mistakes about SOMC assumptions.
> > > >
> > > > generic/336 does:
> > > > touch $SCRATCH_MNT/a/foo
> > > > ln $SCRATCH_MNT/a/foo $SCRATCH_MNT/b/foo_link
> > > > touch $SCRATCH_MNT/b/bar
> > > > sync
> > > > unlink $SCRATCH_MNT/b/foo_link
> > > > mv $SCRATCH_MNT/b/bar $SCRATCH_MNT/c/
> > > > $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a/foo
> > >
> > > This is probably what's happening in this particular test :
> > >
> > > SOMC requires:
> > >           fsync(a/foo) must ensure unlink(b/foo_link)  (because they
> > > were linked at some point)
> > >
> > > But what happens is:
> > >            fsync(a/foo)          -->  unlink(b/foo_link)
> > >            unlink(b/foo_link)  -->  fsync(b)
> > >            fsync(b)                 -->  rename goes through
> > >
> > > SOMC should only require that the unlink persists.
> >
> > That's a /fsync/ requirement, not SOMC.
> >
> > SOMC says:
> >
> > "If the rename after the fsync()d unlink is present after recovery,
> > then every metadata operation completed between the unlink and the
> > rename must also be present after recovery."
> 
> Isn't this a bit too broad? That sounds more like total ordering of
> metadata operations rather than SOMC. Shouldn't SOMC say "all
> operations dependent upon the rename should be present after recovery
> if rename is present?"

Yes. I was describing the explicit behaivour to expect in the
context of the given example, in which all the operations between
the unlink and rename were dependent operations. You've just
restated it as the general rule that I applied...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-03-19  1:13 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-04 14:06 [PATCH] generic: add test for fsync after shrinking truncate and rename fdmanana
2019-03-04 15:04 ` Amir Goldstein
2019-03-04 15:23   ` Filipe Manana
2019-03-04 17:59     ` Amir Goldstein
2019-03-04 22:30       ` Filipe Manana
2019-03-05  5:59         ` Amir Goldstein
2019-03-05  9:26           ` Filipe Manana
2019-03-05 10:51             ` Amir Goldstein
2019-03-05  0:50   ` Dave Chinner
2019-03-05  1:00     ` Dave Chinner
2019-03-05  1:08       ` Vijay Chidambaram
2019-03-05  5:39     ` Amir Goldstein
2019-03-05 22:33       ` Dave Chinner
2019-03-06  7:51         ` Amir Goldstein
2019-03-06 21:48           ` Dave Chinner
2019-03-07  7:52             ` Amir Goldstein
2019-03-07 23:19               ` Jayashree Mohan
2019-03-08  4:35                 ` Dave Chinner
2019-03-08 15:11                   ` Vijay Chidambaram
2019-03-19  1:13                     ` Dave Chinner
2019-03-08  3:46               ` Dave Chinner
2019-03-05  9:26 ` [PATCH v2] " fdmanana

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.