All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: add test for seeing unseen fsync errors on newly open files
@ 2018-05-08 12:56 Jeff Layton
  2018-05-09 12:58 ` Lu Fengqi
  0 siblings, 1 reply; 2+ messages in thread
From: Jeff Layton @ 2018-05-08 12:56 UTC (permalink / raw)
  To: guaneryu; +Cc: fstests, willy, andres, david, amir73il

From: Jeff Layton <jlayton@redhat.com>

This adds a regression test for the following kernel patch:

    errseq: Always report a writeback error once

This is motivated by some rather odd behavior done by the PostgreSQL
project. The main database writers will offload the fsync calls to a
separate process, which can open files after a writeback error has
already occurred.

This used to work with older kernels that reported the error to only
one fd, but with the errseq_t changes we lost the ability to see
errors that occurred before the open. The above patch restores that
behavior.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 tests/btrfs/999     | 110 ++++++++++++++++++++++++++++++++++++++++++++
 tests/btrfs/999.out |   5 ++
 tests/btrfs/group   |   1 +
 3 files changed, 116 insertions(+)
 create mode 100755 tests/btrfs/999
 create mode 100644 tests/btrfs/999.out

diff --git a/tests/btrfs/999 b/tests/btrfs/999
new file mode 100755
index 000000000000..0f68942a91da
--- /dev/null
+++ b/tests/btrfs/999
@@ -0,0 +1,110 @@
+#! /bin/bash
+# FS QA Test No. 999
+#
+# Open a file and write to it and fsync. Then flip the data device to throw
+# errors, write to it again and call sync. Close the file, reopen it and
+# then call fsync on it. Is the error reported?
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2018, Jeff Layton <jlayton@redhat.com>
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#-----------------------------------------------------------------------
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1    # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	cd /
+	rm -rf $tmp.* $testdir
+	_dmerror_cleanup
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmerror
+
+# real QA test starts here
+_supported_os Linux
+_supported_fs btrfs
+# This test uses "dm" without taking into account the data could be on
+# realtime subvolume, thus the test will fail with rtinherit=1
+_require_no_rtinherit
+_require_scratch_dev_pool
+
+_require_dm_target error
+
+rm -f $seqres.full
+
+# bring up dmerror device
+_scratch_unmount
+_dmerror_init
+
+# Replace first device with error-test device
+old_SCRATCH_DEV=$SCRATCH_DEV
+SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | perl -pe "s#$SCRATCH_DEV#$DMERROR_DEV#"`
+SCRATCH_DEV=$DMERROR_DEV
+
+echo "Format and mount"
+_scratch_pool_mkfs "-d raid0 -m raid1" > $seqres.full 2>&1
+_scratch_mount
+
+# How much do we need to write? We need to hit all of the stripes. btrfs uses a
+# fixed 64k stripesize, so write enough to hit each one. In the case of
+# compression, each 128K input data chunk will be compressed to 4K (because of
+# the characters written are duplicate). Therefore we have to write
+# (128K * 16) = 2048K to make sure every stripe can be hit.
+number_of_devices=`echo $SCRATCH_DEV_POOL | wc -w`
+write_kb=$(($number_of_devices * 2048))
+_require_fs_space $SCRATCH_MNT $write_kb
+datalen=$((($write_kb * 1024)-1))
+
+# use fd 5 to hold file open
+testfile=$SCRATCH_MNT/fsync-open-after-err
+exec 5>$testfile
+
+# write some data to file and fsync it out
+$XFS_IO_PROG -c "pwrite -q 0 $datalen" -c fsync $testfile
+
+# flip device to non-working mode
+_dmerror_load_error_table
+
+# rewrite the data, call sync to ensure it's written back w/o scraping error
+$XFS_IO_PROG -c "pwrite -q 0 $datalen" -c sync $testfile
+
+# heal the device error
+_dmerror_load_working_table
+
+# open again and call fsync
+echo "The following fsync should fail with EIO:"
+$XFS_IO_PROG -c fsync $testfile
+echo "done"
+
+# close file
+exec 5>&-
+
+# success, all done
+_dmerror_unmount
+_dmerror_cleanup
+
+status=0
+exit
diff --git a/tests/btrfs/999.out b/tests/btrfs/999.out
new file mode 100644
index 000000000000..38d2d7f6495f
--- /dev/null
+++ b/tests/btrfs/999.out
@@ -0,0 +1,5 @@
+QA output created by 999
+Format and mount
+The following fsync should fail with EIO:
+fsync: Input/output error
+done
diff --git a/tests/btrfs/group b/tests/btrfs/group
index ba766f6b84f8..8550f87a2305 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -162,3 +162,4 @@
 157 auto quick raid
 158 auto quick raid scrub
 159 auto quick
+999 auto quick
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] btrfs: add test for seeing unseen fsync errors on newly open files
  2018-05-08 12:56 [PATCH] btrfs: add test for seeing unseen fsync errors on newly open files Jeff Layton
@ 2018-05-09 12:58 ` Lu Fengqi
  0 siblings, 0 replies; 2+ messages in thread
From: Lu Fengqi @ 2018-05-09 12:58 UTC (permalink / raw)
  To: Jeff Layton, guaneryu; +Cc: fstests, willy, andres, david, amir73il

On 05/08/2018 08:56 PM, Jeff Layton wrote:
> From: Jeff Layton <jlayton@redhat.com>
> 
> This adds a regression test for the following kernel patch:
> 
>      errseq: Always report a writeback error once
> 
> This is motivated by some rather odd behavior done by the PostgreSQL
> project. The main database writers will offload the fsync calls to a
> separate process, which can open files after a writeback error has
> already occurred.
> 
> This used to work with older kernels that reported the error to only
> one fd, but with the errseq_t changes we lost the ability to see
> errors that occurred before the open. The above patch restores that
> behavior.
> 
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>   tests/btrfs/999     | 110 ++++++++++++++++++++++++++++++++++++++++++++
>   tests/btrfs/999.out |   5 ++
>   tests/btrfs/group   |   1 +
>   3 files changed, 116 insertions(+)
>   create mode 100755 tests/btrfs/999
>   create mode 100644 tests/btrfs/999.out
> 
> diff --git a/tests/btrfs/999 b/tests/btrfs/999
> new file mode 100755
> index 000000000000..0f68942a91da
> --- /dev/null
> +++ b/tests/btrfs/999
> @@ -0,0 +1,110 @@
> +#! /bin/bash
> +# FS QA Test No. 999
> +#
> +# Open a file and write to it and fsync. Then flip the data device to throw
> +# errors, write to it again and call sync. Close the file, reopen it and
> +# then call fsync on it. Is the error reported?
> +#
> +#-----------------------------------------------------------------------
> +# Copyright (c) 2018, Jeff Layton <jlayton@redhat.com>
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> +#-----------------------------------------------------------------------
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1    # failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> +	cd /
> +	rm -rf $tmp.* $testdir
> +	_dmerror_cleanup
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +. ./common/dmerror
> +
> +# real QA test starts here
> +_supported_os Linux
> +_supported_fs btrfs
> +# This test uses "dm" without taking into account the data could be on
> +# realtime subvolume, thus the test will fail with rtinherit=1
> +_require_no_rtinherit
> +_require_scratch_dev_pool
> +
> +_require_dm_target error
> +
> +rm -f $seqres.full
> +
> +# bring up dmerror device
> +_scratch_unmount

# diff -u tests/btrfs/999.out results/btrfs/999.out.bad
--- tests/btrfs/999.out 2018-05-09 13:01:21.605173303 +0800
+++ results/btrfs/999.out.bad 2018-05-09 20:50:46.721383569 +0800
@@ -1,4 +1,5 @@
  QA output created by 999
+umount: /mnt/scratch: not mounted.
  Format and mount
  The following fsync should fail with EIO:
  fsync: Input/output error

The above _require_scratch_dev_pool has already umounted the scratch.

-- 
Thanks,
Lu

> +_dmerror_init
> +
> +# Replace first device with error-test device
> +old_SCRATCH_DEV=$SCRATCH_DEV
> +SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | perl -pe "s#$SCRATCH_DEV#$DMERROR_DEV#"`
> +SCRATCH_DEV=$DMERROR_DEV
> +
> +echo "Format and mount"
> +_scratch_pool_mkfs "-d raid0 -m raid1" > $seqres.full 2>&1
> +_scratch_mount
> +
> +# How much do we need to write? We need to hit all of the stripes. btrfs uses a
> +# fixed 64k stripesize, so write enough to hit each one. In the case of
> +# compression, each 128K input data chunk will be compressed to 4K (because of
> +# the characters written are duplicate). Therefore we have to write
> +# (128K * 16) = 2048K to make sure every stripe can be hit.
> +number_of_devices=`echo $SCRATCH_DEV_POOL | wc -w`
> +write_kb=$(($number_of_devices * 2048))
> +_require_fs_space $SCRATCH_MNT $write_kb
> +datalen=$((($write_kb * 1024)-1))
> +
> +# use fd 5 to hold file open
> +testfile=$SCRATCH_MNT/fsync-open-after-err
> +exec 5>$testfile
> +
> +# write some data to file and fsync it out
> +$XFS_IO_PROG -c "pwrite -q 0 $datalen" -c fsync $testfile
> +
> +# flip device to non-working mode
> +_dmerror_load_error_table
> +
> +# rewrite the data, call sync to ensure it's written back w/o scraping error
> +$XFS_IO_PROG -c "pwrite -q 0 $datalen" -c sync $testfile
> +
> +# heal the device error
> +_dmerror_load_working_table
> +
> +# open again and call fsync
> +echo "The following fsync should fail with EIO:"
> +$XFS_IO_PROG -c fsync $testfile
> +echo "done"
> +
> +# close file
> +exec 5>&-
> +
> +# success, all done
> +_dmerror_unmount
> +_dmerror_cleanup
> +
> +status=0
> +exit
> diff --git a/tests/btrfs/999.out b/tests/btrfs/999.out
> new file mode 100644
> index 000000000000..38d2d7f6495f
> --- /dev/null
> +++ b/tests/btrfs/999.out
> @@ -0,0 +1,5 @@
> +QA output created by 999
> +Format and mount
> +The following fsync should fail with EIO:
> +fsync: Input/output error
> +done
> diff --git a/tests/btrfs/group b/tests/btrfs/group
> index ba766f6b84f8..8550f87a2305 100644
> --- a/tests/btrfs/group
> +++ b/tests/btrfs/group
> @@ -162,3 +162,4 @@
>   157 auto quick raid
>   158 auto quick raid scrub
>   159 auto quick
> +999 auto quick
> 



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-05-09 12:58 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-08 12:56 [PATCH] btrfs: add test for seeing unseen fsync errors on newly open files Jeff Layton
2018-05-09 12:58 ` Lu Fengqi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.