fstests.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET 0/2] fstests: exercise code refactored in 5.14
@ 2021-07-20  1:08 Darrick J. Wong
  2021-07-20  1:08 ` [PATCH 1/2] generic: test xattr operations only Darrick J. Wong
  2021-07-20  1:08 ` [PATCH 2/2] generic: test shutdowns of a nested filesystem Darrick J. Wong
  0 siblings, 2 replies; 11+ messages in thread
From: Darrick J. Wong @ 2021-07-20  1:08 UTC (permalink / raw)
  To: djwong, guaneryu; +Cc: linux-xfs, fstests, guan

Hi all,

Add a few tests to exercise code that got refactored in 5.14.  The xattr
tests shook out some bugs in the big extended attributes refactoring,
and the nested shutdown test simulates the process of recovering after a
VM host filesystem goes down and the guests have to recover.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=new-tests-for-5.14
---
 tests/generic/724     |   57 +++++++++++++++++++++
 tests/generic/724.out |    2 +
 tests/generic/725     |  136 +++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/725.out |    2 +
 4 files changed, 197 insertions(+)
 create mode 100755 tests/generic/724
 create mode 100644 tests/generic/724.out
 create mode 100755 tests/generic/725
 create mode 100644 tests/generic/725.out


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] generic: test xattr operations only
  2021-07-20  1:08 [PATCHSET 0/2] fstests: exercise code refactored in 5.14 Darrick J. Wong
@ 2021-07-20  1:08 ` Darrick J. Wong
  2021-07-25 16:02   ` Eryu Guan
  2021-07-20  1:08 ` [PATCH 2/2] generic: test shutdowns of a nested filesystem Darrick J. Wong
  1 sibling, 1 reply; 11+ messages in thread
From: Darrick J. Wong @ 2021-07-20  1:08 UTC (permalink / raw)
  To: djwong, guaneryu; +Cc: linux-xfs, fstests, guan

From: Darrick J. Wong <djwong@kernel.org>

Exercise extended attribute operations.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 tests/generic/724     |   57 +++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/724.out |    2 ++
 2 files changed, 59 insertions(+)
 create mode 100755 tests/generic/724
 create mode 100644 tests/generic/724.out


diff --git a/tests/generic/724 b/tests/generic/724
new file mode 100755
index 00000000..f2f4a2ec
--- /dev/null
+++ b/tests/generic/724
@@ -0,0 +1,57 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2021 Oracle, Inc.  All Rights Reserved.
+#
+# FS QA Test No. 724
+#
+# Run an extended attributes fsstress run with multiple threads to shake out
+# bugs in the xattr code.
+#
+. ./common/preamble
+_begin_fstest auto attr
+
+_cleanup()
+{
+	$KILLALL_PROG -9 fsstress > /dev/null 2>&1
+	cd /
+	rm -f $tmp.*
+}
+
+# Modify as appropriate.
+_supported_fs generic
+
+_require_scratch
+_require_command "$KILLALL_PROG" "killall"
+
+echo "Silence is golden."
+
+_scratch_mkfs > $seqres.full 2>&1
+_scratch_mount >> $seqres.full 2>&1
+
+nr_cpus=$((LOAD_FACTOR * 4))
+nr_ops=$((700000 * nr_cpus * TIME_FACTOR))
+
+args=('-z' '-S' 'c')
+
+# Do some directory tree modifications, but the bulk of this is geared towards
+# exercising the xattr code, especially attr_set which can do up to 10k values.
+for verb in unlink rmdir; do
+	args+=('-f' "${verb}=1")
+done
+for verb in creat mkdir; do
+	args+=('-f' "${verb}=2")
+done
+for verb in getfattr listfattr; do
+	args+=('-f' "${verb}=3")
+done
+for verb in attr_remove removefattr; do
+	args+=('-f' "${verb}=4")
+done
+args+=('-f' "setfattr=20")
+args+=('-f' "attr_set=60")	# sets larger xattrs
+
+$FSSTRESS_PROG "${args[@]}" $FSSTRESS_AVOID -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus >> $seqres.full
+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/724.out b/tests/generic/724.out
new file mode 100644
index 00000000..164cfffb
--- /dev/null
+++ b/tests/generic/724.out
@@ -0,0 +1,2 @@
+QA output created by 724
+Silence is golden.


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] generic: test shutdowns of a nested filesystem
  2021-07-20  1:08 [PATCHSET 0/2] fstests: exercise code refactored in 5.14 Darrick J. Wong
  2021-07-20  1:08 ` [PATCH 1/2] generic: test xattr operations only Darrick J. Wong
@ 2021-07-20  1:08 ` Darrick J. Wong
  1 sibling, 0 replies; 11+ messages in thread
From: Darrick J. Wong @ 2021-07-20  1:08 UTC (permalink / raw)
  To: djwong, guaneryu; +Cc: linux-xfs, fstests, guan

From: Darrick J. Wong <djwong@kernel.org>

generic/475, but we're running fsstress on a disk image inside the
scratch filesystem

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 tests/generic/725     |  136 +++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/725.out |    2 +
 2 files changed, 138 insertions(+)
 create mode 100755 tests/generic/725
 create mode 100644 tests/generic/725.out


diff --git a/tests/generic/725 b/tests/generic/725
new file mode 100755
index 00000000..f4a42d62
--- /dev/null
+++ b/tests/generic/725
@@ -0,0 +1,136 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2021 Oracle, Inc.  All Rights Reserved.
+#
+# FS QA Test No. 725
+#
+# Test nested log recovery with repeated (simulated) disk failures.  We kick
+# off fsstress on a loopback filesystem mounted on the scratch fs, then switch
+# out the underlying scratch device with dm-error to see what happens when the
+# disk goes down.  Having taken down both fses in this manner, remount them and
+# repeat.  This test simulates VM hosts crashing to try to shake out CoW bugs
+# in writeback on the host that cause VM guests to fail to recover.
+#
+. ./common/preamble
+_begin_fstest shutdown auto log metadata eio
+
+_cleanup()
+{
+	cd /
+	$KILLALL_PROG -9 fsstress > /dev/null 2>&1
+	wait
+	if [ -n "$loopmnt" ]; then
+		umount $loopmnt 2>/dev/null
+		rm -r -f $loopmnt
+	fi
+	rm -f $tmp.*
+	_dmerror_unmount
+	_dmerror_cleanup
+}
+
+# Import common functions.
+. ./common/dmerror
+. ./common/reflink
+
+# Modify as appropriate.
+_supported_fs generic
+
+_require_scratch_reflink
+_require_cp_reflink
+_require_dm_target error
+_require_command "$KILLALL_PROG" "killall"
+
+echo "Silence is golden."
+
+_scratch_mkfs >> $seqres.full 2>&1
+_require_metadata_journaling $SCRATCH_DEV
+_dmerror_init
+_dmerror_mount
+
+# Create a fs image consuming 1/3 of the scratch fs
+scratch_freesp_bytes=$(stat -f -c '%a * %S' $SCRATCH_MNT | bc)
+loopimg_bytes=$((scratch_freesp_bytes / 3))
+
+loopimg=$SCRATCH_MNT/testfs
+truncate -s $loopimg_bytes $loopimg
+_mkfs_dev $loopimg
+
+loopmnt=$tmp.mount
+mkdir -p $loopmnt
+
+scratch_aliveflag=$tmp.runsnap
+snap_aliveflag=$tmp.snapping
+
+snap_loop_fs() {
+	touch "$snap_aliveflag"
+	while [ -e "$scratch_aliveflag" ]; do
+		rm -f $loopimg.a
+		_cp_reflink $loopimg $loopimg.a
+		sleep 1
+	done
+	rm -f "$snap_aliveflag"
+}
+
+fsstress=($FSSTRESS_PROG $FSSTRESS_AVOID -d "$loopmnt" -n 999999 -p "$((LOAD_FACTOR * 4))")
+
+for i in $(seq 1 $((50 * TIME_FACTOR)) ); do
+	touch $scratch_aliveflag
+	snap_loop_fs >> $seqres.full 2>&1 &
+
+	if ! _mount $loopimg $loopmnt -o loop; then
+		rm -f $scratch_aliveflag
+		_fail "loop mount failed"
+		break
+	fi
+
+	("${fsstress[@]}" >> $seqres.full &) > /dev/null 2>&1
+
+	# purposely include 0 second sleeps to test shutdown immediately after
+	# recovery
+	sleep $((RANDOM % (3 * TIME_FACTOR) ))
+	rm -f $scratch_aliveflag
+
+	# This test aims to simulate sudden disk failure, which means that we
+	# do not want to quiesce the filesystem or otherwise give it a chance
+	# to flush its logs.  Therefore we want to call dmsetup with the
+	# --nolockfs parameter; to make this happen we must call the load
+	# error table helper *without* 'lockfs'.
+	_dmerror_load_error_table
+
+	ps -e | grep fsstress > /dev/null 2>&1
+	while [ $? -eq 0 ]; do
+		$KILLALL_PROG -9 fsstress > /dev/null 2>&1
+		wait > /dev/null 2>&1
+		ps -e | grep fsstress > /dev/null 2>&1
+	done
+	for ((i = 0; i < 10; i++)); do
+		test -e "$snap_aliveflag" || break
+		sleep 1
+	done
+
+	# Mount again to replay log after loading working table, so we have a
+	# consistent XFS after test.
+	$UMOUNT_PROG $loopmnt
+	_dmerror_unmount || _fail "unmount failed"
+	_dmerror_load_working_table
+	if ! _dmerror_mount; then
+		dmsetup table | tee -a /dev/ttyprintk
+		lsblk | tee -a /dev/ttyprintk
+		$XFS_METADUMP_PROG -a -g -o $DMERROR_DEV $seqres.dmfail.md
+		_fail "mount failed"
+	fi
+done
+
+# Make sure the fs image file is ok
+if [ -f "$loopimg" ]; then
+	if _mount $loopimg $loopmnt -o loop; then
+		$UMOUNT_PROG $loopmnt &> /dev/null
+	else
+		echo "final loop mount failed"
+	fi
+	_check_xfs_filesystem $loopimg none none
+fi
+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/725.out b/tests/generic/725.out
new file mode 100644
index 00000000..ed73a9fc
--- /dev/null
+++ b/tests/generic/725.out
@@ -0,0 +1,2 @@
+QA output created by 725
+Silence is golden.


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] generic: test xattr operations only
  2021-07-20  1:08 ` [PATCH 1/2] generic: test xattr operations only Darrick J. Wong
@ 2021-07-25 16:02   ` Eryu Guan
  2021-07-26 17:01     ` Darrick J. Wong
  0 siblings, 1 reply; 11+ messages in thread
From: Eryu Guan @ 2021-07-25 16:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Mon, Jul 19, 2021 at 06:08:48PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Exercise extended attribute operations.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  tests/generic/724     |   57 +++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/724.out |    2 ++
>  2 files changed, 59 insertions(+)
>  create mode 100755 tests/generic/724
>  create mode 100644 tests/generic/724.out
> 
> 
> diff --git a/tests/generic/724 b/tests/generic/724
> new file mode 100755
> index 00000000..f2f4a2ec
> --- /dev/null
> +++ b/tests/generic/724
> @@ -0,0 +1,57 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2021 Oracle, Inc.  All Rights Reserved.
> +#
> +# FS QA Test No. 724
> +#
> +# Run an extended attributes fsstress run with multiple threads to shake out
> +# bugs in the xattr code.
> +#
> +. ./common/preamble
> +_begin_fstest auto attr
> +
> +_cleanup()
> +{
> +	$KILLALL_PROG -9 fsstress > /dev/null 2>&1
> +	cd /
> +	rm -f $tmp.*
> +}
> +
> +# Modify as appropriate.
> +_supported_fs generic
> +
> +_require_scratch
> +_require_command "$KILLALL_PROG" "killall"
> +
> +echo "Silence is golden."
> +
> +_scratch_mkfs > $seqres.full 2>&1
> +_scratch_mount >> $seqres.full 2>&1
> +
> +nr_cpus=$((LOAD_FACTOR * 4))
> +nr_ops=$((700000 * nr_cpus * TIME_FACTOR))

This takes too long time to be an 'auto' test, it runs for 20min on my
test vm and is still running, I think I have to kill it manually.

I noticed that generic/52[12] run long time fsx tests, and they are in
'soak long_rw' groups, perhaps this one fits there as well? And maybe
'stress' group too.

Thanks,
Eryu

> +
> +args=('-z' '-S' 'c')
> +
> +# Do some directory tree modifications, but the bulk of this is geared towards
> +# exercising the xattr code, especially attr_set which can do up to 10k values.
> +for verb in unlink rmdir; do
> +	args+=('-f' "${verb}=1")
> +done
> +for verb in creat mkdir; do
> +	args+=('-f' "${verb}=2")
> +done
> +for verb in getfattr listfattr; do
> +	args+=('-f' "${verb}=3")
> +done
> +for verb in attr_remove removefattr; do
> +	args+=('-f' "${verb}=4")
> +done
> +args+=('-f' "setfattr=20")
> +args+=('-f' "attr_set=60")	# sets larger xattrs
> +
> +$FSSTRESS_PROG "${args[@]}" $FSSTRESS_AVOID -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus >> $seqres.full
> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/generic/724.out b/tests/generic/724.out
> new file mode 100644
> index 00000000..164cfffb
> --- /dev/null
> +++ b/tests/generic/724.out
> @@ -0,0 +1,2 @@
> +QA output created by 724
> +Silence is golden.
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] generic: test xattr operations only
  2021-07-25 16:02   ` Eryu Guan
@ 2021-07-26 17:01     ` Darrick J. Wong
  0 siblings, 0 replies; 11+ messages in thread
From: Darrick J. Wong @ 2021-07-26 17:01 UTC (permalink / raw)
  To: Eryu Guan; +Cc: guaneryu, linux-xfs, fstests

On Mon, Jul 26, 2021 at 12:02:45AM +0800, Eryu Guan wrote:
> On Mon, Jul 19, 2021 at 06:08:48PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Exercise extended attribute operations.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  tests/generic/724     |   57 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/724.out |    2 ++
> >  2 files changed, 59 insertions(+)
> >  create mode 100755 tests/generic/724
> >  create mode 100644 tests/generic/724.out
> > 
> > 
> > diff --git a/tests/generic/724 b/tests/generic/724
> > new file mode 100755
> > index 00000000..f2f4a2ec
> > --- /dev/null
> > +++ b/tests/generic/724
> > @@ -0,0 +1,57 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2021 Oracle, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test No. 724
> > +#
> > +# Run an extended attributes fsstress run with multiple threads to shake out
> > +# bugs in the xattr code.
> > +#
> > +. ./common/preamble
> > +_begin_fstest auto attr
> > +
> > +_cleanup()
> > +{
> > +	$KILLALL_PROG -9 fsstress > /dev/null 2>&1
> > +	cd /
> > +	rm -f $tmp.*
> > +}
> > +
> > +# Modify as appropriate.
> > +_supported_fs generic
> > +
> > +_require_scratch
> > +_require_command "$KILLALL_PROG" "killall"
> > +
> > +echo "Silence is golden."
> > +
> > +_scratch_mkfs > $seqres.full 2>&1
> > +_scratch_mount >> $seqres.full 2>&1
> > +
> > +nr_cpus=$((LOAD_FACTOR * 4))
> > +nr_ops=$((700000 * nr_cpus * TIME_FACTOR))
> 
> This takes too long time to be an 'auto' test, it runs for 20min on my
> test vm and is still running, I think I have to kill it manually.
> 
> I noticed that generic/52[12] run long time fsx tests, and they are in
> 'soak long_rw' groups, perhaps this one fits there as well? And maybe
> 'stress' group too.

Oh, yikes, I sent this out configured for 700,000 xattr ops.  Let me
lower that to 70k.

--D

> 
> Thanks,
> Eryu
> 
> > +
> > +args=('-z' '-S' 'c')
> > +
> > +# Do some directory tree modifications, but the bulk of this is geared towards
> > +# exercising the xattr code, especially attr_set which can do up to 10k values.
> > +for verb in unlink rmdir; do
> > +	args+=('-f' "${verb}=1")
> > +done
> > +for verb in creat mkdir; do
> > +	args+=('-f' "${verb}=2")
> > +done
> > +for verb in getfattr listfattr; do
> > +	args+=('-f' "${verb}=3")
> > +done
> > +for verb in attr_remove removefattr; do
> > +	args+=('-f' "${verb}=4")
> > +done
> > +args+=('-f' "setfattr=20")
> > +args+=('-f' "attr_set=60")	# sets larger xattrs
> > +
> > +$FSSTRESS_PROG "${args[@]}" $FSSTRESS_AVOID -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus >> $seqres.full
> > +
> > +# success, all done
> > +status=0
> > +exit
> > diff --git a/tests/generic/724.out b/tests/generic/724.out
> > new file mode 100644
> > index 00000000..164cfffb
> > --- /dev/null
> > +++ b/tests/generic/724.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 724
> > +Silence is golden.
> > 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] generic: test shutdowns of a nested filesystem
  2021-08-22 11:18   ` Eryu Guan
@ 2021-08-22 17:23     ` Darrick J. Wong
  0 siblings, 0 replies; 11+ messages in thread
From: Darrick J. Wong @ 2021-08-22 17:23 UTC (permalink / raw)
  To: Eryu Guan; +Cc: guaneryu, linux-xfs, fstests

On Sun, Aug 22, 2021 at 07:18:49PM +0800, Eryu Guan wrote:
> On Tue, Aug 17, 2021 at 04:53:25PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > generic/475, but we're running fsstress on a disk image inside the
> > scratch filesystem
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  common/rc             |   20 +++++++
> >  tests/generic/725     |  136 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/725.out |    2 +
> >  3 files changed, 158 insertions(+)
> >  create mode 100755 tests/generic/725
> >  create mode 100644 tests/generic/725.out
> > 
> > 
> > diff --git a/common/rc b/common/rc
> > index 84757fc1..473bfb0a 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -631,6 +631,26 @@ _ext4_metadump()
> >  		$DUMP_COMPRESSOR -f "$dumpfile" &>> "$seqres.full"
> >  }
> >  
> > +# Capture the metadata of a filesystem in a dump file for offline analysis
> > +_metadump_dev() {
> > +	local device="$1"
> > +	local dumpfile="$2"
> > +	local compressopt="$3"
> > +
> > +	case "$FSTYP" in
> > +	ext*)
> > +		_ext4_metadump $device $dumpfile $compressopt
> > +		;;
> > +	xfs)
> > +		_xfs_metadump $dumpfile $device none $compressopt
> > +		;;
> > +	*)
> > +		echo "Don't know how to metadump $FSTYP"
> 
> This breaks tests on filesystems other than ext* and xfs. I think it's
> OK if we only want to use it in failure path, but it's better to
> describe the use case in comments.

Ok, I'll make a note of that in the comment.

"Capture the metadata of a filesystem in a dump file for offline
analysis.  Not all filesystems support this, so this function should
only be used to capture information about a previous test failure."

> And Im' wondering if should honor DUMP_CORRUPT_FS, and only do the dump
> when it's set.

Yes.  Will fix that in the next release.

--D

> Thanks,
> Eryu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] generic: test shutdowns of a nested filesystem
  2021-08-17 23:53 ` [PATCH 2/2] generic: test shutdowns of a nested filesystem Darrick J. Wong
  2021-08-18  7:06   ` Zorro Lang
@ 2021-08-22 11:18   ` Eryu Guan
  2021-08-22 17:23     ` Darrick J. Wong
  1 sibling, 1 reply; 11+ messages in thread
From: Eryu Guan @ 2021-08-22 11:18 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Tue, Aug 17, 2021 at 04:53:25PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> generic/475, but we're running fsstress on a disk image inside the
> scratch filesystem
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  common/rc             |   20 +++++++
>  tests/generic/725     |  136 +++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/725.out |    2 +
>  3 files changed, 158 insertions(+)
>  create mode 100755 tests/generic/725
>  create mode 100644 tests/generic/725.out
> 
> 
> diff --git a/common/rc b/common/rc
> index 84757fc1..473bfb0a 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -631,6 +631,26 @@ _ext4_metadump()
>  		$DUMP_COMPRESSOR -f "$dumpfile" &>> "$seqres.full"
>  }
>  
> +# Capture the metadata of a filesystem in a dump file for offline analysis
> +_metadump_dev() {
> +	local device="$1"
> +	local dumpfile="$2"
> +	local compressopt="$3"
> +
> +	case "$FSTYP" in
> +	ext*)
> +		_ext4_metadump $device $dumpfile $compressopt
> +		;;
> +	xfs)
> +		_xfs_metadump $dumpfile $device none $compressopt
> +		;;
> +	*)
> +		echo "Don't know how to metadump $FSTYP"

This breaks tests on filesystems other than ext* and xfs. I think it's
OK if we only want to use it in failure path, but it's better to
describe the use case in comments.

And Im' wondering if should honor DUMP_CORRUPT_FS, and only do the dump
when it's set.

Thanks,
Eryu

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] generic: test shutdowns of a nested filesystem
  2021-08-18 15:55     ` Darrick J. Wong
@ 2021-08-18 17:18       ` Zorro Lang
  0 siblings, 0 replies; 11+ messages in thread
From: Zorro Lang @ 2021-08-18 17:18 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests, guan

On Wed, Aug 18, 2021 at 08:55:26AM -0700, Darrick J. Wong wrote:
> On Wed, Aug 18, 2021 at 03:06:54PM +0800, Zorro Lang wrote:
> > On Tue, Aug 17, 2021 at 04:53:25PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@kernel.org>
> > > 
> > > generic/475, but we're running fsstress on a disk image inside the
> > > scratch filesystem
> > > 
> > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > ---
> > 
> > Good to me, thanks for this helpful test case. Just one question,
> > is it better to use xfs_metadump with "-o" option by default?
> 
> _xfs_metadump already passes -a and -o.

Oh, sorry, I didn't notice this line:

test -z "$options" && options="-a -o".

> 
> --D
> 
> > Reviewed-by: Zorro Lang <zlang@redhat.com>
> > 
> > >  common/rc             |   20 +++++++
> > >  tests/generic/725     |  136 +++++++++++++++++++++++++++++++++++++++++++++++++
> > >  tests/generic/725.out |    2 +
> > >  3 files changed, 158 insertions(+)
> > >  create mode 100755 tests/generic/725
> > >  create mode 100644 tests/generic/725.out
> > > 
> > > 
> > > diff --git a/common/rc b/common/rc
> > > index 84757fc1..473bfb0a 100644
> > > --- a/common/rc
> > > +++ b/common/rc
> > > @@ -631,6 +631,26 @@ _ext4_metadump()
> > >  		$DUMP_COMPRESSOR -f "$dumpfile" &>> "$seqres.full"
> > >  }
> > >  
> > > +# Capture the metadata of a filesystem in a dump file for offline analysis
> > > +_metadump_dev() {
> > > +	local device="$1"
> > > +	local dumpfile="$2"
> > > +	local compressopt="$3"
> > > +
> > > +	case "$FSTYP" in
> > > +	ext*)
> > > +		_ext4_metadump $device $dumpfile $compressopt
> > > +		;;
> > > +	xfs)
> > > +		_xfs_metadump $dumpfile $device none $compressopt
> > > +		;;
> > > +	*)
> > > +		echo "Don't know how to metadump $FSTYP"
> > > +		return 1
> > > +		;;
> > > +	esac
> > > +}
> > > +
> > >  _test_mkfs()
> > >  {
> > >      case $FSTYP in
> > > diff --git a/tests/generic/725 b/tests/generic/725
> > > new file mode 100755
> > > index 00000000..ac008fdb
> > > --- /dev/null
> > > +++ b/tests/generic/725
> > > @@ -0,0 +1,136 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (c) 2021 Oracle, Inc.  All Rights Reserved.
> > > +#
> > > +# FS QA Test No. 725
> > > +#
> > > +# Test nested log recovery with repeated (simulated) disk failures.  We kick
> > > +# off fsstress on a loopback filesystem mounted on the scratch fs, then switch
> > > +# out the underlying scratch device with dm-error to see what happens when the
> > > +# disk goes down.  Having taken down both fses in this manner, remount them and
> > > +# repeat.  This test simulates VM hosts crashing to try to shake out CoW bugs
> > > +# in writeback on the host that cause VM guests to fail to recover.
> > > +#
> > > +. ./common/preamble
> > > +_begin_fstest shutdown auto log metadata eio recoveryloop
> > > +
> > > +_cleanup()
> > > +{
> > > +	cd /
> > > +	$KILLALL_PROG -9 fsstress > /dev/null 2>&1
> > > +	wait
> > > +	if [ -n "$loopmnt" ]; then
> > > +		$UMOUNT_PROG $loopmnt 2>/dev/null
> > > +		rm -r -f $loopmnt
> > > +	fi
> > > +	rm -f $tmp.*
> > > +	_dmerror_unmount
> > > +	_dmerror_cleanup
> > > +}
> > > +
> > > +# Import common functions.
> > > +. ./common/dmerror
> > > +. ./common/reflink
> > > +
> > > +# Modify as appropriate.
> > > +_supported_fs generic
> > > +
> > > +_require_scratch_reflink
> > > +_require_cp_reflink
> > > +_require_dm_target error
> > > +_require_command "$KILLALL_PROG" "killall"
> > > +
> > > +echo "Silence is golden."
> > > +
> > > +_scratch_mkfs >> $seqres.full 2>&1
> > > +_require_metadata_journaling $SCRATCH_DEV
> > > +_dmerror_init
> > > +_dmerror_mount
> > > +
> > > +# Create a fs image consuming 1/3 of the scratch fs
> > > +scratch_freesp_bytes=$(_get_available_space $SCRATCH_MNT)
> > > +loopimg_bytes=$((scratch_freesp_bytes / 3))
> > > +
> > > +loopimg=$SCRATCH_MNT/testfs
> > > +truncate -s $loopimg_bytes $loopimg
> > > +_mkfs_dev $loopimg
> > > +
> > > +loopmnt=$tmp.mount
> > > +mkdir -p $loopmnt
> > > +
> > > +scratch_aliveflag=$tmp.runsnap
> > > +snap_aliveflag=$tmp.snapping
> > > +
> > > +snap_loop_fs() {
> > > +	touch "$snap_aliveflag"
> > > +	while [ -e "$scratch_aliveflag" ]; do
> > > +		rm -f $loopimg.a
> > > +		_cp_reflink $loopimg $loopimg.a
> > > +		sleep 1
> > > +	done
> > > +	rm -f "$snap_aliveflag"
> > > +}
> > > +
> > > +fsstress=($FSSTRESS_PROG $FSSTRESS_AVOID -d "$loopmnt" -n 999999 -p "$((LOAD_FACTOR * 4))")
> > > +
> > > +for i in $(seq 1 $((25 * TIME_FACTOR)) ); do
> > > +	touch $scratch_aliveflag
> > > +	snap_loop_fs >> $seqres.full 2>&1 &
> > > +
> > > +	if ! _mount $loopimg $loopmnt -o loop; then
> > > +		rm -f $scratch_aliveflag
> > > +		_metadump_dev $loopimg $seqres.loop.$i.md
> > > +		_fail "iteration $i loopimg mount failed"
> > > +		break
> > > +	fi
> > > +
> > > +	("${fsstress[@]}" >> $seqres.full &) > /dev/null 2>&1
> > > +
> > > +	# purposely include 0 second sleeps to test shutdown immediately after
> > > +	# recovery
> > > +	sleep $((RANDOM % (3 * TIME_FACTOR) ))
> > > +	rm -f $scratch_aliveflag
> > > +
> > > +	# This test aims to simulate sudden disk failure, which means that we
> > > +	# do not want to quiesce the filesystem or otherwise give it a chance
> > > +	# to flush its logs.  Therefore we want to call dmsetup with the
> > > +	# --nolockfs parameter; to make this happen we must call the load
> > > +	# error table helper *without* 'lockfs'.
> > > +	_dmerror_load_error_table
> > > +
> > > +	ps -e | grep fsstress > /dev/null 2>&1
> > > +	while [ $? -eq 0 ]; do
> > > +		$KILLALL_PROG -9 fsstress > /dev/null 2>&1
> > > +		wait > /dev/null 2>&1
> > > +		ps -e | grep fsstress > /dev/null 2>&1
> > > +	done
> > > +	for ((i = 0; i < 10; i++)); do
> > > +		test -e "$snap_aliveflag" || break
> > > +		sleep 1
> > > +	done
> > > +
> > > +	# Mount again to replay log after loading working table, so we have a
> > > +	# consistent fs after test.
> > > +	$UMOUNT_PROG $loopmnt
> > > +	_dmerror_unmount || _fail "iteration $i scratch unmount failed"
> > > +	_dmerror_load_working_table
> > > +	if ! _dmerror_mount; then
> > > +		_metadump_dev $DMERROR_DEV $seqres.scratch.$i.md
> > > +		_fail "iteration $i scratch mount failed"
> > > +	fi
> > > +done
> > > +
> > > +# Make sure the fs image file is ok
> > > +if [ -f "$loopimg" ]; then
> > > +	if _mount $loopimg $loopmnt -o loop; then
> > > +		$UMOUNT_PROG $loopmnt &> /dev/null
> > > +	else
> > > +		_metadump_dev $DMERROR_DEV $seqres.scratch.final.md
> > > +		echo "final scratch mount failed"
> > > +	fi
> > > +	SCRATCH_RTDEV= SCRATCH_LOGDEV= _check_scratch_fs $loopimg
> > > +fi
> > > +
> > > +# success, all done; let the test harness check the scratch fs
> > > +status=0
> > > +exit
> > > diff --git a/tests/generic/725.out b/tests/generic/725.out
> > > new file mode 100644
> > > index 00000000..ed73a9fc
> > > --- /dev/null
> > > +++ b/tests/generic/725.out
> > > @@ -0,0 +1,2 @@
> > > +QA output created by 725
> > > +Silence is golden.
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] generic: test shutdowns of a nested filesystem
  2021-08-18  7:06   ` Zorro Lang
@ 2021-08-18 15:55     ` Darrick J. Wong
  2021-08-18 17:18       ` Zorro Lang
  0 siblings, 1 reply; 11+ messages in thread
From: Darrick J. Wong @ 2021-08-18 15:55 UTC (permalink / raw)
  To: guaneryu, linux-xfs, fstests, guan

On Wed, Aug 18, 2021 at 03:06:54PM +0800, Zorro Lang wrote:
> On Tue, Aug 17, 2021 at 04:53:25PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > generic/475, but we're running fsstress on a disk image inside the
> > scratch filesystem
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> 
> Good to me, thanks for this helpful test case. Just one question,
> is it better to use xfs_metadump with "-o" option by default?

_xfs_metadump already passes -a and -o.

--D

> Reviewed-by: Zorro Lang <zlang@redhat.com>
> 
> >  common/rc             |   20 +++++++
> >  tests/generic/725     |  136 +++++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/generic/725.out |    2 +
> >  3 files changed, 158 insertions(+)
> >  create mode 100755 tests/generic/725
> >  create mode 100644 tests/generic/725.out
> > 
> > 
> > diff --git a/common/rc b/common/rc
> > index 84757fc1..473bfb0a 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -631,6 +631,26 @@ _ext4_metadump()
> >  		$DUMP_COMPRESSOR -f "$dumpfile" &>> "$seqres.full"
> >  }
> >  
> > +# Capture the metadata of a filesystem in a dump file for offline analysis
> > +_metadump_dev() {
> > +	local device="$1"
> > +	local dumpfile="$2"
> > +	local compressopt="$3"
> > +
> > +	case "$FSTYP" in
> > +	ext*)
> > +		_ext4_metadump $device $dumpfile $compressopt
> > +		;;
> > +	xfs)
> > +		_xfs_metadump $dumpfile $device none $compressopt
> > +		;;
> > +	*)
> > +		echo "Don't know how to metadump $FSTYP"
> > +		return 1
> > +		;;
> > +	esac
> > +}
> > +
> >  _test_mkfs()
> >  {
> >      case $FSTYP in
> > diff --git a/tests/generic/725 b/tests/generic/725
> > new file mode 100755
> > index 00000000..ac008fdb
> > --- /dev/null
> > +++ b/tests/generic/725
> > @@ -0,0 +1,136 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2021 Oracle, Inc.  All Rights Reserved.
> > +#
> > +# FS QA Test No. 725
> > +#
> > +# Test nested log recovery with repeated (simulated) disk failures.  We kick
> > +# off fsstress on a loopback filesystem mounted on the scratch fs, then switch
> > +# out the underlying scratch device with dm-error to see what happens when the
> > +# disk goes down.  Having taken down both fses in this manner, remount them and
> > +# repeat.  This test simulates VM hosts crashing to try to shake out CoW bugs
> > +# in writeback on the host that cause VM guests to fail to recover.
> > +#
> > +. ./common/preamble
> > +_begin_fstest shutdown auto log metadata eio recoveryloop
> > +
> > +_cleanup()
> > +{
> > +	cd /
> > +	$KILLALL_PROG -9 fsstress > /dev/null 2>&1
> > +	wait
> > +	if [ -n "$loopmnt" ]; then
> > +		$UMOUNT_PROG $loopmnt 2>/dev/null
> > +		rm -r -f $loopmnt
> > +	fi
> > +	rm -f $tmp.*
> > +	_dmerror_unmount
> > +	_dmerror_cleanup
> > +}
> > +
> > +# Import common functions.
> > +. ./common/dmerror
> > +. ./common/reflink
> > +
> > +# Modify as appropriate.
> > +_supported_fs generic
> > +
> > +_require_scratch_reflink
> > +_require_cp_reflink
> > +_require_dm_target error
> > +_require_command "$KILLALL_PROG" "killall"
> > +
> > +echo "Silence is golden."
> > +
> > +_scratch_mkfs >> $seqres.full 2>&1
> > +_require_metadata_journaling $SCRATCH_DEV
> > +_dmerror_init
> > +_dmerror_mount
> > +
> > +# Create a fs image consuming 1/3 of the scratch fs
> > +scratch_freesp_bytes=$(_get_available_space $SCRATCH_MNT)
> > +loopimg_bytes=$((scratch_freesp_bytes / 3))
> > +
> > +loopimg=$SCRATCH_MNT/testfs
> > +truncate -s $loopimg_bytes $loopimg
> > +_mkfs_dev $loopimg
> > +
> > +loopmnt=$tmp.mount
> > +mkdir -p $loopmnt
> > +
> > +scratch_aliveflag=$tmp.runsnap
> > +snap_aliveflag=$tmp.snapping
> > +
> > +snap_loop_fs() {
> > +	touch "$snap_aliveflag"
> > +	while [ -e "$scratch_aliveflag" ]; do
> > +		rm -f $loopimg.a
> > +		_cp_reflink $loopimg $loopimg.a
> > +		sleep 1
> > +	done
> > +	rm -f "$snap_aliveflag"
> > +}
> > +
> > +fsstress=($FSSTRESS_PROG $FSSTRESS_AVOID -d "$loopmnt" -n 999999 -p "$((LOAD_FACTOR * 4))")
> > +
> > +for i in $(seq 1 $((25 * TIME_FACTOR)) ); do
> > +	touch $scratch_aliveflag
> > +	snap_loop_fs >> $seqres.full 2>&1 &
> > +
> > +	if ! _mount $loopimg $loopmnt -o loop; then
> > +		rm -f $scratch_aliveflag
> > +		_metadump_dev $loopimg $seqres.loop.$i.md
> > +		_fail "iteration $i loopimg mount failed"
> > +		break
> > +	fi
> > +
> > +	("${fsstress[@]}" >> $seqres.full &) > /dev/null 2>&1
> > +
> > +	# purposely include 0 second sleeps to test shutdown immediately after
> > +	# recovery
> > +	sleep $((RANDOM % (3 * TIME_FACTOR) ))
> > +	rm -f $scratch_aliveflag
> > +
> > +	# This test aims to simulate sudden disk failure, which means that we
> > +	# do not want to quiesce the filesystem or otherwise give it a chance
> > +	# to flush its logs.  Therefore we want to call dmsetup with the
> > +	# --nolockfs parameter; to make this happen we must call the load
> > +	# error table helper *without* 'lockfs'.
> > +	_dmerror_load_error_table
> > +
> > +	ps -e | grep fsstress > /dev/null 2>&1
> > +	while [ $? -eq 0 ]; do
> > +		$KILLALL_PROG -9 fsstress > /dev/null 2>&1
> > +		wait > /dev/null 2>&1
> > +		ps -e | grep fsstress > /dev/null 2>&1
> > +	done
> > +	for ((i = 0; i < 10; i++)); do
> > +		test -e "$snap_aliveflag" || break
> > +		sleep 1
> > +	done
> > +
> > +	# Mount again to replay log after loading working table, so we have a
> > +	# consistent fs after test.
> > +	$UMOUNT_PROG $loopmnt
> > +	_dmerror_unmount || _fail "iteration $i scratch unmount failed"
> > +	_dmerror_load_working_table
> > +	if ! _dmerror_mount; then
> > +		_metadump_dev $DMERROR_DEV $seqres.scratch.$i.md
> > +		_fail "iteration $i scratch mount failed"
> > +	fi
> > +done
> > +
> > +# Make sure the fs image file is ok
> > +if [ -f "$loopimg" ]; then
> > +	if _mount $loopimg $loopmnt -o loop; then
> > +		$UMOUNT_PROG $loopmnt &> /dev/null
> > +	else
> > +		_metadump_dev $DMERROR_DEV $seqres.scratch.final.md
> > +		echo "final scratch mount failed"
> > +	fi
> > +	SCRATCH_RTDEV= SCRATCH_LOGDEV= _check_scratch_fs $loopimg
> > +fi
> > +
> > +# success, all done; let the test harness check the scratch fs
> > +status=0
> > +exit
> > diff --git a/tests/generic/725.out b/tests/generic/725.out
> > new file mode 100644
> > index 00000000..ed73a9fc
> > --- /dev/null
> > +++ b/tests/generic/725.out
> > @@ -0,0 +1,2 @@
> > +QA output created by 725
> > +Silence is golden.
> > 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] generic: test shutdowns of a nested filesystem
  2021-08-17 23:53 ` [PATCH 2/2] generic: test shutdowns of a nested filesystem Darrick J. Wong
@ 2021-08-18  7:06   ` Zorro Lang
  2021-08-18 15:55     ` Darrick J. Wong
  2021-08-22 11:18   ` Eryu Guan
  1 sibling, 1 reply; 11+ messages in thread
From: Zorro Lang @ 2021-08-18  7:06 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests, guan

On Tue, Aug 17, 2021 at 04:53:25PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> generic/475, but we're running fsstress on a disk image inside the
> scratch filesystem
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---

Good to me, thanks for this helpful test case. Just one question,
is it better to use xfs_metadump with "-o" option by default?

Reviewed-by: Zorro Lang <zlang@redhat.com>

>  common/rc             |   20 +++++++
>  tests/generic/725     |  136 +++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/725.out |    2 +
>  3 files changed, 158 insertions(+)
>  create mode 100755 tests/generic/725
>  create mode 100644 tests/generic/725.out
> 
> 
> diff --git a/common/rc b/common/rc
> index 84757fc1..473bfb0a 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -631,6 +631,26 @@ _ext4_metadump()
>  		$DUMP_COMPRESSOR -f "$dumpfile" &>> "$seqres.full"
>  }
>  
> +# Capture the metadata of a filesystem in a dump file for offline analysis
> +_metadump_dev() {
> +	local device="$1"
> +	local dumpfile="$2"
> +	local compressopt="$3"
> +
> +	case "$FSTYP" in
> +	ext*)
> +		_ext4_metadump $device $dumpfile $compressopt
> +		;;
> +	xfs)
> +		_xfs_metadump $dumpfile $device none $compressopt
> +		;;
> +	*)
> +		echo "Don't know how to metadump $FSTYP"
> +		return 1
> +		;;
> +	esac
> +}
> +
>  _test_mkfs()
>  {
>      case $FSTYP in
> diff --git a/tests/generic/725 b/tests/generic/725
> new file mode 100755
> index 00000000..ac008fdb
> --- /dev/null
> +++ b/tests/generic/725
> @@ -0,0 +1,136 @@
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2021 Oracle, Inc.  All Rights Reserved.
> +#
> +# FS QA Test No. 725
> +#
> +# Test nested log recovery with repeated (simulated) disk failures.  We kick
> +# off fsstress on a loopback filesystem mounted on the scratch fs, then switch
> +# out the underlying scratch device with dm-error to see what happens when the
> +# disk goes down.  Having taken down both fses in this manner, remount them and
> +# repeat.  This test simulates VM hosts crashing to try to shake out CoW bugs
> +# in writeback on the host that cause VM guests to fail to recover.
> +#
> +. ./common/preamble
> +_begin_fstest shutdown auto log metadata eio recoveryloop
> +
> +_cleanup()
> +{
> +	cd /
> +	$KILLALL_PROG -9 fsstress > /dev/null 2>&1
> +	wait
> +	if [ -n "$loopmnt" ]; then
> +		$UMOUNT_PROG $loopmnt 2>/dev/null
> +		rm -r -f $loopmnt
> +	fi
> +	rm -f $tmp.*
> +	_dmerror_unmount
> +	_dmerror_cleanup
> +}
> +
> +# Import common functions.
> +. ./common/dmerror
> +. ./common/reflink
> +
> +# Modify as appropriate.
> +_supported_fs generic
> +
> +_require_scratch_reflink
> +_require_cp_reflink
> +_require_dm_target error
> +_require_command "$KILLALL_PROG" "killall"
> +
> +echo "Silence is golden."
> +
> +_scratch_mkfs >> $seqres.full 2>&1
> +_require_metadata_journaling $SCRATCH_DEV
> +_dmerror_init
> +_dmerror_mount
> +
> +# Create a fs image consuming 1/3 of the scratch fs
> +scratch_freesp_bytes=$(_get_available_space $SCRATCH_MNT)
> +loopimg_bytes=$((scratch_freesp_bytes / 3))
> +
> +loopimg=$SCRATCH_MNT/testfs
> +truncate -s $loopimg_bytes $loopimg
> +_mkfs_dev $loopimg
> +
> +loopmnt=$tmp.mount
> +mkdir -p $loopmnt
> +
> +scratch_aliveflag=$tmp.runsnap
> +snap_aliveflag=$tmp.snapping
> +
> +snap_loop_fs() {
> +	touch "$snap_aliveflag"
> +	while [ -e "$scratch_aliveflag" ]; do
> +		rm -f $loopimg.a
> +		_cp_reflink $loopimg $loopimg.a
> +		sleep 1
> +	done
> +	rm -f "$snap_aliveflag"
> +}
> +
> +fsstress=($FSSTRESS_PROG $FSSTRESS_AVOID -d "$loopmnt" -n 999999 -p "$((LOAD_FACTOR * 4))")
> +
> +for i in $(seq 1 $((25 * TIME_FACTOR)) ); do
> +	touch $scratch_aliveflag
> +	snap_loop_fs >> $seqres.full 2>&1 &
> +
> +	if ! _mount $loopimg $loopmnt -o loop; then
> +		rm -f $scratch_aliveflag
> +		_metadump_dev $loopimg $seqres.loop.$i.md
> +		_fail "iteration $i loopimg mount failed"
> +		break
> +	fi
> +
> +	("${fsstress[@]}" >> $seqres.full &) > /dev/null 2>&1
> +
> +	# purposely include 0 second sleeps to test shutdown immediately after
> +	# recovery
> +	sleep $((RANDOM % (3 * TIME_FACTOR) ))
> +	rm -f $scratch_aliveflag
> +
> +	# This test aims to simulate sudden disk failure, which means that we
> +	# do not want to quiesce the filesystem or otherwise give it a chance
> +	# to flush its logs.  Therefore we want to call dmsetup with the
> +	# --nolockfs parameter; to make this happen we must call the load
> +	# error table helper *without* 'lockfs'.
> +	_dmerror_load_error_table
> +
> +	ps -e | grep fsstress > /dev/null 2>&1
> +	while [ $? -eq 0 ]; do
> +		$KILLALL_PROG -9 fsstress > /dev/null 2>&1
> +		wait > /dev/null 2>&1
> +		ps -e | grep fsstress > /dev/null 2>&1
> +	done
> +	for ((i = 0; i < 10; i++)); do
> +		test -e "$snap_aliveflag" || break
> +		sleep 1
> +	done
> +
> +	# Mount again to replay log after loading working table, so we have a
> +	# consistent fs after test.
> +	$UMOUNT_PROG $loopmnt
> +	_dmerror_unmount || _fail "iteration $i scratch unmount failed"
> +	_dmerror_load_working_table
> +	if ! _dmerror_mount; then
> +		_metadump_dev $DMERROR_DEV $seqres.scratch.$i.md
> +		_fail "iteration $i scratch mount failed"
> +	fi
> +done
> +
> +# Make sure the fs image file is ok
> +if [ -f "$loopimg" ]; then
> +	if _mount $loopimg $loopmnt -o loop; then
> +		$UMOUNT_PROG $loopmnt &> /dev/null
> +	else
> +		_metadump_dev $DMERROR_DEV $seqres.scratch.final.md
> +		echo "final scratch mount failed"
> +	fi
> +	SCRATCH_RTDEV= SCRATCH_LOGDEV= _check_scratch_fs $loopimg
> +fi
> +
> +# success, all done; let the test harness check the scratch fs
> +status=0
> +exit
> diff --git a/tests/generic/725.out b/tests/generic/725.out
> new file mode 100644
> index 00000000..ed73a9fc
> --- /dev/null
> +++ b/tests/generic/725.out
> @@ -0,0 +1,2 @@
> +QA output created by 725
> +Silence is golden.
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/2] generic: test shutdowns of a nested filesystem
  2021-08-17 23:53 [PATCHSET v2 0/2] fstests: exercise code refactored in 5.14 Darrick J. Wong
@ 2021-08-17 23:53 ` Darrick J. Wong
  2021-08-18  7:06   ` Zorro Lang
  2021-08-22 11:18   ` Eryu Guan
  0 siblings, 2 replies; 11+ messages in thread
From: Darrick J. Wong @ 2021-08-17 23:53 UTC (permalink / raw)
  To: djwong, guaneryu; +Cc: linux-xfs, fstests, guan

From: Darrick J. Wong <djwong@kernel.org>

generic/475, but we're running fsstress on a disk image inside the
scratch filesystem

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 common/rc             |   20 +++++++
 tests/generic/725     |  136 +++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/725.out |    2 +
 3 files changed, 158 insertions(+)
 create mode 100755 tests/generic/725
 create mode 100644 tests/generic/725.out


diff --git a/common/rc b/common/rc
index 84757fc1..473bfb0a 100644
--- a/common/rc
+++ b/common/rc
@@ -631,6 +631,26 @@ _ext4_metadump()
 		$DUMP_COMPRESSOR -f "$dumpfile" &>> "$seqres.full"
 }
 
+# Capture the metadata of a filesystem in a dump file for offline analysis
+_metadump_dev() {
+	local device="$1"
+	local dumpfile="$2"
+	local compressopt="$3"
+
+	case "$FSTYP" in
+	ext*)
+		_ext4_metadump $device $dumpfile $compressopt
+		;;
+	xfs)
+		_xfs_metadump $dumpfile $device none $compressopt
+		;;
+	*)
+		echo "Don't know how to metadump $FSTYP"
+		return 1
+		;;
+	esac
+}
+
 _test_mkfs()
 {
     case $FSTYP in
diff --git a/tests/generic/725 b/tests/generic/725
new file mode 100755
index 00000000..ac008fdb
--- /dev/null
+++ b/tests/generic/725
@@ -0,0 +1,136 @@
+#! /bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) 2021 Oracle, Inc.  All Rights Reserved.
+#
+# FS QA Test No. 725
+#
+# Test nested log recovery with repeated (simulated) disk failures.  We kick
+# off fsstress on a loopback filesystem mounted on the scratch fs, then switch
+# out the underlying scratch device with dm-error to see what happens when the
+# disk goes down.  Having taken down both fses in this manner, remount them and
+# repeat.  This test simulates VM hosts crashing to try to shake out CoW bugs
+# in writeback on the host that cause VM guests to fail to recover.
+#
+. ./common/preamble
+_begin_fstest shutdown auto log metadata eio recoveryloop
+
+_cleanup()
+{
+	cd /
+	$KILLALL_PROG -9 fsstress > /dev/null 2>&1
+	wait
+	if [ -n "$loopmnt" ]; then
+		$UMOUNT_PROG $loopmnt 2>/dev/null
+		rm -r -f $loopmnt
+	fi
+	rm -f $tmp.*
+	_dmerror_unmount
+	_dmerror_cleanup
+}
+
+# Import common functions.
+. ./common/dmerror
+. ./common/reflink
+
+# Modify as appropriate.
+_supported_fs generic
+
+_require_scratch_reflink
+_require_cp_reflink
+_require_dm_target error
+_require_command "$KILLALL_PROG" "killall"
+
+echo "Silence is golden."
+
+_scratch_mkfs >> $seqres.full 2>&1
+_require_metadata_journaling $SCRATCH_DEV
+_dmerror_init
+_dmerror_mount
+
+# Create a fs image consuming 1/3 of the scratch fs
+scratch_freesp_bytes=$(_get_available_space $SCRATCH_MNT)
+loopimg_bytes=$((scratch_freesp_bytes / 3))
+
+loopimg=$SCRATCH_MNT/testfs
+truncate -s $loopimg_bytes $loopimg
+_mkfs_dev $loopimg
+
+loopmnt=$tmp.mount
+mkdir -p $loopmnt
+
+scratch_aliveflag=$tmp.runsnap
+snap_aliveflag=$tmp.snapping
+
+snap_loop_fs() {
+	touch "$snap_aliveflag"
+	while [ -e "$scratch_aliveflag" ]; do
+		rm -f $loopimg.a
+		_cp_reflink $loopimg $loopimg.a
+		sleep 1
+	done
+	rm -f "$snap_aliveflag"
+}
+
+fsstress=($FSSTRESS_PROG $FSSTRESS_AVOID -d "$loopmnt" -n 999999 -p "$((LOAD_FACTOR * 4))")
+
+for i in $(seq 1 $((25 * TIME_FACTOR)) ); do
+	touch $scratch_aliveflag
+	snap_loop_fs >> $seqres.full 2>&1 &
+
+	if ! _mount $loopimg $loopmnt -o loop; then
+		rm -f $scratch_aliveflag
+		_metadump_dev $loopimg $seqres.loop.$i.md
+		_fail "iteration $i loopimg mount failed"
+		break
+	fi
+
+	("${fsstress[@]}" >> $seqres.full &) > /dev/null 2>&1
+
+	# purposely include 0 second sleeps to test shutdown immediately after
+	# recovery
+	sleep $((RANDOM % (3 * TIME_FACTOR) ))
+	rm -f $scratch_aliveflag
+
+	# This test aims to simulate sudden disk failure, which means that we
+	# do not want to quiesce the filesystem or otherwise give it a chance
+	# to flush its logs.  Therefore we want to call dmsetup with the
+	# --nolockfs parameter; to make this happen we must call the load
+	# error table helper *without* 'lockfs'.
+	_dmerror_load_error_table
+
+	ps -e | grep fsstress > /dev/null 2>&1
+	while [ $? -eq 0 ]; do
+		$KILLALL_PROG -9 fsstress > /dev/null 2>&1
+		wait > /dev/null 2>&1
+		ps -e | grep fsstress > /dev/null 2>&1
+	done
+	for ((i = 0; i < 10; i++)); do
+		test -e "$snap_aliveflag" || break
+		sleep 1
+	done
+
+	# Mount again to replay log after loading working table, so we have a
+	# consistent fs after test.
+	$UMOUNT_PROG $loopmnt
+	_dmerror_unmount || _fail "iteration $i scratch unmount failed"
+	_dmerror_load_working_table
+	if ! _dmerror_mount; then
+		_metadump_dev $DMERROR_DEV $seqres.scratch.$i.md
+		_fail "iteration $i scratch mount failed"
+	fi
+done
+
+# Make sure the fs image file is ok
+if [ -f "$loopimg" ]; then
+	if _mount $loopimg $loopmnt -o loop; then
+		$UMOUNT_PROG $loopmnt &> /dev/null
+	else
+		_metadump_dev $DMERROR_DEV $seqres.scratch.final.md
+		echo "final scratch mount failed"
+	fi
+	SCRATCH_RTDEV= SCRATCH_LOGDEV= _check_scratch_fs $loopimg
+fi
+
+# success, all done; let the test harness check the scratch fs
+status=0
+exit
diff --git a/tests/generic/725.out b/tests/generic/725.out
new file mode 100644
index 00000000..ed73a9fc
--- /dev/null
+++ b/tests/generic/725.out
@@ -0,0 +1,2 @@
+QA output created by 725
+Silence is golden.


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-08-22 17:23 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-20  1:08 [PATCHSET 0/2] fstests: exercise code refactored in 5.14 Darrick J. Wong
2021-07-20  1:08 ` [PATCH 1/2] generic: test xattr operations only Darrick J. Wong
2021-07-25 16:02   ` Eryu Guan
2021-07-26 17:01     ` Darrick J. Wong
2021-07-20  1:08 ` [PATCH 2/2] generic: test shutdowns of a nested filesystem Darrick J. Wong
2021-08-17 23:53 [PATCHSET v2 0/2] fstests: exercise code refactored in 5.14 Darrick J. Wong
2021-08-17 23:53 ` [PATCH 2/2] generic: test shutdowns of a nested filesystem Darrick J. Wong
2021-08-18  7:06   ` Zorro Lang
2021-08-18 15:55     ` Darrick J. Wong
2021-08-18 17:18       ` Zorro Lang
2021-08-22 11:18   ` Eryu Guan
2021-08-22 17:23     ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).