All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] weekly fstests changes
@ 2017-12-13  6:03 Darrick J. Wong
  2017-12-13  6:03 ` [PATCH 1/8] common/rc: report kmemleak errors Darrick J. Wong
                   ` (10 more replies)
  0 siblings, 11 replies; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-13  6:03 UTC (permalink / raw)
  To: eguan, darrick.wong; +Cc: linux-xfs, fstests

Hi all,

Here's the usual weekly pile of fstests changes. :)

We start with the same patch as last time that scans kmemleak for leaks
after each test.

Then we move on to a fix in the scrub probe code because upstream
xfsprogs changed xfs_io syntax again.

Third is a patch that adds Unicode linedraw character tests to the two
tests that check that we can store arbitrary byte patterns without
screwing things up.

The fourth patch amends various xfs tests to deal with our slow removal
of zero-alloc transactions, which is causing unexpected fs shutdowns
when free space nears zero.

The fifth patch adds a new test to invoke dm-error on a fsstress run;
this test simulates (for the filesystem) an internal "yank all the disk
cables out" test.

The sixth patch adds FICLONERANGE/FIDEDUPERANGE support to fsstress.
This was helpful in finding a bunch of unhandled corner cases in the xfs
reflink implementation.

Patch seven test write-only fsstress to maximize testing of write paths
where coding mistakes cannot be backed out of so easily.

The final patch in the series fixes the dir/file counts that are
hardcoded in the xfs/068 golden output, since with fsstress doing
reflink now we might get a lot more "files" than we used to.

--D

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 1/8] common/rc: report kmemleak errors
  2017-12-13  6:03 [PATCH 0/8] weekly fstests changes Darrick J. Wong
@ 2017-12-13  6:03 ` Darrick J. Wong
  2017-12-14  9:37   ` Eryu Guan
  2017-12-13  6:03 ` [PATCH 2/8] common/xfs: fix scrub support probing again Darrick J. Wong
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-13  6:03 UTC (permalink / raw)
  To: eguan, darrick.wong; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

If kmemleak is enabled, scan and report memory leaks after every test.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 check     |    2 ++
 common/rc |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+)


diff --git a/check b/check
index b2d251a..469188e 100755
--- a/check
+++ b/check
@@ -497,6 +497,7 @@ _check_filesystems()
 	fi
 }
 
+_init_kmemleak
 _prepare_test_list
 
 if $OPTIONS_HAVE_SECTIONS; then
@@ -793,6 +794,7 @@ for section in $HOST_OPTIONS_SECTIONS; do
 		    n_try=`expr $n_try + 1`
 		    _check_filesystems
 		    _check_dmesg || err=true
+		    _check_kmemleak || err=true
 		fi
 
 	    fi
diff --git a/common/rc b/common/rc
index cb83918..a2bed36 100644
--- a/common/rc
+++ b/common/rc
@@ -3339,6 +3339,58 @@ _check_dmesg()
 	fi
 }
 
+# capture the kmemleak report
+_capture_kmemleak()
+{
+	local _kern_knob="${DEBUGFS_MNT}/kmemleak"
+	local _leak_file="$1"
+
+	# Tell the kernel to scan for memory leaks.  Apparently the write
+	# returns before the scan is complete, so do it twice in the hopes
+	# that twice is enough to capture all the leaks.
+	echo "scan" > "${_kern_knob}"
+	cat "${_kern_knob}" > /dev/null
+	echo "scan" > "${_kern_knob}"
+	cat "${_kern_knob}" > "${_leak_file}"
+	echo "clear" > "${_kern_knob}"
+}
+
+# set up kmemleak
+_init_kmemleak()
+{
+	local _kern_knob="${DEBUGFS_MNT}/kmemleak"
+
+	if [ ! -w "${_kern_knob}" ]; then
+		return 0
+	fi
+
+	# Disable the automatic scan so that we can control it completely,
+	# then dump all the leaks recorded so far.
+	echo "scan=off" > "${_kern_knob}"
+	_capture_kmemleak /dev/null
+}
+
+# check kmemleak log
+_check_kmemleak()
+{
+	local _kern_knob="${DEBUGFS_MNT}/kmemleak"
+	local _leak_file="${seqres}.kmemleak"
+
+	if [ ! -w "${_kern_knob}" ]; then
+		return 0
+	fi
+
+	# Capture and report any leaks
+	_capture_kmemleak "${_leak_file}"
+	if [ -s "${_leak_file}" ]; then
+		_dump_err "_check_kmemleak: something found in kmemleak (see ${_leak_file})"
+		return 1
+	else
+		rm -f "${_leak_file}"
+		return 0
+	fi
+}
+
 # don't check dmesg log after test
 _disable_dmesg_check()
 {


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 2/8] common/xfs: fix scrub support probing again
  2017-12-13  6:03 [PATCH 0/8] weekly fstests changes Darrick J. Wong
  2017-12-13  6:03 ` [PATCH 1/8] common/rc: report kmemleak errors Darrick J. Wong
@ 2017-12-13  6:03 ` Darrick J. Wong
  2017-12-13  6:03 ` [PATCH 3/8] generic/45[34]: test line draw characters in file/attr names Darrick J. Wong
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-13  6:03 UTC (permalink / raw)
  To: eguan, darrick.wong; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

In the final version of the xfs_io scrub command we don't allow the
probe function to have any parameters, so fix the helper to abide
that.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 common/xfs |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/common/xfs b/common/xfs
index 76f94fb..f3e3764 100644
--- a/common/xfs
+++ b/common/xfs
@@ -314,7 +314,7 @@ _supports_xfs_scrub()
 
 	# Probe for kernel support...
 	$XFS_IO_PROG -c 'help scrub' 2>&1 | grep -q 'types are:.*probe' || return 1
-	$XFS_IO_PROG -c "scrub probe 0" "$mountpoint" 2>&1 | grep -q "Inappropriate ioctl" && return 1
+	$XFS_IO_PROG -c "scrub probe" "$mountpoint" 2>&1 | grep -q "Inappropriate ioctl" && return 1
 
 	# Scrub can't run on norecovery mounts
 	_fs_options "$device" | grep -q "norecovery" && return 1


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 3/8] generic/45[34]: test line draw characters in file/attr names
  2017-12-13  6:03 [PATCH 0/8] weekly fstests changes Darrick J. Wong
  2017-12-13  6:03 ` [PATCH 1/8] common/rc: report kmemleak errors Darrick J. Wong
  2017-12-13  6:03 ` [PATCH 2/8] common/xfs: fix scrub support probing again Darrick J. Wong
@ 2017-12-13  6:03 ` Darrick J. Wong
  2017-12-13  6:03 ` [PATCH 4/8] xfs: fix tests to handle removal of no-alloc create nonfeature Darrick J. Wong
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-13  6:03 UTC (permalink / raw)
  To: eguan, darrick.wong; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

Try to draw a multiline rectangular outline in a file name and xattr
name, just to see if we can.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/generic/453 |    5 +++++
 tests/generic/454 |    5 +++++
 2 files changed, 10 insertions(+)


diff --git a/tests/generic/453 b/tests/generic/453
index 38adc8d..424412e 100755
--- a/tests/generic/453
+++ b/tests/generic/453
@@ -109,6 +109,9 @@ setf "urk\xc0\xafmoo" "FAKESLASH"
 # Emoji: octopus butterfly owl giraffe
 setf "emoji_\xf0\x9f\xa6\x91\xf0\x9f\xa6\x8b\xf0\x9f\xa6\x89\xf0\x9f\xa6\x92.txt" "octopus butterfly owl giraffe emoji"
 
+# Line draw characters, because why not?
+setf "\x6c\x69\x6e\x65\x64\x72\x61\x77\x5f\x0a\xe2\x95\x94\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x97\x0a\xe2\x95\x91\x20\x6d\x65\x74\x61\x74\x61\x62\x6c\x65\x20\xe2\x95\x91\x0a\xe2\x95\x9f\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x95\xa2\x0a\xe2\x95\x91\x20\x5f\x5f\x69\x6e\x64\x65\x78\x20\x20\x20\xe2\x95\x91\x0a\xe2\x95\x9a\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x9d\x0a.txt" "ugly box because we can"
+
 ls -la $testdir >> $seqres.full
 
 echo "Test files"
@@ -130,6 +133,8 @@ testf "urk\xc0\xafmoo" "FAKESLASH"
 
 testf "emoji_\xf0\x9f\xa6\x91\xf0\x9f\xa6\x8b\xf0\x9f\xa6\x89\xf0\x9f\xa6\x92.txt" "octopus butterfly owl giraffe emoji"
 
+testf "\x6c\x69\x6e\x65\x64\x72\x61\x77\x5f\x0a\xe2\x95\x94\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x97\x0a\xe2\x95\x91\x20\x6d\x65\x74\x61\x74\x61\x62\x6c\x65\x20\xe2\x95\x91\x0a\xe2\x95\x9f\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x95\xa2\x0a\xe2\x95\x91\x20\x5f\x5f\x69\x6e\x64\x65\x78\x20\x20\x20\xe2\x95\x91\x0a\xe2\x95\x9a\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x9d\x0a.txt" "ugly box because we can"
+
 echo "Uniqueness of inodes?"
 stat -c '%i' "${testdir}/"* | sort | uniq -c | while read nr inum; do
 	if [ "${nr}" -gt 1 ]; then
diff --git a/tests/generic/454 b/tests/generic/454
index f77ab77..24727ab 100755
--- a/tests/generic/454
+++ b/tests/generic/454
@@ -107,6 +107,9 @@ setf "urk\xc0\xafmoo" "FAKESLASH"
 # Emoji: octopus butterfly owl giraffe
 setf "emoji_\xf0\x9f\xa6\x91\xf0\x9f\xa6\x8b\xf0\x9f\xa6\x89\xf0\x9f\xa6\x92.txt" "octopus butterfly owl giraffe emoji"
 
+# Line draw characters, because why not?
+setf "\x6c\x69\x6e\x65\x64\x72\x61\x77\x5f\x0a\xe2\x95\x94\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x97\x0a\xe2\x95\x91\x20\x6d\x65\x74\x61\x74\x61\x62\x6c\x65\x20\xe2\x95\x91\x0a\xe2\x95\x9f\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x95\xa2\x0a\xe2\x95\x91\x20\x5f\x5f\x69\x6e\x64\x65\x78\x20\x20\x20\xe2\x95\x91\x0a\xe2\x95\x9a\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x9d\x0a.txt" "ugly box because we can"
+
 $GETFATTR_PROG --absolute-names -d "${testfile}" >> $seqres.full
 
 echo "Test files"
@@ -128,6 +131,8 @@ testf "urk\xc0\xafmoo" "FAKESLASH"
 
 testf "emoji_\xf0\x9f\xa6\x91\xf0\x9f\xa6\x8b\xf0\x9f\xa6\x89\xf0\x9f\xa6\x92.txt" "octopus butterfly owl giraffe emoji"
 
+testf "\x6c\x69\x6e\x65\x64\x72\x61\x77\x5f\x0a\xe2\x95\x94\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x97\x0a\xe2\x95\x91\x20\x6d\x65\x74\x61\x74\x61\x62\x6c\x65\x20\xe2\x95\x91\x0a\xe2\x95\x9f\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x94\x80\xe2\x95\xa2\x0a\xe2\x95\x91\x20\x5f\x5f\x69\x6e\x64\x65\x78\x20\x20\x20\xe2\x95\x91\x0a\xe2\x95\x9a\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x9d\x0a.txt" "ugly box because we can"
+
 echo "Uniqueness of keys?"
 crazy_keys="$($GETFATTR_PROG --absolute-names -d "${testfile}" | egrep -c '(french_|chinese_|greek_|arabic_|urk)')"
 expected_keys=11


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 4/8] xfs: fix tests to handle removal of no-alloc create nonfeature
  2017-12-13  6:03 [PATCH 0/8] weekly fstests changes Darrick J. Wong
                   ` (2 preceding siblings ...)
  2017-12-13  6:03 ` [PATCH 3/8] generic/45[34]: test line draw characters in file/attr names Darrick J. Wong
@ 2017-12-13  6:03 ` Darrick J. Wong
  2017-12-13 22:12   ` Dave Chinner
  2017-12-13 22:45   ` [PATCH v2 " Darrick J. Wong
  2017-12-13  6:03 ` [PATCH 5/8] generic: test error shutdown while stressing filesystem Darrick J. Wong
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-13  6:03 UTC (permalink / raw)
  To: eguan, darrick.wong; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

We're removing from XFS the ability to perform no-allocation file
creation.  This was added years ago because someone at SGI demanded that
we still be able to create (empty?) files with zero free blocks
remaining so long as there were free inodes and space in existing
directory blocks.  This came at an unacceptable risk of ENOSPC'ing
midway through a transaction and shutting down the fs, so we're removing
it for the create case.

However, some tests fail as a result, so fix them to be more flexible
about not failing when a dir/file creation fails due to ENOSPC.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/013 |    6 ++++--
 tests/xfs/014 |    3 +++
 tests/xfs/104 |    2 +-
 tests/xfs/109 |    2 +-
 4 files changed, 9 insertions(+), 4 deletions(-)


diff --git a/tests/xfs/013 b/tests/xfs/013
index 80298ca..394b9bc 100755
--- a/tests/xfs/013
+++ b/tests/xfs/013
@@ -145,8 +145,10 @@ $FSSTRESS_PROG -d $SCRATCH_MNT/fsstress -n 9999999 -p 2 -S t \
 for i in $(seq 1 $LOOPS)
 do
 	# hard link the content of the current directory to the next
-	cp -Rl $SCRATCH_MNT/dir$i $SCRATCH_MNT/dir$((i+1)) 2>&1 | \
-		filter_enospc
+	while ! test -d $SCRATCH_MNT/dir$((i+1)); do
+		cp -Rl $SCRATCH_MNT/dir$i $SCRATCH_MNT/dir$((i+1)) 2>&1 | \
+			filter_enospc
+	done
 
 	# do a random replacement of files in the new directory
 	_rand_replace $SCRATCH_MNT/dir$((i+1)) $COUNT
diff --git a/tests/xfs/014 b/tests/xfs/014
index 875ab40..08cd001 100755
--- a/tests/xfs/014
+++ b/tests/xfs/014
@@ -112,6 +112,9 @@ _test_enospc()
 	# consume 1/2 of the current preallocation across the set of 4 writers
 	write_size=$((TOTAL_PREALLOC / 2 / 4))
 	for i in $(seq 0 3); do
+		touch $dir/file.$i
+	done
+	for i in $(seq 0 3); do
 		$XFS_IO_PROG -f -c "pwrite 0 $write_size" $dir/file.$i \
 			>> $seqres.full &
 	done
diff --git a/tests/xfs/104 b/tests/xfs/104
index 785027e..c3b5977 100755
--- a/tests/xfs/104
+++ b/tests/xfs/104
@@ -65,7 +65,7 @@ _stress_scratch()
 	# -w ensures that the only ops are ones which cause write I/O
 	FSSTRESS_ARGS=`_scale_fsstress_args -d $SCRATCH_MNT -w -p $procs \
 	    -n $nops $FSSTRESS_AVOID`
-	$FSSTRESS_PROG $FSSTRESS_ARGS >> $seqres.full &
+	$FSSTRESS_PROG $FSSTRESS_ARGS >> $seqres.full 2>&1 &
 }
 
 # real QA test starts here
diff --git a/tests/xfs/109 b/tests/xfs/109
index e0fdec3..2625f15 100755
--- a/tests/xfs/109
+++ b/tests/xfs/109
@@ -79,7 +79,7 @@ allocate()
 			while [ $j -lt 100 ]; do
 				$XFS_IO_PROG -f -c 'pwrite -b 64k 0 16m' $file \
 					>/dev/null 2>&1
-				rm $file
+				test -e $file && rm $file
 				let j=$j+1
 			done
 		} &


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 5/8] generic: test error shutdown while stressing filesystem
  2017-12-13  6:03 [PATCH 0/8] weekly fstests changes Darrick J. Wong
                   ` (3 preceding siblings ...)
  2017-12-13  6:03 ` [PATCH 4/8] xfs: fix tests to handle removal of no-alloc create nonfeature Darrick J. Wong
@ 2017-12-13  6:03 ` Darrick J. Wong
  2017-12-13  6:03 ` [PATCH 6/8] fsstress: implement the clonerange/deduperange ioctls Darrick J. Wong
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-13  6:03 UTC (permalink / raw)
  To: eguan, darrick.wong; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/generic/932     |   97 +++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/932.out |    2 +
 tests/generic/group   |    1 +
 3 files changed, 100 insertions(+)
 create mode 100755 tests/generic/932
 create mode 100644 tests/generic/932.out


diff --git a/tests/generic/932 b/tests/generic/932
new file mode 100755
index 0000000..a440a75
--- /dev/null
+++ b/tests/generic/932
@@ -0,0 +1,97 @@
+#! /bin/bash
+# FS QA Test No. 932
+#
+# Test log recovery with repeated (simulated) disk failures.  We kick
+# off fsstress on the scratch fs, then switch out the underlying device
+# with dm-error to see what happens when the disk goes down.  Having
+# taken down the fs in this manner, remount it and repeat.  This test
+# is a Good Enough (tm) simulation of our internal multipath failure
+# testing efforts.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2017 Oracle, Inc.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+	$KILLALL_PROG -9 fsstress > /dev/null 2>&1
+	_dmerror_unmount
+	_dmerror_cleanup
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/dmerror
+
+# Modify as appropriate.
+_supported_fs generic
+_supported_os Linux
+
+_require_scratch
+_require_dm_target error
+_require_command "$KILLALL_PROG" "killall"
+
+rm -f $seqres.full
+
+echo "Silence is golden."
+
+_scratch_mkfs >> $seqres.full 2>&1
+_require_metadata_journaling $SCRATCH_DEV
+_dmerror_init
+_dmerror_mount
+
+for i in $(seq 1 $((50 * TIME_FACTOR)) ); do
+	($FSSTRESS_PROG $FSSTRESS_AVOID -d $SCRATCH_MNT -n 999999 -p $((LOAD_FACTOR * 4)) >> $seqres.full &) \
+		> /dev/null 2>&1
+
+	# purposely include 0 second sleeps to test shutdown immediately after
+	# recovery
+	sleep $((RANDOM % 3))
+
+	# Loading error table without "--nolockfs" option. Because "--nolockfs"
+	# won't freeze fs, then some running I/Os may cause XFS to shutdown
+	# prematurely. That's not what we want to test.
+	_dmerror_load_error_table
+
+	ps -e | grep fsstress > /dev/null 2>&1
+	while [ $? -eq 0 ]; do
+		$KILLALL_PROG -9 fsstress > /dev/null 2>&1
+		wait > /dev/null 2>&1
+		ps -e | grep fsstress > /dev/null 2>&1
+	done
+
+	# Mount again to replay log after loading working table, so we have a
+	# consistent XFS after test.
+	_dmerror_unmount || _fail "unmount failed"
+	_dmerror_load_working_table
+	_dmerror_mount || _fail "mount failed"
+done
+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/932.out b/tests/generic/932.out
new file mode 100644
index 0000000..1bb668c
--- /dev/null
+++ b/tests/generic/932.out
@@ -0,0 +1,2 @@
+QA output created by 932
+Silence is golden.
diff --git a/tests/generic/group b/tests/generic/group
index 6c3bb03..ff1ddc9 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -472,3 +472,4 @@
 467 auto quick exportfs
 468 shutdown auto quick metadata
 469 auto quick
+932 shutdown auto log metadata


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 6/8] fsstress: implement the clonerange/deduperange ioctls
  2017-12-13  6:03 [PATCH 0/8] weekly fstests changes Darrick J. Wong
                   ` (4 preceding siblings ...)
  2017-12-13  6:03 ` [PATCH 5/8] generic: test error shutdown while stressing filesystem Darrick J. Wong
@ 2017-12-13  6:03 ` Darrick J. Wong
  2017-12-14  6:39   ` Amir Goldstein
  2017-12-15  2:07   ` [PATCH v2 " Darrick J. Wong
  2017-12-13  6:04 ` [PATCH 7/8] generic: run a long-soak write-only fsstress test Darrick J. Wong
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-13  6:03 UTC (permalink / raw)
  To: eguan, darrick.wong; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

Mix it up a bit by reflinking and deduping data blocks when possible.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 ltp/fsstress.c |  440 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 440 insertions(+)


diff --git a/ltp/fsstress.c b/ltp/fsstress.c
index 96f48b1..e2dfa5e 100644
--- a/ltp/fsstress.c
+++ b/ltp/fsstress.c
@@ -68,7 +68,9 @@ typedef enum {
 	OP_BULKSTAT,
 	OP_BULKSTAT1,
 	OP_CHOWN,
+	OP_CLONERANGE,
 	OP_CREAT,
+	OP_DEDUPERANGE,
 	OP_DREAD,
 	OP_DWRITE,
 	OP_FALLOCATE,
@@ -174,7 +176,9 @@ void	awrite_f(int, long);
 void	bulkstat_f(int, long);
 void	bulkstat1_f(int, long);
 void	chown_f(int, long);
+void	clonerange_f(int, long);
 void	creat_f(int, long);
+void	deduperange_f(int, long);
 void	dread_f(int, long);
 void	dwrite_f(int, long);
 void	fallocate_f(int, long);
@@ -221,7 +225,9 @@ opdesc_t	ops[] = {
 	{ OP_BULKSTAT, "bulkstat", bulkstat_f, 1, 0 },
 	{ OP_BULKSTAT1, "bulkstat1", bulkstat1_f, 1, 0 },
 	{ OP_CHOWN, "chown", chown_f, 3, 1 },
+	{ OP_CLONERANGE, "clonerange", clonerange_f, 4, 1 },
 	{ OP_CREAT, "creat", creat_f, 4, 1 },
+	{ OP_DEDUPERANGE, "deduperange", deduperange_f, 4, 1},
 	{ OP_DREAD, "dread", dread_f, 4, 0 },
 	{ OP_DWRITE, "dwrite", dwrite_f, 4, 1 },
 	{ OP_FALLOCATE, "fallocate", fallocate_f, 1, 1 },
@@ -1312,6 +1318,16 @@ make_freq_table(void)
 	}
 }
 
+void
+free_freq_table(void)
+{
+	if (!freq_table)
+		return;
+	free(freq_table);
+	freq_table = NULL;
+	freq_table_size = 0;
+}
+
 int
 mkdir_path(pathname_t *name, mode_t mode)
 {
@@ -2189,6 +2205,430 @@ chown_f(int opno, long r)
 	free_pathname(&f);
 }
 
+static void
+disable_op(opty_t opt)
+{
+	opdesc_t	*p;
+
+	for (p = ops; p < ops_end; p++) {
+		if (opt == p->op) {
+			p->freq = 0;
+			free_freq_table();
+			make_freq_table();
+			return;
+		}
+	}
+}
+
+/* reflink some arbitrary range of f1 to f2. */
+void
+clonerange_f(
+	int			opno,
+	long			r)
+{
+#ifdef FICLONERANGE
+	struct file_clone_range	fcr;
+	struct pathname		fpath1;
+	struct pathname		fpath2;
+	struct stat64		stat1;
+	struct stat64		stat2;
+	char			inoinfo1[1024];
+	char			inoinfo2[1024];
+	off64_t			lr;
+	off64_t			off1;
+	off64_t			off2;
+	size_t			len;
+	int			v1;
+	int			v2;
+	int			fd1;
+	int			fd2;
+	int			ret;
+	int			e;
+
+	/* Load paths */
+	init_pathname(&fpath1);
+	if (!get_fname(FT_REGm, r, &fpath1, NULL, NULL, &v1)) {
+		if (v1)
+			printf("%d/%d: clonerange read - no filename\n",
+				procid, opno);
+		goto out_fpath1;
+	}
+
+	init_pathname(&fpath2);
+	if (!get_fname(FT_REGm, random(), &fpath2, NULL, NULL, &v2)) {
+		if (v2)
+			printf("%d/%d: clonerange write - no filename\n",
+				procid, opno);
+		goto out_fpath2;
+	}
+
+	/* Open files */
+	fd1 = open_path(&fpath1, O_RDONLY);
+	e = fd1 < 0 ? errno : 0;
+	check_cwd();
+	if (fd1 < 0) {
+		if (v1)
+			printf("%d/%d: clonerange read - open %s failed %d\n",
+				procid, opno, fpath1.path, e);
+		goto out_fpath2;
+	}
+
+	fd2 = open_path(&fpath2, O_WRONLY);
+	e = fd2 < 0 ? errno : 0;
+	check_cwd();
+	if (fd2 < 0) {
+		if (v2)
+			printf("%d/%d: clonerange write - open %s failed %d\n",
+				procid, opno, fpath2.path, e);
+		goto out_fd1;
+	}
+
+	/* Get file stats */
+	if (fstat64(fd1, &stat1) < 0) {
+		if (v1)
+			printf("%d/%d: clonerange read - fstat64 %s failed %d\n",
+				procid, opno, fpath1.path, errno);
+		goto out_fd2;
+	}
+	inode_info(inoinfo1, sizeof(inoinfo1), &stat1, v1);
+
+	if (fstat64(fd2, &stat2) < 0) {
+		if (v2)
+			printf("%d/%d: clonerange write - fstat64 %s failed %d\n",
+				procid, opno, fpath2.path, errno);
+		goto out_fd2;
+	}
+	inode_info(inoinfo2, sizeof(inoinfo2), &stat2, v1);
+
+	/* Calculate offsets */
+	len = (random() % FILELEN_MAX) + 1;
+	len &= ~(stat1.st_blksize - 1);
+	if (len == 0)
+		len = stat1.st_blksize;
+	if (len > stat1.st_size)
+		len = stat1.st_size;
+
+	lr = ((__int64_t)random() << 32) + random();
+	if (stat1.st_size == len)
+		off1 = 0;
+	else
+		off1 = (off64_t)(lr % MIN(stat1.st_size - len, MAXFSIZE));
+	off1 %= maxfsize;
+	off1 &= ~(stat1.st_blksize - 1);
+
+	/*
+	 * If srcfile == destfile, randomly generate destination ranges
+	 * until we find one that doesn't overlap the source range.
+	 */
+	do {
+		lr = ((__int64_t)random() << 32) + random();
+		off2 = (off64_t)(lr % MIN(stat2.st_size + (1024 * 1024), MAXFSIZE));
+		off2 %= maxfsize;
+		off2 &= ~(stat2.st_blksize - 1);
+	} while (stat1.st_ino == stat2.st_ino && llabs(off2 - off1) < len);
+
+	/* Clone data blocks */
+	fcr.src_fd = fd1;
+	fcr.src_offset = off1;
+	fcr.src_length = len;
+	fcr.dest_offset = off2;
+
+	ret = ioctl(fd2, FICLONERANGE, &fcr);
+	if (ret < 0 && (errno == EOPNOTSUPP || errno == ENOTTY)) {
+		/*
+		 * If the fs doesn't support reflink, just stop running it.
+		 * Make a note of this in the log.
+		 */
+		printf("%d/%d: clonerange not supported\n", procid, opno);
+		disable_op(OP_CLONERANGE);
+		ret = 0;
+		goto out_fd2;
+	}
+	e = ret < 0 ? errno : 0;
+	if (v1 || v2 || ret < 0) {
+		printf("%d/%d: clonerange %s%s [%lld,%lld] -> %s%s [%lld,%lld]",
+			procid, opno,
+			fpath1.path, inoinfo1, (long long)off1, (long long)len,
+			fpath2.path, inoinfo2, (long long)off2, (long long)len);
+
+		if (ret < 0)
+			printf(" error %d", e);
+		printf("\n");
+	}
+
+out_fd2:
+	close(fd2);
+out_fd1:
+	close(fd1);
+out_fpath2:
+	free_pathname(&fpath2);
+out_fpath1:
+	free_pathname(&fpath1);
+#endif
+}
+
+/* dedupe some arbitrary range of f1 to f2...fn. */
+void
+deduperange_f(
+	int			opno,
+	long			r)
+{
+#ifdef FIDEDUPERANGE
+#define INFO_SZ			1024
+	struct file_dedupe_range *fdr;
+	struct pathname		*fpath;
+	struct stat64		*stat;
+	char			*info;
+	off64_t			*off;
+	int			*v;
+	int			*fd;
+	int			nr;
+	off64_t			lr;
+	size_t			len;
+	int			ret;
+	int			i;
+	int			e;
+
+	if (flist[FT_REG].nfiles < 2)
+		return;
+
+	/* Pick somewhere between 2 and 128 files. */
+	do {
+		nr = random() % (flist[FT_REG].nfiles + 1);
+	} while (nr < 2 || nr > 128);
+
+	/* Alloc memory */
+	fdr = malloc(nr * sizeof(struct file_dedupe_range_info) +
+		     sizeof(struct file_dedupe_range));
+	if (!fdr) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		return;
+	}
+	memset(fdr, 0, (nr * sizeof(struct file_dedupe_range_info) +
+			sizeof(struct file_dedupe_range)));
+
+	fpath = calloc(nr, sizeof(struct pathname));
+	if (!fpath) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		goto out_fdr;
+	}
+
+	stat = calloc(nr, sizeof(struct stat64));
+	if (!stat) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		goto out_paths;
+	}
+
+	info = calloc(nr, INFO_SZ);
+	if (!info) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		goto out_stats;
+	}
+
+	off = calloc(nr, sizeof(off64_t));
+	if (!off) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		goto out_info;
+	}
+
+	v = calloc(nr, sizeof(int));
+	if (!v) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		goto out_offsets;
+	}
+	fd = calloc(nr, sizeof(int));
+	if (!fd) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		goto out_v;
+	}
+	memset(fd, 0xFF, nr * sizeof(int));
+
+	/* Get paths for all files */
+	for (i = 0; i < nr; i++)
+		init_pathname(&fpath[i]);
+
+	if (!get_fname(FT_REGm, r, &fpath[0], NULL, NULL, &v[0])) {
+		if (v[0])
+			printf("%d/%d: deduperange read - no filename\n",
+				procid, opno);
+		goto out_pathnames;
+	}
+
+	for (i = 1; i < nr; i++) {
+		if (!get_fname(FT_REGm, random(), &fpath[i], NULL, NULL, &v[i])) {
+			if (v[i])
+				printf("%d/%d: deduperange write - no filename\n",
+					procid, opno);
+			goto out_pathnames;
+		}
+	}
+
+	/* Open files */
+	fd[0] = open_path(&fpath[0], O_RDONLY);
+	e = fd[0] < 0 ? errno : 0;
+	check_cwd();
+	if (fd[0] < 0) {
+		if (v[0])
+			printf("%d/%d: deduperange read - open %s failed %d\n",
+				procid, opno, fpath[0].path, e);
+		goto out_pathnames;
+	}
+
+	for (i = 1; i < nr; i++) {
+		fd[i] = open_path(&fpath[i], O_WRONLY);
+		e = fd[i] < 0 ? errno : 0;
+		check_cwd();
+		if (fd[i] < 0) {
+			if (v[i])
+				printf("%d/%d: deduperange write - open %s failed %d\n",
+					procid, opno, fpath[i].path, e);
+			goto out_fds;
+		}
+	}
+
+	/* Get file stats */
+	if (fstat64(fd[0], &stat[0]) < 0) {
+		if (v[0])
+			printf("%d/%d: deduperange read - fstat64 %s failed %d\n",
+				procid, opno, fpath[0].path, errno);
+		goto out_fds;
+	}
+
+	inode_info(&info[0], INFO_SZ, &stat[0], v[0]);
+
+	for (i = 1; i < nr; i++) {
+		if (fstat64(fd[i], &stat[i]) < 0) {
+			if (v[i])
+				printf("%d/%d: deduperange write - fstat64 %s failed %d\n",
+					procid, opno, fpath[i].path, errno);
+			goto out_fds;
+		}
+		inode_info(&info[i * INFO_SZ], INFO_SZ, &stat[i], v[i]);
+	}
+
+	/* Never try to dedupe more than half of the src file. */
+	len = (random() % FILELEN_MAX) + 1;
+	len &= ~(stat[0].st_blksize - 1);
+	if (len == 0)
+		len = stat[0].st_blksize / 2;
+	if (len > stat[0].st_size / 2)
+		len = stat[0].st_size / 2;
+
+	/* Calculate offsets */
+	lr = ((__int64_t)random() << 32) + random();
+	if (stat[0].st_size == len)
+		off[0] = 0;
+	else
+		off[0] = (off64_t)(lr % MIN(stat[0].st_size - len, MAXFSIZE));
+	off[0] %= maxfsize;
+	off[0] &= ~(stat[0].st_blksize - 1);
+
+	/*
+	 * If srcfile == destfile[i], randomly generate destination ranges
+	 * until we find one that doesn't overlap the source range.
+	 */
+	for (i = 1; i < nr; i++) {
+		int	tries = 0;
+
+		do {
+			lr = ((__int64_t)random() << 32) + random();
+			if (stat[i].st_size <= len)
+				off[i] = 0;
+			else
+				off[i] = (off64_t)(lr % MIN(stat[i].st_size - len, MAXFSIZE));
+			off[i] %= maxfsize;
+			off[i] &= ~(stat[i].st_blksize - 1);
+		} while (stat[0].st_ino == stat[i].st_ino &&
+			 llabs(off[i] - off[0]) < len &&
+			 tries++ < 10);
+	}
+
+	/* Clone data blocks */
+	fdr->src_offset = off[0];
+	fdr->src_length = len;
+	fdr->dest_count = nr - 1;
+	for (i = 1; i < nr; i++) {
+		fdr->info[i - 1].dest_fd = fd[i];
+		fdr->info[i - 1].dest_offset = off[i];
+	}
+
+	ret = ioctl(fd[0], FIDEDUPERANGE, fdr);
+	if ((ret < 0 && (errno == EOPNOTSUPP || errno == ENOTTY)) ||
+	    fdr->info[0].status == -EINVAL) {
+		/*
+		 * If the fs doesn't support reflink, just stop running it.
+		 * Make a note of this in the log.
+		 *
+		 * (Yes, we can indicate lack of support by returning -EINVAL
+		 * in the status field...)
+		 */
+		printf("%d/%d: deduperange not supported\n", procid, opno);
+		disable_op(OP_DEDUPERANGE);
+		ret = 0;
+		goto out_fds;
+	}
+	e = ret < 0 ? errno : 0;
+	if (v[0]) {
+		printf("%d/%d: deduperange from %s%s [%lld,%lld]",
+			procid, opno,
+			fpath[0].path, &info[0], (long long)off[0],
+			(long long)len);
+		if (ret < 0)
+			printf(" error %d", e);
+		printf("\n");
+	}
+	if (ret < 0)
+		goto out_fds;
+
+	for (i = 1; i < nr; i++) {
+		e = fdr->info[i - 1].status < 0 ? fdr->info[i - 1].status : 0;
+		if (v[i] || fdr->info[i - 1].status < 0) {
+			printf("%d/%d: ...to %s%s [%lld,%lld]",
+				procid, opno,
+				fpath[i].path, &info[i * INFO_SZ],
+				(long long)off[i], (long long)len);
+			if (fdr->info[i - 1].status < 0)
+				printf(" error %d", e);
+			if (fdr->info[i - 1].status == FILE_DEDUPE_RANGE_SAME)
+				printf(" %llu bytes deduplicated",
+					fdr->info[i - 1].bytes_deduped);
+			if (fdr->info[i - 1].status == FILE_DEDUPE_RANGE_DIFFERS)
+				printf(" differed");
+			printf("\n");
+		}
+	}
+
+out_fds:
+	for (i = 0; i < nr; i++)
+		if (fd[i] >= 0)
+			close(fd[i]);
+out_pathnames:
+	for (i = 0; i < nr; i++)
+		free_pathname(&fpath[i]);
+
+	free(fd);
+out_v:
+	free(v);
+out_offsets:
+	free(off);
+out_info:
+	free(info);
+out_stats:
+	free(stat);
+out_paths:
+	free(fpath);
+out_fdr:
+	free(fdr);
+#endif
+}
+
 void
 setxattr_f(int opno, long r)
 {


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 7/8] generic: run a long-soak write-only fsstress test
  2017-12-13  6:03 [PATCH 0/8] weekly fstests changes Darrick J. Wong
                   ` (5 preceding siblings ...)
  2017-12-13  6:03 ` [PATCH 6/8] fsstress: implement the clonerange/deduperange ioctls Darrick J. Wong
@ 2017-12-13  6:04 ` Darrick J. Wong
  2018-01-07 15:34   ` Eryu Guan
  2017-12-13  6:04 ` [PATCH 8/8] xfs/068: fix variability problems in file/dir count output Darrick J. Wong
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-13  6:04 UTC (permalink / raw)
  To: eguan, darrick.wong; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

Let a lot of writes soak in with multithreaded fsstress to look for bugs
and other problems.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/generic/933     |   64 +++++++++++++++++++++++++++++++++++++++++++++++++
 tests/generic/933.out |    2 ++
 tests/generic/group   |    1 +
 3 files changed, 67 insertions(+)
 create mode 100755 tests/generic/933
 create mode 100644 tests/generic/933.out


diff --git a/tests/generic/933 b/tests/generic/933
new file mode 100755
index 0000000..0cbd081
--- /dev/null
+++ b/tests/generic/933
@@ -0,0 +1,64 @@
+#! /bin/bash
+# FS QA Test No. 933
+#
+# Run an all-writes fsstress run with multiple threads to shake out
+# bugs in the write path.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2017 Oracle, Inc.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	cd /
+	rm -f $tmp.*
+	$KILLALL_PROG -9 fsstress > /dev/null 2>&1
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+
+# Modify as appropriate.
+_supported_fs generic
+_supported_os Linux
+
+_require_scratch
+_require_command "$KILLALL_PROG" "killall"
+
+rm -f $seqres.full
+
+echo "Silence is golden."
+
+_scratch_mkfs > $seqres.full 2>&1
+_scratch_mount >> $seqres.full 2>&1
+
+nr_cpus=$((LOAD_FACTOR * 4))
+nr_ops=$((25000 * nr_cpus * TIME_FACTOR))
+$FSSTRESS_PROG $FSSTRESS_AVOID -w -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus -v >> $seqres.full
+
+# success, all done
+status=0
+exit
diff --git a/tests/generic/933.out b/tests/generic/933.out
new file mode 100644
index 0000000..758765d
--- /dev/null
+++ b/tests/generic/933.out
@@ -0,0 +1,2 @@
+QA output created by 933
+Silence is golden.
diff --git a/tests/generic/group b/tests/generic/group
index ff1ddc9..3c4fff5 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -473,3 +473,4 @@
 468 shutdown auto quick metadata
 469 auto quick
 932 shutdown auto log metadata
+933 auto rw clone


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 8/8] xfs/068: fix variability problems in file/dir count output
  2017-12-13  6:03 [PATCH 0/8] weekly fstests changes Darrick J. Wong
                   ` (6 preceding siblings ...)
  2017-12-13  6:04 ` [PATCH 7/8] generic: run a long-soak write-only fsstress test Darrick J. Wong
@ 2017-12-13  6:04 ` Darrick J. Wong
  2017-12-13 22:20   ` Dave Chinner
                     ` (3 more replies)
  2017-12-15  8:55 ` [PATCH 0/8] weekly fstests changes Eryu Guan
                   ` (2 subsequent siblings)
  10 siblings, 4 replies; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-13  6:04 UTC (permalink / raw)
  To: eguan, darrick.wong; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

In this test we use fsstress to create some number of files and then
exercise xfsdump/xfsrestore on them.  Depending on the fsstress config
we may end up with a different number of files than is hardcoded in the
golden output (particularly after adding reflink support to fsstress)
and thereby fail the test.  Since we're not really testing how many
files fsstress can create, just turn the counts into XXX/YYY.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/068     |    4 +++-
 tests/xfs/068.out |    2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)


diff --git a/tests/xfs/068 b/tests/xfs/068
index 7151e28..119b204 100755
--- a/tests/xfs/068
+++ b/tests/xfs/068
@@ -44,7 +44,9 @@ _supported_fs xfs
 _supported_os Linux
 
 _create_dumpdir_stress_num 4096
-_do_dump_restore
+# Fancy filtering here because fsstress doesn't always create the
+# same number of files (depending on the fs geometry)
+_do_dump_restore | sed -e 's/xfsrestore: [0-9]* directories and [0-9]* entries/xfsrestore: XXX directories and YYY entries/g'
 
 # success, all done
 exit
diff --git a/tests/xfs/068.out b/tests/xfs/068.out
index fa3a552..f53c555 100644
--- a/tests/xfs/068.out
+++ b/tests/xfs/068.out
@@ -22,7 +22,7 @@ xfsrestore: session id: ID
 xfsrestore: media ID: ID
 xfsrestore: searching media for directory dump
 xfsrestore: reading directories
-xfsrestore: 383 directories and 1335 entries processed
+xfsrestore: XXX directories and YYY entries processed
 xfsrestore: directory post-processing
 xfsrestore: restoring non-directory files
 xfsrestore: restore complete: SECS seconds elapsed


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 4/8] xfs: fix tests to handle removal of no-alloc create nonfeature
  2017-12-13  6:03 ` [PATCH 4/8] xfs: fix tests to handle removal of no-alloc create nonfeature Darrick J. Wong
@ 2017-12-13 22:12   ` Dave Chinner
  2017-12-13 22:45   ` [PATCH v2 " Darrick J. Wong
  1 sibling, 0 replies; 55+ messages in thread
From: Dave Chinner @ 2017-12-13 22:12 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: eguan, linux-xfs, fstests

On Tue, Dec 12, 2017 at 10:03:42PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> We're removing from XFS the ability to perform no-allocation file
> creation.  This was added years ago because someone at SGI demanded that
> we still be able to create (empty?) files with zero free blocks

Just to correct the record, it wasn't SGI that demanded this - it
was implemented by SGI a long, long time ago (~1998) in response to
customer demands.

> remaining so long as there were free inodes and space in existing
> directory blocks.  This came at an unacceptable risk of ENOSPC'ing
> midway through a transaction and shutting down the fs, so we're removing
> it for the create case.

Well, 20 years later we consider it an unacceptible risk. :P

> However, some tests fail as a result, so fix them to be more flexible
> about not failing when a dir/file creation fails due to ENOSPC.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks fine.

Reviewed-by: Dave Chinner <dchinner@redhat.com>
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 8/8] xfs/068: fix variability problems in file/dir count output
  2017-12-13  6:04 ` [PATCH 8/8] xfs/068: fix variability problems in file/dir count output Darrick J. Wong
@ 2017-12-13 22:20   ` Dave Chinner
  2017-12-13 22:23     ` Darrick J. Wong
  2017-12-13 23:28   ` [PATCH v2 8/8] xfs/068: fix clonerange " Darrick J. Wong
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 55+ messages in thread
From: Dave Chinner @ 2017-12-13 22:20 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: eguan, linux-xfs, fstests

On Tue, Dec 12, 2017 at 10:04:11PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> In this test we use fsstress to create some number of files and then
> exercise xfsdump/xfsrestore on them.  Depending on the fsstress config
> we may end up with a different number of files than is hardcoded in the
> golden output (particularly after adding reflink support to fsstress)
> and thereby fail the test.  Since we're not really testing how many
> files fsstress can create, just turn the counts into XXX/YYY.

Hmmmm. those numbers were in the golden output specifically because
fsstress is supposed to be deterministic for a given random seed.
What it is supposed to be testing is that xfsdump actually dumped
all the files that were created, and xfs-restore was able to process
them all. If either barf on a file, they'll silently skip it, and
the numbers won't come out properly.

The typical class of bug this test finds is bulkstat iteration
problems - if bulkstat misses an inode it shouldn't, then the
xfsrestore numbers come out wrong. By making the data set
non-deterministic and not checking the numbers, we end up losing the
ability of this test to check bulkstat iteration and dump/restore
completeness....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 8/8] xfs/068: fix variability problems in file/dir count output
  2017-12-13 22:20   ` Dave Chinner
@ 2017-12-13 22:23     ` Darrick J. Wong
  2017-12-13 22:45       ` Dave Chinner
  0 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-13 22:23 UTC (permalink / raw)
  To: Dave Chinner; +Cc: eguan, linux-xfs, fstests

On Thu, Dec 14, 2017 at 09:20:46AM +1100, Dave Chinner wrote:
> On Tue, Dec 12, 2017 at 10:04:11PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > In this test we use fsstress to create some number of files and then
> > exercise xfsdump/xfsrestore on them.  Depending on the fsstress config
> > we may end up with a different number of files than is hardcoded in the
> > golden output (particularly after adding reflink support to fsstress)
> > and thereby fail the test.  Since we're not really testing how many
> > files fsstress can create, just turn the counts into XXX/YYY.
> 
> Hmmmm. those numbers were in the golden output specifically because
> fsstress is supposed to be deterministic for a given random seed.
> What it is supposed to be testing is that xfsdump actually dumped
> all the files that were created, and xfs-restore was able to process
> them all. If either barf on a file, they'll silently skip it, and
> the numbers won't come out properly.
> 
> The typical class of bug this test finds is bulkstat iteration
> problems - if bulkstat misses an inode it shouldn't, then the
> xfsrestore numbers come out wrong. By making the data set
> non-deterministic and not checking the numbers, we end up losing the
> ability of this test to check bulkstat iteration and dump/restore
> completeness....

Ah, fun.  Ok, in that case I think the correct fix for this problem is
to turn off clonerange/deduperange in the fsstress command line so that
we get back to deterministic(?) counts...

...unless a better solution to count the number of dirs/files and compare
to whatever xfsrestore says?

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v2 4/8] xfs: fix tests to handle removal of no-alloc create nonfeature
  2017-12-13  6:03 ` [PATCH 4/8] xfs: fix tests to handle removal of no-alloc create nonfeature Darrick J. Wong
  2017-12-13 22:12   ` Dave Chinner
@ 2017-12-13 22:45   ` Darrick J. Wong
  1 sibling, 0 replies; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-13 22:45 UTC (permalink / raw)
  To: eguan; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

We're removing from XFS the ability to perform no-allocation file
creation.  This was added years ago because some customer of SGI
demanded that we still be able to create (empty?) files with zero free
blocks remaining so long as there were free inodes and space in existing
directory blocks.  This came at an unacceptable risk of ENOSPC'ing
midway through a transaction and shutting down the fs, so we're removing
it for the create case having changed our minds 20 years later.

However, some tests fail as a result, so fix them to be more flexible
about not failing when a dir/file creation fails due to ENOSPC.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
v2: fix commit message
---
 tests/xfs/013 |    6 ++++--
 tests/xfs/014 |    3 +++
 tests/xfs/104 |    2 +-
 tests/xfs/109 |    2 +-
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/tests/xfs/013 b/tests/xfs/013
index 80298ca..394b9bc 100755
--- a/tests/xfs/013
+++ b/tests/xfs/013
@@ -145,8 +145,10 @@ $FSSTRESS_PROG -d $SCRATCH_MNT/fsstress -n 9999999 -p 2 -S t \
 for i in $(seq 1 $LOOPS)
 do
 	# hard link the content of the current directory to the next
-	cp -Rl $SCRATCH_MNT/dir$i $SCRATCH_MNT/dir$((i+1)) 2>&1 | \
-		filter_enospc
+	while ! test -d $SCRATCH_MNT/dir$((i+1)); do
+		cp -Rl $SCRATCH_MNT/dir$i $SCRATCH_MNT/dir$((i+1)) 2>&1 | \
+			filter_enospc
+	done
 
 	# do a random replacement of files in the new directory
 	_rand_replace $SCRATCH_MNT/dir$((i+1)) $COUNT
diff --git a/tests/xfs/014 b/tests/xfs/014
index 875ab40..08cd001 100755
--- a/tests/xfs/014
+++ b/tests/xfs/014
@@ -112,6 +112,9 @@ _test_enospc()
 	# consume 1/2 of the current preallocation across the set of 4 writers
 	write_size=$((TOTAL_PREALLOC / 2 / 4))
 	for i in $(seq 0 3); do
+		touch $dir/file.$i
+	done
+	for i in $(seq 0 3); do
 		$XFS_IO_PROG -f -c "pwrite 0 $write_size" $dir/file.$i \
 			>> $seqres.full &
 	done
diff --git a/tests/xfs/104 b/tests/xfs/104
index 785027e..c3b5977 100755
--- a/tests/xfs/104
+++ b/tests/xfs/104
@@ -65,7 +65,7 @@ _stress_scratch()
 	# -w ensures that the only ops are ones which cause write I/O
 	FSSTRESS_ARGS=`_scale_fsstress_args -d $SCRATCH_MNT -w -p $procs \
 	    -n $nops $FSSTRESS_AVOID`
-	$FSSTRESS_PROG $FSSTRESS_ARGS >> $seqres.full &
+	$FSSTRESS_PROG $FSSTRESS_ARGS >> $seqres.full 2>&1 &
 }
 
 # real QA test starts here
diff --git a/tests/xfs/109 b/tests/xfs/109
index e0fdec3..2625f15 100755
--- a/tests/xfs/109
+++ b/tests/xfs/109
@@ -79,7 +79,7 @@ allocate()
 			while [ $j -lt 100 ]; do
 				$XFS_IO_PROG -f -c 'pwrite -b 64k 0 16m' $file \
 					>/dev/null 2>&1
-				rm $file
+				test -e $file && rm $file
 				let j=$j+1
 			done
 		} &

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 8/8] xfs/068: fix variability problems in file/dir count output
  2017-12-13 22:23     ` Darrick J. Wong
@ 2017-12-13 22:45       ` Dave Chinner
  2017-12-13 23:17         ` Darrick J. Wong
  0 siblings, 1 reply; 55+ messages in thread
From: Dave Chinner @ 2017-12-13 22:45 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: eguan, linux-xfs, fstests

On Wed, Dec 13, 2017 at 02:23:52PM -0800, Darrick J. Wong wrote:
> On Thu, Dec 14, 2017 at 09:20:46AM +1100, Dave Chinner wrote:
> > On Tue, Dec 12, 2017 at 10:04:11PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > In this test we use fsstress to create some number of files and then
> > > exercise xfsdump/xfsrestore on them.  Depending on the fsstress config
> > > we may end up with a different number of files than is hardcoded in the
> > > golden output (particularly after adding reflink support to fsstress)
> > > and thereby fail the test.  Since we're not really testing how many
> > > files fsstress can create, just turn the counts into XXX/YYY.
> > 
> > Hmmmm. those numbers were in the golden output specifically because
> > fsstress is supposed to be deterministic for a given random seed.
> > What it is supposed to be testing is that xfsdump actually dumped
> > all the files that were created, and xfs-restore was able to process
> > them all. If either barf on a file, they'll silently skip it, and
> > the numbers won't come out properly.
> > 
> > The typical class of bug this test finds is bulkstat iteration
> > problems - if bulkstat misses an inode it shouldn't, then the
> > xfsrestore numbers come out wrong. By making the data set
> > non-deterministic and not checking the numbers, we end up losing the
> > ability of this test to check bulkstat iteration and dump/restore
> > completeness....
> 
> Ah, fun.  Ok, in that case I think the correct fix for this problem is
> to turn off clonerange/deduperange in the fsstress command line so that
> we get back to deterministic(?) counts...

Why aren't the clonerange/deduperange operations deterministic?
Shouldn't these always do the same thing from the POV of
xfsdump/restore?

> ...unless a better solution to count the number of dirs/files and compare
> to whatever xfsrestore says?

Haven't looked recently, but there were reasons for doing it this
way that I don't recall off the top of my head...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 8/8] xfs/068: fix variability problems in file/dir count output
  2017-12-13 22:45       ` Dave Chinner
@ 2017-12-13 23:17         ` Darrick J. Wong
  2017-12-13 23:42           ` Dave Chinner
  0 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-13 23:17 UTC (permalink / raw)
  To: Dave Chinner; +Cc: eguan, linux-xfs, fstests

On Thu, Dec 14, 2017 at 09:45:53AM +1100, Dave Chinner wrote:
> On Wed, Dec 13, 2017 at 02:23:52PM -0800, Darrick J. Wong wrote:
> > On Thu, Dec 14, 2017 at 09:20:46AM +1100, Dave Chinner wrote:
> > > On Tue, Dec 12, 2017 at 10:04:11PM -0800, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > In this test we use fsstress to create some number of files and then
> > > > exercise xfsdump/xfsrestore on them.  Depending on the fsstress config
> > > > we may end up with a different number of files than is hardcoded in the
> > > > golden output (particularly after adding reflink support to fsstress)
> > > > and thereby fail the test.  Since we're not really testing how many
> > > > files fsstress can create, just turn the counts into XXX/YYY.
> > > 
> > > Hmmmm. those numbers were in the golden output specifically because
> > > fsstress is supposed to be deterministic for a given random seed.
> > > What it is supposed to be testing is that xfsdump actually dumped
> > > all the files that were created, and xfs-restore was able to process
> > > them all. If either barf on a file, they'll silently skip it, and
> > > the numbers won't come out properly.
> > > 
> > > The typical class of bug this test finds is bulkstat iteration
> > > problems - if bulkstat misses an inode it shouldn't, then the
> > > xfsrestore numbers come out wrong. By making the data set
> > > non-deterministic and not checking the numbers, we end up losing the
> > > ability of this test to check bulkstat iteration and dump/restore
> > > completeness....
> > 
> > Ah, fun.  Ok, in that case I think the correct fix for this problem is
> > to turn off clonerange/deduperange in the fsstress command line so that
> > we get back to deterministic(?) counts...
> 
> Why aren't the clonerange/deduperange operations deterministic?
> Shouldn't these always do the same thing from the POV of
> xfsdump/restore?

The operations themselves are deterministic, but adding the two commands
for clone & dedupe changed the size of the ops table, which means that
fsstress pursues a different sequence of operations for a given nproc
and seed input.  Furthermore, the outcome of the operations will differ
depending on whether or not the xfs supports reflink, because a
clonerange that fails with EOPNOTSUPP causes the commands' frequency to
be zeroed in the command table.

--D

> > ...unless a better solution to count the number of dirs/files and compare
> > to whatever xfsrestore says?
> 
> Haven't looked recently, but there were reasons for doing it this
> way that I don't recall off the top of my head...
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v2 8/8] xfs/068: fix clonerange problems in file/dir count output
  2017-12-13  6:04 ` [PATCH 8/8] xfs/068: fix variability problems in file/dir count output Darrick J. Wong
  2017-12-13 22:20   ` Dave Chinner
@ 2017-12-13 23:28   ` Darrick J. Wong
  2017-12-13 23:44     ` Dave Chinner
  2017-12-15  2:08   ` [PATCH v3 8/8] xfs/068: fix variability " Darrick J. Wong
  2017-12-15  2:17   ` [PATCH v4 " Darrick J. Wong
  3 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-13 23:28 UTC (permalink / raw)
  To: eguan; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

In this test we use a fixed sequence of operations in fsstress to create
some number of files and dirs and then exercise xfsdump/xfsrestore on
them.  Since clonerange/deduperange are not supported on all xfs
configurations, detect if they're in fsstress and disable them so that
we always execute exactly the same sequence of operations no matter how
the filesystem is configured.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/068 |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tests/xfs/068 b/tests/xfs/068
index 7151e28..f95a539 100755
--- a/tests/xfs/068
+++ b/tests/xfs/068
@@ -43,6 +43,14 @@ trap "rm -rf $tmp.*; exit \$status" 0 1 2 3 15
 _supported_fs xfs
 _supported_os Linux
 
+# Remove fsstress commands that aren't supported on all xfs configs
+if $FSSTRESS_PROG | grep -q clonerange; then
+	FSSTRESS_AVOID="-f clonerange=0 $FSSTRESS_AVOID"
+fi
+if $FSSTRESS_PROG | grep -q deduperange; then
+	FSSTRESS_AVOID="-f deduperange=0 $FSSTRESS_AVOID"
+fi
+
 _create_dumpdir_stress_num 4096
 _do_dump_restore
 

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 8/8] xfs/068: fix variability problems in file/dir count output
  2017-12-13 23:17         ` Darrick J. Wong
@ 2017-12-13 23:42           ` Dave Chinner
  0 siblings, 0 replies; 55+ messages in thread
From: Dave Chinner @ 2017-12-13 23:42 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: eguan, linux-xfs, fstests

On Wed, Dec 13, 2017 at 03:17:45PM -0800, Darrick J. Wong wrote:
> On Thu, Dec 14, 2017 at 09:45:53AM +1100, Dave Chinner wrote:
> > On Wed, Dec 13, 2017 at 02:23:52PM -0800, Darrick J. Wong wrote:
> > > On Thu, Dec 14, 2017 at 09:20:46AM +1100, Dave Chinner wrote:
> > > > On Tue, Dec 12, 2017 at 10:04:11PM -0800, Darrick J. Wong wrote:
> > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > 
> > > > > In this test we use fsstress to create some number of files and then
> > > > > exercise xfsdump/xfsrestore on them.  Depending on the fsstress config
> > > > > we may end up with a different number of files than is hardcoded in the
> > > > > golden output (particularly after adding reflink support to fsstress)
> > > > > and thereby fail the test.  Since we're not really testing how many
> > > > > files fsstress can create, just turn the counts into XXX/YYY.
> > > > 
> > > > Hmmmm. those numbers were in the golden output specifically because
> > > > fsstress is supposed to be deterministic for a given random seed.
> > > > What it is supposed to be testing is that xfsdump actually dumped
> > > > all the files that were created, and xfs-restore was able to process
> > > > them all. If either barf on a file, they'll silently skip it, and
> > > > the numbers won't come out properly.
> > > > 
> > > > The typical class of bug this test finds is bulkstat iteration
> > > > problems - if bulkstat misses an inode it shouldn't, then the
> > > > xfsrestore numbers come out wrong. By making the data set
> > > > non-deterministic and not checking the numbers, we end up losing the
> > > > ability of this test to check bulkstat iteration and dump/restore
> > > > completeness....
> > > 
> > > Ah, fun.  Ok, in that case I think the correct fix for this problem is
> > > to turn off clonerange/deduperange in the fsstress command line so that
> > > we get back to deterministic(?) counts...
> > 
> > Why aren't the clonerange/deduperange operations deterministic?
> > Shouldn't these always do the same thing from the POV of
> > xfsdump/restore?
> 
> The operations themselves are deterministic, but adding the two commands
> for clone & dedupe changed the size of the ops table, which means that
> fsstress pursues a different sequence of operations for a given nproc
> and seed input.  Furthermore, the outcome of the operations will differ
> depending on whether or not the xfs supports reflink, because a
> clonerange that fails with EOPNOTSUPP causes the commands' frequency to
> be zeroed in the command table.

Ah, ok. So it's the dynamic nature of newly supported operations
that causes the problems for this test, not that the options arei
supported. Seems reasonable just to disable them for these tests,
then.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 8/8] xfs/068: fix clonerange problems in file/dir count output
  2017-12-13 23:28   ` [PATCH v2 8/8] xfs/068: fix clonerange " Darrick J. Wong
@ 2017-12-13 23:44     ` Dave Chinner
  2017-12-14  6:52       ` Amir Goldstein
  0 siblings, 1 reply; 55+ messages in thread
From: Dave Chinner @ 2017-12-13 23:44 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: eguan, linux-xfs, fstests

On Wed, Dec 13, 2017 at 03:28:05PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> In this test we use a fixed sequence of operations in fsstress to create
> some number of files and dirs and then exercise xfsdump/xfsrestore on
> them.  Since clonerange/deduperange are not supported on all xfs
> configurations, detect if they're in fsstress and disable them so that
> we always execute exactly the same sequence of operations no matter how
> the filesystem is configured.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  tests/xfs/068 |    8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/tests/xfs/068 b/tests/xfs/068
> index 7151e28..f95a539 100755
> --- a/tests/xfs/068
> +++ b/tests/xfs/068
> @@ -43,6 +43,14 @@ trap "rm -rf $tmp.*; exit \$status" 0 1 2 3 15
>  _supported_fs xfs
>  _supported_os Linux
>  
> +# Remove fsstress commands that aren't supported on all xfs configs
> +if $FSSTRESS_PROG | grep -q clonerange; then
> +	FSSTRESS_AVOID="-f clonerange=0 $FSSTRESS_AVOID"
> +fi
> +if $FSSTRESS_PROG | grep -q deduperange; then
> +	FSSTRESS_AVOID="-f deduperange=0 $FSSTRESS_AVOID"
> +fi
> +

I'd put this inside _create_dumpdir_stress_num as it's supposed to
DTRT for the dump/restore that follows. Otherwise looks fine.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 6/8] fsstress: implement the clonerange/deduperange ioctls
  2017-12-13  6:03 ` [PATCH 6/8] fsstress: implement the clonerange/deduperange ioctls Darrick J. Wong
@ 2017-12-14  6:39   ` Amir Goldstein
  2017-12-14  7:32     ` Eryu Guan
  2017-12-15  2:07   ` [PATCH v2 " Darrick J. Wong
  1 sibling, 1 reply; 55+ messages in thread
From: Amir Goldstein @ 2017-12-14  6:39 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Eryu Guan, linux-xfs, fstests

On Wed, Dec 13, 2017 at 8:03 AM, Darrick J. Wong
<darrick.wong@oracle.com> wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Mix it up a bit by reflinking and deduping data blocks when possible.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  ltp/fsstress.c |  440 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 440 insertions(+)
>
>
> diff --git a/ltp/fsstress.c b/ltp/fsstress.c
> index 96f48b1..e2dfa5e 100644
> --- a/ltp/fsstress.c
> +++ b/ltp/fsstress.c
> @@ -68,7 +68,9 @@ typedef enum {
>         OP_BULKSTAT,
>         OP_BULKSTAT1,
>         OP_CHOWN,
> +       OP_CLONERANGE,
>         OP_CREAT,
> +       OP_DEDUPERANGE,
>         OP_DREAD,
>         OP_DWRITE,
>         OP_FALLOCATE,
> @@ -174,7 +176,9 @@ void        awrite_f(int, long);
>  void   bulkstat_f(int, long);
>  void   bulkstat1_f(int, long);
>  void   chown_f(int, long);
> +void   clonerange_f(int, long);
>  void   creat_f(int, long);
> +void   deduperange_f(int, long);
>  void   dread_f(int, long);
>  void   dwrite_f(int, long);
>  void   fallocate_f(int, long);
> @@ -221,7 +225,9 @@ opdesc_t    ops[] = {
>         { OP_BULKSTAT, "bulkstat", bulkstat_f, 1, 0 },
>         { OP_BULKSTAT1, "bulkstat1", bulkstat1_f, 1, 0 },
>         { OP_CHOWN, "chown", chown_f, 3, 1 },
> +       { OP_CLONERANGE, "clonerange", clonerange_f, 4, 1 },
>         { OP_CREAT, "creat", creat_f, 4, 1 },
> +       { OP_DEDUPERANGE, "deduperange", deduperange_f, 4, 1},
>         { OP_DREAD, "dread", dread_f, 4, 0 },
>         { OP_DWRITE, "dwrite", dwrite_f, 4, 1 },
>         { OP_FALLOCATE, "fallocate", fallocate_f, 1, 1 },
> @@ -1312,6 +1318,16 @@ make_freq_table(void)
>         }
>  }
>
> +void
> +free_freq_table(void)
> +{
> +       if (!freq_table)
> +               return;
> +       free(freq_table);
> +       freq_table = NULL;
> +       freq_table_size = 0;
> +}
> +
>  int
>  mkdir_path(pathname_t *name, mode_t mode)
>  {
> @@ -2189,6 +2205,430 @@ chown_f(int opno, long r)
>         free_pathname(&f);
>  }
>
> +static void
> +disable_op(opty_t opt)
> +{
> +       opdesc_t        *p;
> +
> +       for (p = ops; p < ops_end; p++) {
> +               if (opt == p->op) {
> +                       p->freq = 0;
> +                       free_freq_table();
> +                       make_freq_table();
> +                       return;
> +               }
> +       }
> +}
> +

If we want to go down the path of runtime disable ops, the question is:
Why disable_op clonerange/deduperange and not disable_op
insert/collapse/zero/punch? there are probably other ops as well.
This is also inconsistent by the fact that build time disable of
neither FALLOC nor LONERANGE/DEDUPERANGE will not
change the random op sequence, while runtime disable will.

This is related to the conversation of the golden output of xfs/068.
Will take my arguments there now...

Amir.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 8/8] xfs/068: fix clonerange problems in file/dir count output
  2017-12-13 23:44     ` Dave Chinner
@ 2017-12-14  6:52       ` Amir Goldstein
  2017-12-14  7:37         ` Amir Goldstein
  2017-12-14  7:49         ` Eryu Guan
  0 siblings, 2 replies; 55+ messages in thread
From: Amir Goldstein @ 2017-12-14  6:52 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Darrick J. Wong, Eryu Guan, linux-xfs, fstests

On Thu, Dec 14, 2017 at 1:44 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Wed, Dec 13, 2017 at 03:28:05PM -0800, Darrick J. Wong wrote:
>> From: Darrick J. Wong <darrick.wong@oracle.com>
>>
>> In this test we use a fixed sequence of operations in fsstress to create
>> some number of files and dirs and then exercise xfsdump/xfsrestore on
>> them.  Since clonerange/deduperange are not supported on all xfs
>> configurations, detect if they're in fsstress and disable them so that
>> we always execute exactly the same sequence of operations no matter how
>> the filesystem is configured.
>>
>> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
>> ---
>>  tests/xfs/068 |    8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/tests/xfs/068 b/tests/xfs/068
>> index 7151e28..f95a539 100755
>> --- a/tests/xfs/068
>> +++ b/tests/xfs/068
>> @@ -43,6 +43,14 @@ trap "rm -rf $tmp.*; exit \$status" 0 1 2 3 15
>>  _supported_fs xfs
>>  _supported_os Linux
>>
>> +# Remove fsstress commands that aren't supported on all xfs configs
>> +if $FSSTRESS_PROG | grep -q clonerange; then
>> +     FSSTRESS_AVOID="-f clonerange=0 $FSSTRESS_AVOID"
>> +fi
>> +if $FSSTRESS_PROG | grep -q deduperange; then
>> +     FSSTRESS_AVOID="-f deduperange=0 $FSSTRESS_AVOID"
>> +fi
>> +
>
> I'd put this inside _create_dumpdir_stress_num as it's supposed to
> DTRT for the dump/restore that follows. Otherwise looks fine.
>

Guys,

Please take a look at the only 2 changes in the history of this test.
I would like to make sure we are not in a loop:

5d36d85 xfs/068: update golden output due to new operations in fsstress
6e5194d fsstress: Add fallocate insert range operation

The first change excludes the new insert op (by dchinner on commit)
The second change re-includes insert op, does not exclude new
mread/mwrite ops and updates golden output, following this discussion:
https://marc.info/?l=fstests&m=149014697111838&w=2
(the referenced thread ends with a ? to Dave, but was followed by v6..v8
 that were "silently acked" by Dave).

I personally argued that the blacklist approach to xfs/068 is fragile and indeed
this is the third time the test breaks in the history I know of,
because of added
fsstress ops. Fine. As long as we at least stay consistent with a decision about
update golden output vs. exclude ops and document the decision in a comment
with the reasoning, so we won't have to repeat this discussion next time.

Darrick,

IMO, we should follow the path of updating golden output and instead of
dropping clone/dedupe from ops table in runtime, you should make them
a noop or ignore the error, keeping the random sequence unchanged.
This is more or less what happens with insert/collapse (error is ignored)
already, so it would be weird to make exceptions.

For reference, fsx does disable insert/collapse/zero/punch at runtime
and that does change the random sequence of fsx.

Cheers,
Amir.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 6/8] fsstress: implement the clonerange/deduperange ioctls
  2017-12-14  6:39   ` Amir Goldstein
@ 2017-12-14  7:32     ` Eryu Guan
  2017-12-14 20:20       ` Darrick J. Wong
  0 siblings, 1 reply; 55+ messages in thread
From: Eryu Guan @ 2017-12-14  7:32 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Darrick J. Wong, linux-xfs, fstests

On Thu, Dec 14, 2017 at 08:39:38AM +0200, Amir Goldstein wrote:
> On Wed, Dec 13, 2017 at 8:03 AM, Darrick J. Wong
> <darrick.wong@oracle.com> wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > Mix it up a bit by reflinking and deduping data blocks when possible.
> >
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  ltp/fsstress.c |  440 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 440 insertions(+)
> >
> >
> > diff --git a/ltp/fsstress.c b/ltp/fsstress.c
> > index 96f48b1..e2dfa5e 100644
> > --- a/ltp/fsstress.c
> > +++ b/ltp/fsstress.c
> > @@ -68,7 +68,9 @@ typedef enum {
> >         OP_BULKSTAT,
> >         OP_BULKSTAT1,
> >         OP_CHOWN,
> > +       OP_CLONERANGE,
> >         OP_CREAT,
> > +       OP_DEDUPERANGE,
> >         OP_DREAD,
> >         OP_DWRITE,
> >         OP_FALLOCATE,
> > @@ -174,7 +176,9 @@ void        awrite_f(int, long);
> >  void   bulkstat_f(int, long);
> >  void   bulkstat1_f(int, long);
> >  void   chown_f(int, long);
> > +void   clonerange_f(int, long);
> >  void   creat_f(int, long);
> > +void   deduperange_f(int, long);
> >  void   dread_f(int, long);
> >  void   dwrite_f(int, long);
> >  void   fallocate_f(int, long);
> > @@ -221,7 +225,9 @@ opdesc_t    ops[] = {
> >         { OP_BULKSTAT, "bulkstat", bulkstat_f, 1, 0 },
> >         { OP_BULKSTAT1, "bulkstat1", bulkstat1_f, 1, 0 },
> >         { OP_CHOWN, "chown", chown_f, 3, 1 },
> > +       { OP_CLONERANGE, "clonerange", clonerange_f, 4, 1 },
> >         { OP_CREAT, "creat", creat_f, 4, 1 },
> > +       { OP_DEDUPERANGE, "deduperange", deduperange_f, 4, 1},
> >         { OP_DREAD, "dread", dread_f, 4, 0 },
> >         { OP_DWRITE, "dwrite", dwrite_f, 4, 1 },
> >         { OP_FALLOCATE, "fallocate", fallocate_f, 1, 1 },
> > @@ -1312,6 +1318,16 @@ make_freq_table(void)
> >         }
> >  }
> >
> > +void
> > +free_freq_table(void)
> > +{
> > +       if (!freq_table)
> > +               return;
> > +       free(freq_table);
> > +       freq_table = NULL;
> > +       freq_table_size = 0;
> > +}
> > +
> >  int
> >  mkdir_path(pathname_t *name, mode_t mode)
> >  {
> > @@ -2189,6 +2205,430 @@ chown_f(int opno, long r)
> >         free_pathname(&f);
> >  }
> >
> > +static void
> > +disable_op(opty_t opt)
> > +{
> > +       opdesc_t        *p;
> > +
> > +       for (p = ops; p < ops_end; p++) {
> > +               if (opt == p->op) {
> > +                       p->freq = 0;
> > +                       free_freq_table();
> > +                       make_freq_table();
> > +                       return;
> > +               }
> > +       }
> > +}
> > +
> 
> If we want to go down the path of runtime disable ops, the question is:
> Why disable_op clonerange/deduperange and not disable_op
> insert/collapse/zero/punch? there are probably other ops as well.
> This is also inconsistent by the fact that build time disable of
> neither FALLOC nor LONERANGE/DEDUPERANGE will not
> change the random op sequence, while runtime disable will.

That's a good point. I don't think we want to disable unsupported
operations dynamically, fsstress just accepts/ignores whatever result
the operation gives and just prints the errno to log in verbose mode.

Thanks,
Eryu

> 
> This is related to the conversation of the golden output of xfs/068.
> Will take my arguments there now...
> 
> Amir.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 8/8] xfs/068: fix clonerange problems in file/dir count output
  2017-12-14  6:52       ` Amir Goldstein
@ 2017-12-14  7:37         ` Amir Goldstein
  2017-12-14  7:49         ` Eryu Guan
  1 sibling, 0 replies; 55+ messages in thread
From: Amir Goldstein @ 2017-12-14  7:37 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Darrick J. Wong, Eryu Guan, linux-xfs, fstests

On Thu, Dec 14, 2017 at 8:52 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, Dec 14, 2017 at 1:44 AM, Dave Chinner <david@fromorbit.com> wrote:
>> On Wed, Dec 13, 2017 at 03:28:05PM -0800, Darrick J. Wong wrote:
>>> From: Darrick J. Wong <darrick.wong@oracle.com>
>>>
>>> In this test we use a fixed sequence of operations in fsstress to create
>>> some number of files and dirs and then exercise xfsdump/xfsrestore on
>>> them.  Since clonerange/deduperange are not supported on all xfs
>>> configurations, detect if they're in fsstress and disable them so that
>>> we always execute exactly the same sequence of operations no matter how
>>> the filesystem is configured.
>>>
>>> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
>>> ---
>>>  tests/xfs/068 |    8 ++++++++
>>>  1 file changed, 8 insertions(+)
>>>
>>> diff --git a/tests/xfs/068 b/tests/xfs/068
>>> index 7151e28..f95a539 100755
>>> --- a/tests/xfs/068
>>> +++ b/tests/xfs/068
>>> @@ -43,6 +43,14 @@ trap "rm -rf $tmp.*; exit \$status" 0 1 2 3 15
>>>  _supported_fs xfs
>>>  _supported_os Linux
>>>
>>> +# Remove fsstress commands that aren't supported on all xfs configs
>>> +if $FSSTRESS_PROG | grep -q clonerange; then
>>> +     FSSTRESS_AVOID="-f clonerange=0 $FSSTRESS_AVOID"
>>> +fi
>>> +if $FSSTRESS_PROG | grep -q deduperange; then
>>> +     FSSTRESS_AVOID="-f deduperange=0 $FSSTRESS_AVOID"
>>> +fi
>>> +
>>
>> I'd put this inside _create_dumpdir_stress_num as it's supposed to
>> DTRT for the dump/restore that follows. Otherwise looks fine.
>>
>
> Guys,
>
> Please take a look at the only 2 changes in the history of this test.
> I would like to make sure we are not in a loop:
>
> 5d36d85 xfs/068: update golden output due to new operations in fsstress
> 6e5194d fsstress: Add fallocate insert range operation
>
> The first change excludes the new insert op (by dchinner on commit)
> The second change re-includes insert op, does not exclude new
> mread/mwrite ops and updates golden output, following this discussion:
> https://marc.info/?l=fstests&m=149014697111838&w=2
> (the referenced thread ends with a ? to Dave, but was followed by v6..v8
>  that were "silently acked" by Dave).
>
> I personally argued that the blacklist approach to xfs/068 is fragile and indeed
> this is the third time the test breaks in the history I know of,
> because of added
> fsstress ops. Fine. As long as we at least stay consistent with a decision about
> update golden output vs. exclude ops and document the decision in a comment
> with the reasoning, so we won't have to repeat this discussion next time.
>
> Darrick,
>
> IMO, we should follow the path of updating golden output and instead of
> dropping clone/dedupe from ops table in runtime, you should make them
> a noop or ignore the error, keeping the random sequence unchanged.
> This is more or less what happens with insert/collapse (error is ignored)
> already, so it would be weird to make exceptions.
>
> For reference, fsx does disable insert/collapse/zero/punch at runtime
> and that does change the random sequence of fsx.
>

Looking again, I think this test is currently broken if
Fallocate is disabled at build time and will also be
Broken if we fix golden output for clonerange and it
Will be disabled in build time, because rand() on the
Operation params won't happen.

Amir.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 8/8] xfs/068: fix clonerange problems in file/dir count output
  2017-12-14  6:52       ` Amir Goldstein
  2017-12-14  7:37         ` Amir Goldstein
@ 2017-12-14  7:49         ` Eryu Guan
  2017-12-14  8:15           ` Amir Goldstein
  2017-12-14 21:35           ` Dave Chinner
  1 sibling, 2 replies; 55+ messages in thread
From: Eryu Guan @ 2017-12-14  7:49 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Dave Chinner, Darrick J. Wong, linux-xfs, fstests

On Thu, Dec 14, 2017 at 08:52:32AM +0200, Amir Goldstein wrote:
> On Thu, Dec 14, 2017 at 1:44 AM, Dave Chinner <david@fromorbit.com> wrote:
> > On Wed, Dec 13, 2017 at 03:28:05PM -0800, Darrick J. Wong wrote:
> >> From: Darrick J. Wong <darrick.wong@oracle.com>
> >>
> >> In this test we use a fixed sequence of operations in fsstress to create
> >> some number of files and dirs and then exercise xfsdump/xfsrestore on
> >> them.  Since clonerange/deduperange are not supported on all xfs
> >> configurations, detect if they're in fsstress and disable them so that
> >> we always execute exactly the same sequence of operations no matter how
> >> the filesystem is configured.
> >>
> >> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> >> ---
> >>  tests/xfs/068 |    8 ++++++++
> >>  1 file changed, 8 insertions(+)
> >>
> >> diff --git a/tests/xfs/068 b/tests/xfs/068
> >> index 7151e28..f95a539 100755
> >> --- a/tests/xfs/068
> >> +++ b/tests/xfs/068
> >> @@ -43,6 +43,14 @@ trap "rm -rf $tmp.*; exit \$status" 0 1 2 3 15
> >>  _supported_fs xfs
> >>  _supported_os Linux
> >>
> >> +# Remove fsstress commands that aren't supported on all xfs configs
> >> +if $FSSTRESS_PROG | grep -q clonerange; then
> >> +     FSSTRESS_AVOID="-f clonerange=0 $FSSTRESS_AVOID"
> >> +fi
> >> +if $FSSTRESS_PROG | grep -q deduperange; then
> >> +     FSSTRESS_AVOID="-f deduperange=0 $FSSTRESS_AVOID"
> >> +fi
> >> +
> >
> > I'd put this inside _create_dumpdir_stress_num as it's supposed to
> > DTRT for the dump/restore that follows. Otherwise looks fine.
> >
> 
> Guys,
> 
> Please take a look at the only 2 changes in the history of this test.
> I would like to make sure we are not in a loop:
> 
> 5d36d85 xfs/068: update golden output due to new operations in fsstress
> 6e5194d fsstress: Add fallocate insert range operation
> 
> The first change excludes the new insert op (by dchinner on commit)
> The second change re-includes insert op, does not exclude new
> mread/mwrite ops and updates golden output, following this discussion:
> https://marc.info/?l=fstests&m=149014697111838&w=2
> (the referenced thread ends with a ? to Dave, but was followed by v6..v8
>  that were "silently acked" by Dave).
> 
> I personally argued that the blacklist approach to xfs/068 is fragile and indeed
> this is the third time the test breaks in the history I know of,
> because of added
> fsstress ops. Fine. As long as we at least stay consistent with a decision about
> update golden output vs. exclude ops and document the decision in a comment
> with the reasoning, so we won't have to repeat this discussion next time.

I think the fundamental problem of xfs/068 is the hardcoded file numbers
in .out file, perhaps we should calculate the expected number of
files/dirs to be dumped/restored before the dump test and extract the
actual restored number of files/dirs from xfsrestore output and do a
comparison. (or save the whole tree structure for comparison? I haven't
done any test yet, just some random thoughts for now.)

Currently, xfs/068 will easily break if there's user-defined
FSSTRESS_AVOID, e.g. FSSTRESS_AVOID="-ffallocate=0", and that's totally
legal test configuration.

IHMO we really should fix xfs/068 first to avoid hitting the same
problem again and again.

Thanks,
Eryu

> 
> Darrick,
> 
> IMO, we should follow the path of updating golden output and instead of
> dropping clone/dedupe from ops table in runtime, you should make them
> a noop or ignore the error, keeping the random sequence unchanged.
> This is more or less what happens with insert/collapse (error is ignored)
> already, so it would be weird to make exceptions.
> 
> For reference, fsx does disable insert/collapse/zero/punch at runtime
> and that does change the random sequence of fsx.
> 
> Cheers,
> Amir.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 8/8] xfs/068: fix clonerange problems in file/dir count output
  2017-12-14  7:49         ` Eryu Guan
@ 2017-12-14  8:15           ` Amir Goldstein
  2017-12-14 21:35           ` Dave Chinner
  1 sibling, 0 replies; 55+ messages in thread
From: Amir Goldstein @ 2017-12-14  8:15 UTC (permalink / raw)
  To: Eryu Guan; +Cc: Dave Chinner, Darrick J. Wong, linux-xfs, fstests

On Thu, Dec 14, 2017 at 9:49 AM, Eryu Guan <eguan@redhat.com> wrote:
> On Thu, Dec 14, 2017 at 08:52:32AM +0200, Amir Goldstein wrote:
>> On Thu, Dec 14, 2017 at 1:44 AM, Dave Chinner <david@fromorbit.com> wrote:
>> > On Wed, Dec 13, 2017 at 03:28:05PM -0800, Darrick J. Wong wrote:
>> >> From: Darrick J. Wong <darrick.wong@oracle.com>
>> >>
>> >> In this test we use a fixed sequence of operations in fsstress to create
>> >> some number of files and dirs and then exercise xfsdump/xfsrestore on
>> >> them.  Since clonerange/deduperange are not supported on all xfs
>> >> configurations, detect if they're in fsstress and disable them so that
>> >> we always execute exactly the same sequence of operations no matter how
>> >> the filesystem is configured.
>> >>
>> >> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
>> >> ---
>> >>  tests/xfs/068 |    8 ++++++++
>> >>  1 file changed, 8 insertions(+)
>> >>
>> >> diff --git a/tests/xfs/068 b/tests/xfs/068
>> >> index 7151e28..f95a539 100755
>> >> --- a/tests/xfs/068
>> >> +++ b/tests/xfs/068
>> >> @@ -43,6 +43,14 @@ trap "rm -rf $tmp.*; exit \$status" 0 1 2 3 15
>> >>  _supported_fs xfs
>> >>  _supported_os Linux
>> >>
>> >> +# Remove fsstress commands that aren't supported on all xfs configs
>> >> +if $FSSTRESS_PROG | grep -q clonerange; then
>> >> +     FSSTRESS_AVOID="-f clonerange=0 $FSSTRESS_AVOID"
>> >> +fi
>> >> +if $FSSTRESS_PROG | grep -q deduperange; then
>> >> +     FSSTRESS_AVOID="-f deduperange=0 $FSSTRESS_AVOID"
>> >> +fi
>> >> +
>> >
>> > I'd put this inside _create_dumpdir_stress_num as it's supposed to
>> > DTRT for the dump/restore that follows. Otherwise looks fine.
>> >
>>
>> Guys,
>>
>> Please take a look at the only 2 changes in the history of this test.
>> I would like to make sure we are not in a loop:
>>
>> 5d36d85 xfs/068: update golden output due to new operations in fsstress
>> 6e5194d fsstress: Add fallocate insert range operation
>>
>> The first change excludes the new insert op (by dchinner on commit)
>> The second change re-includes insert op, does not exclude new
>> mread/mwrite ops and updates golden output, following this discussion:
>> https://marc.info/?l=fstests&m=149014697111838&w=2
>> (the referenced thread ends with a ? to Dave, but was followed by v6..v8
>>  that were "silently acked" by Dave).
>>
>> I personally argued that the blacklist approach to xfs/068 is fragile and indeed
>> this is the third time the test breaks in the history I know of,
>> because of added
>> fsstress ops. Fine. As long as we at least stay consistent with a decision about
>> update golden output vs. exclude ops and document the decision in a comment
>> with the reasoning, so we won't have to repeat this discussion next time.
>
> I think the fundamental problem of xfs/068 is the hardcoded file numbers
> in .out file, perhaps we should calculate the expected number of
> files/dirs to be dumped/restored before the dump test and extract the
> actual restored number of files/dirs from xfsrestore output and do a
> comparison. (or save the whole tree structure for comparison? I haven't
> done any test yet, just some random thoughts for now.)
>
> Currently, xfs/068 will easily break if there's user-defined
> FSSTRESS_AVOID, e.g. FSSTRESS_AVOID="-ffallocate=0", and that's totally
> legal test configuration.
>
> IHMO we really should fix xfs/068 first to avoid hitting the same
> problem again and again.
>

Agreed.
And while at it, need to redirect:

echo "fsstress : $_param"

to $seq.full and remove it from golden output and print the
actual parameters in the log including  $FSSTRESS_AVOID, not the
make believe params.

Amir.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 1/8] common/rc: report kmemleak errors
  2017-12-13  6:03 ` [PATCH 1/8] common/rc: report kmemleak errors Darrick J. Wong
@ 2017-12-14  9:37   ` Eryu Guan
  2017-12-14 18:15     ` Darrick J. Wong
  0 siblings, 1 reply; 55+ messages in thread
From: Eryu Guan @ 2017-12-14  9:37 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, fstests

On Tue, Dec 12, 2017 at 10:03:18PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> If kmemleak is enabled, scan and report memory leaks after every test.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  check     |    2 ++
>  common/rc |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 54 insertions(+)
> 
> 
> diff --git a/check b/check
> index b2d251a..469188e 100755
> --- a/check
> +++ b/check
> @@ -497,6 +497,7 @@ _check_filesystems()
>  	fi
>  }
>  
> +_init_kmemleak
>  _prepare_test_list
>  
>  if $OPTIONS_HAVE_SECTIONS; then
> @@ -793,6 +794,7 @@ for section in $HOST_OPTIONS_SECTIONS; do
>  		    n_try=`expr $n_try + 1`
>  		    _check_filesystems
>  		    _check_dmesg || err=true
> +		    _check_kmemleak || err=true
>  		fi
>  
>  	    fi
> diff --git a/common/rc b/common/rc
> index cb83918..a2bed36 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -3339,6 +3339,58 @@ _check_dmesg()
>  	fi
>  }
>  
> +# capture the kmemleak report
> +_capture_kmemleak()
> +{
> +	local _kern_knob="${DEBUGFS_MNT}/kmemleak"
> +	local _leak_file="$1"
> +
> +	# Tell the kernel to scan for memory leaks.  Apparently the write
> +	# returns before the scan is complete, so do it twice in the hopes
> +	# that twice is enough to capture all the leaks.
> +	echo "scan" > "${_kern_knob}"
> +	cat "${_kern_knob}" > /dev/null
> +	echo "scan" > "${_kern_knob}"
> +	cat "${_kern_knob}" > "${_leak_file}"
> +	echo "clear" > "${_kern_knob}"

Hmm, two scans seem not enough either, I could see false positive easily
in a 'quick' group run, because some leaks are not reported immediately
after the test but after next test or next few tests. e.g. I saw
generic/008 (tested on XFS) being reported as leaking memory, and from
008.kmemleak I saw:

unreferenced object 0xffff880277679800 (size 512):
  comm "nametest", pid 25007, jiffies 4300176958 (age 9.854s)
...

But "nametest" is only used in generic/007, the leak should be triggered
by generic/007 too, but 007 was reported as PASS in my case.

Not sure what's the best way to deal with these false positive, adding
more scans seem to work, but that's ugly and requires more test time..
What do you think?

Otherwise the whole check kmemleak framework looks fine to me.

Thanks,
Eryu

> +}
> +
> +# set up kmemleak
> +_init_kmemleak()
> +{
> +	local _kern_knob="${DEBUGFS_MNT}/kmemleak"
> +
> +	if [ ! -w "${_kern_knob}" ]; then
> +		return 0
> +	fi
> +
> +	# Disable the automatic scan so that we can control it completely,
> +	# then dump all the leaks recorded so far.
> +	echo "scan=off" > "${_kern_knob}"
> +	_capture_kmemleak /dev/null
> +}
> +
> +# check kmemleak log
> +_check_kmemleak()
> +{
> +	local _kern_knob="${DEBUGFS_MNT}/kmemleak"
> +	local _leak_file="${seqres}.kmemleak"
> +
> +	if [ ! -w "${_kern_knob}" ]; then
> +		return 0
> +	fi
> +
> +	# Capture and report any leaks
> +	_capture_kmemleak "${_leak_file}"
> +	if [ -s "${_leak_file}" ]; then
> +		_dump_err "_check_kmemleak: something found in kmemleak (see ${_leak_file})"
> +		return 1
> +	else
> +		rm -f "${_leak_file}"
> +		return 0
> +	fi
> +}
> +
>  # don't check dmesg log after test
>  _disable_dmesg_check()
>  {
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 1/8] common/rc: report kmemleak errors
  2017-12-14  9:37   ` Eryu Guan
@ 2017-12-14 18:15     ` Darrick J. Wong
  2018-01-05  8:02       ` Eryu Guan
  0 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-14 18:15 UTC (permalink / raw)
  To: Eryu Guan; +Cc: linux-xfs, fstests

On Thu, Dec 14, 2017 at 05:37:18PM +0800, Eryu Guan wrote:
> On Tue, Dec 12, 2017 at 10:03:18PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > If kmemleak is enabled, scan and report memory leaks after every test.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  check     |    2 ++
> >  common/rc |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 54 insertions(+)
> > 
> > 
> > diff --git a/check b/check
> > index b2d251a..469188e 100755
> > --- a/check
> > +++ b/check
> > @@ -497,6 +497,7 @@ _check_filesystems()
> >  	fi
> >  }
> >  
> > +_init_kmemleak
> >  _prepare_test_list
> >  
> >  if $OPTIONS_HAVE_SECTIONS; then
> > @@ -793,6 +794,7 @@ for section in $HOST_OPTIONS_SECTIONS; do
> >  		    n_try=`expr $n_try + 1`
> >  		    _check_filesystems
> >  		    _check_dmesg || err=true
> > +		    _check_kmemleak || err=true
> >  		fi
> >  
> >  	    fi
> > diff --git a/common/rc b/common/rc
> > index cb83918..a2bed36 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -3339,6 +3339,58 @@ _check_dmesg()
> >  	fi
> >  }
> >  
> > +# capture the kmemleak report
> > +_capture_kmemleak()
> > +{
> > +	local _kern_knob="${DEBUGFS_MNT}/kmemleak"
> > +	local _leak_file="$1"
> > +
> > +	# Tell the kernel to scan for memory leaks.  Apparently the write
> > +	# returns before the scan is complete, so do it twice in the hopes
> > +	# that twice is enough to capture all the leaks.
> > +	echo "scan" > "${_kern_knob}"
> > +	cat "${_kern_knob}" > /dev/null
> > +	echo "scan" > "${_kern_knob}"
> > +	cat "${_kern_knob}" > "${_leak_file}"
> > +	echo "clear" > "${_kern_knob}"
> 
> Hmm, two scans seem not enough either, I could see false positive easily
> in a 'quick' group run, because some leaks are not reported immediately
> after the test but after next test or next few tests. e.g. I saw
> generic/008 (tested on XFS) being reported as leaking memory, and from
> 008.kmemleak I saw:
> 
> unreferenced object 0xffff880277679800 (size 512):
>   comm "nametest", pid 25007, jiffies 4300176958 (age 9.854s)
> ...
> 
> But "nametest" is only used in generic/007, the leak should be triggered
> by generic/007 too, but 007 was reported as PASS in my case.
> 
> Not sure what's the best way to deal with these false positive, adding
> more scans seem to work, but that's ugly and requires more test time..
> What do you think?

I'm not sure either -- the brief scan I made of mm/kmemleak.c didn't
reveal anything obvious that would explain the behavior we see.  It
might just take a while for determine positively that an item isn't
gray.

We could change the message to state that found leaks might have
resulted from the previous test?  That's rather unsatisfying, but I
don't know what else to do.

Or maybe a sleep 1 in between scans to see if that makes it more likely
to attribute a leak to the correct test?  I don't anticipate running
xfstests with kmemleak=on too terribly often, so the extra ~700s won't
bother me too much.

--D

> Otherwise the whole check kmemleak framework looks fine to me.
> 
> Thanks,
> Eryu
> 
> > +}
> > +
> > +# set up kmemleak
> > +_init_kmemleak()
> > +{
> > +	local _kern_knob="${DEBUGFS_MNT}/kmemleak"
> > +
> > +	if [ ! -w "${_kern_knob}" ]; then
> > +		return 0
> > +	fi
> > +
> > +	# Disable the automatic scan so that we can control it completely,
> > +	# then dump all the leaks recorded so far.
> > +	echo "scan=off" > "${_kern_knob}"
> > +	_capture_kmemleak /dev/null
> > +}
> > +
> > +# check kmemleak log
> > +_check_kmemleak()
> > +{
> > +	local _kern_knob="${DEBUGFS_MNT}/kmemleak"
> > +	local _leak_file="${seqres}.kmemleak"
> > +
> > +	if [ ! -w "${_kern_knob}" ]; then
> > +		return 0
> > +	fi
> > +
> > +	# Capture and report any leaks
> > +	_capture_kmemleak "${_leak_file}"
> > +	if [ -s "${_leak_file}" ]; then
> > +		_dump_err "_check_kmemleak: something found in kmemleak (see ${_leak_file})"
> > +		return 1
> > +	else
> > +		rm -f "${_leak_file}"
> > +		return 0
> > +	fi
> > +}
> > +
> >  # don't check dmesg log after test
> >  _disable_dmesg_check()
> >  {
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 6/8] fsstress: implement the clonerange/deduperange ioctls
  2017-12-14  7:32     ` Eryu Guan
@ 2017-12-14 20:20       ` Darrick J. Wong
  0 siblings, 0 replies; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-14 20:20 UTC (permalink / raw)
  To: Eryu Guan; +Cc: Amir Goldstein, linux-xfs, fstests

On Thu, Dec 14, 2017 at 03:32:46PM +0800, Eryu Guan wrote:
> On Thu, Dec 14, 2017 at 08:39:38AM +0200, Amir Goldstein wrote:
> > On Wed, Dec 13, 2017 at 8:03 AM, Darrick J. Wong
> > <darrick.wong@oracle.com> wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > >
> > > Mix it up a bit by reflinking and deduping data blocks when possible.
> > >
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  ltp/fsstress.c |  440 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 440 insertions(+)
> > >
> > >
> > > diff --git a/ltp/fsstress.c b/ltp/fsstress.c
> > > index 96f48b1..e2dfa5e 100644
> > > --- a/ltp/fsstress.c
> > > +++ b/ltp/fsstress.c
> > > @@ -68,7 +68,9 @@ typedef enum {
> > >         OP_BULKSTAT,
> > >         OP_BULKSTAT1,
> > >         OP_CHOWN,
> > > +       OP_CLONERANGE,
> > >         OP_CREAT,
> > > +       OP_DEDUPERANGE,
> > >         OP_DREAD,
> > >         OP_DWRITE,
> > >         OP_FALLOCATE,
> > > @@ -174,7 +176,9 @@ void        awrite_f(int, long);
> > >  void   bulkstat_f(int, long);
> > >  void   bulkstat1_f(int, long);
> > >  void   chown_f(int, long);
> > > +void   clonerange_f(int, long);
> > >  void   creat_f(int, long);
> > > +void   deduperange_f(int, long);
> > >  void   dread_f(int, long);
> > >  void   dwrite_f(int, long);
> > >  void   fallocate_f(int, long);
> > > @@ -221,7 +225,9 @@ opdesc_t    ops[] = {
> > >         { OP_BULKSTAT, "bulkstat", bulkstat_f, 1, 0 },
> > >         { OP_BULKSTAT1, "bulkstat1", bulkstat1_f, 1, 0 },
> > >         { OP_CHOWN, "chown", chown_f, 3, 1 },
> > > +       { OP_CLONERANGE, "clonerange", clonerange_f, 4, 1 },
> > >         { OP_CREAT, "creat", creat_f, 4, 1 },
> > > +       { OP_DEDUPERANGE, "deduperange", deduperange_f, 4, 1},
> > >         { OP_DREAD, "dread", dread_f, 4, 0 },
> > >         { OP_DWRITE, "dwrite", dwrite_f, 4, 1 },
> > >         { OP_FALLOCATE, "fallocate", fallocate_f, 1, 1 },
> > > @@ -1312,6 +1318,16 @@ make_freq_table(void)
> > >         }
> > >  }
> > >
> > > +void
> > > +free_freq_table(void)
> > > +{
> > > +       if (!freq_table)
> > > +               return;
> > > +       free(freq_table);
> > > +       freq_table = NULL;
> > > +       freq_table_size = 0;
> > > +}
> > > +
> > >  int
> > >  mkdir_path(pathname_t *name, mode_t mode)
> > >  {
> > > @@ -2189,6 +2205,430 @@ chown_f(int opno, long r)
> > >         free_pathname(&f);
> > >  }
> > >
> > > +static void
> > > +disable_op(opty_t opt)
> > > +{
> > > +       opdesc_t        *p;
> > > +
> > > +       for (p = ops; p < ops_end; p++) {
> > > +               if (opt == p->op) {
> > > +                       p->freq = 0;
> > > +                       free_freq_table();
> > > +                       make_freq_table();
> > > +                       return;
> > > +               }
> > > +       }
> > > +}
> > > +
> > 
> > If we want to go down the path of runtime disable ops, the question is:
> > Why disable_op clonerange/deduperange and not disable_op
> > insert/collapse/zero/punch? there are probably other ops as well.
> > This is also inconsistent by the fact that build time disable of
> > neither FALLOC nor LONERANGE/DEDUPERANGE will not
> > change the random op sequence, while runtime disable will.
> 
> That's a good point. I don't think we want to disable unsupported
> operations dynamically, fsstress just accepts/ignores whatever result
> the operation gives and just prints the errno to log in verbose mode.

Ok, I'll remove all the auto-disabling stuff & resubmit.

--D

> Thanks,
> Eryu
> 
> > 
> > This is related to the conversation of the golden output of xfs/068.
> > Will take my arguments there now...
> > 
> > Amir.
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 8/8] xfs/068: fix clonerange problems in file/dir count output
  2017-12-14  7:49         ` Eryu Guan
  2017-12-14  8:15           ` Amir Goldstein
@ 2017-12-14 21:35           ` Dave Chinner
  2017-12-15  2:04             ` Darrick J. Wong
  1 sibling, 1 reply; 55+ messages in thread
From: Dave Chinner @ 2017-12-14 21:35 UTC (permalink / raw)
  To: Eryu Guan; +Cc: Amir Goldstein, Darrick J. Wong, linux-xfs, fstests

On Thu, Dec 14, 2017 at 03:49:47PM +0800, Eryu Guan wrote:
> On Thu, Dec 14, 2017 at 08:52:32AM +0200, Amir Goldstein wrote:
> > On Thu, Dec 14, 2017 at 1:44 AM, Dave Chinner <david@fromorbit.com> wrote:
> > > On Wed, Dec 13, 2017 at 03:28:05PM -0800, Darrick J. Wong wrote:
> > >> From: Darrick J. Wong <darrick.wong@oracle.com>
> > >>
> > >> In this test we use a fixed sequence of operations in fsstress to create
> > >> some number of files and dirs and then exercise xfsdump/xfsrestore on
> > >> them.  Since clonerange/deduperange are not supported on all xfs
> > >> configurations, detect if they're in fsstress and disable them so that
> > >> we always execute exactly the same sequence of operations no matter how
> > >> the filesystem is configured.
> > >>
> > >> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > >> ---
> > >>  tests/xfs/068 |    8 ++++++++
> > >>  1 file changed, 8 insertions(+)
> > >>
> > >> diff --git a/tests/xfs/068 b/tests/xfs/068
> > >> index 7151e28..f95a539 100755
> > >> --- a/tests/xfs/068
> > >> +++ b/tests/xfs/068
> > >> @@ -43,6 +43,14 @@ trap "rm -rf $tmp.*; exit \$status" 0 1 2 3 15
> > >>  _supported_fs xfs
> > >>  _supported_os Linux
> > >>
> > >> +# Remove fsstress commands that aren't supported on all xfs configs
> > >> +if $FSSTRESS_PROG | grep -q clonerange; then
> > >> +     FSSTRESS_AVOID="-f clonerange=0 $FSSTRESS_AVOID"
> > >> +fi
> > >> +if $FSSTRESS_PROG | grep -q deduperange; then
> > >> +     FSSTRESS_AVOID="-f deduperange=0 $FSSTRESS_AVOID"
> > >> +fi
> > >> +
> > >
> > > I'd put this inside _create_dumpdir_stress_num as it's supposed to
> > > DTRT for the dump/restore that follows. Otherwise looks fine.
> > >
> > 
> > Guys,
> > 
> > Please take a look at the only 2 changes in the history of this test.
> > I would like to make sure we are not in a loop:
> > 
> > 5d36d85 xfs/068: update golden output due to new operations in fsstress
> > 6e5194d fsstress: Add fallocate insert range operation
> > 
> > The first change excludes the new insert op (by dchinner on commit)
> > The second change re-includes insert op, does not exclude new
> > mread/mwrite ops and updates golden output, following this discussion:
> > https://marc.info/?l=fstests&m=149014697111838&w=2
> > (the referenced thread ends with a ? to Dave, but was followed by v6..v8
> >  that were "silently acked" by Dave).
> > 
> > I personally argued that the blacklist approach to xfs/068 is fragile and indeed
> > this is the third time the test breaks in the history I know of,
> > because of added
> > fsstress ops. Fine. As long as we at least stay consistent with a decision about
> > update golden output vs. exclude ops and document the decision in a comment
> > with the reasoning, so we won't have to repeat this discussion next time.
> 
> I think the fundamental problem of xfs/068 is the hardcoded file numbers
> in .out file, perhaps we should calculate the expected number of
> files/dirs to be dumped/restored before the dump test and extract the
> actual restored number of files/dirs from xfsrestore output and do a
> comparison. (or save the whole tree structure for comparison? I haven't
> done any test yet, just some random thoughts for now.)

Or we don't waste any more time on trying to make a reliable, stable
regression test that has a history of detecting bulkstat regressions
work differently?

Indeed, the problem here is our "turn on new functionality in
fsstress as it is added" process will always break older tests that
require fixed functionality to test.  Having tests fail when we do
this is perfectly reasonable - it means we have to consider whether
that new fsstress operation is valid for the test being run. 

Making tests silently accept new operations that may not be valid
for the thing being tested doesn't improve our test coverage. What
it does is take away a warning canary that tells us we may have
broken something we didn't intend to break. e.g. maybe this test is
telling us reflink breaks xfsdump or xfsrestore? That's the point of
having hard coded numbers in the golden output - any change whether
intended, expected or otherwise requires us to go look at whether
that new functionality has broken xfsdump/restore.

That's what regression tests are for, and taking that away from the
test under the guise of "easier test maintenance" is misguided.
Regression tests require validation and checking when new
functionality is added to the tools they use. Having old tests fail
when new features are added is exactly what we want the regression
tests to do, otherwise we'll miss regressions that the current code
actually catches.

> Currently, xfs/068 will easily break if there's user-defined
> FSSTRESS_AVOID, e.g. FSSTRESS_AVOID="-ffallocate=0", and that's totally
> legal test configuration.

I think that's a strawman argument - who tests XFS without fallocate
enabled these days? Indeed, fsstress will still be doing
preallocation via the old ALLOCSP and RESVSP ioctls that predate
fallocate....

> IHMO we really should fix xfs/068 first to avoid hitting the same
> problem again and again.

IMO, we should not be changing the way old tests work, especially
those that have, in the past, been very good at exposing bugs in
kernel interfaces.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 8/8] xfs/068: fix clonerange problems in file/dir count output
  2017-12-14 21:35           ` Dave Chinner
@ 2017-12-15  2:04             ` Darrick J. Wong
  2017-12-15  4:37               ` Dave Chinner
  0 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-15  2:04 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Eryu Guan, Amir Goldstein, linux-xfs, fstests

On Fri, Dec 15, 2017 at 08:35:41AM +1100, Dave Chinner wrote:
> On Thu, Dec 14, 2017 at 03:49:47PM +0800, Eryu Guan wrote:
> > On Thu, Dec 14, 2017 at 08:52:32AM +0200, Amir Goldstein wrote:
> > > On Thu, Dec 14, 2017 at 1:44 AM, Dave Chinner <david@fromorbit.com> wrote:
> > > > On Wed, Dec 13, 2017 at 03:28:05PM -0800, Darrick J. Wong wrote:
> > > >> From: Darrick J. Wong <darrick.wong@oracle.com>
> > > >>
> > > >> In this test we use a fixed sequence of operations in fsstress to create
> > > >> some number of files and dirs and then exercise xfsdump/xfsrestore on
> > > >> them.  Since clonerange/deduperange are not supported on all xfs
> > > >> configurations, detect if they're in fsstress and disable them so that
> > > >> we always execute exactly the same sequence of operations no matter how
> > > >> the filesystem is configured.
> > > >>
> > > >> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > >> ---
> > > >>  tests/xfs/068 |    8 ++++++++
> > > >>  1 file changed, 8 insertions(+)
> > > >>
> > > >> diff --git a/tests/xfs/068 b/tests/xfs/068
> > > >> index 7151e28..f95a539 100755
> > > >> --- a/tests/xfs/068
> > > >> +++ b/tests/xfs/068
> > > >> @@ -43,6 +43,14 @@ trap "rm -rf $tmp.*; exit \$status" 0 1 2 3 15
> > > >>  _supported_fs xfs
> > > >>  _supported_os Linux
> > > >>
> > > >> +# Remove fsstress commands that aren't supported on all xfs configs
> > > >> +if $FSSTRESS_PROG | grep -q clonerange; then
> > > >> +     FSSTRESS_AVOID="-f clonerange=0 $FSSTRESS_AVOID"
> > > >> +fi
> > > >> +if $FSSTRESS_PROG | grep -q deduperange; then
> > > >> +     FSSTRESS_AVOID="-f deduperange=0 $FSSTRESS_AVOID"
> > > >> +fi
> > > >> +
> > > >
> > > > I'd put this inside _create_dumpdir_stress_num as it's supposed to
> > > > DTRT for the dump/restore that follows. Otherwise looks fine.
> > > >
> > > 
> > > Guys,
> > > 
> > > Please take a look at the only 2 changes in the history of this test.
> > > I would like to make sure we are not in a loop:
> > > 
> > > 5d36d85 xfs/068: update golden output due to new operations in fsstress
> > > 6e5194d fsstress: Add fallocate insert range operation
> > > 
> > > The first change excludes the new insert op (by dchinner on commit)
> > > The second change re-includes insert op, does not exclude new
> > > mread/mwrite ops and updates golden output, following this discussion:
> > > https://marc.info/?l=fstests&m=149014697111838&w=2
> > > (the referenced thread ends with a ? to Dave, but was followed by v6..v8
> > >  that were "silently acked" by Dave).
> > > 
> > > I personally argued that the blacklist approach to xfs/068 is fragile and indeed
> > > this is the third time the test breaks in the history I know of,
> > > because of added
> > > fsstress ops. Fine. As long as we at least stay consistent with a decision about
> > > update golden output vs. exclude ops and document the decision in a comment
> > > with the reasoning, so we won't have to repeat this discussion next time.
> > 
> > I think the fundamental problem of xfs/068 is the hardcoded file numbers
> > in .out file, perhaps we should calculate the expected number of
> > files/dirs to be dumped/restored before the dump test and extract the
> > actual restored number of files/dirs from xfsrestore output and do a
> > comparison. (or save the whole tree structure for comparison? I haven't
> > done any test yet, just some random thoughts for now.)
> 
> Or we don't waste any more time on trying to make a reliable, stable
> regression test that has a history of detecting bulkstat regressions
> work differently?

<shrug> See now, the frustrating part about fixing this testcase is that
I still don't feel like I have a good grasp on what this thing is trying
to test -- apparently we're checking for bulkstat regressions, dump
problems, and restore problems?  Are we also looking for problems that
might crop up with the newer APIs, whatever those might be?

Currently I have a reworked version of this patch that runs fsstress,
measures the number of directories and inodes in $dump_dir, then
programmatically compares that to whatever xfsrestore tells us it
restored.  This ought to be enough that we can create a sufficiently
messy filesystem with whatever sequence of syscalls we want, and make
sure that dump/restore actually work on them.

First we run fsstress, then we count the number of dirs, the number
of fs objects, take a snapshot of the 'find .' output, and md5sum every
file in the dump directory.

If fsstress creates fewer than 100 dirs or 600 inodes, we fail the test
because that wasn't enough.

If bulkstat fails to iterate all the inodes, restore's output will
reflect fewer files than was expected.

If dump fails to generate a full dump, restore's output will
reflect fewer files than was expected.

If restore fails to restore the full dump, restore's output will
reflect fewer files than was expected.

If the restore output doesn't reflect the number of dirs/inodes we
counted at the beginning, we fail the test.

If the 'find .' output of the restored dir doesn't match the original,
we fail the test.

If the md5sum -c output shows corrupt files, we fail the test.

So now I really have no idea -- is that enough to check that everything
works?  I felt like it does, but given all the back and forth now I'm
wondering if even this is enough.

(Yeah, I'm frustrated because the fsstress additions have been very
helpful at flushing out more reflink bugs and I feel like I'm making
very little progress on this xfs/068 thing.  Sorry.)

--D

> Indeed, the problem here is our "turn on new functionality in
> fsstress as it is added" process will always break older tests that
> require fixed functionality to test.  Having tests fail when we do
> this is perfectly reasonable - it means we have to consider whether
> that new fsstress operation is valid for the test being run. 
> 
> Making tests silently accept new operations that may not be valid
> for the thing being tested doesn't improve our test coverage. What
> it does is take away a warning canary that tells us we may have
> broken something we didn't intend to break. e.g. maybe this test is
> telling us reflink breaks xfsdump or xfsrestore? That's the point of
> having hard coded numbers in the golden output - any change whether
> intended, expected or otherwise requires us to go look at whether
> that new functionality has broken xfsdump/restore.
> 
> That's what regression tests are for, and taking that away from the
> test under the guise of "easier test maintenance" is misguided.
> Regression tests require validation and checking when new
> functionality is added to the tools they use. Having old tests fail
> when new features are added is exactly what we want the regression
> tests to do, otherwise we'll miss regressions that the current code
> actually catches.
> 
> > Currently, xfs/068 will easily break if there's user-defined
> > FSSTRESS_AVOID, e.g. FSSTRESS_AVOID="-ffallocate=0", and that's totally
> > legal test configuration.
> 
> I think that's a strawman argument - who tests XFS without fallocate
> enabled these days? Indeed, fsstress will still be doing
> preallocation via the old ALLOCSP and RESVSP ioctls that predate
> fallocate....
> 
> > IHMO we really should fix xfs/068 first to avoid hitting the same
> > problem again and again.
> 
> IMO, we should not be changing the way old tests work, especially
> those that have, in the past, been very good at exposing bugs in
> kernel interfaces.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v2 6/8] fsstress: implement the clonerange/deduperange ioctls
  2017-12-13  6:03 ` [PATCH 6/8] fsstress: implement the clonerange/deduperange ioctls Darrick J. Wong
  2017-12-14  6:39   ` Amir Goldstein
@ 2017-12-15  2:07   ` Darrick J. Wong
  2018-01-03  8:48     ` Eryu Guan
  2018-02-22 16:06     ` Luis Henriques
  1 sibling, 2 replies; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-15  2:07 UTC (permalink / raw)
  To: eguan; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

Mix it up a bit by reflinking and deduping data blocks when possible.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v2: don't disable broken commands, just ignore them
---
 ltp/fsstress.c |  391 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 391 insertions(+)

diff --git a/ltp/fsstress.c b/ltp/fsstress.c
index 96f48b1..b02cb0c 100644
--- a/ltp/fsstress.c
+++ b/ltp/fsstress.c
@@ -68,7 +68,9 @@ typedef enum {
 	OP_BULKSTAT,
 	OP_BULKSTAT1,
 	OP_CHOWN,
+	OP_CLONERANGE,
 	OP_CREAT,
+	OP_DEDUPERANGE,
 	OP_DREAD,
 	OP_DWRITE,
 	OP_FALLOCATE,
@@ -174,7 +176,9 @@ void	awrite_f(int, long);
 void	bulkstat_f(int, long);
 void	bulkstat1_f(int, long);
 void	chown_f(int, long);
+void	clonerange_f(int, long);
 void	creat_f(int, long);
+void	deduperange_f(int, long);
 void	dread_f(int, long);
 void	dwrite_f(int, long);
 void	fallocate_f(int, long);
@@ -221,7 +225,9 @@ opdesc_t	ops[] = {
 	{ OP_BULKSTAT, "bulkstat", bulkstat_f, 1, 0 },
 	{ OP_BULKSTAT1, "bulkstat1", bulkstat1_f, 1, 0 },
 	{ OP_CHOWN, "chown", chown_f, 3, 1 },
+	{ OP_CLONERANGE, "clonerange", clonerange_f, 4, 1 },
 	{ OP_CREAT, "creat", creat_f, 4, 1 },
+	{ OP_DEDUPERANGE, "deduperange", deduperange_f, 4, 1},
 	{ OP_DREAD, "dread", dread_f, 4, 0 },
 	{ OP_DWRITE, "dwrite", dwrite_f, 4, 1 },
 	{ OP_FALLOCATE, "fallocate", fallocate_f, 1, 1 },
@@ -2189,6 +2195,391 @@ chown_f(int opno, long r)
 	free_pathname(&f);
 }
 
+/* reflink some arbitrary range of f1 to f2. */
+void
+clonerange_f(
+	int			opno,
+	long			r)
+{
+#ifdef FICLONERANGE
+	struct file_clone_range	fcr;
+	struct pathname		fpath1;
+	struct pathname		fpath2;
+	struct stat64		stat1;
+	struct stat64		stat2;
+	char			inoinfo1[1024];
+	char			inoinfo2[1024];
+	off64_t			lr;
+	off64_t			off1;
+	off64_t			off2;
+	size_t			len;
+	int			v1;
+	int			v2;
+	int			fd1;
+	int			fd2;
+	int			ret;
+	int			e;
+
+	/* Load paths */
+	init_pathname(&fpath1);
+	if (!get_fname(FT_REGm, r, &fpath1, NULL, NULL, &v1)) {
+		if (v1)
+			printf("%d/%d: clonerange read - no filename\n",
+				procid, opno);
+		goto out_fpath1;
+	}
+
+	init_pathname(&fpath2);
+	if (!get_fname(FT_REGm, random(), &fpath2, NULL, NULL, &v2)) {
+		if (v2)
+			printf("%d/%d: clonerange write - no filename\n",
+				procid, opno);
+		goto out_fpath2;
+	}
+
+	/* Open files */
+	fd1 = open_path(&fpath1, O_RDONLY);
+	e = fd1 < 0 ? errno : 0;
+	check_cwd();
+	if (fd1 < 0) {
+		if (v1)
+			printf("%d/%d: clonerange read - open %s failed %d\n",
+				procid, opno, fpath1.path, e);
+		goto out_fpath2;
+	}
+
+	fd2 = open_path(&fpath2, O_WRONLY);
+	e = fd2 < 0 ? errno : 0;
+	check_cwd();
+	if (fd2 < 0) {
+		if (v2)
+			printf("%d/%d: clonerange write - open %s failed %d\n",
+				procid, opno, fpath2.path, e);
+		goto out_fd1;
+	}
+
+	/* Get file stats */
+	if (fstat64(fd1, &stat1) < 0) {
+		if (v1)
+			printf("%d/%d: clonerange read - fstat64 %s failed %d\n",
+				procid, opno, fpath1.path, errno);
+		goto out_fd2;
+	}
+	inode_info(inoinfo1, sizeof(inoinfo1), &stat1, v1);
+
+	if (fstat64(fd2, &stat2) < 0) {
+		if (v2)
+			printf("%d/%d: clonerange write - fstat64 %s failed %d\n",
+				procid, opno, fpath2.path, errno);
+		goto out_fd2;
+	}
+	inode_info(inoinfo2, sizeof(inoinfo2), &stat2, v1);
+
+	/* Calculate offsets */
+	len = (random() % FILELEN_MAX) + 1;
+	len &= ~(stat1.st_blksize - 1);
+	if (len == 0)
+		len = stat1.st_blksize;
+	if (len > stat1.st_size)
+		len = stat1.st_size;
+
+	lr = ((__int64_t)random() << 32) + random();
+	if (stat1.st_size == len)
+		off1 = 0;
+	else
+		off1 = (off64_t)(lr % MIN(stat1.st_size - len, MAXFSIZE));
+	off1 %= maxfsize;
+	off1 &= ~(stat1.st_blksize - 1);
+
+	/*
+	 * If srcfile == destfile, randomly generate destination ranges
+	 * until we find one that doesn't overlap the source range.
+	 */
+	do {
+		lr = ((__int64_t)random() << 32) + random();
+		off2 = (off64_t)(lr % MIN(stat2.st_size + (1024 * 1024), MAXFSIZE));
+		off2 %= maxfsize;
+		off2 &= ~(stat2.st_blksize - 1);
+	} while (stat1.st_ino == stat2.st_ino && llabs(off2 - off1) < len);
+
+	/* Clone data blocks */
+	fcr.src_fd = fd1;
+	fcr.src_offset = off1;
+	fcr.src_length = len;
+	fcr.dest_offset = off2;
+
+	ret = ioctl(fd2, FICLONERANGE, &fcr);
+	e = ret < 0 ? errno : 0;
+	if (v1 || v2) {
+		printf("%d/%d: clonerange %s%s [%lld,%lld] -> %s%s [%lld,%lld]",
+			procid, opno,
+			fpath1.path, inoinfo1, (long long)off1, (long long)len,
+			fpath2.path, inoinfo2, (long long)off2, (long long)len);
+
+		if (ret < 0)
+			printf(" error %d", e);
+		printf("\n");
+	}
+
+out_fd2:
+	close(fd2);
+out_fd1:
+	close(fd1);
+out_fpath2:
+	free_pathname(&fpath2);
+out_fpath1:
+	free_pathname(&fpath1);
+#endif
+}
+
+/* dedupe some arbitrary range of f1 to f2...fn. */
+void
+deduperange_f(
+	int			opno,
+	long			r)
+{
+#ifdef FIDEDUPERANGE
+#define INFO_SZ			1024
+	struct file_dedupe_range *fdr;
+	struct pathname		*fpath;
+	struct stat64		*stat;
+	char			*info;
+	off64_t			*off;
+	int			*v;
+	int			*fd;
+	int			nr;
+	off64_t			lr;
+	size_t			len;
+	int			ret;
+	int			i;
+	int			e;
+
+	if (flist[FT_REG].nfiles < 2)
+		return;
+
+	/* Pick somewhere between 2 and 128 files. */
+	do {
+		nr = random() % (flist[FT_REG].nfiles + 1);
+	} while (nr < 2 || nr > 128);
+
+	/* Alloc memory */
+	fdr = malloc(nr * sizeof(struct file_dedupe_range_info) +
+		     sizeof(struct file_dedupe_range));
+	if (!fdr) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		return;
+	}
+	memset(fdr, 0, (nr * sizeof(struct file_dedupe_range_info) +
+			sizeof(struct file_dedupe_range)));
+
+	fpath = calloc(nr, sizeof(struct pathname));
+	if (!fpath) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		goto out_fdr;
+	}
+
+	stat = calloc(nr, sizeof(struct stat64));
+	if (!stat) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		goto out_paths;
+	}
+
+	info = calloc(nr, INFO_SZ);
+	if (!info) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		goto out_stats;
+	}
+
+	off = calloc(nr, sizeof(off64_t));
+	if (!off) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		goto out_info;
+	}
+
+	v = calloc(nr, sizeof(int));
+	if (!v) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		goto out_offsets;
+	}
+	fd = calloc(nr, sizeof(int));
+	if (!fd) {
+		printf("%d/%d: line %d error %d\n",
+			procid, opno, __LINE__, errno);
+		goto out_v;
+	}
+	memset(fd, 0xFF, nr * sizeof(int));
+
+	/* Get paths for all files */
+	for (i = 0; i < nr; i++)
+		init_pathname(&fpath[i]);
+
+	if (!get_fname(FT_REGm, r, &fpath[0], NULL, NULL, &v[0])) {
+		if (v[0])
+			printf("%d/%d: deduperange read - no filename\n",
+				procid, opno);
+		goto out_pathnames;
+	}
+
+	for (i = 1; i < nr; i++) {
+		if (!get_fname(FT_REGm, random(), &fpath[i], NULL, NULL, &v[i])) {
+			if (v[i])
+				printf("%d/%d: deduperange write - no filename\n",
+					procid, opno);
+			goto out_pathnames;
+		}
+	}
+
+	/* Open files */
+	fd[0] = open_path(&fpath[0], O_RDONLY);
+	e = fd[0] < 0 ? errno : 0;
+	check_cwd();
+	if (fd[0] < 0) {
+		if (v[0])
+			printf("%d/%d: deduperange read - open %s failed %d\n",
+				procid, opno, fpath[0].path, e);
+		goto out_pathnames;
+	}
+
+	for (i = 1; i < nr; i++) {
+		fd[i] = open_path(&fpath[i], O_WRONLY);
+		e = fd[i] < 0 ? errno : 0;
+		check_cwd();
+		if (fd[i] < 0) {
+			if (v[i])
+				printf("%d/%d: deduperange write - open %s failed %d\n",
+					procid, opno, fpath[i].path, e);
+			goto out_fds;
+		}
+	}
+
+	/* Get file stats */
+	if (fstat64(fd[0], &stat[0]) < 0) {
+		if (v[0])
+			printf("%d/%d: deduperange read - fstat64 %s failed %d\n",
+				procid, opno, fpath[0].path, errno);
+		goto out_fds;
+	}
+
+	inode_info(&info[0], INFO_SZ, &stat[0], v[0]);
+
+	for (i = 1; i < nr; i++) {
+		if (fstat64(fd[i], &stat[i]) < 0) {
+			if (v[i])
+				printf("%d/%d: deduperange write - fstat64 %s failed %d\n",
+					procid, opno, fpath[i].path, errno);
+			goto out_fds;
+		}
+		inode_info(&info[i * INFO_SZ], INFO_SZ, &stat[i], v[i]);
+	}
+
+	/* Never try to dedupe more than half of the src file. */
+	len = (random() % FILELEN_MAX) + 1;
+	len &= ~(stat[0].st_blksize - 1);
+	if (len == 0)
+		len = stat[0].st_blksize / 2;
+	if (len > stat[0].st_size / 2)
+		len = stat[0].st_size / 2;
+
+	/* Calculate offsets */
+	lr = ((__int64_t)random() << 32) + random();
+	if (stat[0].st_size == len)
+		off[0] = 0;
+	else
+		off[0] = (off64_t)(lr % MIN(stat[0].st_size - len, MAXFSIZE));
+	off[0] %= maxfsize;
+	off[0] &= ~(stat[0].st_blksize - 1);
+
+	/*
+	 * If srcfile == destfile[i], randomly generate destination ranges
+	 * until we find one that doesn't overlap the source range.
+	 */
+	for (i = 1; i < nr; i++) {
+		int	tries = 0;
+
+		do {
+			lr = ((__int64_t)random() << 32) + random();
+			if (stat[i].st_size <= len)
+				off[i] = 0;
+			else
+				off[i] = (off64_t)(lr % MIN(stat[i].st_size - len, MAXFSIZE));
+			off[i] %= maxfsize;
+			off[i] &= ~(stat[i].st_blksize - 1);
+		} while (stat[0].st_ino == stat[i].st_ino &&
+			 llabs(off[i] - off[0]) < len &&
+			 tries++ < 10);
+	}
+
+	/* Clone data blocks */
+	fdr->src_offset = off[0];
+	fdr->src_length = len;
+	fdr->dest_count = nr - 1;
+	for (i = 1; i < nr; i++) {
+		fdr->info[i - 1].dest_fd = fd[i];
+		fdr->info[i - 1].dest_offset = off[i];
+	}
+
+	ret = ioctl(fd[0], FIDEDUPERANGE, fdr);
+	e = ret < 0 ? errno : 0;
+	if (v[0]) {
+		printf("%d/%d: deduperange from %s%s [%lld,%lld]",
+			procid, opno,
+			fpath[0].path, &info[0], (long long)off[0],
+			(long long)len);
+		if (ret < 0)
+			printf(" error %d", e);
+		printf("\n");
+	}
+	if (ret < 0)
+		goto out_fds;
+
+	for (i = 1; i < nr; i++) {
+		e = fdr->info[i - 1].status < 0 ? fdr->info[i - 1].status : 0;
+		if (v[i]) {
+			printf("%d/%d: ...to %s%s [%lld,%lld]",
+				procid, opno,
+				fpath[i].path, &info[i * INFO_SZ],
+				(long long)off[i], (long long)len);
+			if (fdr->info[i - 1].status < 0)
+				printf(" error %d", e);
+			if (fdr->info[i - 1].status == FILE_DEDUPE_RANGE_SAME)
+				printf(" %llu bytes deduplicated",
+					fdr->info[i - 1].bytes_deduped);
+			if (fdr->info[i - 1].status == FILE_DEDUPE_RANGE_DIFFERS)
+				printf(" differed");
+			printf("\n");
+		}
+	}
+
+out_fds:
+	for (i = 0; i < nr; i++)
+		if (fd[i] >= 0)
+			close(fd[i]);
+out_pathnames:
+	for (i = 0; i < nr; i++)
+		free_pathname(&fpath[i]);
+
+	free(fd);
+out_v:
+	free(v);
+out_offsets:
+	free(off);
+out_info:
+	free(info);
+out_stats:
+	free(stat);
+out_paths:
+	free(fpath);
+out_fdr:
+	free(fdr);
+#endif
+}
+
 void
 setxattr_f(int opno, long r)
 {

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH v3 8/8] xfs/068: fix variability problems in file/dir count output
  2017-12-13  6:04 ` [PATCH 8/8] xfs/068: fix variability problems in file/dir count output Darrick J. Wong
  2017-12-13 22:20   ` Dave Chinner
  2017-12-13 23:28   ` [PATCH v2 8/8] xfs/068: fix clonerange " Darrick J. Wong
@ 2017-12-15  2:08   ` Darrick J. Wong
  2017-12-15  2:16     ` Darrick J. Wong
  2017-12-15  2:17   ` [PATCH v4 " Darrick J. Wong
  3 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-15  2:08 UTC (permalink / raw)
  To: eguan; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

In these tests we use a fixed sequence of operations in fsstress to
create a directory tree and then exercise xfsdump/xfsrestore on that.
However, this changes every time someone adds a new fsstress command,
or someone runs with FSSTRESS_AVOID, etc.

Therefore, check the counts directly from xfsrestore output instead
of relying on the golden output and do much more rigorous checking of
the dir tree complexity and the intactness of the dirs and files after
restoring them.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v3: moar checking
---
 common/dump       |   40 ++++++++++++++++++++++++++++++++++++++--
 tests/xfs/027.out |    2 +-
 tests/xfs/068.out |    2 +-
 3 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/common/dump b/common/dump
index 898aaa4..db2e156 100644
--- a/common/dump
+++ b/common/dump
@@ -316,7 +316,8 @@ _create_dumpdir_stress_num()
     echo "-----------------------------------------------"
     echo "fsstress : $_param"
     echo "-----------------------------------------------"
-    if ! $here/ltp/fsstress $_param -s 1 $FSSTRESS_AVOID -n $_count -d $dump_dir >$tmp.out 2>&1
+    echo $FSSTRESS_PROG $_param -s 1 $FSSTRESS_AVOID -n $_count -d $dump_dir >> $seqres.full
+    if ! $FSSTRESS_PROG $_param -s 1 $FSSTRESS_AVOID -n $_count -d $dump_dir >$tmp.out 2>&1
     then
         echo "    fsstress (count=$_count) returned $? - see $seqres.full"
 
@@ -1240,9 +1241,44 @@ _do_dump_restore()
     echo "xfsdump|xfsrestore ..."
     restore_opts="$_restore_debug$restore_args - $restore_dir"
     dump_opts="$_dump_debug$dump_args -s $dump_sdir - $SCRATCH_MNT"
+
+    # We expect there to be one more dir (and inode) than what's in dump_dir.
+    # Construct the string we expect to see in the output, since fsstress
+    # will create different directory structures every time someone adds
+    # a new command, runs with a different FSSTRESS_AVOID, etc.
+    expected_dirs=$(find $dump_dir -type d | wc -l)
+    expected_inodes=$(find $dump_dir | wc -l)
+    expected_str=": $((expected_dirs + 1)) directories and $((expected_inodes + 1)) entries processed"
+
+    if [ $expected_dirs -lt 100 ] || [ $expected_inodes -lt 600 ]; then
+        echo "Oddly small dir tree$expected_str"
+    fi
+
+    # Measure the md5 of every file...
+    (cd $dump_dir ; find . -type f -print0 | xargs -0 md5sum) > $seqres.md5
+    (cd $dump_dir ; find | sort) > $seqres.fstree
+
     echo "xfsdump $dump_opts | xfsrestore $restore_opts" | _dir_filter
-    $XFSDUMP_PROG $dump_opts 2>$tmp.dump.mlog | $XFSRESTORE_PROG $restore_opts 2>&1 | tee -a $seqres.full | _dump_filter
+    $XFSDUMP_PROG $dump_opts 2>$tmp.dump.mlog | $XFSRESTORE_PROG $restore_opts > $tmp.restore.mlog 2>&1
+    cat $tmp.restore.mlog >> $seqres.full
+    echo "xfsrestore output should contain$expected_str" >> $seqres.full
+    cat $tmp.restore.mlog | _dump_filter | sed -e 's/: \([0-9]*\) directories and \([0-9]*\) entries/: XXX directories and YYY entries/g'
     _dump_filter <$tmp.dump.mlog
+
+    # Did we actually restore as many dirs/files as we had?
+    if ! grep -q "$expected_str" $tmp.restore.mlog; then
+        echo "mismatch counts between directory tree and restored filesystem"
+        grep "directories and.*entries processed" $tmp.restore.mlog | sed -e 's/^.*:/found:/g'
+        echo "expected$expected_str"
+    fi
+
+    # Does the directory tree match?
+    diff -u $seqres.fstree <(cd $restore_dir/$dump_sdir ; find | sort)
+
+    # Measure the md5 of every restored file...
+    (cd $restore_dir/$dump_sdir ; md5sum --quiet -c $seqres.md5)
+
+    rm -rf $seqres.md5 $seqres.fstree $tmp.restore.mlog $tmp.dump.mlog
 }
 
 #
diff --git a/tests/xfs/027.out b/tests/xfs/027.out
index ba425a3..7665021 100644
--- a/tests/xfs/027.out
+++ b/tests/xfs/027.out
@@ -19,7 +19,7 @@ xfsrestore: session id: ID
 xfsrestore: media ID: ID
 xfsrestore: searching media for directory dump
 xfsrestore: reading directories
-xfsrestore: 3 directories and 39 entries processed
+xfsrestore: XXX directories and YYY entries processed
 xfsrestore: directory post-processing
 xfsrestore: restoring non-directory files
 xfsrestore: restore complete: SECS seconds elapsed
diff --git a/tests/xfs/068.out b/tests/xfs/068.out
index fa3a552..f53c555 100644
--- a/tests/xfs/068.out
+++ b/tests/xfs/068.out
@@ -22,7 +22,7 @@ xfsrestore: session id: ID
 xfsrestore: media ID: ID
 xfsrestore: searching media for directory dump
 xfsrestore: reading directories
-xfsrestore: 383 directories and 1335 entries processed
+xfsrestore: XXX directories and YYY entries processed
 xfsrestore: directory post-processing
 xfsrestore: restoring non-directory files
 xfsrestore: restore complete: SECS seconds elapsed

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v3 8/8] xfs/068: fix variability problems in file/dir count output
  2017-12-15  2:08   ` [PATCH v3 8/8] xfs/068: fix variability " Darrick J. Wong
@ 2017-12-15  2:16     ` Darrick J. Wong
  0 siblings, 0 replies; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-15  2:16 UTC (permalink / raw)
  To: eguan; +Cc: linux-xfs, fstests

On Thu, Dec 14, 2017 at 06:08:59PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> In these tests we use a fixed sequence of operations in fsstress to
> create a directory tree and then exercise xfsdump/xfsrestore on that.
> However, this changes every time someone adds a new fsstress command,
> or someone runs with FSSTRESS_AVOID, etc.
> 
> Therefore, check the counts directly from xfsrestore output instead
> of relying on the golden output and do much more rigorous checking of
> the dir tree complexity and the intactness of the dirs and files after
> restoring them.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> v3: moar checking
> ---
>  common/dump       |   40 ++++++++++++++++++++++++++++++++++++++--
>  tests/xfs/027.out |    2 +-
>  tests/xfs/068.out |    2 +-
>  3 files changed, 40 insertions(+), 4 deletions(-)
> 
> diff --git a/common/dump b/common/dump
> index 898aaa4..db2e156 100644
> --- a/common/dump
> +++ b/common/dump
> @@ -316,7 +316,8 @@ _create_dumpdir_stress_num()
>      echo "-----------------------------------------------"
>      echo "fsstress : $_param"
>      echo "-----------------------------------------------"
> -    if ! $here/ltp/fsstress $_param -s 1 $FSSTRESS_AVOID -n $_count -d $dump_dir >$tmp.out 2>&1
> +    echo $FSSTRESS_PROG $_param -s 1 $FSSTRESS_AVOID -n $_count -d $dump_dir >> $seqres.full
> +    if ! $FSSTRESS_PROG $_param -s 1 $FSSTRESS_AVOID -n $_count -d $dump_dir >$tmp.out 2>&1
>      then
>          echo "    fsstress (count=$_count) returned $? - see $seqres.full"
>  
> @@ -1240,9 +1241,44 @@ _do_dump_restore()
>      echo "xfsdump|xfsrestore ..."
>      restore_opts="$_restore_debug$restore_args - $restore_dir"
>      dump_opts="$_dump_debug$dump_args -s $dump_sdir - $SCRATCH_MNT"
> +
> +    # We expect there to be one more dir (and inode) than what's in dump_dir.
> +    # Construct the string we expect to see in the output, since fsstress
> +    # will create different directory structures every time someone adds
> +    # a new command, runs with a different FSSTRESS_AVOID, etc.
> +    expected_dirs=$(find $dump_dir -type d | wc -l)
> +    expected_inodes=$(find $dump_dir | wc -l)
> +    expected_str=": $((expected_dirs + 1)) directories and $((expected_inodes + 1)) entries processed"
> +
> +    if [ $expected_dirs -lt 100 ] || [ $expected_inodes -lt 600 ]; then
> +        echo "Oddly small dir tree$expected_str"
> +    fi

...aaand of course I send the wrong version from earlier, so NAK.

--D

> +
> +    # Measure the md5 of every file...
> +    (cd $dump_dir ; find . -type f -print0 | xargs -0 md5sum) > $seqres.md5
> +    (cd $dump_dir ; find | sort) > $seqres.fstree
> +
>      echo "xfsdump $dump_opts | xfsrestore $restore_opts" | _dir_filter
> -    $XFSDUMP_PROG $dump_opts 2>$tmp.dump.mlog | $XFSRESTORE_PROG $restore_opts 2>&1 | tee -a $seqres.full | _dump_filter
> +    $XFSDUMP_PROG $dump_opts 2>$tmp.dump.mlog | $XFSRESTORE_PROG $restore_opts > $tmp.restore.mlog 2>&1
> +    cat $tmp.restore.mlog >> $seqres.full
> +    echo "xfsrestore output should contain$expected_str" >> $seqres.full
> +    cat $tmp.restore.mlog | _dump_filter | sed -e 's/: \([0-9]*\) directories and \([0-9]*\) entries/: XXX directories and YYY entries/g'
>      _dump_filter <$tmp.dump.mlog
> +
> +    # Did we actually restore as many dirs/files as we had?
> +    if ! grep -q "$expected_str" $tmp.restore.mlog; then
> +        echo "mismatch counts between directory tree and restored filesystem"
> +        grep "directories and.*entries processed" $tmp.restore.mlog | sed -e 's/^.*:/found:/g'
> +        echo "expected$expected_str"
> +    fi
> +
> +    # Does the directory tree match?
> +    diff -u $seqres.fstree <(cd $restore_dir/$dump_sdir ; find | sort)
> +
> +    # Measure the md5 of every restored file...
> +    (cd $restore_dir/$dump_sdir ; md5sum --quiet -c $seqres.md5)
> +
> +    rm -rf $seqres.md5 $seqres.fstree $tmp.restore.mlog $tmp.dump.mlog
>  }
>  
>  #
> diff --git a/tests/xfs/027.out b/tests/xfs/027.out
> index ba425a3..7665021 100644
> --- a/tests/xfs/027.out
> +++ b/tests/xfs/027.out
> @@ -19,7 +19,7 @@ xfsrestore: session id: ID
>  xfsrestore: media ID: ID
>  xfsrestore: searching media for directory dump
>  xfsrestore: reading directories
> -xfsrestore: 3 directories and 39 entries processed
> +xfsrestore: XXX directories and YYY entries processed
>  xfsrestore: directory post-processing
>  xfsrestore: restoring non-directory files
>  xfsrestore: restore complete: SECS seconds elapsed
> diff --git a/tests/xfs/068.out b/tests/xfs/068.out
> index fa3a552..f53c555 100644
> --- a/tests/xfs/068.out
> +++ b/tests/xfs/068.out
> @@ -22,7 +22,7 @@ xfsrestore: session id: ID
>  xfsrestore: media ID: ID
>  xfsrestore: searching media for directory dump
>  xfsrestore: reading directories
> -xfsrestore: 383 directories and 1335 entries processed
> +xfsrestore: XXX directories and YYY entries processed
>  xfsrestore: directory post-processing
>  xfsrestore: restoring non-directory files
>  xfsrestore: restore complete: SECS seconds elapsed
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH v4 8/8] xfs/068: fix variability problems in file/dir count output
  2017-12-13  6:04 ` [PATCH 8/8] xfs/068: fix variability problems in file/dir count output Darrick J. Wong
                     ` (2 preceding siblings ...)
  2017-12-15  2:08   ` [PATCH v3 8/8] xfs/068: fix variability " Darrick J. Wong
@ 2017-12-15  2:17   ` Darrick J. Wong
  3 siblings, 0 replies; 55+ messages in thread
From: Darrick J. Wong @ 2017-12-15  2:17 UTC (permalink / raw)
  To: eguan; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

In these tests we use a fixed sequence of operations in fsstress to
create a directory tree and then exercise xfsdump/xfsrestore on that.
However, this changes every time someone adds a new fsstress command,
or someone runs with FSSTRESS_AVOID, etc.

Therefore, check the counts directly from xfsrestore output instead
of relying on the golden output; check that the paths of every file
in the directory tree match; and check that the md5 of every regular
file matches.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v4: add lots more checking of everything we restored
---
 common/dump       |   36 ++++++++++++++++++++++++++++++++++--
 tests/xfs/027.out |    2 +-
 tests/xfs/068     |    9 +++++++++
 tests/xfs/068.out |    2 +-
 4 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/common/dump b/common/dump
index 898aaa4..f4edd53 100644
--- a/common/dump
+++ b/common/dump
@@ -316,7 +316,8 @@ _create_dumpdir_stress_num()
     echo "-----------------------------------------------"
     echo "fsstress : $_param"
     echo "-----------------------------------------------"
-    if ! $here/ltp/fsstress $_param -s 1 $FSSTRESS_AVOID -n $_count -d $dump_dir >$tmp.out 2>&1
+    echo $FSSTRESS_PROG $_param -s 1 $FSSTRESS_AVOID -n $_count -d $dump_dir >> $seqres.full
+    if ! $FSSTRESS_PROG $_param -s 1 $FSSTRESS_AVOID -n $_count -d $dump_dir >$tmp.out 2>&1
     then
         echo "    fsstress (count=$_count) returned $? - see $seqres.full"
 
@@ -1240,9 +1241,40 @@ _do_dump_restore()
     echo "xfsdump|xfsrestore ..."
     restore_opts="$_restore_debug$restore_args - $restore_dir"
     dump_opts="$_dump_debug$dump_args -s $dump_sdir - $SCRATCH_MNT"
+
+    # We expect there to be one more dir (and inode) than what's in dump_dir.
+    # Construct the string we expect to see in the output, since fsstress
+    # will create different directory structures every time someone adds
+    # a new command, runs with a different FSSTRESS_AVOID, etc.
+    expected_dirs=$(find $dump_dir -type d | wc -l)
+    expected_inodes=$(find $dump_dir | wc -l)
+    expected_str=": $((expected_dirs + 1)) directories and $((expected_inodes + 1)) entries processed"
+
+    # Measure the md5 of every file...
+    (cd $dump_dir ; find . -type f -print0 | xargs -0 md5sum) > $seqres.md5
+    (cd $dump_dir ; find | sort) > $seqres.fstree
+
     echo "xfsdump $dump_opts | xfsrestore $restore_opts" | _dir_filter
-    $XFSDUMP_PROG $dump_opts 2>$tmp.dump.mlog | $XFSRESTORE_PROG $restore_opts 2>&1 | tee -a $seqres.full | _dump_filter
+    $XFSDUMP_PROG $dump_opts 2>$tmp.dump.mlog | $XFSRESTORE_PROG $restore_opts > $tmp.restore.mlog 2>&1
+    cat $tmp.restore.mlog >> $seqres.full
+    echo "xfsrestore output should contain$expected_str" >> $seqres.full
+    cat $tmp.restore.mlog | _dump_filter | sed -e 's/: \([0-9]*\) directories and \([0-9]*\) entries/: XXX directories and YYY entries/g'
     _dump_filter <$tmp.dump.mlog
+
+    # Did we actually restore as many dirs/files as we had?
+    if ! grep -q "$expected_str" $tmp.restore.mlog; then
+        echo "mismatch counts between directory tree and restored filesystem"
+        grep "directories and.*entries processed" $tmp.restore.mlog | sed -e 's/^.*:/found:/g'
+        echo "expected$expected_str"
+    fi
+
+    # Does the directory tree match?
+    diff -u $seqres.fstree <(cd $restore_dir/$dump_sdir ; find | sort)
+
+    # Measure the md5 of every restored file...
+    (cd $restore_dir/$dump_sdir ; md5sum --quiet -c $seqres.md5)
+
+    rm -rf $seqres.md5 $seqres.fstree $tmp.restore.mlog $tmp.dump.mlog
 }
 
 #
diff --git a/tests/xfs/027.out b/tests/xfs/027.out
index ba425a3..7665021 100644
--- a/tests/xfs/027.out
+++ b/tests/xfs/027.out
@@ -19,7 +19,7 @@ xfsrestore: session id: ID
 xfsrestore: media ID: ID
 xfsrestore: searching media for directory dump
 xfsrestore: reading directories
-xfsrestore: 3 directories and 39 entries processed
+xfsrestore: XXX directories and YYY entries processed
 xfsrestore: directory post-processing
 xfsrestore: restoring non-directory files
 xfsrestore: restore complete: SECS seconds elapsed
diff --git a/tests/xfs/068 b/tests/xfs/068
index 7151e28..9ecb836 100755
--- a/tests/xfs/068
+++ b/tests/xfs/068
@@ -44,6 +44,15 @@ _supported_fs xfs
 _supported_os Linux
 
 _create_dumpdir_stress_num 4096
+
+# Let's make sure there's a largeish number of dirs/files here.
+expected_dirs=$(find $dump_dir -type d | wc -l)
+expected_inodes=$(find $dump_dir | wc -l)
+
+if [ $expected_dirs -lt 100 ] || [ $expected_inodes -lt 600 ]; then
+	echo "Oddly small dir tree: $((expected_dirs + 1)) directories and $((expected_inodes + 1)) entries processed"
+fi
+
 _do_dump_restore
 
 # success, all done
diff --git a/tests/xfs/068.out b/tests/xfs/068.out
index fa3a552..f53c555 100644
--- a/tests/xfs/068.out
+++ b/tests/xfs/068.out
@@ -22,7 +22,7 @@ xfsrestore: session id: ID
 xfsrestore: media ID: ID
 xfsrestore: searching media for directory dump
 xfsrestore: reading directories
-xfsrestore: 383 directories and 1335 entries processed
+xfsrestore: XXX directories and YYY entries processed
 xfsrestore: directory post-processing
 xfsrestore: restoring non-directory files
 xfsrestore: restore complete: SECS seconds elapsed

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 8/8] xfs/068: fix clonerange problems in file/dir count output
  2017-12-15  2:04             ` Darrick J. Wong
@ 2017-12-15  4:37               ` Dave Chinner
  2017-12-15  7:06                 ` Amir Goldstein
  0 siblings, 1 reply; 55+ messages in thread
From: Dave Chinner @ 2017-12-15  4:37 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Eryu Guan, Amir Goldstein, linux-xfs, fstests

On Thu, Dec 14, 2017 at 06:04:19PM -0800, Darrick J. Wong wrote:
> On Fri, Dec 15, 2017 at 08:35:41AM +1100, Dave Chinner wrote:
> > On Thu, Dec 14, 2017 at 03:49:47PM +0800, Eryu Guan wrote:
> > > On Thu, Dec 14, 2017 at 08:52:32AM +0200, Amir Goldstein wrote:
> > > > On Thu, Dec 14, 2017 at 1:44 AM, Dave Chinner <david@fromorbit.com> wrote:
> > > > > On Wed, Dec 13, 2017 at 03:28:05PM -0800, Darrick J. Wong wrote:
> > > > >> From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > >>
> > > > >> In this test we use a fixed sequence of operations in fsstress to create
> > > > >> some number of files and dirs and then exercise xfsdump/xfsrestore on
> > > > >> them.  Since clonerange/deduperange are not supported on all xfs
> > > > >> configurations, detect if they're in fsstress and disable them so that
> > > > >> we always execute exactly the same sequence of operations no matter how
> > > > >> the filesystem is configured.
> > > > >>
> > > > >> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > >> ---
> > > > >>  tests/xfs/068 |    8 ++++++++
> > > > >>  1 file changed, 8 insertions(+)
> > > > >>
> > > > >> diff --git a/tests/xfs/068 b/tests/xfs/068
> > > > >> index 7151e28..f95a539 100755
> > > > >> --- a/tests/xfs/068
> > > > >> +++ b/tests/xfs/068
> > > > >> @@ -43,6 +43,14 @@ trap "rm -rf $tmp.*; exit \$status" 0 1 2 3 15
> > > > >>  _supported_fs xfs
> > > > >>  _supported_os Linux
> > > > >>
> > > > >> +# Remove fsstress commands that aren't supported on all xfs configs
> > > > >> +if $FSSTRESS_PROG | grep -q clonerange; then
> > > > >> +     FSSTRESS_AVOID="-f clonerange=0 $FSSTRESS_AVOID"
> > > > >> +fi
> > > > >> +if $FSSTRESS_PROG | grep -q deduperange; then
> > > > >> +     FSSTRESS_AVOID="-f deduperange=0 $FSSTRESS_AVOID"
> > > > >> +fi
> > > > >> +
> > > > >
> > > > > I'd put this inside _create_dumpdir_stress_num as it's supposed to
> > > > > DTRT for the dump/restore that follows. Otherwise looks fine.
> > > > >
> > > > 
> > > > Guys,
> > > > 
> > > > Please take a look at the only 2 changes in the history of this test.
> > > > I would like to make sure we are not in a loop:
> > > > 
> > > > 5d36d85 xfs/068: update golden output due to new operations in fsstress
> > > > 6e5194d fsstress: Add fallocate insert range operation
> > > > 
> > > > The first change excludes the new insert op (by dchinner on commit)
> > > > The second change re-includes insert op, does not exclude new
> > > > mread/mwrite ops and updates golden output, following this discussion:
> > > > https://marc.info/?l=fstests&m=149014697111838&w=2
> > > > (the referenced thread ends with a ? to Dave, but was followed by v6..v8
> > > >  that were "silently acked" by Dave).
> > > > 
> > > > I personally argued that the blacklist approach to xfs/068 is fragile and indeed
> > > > this is the third time the test breaks in the history I know of,
> > > > because of added
> > > > fsstress ops. Fine. As long as we at least stay consistent with a decision about
> > > > update golden output vs. exclude ops and document the decision in a comment
> > > > with the reasoning, so we won't have to repeat this discussion next time.
> > > 
> > > I think the fundamental problem of xfs/068 is the hardcoded file numbers
> > > in .out file, perhaps we should calculate the expected number of
> > > files/dirs to be dumped/restored before the dump test and extract the
> > > actual restored number of files/dirs from xfsrestore output and do a
> > > comparison. (or save the whole tree structure for comparison? I haven't
> > > done any test yet, just some random thoughts for now.)
> > 
> > Or we don't waste any more time on trying to make a reliable, stable
> > regression test that has a history of detecting bulkstat regressions
> > work differently?
> 
> <shrug> See now, the frustrating part about fixing this testcase is that
> I still don't feel like I have a good grasp on what this thing is trying
> to test -- apparently we're checking for bulkstat regressions, dump
> problems, and restore problems? 

commit 481c28f52fd4ed3976f2733a1c65f92760138258
Author: Eric Sandeen <sandeen@redhat.com>
Date:   Tue Oct 14 22:59:39 2014 +1100

    xfs: test larger dump/restore to/from file
    
    This test creates a large-ish directory structure using
    fsstress, and does a dump/restore to make sure we dump
    all the files.
    
    Without the fix for the regression caused by:
    c7cb51d xfs: fix error handling at xfs_inumbers
    
    we will see failures like:
    
        -xfsrestore: 486 directories and 1590 entries processed
        +xfsrestore: 30 directories and 227 entries processed
    
    as it fails to process all inodes.
    
    I think that existing tests have a much smaller set of files,
    and so don't trip the bug.
    
    I don't do a file-by-file comparison here, because for some
    reason the diff output gets garbled; this test only checks
    that we've dumped & restored the correct number of files.
    
    Signed-off-by: Eric Sandeen <sandeen@redhat.com>
    Reviewed-by: Dave Chinner <dchinner@redhat.com>
    Signed-off-by: Dave Chinner <david@fromorbit.com>

FWIW, I'm pretty sure the diff problems were related to binary file
contents, so it was dropped as it wasn't critical to validting
that bulkstat and inode number iteration worked correctly.

> Are we also looking for problems that
> might crop up with the newer APIs, whatever those might be?

No, we're explicitly using fsstress to generate a dataset large
enough to cause iteration over various API xfsdump relies on to
work correctly. i.e. the features fsstress have are irrelevant to
the functioning of this test - we want it to generate a specific,
consistent, deterministic data set and that's it.

Really, all I care about is that we don't overcomplicate the
problem and the solution. Just adding commands to the avoid list
for fsstress is a perfectly acceptible, simple solutioni - we've
done it twice in 3 years for this test, and we've done it for other
tests, too. It's hardly a crippling maintenance burden.

And, FWIW, we check the file count from xfsrestore in teh golden
output of pretty much every xfsdump/restore test:

$ git grep "entries processed" tests/xfs
tests/xfs/022:_do_restore | sed -e "/entries processed$/s/[0-9][0-9]*/NUM/g"
tests/xfs/022.out:xfsrestore: NUM directories and NUM entries processed
tests/xfs/023.out:xfsrestore: 3 directories and 38 entries processed
tests/xfs/024.out:xfsrestore: 3 directories and 38 entries processed
tests/xfs/025.out:xfsrestore: 3 directories and 38 entries processed
tests/xfs/026.out:xfsrestore: 3 directories and 38 entries processed
tests/xfs/027.out:xfsrestore: 3 directories and 39 entries processed
tests/xfs/035.out:xfsrestore: 3 directories and 6 entries processed
tests/xfs/036.out:xfsrestore: 3 directories and 38 entries processed
tests/xfs/037.out:xfsrestore: 3 directories and 38 entries processed
tests/xfs/038.out:xfsrestore: 3 directories and 38 entries processed
tests/xfs/039.out:xfsrestore: 3 directories and 38 entries processed
tests/xfs/043.out:xfsrestore: 3 directories and 38 entries processed
tests/xfs/046.out:xfsrestore: 3 directories and 10 entries processed
tests/xfs/055.out:xfsrestore: 3 directories and 38 entries processed
tests/xfs/056.out:xfsrestore: 7 directories and 11 entries processed
tests/xfs/060.out:xfsrestore: 3 directories and 41 entries processed
tests/xfs/061.out:xfsrestore: 7 directories and 11 entries processed
tests/xfs/063.out:xfsrestore: 4 directories and 21 entries processed
.....

Really, I don't see a need to do anything else than avoid the
fsstress ops that caused the change of behaviour. All the other
xfsdump/restore tests do file and directory tree validations, so
they are going to catch any regression on that side of things. This
test just exercises iteration of various APIs that we've broken in
the past...

> Currently I have a reworked version of this patch that runs
> fsstress, measures the number of directories and inodes in
> $dump_dir, then programmatically compares that to whatever
> xfsrestore tells us it restored.  This ought to be enough that we
> can create a sufficiently messy filesystem with whatever sequence
> of syscalls we want, and make sure that dump/restore actually work
> on them.
> 
> First we run fsstress, then we count the number of dirs, the
> number of fs objects, take a snapshot of the 'find .' output, and
> md5sum every file in the dump directory.
> 
> If fsstress creates fewer than 100 dirs or 600 inodes, we fail the
> test because that wasn't enough.
> 
> If bulkstat fails to iterate all the inodes, restore's output will
> reflect fewer files than was expected.
> 
> If dump fails to generate a full dump, restore's output will
> reflect fewer files than was expected.
> 
> If restore fails to restore the full dump, restore's output will
> reflect fewer files than was expected.
> 
> If the restore output doesn't reflect the number of dirs/inodes we
> counted at the beginning, we fail the test.
> 
> If the 'find .' output of the restored dir doesn't match the
> original, we fail the test.
> 
> If the md5sum -c output shows corrupt files, we fail the test.
> 
> So now I really have no idea -- is that enough to check that
> everything works?  I felt like it does, but given all the back and
> forth now I'm wondering if even this is enough.

What did I say about not wanting to overcomplicate the problem and
the solution? :/

Folks, I don't say "leave it alone, it's fine" without a good
reason.  If you've never tried to debug xfsdump or xfsrestore, and
you aren't familiar with the ancient xfsdump and restore unit tests
that were written before anyone here was working on linux, then
don't suggest we rewrite them to make them nicer. Their value is in
the fact they've been around almost entirely unchanged for 15 years
and they still catch bugs....

Make whatever changes are necessary to keep them running
exactly as they are and don't change them unless xfsdump/restore
testing requires them to be changed.

> (Yeah, I'm frustrated because the fsstress additions have been
> very helpful at flushing out more reflink bugs and I feel like I'm
> making very little progress on this xfs/068 thing.  Sorry.)

Well, I thought it was all sorted until people started suggesting we
do crazy things like you've now gone and done.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 8/8] xfs/068: fix clonerange problems in file/dir count output
  2017-12-15  4:37               ` Dave Chinner
@ 2017-12-15  7:06                 ` Amir Goldstein
  0 siblings, 0 replies; 55+ messages in thread
From: Amir Goldstein @ 2017-12-15  7:06 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Darrick J. Wong, Eryu Guan, linux-xfs, fstests

On Fri, Dec 15, 2017 at 6:37 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Thu, Dec 14, 2017 at 06:04:19PM -0800, Darrick J. Wong wrote:
>> On Fri, Dec 15, 2017 at 08:35:41AM +1100, Dave Chinner wrote:
>> > On Thu, Dec 14, 2017 at 03:49:47PM +0800, Eryu Guan wrote:
>> > > On Thu, Dec 14, 2017 at 08:52:32AM +0200, Amir Goldstein wrote:
>> > > > On Thu, Dec 14, 2017 at 1:44 AM, Dave Chinner <david@fromorbit.com> wrote:
>> > > > > On Wed, Dec 13, 2017 at 03:28:05PM -0800, Darrick J. Wong wrote:
>> > > > >> From: Darrick J. Wong <darrick.wong@oracle.com>
>> > > > >>
>> > > > >> In this test we use a fixed sequence of operations in fsstress to create
>> > > > >> some number of files and dirs and then exercise xfsdump/xfsrestore on
>> > > > >> them.  Since clonerange/deduperange are not supported on all xfs
>> > > > >> configurations, detect if they're in fsstress and disable them so that
>> > > > >> we always execute exactly the same sequence of operations no matter how
>> > > > >> the filesystem is configured.
>> > > > >>
>> > > > >> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
>> > > > >> ---
>> > > > >>  tests/xfs/068 |    8 ++++++++
>> > > > >>  1 file changed, 8 insertions(+)
>> > > > >>
>> > > > >> diff --git a/tests/xfs/068 b/tests/xfs/068
>> > > > >> index 7151e28..f95a539 100755
>> > > > >> --- a/tests/xfs/068
>> > > > >> +++ b/tests/xfs/068
>> > > > >> @@ -43,6 +43,14 @@ trap "rm -rf $tmp.*; exit \$status" 0 1 2 3 15
>> > > > >>  _supported_fs xfs
>> > > > >>  _supported_os Linux
>> > > > >>
>> > > > >> +# Remove fsstress commands that aren't supported on all xfs configs
>> > > > >> +if $FSSTRESS_PROG | grep -q clonerange; then
>> > > > >> +     FSSTRESS_AVOID="-f clonerange=0 $FSSTRESS_AVOID"
>> > > > >> +fi
>> > > > >> +if $FSSTRESS_PROG | grep -q deduperange; then
>> > > > >> +     FSSTRESS_AVOID="-f deduperange=0 $FSSTRESS_AVOID"
>> > > > >> +fi
>> > > > >> +
>> > > > >
>> > > > > I'd put this inside _create_dumpdir_stress_num as it's supposed to
>> > > > > DTRT for the dump/restore that follows. Otherwise looks fine.
>> > > > >
>> > > >
>> > > > Guys,
>> > > >
>> > > > Please take a look at the only 2 changes in the history of this test.
>> > > > I would like to make sure we are not in a loop:
>> > > >
>> > > > 5d36d85 xfs/068: update golden output due to new operations in fsstress
>> > > > 6e5194d fsstress: Add fallocate insert range operation
>> > > >
>> > > > The first change excludes the new insert op (by dchinner on commit)
>> > > > The second change re-includes insert op, does not exclude new
>> > > > mread/mwrite ops and updates golden output, following this discussion:
>> > > > https://marc.info/?l=fstests&m=149014697111838&w=2
>> > > > (the referenced thread ends with a ? to Dave, but was followed by v6..v8
>> > > >  that were "silently acked" by Dave).
>> > > >
>> > > > I personally argued that the blacklist approach to xfs/068 is fragile and indeed
>> > > > this is the third time the test breaks in the history I know of,
>> > > > because of added
>> > > > fsstress ops. Fine. As long as we at least stay consistent with a decision about
>> > > > update golden output vs. exclude ops and document the decision in a comment
>> > > > with the reasoning, so we won't have to repeat this discussion next time.
>> > >
>> > > I think the fundamental problem of xfs/068 is the hardcoded file numbers
>> > > in .out file, perhaps we should calculate the expected number of
>> > > files/dirs to be dumped/restored before the dump test and extract the
>> > > actual restored number of files/dirs from xfsrestore output and do a
>> > > comparison. (or save the whole tree structure for comparison? I haven't
>> > > done any test yet, just some random thoughts for now.)
>> >
>> > Or we don't waste any more time on trying to make a reliable, stable
>> > regression test that has a history of detecting bulkstat regressions
>> > work differently?
>>
>> <shrug> See now, the frustrating part about fixing this testcase is that
>> I still don't feel like I have a good grasp on what this thing is trying
>> to test -- apparently we're checking for bulkstat regressions, dump
>> problems, and restore problems?
>
> commit 481c28f52fd4ed3976f2733a1c65f92760138258
> Author: Eric Sandeen <sandeen@redhat.com>
> Date:   Tue Oct 14 22:59:39 2014 +1100
>
>     xfs: test larger dump/restore to/from file
>
>     This test creates a large-ish directory structure using
>     fsstress, and does a dump/restore to make sure we dump
>     all the files.
>
>     Without the fix for the regression caused by:
>     c7cb51d xfs: fix error handling at xfs_inumbers
>
>     we will see failures like:
>
>         -xfsrestore: 486 directories and 1590 entries processed
>         +xfsrestore: 30 directories and 227 entries processed
>
>     as it fails to process all inodes.
>
>     I think that existing tests have a much smaller set of files,
>     and so don't trip the bug.
>
>     I don't do a file-by-file comparison here, because for some
>     reason the diff output gets garbled; this test only checks
>     that we've dumped & restored the correct number of files.
>
>     Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>     Reviewed-by: Dave Chinner <dchinner@redhat.com>
>     Signed-off-by: Dave Chinner <david@fromorbit.com>
>
> FWIW, I'm pretty sure the diff problems were related to binary file
> contents, so it was dropped as it wasn't critical to validting
> that bulkstat and inode number iteration worked correctly.
>
>> Are we also looking for problems that
>> might crop up with the newer APIs, whatever those might be?
>
> No, we're explicitly using fsstress to generate a dataset large
> enough to cause iteration over various API xfsdump relies on to
> work correctly. i.e. the features fsstress have are irrelevant to
> the functioning of this test - we want it to generate a specific,
> consistent, deterministic data set and that's it.
>
> Really, all I care about is that we don't overcomplicate the
> problem and the solution. Just adding commands to the avoid list
> for fsstress is a perfectly acceptible, simple solutioni - we've
> done it twice in 3 years for this test, and we've done it for other
> tests, too. It's hardly a crippling maintenance burden.
>
> And, FWIW, we check the file count from xfsrestore in teh golden
> output of pretty much every xfsdump/restore test:
>
> $ git grep "entries processed" tests/xfs
> tests/xfs/022:_do_restore | sed -e "/entries processed$/s/[0-9][0-9]*/NUM/g"
> tests/xfs/022.out:xfsrestore: NUM directories and NUM entries processed
> tests/xfs/023.out:xfsrestore: 3 directories and 38 entries processed
> tests/xfs/024.out:xfsrestore: 3 directories and 38 entries processed
> tests/xfs/025.out:xfsrestore: 3 directories and 38 entries processed
> tests/xfs/026.out:xfsrestore: 3 directories and 38 entries processed
> tests/xfs/027.out:xfsrestore: 3 directories and 39 entries processed
> tests/xfs/035.out:xfsrestore: 3 directories and 6 entries processed
> tests/xfs/036.out:xfsrestore: 3 directories and 38 entries processed
> tests/xfs/037.out:xfsrestore: 3 directories and 38 entries processed
> tests/xfs/038.out:xfsrestore: 3 directories and 38 entries processed
> tests/xfs/039.out:xfsrestore: 3 directories and 38 entries processed
> tests/xfs/043.out:xfsrestore: 3 directories and 38 entries processed
> tests/xfs/046.out:xfsrestore: 3 directories and 10 entries processed
> tests/xfs/055.out:xfsrestore: 3 directories and 38 entries processed
> tests/xfs/056.out:xfsrestore: 7 directories and 11 entries processed
> tests/xfs/060.out:xfsrestore: 3 directories and 41 entries processed
> tests/xfs/061.out:xfsrestore: 7 directories and 11 entries processed
> tests/xfs/063.out:xfsrestore: 4 directories and 21 entries processed
> .....
>
> Really, I don't see a need to do anything else than avoid the
> fsstress ops that caused the change of behaviour. All the other
> xfsdump/restore tests do file and directory tree validations, so
> they are going to catch any regression on that side of things. This
> test just exercises iteration of various APIs that we've broken in
> the past...
>
>> Currently I have a reworked version of this patch that runs
>> fsstress, measures the number of directories and inodes in
>> $dump_dir, then programmatically compares that to whatever
>> xfsrestore tells us it restored.  This ought to be enough that we
>> can create a sufficiently messy filesystem with whatever sequence
>> of syscalls we want, and make sure that dump/restore actually work
>> on them.
>>
>> First we run fsstress, then we count the number of dirs, the
>> number of fs objects, take a snapshot of the 'find .' output, and
>> md5sum every file in the dump directory.
>>
>> If fsstress creates fewer than 100 dirs or 600 inodes, we fail the
>> test because that wasn't enough.
>>
>> If bulkstat fails to iterate all the inodes, restore's output will
>> reflect fewer files than was expected.
>>
>> If dump fails to generate a full dump, restore's output will
>> reflect fewer files than was expected.
>>
>> If restore fails to restore the full dump, restore's output will
>> reflect fewer files than was expected.
>>
>> If the restore output doesn't reflect the number of dirs/inodes we
>> counted at the beginning, we fail the test.
>>
>> If the 'find .' output of the restored dir doesn't match the
>> original, we fail the test.
>>
>> If the md5sum -c output shows corrupt files, we fail the test.
>>
>> So now I really have no idea -- is that enough to check that
>> everything works?  I felt like it does, but given all the back and
>> forth now I'm wondering if even this is enough.
>
> What did I say about not wanting to overcomplicate the problem and
> the solution? :/
>
> Folks, I don't say "leave it alone, it's fine" without a good
> reason.  If you've never tried to debug xfsdump or xfsrestore, and
> you aren't familiar with the ancient xfsdump and restore unit tests
> that were written before anyone here was working on linux, then
> don't suggest we rewrite them to make them nicer. Their value is in
> the fact they've been around almost entirely unchanged for 15 years
> and they still catch bugs....
>
> Make whatever changes are necessary to keep them running
> exactly as they are and don't change them unless xfsdump/restore
> testing requires them to be changed.
>
>> (Yeah, I'm frustrated because the fsstress additions have been
>> very helpful at flushing out more reflink bugs and I feel like I'm
>> making very little progress on this xfs/068 thing.  Sorry.)
>
> Well, I thought it was all sorted until people started suggesting we
> do crazy things like you've now gone and done.
>

Please read this test description out loud and tell yourself this is a good
sanity test:
1. perform a $series of fs operations
2. backup & restore
3. count the number of file&dir
4. makes sure the $count didn't change across $kernel & user $tools releases

Now developer changes the test variable $series, so the result $count obviously
changes, so developer changes the expected count  WITHOUT verifying that
the expected count is actually correct.

There is NOTHING in the current test (before Darrick's "crazy" changes) that
promise we are not actually validating that bulkstat *is* broken.

Amir.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 0/8] weekly fstests changes
  2017-12-13  6:03 [PATCH 0/8] weekly fstests changes Darrick J. Wong
                   ` (7 preceding siblings ...)
  2017-12-13  6:04 ` [PATCH 8/8] xfs/068: fix variability problems in file/dir count output Darrick J. Wong
@ 2017-12-15  8:55 ` Eryu Guan
  2018-01-03 19:22 ` [PATCH 9/8] xfs: find libxfs api violations Darrick J. Wong
  2018-01-03 19:26 ` [PATCH 10/8] xfs: check that fs freeze minimizes required recovery Darrick J. Wong
  10 siblings, 0 replies; 55+ messages in thread
From: Eryu Guan @ 2017-12-15  8:55 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, fstests

On Tue, Dec 12, 2017 at 10:03:10PM -0800, Darrick J. Wong wrote:
> Hi all,
> 
> Here's the usual weekly pile of fstests changes. :)
> 
> We start with the same patch as last time that scans kmemleak for leaks
> after each test.
> 
> Then we move on to a fix in the scrub probe code because upstream
> xfsprogs changed xfs_io syntax again.
> 
> Third is a patch that adds Unicode linedraw character tests to the two
> tests that check that we can store arbitrary byte patterns without
> screwing things up.
> 
> The fourth patch amends various xfs tests to deal with our slow removal
> of zero-alloc transactions, which is causing unexpected fs shutdowns
> when free space nears zero.

I'm taking patch 2-4 for this week's update, I'll need more time to look
at & test & think more about the other patches.

Thanks,
Eryu

> 
> The fifth patch adds a new test to invoke dm-error on a fsstress run;
> this test simulates (for the filesystem) an internal "yank all the disk
> cables out" test.
> 
> The sixth patch adds FICLONERANGE/FIDEDUPERANGE support to fsstress.
> This was helpful in finding a bunch of unhandled corner cases in the xfs
> reflink implementation.
> 
> Patch seven test write-only fsstress to maximize testing of write paths
> where coding mistakes cannot be backed out of so easily.
> 
> The final patch in the series fixes the dir/file counts that are
> hardcoded in the xfs/068 golden output, since with fsstress doing
> reflink now we might get a lot more "files" than we used to.
> 
> --D
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 6/8] fsstress: implement the clonerange/deduperange ioctls
  2017-12-15  2:07   ` [PATCH v2 " Darrick J. Wong
@ 2018-01-03  8:48     ` Eryu Guan
  2018-01-03 17:12       ` Darrick J. Wong
  2018-02-22 16:06     ` Luis Henriques
  1 sibling, 1 reply; 55+ messages in thread
From: Eryu Guan @ 2018-01-03  8:48 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, fstests

On Thu, Dec 14, 2017 at 06:07:31PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Mix it up a bit by reflinking and deduping data blocks when possible.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

This looks fine overall, but I noticed a soft lockup bug in generic/083
and generic/269 (both test exercise ENOSPC behavior), test config is
reflink+rmapbt XFS with 4k block size. Not sure if the soft lockup is
related to the clonerange/deduperange ops in fsstress yet, will confirm
without clone/dedupe ops.

[12968.100008] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [fsstress:6903]
[12968.100038] Modules linked in: loop dm_flakey xfs ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc 8139too 8139cp i2c_piix4 joydev mii pcspkr virtio_balloon virtio_pci serio_raw virtio_ring virtio floppy ata_generic pata_acpi
[12968.104043] irq event stamp: 23222196
[12968.104043] hardirqs last  enabled at (23222195): [<000000007d0c2e75>] restore_regs_and_return_to_kernel+0x0/0x2e
[12968.105111] hardirqs last disabled at (23222196): [<000000008f80dc57>] apic_timer_interrupt+0xa7/0xc0
[12968.105111] softirqs last  enabled at (877594): [<0000000034c53d5e>] __do_softirq+0x392/0x502
[12968.105111] softirqs last disabled at (877585): [<000000003f4d9e0b>] irq_exit+0x102/0x110
[12968.105111] CPU: 2 PID: 6903 Comm: fsstress Tainted: G        W    L   4.15.0-rc5 #10
[12968.105111] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[12968.108043] RIP: 0010:xfs_bmapi_update_map+0xc/0xc0 [xfs]
[12968.108043] RSP: 0018:ffffb8cbc2b8ba88 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
[12968.109028] RAX: ffffb8cbc2b8bc50 RBX: 0000000000000a40 RCX: 000000000000012b
[12968.109111] RDX: ffffb8cbc2b8bb00 RSI: ffffb8cbc2b8bb08 RDI: ffffb8cbc2b8baf8
[12968.109111] RBP: ffffb8cbc2b8bc10 R08: 000000000000012c R09: ffffb8cbc2b8bb14
[12968.109111] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb8cbc2b8bb28
[12968.109111] R13: ffffb8cbc2b8bb68 R14: 000000000000012c R15: 0000000000000001
[12968.109111] FS:  00007fed71507b80(0000) GS:ffff98f457200000(0000) knlGS:0000000000000000
[12968.112047] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[12968.112047] CR2: 00007fed71503000 CR3: 000000020f50d000 CR4: 00000000000006e0
[12968.113049] Call Trace:
[12968.113049]  xfs_bmapi_write+0x33e/0xcc0 [xfs]
[12968.113049]  xfs_reflink_convert_cow+0x8c/0xc0 [xfs]
[12968.113049]  ? xfs_vm_writepages+0x54/0xd0 [xfs]
[12968.113049]  xfs_submit_ioend+0x18f/0x1f0 [xfs]
[12968.113049]  xfs_vm_writepages+0xc5/0xd0 [xfs]
[12968.113049]  do_writepages+0x48/0xf0
[12968.113049]  ? __filemap_fdatawrite_range+0xb4/0x100
[12968.116073]  ? __filemap_fdatawrite_range+0xc1/0x100
[12968.116073]  __filemap_fdatawrite_range+0xc1/0x100
[12968.116073]  xfs_release+0x11c/0x160 [xfs]
[12968.117049]  __fput+0xe6/0x1f0
[12968.117049]  task_work_run+0x82/0xb0
[12968.117049]  exit_to_usermode_loop+0xa8/0xb0
[12968.117049]  syscall_return_slowpath+0x153/0x160
[12968.117049]  entry_SYSCALL_64_fastpath+0x94/0x96
[12968.117049] RIP: 0033:0x7fed70cddcb1
[12968.117049] RSP: 002b:00007ffd8d566118 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[12968.117049] RAX: 0000000000000000 RBX: 00000000000002da RCX: 00007fed70cddcb1
[12968.117049] RDX: 0000000000c1f440 RSI: 0000000000c1e010 RDI: 0000000000000003
[12968.120048] RBP: 0000000000000003 R08: 0000000000000006 R09: 00007ffd8d56612c
[12968.120048] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000012bd3b
[12968.121048] R13: 00000000004073c0 R14: 0000000000000000 R15: 0000000000000000

> ---
> v2: don't disable broken commands, just ignore them
> ---
>  ltp/fsstress.c |  391 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 391 insertions(+)
> 
> diff --git a/ltp/fsstress.c b/ltp/fsstress.c
> index 96f48b1..b02cb0c 100644
> --- a/ltp/fsstress.c
> +++ b/ltp/fsstress.c
> @@ -68,7 +68,9 @@ typedef enum {
>  	OP_BULKSTAT,
>  	OP_BULKSTAT1,
>  	OP_CHOWN,
> +	OP_CLONERANGE,
>  	OP_CREAT,
> +	OP_DEDUPERANGE,
>  	OP_DREAD,
>  	OP_DWRITE,
>  	OP_FALLOCATE,
> @@ -174,7 +176,9 @@ void	awrite_f(int, long);
>  void	bulkstat_f(int, long);
>  void	bulkstat1_f(int, long);
>  void	chown_f(int, long);
> +void	clonerange_f(int, long);
>  void	creat_f(int, long);
> +void	deduperange_f(int, long);
>  void	dread_f(int, long);
>  void	dwrite_f(int, long);
>  void	fallocate_f(int, long);
> @@ -221,7 +225,9 @@ opdesc_t	ops[] = {
>  	{ OP_BULKSTAT, "bulkstat", bulkstat_f, 1, 0 },
>  	{ OP_BULKSTAT1, "bulkstat1", bulkstat1_f, 1, 0 },
>  	{ OP_CHOWN, "chown", chown_f, 3, 1 },
> +	{ OP_CLONERANGE, "clonerange", clonerange_f, 4, 1 },
>  	{ OP_CREAT, "creat", creat_f, 4, 1 },
> +	{ OP_DEDUPERANGE, "deduperange", deduperange_f, 4, 1},
>  	{ OP_DREAD, "dread", dread_f, 4, 0 },
>  	{ OP_DWRITE, "dwrite", dwrite_f, 4, 1 },
>  	{ OP_FALLOCATE, "fallocate", fallocate_f, 1, 1 },
> @@ -2189,6 +2195,391 @@ chown_f(int opno, long r)
>  	free_pathname(&f);
>  }
>  
> +/* reflink some arbitrary range of f1 to f2. */
> +void
> +clonerange_f(
> +	int			opno,
> +	long			r)
> +{
> +#ifdef FICLONERANGE
> +	struct file_clone_range	fcr;
> +	struct pathname		fpath1;
> +	struct pathname		fpath2;
> +	struct stat64		stat1;
> +	struct stat64		stat2;
> +	char			inoinfo1[1024];
> +	char			inoinfo2[1024];
> +	off64_t			lr;
> +	off64_t			off1;
> +	off64_t			off2;
> +	size_t			len;
> +	int			v1;
> +	int			v2;
> +	int			fd1;
> +	int			fd2;
> +	int			ret;
> +	int			e;
> +
> +	/* Load paths */
> +	init_pathname(&fpath1);
> +	if (!get_fname(FT_REGm, r, &fpath1, NULL, NULL, &v1)) {
> +		if (v1)
> +			printf("%d/%d: clonerange read - no filename\n",
> +				procid, opno);
> +		goto out_fpath1;
> +	}
> +
> +	init_pathname(&fpath2);
> +	if (!get_fname(FT_REGm, random(), &fpath2, NULL, NULL, &v2)) {
> +		if (v2)
> +			printf("%d/%d: clonerange write - no filename\n",
> +				procid, opno);
> +		goto out_fpath2;
> +	}
> +
> +	/* Open files */
> +	fd1 = open_path(&fpath1, O_RDONLY);
> +	e = fd1 < 0 ? errno : 0;
> +	check_cwd();
> +	if (fd1 < 0) {
> +		if (v1)
> +			printf("%d/%d: clonerange read - open %s failed %d\n",
> +				procid, opno, fpath1.path, e);
> +		goto out_fpath2;
> +	}
> +
> +	fd2 = open_path(&fpath2, O_WRONLY);
> +	e = fd2 < 0 ? errno : 0;
> +	check_cwd();
> +	if (fd2 < 0) {
> +		if (v2)
> +			printf("%d/%d: clonerange write - open %s failed %d\n",
> +				procid, opno, fpath2.path, e);
> +		goto out_fd1;
> +	}
> +
> +	/* Get file stats */
> +	if (fstat64(fd1, &stat1) < 0) {
> +		if (v1)
> +			printf("%d/%d: clonerange read - fstat64 %s failed %d\n",
> +				procid, opno, fpath1.path, errno);
> +		goto out_fd2;
> +	}
> +	inode_info(inoinfo1, sizeof(inoinfo1), &stat1, v1);
> +
> +	if (fstat64(fd2, &stat2) < 0) {
> +		if (v2)
> +			printf("%d/%d: clonerange write - fstat64 %s failed %d\n",
> +				procid, opno, fpath2.path, errno);
> +		goto out_fd2;
> +	}
> +	inode_info(inoinfo2, sizeof(inoinfo2), &stat2, v1);
                                                      ^^^^ should be v2?
> +
> +	/* Calculate offsets */
> +	len = (random() % FILELEN_MAX) + 1;
> +	len &= ~(stat1.st_blksize - 1);
> +	if (len == 0)
> +		len = stat1.st_blksize;
> +	if (len > stat1.st_size)
> +		len = stat1.st_size;
> +
> +	lr = ((__int64_t)random() << 32) + random();
> +	if (stat1.st_size == len)
> +		off1 = 0;
> +	else
> +		off1 = (off64_t)(lr % MIN(stat1.st_size - len, MAXFSIZE));
> +	off1 %= maxfsize;
> +	off1 &= ~(stat1.st_blksize - 1);

Seems that the offset and len are not required to be block size aligned,
mind adding some comments on the consideration on offset and len, in
both clonerange and deduperange cases?

Thanks,
Eryu

> +
> +	/*
> +	 * If srcfile == destfile, randomly generate destination ranges
> +	 * until we find one that doesn't overlap the source range.
> +	 */
> +	do {
> +		lr = ((__int64_t)random() << 32) + random();
> +		off2 = (off64_t)(lr % MIN(stat2.st_size + (1024 * 1024), MAXFSIZE));
> +		off2 %= maxfsize;
> +		off2 &= ~(stat2.st_blksize - 1);
> +	} while (stat1.st_ino == stat2.st_ino && llabs(off2 - off1) < len);
> +
> +	/* Clone data blocks */
> +	fcr.src_fd = fd1;
> +	fcr.src_offset = off1;
> +	fcr.src_length = len;
> +	fcr.dest_offset = off2;
> +
> +	ret = ioctl(fd2, FICLONERANGE, &fcr);
> +	e = ret < 0 ? errno : 0;
> +	if (v1 || v2) {
> +		printf("%d/%d: clonerange %s%s [%lld,%lld] -> %s%s [%lld,%lld]",
> +			procid, opno,
> +			fpath1.path, inoinfo1, (long long)off1, (long long)len,
> +			fpath2.path, inoinfo2, (long long)off2, (long long)len);
> +
> +		if (ret < 0)
> +			printf(" error %d", e);
> +		printf("\n");
> +	}
> +
> +out_fd2:
> +	close(fd2);
> +out_fd1:
> +	close(fd1);
> +out_fpath2:
> +	free_pathname(&fpath2);
> +out_fpath1:
> +	free_pathname(&fpath1);
> +#endif
> +}
> +
> +/* dedupe some arbitrary range of f1 to f2...fn. */
> +void
> +deduperange_f(
> +	int			opno,
> +	long			r)
> +{
> +#ifdef FIDEDUPERANGE
> +#define INFO_SZ			1024
> +	struct file_dedupe_range *fdr;
> +	struct pathname		*fpath;
> +	struct stat64		*stat;
> +	char			*info;
> +	off64_t			*off;
> +	int			*v;
> +	int			*fd;
> +	int			nr;
> +	off64_t			lr;
> +	size_t			len;
> +	int			ret;
> +	int			i;
> +	int			e;
> +
> +	if (flist[FT_REG].nfiles < 2)
> +		return;
> +
> +	/* Pick somewhere between 2 and 128 files. */
> +	do {
> +		nr = random() % (flist[FT_REG].nfiles + 1);
> +	} while (nr < 2 || nr > 128);
> +
> +	/* Alloc memory */
> +	fdr = malloc(nr * sizeof(struct file_dedupe_range_info) +
> +		     sizeof(struct file_dedupe_range));
> +	if (!fdr) {
> +		printf("%d/%d: line %d error %d\n",
> +			procid, opno, __LINE__, errno);
> +		return;
> +	}
> +	memset(fdr, 0, (nr * sizeof(struct file_dedupe_range_info) +
> +			sizeof(struct file_dedupe_range)));
> +
> +	fpath = calloc(nr, sizeof(struct pathname));
> +	if (!fpath) {
> +		printf("%d/%d: line %d error %d\n",
> +			procid, opno, __LINE__, errno);
> +		goto out_fdr;
> +	}
> +
> +	stat = calloc(nr, sizeof(struct stat64));
> +	if (!stat) {
> +		printf("%d/%d: line %d error %d\n",
> +			procid, opno, __LINE__, errno);
> +		goto out_paths;
> +	}
> +
> +	info = calloc(nr, INFO_SZ);
> +	if (!info) {
> +		printf("%d/%d: line %d error %d\n",
> +			procid, opno, __LINE__, errno);
> +		goto out_stats;
> +	}
> +
> +	off = calloc(nr, sizeof(off64_t));
> +	if (!off) {
> +		printf("%d/%d: line %d error %d\n",
> +			procid, opno, __LINE__, errno);
> +		goto out_info;
> +	}
> +
> +	v = calloc(nr, sizeof(int));
> +	if (!v) {
> +		printf("%d/%d: line %d error %d\n",
> +			procid, opno, __LINE__, errno);
> +		goto out_offsets;
> +	}
> +	fd = calloc(nr, sizeof(int));
> +	if (!fd) {
> +		printf("%d/%d: line %d error %d\n",
> +			procid, opno, __LINE__, errno);
> +		goto out_v;
> +	}
> +	memset(fd, 0xFF, nr * sizeof(int));
> +
> +	/* Get paths for all files */
> +	for (i = 0; i < nr; i++)
> +		init_pathname(&fpath[i]);
> +
> +	if (!get_fname(FT_REGm, r, &fpath[0], NULL, NULL, &v[0])) {
> +		if (v[0])
> +			printf("%d/%d: deduperange read - no filename\n",
> +				procid, opno);
> +		goto out_pathnames;
> +	}
> +
> +	for (i = 1; i < nr; i++) {
> +		if (!get_fname(FT_REGm, random(), &fpath[i], NULL, NULL, &v[i])) {
> +			if (v[i])
> +				printf("%d/%d: deduperange write - no filename\n",
> +					procid, opno);
> +			goto out_pathnames;
> +		}
> +	}
> +
> +	/* Open files */
> +	fd[0] = open_path(&fpath[0], O_RDONLY);
> +	e = fd[0] < 0 ? errno : 0;
> +	check_cwd();
> +	if (fd[0] < 0) {
> +		if (v[0])
> +			printf("%d/%d: deduperange read - open %s failed %d\n",
> +				procid, opno, fpath[0].path, e);
> +		goto out_pathnames;
> +	}
> +
> +	for (i = 1; i < nr; i++) {
> +		fd[i] = open_path(&fpath[i], O_WRONLY);
> +		e = fd[i] < 0 ? errno : 0;
> +		check_cwd();
> +		if (fd[i] < 0) {
> +			if (v[i])
> +				printf("%d/%d: deduperange write - open %s failed %d\n",
> +					procid, opno, fpath[i].path, e);
> +			goto out_fds;
> +		}
> +	}
> +
> +	/* Get file stats */
> +	if (fstat64(fd[0], &stat[0]) < 0) {
> +		if (v[0])
> +			printf("%d/%d: deduperange read - fstat64 %s failed %d\n",
> +				procid, opno, fpath[0].path, errno);
> +		goto out_fds;
> +	}
> +
> +	inode_info(&info[0], INFO_SZ, &stat[0], v[0]);
> +
> +	for (i = 1; i < nr; i++) {
> +		if (fstat64(fd[i], &stat[i]) < 0) {
> +			if (v[i])
> +				printf("%d/%d: deduperange write - fstat64 %s failed %d\n",
> +					procid, opno, fpath[i].path, errno);
> +			goto out_fds;
> +		}
> +		inode_info(&info[i * INFO_SZ], INFO_SZ, &stat[i], v[i]);
> +	}
> +
> +	/* Never try to dedupe more than half of the src file. */
> +	len = (random() % FILELEN_MAX) + 1;
> +	len &= ~(stat[0].st_blksize - 1);
> +	if (len == 0)
> +		len = stat[0].st_blksize / 2;
> +	if (len > stat[0].st_size / 2)
> +		len = stat[0].st_size / 2;
> +
> +	/* Calculate offsets */
> +	lr = ((__int64_t)random() << 32) + random();
> +	if (stat[0].st_size == len)
> +		off[0] = 0;
> +	else
> +		off[0] = (off64_t)(lr % MIN(stat[0].st_size - len, MAXFSIZE));
> +	off[0] %= maxfsize;
> +	off[0] &= ~(stat[0].st_blksize - 1);
> +
> +	/*
> +	 * If srcfile == destfile[i], randomly generate destination ranges
> +	 * until we find one that doesn't overlap the source range.
> +	 */
> +	for (i = 1; i < nr; i++) {
> +		int	tries = 0;
> +
> +		do {
> +			lr = ((__int64_t)random() << 32) + random();
> +			if (stat[i].st_size <= len)
> +				off[i] = 0;
> +			else
> +				off[i] = (off64_t)(lr % MIN(stat[i].st_size - len, MAXFSIZE));
> +			off[i] %= maxfsize;
> +			off[i] &= ~(stat[i].st_blksize - 1);
> +		} while (stat[0].st_ino == stat[i].st_ino &&
> +			 llabs(off[i] - off[0]) < len &&
> +			 tries++ < 10);
> +	}
> +
> +	/* Clone data blocks */
> +	fdr->src_offset = off[0];
> +	fdr->src_length = len;
> +	fdr->dest_count = nr - 1;
> +	for (i = 1; i < nr; i++) {
> +		fdr->info[i - 1].dest_fd = fd[i];
> +		fdr->info[i - 1].dest_offset = off[i];
> +	}
> +
> +	ret = ioctl(fd[0], FIDEDUPERANGE, fdr);
> +	e = ret < 0 ? errno : 0;
> +	if (v[0]) {
> +		printf("%d/%d: deduperange from %s%s [%lld,%lld]",
> +			procid, opno,
> +			fpath[0].path, &info[0], (long long)off[0],
> +			(long long)len);
> +		if (ret < 0)
> +			printf(" error %d", e);
> +		printf("\n");
> +	}
> +	if (ret < 0)
> +		goto out_fds;
> +
> +	for (i = 1; i < nr; i++) {
> +		e = fdr->info[i - 1].status < 0 ? fdr->info[i - 1].status : 0;
> +		if (v[i]) {
> +			printf("%d/%d: ...to %s%s [%lld,%lld]",
> +				procid, opno,
> +				fpath[i].path, &info[i * INFO_SZ],
> +				(long long)off[i], (long long)len);
> +			if (fdr->info[i - 1].status < 0)
> +				printf(" error %d", e);
> +			if (fdr->info[i - 1].status == FILE_DEDUPE_RANGE_SAME)
> +				printf(" %llu bytes deduplicated",
> +					fdr->info[i - 1].bytes_deduped);
> +			if (fdr->info[i - 1].status == FILE_DEDUPE_RANGE_DIFFERS)
> +				printf(" differed");
> +			printf("\n");
> +		}
> +	}
> +
> +out_fds:
> +	for (i = 0; i < nr; i++)
> +		if (fd[i] >= 0)
> +			close(fd[i]);
> +out_pathnames:
> +	for (i = 0; i < nr; i++)
> +		free_pathname(&fpath[i]);
> +
> +	free(fd);
> +out_v:
> +	free(v);
> +out_offsets:
> +	free(off);
> +out_info:
> +	free(info);
> +out_stats:
> +	free(stat);
> +out_paths:
> +	free(fpath);
> +out_fdr:
> +	free(fdr);
> +#endif
> +}
> +
>  void
>  setxattr_f(int opno, long r)
>  {

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 6/8] fsstress: implement the clonerange/deduperange ioctls
  2018-01-03  8:48     ` Eryu Guan
@ 2018-01-03 17:12       ` Darrick J. Wong
  2018-01-05  4:35         ` Eryu Guan
  0 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2018-01-03 17:12 UTC (permalink / raw)
  To: Eryu Guan; +Cc: linux-xfs, fstests

On Wed, Jan 03, 2018 at 04:48:01PM +0800, Eryu Guan wrote:
> On Thu, Dec 14, 2017 at 06:07:31PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Mix it up a bit by reflinking and deduping data blocks when possible.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> This looks fine overall, but I noticed a soft lockup bug in generic/083
> and generic/269 (both test exercise ENOSPC behavior), test config is
> reflink+rmapbt XFS with 4k block size. Not sure if the soft lockup is
> related to the clonerange/deduperange ops in fsstress yet, will confirm
> without clone/dedupe ops.
> 
> [12968.100008] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [fsstress:6903]
> [12968.100038] Modules linked in: loop dm_flakey xfs ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc 8139too 8139cp i2c_piix4 joydev mii pcspkr virtio_balloon virtio_pci serio_raw virtio_ring virtio floppy ata_generic pata_acpi
> [12968.104043] irq event stamp: 23222196
> [12968.104043] hardirqs last  enabled at (23222195): [<000000007d0c2e75>] restore_regs_and_return_to_kernel+0x0/0x2e
> [12968.105111] hardirqs last disabled at (23222196): [<000000008f80dc57>] apic_timer_interrupt+0xa7/0xc0
> [12968.105111] softirqs last  enabled at (877594): [<0000000034c53d5e>] __do_softirq+0x392/0x502
> [12968.105111] softirqs last disabled at (877585): [<000000003f4d9e0b>] irq_exit+0x102/0x110
> [12968.105111] CPU: 2 PID: 6903 Comm: fsstress Tainted: G        W    L   4.15.0-rc5 #10
> [12968.105111] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
> [12968.108043] RIP: 0010:xfs_bmapi_update_map+0xc/0xc0 [xfs]

Hmmm, I haven't seen such a hang; I wonder if we're doing something
we shouldn't be doing and looping in bmapi_write.  In any case it's
a bug with xfs, not fsstress.

--D

> [12968.108043] RSP: 0018:ffffb8cbc2b8ba88 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
> [12968.109028] RAX: ffffb8cbc2b8bc50 RBX: 0000000000000a40 RCX: 000000000000012b
> [12968.109111] RDX: ffffb8cbc2b8bb00 RSI: ffffb8cbc2b8bb08 RDI: ffffb8cbc2b8baf8
> [12968.109111] RBP: ffffb8cbc2b8bc10 R08: 000000000000012c R09: ffffb8cbc2b8bb14
> [12968.109111] R10: 0000000000000000 R11: 0000000000000000 R12: ffffb8cbc2b8bb28
> [12968.109111] R13: ffffb8cbc2b8bb68 R14: 000000000000012c R15: 0000000000000001
> [12968.109111] FS:  00007fed71507b80(0000) GS:ffff98f457200000(0000) knlGS:0000000000000000
> [12968.112047] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [12968.112047] CR2: 00007fed71503000 CR3: 000000020f50d000 CR4: 00000000000006e0
> [12968.113049] Call Trace:
> [12968.113049]  xfs_bmapi_write+0x33e/0xcc0 [xfs]
> [12968.113049]  xfs_reflink_convert_cow+0x8c/0xc0 [xfs]
> [12968.113049]  ? xfs_vm_writepages+0x54/0xd0 [xfs]
> [12968.113049]  xfs_submit_ioend+0x18f/0x1f0 [xfs]
> [12968.113049]  xfs_vm_writepages+0xc5/0xd0 [xfs]
> [12968.113049]  do_writepages+0x48/0xf0
> [12968.113049]  ? __filemap_fdatawrite_range+0xb4/0x100
> [12968.116073]  ? __filemap_fdatawrite_range+0xc1/0x100
> [12968.116073]  __filemap_fdatawrite_range+0xc1/0x100
> [12968.116073]  xfs_release+0x11c/0x160 [xfs]
> [12968.117049]  __fput+0xe6/0x1f0
> [12968.117049]  task_work_run+0x82/0xb0
> [12968.117049]  exit_to_usermode_loop+0xa8/0xb0
> [12968.117049]  syscall_return_slowpath+0x153/0x160
> [12968.117049]  entry_SYSCALL_64_fastpath+0x94/0x96
> [12968.117049] RIP: 0033:0x7fed70cddcb1
> [12968.117049] RSP: 002b:00007ffd8d566118 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
> [12968.117049] RAX: 0000000000000000 RBX: 00000000000002da RCX: 00007fed70cddcb1
> [12968.117049] RDX: 0000000000c1f440 RSI: 0000000000c1e010 RDI: 0000000000000003
> [12968.120048] RBP: 0000000000000003 R08: 0000000000000006 R09: 00007ffd8d56612c
> [12968.120048] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000012bd3b
> [12968.121048] R13: 00000000004073c0 R14: 0000000000000000 R15: 0000000000000000
> 
> > ---
> > v2: don't disable broken commands, just ignore them
> > ---
> >  ltp/fsstress.c |  391 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 391 insertions(+)
> > 
> > diff --git a/ltp/fsstress.c b/ltp/fsstress.c
> > index 96f48b1..b02cb0c 100644
> > --- a/ltp/fsstress.c
> > +++ b/ltp/fsstress.c
> > @@ -68,7 +68,9 @@ typedef enum {
> >  	OP_BULKSTAT,
> >  	OP_BULKSTAT1,
> >  	OP_CHOWN,
> > +	OP_CLONERANGE,
> >  	OP_CREAT,
> > +	OP_DEDUPERANGE,
> >  	OP_DREAD,
> >  	OP_DWRITE,
> >  	OP_FALLOCATE,
> > @@ -174,7 +176,9 @@ void	awrite_f(int, long);
> >  void	bulkstat_f(int, long);
> >  void	bulkstat1_f(int, long);
> >  void	chown_f(int, long);
> > +void	clonerange_f(int, long);
> >  void	creat_f(int, long);
> > +void	deduperange_f(int, long);
> >  void	dread_f(int, long);
> >  void	dwrite_f(int, long);
> >  void	fallocate_f(int, long);
> > @@ -221,7 +225,9 @@ opdesc_t	ops[] = {
> >  	{ OP_BULKSTAT, "bulkstat", bulkstat_f, 1, 0 },
> >  	{ OP_BULKSTAT1, "bulkstat1", bulkstat1_f, 1, 0 },
> >  	{ OP_CHOWN, "chown", chown_f, 3, 1 },
> > +	{ OP_CLONERANGE, "clonerange", clonerange_f, 4, 1 },
> >  	{ OP_CREAT, "creat", creat_f, 4, 1 },
> > +	{ OP_DEDUPERANGE, "deduperange", deduperange_f, 4, 1},
> >  	{ OP_DREAD, "dread", dread_f, 4, 0 },
> >  	{ OP_DWRITE, "dwrite", dwrite_f, 4, 1 },
> >  	{ OP_FALLOCATE, "fallocate", fallocate_f, 1, 1 },
> > @@ -2189,6 +2195,391 @@ chown_f(int opno, long r)
> >  	free_pathname(&f);
> >  }
> >  
> > +/* reflink some arbitrary range of f1 to f2. */
> > +void
> > +clonerange_f(
> > +	int			opno,
> > +	long			r)
> > +{
> > +#ifdef FICLONERANGE
> > +	struct file_clone_range	fcr;
> > +	struct pathname		fpath1;
> > +	struct pathname		fpath2;
> > +	struct stat64		stat1;
> > +	struct stat64		stat2;
> > +	char			inoinfo1[1024];
> > +	char			inoinfo2[1024];
> > +	off64_t			lr;
> > +	off64_t			off1;
> > +	off64_t			off2;
> > +	size_t			len;
> > +	int			v1;
> > +	int			v2;
> > +	int			fd1;
> > +	int			fd2;
> > +	int			ret;
> > +	int			e;
> > +
> > +	/* Load paths */
> > +	init_pathname(&fpath1);
> > +	if (!get_fname(FT_REGm, r, &fpath1, NULL, NULL, &v1)) {
> > +		if (v1)
> > +			printf("%d/%d: clonerange read - no filename\n",
> > +				procid, opno);
> > +		goto out_fpath1;
> > +	}
> > +
> > +	init_pathname(&fpath2);
> > +	if (!get_fname(FT_REGm, random(), &fpath2, NULL, NULL, &v2)) {
> > +		if (v2)
> > +			printf("%d/%d: clonerange write - no filename\n",
> > +				procid, opno);
> > +		goto out_fpath2;
> > +	}
> > +
> > +	/* Open files */
> > +	fd1 = open_path(&fpath1, O_RDONLY);
> > +	e = fd1 < 0 ? errno : 0;
> > +	check_cwd();
> > +	if (fd1 < 0) {
> > +		if (v1)
> > +			printf("%d/%d: clonerange read - open %s failed %d\n",
> > +				procid, opno, fpath1.path, e);
> > +		goto out_fpath2;
> > +	}
> > +
> > +	fd2 = open_path(&fpath2, O_WRONLY);
> > +	e = fd2 < 0 ? errno : 0;
> > +	check_cwd();
> > +	if (fd2 < 0) {
> > +		if (v2)
> > +			printf("%d/%d: clonerange write - open %s failed %d\n",
> > +				procid, opno, fpath2.path, e);
> > +		goto out_fd1;
> > +	}
> > +
> > +	/* Get file stats */
> > +	if (fstat64(fd1, &stat1) < 0) {
> > +		if (v1)
> > +			printf("%d/%d: clonerange read - fstat64 %s failed %d\n",
> > +				procid, opno, fpath1.path, errno);
> > +		goto out_fd2;
> > +	}
> > +	inode_info(inoinfo1, sizeof(inoinfo1), &stat1, v1);
> > +
> > +	if (fstat64(fd2, &stat2) < 0) {
> > +		if (v2)
> > +			printf("%d/%d: clonerange write - fstat64 %s failed %d\n",
> > +				procid, opno, fpath2.path, errno);
> > +		goto out_fd2;
> > +	}
> > +	inode_info(inoinfo2, sizeof(inoinfo2), &stat2, v1);
>                                                       ^^^^ should be v2?
> > +
> > +	/* Calculate offsets */
> > +	len = (random() % FILELEN_MAX) + 1;
> > +	len &= ~(stat1.st_blksize - 1);
> > +	if (len == 0)
> > +		len = stat1.st_blksize;
> > +	if (len > stat1.st_size)
> > +		len = stat1.st_size;
> > +
> > +	lr = ((__int64_t)random() << 32) + random();
> > +	if (stat1.st_size == len)
> > +		off1 = 0;
> > +	else
> > +		off1 = (off64_t)(lr % MIN(stat1.st_size - len, MAXFSIZE));
> > +	off1 %= maxfsize;
> > +	off1 &= ~(stat1.st_blksize - 1);
> 
> Seems that the offset and len are not required to be block size aligned,
> mind adding some comments on the consideration on offset and len, in
> both clonerange and deduperange cases?
> 
> Thanks,
> Eryu
> 
> > +
> > +	/*
> > +	 * If srcfile == destfile, randomly generate destination ranges
> > +	 * until we find one that doesn't overlap the source range.
> > +	 */
> > +	do {
> > +		lr = ((__int64_t)random() << 32) + random();
> > +		off2 = (off64_t)(lr % MIN(stat2.st_size + (1024 * 1024), MAXFSIZE));
> > +		off2 %= maxfsize;
> > +		off2 &= ~(stat2.st_blksize - 1);
> > +	} while (stat1.st_ino == stat2.st_ino && llabs(off2 - off1) < len);
> > +
> > +	/* Clone data blocks */
> > +	fcr.src_fd = fd1;
> > +	fcr.src_offset = off1;
> > +	fcr.src_length = len;
> > +	fcr.dest_offset = off2;
> > +
> > +	ret = ioctl(fd2, FICLONERANGE, &fcr);
> > +	e = ret < 0 ? errno : 0;
> > +	if (v1 || v2) {
> > +		printf("%d/%d: clonerange %s%s [%lld,%lld] -> %s%s [%lld,%lld]",
> > +			procid, opno,
> > +			fpath1.path, inoinfo1, (long long)off1, (long long)len,
> > +			fpath2.path, inoinfo2, (long long)off2, (long long)len);
> > +
> > +		if (ret < 0)
> > +			printf(" error %d", e);
> > +		printf("\n");
> > +	}
> > +
> > +out_fd2:
> > +	close(fd2);
> > +out_fd1:
> > +	close(fd1);
> > +out_fpath2:
> > +	free_pathname(&fpath2);
> > +out_fpath1:
> > +	free_pathname(&fpath1);
> > +#endif
> > +}
> > +
> > +/* dedupe some arbitrary range of f1 to f2...fn. */
> > +void
> > +deduperange_f(
> > +	int			opno,
> > +	long			r)
> > +{
> > +#ifdef FIDEDUPERANGE
> > +#define INFO_SZ			1024
> > +	struct file_dedupe_range *fdr;
> > +	struct pathname		*fpath;
> > +	struct stat64		*stat;
> > +	char			*info;
> > +	off64_t			*off;
> > +	int			*v;
> > +	int			*fd;
> > +	int			nr;
> > +	off64_t			lr;
> > +	size_t			len;
> > +	int			ret;
> > +	int			i;
> > +	int			e;
> > +
> > +	if (flist[FT_REG].nfiles < 2)
> > +		return;
> > +
> > +	/* Pick somewhere between 2 and 128 files. */
> > +	do {
> > +		nr = random() % (flist[FT_REG].nfiles + 1);
> > +	} while (nr < 2 || nr > 128);
> > +
> > +	/* Alloc memory */
> > +	fdr = malloc(nr * sizeof(struct file_dedupe_range_info) +
> > +		     sizeof(struct file_dedupe_range));
> > +	if (!fdr) {
> > +		printf("%d/%d: line %d error %d\n",
> > +			procid, opno, __LINE__, errno);
> > +		return;
> > +	}
> > +	memset(fdr, 0, (nr * sizeof(struct file_dedupe_range_info) +
> > +			sizeof(struct file_dedupe_range)));
> > +
> > +	fpath = calloc(nr, sizeof(struct pathname));
> > +	if (!fpath) {
> > +		printf("%d/%d: line %d error %d\n",
> > +			procid, opno, __LINE__, errno);
> > +		goto out_fdr;
> > +	}
> > +
> > +	stat = calloc(nr, sizeof(struct stat64));
> > +	if (!stat) {
> > +		printf("%d/%d: line %d error %d\n",
> > +			procid, opno, __LINE__, errno);
> > +		goto out_paths;
> > +	}
> > +
> > +	info = calloc(nr, INFO_SZ);
> > +	if (!info) {
> > +		printf("%d/%d: line %d error %d\n",
> > +			procid, opno, __LINE__, errno);
> > +		goto out_stats;
> > +	}
> > +
> > +	off = calloc(nr, sizeof(off64_t));
> > +	if (!off) {
> > +		printf("%d/%d: line %d error %d\n",
> > +			procid, opno, __LINE__, errno);
> > +		goto out_info;
> > +	}
> > +
> > +	v = calloc(nr, sizeof(int));
> > +	if (!v) {
> > +		printf("%d/%d: line %d error %d\n",
> > +			procid, opno, __LINE__, errno);
> > +		goto out_offsets;
> > +	}
> > +	fd = calloc(nr, sizeof(int));
> > +	if (!fd) {
> > +		printf("%d/%d: line %d error %d\n",
> > +			procid, opno, __LINE__, errno);
> > +		goto out_v;
> > +	}
> > +	memset(fd, 0xFF, nr * sizeof(int));
> > +
> > +	/* Get paths for all files */
> > +	for (i = 0; i < nr; i++)
> > +		init_pathname(&fpath[i]);
> > +
> > +	if (!get_fname(FT_REGm, r, &fpath[0], NULL, NULL, &v[0])) {
> > +		if (v[0])
> > +			printf("%d/%d: deduperange read - no filename\n",
> > +				procid, opno);
> > +		goto out_pathnames;
> > +	}
> > +
> > +	for (i = 1; i < nr; i++) {
> > +		if (!get_fname(FT_REGm, random(), &fpath[i], NULL, NULL, &v[i])) {
> > +			if (v[i])
> > +				printf("%d/%d: deduperange write - no filename\n",
> > +					procid, opno);
> > +			goto out_pathnames;
> > +		}
> > +	}
> > +
> > +	/* Open files */
> > +	fd[0] = open_path(&fpath[0], O_RDONLY);
> > +	e = fd[0] < 0 ? errno : 0;
> > +	check_cwd();
> > +	if (fd[0] < 0) {
> > +		if (v[0])
> > +			printf("%d/%d: deduperange read - open %s failed %d\n",
> > +				procid, opno, fpath[0].path, e);
> > +		goto out_pathnames;
> > +	}
> > +
> > +	for (i = 1; i < nr; i++) {
> > +		fd[i] = open_path(&fpath[i], O_WRONLY);
> > +		e = fd[i] < 0 ? errno : 0;
> > +		check_cwd();
> > +		if (fd[i] < 0) {
> > +			if (v[i])
> > +				printf("%d/%d: deduperange write - open %s failed %d\n",
> > +					procid, opno, fpath[i].path, e);
> > +			goto out_fds;
> > +		}
> > +	}
> > +
> > +	/* Get file stats */
> > +	if (fstat64(fd[0], &stat[0]) < 0) {
> > +		if (v[0])
> > +			printf("%d/%d: deduperange read - fstat64 %s failed %d\n",
> > +				procid, opno, fpath[0].path, errno);
> > +		goto out_fds;
> > +	}
> > +
> > +	inode_info(&info[0], INFO_SZ, &stat[0], v[0]);
> > +
> > +	for (i = 1; i < nr; i++) {
> > +		if (fstat64(fd[i], &stat[i]) < 0) {
> > +			if (v[i])
> > +				printf("%d/%d: deduperange write - fstat64 %s failed %d\n",
> > +					procid, opno, fpath[i].path, errno);
> > +			goto out_fds;
> > +		}
> > +		inode_info(&info[i * INFO_SZ], INFO_SZ, &stat[i], v[i]);
> > +	}
> > +
> > +	/* Never try to dedupe more than half of the src file. */
> > +	len = (random() % FILELEN_MAX) + 1;
> > +	len &= ~(stat[0].st_blksize - 1);
> > +	if (len == 0)
> > +		len = stat[0].st_blksize / 2;
> > +	if (len > stat[0].st_size / 2)
> > +		len = stat[0].st_size / 2;
> > +
> > +	/* Calculate offsets */
> > +	lr = ((__int64_t)random() << 32) + random();
> > +	if (stat[0].st_size == len)
> > +		off[0] = 0;
> > +	else
> > +		off[0] = (off64_t)(lr % MIN(stat[0].st_size - len, MAXFSIZE));
> > +	off[0] %= maxfsize;
> > +	off[0] &= ~(stat[0].st_blksize - 1);
> > +
> > +	/*
> > +	 * If srcfile == destfile[i], randomly generate destination ranges
> > +	 * until we find one that doesn't overlap the source range.
> > +	 */
> > +	for (i = 1; i < nr; i++) {
> > +		int	tries = 0;
> > +
> > +		do {
> > +			lr = ((__int64_t)random() << 32) + random();
> > +			if (stat[i].st_size <= len)
> > +				off[i] = 0;
> > +			else
> > +				off[i] = (off64_t)(lr % MIN(stat[i].st_size - len, MAXFSIZE));
> > +			off[i] %= maxfsize;
> > +			off[i] &= ~(stat[i].st_blksize - 1);
> > +		} while (stat[0].st_ino == stat[i].st_ino &&
> > +			 llabs(off[i] - off[0]) < len &&
> > +			 tries++ < 10);
> > +	}
> > +
> > +	/* Clone data blocks */
> > +	fdr->src_offset = off[0];
> > +	fdr->src_length = len;
> > +	fdr->dest_count = nr - 1;
> > +	for (i = 1; i < nr; i++) {
> > +		fdr->info[i - 1].dest_fd = fd[i];
> > +		fdr->info[i - 1].dest_offset = off[i];
> > +	}
> > +
> > +	ret = ioctl(fd[0], FIDEDUPERANGE, fdr);
> > +	e = ret < 0 ? errno : 0;
> > +	if (v[0]) {
> > +		printf("%d/%d: deduperange from %s%s [%lld,%lld]",
> > +			procid, opno,
> > +			fpath[0].path, &info[0], (long long)off[0],
> > +			(long long)len);
> > +		if (ret < 0)
> > +			printf(" error %d", e);
> > +		printf("\n");
> > +	}
> > +	if (ret < 0)
> > +		goto out_fds;
> > +
> > +	for (i = 1; i < nr; i++) {
> > +		e = fdr->info[i - 1].status < 0 ? fdr->info[i - 1].status : 0;
> > +		if (v[i]) {
> > +			printf("%d/%d: ...to %s%s [%lld,%lld]",
> > +				procid, opno,
> > +				fpath[i].path, &info[i * INFO_SZ],
> > +				(long long)off[i], (long long)len);
> > +			if (fdr->info[i - 1].status < 0)
> > +				printf(" error %d", e);
> > +			if (fdr->info[i - 1].status == FILE_DEDUPE_RANGE_SAME)
> > +				printf(" %llu bytes deduplicated",
> > +					fdr->info[i - 1].bytes_deduped);
> > +			if (fdr->info[i - 1].status == FILE_DEDUPE_RANGE_DIFFERS)
> > +				printf(" differed");
> > +			printf("\n");
> > +		}
> > +	}
> > +
> > +out_fds:
> > +	for (i = 0; i < nr; i++)
> > +		if (fd[i] >= 0)
> > +			close(fd[i]);
> > +out_pathnames:
> > +	for (i = 0; i < nr; i++)
> > +		free_pathname(&fpath[i]);
> > +
> > +	free(fd);
> > +out_v:
> > +	free(v);
> > +out_offsets:
> > +	free(off);
> > +out_info:
> > +	free(info);
> > +out_stats:
> > +	free(stat);
> > +out_paths:
> > +	free(fpath);
> > +out_fdr:
> > +	free(fdr);
> > +#endif
> > +}
> > +
> >  void
> >  setxattr_f(int opno, long r)
> >  {
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 9/8] xfs: find libxfs api violations
  2017-12-13  6:03 [PATCH 0/8] weekly fstests changes Darrick J. Wong
                   ` (8 preceding siblings ...)
  2017-12-15  8:55 ` [PATCH 0/8] weekly fstests changes Eryu Guan
@ 2018-01-03 19:22 ` Darrick J. Wong
  2018-01-03 19:26 ` [PATCH 10/8] xfs: check that fs freeze minimizes required recovery Darrick J. Wong
  10 siblings, 0 replies; 55+ messages in thread
From: Darrick J. Wong @ 2018-01-03 19:22 UTC (permalink / raw)
  To: eguan; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

New test to run tools/find-api-violations.sh in xfsprogs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/708     |   56 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/708.out |    2 ++
 tests/xfs/group   |    1 +
 3 files changed, 59 insertions(+)
 create mode 100755 tests/xfs/708
 create mode 100644 tests/xfs/708.out

diff --git a/tests/xfs/708 b/tests/xfs/708
new file mode 100755
index 0000000..4877bcf
--- /dev/null
+++ b/tests/xfs/708
@@ -0,0 +1,56 @@
+#! /bin/bash
+# FS QA Test No. 708
+#
+# find-api-violations test
+#
+# The purpose of this test is ensure that the xfsprogs programs use the
+# libxfs_ symbols (in libxfs-api-defs.h) instead of raw xfs_ functions.
+# This is for the maintainers; it's not a functionality test.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2018 Oracle, Inc.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1	# failure is the default!
+trap "rm -f $tmp.*; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+[ -z "$WORKAREA" ] && \
+	_notrun "Can't run find-api-violations.sh without WORKAREA set"
+[ -f "$WORKAREA/tools/find-api-violations.sh" ] || \
+	_notrun "Can't find find-api-violations.sh tool under \"$WORKAREA\""
+
+echo "Silence is golden."
+
+# Look for API usage problems.  Old versions of the script have an improperly
+# specified grep pattern that is mistaken for a (broken) range specifier in
+# LC_ALL=C, so use English instead.
+(cd "$WORKAREA" ; LC_ALL="en_US.UTF-8" bash ./tools/find-api-violations.sh ) | tee -a $seqres.full
+
+# success, all done
+status=0
+exit
diff --git a/tests/xfs/708.out b/tests/xfs/708.out
new file mode 100644
index 0000000..b40823e
--- /dev/null
+++ b/tests/xfs/708.out
@@ -0,0 +1,2 @@
+QA output created by 708
+Silence is golden.
diff --git a/tests/xfs/group b/tests/xfs/group
index d230060..e1b1582 100644
--- a/tests/xfs/group
+++ b/tests/xfs/group
@@ -434,3 +434,4 @@
 434 auto quick clone fsr
 435 auto quick clone
 436 auto quick clone fsr
+708 auto quick other

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 10/8] xfs: check that fs freeze minimizes required recovery
  2017-12-13  6:03 [PATCH 0/8] weekly fstests changes Darrick J. Wong
                   ` (9 preceding siblings ...)
  2018-01-03 19:22 ` [PATCH 9/8] xfs: find libxfs api violations Darrick J. Wong
@ 2018-01-03 19:26 ` Darrick J. Wong
  2018-01-09 11:33   ` Eryu Guan
  10 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2018-01-03 19:26 UTC (permalink / raw)
  To: eguan; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

Make sure that a fs freeze operation cleans up as much of the filesystem
so as to minimize the recovery required in a crash/remount scenario.  In
particular we want to check that we don't leave CoW preallocations
sitting around in the refcountbt, though this test looks for anything
out of the ordinary on the frozen fs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/903     |  107 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 tests/xfs/903.out |   10 +++++
 tests/xfs/group   |    1 
 3 files changed, 118 insertions(+)
 create mode 100755 tests/xfs/903
 create mode 100644 tests/xfs/903.out

diff --git a/tests/xfs/903 b/tests/xfs/903
new file mode 100755
index 0000000..1686356
--- /dev/null
+++ b/tests/xfs/903
@@ -0,0 +1,107 @@
+#! /bin/bash
+# FS QA Test No. 903
+#
+# Test that frozen filesystems are relatively clean and not full of errors.
+# Prior to freezing a filesystem, we want to minimize the amount of recovery
+# that will have to happen if the system goes down while the fs is frozen.
+# Therefore, start up fsstress and cycle through a few freeze/thaw cycles
+# to ensure that nothing blows up when we try to do this.
+#
+# Unfortunately the log will probably still be dirty, so we can't do much
+# about enforcing a clean repair -n run.
+#
+#-----------------------------------------------------------------------
+# Copyright (c) 2000-2002 Silicon Graphics, Inc.  All Rights Reserved.
+# Copyright (c) 2018 Oracle.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#-----------------------------------------------------------------------
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap "_cleanup; rm -f $tmp.*; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+	# Make sure we thaw the fs before we unmount or else we remove the
+	# mount without actually deactivating the filesystem(!)
+	$XFS_IO_PROG -x -c "thaw" $SCRATCH_MNT 2> /dev/null
+	echo "*** unmount"
+	_scratch_unmount 2>/dev/null
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs xfs
+_supported_os Linux
+
+_require_scratch
+
+# xfs_db will OOM kill the machine if you don't have huge amounts of RAM, so
+# don't run this on large filesystems.
+_require_no_large_scratch_dev
+
+echo "*** init FS"
+
+rm -f $seqres.full
+_scratch_unmount >/dev/null 2>&1
+echo "*** MKFS ***" >>$seqres.full
+echo "" >>$seqres.full
+_scratch_mkfs_xfs >>$seqres.full 2>&1 || _fail "mkfs failed"
+_scratch_mount >>$seqres.full 2>&1 || _fail "mount failed"
+
+echo "*** test"
+
+for l in 0 1 2 3 4
+do
+	echo "    *** test $l"
+	FSSTRESS_ARGS=`_scale_fsstress_args -d $SCRATCH_MNT -n 1000 $FSSTRESS_AVOID`
+	$FSSTRESS_PROG  $FSSTRESS_ARGS >>$seqres.full
+
+	$XFS_IO_PROG -x -c 'freeze' $SCRATCH_MNT
+
+	# Log will probably be dirty after the freeze, record state
+	echo "" >>$seqres.full
+	echo "*** xfs_logprint ***" >>$seqres.full
+	echo "" >>$seqres.full
+	log=clean
+	_scratch_xfs_logprint -tb 2>&1 | tee -a $seqres.full \
+		| head | grep -q "<CLEAN>" || log=dirty
+
+	# Fail if repair complains and the log is clean
+	echo "" >>$seqres.full
+	echo "*** XFS_REPAIR -n ***" >>$seqres.full
+	echo "" >>$seqres.full
+	_scratch_xfs_repair -f -n >> $seqres.full 2>&1
+
+	if [ $? -ne 0 ] && [ "$log" = "clean" ]; then
+		_fail "xfs_repair failed"
+	fi
+
+	$XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT
+done
+
+echo "*** done"
+status=0
+exit 0
diff --git a/tests/xfs/903.out b/tests/xfs/903.out
new file mode 100644
index 0000000..378f0cb
--- /dev/null
+++ b/tests/xfs/903.out
@@ -0,0 +1,10 @@
+QA output created by 903
+*** init FS
+*** test
+    *** test 0
+    *** test 1
+    *** test 2
+    *** test 3
+    *** test 4
+*** done
+*** unmount
diff --git a/tests/xfs/group b/tests/xfs/group
index e1b1582..23c26c2 100644
--- a/tests/xfs/group
+++ b/tests/xfs/group
@@ -435,3 +435,4 @@
 435 auto quick clone
 436 auto quick clone fsr
 708 auto quick other
+903 mount auto quick stress

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 6/8] fsstress: implement the clonerange/deduperange ioctls
  2018-01-03 17:12       ` Darrick J. Wong
@ 2018-01-05  4:35         ` Eryu Guan
  2018-01-05  4:54           ` Darrick J. Wong
  0 siblings, 1 reply; 55+ messages in thread
From: Eryu Guan @ 2018-01-05  4:35 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, fstests

On Wed, Jan 03, 2018 at 09:12:11AM -0800, Darrick J. Wong wrote:
> On Wed, Jan 03, 2018 at 04:48:01PM +0800, Eryu Guan wrote:
> > On Thu, Dec 14, 2017 at 06:07:31PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Mix it up a bit by reflinking and deduping data blocks when possible.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > This looks fine overall, but I noticed a soft lockup bug in generic/083
> > and generic/269 (both test exercise ENOSPC behavior), test config is
> > reflink+rmapbt XFS with 4k block size. Not sure if the soft lockup is
> > related to the clonerange/deduperange ops in fsstress yet, will confirm
> > without clone/dedupe ops.

More testings showed that this may have something to do with the
deduperange operations. (I was testing with Fedora rawhide with
v4.15-rc5 kernel, I didn't see hang nor soft lockup with my RHEL7 base
host, because there's no FIDEDUPERANGE defined there).

I reverted the whole clonerange/deduperange support and retested for two
rounds of full '-g auto' run without hitting any hang or soft lockup.
Then I commented out the deduperange ops and left clonerange ops there,
no hang/lockup either. At last I commented out the clonerange ops but
left deduperange ops there, I hit a different hang in generic/270 (still
a ENOSPC test). I pasted partial sysrq-w output here, if full output is
needed please let me know.

[79200.901901] 14266.fsstress. D12200 14533  14460 0x00000000
[79200.902419] Call Trace:
[79200.902655]  ? __schedule+0x2e3/0xb90
[79200.902969]  ? _raw_spin_unlock_irqrestore+0x32/0x60
[79200.903442]  schedule+0x2f/0x90   
[79200.903727]  schedule_timeout+0x1dd/0x540
[79200.904114]  ? __next_timer_interrupt+0xc0/0xc0
[79200.904535]  xfs_inode_ag_walk.isra.12+0x3cc/0x670 [xfs]
[79200.905009]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[79200.905563]  ? kvm_clock_read+0x21/0x30
[79200.905891]  ? sched_clock+0x5/0x10
[79200.906243]  ? sched_clock_local+0x12/0x80
[79200.906598]  ? kvm_clock_read+0x21/0x30
[79200.906920]  ? sched_clock+0x5/0x10
[79200.907273]  ? sched_clock_local+0x12/0x80
[79200.907636]  ? __lock_is_held+0x59/0xa0
[79200.907988]  ? xfs_inode_ag_iterator_tag+0x46/0xb0 [xfs]
[79200.908497]  ? rcu_read_lock_sched_held+0x6b/0x80
[79200.908926]  ? xfs_perag_get_tag+0x28b/0x2f0 [xfs]
[79200.909416]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[79200.909922]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
[79200.910446]  xfs_file_buffered_aio_write+0x348/0x370 [xfs]
[79200.910948]  xfs_file_write_iter+0x99/0x140 [xfs]
[79200.911400]  __vfs_write+0xfc/0x170
[79200.911726]  vfs_write+0xc1/0x1b0
[79200.912063]  SyS_write+0x55/0xc0
[79200.912347]  entry_SYSCALL_64_fastpath+0x1f/0x96

Seems other hangning fsstress processes were all waiting for io
completion of writeback (sleeping in wb_wait_for_completion).

> > 
> > [12968.100008] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [fsstress:6903]
> > [12968.100038] Modules linked in: loop dm_flakey xfs ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc 8139too 8139cp i2c_piix4 joydev mii pcspkr virtio_balloon virtio_pci serio_raw virtio_ring virtio floppy ata_generic pata_acpi
> > [12968.104043] irq event stamp: 23222196
> > [12968.104043] hardirqs last  enabled at (23222195): [<000000007d0c2e75>] restore_regs_and_return_to_kernel+0x0/0x2e
> > [12968.105111] hardirqs last disabled at (23222196): [<000000008f80dc57>] apic_timer_interrupt+0xa7/0xc0
> > [12968.105111] softirqs last  enabled at (877594): [<0000000034c53d5e>] __do_softirq+0x392/0x502
> > [12968.105111] softirqs last disabled at (877585): [<000000003f4d9e0b>] irq_exit+0x102/0x110
> > [12968.105111] CPU: 2 PID: 6903 Comm: fsstress Tainted: G        W    L   4.15.0-rc5 #10
> > [12968.105111] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
> > [12968.108043] RIP: 0010:xfs_bmapi_update_map+0xc/0xc0 [xfs]
> 
> Hmmm, I haven't seen such a hang; I wonder if we're doing something
> we shouldn't be doing and looping in bmapi_write.  In any case it's
> a bug with xfs, not fsstress.

Agreed, I'm planning to pull this patch in this week's update, with the
following fix

- inode_info(inoinfo2, sizeof(inoinfo2), &stat2, v1);
+ inode_info(inoinfo2, sizeof(inoinfo2), &stat2, v2);

Also I'd follow Dave's suggestion on xfs/068 fix, move the
FSSTRESS_AVOID handling to common/dump on commit. Please let me know if
you have a different plan now.

Thanks,
Eryu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 6/8] fsstress: implement the clonerange/deduperange ioctls
  2018-01-05  4:35         ` Eryu Guan
@ 2018-01-05  4:54           ` Darrick J. Wong
  2018-01-06  1:46             ` Darrick J. Wong
  0 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2018-01-05  4:54 UTC (permalink / raw)
  To: Eryu Guan; +Cc: linux-xfs, fstests

On Fri, Jan 05, 2018 at 12:35:49PM +0800, Eryu Guan wrote:
> On Wed, Jan 03, 2018 at 09:12:11AM -0800, Darrick J. Wong wrote:
> > On Wed, Jan 03, 2018 at 04:48:01PM +0800, Eryu Guan wrote:
> > > On Thu, Dec 14, 2017 at 06:07:31PM -0800, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > Mix it up a bit by reflinking and deduping data blocks when possible.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > This looks fine overall, but I noticed a soft lockup bug in generic/083
> > > and generic/269 (both test exercise ENOSPC behavior), test config is
> > > reflink+rmapbt XFS with 4k block size. Not sure if the soft lockup is
> > > related to the clonerange/deduperange ops in fsstress yet, will confirm
> > > without clone/dedupe ops.
> 
> More testings showed that this may have something to do with the
> deduperange operations. (I was testing with Fedora rawhide with
> v4.15-rc5 kernel, I didn't see hang nor soft lockup with my RHEL7 base
> host, because there's no FIDEDUPERANGE defined there).
> 
> I reverted the whole clonerange/deduperange support and retested for two
> rounds of full '-g auto' run without hitting any hang or soft lockup.
> Then I commented out the deduperange ops and left clonerange ops there,
> no hang/lockup either. At last I commented out the clonerange ops but
> left deduperange ops there, I hit a different hang in generic/270 (still
> a ENOSPC test). I pasted partial sysrq-w output here, if full output is
> needed please let me know.
> 
> [79200.901901] 14266.fsstress. D12200 14533  14460 0x00000000
> [79200.902419] Call Trace:
> [79200.902655]  ? __schedule+0x2e3/0xb90
> [79200.902969]  ? _raw_spin_unlock_irqrestore+0x32/0x60
> [79200.903442]  schedule+0x2f/0x90   
> [79200.903727]  schedule_timeout+0x1dd/0x540
> [79200.904114]  ? __next_timer_interrupt+0xc0/0xc0
> [79200.904535]  xfs_inode_ag_walk.isra.12+0x3cc/0x670 [xfs]
> [79200.905009]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> [79200.905563]  ? kvm_clock_read+0x21/0x30
> [79200.905891]  ? sched_clock+0x5/0x10
> [79200.906243]  ? sched_clock_local+0x12/0x80
> [79200.906598]  ? kvm_clock_read+0x21/0x30
> [79200.906920]  ? sched_clock+0x5/0x10
> [79200.907273]  ? sched_clock_local+0x12/0x80
> [79200.907636]  ? __lock_is_held+0x59/0xa0
> [79200.907988]  ? xfs_inode_ag_iterator_tag+0x46/0xb0 [xfs]
> [79200.908497]  ? rcu_read_lock_sched_held+0x6b/0x80
> [79200.908926]  ? xfs_perag_get_tag+0x28b/0x2f0 [xfs]
> [79200.909416]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> [79200.909922]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> [79200.910446]  xfs_file_buffered_aio_write+0x348/0x370 [xfs]
> [79200.910948]  xfs_file_write_iter+0x99/0x140 [xfs]
> [79200.911400]  __vfs_write+0xfc/0x170
> [79200.911726]  vfs_write+0xc1/0x1b0
> [79200.912063]  SyS_write+0x55/0xc0
> [79200.912347]  entry_SYSCALL_64_fastpath+0x1f/0x96
> 
> Seems other hangning fsstress processes were all waiting for io
> completion of writeback (sleeping in wb_wait_for_completion).

Hmm, I'll badger it some more, though I did see:

[ 4349.832516] XFS: Assertion failed: xfs_is_reflink_inode(ip), file: /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_reflink.c, line: 651
[ 4349.847730] WARNING: CPU: 3 PID: 3600 at /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_message.c:116 assfail+0x2e/0x60 [xfs]
[ 4349.849142] Modules linked in: xfs libcrc32c dm_snapshot dm_bufio dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: xfs]
[ 4349.850603] CPU: 3 PID: 3600 Comm: fsstress Not tainted 4.15.0-rc6-xfsx #9
[ 4349.851417] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
[ 4349.852594] RIP: 0010:assfail+0x2e/0x60 [xfs]
[ 4349.853156] RSP: 0018:ffffc90002d97a80 EFLAGS: 00010246
[ 4349.853785] RAX: 00000000ffffffea RBX: 0000000000000000 RCX: 0000000000000001
[ 4349.854621] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa0270585
[ 4349.855457] RBP: ffff88001d41d100 R08: 0000000000000000 R09: 0000000000000000
[ 4349.856296] R10: ffffc90002d97a28 R11: f000000000000000 R12: 0000000000000000
[ 4349.857142] R13: ffffffffffffffff R14: 0000000000000000 R15: 0000000000000008
[ 4349.857969] FS:  00007f0712dc8700(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000
[ 4349.858918] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4349.859596] CR2: 00007f0711e7e008 CR3: 0000000004265001 CR4: 00000000001606e0
[ 4349.860462] Call Trace:
[ 4349.860860]  xfs_reflink_cancel_cow_range+0x3f6/0x650 [xfs]
[ 4349.861596]  ? down_write_nested+0x94/0xb0
[ 4349.862165]  ? xfs_ilock+0x2ac/0x450 [xfs]
[ 4349.862719]  xfs_inode_free_cowblocks+0x38e/0x620 [xfs]
[ 4349.863376]  xfs_inode_ag_walk+0x327/0xc30 [xfs]
[ 4349.863972]  ? xfs_inode_free_eofblocks+0x580/0x580 [xfs]
[ 4349.864600]  ? try_to_wake_up+0x30/0x560
[ 4349.865105]  ? _raw_spin_unlock_irqrestore+0x46/0x70
[ 4349.865667]  ? try_to_wake_up+0x49/0x560
[ 4349.866159]  ? radix_tree_gang_lookup_tag+0xf4/0x150
[ 4349.866795]  ? xfs_inode_free_eofblocks+0x580/0x580 [xfs]
[ 4349.867438]  ? xfs_perag_get_tag+0x205/0x470 [xfs]
[ 4349.868042]  ? xfs_perag_put+0x15f/0x2e0 [xfs]
[ 4349.868573]  ? xfs_inode_free_eofblocks+0x580/0x580 [xfs]
[ 4349.869243]  xfs_inode_ag_iterator_tag+0x65/0xa0 [xfs]
[ 4349.869876]  xfs_file_buffered_aio_write+0x203/0x5b0 [xfs]
[ 4349.870575]  xfs_file_write_iter+0x298/0x4f0 [xfs]
[ 4349.871164]  __vfs_write+0x130/0x1a0
[ 4349.871585]  vfs_write+0xc8/0x1c0
[ 4349.872001]  SyS_write+0x45/0xa0
[ 4349.872394]  entry_SYSCALL_64_fastpath+0x1f/0x96
[ 4349.872950] RIP: 0033:0x7f071259c4a0
[ 4349.873377] RSP: 002b:00007ffea07c4588 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 4349.874226] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f071259c4a0
[ 4349.875066] RDX: 000000000000aebc RSI: 000000000111b260 RDI: 0000000000000003
[ 4349.875866] RBP: 0000000000000001 R08: 000000000000006e R09: 0000000000000004
[ 4349.876658] R10: 00007f0712586b78 R11: 0000000000000246 R12: 00000000000007a9
[ 4349.877433] R13: 0000000000000003 R14: 00000000001c6200 R15: 000000000001e000
[ 4349.878273] Code: 00 00 53 48 89 f1 41 89 d0 48 c7 c6 18 3e 28 a0 48 89 fa 31 ff e8 63 fa ff ff 0f b6 1d 28 44 29 00 80 fb 01 77 09 83 e3 01 75 15 <0f> ff 5b c3 0f b6 f3 48 c7 c7 90 bd 3d a0 e8 3f 13 1c e1 eb e6 
[ 4349.880399] ---[ end trace 1e05700f283b7cc1 ]---

So maybe I need to take a closer look at all this machinery tomorrow...

> 
> > > 
> > > [12968.100008] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [fsstress:6903]
> > > [12968.100038] Modules linked in: loop dm_flakey xfs ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc 8139too 8139cp i2c_piix4 joydev mii pcspkr virtio_balloon virtio_pci serio_raw virtio_ring virtio floppy ata_generic pata_acpi
> > > [12968.104043] irq event stamp: 23222196
> > > [12968.104043] hardirqs last  enabled at (23222195): [<000000007d0c2e75>] restore_regs_and_return_to_kernel+0x0/0x2e
> > > [12968.105111] hardirqs last disabled at (23222196): [<000000008f80dc57>] apic_timer_interrupt+0xa7/0xc0
> > > [12968.105111] softirqs last  enabled at (877594): [<0000000034c53d5e>] __do_softirq+0x392/0x502
> > > [12968.105111] softirqs last disabled at (877585): [<000000003f4d9e0b>] irq_exit+0x102/0x110
> > > [12968.105111] CPU: 2 PID: 6903 Comm: fsstress Tainted: G        W    L   4.15.0-rc5 #10
> > > [12968.105111] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
> > > [12968.108043] RIP: 0010:xfs_bmapi_update_map+0xc/0xc0 [xfs]
> > 
> > Hmmm, I haven't seen such a hang; I wonder if we're doing something
> > we shouldn't be doing and looping in bmapi_write.  In any case it's
> > a bug with xfs, not fsstress.
> 
> Agreed, I'm planning to pull this patch in this week's update, with the
> following fix
> 
> - inode_info(inoinfo2, sizeof(inoinfo2), &stat2, v1);
> + inode_info(inoinfo2, sizeof(inoinfo2), &stat2, v2);
> 
> Also I'd follow Dave's suggestion on xfs/068 fix, move the
> FSSTRESS_AVOID handling to common/dump on commit. Please let me know if
> you have a different plan now.

I was just gonna go back to amending only xfs/068 to turn off clone/dedupe.

--D

> Thanks,
> Eryu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 1/8] common/rc: report kmemleak errors
  2017-12-14 18:15     ` Darrick J. Wong
@ 2018-01-05  8:02       ` Eryu Guan
  2018-01-05 17:02         ` Darrick J. Wong
  0 siblings, 1 reply; 55+ messages in thread
From: Eryu Guan @ 2018-01-05  8:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, fstests

On Thu, Dec 14, 2017 at 10:15:08AM -0800, Darrick J. Wong wrote:
> On Thu, Dec 14, 2017 at 05:37:18PM +0800, Eryu Guan wrote:
> > On Tue, Dec 12, 2017 at 10:03:18PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > If kmemleak is enabled, scan and report memory leaks after every test.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  check     |    2 ++
> > >  common/rc |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 54 insertions(+)
> > > 
> > > 
> > > diff --git a/check b/check
> > > index b2d251a..469188e 100755
> > > --- a/check
> > > +++ b/check
> > > @@ -497,6 +497,7 @@ _check_filesystems()
> > >  	fi
> > >  }
> > >  
> > > +_init_kmemleak
> > >  _prepare_test_list
> > >  
> > >  if $OPTIONS_HAVE_SECTIONS; then
> > > @@ -793,6 +794,7 @@ for section in $HOST_OPTIONS_SECTIONS; do
> > >  		    n_try=`expr $n_try + 1`
> > >  		    _check_filesystems
> > >  		    _check_dmesg || err=true
> > > +		    _check_kmemleak || err=true
> > >  		fi
> > >  
> > >  	    fi
> > > diff --git a/common/rc b/common/rc
> > > index cb83918..a2bed36 100644
> > > --- a/common/rc
> > > +++ b/common/rc
> > > @@ -3339,6 +3339,58 @@ _check_dmesg()
> > >  	fi
> > >  }
> > >  
> > > +# capture the kmemleak report
> > > +_capture_kmemleak()
> > > +{
> > > +	local _kern_knob="${DEBUGFS_MNT}/kmemleak"
> > > +	local _leak_file="$1"
> > > +
> > > +	# Tell the kernel to scan for memory leaks.  Apparently the write
> > > +	# returns before the scan is complete, so do it twice in the hopes
> > > +	# that twice is enough to capture all the leaks.
> > > +	echo "scan" > "${_kern_knob}"
> > > +	cat "${_kern_knob}" > /dev/null
> > > +	echo "scan" > "${_kern_knob}"
> > > +	cat "${_kern_knob}" > "${_leak_file}"
> > > +	echo "clear" > "${_kern_knob}"
> > 
> > Hmm, two scans seem not enough either, I could see false positive easily
> > in a 'quick' group run, because some leaks are not reported immediately
> > after the test but after next test or next few tests. e.g. I saw
> > generic/008 (tested on XFS) being reported as leaking memory, and from
> > 008.kmemleak I saw:
> > 
> > unreferenced object 0xffff880277679800 (size 512):
> >   comm "nametest", pid 25007, jiffies 4300176958 (age 9.854s)
> > ...
> > 
> > But "nametest" is only used in generic/007, the leak should be triggered
> > by generic/007 too, but 007 was reported as PASS in my case.
> > 
> > Not sure what's the best way to deal with these false positive, adding
> > more scans seem to work, but that's ugly and requires more test time..
> > What do you think?
> 
> I'm not sure either -- the brief scan I made of mm/kmemleak.c didn't
> reveal anything obvious that would explain the behavior we see.  It
> might just take a while for determine positively that an item isn't
> gray.

Seems so, I did read similar statements elsewhere, but I can't remember
now..

> 
> We could change the message to state that found leaks might have
> resulted from the previous test?  That's rather unsatisfying, but I
> don't know what else to do.

Seems like a reasonable way to go at this stage. I also noticed some
leaks probably were not from the test we ran nor fs-related, but other
processes on the system, e.g. 

unreferenced object 0xffff8801768be3c0 (size 256):
  comm "softirq", pid 0, jiffies 4299031078 (age 14.234s)
  hex dump (first 32 bytes):
    01 00 00 00 00 00 00 00 03 00 00 00 00 03 00 00  ................
    b7 fd 01 00 00 00 00 00 d8 f6 1f 79 02 88 ff ff  ...........y....
  backtrace:
    [<ffffffffa077cae8>] init_conntrack+0x4a8/0x4c0 [nf_conntrack]
    [<ffffffffa077d2c4>] nf_conntrack_in+0x494/0x510 [nf_conntrack]
    [<ffffffff815f32d7>] nf_hook_slow+0x37/0xb0
    [<ffffffff815fd6a0>] ip_rcv+0x2f0/0x3c0
    [<ffffffff815b5833>] __netif_receive_skb_core+0x3d3/0xaa0
    [<ffffffff815b8154>] netif_receive_skb_internal+0x34/0xc0
    [<ffffffffa0356654>] br_pass_frame_up+0xb4/0x140 [bridge]
    [<ffffffffa03568eb>] br_handle_frame_finish+0x20b/0x3f0 [bridge]
    [<ffffffffa0356c7b>] br_handle_frame+0x16b/0x2c0 [bridge]
    [<ffffffff815b5651>] __netif_receive_skb_core+0x1f1/0xaa0
    [<ffffffff815b8154>] netif_receive_skb_internal+0x34/0xc0
    [<ffffffff815b8dbc>] napi_gro_receive+0xbc/0xe0
    [<ffffffffa004f64c>] bnx2_poll_work+0x8fc/0x1190 [bnx2]
    [<ffffffffa004ff13>] bnx2_poll_msix+0x33/0xb0 [bnx2]
    [<ffffffff815b868e>] net_rx_action+0x26e/0x3a0
    [<ffffffff816e8778>] __do_softirq+0xc8/0x26c

Perhaps we can mark the kmemleak check as "experimental" or so? By
adding some kind of "disclaimer" in the beginning of $seqres.kmemleak
file? So people could have the right expectation on these kmemleak
failures.

> 
> Or maybe a sleep 1 in between scans to see if that makes it more likely
> to attribute a leak to the correct test?  I don't anticipate running
> xfstests with kmemleak=on too terribly often, so the extra ~700s won't
> bother me too much.

This doesn't improve anything to me, 7 of the first 8 tests failed due
to kmemleak check after adding 'sleep 1' between scans.

Thanks,
Eryu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 1/8] common/rc: report kmemleak errors
  2018-01-05  8:02       ` Eryu Guan
@ 2018-01-05 17:02         ` Darrick J. Wong
  2018-01-07 15:25           ` Eryu Guan
  0 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2018-01-05 17:02 UTC (permalink / raw)
  To: Eryu Guan; +Cc: linux-xfs, fstests

On Fri, Jan 05, 2018 at 04:02:55PM +0800, Eryu Guan wrote:
> On Thu, Dec 14, 2017 at 10:15:08AM -0800, Darrick J. Wong wrote:
> > On Thu, Dec 14, 2017 at 05:37:18PM +0800, Eryu Guan wrote:
> > > On Tue, Dec 12, 2017 at 10:03:18PM -0800, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > If kmemleak is enabled, scan and report memory leaks after every test.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > ---
> > > >  check     |    2 ++
> > > >  common/rc |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >  2 files changed, 54 insertions(+)
> > > > 
> > > > 
> > > > diff --git a/check b/check
> > > > index b2d251a..469188e 100755
> > > > --- a/check
> > > > +++ b/check
> > > > @@ -497,6 +497,7 @@ _check_filesystems()
> > > >  	fi
> > > >  }
> > > >  
> > > > +_init_kmemleak
> > > >  _prepare_test_list
> > > >  
> > > >  if $OPTIONS_HAVE_SECTIONS; then
> > > > @@ -793,6 +794,7 @@ for section in $HOST_OPTIONS_SECTIONS; do
> > > >  		    n_try=`expr $n_try + 1`
> > > >  		    _check_filesystems
> > > >  		    _check_dmesg || err=true
> > > > +		    _check_kmemleak || err=true
> > > >  		fi
> > > >  
> > > >  	    fi
> > > > diff --git a/common/rc b/common/rc
> > > > index cb83918..a2bed36 100644
> > > > --- a/common/rc
> > > > +++ b/common/rc
> > > > @@ -3339,6 +3339,58 @@ _check_dmesg()
> > > >  	fi
> > > >  }
> > > >  
> > > > +# capture the kmemleak report
> > > > +_capture_kmemleak()
> > > > +{
> > > > +	local _kern_knob="${DEBUGFS_MNT}/kmemleak"
> > > > +	local _leak_file="$1"
> > > > +
> > > > +	# Tell the kernel to scan for memory leaks.  Apparently the write
> > > > +	# returns before the scan is complete, so do it twice in the hopes
> > > > +	# that twice is enough to capture all the leaks.
> > > > +	echo "scan" > "${_kern_knob}"
> > > > +	cat "${_kern_knob}" > /dev/null
> > > > +	echo "scan" > "${_kern_knob}"
> > > > +	cat "${_kern_knob}" > "${_leak_file}"
> > > > +	echo "clear" > "${_kern_knob}"
> > > 
> > > Hmm, two scans seem not enough either, I could see false positive easily
> > > in a 'quick' group run, because some leaks are not reported immediately
> > > after the test but after next test or next few tests. e.g. I saw
> > > generic/008 (tested on XFS) being reported as leaking memory, and from
> > > 008.kmemleak I saw:
> > > 
> > > unreferenced object 0xffff880277679800 (size 512):
> > >   comm "nametest", pid 25007, jiffies 4300176958 (age 9.854s)
> > > ...
> > > 
> > > But "nametest" is only used in generic/007, the leak should be triggered
> > > by generic/007 too, but 007 was reported as PASS in my case.
> > > 
> > > Not sure what's the best way to deal with these false positive, adding
> > > more scans seem to work, but that's ugly and requires more test time..
> > > What do you think?
> > 
> > I'm not sure either -- the brief scan I made of mm/kmemleak.c didn't
> > reveal anything obvious that would explain the behavior we see.  It
> > might just take a while for determine positively that an item isn't
> > gray.
> 
> Seems so, I did read similar statements elsewhere, but I can't remember
> now..
> 
> > 
> > We could change the message to state that found leaks might have
> > resulted from the previous test?  That's rather unsatisfying, but I
> > don't know what else to do.
> 
> Seems like a reasonable way to go at this stage. I also noticed some
> leaks probably were not from the test we ran nor fs-related, but other
> processes on the system, e.g. 
> 
> unreferenced object 0xffff8801768be3c0 (size 256):
>   comm "softirq", pid 0, jiffies 4299031078 (age 14.234s)
>   hex dump (first 32 bytes):
>     01 00 00 00 00 00 00 00 03 00 00 00 00 03 00 00  ................
>     b7 fd 01 00 00 00 00 00 d8 f6 1f 79 02 88 ff ff  ...........y....
>   backtrace:
>     [<ffffffffa077cae8>] init_conntrack+0x4a8/0x4c0 [nf_conntrack]
>     [<ffffffffa077d2c4>] nf_conntrack_in+0x494/0x510 [nf_conntrack]
>     [<ffffffff815f32d7>] nf_hook_slow+0x37/0xb0
>     [<ffffffff815fd6a0>] ip_rcv+0x2f0/0x3c0
>     [<ffffffff815b5833>] __netif_receive_skb_core+0x3d3/0xaa0
>     [<ffffffff815b8154>] netif_receive_skb_internal+0x34/0xc0
>     [<ffffffffa0356654>] br_pass_frame_up+0xb4/0x140 [bridge]
>     [<ffffffffa03568eb>] br_handle_frame_finish+0x20b/0x3f0 [bridge]
>     [<ffffffffa0356c7b>] br_handle_frame+0x16b/0x2c0 [bridge]
>     [<ffffffff815b5651>] __netif_receive_skb_core+0x1f1/0xaa0
>     [<ffffffff815b8154>] netif_receive_skb_internal+0x34/0xc0
>     [<ffffffff815b8dbc>] napi_gro_receive+0xbc/0xe0
>     [<ffffffffa004f64c>] bnx2_poll_work+0x8fc/0x1190 [bnx2]
>     [<ffffffffa004ff13>] bnx2_poll_msix+0x33/0xb0 [bnx2]
>     [<ffffffff815b868e>] net_rx_action+0x26e/0x3a0
>     [<ffffffff816e8778>] __do_softirq+0xc8/0x26c
> 
> Perhaps we can mark the kmemleak check as "experimental" or so? By
> adding some kind of "disclaimer" in the beginning of $seqres.kmemleak
> file? So people could have the right expectation on these kmemleak
> failures.

How about:

"EXPERIMENTAL kmemleak reported some memory leaks!  Due to the way kmemleak
works, the leak might be from an earlier test, or something totally unrelated.

"unreferenced object 0xffff8801768be3c0 (size 256):
  comm "softirq", pid 0, jiffies 4299031078 (age 14.234s)
..."

> > 
> > Or maybe a sleep 1 in between scans to see if that makes it more likely
> > to attribute a leak to the correct test?  I don't anticipate running
> > xfstests with kmemleak=on too terribly often, so the extra ~700s won't
> > bother me too much.
> 
> This doesn't improve anything to me, 7 of the first 8 tests failed due
> to kmemleak check after adding 'sleep 1' between scans.

<nod>

--D

> Thanks,
> Eryu
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 6/8] fsstress: implement the clonerange/deduperange ioctls
  2018-01-05  4:54           ` Darrick J. Wong
@ 2018-01-06  1:46             ` Darrick J. Wong
  2018-01-09  7:09               ` Darrick J. Wong
  0 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2018-01-06  1:46 UTC (permalink / raw)
  To: Eryu Guan; +Cc: linux-xfs, fstests

On Thu, Jan 04, 2018 at 08:54:10PM -0800, Darrick J. Wong wrote:
> On Fri, Jan 05, 2018 at 12:35:49PM +0800, Eryu Guan wrote:
> > On Wed, Jan 03, 2018 at 09:12:11AM -0800, Darrick J. Wong wrote:
> > > On Wed, Jan 03, 2018 at 04:48:01PM +0800, Eryu Guan wrote:
> > > > On Thu, Dec 14, 2017 at 06:07:31PM -0800, Darrick J. Wong wrote:
> > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > 
> > > > > Mix it up a bit by reflinking and deduping data blocks when possible.
> > > > > 
> > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > This looks fine overall, but I noticed a soft lockup bug in generic/083
> > > > and generic/269 (both test exercise ENOSPC behavior), test config is
> > > > reflink+rmapbt XFS with 4k block size. Not sure if the soft lockup is
> > > > related to the clonerange/deduperange ops in fsstress yet, will confirm
> > > > without clone/dedupe ops.
> > 
> > More testings showed that this may have something to do with the
> > deduperange operations. (I was testing with Fedora rawhide with
> > v4.15-rc5 kernel, I didn't see hang nor soft lockup with my RHEL7 base
> > host, because there's no FIDEDUPERANGE defined there).
> > 
> > I reverted the whole clonerange/deduperange support and retested for two
> > rounds of full '-g auto' run without hitting any hang or soft lockup.
> > Then I commented out the deduperange ops and left clonerange ops there,
> > no hang/lockup either. At last I commented out the clonerange ops but
> > left deduperange ops there, I hit a different hang in generic/270 (still
> > a ENOSPC test). I pasted partial sysrq-w output here, if full output is
> > needed please let me know.
> > 
> > [79200.901901] 14266.fsstress. D12200 14533  14460 0x00000000
> > [79200.902419] Call Trace:
> > [79200.902655]  ? __schedule+0x2e3/0xb90
> > [79200.902969]  ? _raw_spin_unlock_irqrestore+0x32/0x60
> > [79200.903442]  schedule+0x2f/0x90   
> > [79200.903727]  schedule_timeout+0x1dd/0x540
> > [79200.904114]  ? __next_timer_interrupt+0xc0/0xc0
> > [79200.904535]  xfs_inode_ag_walk.isra.12+0x3cc/0x670 [xfs]
> > [79200.905009]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > [79200.905563]  ? kvm_clock_read+0x21/0x30
> > [79200.905891]  ? sched_clock+0x5/0x10
> > [79200.906243]  ? sched_clock_local+0x12/0x80
> > [79200.906598]  ? kvm_clock_read+0x21/0x30
> > [79200.906920]  ? sched_clock+0x5/0x10
> > [79200.907273]  ? sched_clock_local+0x12/0x80
> > [79200.907636]  ? __lock_is_held+0x59/0xa0
> > [79200.907988]  ? xfs_inode_ag_iterator_tag+0x46/0xb0 [xfs]
> > [79200.908497]  ? rcu_read_lock_sched_held+0x6b/0x80
> > [79200.908926]  ? xfs_perag_get_tag+0x28b/0x2f0 [xfs]
> > [79200.909416]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > [79200.909922]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> > [79200.910446]  xfs_file_buffered_aio_write+0x348/0x370 [xfs]
> > [79200.910948]  xfs_file_write_iter+0x99/0x140 [xfs]
> > [79200.911400]  __vfs_write+0xfc/0x170
> > [79200.911726]  vfs_write+0xc1/0x1b0
> > [79200.912063]  SyS_write+0x55/0xc0
> > [79200.912347]  entry_SYSCALL_64_fastpath+0x1f/0x96
> > 
> > Seems other hangning fsstress processes were all waiting for io
> > completion of writeback (sleeping in wb_wait_for_completion).
> 
> Hmm, I'll badger it some more, though I did see:
> 
> [ 4349.832516] XFS: Assertion failed: xfs_is_reflink_inode(ip), file: /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_reflink.c, line: 651
> [ 4349.847730] WARNING: CPU: 3 PID: 3600 at /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_message.c:116 assfail+0x2e/0x60 [xfs]
> [ 4349.849142] Modules linked in: xfs libcrc32c dm_snapshot dm_bufio dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: xfs]
> [ 4349.850603] CPU: 3 PID: 3600 Comm: fsstress Not tainted 4.15.0-rc6-xfsx #9
> [ 4349.851417] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
> [ 4349.852594] RIP: 0010:assfail+0x2e/0x60 [xfs]
> [ 4349.853156] RSP: 0018:ffffc90002d97a80 EFLAGS: 00010246
> [ 4349.853785] RAX: 00000000ffffffea RBX: 0000000000000000 RCX: 0000000000000001
> [ 4349.854621] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa0270585
> [ 4349.855457] RBP: ffff88001d41d100 R08: 0000000000000000 R09: 0000000000000000
> [ 4349.856296] R10: ffffc90002d97a28 R11: f000000000000000 R12: 0000000000000000
> [ 4349.857142] R13: ffffffffffffffff R14: 0000000000000000 R15: 0000000000000008
> [ 4349.857969] FS:  00007f0712dc8700(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000
> [ 4349.858918] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 4349.859596] CR2: 00007f0711e7e008 CR3: 0000000004265001 CR4: 00000000001606e0
> [ 4349.860462] Call Trace:
> [ 4349.860860]  xfs_reflink_cancel_cow_range+0x3f6/0x650 [xfs]
> [ 4349.861596]  ? down_write_nested+0x94/0xb0
> [ 4349.862165]  ? xfs_ilock+0x2ac/0x450 [xfs]
> [ 4349.862719]  xfs_inode_free_cowblocks+0x38e/0x620 [xfs]
> [ 4349.863376]  xfs_inode_ag_walk+0x327/0xc30 [xfs]
> [ 4349.863972]  ? xfs_inode_free_eofblocks+0x580/0x580 [xfs]
> [ 4349.864600]  ? try_to_wake_up+0x30/0x560
> [ 4349.865105]  ? _raw_spin_unlock_irqrestore+0x46/0x70
> [ 4349.865667]  ? try_to_wake_up+0x49/0x560
> [ 4349.866159]  ? radix_tree_gang_lookup_tag+0xf4/0x150
> [ 4349.866795]  ? xfs_inode_free_eofblocks+0x580/0x580 [xfs]
> [ 4349.867438]  ? xfs_perag_get_tag+0x205/0x470 [xfs]
> [ 4349.868042]  ? xfs_perag_put+0x15f/0x2e0 [xfs]
> [ 4349.868573]  ? xfs_inode_free_eofblocks+0x580/0x580 [xfs]
> [ 4349.869243]  xfs_inode_ag_iterator_tag+0x65/0xa0 [xfs]
> [ 4349.869876]  xfs_file_buffered_aio_write+0x203/0x5b0 [xfs]
> [ 4349.870575]  xfs_file_write_iter+0x298/0x4f0 [xfs]
> [ 4349.871164]  __vfs_write+0x130/0x1a0
> [ 4349.871585]  vfs_write+0xc8/0x1c0
> [ 4349.872001]  SyS_write+0x45/0xa0
> [ 4349.872394]  entry_SYSCALL_64_fastpath+0x1f/0x96
> [ 4349.872950] RIP: 0033:0x7f071259c4a0
> [ 4349.873377] RSP: 002b:00007ffea07c4588 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [ 4349.874226] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f071259c4a0
> [ 4349.875066] RDX: 000000000000aebc RSI: 000000000111b260 RDI: 0000000000000003
> [ 4349.875866] RBP: 0000000000000001 R08: 000000000000006e R09: 0000000000000004
> [ 4349.876658] R10: 00007f0712586b78 R11: 0000000000000246 R12: 00000000000007a9
> [ 4349.877433] R13: 0000000000000003 R14: 00000000001c6200 R15: 000000000001e000
> [ 4349.878273] Code: 00 00 53 48 89 f1 41 89 d0 48 c7 c6 18 3e 28 a0 48 89 fa 31 ff e8 63 fa ff ff 0f b6 1d 28 44 29 00 80 fb 01 77 09 83 e3 01 75 15 <0f> ff 5b c3 0f b6 f3 48 c7 c7 90 bd 3d a0 e8 3f 13 1c e1 eb e6 
> [ 4349.880399] ---[ end trace 1e05700f283b7cc1 ]---

Ok so I think this is just an assert that doesn't belong.

> So maybe I need to take a closer look at all this machinery tomorrow...

It would seem that writeback is wedging up when it tries to allocate blocks
to fill a delalloc(?) extent, but at that point the filesystem is totally
out of space (zero free blocks) and the whole thing dies.  Hm.  Will take
a further look next week.

--D

FWIW I also saw these oddballs fly by on one of the g/269 runs:

(I'm merely recording these here to leave a breadcrumb trail so I can pick
this up again on Monday.  I think these are all related to the fs being
totally out of space.)

[  350.205699] XFS: Assertion failed: type != XFS_IO_COW, file: /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_aops.c, line: 393
[  350.207092] WARNING: CPU: 2 PID: 105 at /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_message.c:116 assfail+0x2e/0x60 [xfs]
[  350.208224] Modules linked in: xfs libcrc32c dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: xfs]
[  350.209219] CPU: 2 PID: 105 Comm: kworker/u10:2 Not tainted 4.15.0-rc6-xfsx #5
[  350.209959] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
[  350.210902] Workqueue: writeback wb_workfn (flush-259:4)
[  350.211489] RIP: 0010:assfail+0x2e/0x60 [xfs]
[  350.211947] RSP: 0018:ffffc9000088f8b8 EFLAGS: 00010246
[  350.212512] RAX: 00000000ffffffea RBX: 0000000000000000 RCX: 0000000000000001
[  350.213276] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa06155b5
[  350.214072] RBP: 0000000000001000 R08: 0000000000000000 R09: 0000000000000000
[  350.214836] R10: 0000000000000000 R11: f000000000000000 R12: 00000000000b3000
[  350.215565] R13: 0000000000000004 R14: ffff88005d2ee540 R15: ffffc9000088fb28
[  350.216328] FS:  0000000000000000(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
[  350.217172] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  350.217849] CR2: 00007f6a9b021000 CR3: 0000000002011002 CR4: 00000000001606e0
[  350.218675] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  350.219528] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  350.220389] Call Trace:
[  350.220748]  xfs_map_blocks+0x479/0x8e0 [xfs]
[  350.221310]  xfs_do_writepage+0x2f8/0xe30 [xfs]
[  350.221870]  write_cache_pages+0x20c/0x530
[  350.222381]  ? xfs_add_to_ioend+0x6d0/0x6d0 [xfs]
[  350.222950]  xfs_vm_writepages+0x7f/0x170 [xfs]
[  350.223550]  do_writepages+0x17/0x70
[  350.224051]  __writeback_single_inode+0x59/0x7e0
[  350.224662]  writeback_sb_inodes+0x283/0x550
[  350.225172]  wb_writeback+0x112/0x5c0
[  350.225623]  ? wb_workfn+0x128/0x740
[  350.226053]  wb_workfn+0x128/0x740
[  350.226485]  ? lock_acquire+0xab/0x200
[  350.226941]  ? lock_acquire+0xab/0x200
[  350.227380]  ? process_one_work+0x17e/0x680
[  350.227875]  process_one_work+0x1fb/0x680
[  350.228409]  worker_thread+0x4d/0x3e0
[  350.228849]  kthread+0x103/0x140
[  350.229249]  ? process_one_work+0x680/0x680
[  350.229751]  ? kthread_delayed_work_timer_fn+0x90/0x90
[  350.230358]  ret_from_fork+0x24/0x30
[  350.230789] Code: 00 00 53 48 89 f1 41 89 d0 48 c7 c6 d8 8e 62 a0 48 89 fa 31 ff e8 63 fa ff ff 0f b6 1d 88 50 29 00 80 fb 01 77 09 83 e3 01 75 15 <0f> ff 5b c3 0f b6 f3 48 c7 c7 70 0d 78 a0 e8 df ce e1 e0 eb e6 
[  350.232932] ---[ end trace 7e48cbbf0d68bb48 ]---
[  351.713383] XFS (pmem4): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
[  359.565227] XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, file: /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_trans.c, line: 353
[  359.585681] WARNING: CPU: 3 PID: 14462 at /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_message.c:116 assfail+0x2e/0x60 [xfs]
[  359.591910] Modules linked in: xfs libcrc32c dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: xfs]
[  359.594521] CPU: 3 PID: 14462 Comm: fsstress Tainted: G        W        4.15.0-rc6-xfsx #5
[  359.595988] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
[  359.596946] RIP: 0010:assfail+0x2e/0x60 [xfs]
[  359.597385] RSP: 0018:ffffc9000280b708 EFLAGS: 00010246
[  359.597927] RAX: 00000000ffffffea RBX: 0000000000000000 RCX: 0000000000000001
[  359.598661] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa06155b5
[  359.599388] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
[  359.600200] R10: 0000000000000000 R11: f000000000000000 R12: ffffffffffffffee
[  359.601005] R13: ffff8800390ec000 R14: ffffc9000280b840 R15: ffff8800390ec000
[  359.601777] FS:  00007f6a9b033700(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000
[  359.602725] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  359.603374] CR2: 0000000000815b88 CR3: 000000003d1e5001 CR4: 00000000001606e0
[  359.604154] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  359.604995] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  359.605814] Call Trace:
[  359.606081]  xfs_trans_mod_sb+0x44c/0x5e0 [xfs]
[  359.606656]  xfs_alloc_ag_vextent+0x169/0x580 [xfs]
[  359.607263]  xfs_alloc_vextent+0xb1b/0x19c0 [xfs]
[  359.607849]  ? xfs_bmap_longest_free_extent+0x6c/0x120 [xfs]
[  359.608710]  ? xfs_bmap_btalloc_nullfb+0x9e/0x190 [xfs]
[  359.609504]  xfs_bmap_btalloc+0x2c1/0xe60 [xfs]
[  359.610599]  xfs_bmapi_write+0x641/0x1d70 [xfs]
[  359.611993]  xfs_iomap_write_unwritten+0x246/0x690 [xfs]
[  359.612991]  iomap_dio_complete+0x43/0x100
[  359.613414]  iomap_dio_rw+0x358/0x380
[  359.613834]  ? xfs_file_dio_aio_write+0x184/0x7a0 [xfs]
[  359.614385]  xfs_file_dio_aio_write+0x184/0x7a0 [xfs]
[  359.614919]  ? lock_acquire+0xab/0x200
[  359.615342]  xfs_file_write_iter+0x16c/0x4f0 [xfs]
[  359.615842]  aio_write+0x129/0x1b0
[  359.616207]  ? lock_acquire+0xab/0x200
[  359.616615]  ? __might_fault+0x36/0x80
[  359.617020]  ? do_io_submit+0x40e/0x8c0
[  359.617437]  do_io_submit+0x40e/0x8c0
[  359.617841]  ? entry_SYSCALL_64_fastpath+0x1f/0x96
[  359.618354]  entry_SYSCALL_64_fastpath+0x1f/0x96
[  359.618861] RIP: 0033:0x7f6a9aa14697
[  359.619222] RSP: 002b:00007ffef5888f58 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1
[  359.619936] RAX: ffffffffffffffda RBX: 0000000000000473 RCX: 00007f6a9aa14697
[  359.620550] RDX: 00007ffef5888f80 RSI: 0000000000000001 RDI: 00007f6a9b03b000
[  359.621289] RBP: 0000000000184000 R08: 00007f6a9a7f1bf8 R09: 0000000000000000
[  359.622057] R10: 00007f6a9a7f1b78 R11: 0000000000000246 R12: 0000000000000000
[  359.622832] R13: 7fffffffffffffff R14: 0000000000000003 R15: 0000000000000004
[  359.623664] Code: 00 00 53 48 89 f1 41 89 d0 48 c7 c6 d8 8e 62 a0 48 89 fa 31 ff e8 63 fa ff ff 0f b6 1d 88 50 29 00 80 fb 01 77 09 83 e3 01 75 15 <0f> ff 5b c3 0f b6 f3 48 c7 c7 70 0d 78 a0 e8 df ce e1 e0 eb e6 
[  359.625557] ---[ end trace 7e48cbbf0d68bb49 ]---
<snip>
[  361.250280] XFS: Assertion failed: pathlen == 0, file: /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_symlink.c, line: 346
[  361.255275] WARNING: CPU: 2 PID: 14534 at /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_message.c:116 assfail+0x2e/0x60 [xfs]
[  361.256421] Modules linked in: xfs libcrc32c dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: xfs]
[  361.257417] CPU: 2 PID: 14534 Comm: fsstress Tainted: G        W        4.15.0-rc6-xfsx #5
[  361.258269] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
[  361.259250] RIP: 0010:assfail+0x2e/0x60 [xfs]
[  361.259746] RSP: 0018:ffffc90002a4fd00 EFLAGS: 00010246
[  361.260481] RAX: 00000000ffffffea RBX: 0000000000000000 RCX: 0000000000000001
[  361.261374] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa06155b5
[  361.262111] RBP: ffff880055db5e80 R08: 0000000000000000 R09: 0000000000000000
[  361.262790] R10: ffffc90002a4fc40 R11: f000000000000000 R12: ffff8800390ec000
[  361.263453] R13: 00000000000001b1 R14: ffffc90002a4fea8 R15: 0000000000000024
[  361.264122] FS:  00007f6a9b033700(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
[  361.264868] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  361.265409] CR2: 00007f6a9b030000 CR3: 0000000022460001 CR4: 00000000001606e0
[  361.266074] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  361.266734] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  361.267395] Call Trace:
[  361.267684]  xfs_symlink+0xd1b/0x11d0 [xfs]
[  361.268081]  ? lock_acquire+0xab/0x200
[  361.268466]  ? __d_rehash+0x82/0xd0
[  361.268814]  ? _raw_spin_unlock+0x2e/0x50
[  361.269221]  xfs_vn_symlink+0x9a/0x1f0 [xfs]
[  361.269636]  vfs_symlink+0x83/0xd0
[  361.269954]  SyS_symlink+0x7e/0xd0
[  361.270273]  entry_SYSCALL_64_fastpath+0x1f/0x96
[  361.270708] RIP: 0033:0x7f6a9a525997
[  361.271049] RSP: 002b:00007ffef5889338 EFLAGS: 00000206 ORIG_RAX: 0000000000000058
[  361.271729] RAX: ffffffffffffffda RBX: 0000000000000387 RCX: 00007f6a9a525997
[  361.272380] RDX: 0000000000000064 RSI: 00000000007cc160 RDI: 00000000007d04c0
[  361.273016] RBP: 0000000000000009 R08: 00007f6a9a7f2308 R09: 0000000000000003
[  361.273659] R10: 0000000000000000 R11: 0000000000000206 R12: 00000000000e4000
[  361.274300] R13: 0000000000000003 R14: 000000000000f000 R15: 0000000000000016
[  361.274939] Code: 00 00 53 48 89 f1 41 89 d0 48 c7 c6 d8 8e 62 a0 48 89 fa 31 ff e8 63 fa ff ff 0f b6 1d 88 50 29 00 80 fb 01 77 09 83 e3 01 75 15 <0f> ff 5b c3 0f b6 f3 48 c7 c7 70 0d 78 a0 e8 df ce e1 e0 eb e6 
[  361.276610] ---[ end trace 7e48cbbf0d68bb59 ]---
[  361.280551] XFS: Assertion failed: error != -ENOSPC, file: /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_inode.c, line: 1223
[  361.284186] WARNING: CPU: 0 PID: 14473 at /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_message.c:116 assfail+0x2e/0x60 [xfs]
[  361.285483] Modules linked in: xfs libcrc32c dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: xfs]
[  361.286912] CPU: 0 PID: 14473 Comm: fsstress Tainted: G        W        4.15.0-rc6-xfsx #5
[  361.288140] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
[  361.289029] RIP: 0010:assfail+0x2e/0x60 [xfs]
[  361.289466] RSP: 0018:ffffc90002863d58 EFLAGS: 00010246
[  361.289959] RAX: 00000000ffffffea RBX: 0000000000000000 RCX: 0000000000000001
[  361.290705] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa06155b5
[  361.291432] RBP: 00000000ffffffe4 R08: 0000000000000000 R09: 0000000000000000
[  361.292198] R10: 0000000000000000 R11: f000000000000000 R12: ffff880030023600
[  361.292944] R13: ffffc90002863e98 R14: 0000000000000025 R15: 0000000000000000
[  361.293789] FS:  00007f6a9b033700(0000) GS:ffff88003ea00000(0000) knlGS:0000000000000000
[  361.294562] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  361.295118] CR2: 00007fcea800f0b8 CR3: 000000003d4a2004 CR4: 00000000001606f0
[  361.295797] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  361.296566] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  361.297253] Call Trace:
[  361.297609]  xfs_create+0x82a/0xd10 [xfs]
[  361.298014]  ? get_cached_acl+0xca/0x1e0
[  361.298437]  xfs_generic_create+0x220/0x360 [xfs]
[  361.298905]  vfs_mknod+0xa9/0x100
[  361.299245]  SyS_mknod+0x1a3/0x1f0
[  361.299584]  entry_SYSCALL_64_fastpath+0x1f/0x96
[  361.300047] RIP: 0033:0x7f6a9a523cad
[  361.300410] RSP: 002b:00007ffef5889338 EFLAGS: 00000246 ORIG_RAX: 0000000000000085
[  361.301117] RAX: ffffffffffffffda RBX: 00000000000003e6 RCX: 00007f6a9a523cad
[  361.301887] RDX: 0000000000000000 RSI: 0000000000002124 RDI: 00000000007db300
[  361.302569] RBP: 00000000000db000 R08: 00000000007db300 R09: 0000000000000002
[  361.303344] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000d000
[  361.304120] R13: 7fffffffffffffff R14: 0000000000000003 R15: 0000000000000004
[  361.304912] Code: 00 00 53 48 89 f1 41 89 d0 48 c7 c6 d8 8e 62 a0 48 89 fa 31 ff e8 63 fa ff ff 0f b6 1d 88 50 29 00 80 fb 01 77 09 83 e3 01 75 15 <0f> ff 5b c3 0f b6 f3 48 c7 c7 70 0d 78 a0 e8 df ce e1 e0 eb e6 
[  361.306797] ---[ end trace 7e48cbbf0d68bb5a ]---


> > 
> > > > 
> > > > [12968.100008] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [fsstress:6903]
> > > > [12968.100038] Modules linked in: loop dm_flakey xfs ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc 8139too 8139cp i2c_piix4 joydev mii pcspkr virtio_balloon virtio_pci serio_raw virtio_ring virtio floppy ata_generic pata_acpi
> > > > [12968.104043] irq event stamp: 23222196
> > > > [12968.104043] hardirqs last  enabled at (23222195): [<000000007d0c2e75>] restore_regs_and_return_to_kernel+0x0/0x2e
> > > > [12968.105111] hardirqs last disabled at (23222196): [<000000008f80dc57>] apic_timer_interrupt+0xa7/0xc0
> > > > [12968.105111] softirqs last  enabled at (877594): [<0000000034c53d5e>] __do_softirq+0x392/0x502
> > > > [12968.105111] softirqs last disabled at (877585): [<000000003f4d9e0b>] irq_exit+0x102/0x110
> > > > [12968.105111] CPU: 2 PID: 6903 Comm: fsstress Tainted: G        W    L   4.15.0-rc5 #10
> > > > [12968.105111] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
> > > > [12968.108043] RIP: 0010:xfs_bmapi_update_map+0xc/0xc0 [xfs]
> > > 
> > > Hmmm, I haven't seen such a hang; I wonder if we're doing something
> > > we shouldn't be doing and looping in bmapi_write.  In any case it's
> > > a bug with xfs, not fsstress.
> > 
> > Agreed, I'm planning to pull this patch in this week's update, with the
> > following fix
> > 
> > - inode_info(inoinfo2, sizeof(inoinfo2), &stat2, v1);
> > + inode_info(inoinfo2, sizeof(inoinfo2), &stat2, v2);
> > 
> > Also I'd follow Dave's suggestion on xfs/068 fix, move the
> > FSSTRESS_AVOID handling to common/dump on commit. Please let me know if
> > you have a different plan now.
> 
> I was just gonna go back to amending only xfs/068 to turn off clone/dedupe.
> 
> --D
> 
> > Thanks,
> > Eryu
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 1/8] common/rc: report kmemleak errors
  2018-01-05 17:02         ` Darrick J. Wong
@ 2018-01-07 15:25           ` Eryu Guan
  0 siblings, 0 replies; 55+ messages in thread
From: Eryu Guan @ 2018-01-07 15:25 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, fstests

On Fri, Jan 05, 2018 at 09:02:27AM -0800, Darrick J. Wong wrote:
> On Fri, Jan 05, 2018 at 04:02:55PM +0800, Eryu Guan wrote:
> > On Thu, Dec 14, 2017 at 10:15:08AM -0800, Darrick J. Wong wrote:
> > > On Thu, Dec 14, 2017 at 05:37:18PM +0800, Eryu Guan wrote:
> > > > On Tue, Dec 12, 2017 at 10:03:18PM -0800, Darrick J. Wong wrote:
> > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > 
> > > > > If kmemleak is enabled, scan and report memory leaks after every test.
> > > > > 
> > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > ---
> > > > >  check     |    2 ++
> > > > >  common/rc |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > >  2 files changed, 54 insertions(+)
> > > > > 
> > > > > 
> > > > > diff --git a/check b/check
> > > > > index b2d251a..469188e 100755
> > > > > --- a/check
> > > > > +++ b/check
> > > > > @@ -497,6 +497,7 @@ _check_filesystems()
> > > > >  	fi
> > > > >  }
> > > > >  
> > > > > +_init_kmemleak
> > > > >  _prepare_test_list
> > > > >  
> > > > >  if $OPTIONS_HAVE_SECTIONS; then
> > > > > @@ -793,6 +794,7 @@ for section in $HOST_OPTIONS_SECTIONS; do
> > > > >  		    n_try=`expr $n_try + 1`
> > > > >  		    _check_filesystems
> > > > >  		    _check_dmesg || err=true
> > > > > +		    _check_kmemleak || err=true
> > > > >  		fi
> > > > >  
> > > > >  	    fi
> > > > > diff --git a/common/rc b/common/rc
> > > > > index cb83918..a2bed36 100644
> > > > > --- a/common/rc
> > > > > +++ b/common/rc
> > > > > @@ -3339,6 +3339,58 @@ _check_dmesg()
> > > > >  	fi
> > > > >  }
> > > > >  
> > > > > +# capture the kmemleak report
> > > > > +_capture_kmemleak()
> > > > > +{
> > > > > +	local _kern_knob="${DEBUGFS_MNT}/kmemleak"
> > > > > +	local _leak_file="$1"
> > > > > +
> > > > > +	# Tell the kernel to scan for memory leaks.  Apparently the write
> > > > > +	# returns before the scan is complete, so do it twice in the hopes
> > > > > +	# that twice is enough to capture all the leaks.
> > > > > +	echo "scan" > "${_kern_knob}"
> > > > > +	cat "${_kern_knob}" > /dev/null
> > > > > +	echo "scan" > "${_kern_knob}"
> > > > > +	cat "${_kern_knob}" > "${_leak_file}"
> > > > > +	echo "clear" > "${_kern_knob}"
> > > > 
> > > > Hmm, two scans seem not enough either, I could see false positive easily
> > > > in a 'quick' group run, because some leaks are not reported immediately
> > > > after the test but after next test or next few tests. e.g. I saw
> > > > generic/008 (tested on XFS) being reported as leaking memory, and from
> > > > 008.kmemleak I saw:
> > > > 
> > > > unreferenced object 0xffff880277679800 (size 512):
> > > >   comm "nametest", pid 25007, jiffies 4300176958 (age 9.854s)
> > > > ...
> > > > 
> > > > But "nametest" is only used in generic/007, the leak should be triggered
> > > > by generic/007 too, but 007 was reported as PASS in my case.
> > > > 
> > > > Not sure what's the best way to deal with these false positive, adding
> > > > more scans seem to work, but that's ugly and requires more test time..
> > > > What do you think?
> > > 
> > > I'm not sure either -- the brief scan I made of mm/kmemleak.c didn't
> > > reveal anything obvious that would explain the behavior we see.  It
> > > might just take a while for determine positively that an item isn't
> > > gray.
> > 
> > Seems so, I did read similar statements elsewhere, but I can't remember
> > now..
> > 
> > > 
> > > We could change the message to state that found leaks might have
> > > resulted from the previous test?  That's rather unsatisfying, but I
> > > don't know what else to do.
> > 
> > Seems like a reasonable way to go at this stage. I also noticed some
> > leaks probably were not from the test we ran nor fs-related, but other
> > processes on the system, e.g. 
> > 
> > unreferenced object 0xffff8801768be3c0 (size 256):
> >   comm "softirq", pid 0, jiffies 4299031078 (age 14.234s)
> >   hex dump (first 32 bytes):
> >     01 00 00 00 00 00 00 00 03 00 00 00 00 03 00 00  ................
> >     b7 fd 01 00 00 00 00 00 d8 f6 1f 79 02 88 ff ff  ...........y....
> >   backtrace:
> >     [<ffffffffa077cae8>] init_conntrack+0x4a8/0x4c0 [nf_conntrack]
> >     [<ffffffffa077d2c4>] nf_conntrack_in+0x494/0x510 [nf_conntrack]
> >     [<ffffffff815f32d7>] nf_hook_slow+0x37/0xb0
> >     [<ffffffff815fd6a0>] ip_rcv+0x2f0/0x3c0
> >     [<ffffffff815b5833>] __netif_receive_skb_core+0x3d3/0xaa0
> >     [<ffffffff815b8154>] netif_receive_skb_internal+0x34/0xc0
> >     [<ffffffffa0356654>] br_pass_frame_up+0xb4/0x140 [bridge]
> >     [<ffffffffa03568eb>] br_handle_frame_finish+0x20b/0x3f0 [bridge]
> >     [<ffffffffa0356c7b>] br_handle_frame+0x16b/0x2c0 [bridge]
> >     [<ffffffff815b5651>] __netif_receive_skb_core+0x1f1/0xaa0
> >     [<ffffffff815b8154>] netif_receive_skb_internal+0x34/0xc0
> >     [<ffffffff815b8dbc>] napi_gro_receive+0xbc/0xe0
> >     [<ffffffffa004f64c>] bnx2_poll_work+0x8fc/0x1190 [bnx2]
> >     [<ffffffffa004ff13>] bnx2_poll_msix+0x33/0xb0 [bnx2]
> >     [<ffffffff815b868e>] net_rx_action+0x26e/0x3a0
> >     [<ffffffff816e8778>] __do_softirq+0xc8/0x26c
> > 
> > Perhaps we can mark the kmemleak check as "experimental" or so? By
> > adding some kind of "disclaimer" in the beginning of $seqres.kmemleak
> > file? So people could have the right expectation on these kmemleak
> > failures.
> 
> How about:
> 
> "EXPERIMENTAL kmemleak reported some memory leaks!  Due to the way kmemleak
> works, the leak might be from an earlier test, or something totally unrelated.

Yeah, this looks good to me!

Thanks,
Eryu

> 
> "unreferenced object 0xffff8801768be3c0 (size 256):
>   comm "softirq", pid 0, jiffies 4299031078 (age 14.234s)
> ..."
> 
> > > 
> > > Or maybe a sleep 1 in between scans to see if that makes it more likely
> > > to attribute a leak to the correct test?  I don't anticipate running
> > > xfstests with kmemleak=on too terribly often, so the extra ~700s won't
> > > bother me too much.
> > 
> > This doesn't improve anything to me, 7 of the first 8 tests failed due
> > to kmemleak check after adding 'sleep 1' between scans.
> 
> <nod>
> 
> --D
> 
> > Thanks,
> > Eryu
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 7/8] generic: run a long-soak write-only fsstress test
  2017-12-13  6:04 ` [PATCH 7/8] generic: run a long-soak write-only fsstress test Darrick J. Wong
@ 2018-01-07 15:34   ` Eryu Guan
  0 siblings, 0 replies; 55+ messages in thread
From: Eryu Guan @ 2018-01-07 15:34 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, fstests

On Tue, Dec 12, 2017 at 10:04:04PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Let a lot of writes soak in with multithreaded fsstress to look for bugs
> and other problems.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  tests/generic/933     |   64 +++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/933.out |    2 ++
>  tests/generic/group   |    1 +
>  3 files changed, 67 insertions(+)
>  create mode 100755 tests/generic/933
>  create mode 100644 tests/generic/933.out
> 
> 
> diff --git a/tests/generic/933 b/tests/generic/933
> new file mode 100755
> index 0000000..0cbd081
> --- /dev/null
> +++ b/tests/generic/933
> @@ -0,0 +1,64 @@
> +#! /bin/bash
> +# FS QA Test No. 933
> +#
> +# Run an all-writes fsstress run with multiple threads to shake out
> +# bugs in the write path.
> +#
> +#-----------------------------------------------------------------------
> +# Copyright (c) 2017 Oracle, Inc.  All Rights Reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> +#-----------------------------------------------------------------------
> +#
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1	# failure is the default!
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> +	cd /
> +	rm -f $tmp.*
> +	$KILLALL_PROG -9 fsstress > /dev/null 2>&1
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +
> +# Modify as appropriate.
> +_supported_fs generic
> +_supported_os Linux
> +
> +_require_scratch
> +_require_command "$KILLALL_PROG" "killall"
> +
> +rm -f $seqres.full
> +
> +echo "Silence is golden."
> +
> +_scratch_mkfs > $seqres.full 2>&1
> +_scratch_mount >> $seqres.full 2>&1
> +
> +nr_cpus=$((LOAD_FACTOR * 4))
> +nr_ops=$((25000 * nr_cpus * TIME_FACTOR))
> +$FSSTRESS_PROG $FSSTRESS_AVOID -w -d $SCRATCH_MNT -n $nr_ops -p $nr_cpus -v >> $seqres.full

I removed '-v' option on commit, otherwise the resulting $seqres.full is
too large (almost 200M).

> +
> +# success, all done
> +status=0
> +exit
> diff --git a/tests/generic/933.out b/tests/generic/933.out
> new file mode 100644
> index 0000000..758765d
> --- /dev/null
> +++ b/tests/generic/933.out
> @@ -0,0 +1,2 @@
> +QA output created by 933
> +Silence is golden.
> diff --git a/tests/generic/group b/tests/generic/group
> index ff1ddc9..3c4fff5 100644
> --- a/tests/generic/group
> +++ b/tests/generic/group
> @@ -473,3 +473,4 @@
>  468 shutdown auto quick metadata
>  469 auto quick
>  932 shutdown auto log metadata
> +933 auto rw clone

I also removed the 'clone' group, it only runs clone operations
implicitly by fsstress.

Thanks,
Eryu

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 6/8] fsstress: implement the clonerange/deduperange ioctls
  2018-01-06  1:46             ` Darrick J. Wong
@ 2018-01-09  7:09               ` Darrick J. Wong
  0 siblings, 0 replies; 55+ messages in thread
From: Darrick J. Wong @ 2018-01-09  7:09 UTC (permalink / raw)
  To: Eryu Guan; +Cc: linux-xfs, fstests

On Fri, Jan 05, 2018 at 05:46:54PM -0800, Darrick J. Wong wrote:
> On Thu, Jan 04, 2018 at 08:54:10PM -0800, Darrick J. Wong wrote:
> > On Fri, Jan 05, 2018 at 12:35:49PM +0800, Eryu Guan wrote:
> > > On Wed, Jan 03, 2018 at 09:12:11AM -0800, Darrick J. Wong wrote:
> > > > On Wed, Jan 03, 2018 at 04:48:01PM +0800, Eryu Guan wrote:
> > > > > On Thu, Dec 14, 2017 at 06:07:31PM -0800, Darrick J. Wong wrote:
> > > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > > 
> > > > > > Mix it up a bit by reflinking and deduping data blocks when possible.
> > > > > > 
> > > > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > 
> > > > > This looks fine overall, but I noticed a soft lockup bug in generic/083
> > > > > and generic/269 (both test exercise ENOSPC behavior), test config is
> > > > > reflink+rmapbt XFS with 4k block size. Not sure if the soft lockup is
> > > > > related to the clonerange/deduperange ops in fsstress yet, will confirm
> > > > > without clone/dedupe ops.
> > > 
> > > More testings showed that this may have something to do with the
> > > deduperange operations. (I was testing with Fedora rawhide with
> > > v4.15-rc5 kernel, I didn't see hang nor soft lockup with my RHEL7 base
> > > host, because there's no FIDEDUPERANGE defined there).
> > > 
> > > I reverted the whole clonerange/deduperange support and retested for two
> > > rounds of full '-g auto' run without hitting any hang or soft lockup.
> > > Then I commented out the deduperange ops and left clonerange ops there,
> > > no hang/lockup either. At last I commented out the clonerange ops but
> > > left deduperange ops there, I hit a different hang in generic/270 (still
> > > a ENOSPC test). I pasted partial sysrq-w output here, if full output is
> > > needed please let me know.
> > > 
> > > [79200.901901] 14266.fsstress. D12200 14533  14460 0x00000000
> > > [79200.902419] Call Trace:
> > > [79200.902655]  ? __schedule+0x2e3/0xb90
> > > [79200.902969]  ? _raw_spin_unlock_irqrestore+0x32/0x60
> > > [79200.903442]  schedule+0x2f/0x90   
> > > [79200.903727]  schedule_timeout+0x1dd/0x540
> > > [79200.904114]  ? __next_timer_interrupt+0xc0/0xc0
> > > [79200.904535]  xfs_inode_ag_walk.isra.12+0x3cc/0x670 [xfs]
> > > [79200.905009]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > [79200.905563]  ? kvm_clock_read+0x21/0x30
> > > [79200.905891]  ? sched_clock+0x5/0x10
> > > [79200.906243]  ? sched_clock_local+0x12/0x80
> > > [79200.906598]  ? kvm_clock_read+0x21/0x30
> > > [79200.906920]  ? sched_clock+0x5/0x10
> > > [79200.907273]  ? sched_clock_local+0x12/0x80
> > > [79200.907636]  ? __lock_is_held+0x59/0xa0
> > > [79200.907988]  ? xfs_inode_ag_iterator_tag+0x46/0xb0 [xfs]
> > > [79200.908497]  ? rcu_read_lock_sched_held+0x6b/0x80
> > > [79200.908926]  ? xfs_perag_get_tag+0x28b/0x2f0 [xfs]
> > > [79200.909416]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > [79200.909922]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> > > [79200.910446]  xfs_file_buffered_aio_write+0x348/0x370 [xfs]
> > > [79200.910948]  xfs_file_write_iter+0x99/0x140 [xfs]
> > > [79200.911400]  __vfs_write+0xfc/0x170
> > > [79200.911726]  vfs_write+0xc1/0x1b0
> > > [79200.912063]  SyS_write+0x55/0xc0
> > > [79200.912347]  entry_SYSCALL_64_fastpath+0x1f/0x96
> > > 
> > > Seems other hangning fsstress processes were all waiting for io
> > > completion of writeback (sleeping in wb_wait_for_completion).
> > 
> > Hmm, I'll badger it some more, though I did see:
> > 
> > [ 4349.832516] XFS: Assertion failed: xfs_is_reflink_inode(ip), file: /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_reflink.c, line: 651
> > [ 4349.847730] WARNING: CPU: 3 PID: 3600 at /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_message.c:116 assfail+0x2e/0x60 [xfs]
> > [ 4349.849142] Modules linked in: xfs libcrc32c dm_snapshot dm_bufio dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: xfs]
> > [ 4349.850603] CPU: 3 PID: 3600 Comm: fsstress Not tainted 4.15.0-rc6-xfsx #9
> > [ 4349.851417] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
> > [ 4349.852594] RIP: 0010:assfail+0x2e/0x60 [xfs]
> > [ 4349.853156] RSP: 0018:ffffc90002d97a80 EFLAGS: 00010246
> > [ 4349.853785] RAX: 00000000ffffffea RBX: 0000000000000000 RCX: 0000000000000001
> > [ 4349.854621] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa0270585
> > [ 4349.855457] RBP: ffff88001d41d100 R08: 0000000000000000 R09: 0000000000000000
> > [ 4349.856296] R10: ffffc90002d97a28 R11: f000000000000000 R12: 0000000000000000
> > [ 4349.857142] R13: ffffffffffffffff R14: 0000000000000000 R15: 0000000000000008
> > [ 4349.857969] FS:  00007f0712dc8700(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000
> > [ 4349.858918] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 4349.859596] CR2: 00007f0711e7e008 CR3: 0000000004265001 CR4: 00000000001606e0
> > [ 4349.860462] Call Trace:
> > [ 4349.860860]  xfs_reflink_cancel_cow_range+0x3f6/0x650 [xfs]
> > [ 4349.861596]  ? down_write_nested+0x94/0xb0
> > [ 4349.862165]  ? xfs_ilock+0x2ac/0x450 [xfs]
> > [ 4349.862719]  xfs_inode_free_cowblocks+0x38e/0x620 [xfs]
> > [ 4349.863376]  xfs_inode_ag_walk+0x327/0xc30 [xfs]
> > [ 4349.863972]  ? xfs_inode_free_eofblocks+0x580/0x580 [xfs]
> > [ 4349.864600]  ? try_to_wake_up+0x30/0x560
> > [ 4349.865105]  ? _raw_spin_unlock_irqrestore+0x46/0x70
> > [ 4349.865667]  ? try_to_wake_up+0x49/0x560
> > [ 4349.866159]  ? radix_tree_gang_lookup_tag+0xf4/0x150
> > [ 4349.866795]  ? xfs_inode_free_eofblocks+0x580/0x580 [xfs]
> > [ 4349.867438]  ? xfs_perag_get_tag+0x205/0x470 [xfs]
> > [ 4349.868042]  ? xfs_perag_put+0x15f/0x2e0 [xfs]
> > [ 4349.868573]  ? xfs_inode_free_eofblocks+0x580/0x580 [xfs]
> > [ 4349.869243]  xfs_inode_ag_iterator_tag+0x65/0xa0 [xfs]
> > [ 4349.869876]  xfs_file_buffered_aio_write+0x203/0x5b0 [xfs]
> > [ 4349.870575]  xfs_file_write_iter+0x298/0x4f0 [xfs]
> > [ 4349.871164]  __vfs_write+0x130/0x1a0
> > [ 4349.871585]  vfs_write+0xc8/0x1c0
> > [ 4349.872001]  SyS_write+0x45/0xa0
> > [ 4349.872394]  entry_SYSCALL_64_fastpath+0x1f/0x96
> > [ 4349.872950] RIP: 0033:0x7f071259c4a0
> > [ 4349.873377] RSP: 002b:00007ffea07c4588 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > [ 4349.874226] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f071259c4a0
> > [ 4349.875066] RDX: 000000000000aebc RSI: 000000000111b260 RDI: 0000000000000003
> > [ 4349.875866] RBP: 0000000000000001 R08: 000000000000006e R09: 0000000000000004
> > [ 4349.876658] R10: 00007f0712586b78 R11: 0000000000000246 R12: 00000000000007a9
> > [ 4349.877433] R13: 0000000000000003 R14: 00000000001c6200 R15: 000000000001e000
> > [ 4349.878273] Code: 00 00 53 48 89 f1 41 89 d0 48 c7 c6 18 3e 28 a0 48 89 fa 31 ff e8 63 fa ff ff 0f b6 1d 28 44 29 00 80 fb 01 77 09 83 e3 01 75 15 <0f> ff 5b c3 0f b6 f3 48 c7 c7 90 bd 3d a0 e8 3f 13 1c e1 eb e6 
> > [ 4349.880399] ---[ end trace 1e05700f283b7cc1 ]---
> 
> Ok so I think this is just an assert that doesn't belong.
> 
> > So maybe I need to take a closer look at all this machinery tomorrow...
> 
> It would seem that writeback is wedging up when it tries to allocate blocks
> to fill a delalloc(?) extent, but at that point the filesystem is totally
> out of space (zero free blocks) and the whole thing dies.  Hm.  Will take
> a further look next week.

Nothing is ever simple in XFS, is it...

This is actually two problems -- the first is that the while (bno < end
&& n < *nmap) loop in xfs_bmapi_write somehow becomes an infinite loop
when ... I guess eof is true prior to entering the loop?  So we never
advance bno past end, and stuck become we.

The second problem is that for whatever reason the free blocks counter
can dip negative(!) and if it does this for too long(?) then transaction
allocation thinks that we have a large positive number of blocks(??) but
there's nothing in the bnobt to feed it and so kablooie?  Or maybe this
is just some xfs_ag_resv insanity(???) (bfoster was asking about that
earlier).

So, uh... yeah.  It's 11pm, I'm going to bed, will take it up in the
morning.

--D

> --D
> 
> FWIW I also saw these oddballs fly by on one of the g/269 runs:
> 
> (I'm merely recording these here to leave a breadcrumb trail so I can pick
> this up again on Monday.  I think these are all related to the fs being
> totally out of space.)
> 
> [  350.205699] XFS: Assertion failed: type != XFS_IO_COW, file: /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_aops.c, line: 393
> [  350.207092] WARNING: CPU: 2 PID: 105 at /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_message.c:116 assfail+0x2e/0x60 [xfs]
> [  350.208224] Modules linked in: xfs libcrc32c dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: xfs]
> [  350.209219] CPU: 2 PID: 105 Comm: kworker/u10:2 Not tainted 4.15.0-rc6-xfsx #5
> [  350.209959] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
> [  350.210902] Workqueue: writeback wb_workfn (flush-259:4)
> [  350.211489] RIP: 0010:assfail+0x2e/0x60 [xfs]
> [  350.211947] RSP: 0018:ffffc9000088f8b8 EFLAGS: 00010246
> [  350.212512] RAX: 00000000ffffffea RBX: 0000000000000000 RCX: 0000000000000001
> [  350.213276] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa06155b5
> [  350.214072] RBP: 0000000000001000 R08: 0000000000000000 R09: 0000000000000000
> [  350.214836] R10: 0000000000000000 R11: f000000000000000 R12: 00000000000b3000
> [  350.215565] R13: 0000000000000004 R14: ffff88005d2ee540 R15: ffffc9000088fb28
> [  350.216328] FS:  0000000000000000(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
> [  350.217172] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  350.217849] CR2: 00007f6a9b021000 CR3: 0000000002011002 CR4: 00000000001606e0
> [  350.218675] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  350.219528] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  350.220389] Call Trace:
> [  350.220748]  xfs_map_blocks+0x479/0x8e0 [xfs]
> [  350.221310]  xfs_do_writepage+0x2f8/0xe30 [xfs]
> [  350.221870]  write_cache_pages+0x20c/0x530
> [  350.222381]  ? xfs_add_to_ioend+0x6d0/0x6d0 [xfs]
> [  350.222950]  xfs_vm_writepages+0x7f/0x170 [xfs]
> [  350.223550]  do_writepages+0x17/0x70
> [  350.224051]  __writeback_single_inode+0x59/0x7e0
> [  350.224662]  writeback_sb_inodes+0x283/0x550
> [  350.225172]  wb_writeback+0x112/0x5c0
> [  350.225623]  ? wb_workfn+0x128/0x740
> [  350.226053]  wb_workfn+0x128/0x740
> [  350.226485]  ? lock_acquire+0xab/0x200
> [  350.226941]  ? lock_acquire+0xab/0x200
> [  350.227380]  ? process_one_work+0x17e/0x680
> [  350.227875]  process_one_work+0x1fb/0x680
> [  350.228409]  worker_thread+0x4d/0x3e0
> [  350.228849]  kthread+0x103/0x140
> [  350.229249]  ? process_one_work+0x680/0x680
> [  350.229751]  ? kthread_delayed_work_timer_fn+0x90/0x90
> [  350.230358]  ret_from_fork+0x24/0x30
> [  350.230789] Code: 00 00 53 48 89 f1 41 89 d0 48 c7 c6 d8 8e 62 a0 48 89 fa 31 ff e8 63 fa ff ff 0f b6 1d 88 50 29 00 80 fb 01 77 09 83 e3 01 75 15 <0f> ff 5b c3 0f b6 f3 48 c7 c7 70 0d 78 a0 e8 df ce e1 e0 eb e6 
> [  350.232932] ---[ end trace 7e48cbbf0d68bb48 ]---
> [  351.713383] XFS (pmem4): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
> [  359.565227] XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, file: /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_trans.c, line: 353
> [  359.585681] WARNING: CPU: 3 PID: 14462 at /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_message.c:116 assfail+0x2e/0x60 [xfs]
> [  359.591910] Modules linked in: xfs libcrc32c dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: xfs]
> [  359.594521] CPU: 3 PID: 14462 Comm: fsstress Tainted: G        W        4.15.0-rc6-xfsx #5
> [  359.595988] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
> [  359.596946] RIP: 0010:assfail+0x2e/0x60 [xfs]
> [  359.597385] RSP: 0018:ffffc9000280b708 EFLAGS: 00010246
> [  359.597927] RAX: 00000000ffffffea RBX: 0000000000000000 RCX: 0000000000000001
> [  359.598661] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa06155b5
> [  359.599388] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
> [  359.600200] R10: 0000000000000000 R11: f000000000000000 R12: ffffffffffffffee
> [  359.601005] R13: ffff8800390ec000 R14: ffffc9000280b840 R15: ffff8800390ec000
> [  359.601777] FS:  00007f6a9b033700(0000) GS:ffff88007f400000(0000) knlGS:0000000000000000
> [  359.602725] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  359.603374] CR2: 0000000000815b88 CR3: 000000003d1e5001 CR4: 00000000001606e0
> [  359.604154] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  359.604995] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  359.605814] Call Trace:
> [  359.606081]  xfs_trans_mod_sb+0x44c/0x5e0 [xfs]
> [  359.606656]  xfs_alloc_ag_vextent+0x169/0x580 [xfs]
> [  359.607263]  xfs_alloc_vextent+0xb1b/0x19c0 [xfs]
> [  359.607849]  ? xfs_bmap_longest_free_extent+0x6c/0x120 [xfs]
> [  359.608710]  ? xfs_bmap_btalloc_nullfb+0x9e/0x190 [xfs]
> [  359.609504]  xfs_bmap_btalloc+0x2c1/0xe60 [xfs]
> [  359.610599]  xfs_bmapi_write+0x641/0x1d70 [xfs]
> [  359.611993]  xfs_iomap_write_unwritten+0x246/0x690 [xfs]
> [  359.612991]  iomap_dio_complete+0x43/0x100
> [  359.613414]  iomap_dio_rw+0x358/0x380
> [  359.613834]  ? xfs_file_dio_aio_write+0x184/0x7a0 [xfs]
> [  359.614385]  xfs_file_dio_aio_write+0x184/0x7a0 [xfs]
> [  359.614919]  ? lock_acquire+0xab/0x200
> [  359.615342]  xfs_file_write_iter+0x16c/0x4f0 [xfs]
> [  359.615842]  aio_write+0x129/0x1b0
> [  359.616207]  ? lock_acquire+0xab/0x200
> [  359.616615]  ? __might_fault+0x36/0x80
> [  359.617020]  ? do_io_submit+0x40e/0x8c0
> [  359.617437]  do_io_submit+0x40e/0x8c0
> [  359.617841]  ? entry_SYSCALL_64_fastpath+0x1f/0x96
> [  359.618354]  entry_SYSCALL_64_fastpath+0x1f/0x96
> [  359.618861] RIP: 0033:0x7f6a9aa14697
> [  359.619222] RSP: 002b:00007ffef5888f58 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1
> [  359.619936] RAX: ffffffffffffffda RBX: 0000000000000473 RCX: 00007f6a9aa14697
> [  359.620550] RDX: 00007ffef5888f80 RSI: 0000000000000001 RDI: 00007f6a9b03b000
> [  359.621289] RBP: 0000000000184000 R08: 00007f6a9a7f1bf8 R09: 0000000000000000
> [  359.622057] R10: 00007f6a9a7f1b78 R11: 0000000000000246 R12: 0000000000000000
> [  359.622832] R13: 7fffffffffffffff R14: 0000000000000003 R15: 0000000000000004
> [  359.623664] Code: 00 00 53 48 89 f1 41 89 d0 48 c7 c6 d8 8e 62 a0 48 89 fa 31 ff e8 63 fa ff ff 0f b6 1d 88 50 29 00 80 fb 01 77 09 83 e3 01 75 15 <0f> ff 5b c3 0f b6 f3 48 c7 c7 70 0d 78 a0 e8 df ce e1 e0 eb e6 
> [  359.625557] ---[ end trace 7e48cbbf0d68bb49 ]---
> <snip>
> [  361.250280] XFS: Assertion failed: pathlen == 0, file: /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_symlink.c, line: 346
> [  361.255275] WARNING: CPU: 2 PID: 14534 at /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_message.c:116 assfail+0x2e/0x60 [xfs]
> [  361.256421] Modules linked in: xfs libcrc32c dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: xfs]
> [  361.257417] CPU: 2 PID: 14534 Comm: fsstress Tainted: G        W        4.15.0-rc6-xfsx #5
> [  361.258269] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
> [  361.259250] RIP: 0010:assfail+0x2e/0x60 [xfs]
> [  361.259746] RSP: 0018:ffffc90002a4fd00 EFLAGS: 00010246
> [  361.260481] RAX: 00000000ffffffea RBX: 0000000000000000 RCX: 0000000000000001
> [  361.261374] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa06155b5
> [  361.262111] RBP: ffff880055db5e80 R08: 0000000000000000 R09: 0000000000000000
> [  361.262790] R10: ffffc90002a4fc40 R11: f000000000000000 R12: ffff8800390ec000
> [  361.263453] R13: 00000000000001b1 R14: ffffc90002a4fea8 R15: 0000000000000024
> [  361.264122] FS:  00007f6a9b033700(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
> [  361.264868] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  361.265409] CR2: 00007f6a9b030000 CR3: 0000000022460001 CR4: 00000000001606e0
> [  361.266074] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  361.266734] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  361.267395] Call Trace:
> [  361.267684]  xfs_symlink+0xd1b/0x11d0 [xfs]
> [  361.268081]  ? lock_acquire+0xab/0x200
> [  361.268466]  ? __d_rehash+0x82/0xd0
> [  361.268814]  ? _raw_spin_unlock+0x2e/0x50
> [  361.269221]  xfs_vn_symlink+0x9a/0x1f0 [xfs]
> [  361.269636]  vfs_symlink+0x83/0xd0
> [  361.269954]  SyS_symlink+0x7e/0xd0
> [  361.270273]  entry_SYSCALL_64_fastpath+0x1f/0x96
> [  361.270708] RIP: 0033:0x7f6a9a525997
> [  361.271049] RSP: 002b:00007ffef5889338 EFLAGS: 00000206 ORIG_RAX: 0000000000000058
> [  361.271729] RAX: ffffffffffffffda RBX: 0000000000000387 RCX: 00007f6a9a525997
> [  361.272380] RDX: 0000000000000064 RSI: 00000000007cc160 RDI: 00000000007d04c0
> [  361.273016] RBP: 0000000000000009 R08: 00007f6a9a7f2308 R09: 0000000000000003
> [  361.273659] R10: 0000000000000000 R11: 0000000000000206 R12: 00000000000e4000
> [  361.274300] R13: 0000000000000003 R14: 000000000000f000 R15: 0000000000000016
> [  361.274939] Code: 00 00 53 48 89 f1 41 89 d0 48 c7 c6 d8 8e 62 a0 48 89 fa 31 ff e8 63 fa ff ff 0f b6 1d 88 50 29 00 80 fb 01 77 09 83 e3 01 75 15 <0f> ff 5b c3 0f b6 f3 48 c7 c7 70 0d 78 a0 e8 df ce e1 e0 eb e6 
> [  361.276610] ---[ end trace 7e48cbbf0d68bb59 ]---
> [  361.280551] XFS: Assertion failed: error != -ENOSPC, file: /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_inode.c, line: 1223
> [  361.284186] WARNING: CPU: 0 PID: 14473 at /storage/home/djwong/cdev/work/linux-xfs/fs/xfs/xfs_message.c:116 assfail+0x2e/0x60 [xfs]
> [  361.285483] Modules linked in: xfs libcrc32c dax_pmem device_dax nd_pmem sch_fq_codel af_packet [last unloaded: xfs]
> [  361.286912] CPU: 0 PID: 14473 Comm: fsstress Tainted: G        W        4.15.0-rc6-xfsx #5
> [  361.288140] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1djwong0 04/01/2014
> [  361.289029] RIP: 0010:assfail+0x2e/0x60 [xfs]
> [  361.289466] RSP: 0018:ffffc90002863d58 EFLAGS: 00010246
> [  361.289959] RAX: 00000000ffffffea RBX: 0000000000000000 RCX: 0000000000000001
> [  361.290705] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffa06155b5
> [  361.291432] RBP: 00000000ffffffe4 R08: 0000000000000000 R09: 0000000000000000
> [  361.292198] R10: 0000000000000000 R11: f000000000000000 R12: ffff880030023600
> [  361.292944] R13: ffffc90002863e98 R14: 0000000000000025 R15: 0000000000000000
> [  361.293789] FS:  00007f6a9b033700(0000) GS:ffff88003ea00000(0000) knlGS:0000000000000000
> [  361.294562] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  361.295118] CR2: 00007fcea800f0b8 CR3: 000000003d4a2004 CR4: 00000000001606f0
> [  361.295797] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  361.296566] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  361.297253] Call Trace:
> [  361.297609]  xfs_create+0x82a/0xd10 [xfs]
> [  361.298014]  ? get_cached_acl+0xca/0x1e0
> [  361.298437]  xfs_generic_create+0x220/0x360 [xfs]
> [  361.298905]  vfs_mknod+0xa9/0x100
> [  361.299245]  SyS_mknod+0x1a3/0x1f0
> [  361.299584]  entry_SYSCALL_64_fastpath+0x1f/0x96
> [  361.300047] RIP: 0033:0x7f6a9a523cad
> [  361.300410] RSP: 002b:00007ffef5889338 EFLAGS: 00000246 ORIG_RAX: 0000000000000085
> [  361.301117] RAX: ffffffffffffffda RBX: 00000000000003e6 RCX: 00007f6a9a523cad
> [  361.301887] RDX: 0000000000000000 RSI: 0000000000002124 RDI: 00000000007db300
> [  361.302569] RBP: 00000000000db000 R08: 00000000007db300 R09: 0000000000000002
> [  361.303344] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000d000
> [  361.304120] R13: 7fffffffffffffff R14: 0000000000000003 R15: 0000000000000004
> [  361.304912] Code: 00 00 53 48 89 f1 41 89 d0 48 c7 c6 d8 8e 62 a0 48 89 fa 31 ff e8 63 fa ff ff 0f b6 1d 88 50 29 00 80 fb 01 77 09 83 e3 01 75 15 <0f> ff 5b c3 0f b6 f3 48 c7 c7 70 0d 78 a0 e8 df ce e1 e0 eb e6 
> [  361.306797] ---[ end trace 7e48cbbf0d68bb5a ]---
> 
> 
> > > 
> > > > > 
> > > > > [12968.100008] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [fsstress:6903]
> > > > > [12968.100038] Modules linked in: loop dm_flakey xfs ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc 8139too 8139cp i2c_piix4 joydev mii pcspkr virtio_balloon virtio_pci serio_raw virtio_ring virtio floppy ata_generic pata_acpi
> > > > > [12968.104043] irq event stamp: 23222196
> > > > > [12968.104043] hardirqs last  enabled at (23222195): [<000000007d0c2e75>] restore_regs_and_return_to_kernel+0x0/0x2e
> > > > > [12968.105111] hardirqs last disabled at (23222196): [<000000008f80dc57>] apic_timer_interrupt+0xa7/0xc0
> > > > > [12968.105111] softirqs last  enabled at (877594): [<0000000034c53d5e>] __do_softirq+0x392/0x502
> > > > > [12968.105111] softirqs last disabled at (877585): [<000000003f4d9e0b>] irq_exit+0x102/0x110
> > > > > [12968.105111] CPU: 2 PID: 6903 Comm: fsstress Tainted: G        W    L   4.15.0-rc5 #10
> > > > > [12968.105111] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
> > > > > [12968.108043] RIP: 0010:xfs_bmapi_update_map+0xc/0xc0 [xfs]
> > > > 
> > > > Hmmm, I haven't seen such a hang; I wonder if we're doing something
> > > > we shouldn't be doing and looping in bmapi_write.  In any case it's
> > > > a bug with xfs, not fsstress.
> > > 
> > > Agreed, I'm planning to pull this patch in this week's update, with the
> > > following fix
> > > 
> > > - inode_info(inoinfo2, sizeof(inoinfo2), &stat2, v1);
> > > + inode_info(inoinfo2, sizeof(inoinfo2), &stat2, v2);
> > > 
> > > Also I'd follow Dave's suggestion on xfs/068 fix, move the
> > > FSSTRESS_AVOID handling to common/dump on commit. Please let me know if
> > > you have a different plan now.
> > 
> > I was just gonna go back to amending only xfs/068 to turn off clone/dedupe.
> > 
> > --D
> > 
> > > Thanks,
> > > Eryu
> > --
> > To unsubscribe from this list: send the line "unsubscribe fstests" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 10/8] xfs: check that fs freeze minimizes required recovery
  2018-01-03 19:26 ` [PATCH 10/8] xfs: check that fs freeze minimizes required recovery Darrick J. Wong
@ 2018-01-09 11:33   ` Eryu Guan
  2018-01-10  0:03     ` Darrick J. Wong
  0 siblings, 1 reply; 55+ messages in thread
From: Eryu Guan @ 2018-01-09 11:33 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, fstests

On Wed, Jan 03, 2018 at 11:26:26AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Make sure that a fs freeze operation cleans up as much of the filesystem
> so as to minimize the recovery required in a crash/remount scenario.  In
> particular we want to check that we don't leave CoW preallocations
> sitting around in the refcountbt, though this test looks for anything
> out of the ordinary on the frozen fs.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  tests/xfs/903     |  107 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/xfs/903.out |   10 +++++
>  tests/xfs/group   |    1 
>  3 files changed, 118 insertions(+)
>  create mode 100755 tests/xfs/903
>  create mode 100644 tests/xfs/903.out
> 
> diff --git a/tests/xfs/903 b/tests/xfs/903
> new file mode 100755
> index 0000000..1686356
> --- /dev/null
> +++ b/tests/xfs/903
> @@ -0,0 +1,107 @@
> +#! /bin/bash
> +# FS QA Test No. 903
> +#
> +# Test that frozen filesystems are relatively clean and not full of errors.
> +# Prior to freezing a filesystem, we want to minimize the amount of recovery
> +# that will have to happen if the system goes down while the fs is frozen.
> +# Therefore, start up fsstress and cycle through a few freeze/thaw cycles
> +# to ensure that nothing blows up when we try to do this.
> +#
> +# Unfortunately the log will probably still be dirty, so we can't do much
> +# about enforcing a clean repair -n run.
> +#
> +#-----------------------------------------------------------------------
> +# Copyright (c) 2000-2002 Silicon Graphics, Inc.  All Rights Reserved.
> +# Copyright (c) 2018 Oracle.  All Rights Reserved.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation.
> +#
> +# This program is distributed in the hope that it would be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program; if not, write the Free Software Foundation,
> +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> +#
> +#-----------------------------------------------------------------------
> +#
> +
> +seq=`basename $0`
> +seqres=$RESULT_DIR/$seq
> +echo "QA output created by $seq"
> +
> +here=`pwd`
> +tmp=/tmp/$$
> +status=1
> +trap "_cleanup; rm -f $tmp.*; exit \$status" 0 1 2 3 15
> +
> +_cleanup()
> +{
> +	# Make sure we thaw the fs before we unmount or else we remove the
> +	# mount without actually deactivating the filesystem(!)
> +	$XFS_IO_PROG -x -c "thaw" $SCRATCH_MNT 2> /dev/null
> +	echo "*** unmount"
> +	_scratch_unmount 2>/dev/null
> +}
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +
> +# real QA test starts here
> +_supported_fs xfs
> +_supported_os Linux
> +
> +_require_scratch
> +
> +# xfs_db will OOM kill the machine if you don't have huge amounts of RAM, so
> +# don't run this on large filesystems.
> +_require_no_large_scratch_dev

Looks like this is copied from some other test, but seems
_check_xfs_filesystem already skips _xfs_check if $LARGE_SCRATCH_DEV is
'yes', so we don't need this _require rule now.

> +
> +echo "*** init FS"
> +
> +rm -f $seqres.full
> +_scratch_unmount >/dev/null 2>&1

_require_scratch umounts it for you :)

> +echo "*** MKFS ***" >>$seqres.full
> +echo "" >>$seqres.full
> +_scratch_mkfs_xfs >>$seqres.full 2>&1 || _fail "mkfs failed"
> +_scratch_mount >>$seqres.full 2>&1 || _fail "mount failed"
> +
> +echo "*** test"
> +
> +for l in 0 1 2 3 4
> +do
> +	echo "    *** test $l"
> +	FSSTRESS_ARGS=`_scale_fsstress_args -d $SCRATCH_MNT -n 1000 $FSSTRESS_AVOID`
> +	$FSSTRESS_PROG  $FSSTRESS_ARGS >>$seqres.full
> +
> +	$XFS_IO_PROG -x -c 'freeze' $SCRATCH_MNT
> +
> +	# Log will probably be dirty after the freeze, record state
> +	echo "" >>$seqres.full
> +	echo "*** xfs_logprint ***" >>$seqres.full
> +	echo "" >>$seqres.full
> +	log=clean
> +	_scratch_xfs_logprint -tb 2>&1 | tee -a $seqres.full \
> +		| head | grep -q "<CLEAN>" || log=dirty
> +
> +	# Fail if repair complains and the log is clean
> +	echo "" >>$seqres.full
> +	echo "*** XFS_REPAIR -n ***" >>$seqres.full
> +	echo "" >>$seqres.full
> +	_scratch_xfs_repair -f -n >> $seqres.full 2>&1
> +
> +	if [ $? -ne 0 ] && [ "$log" = "clean" ]; then
> +		_fail "xfs_repair failed"
> +	fi

Hmm, I enlarged the loop count to 100 and didn't see a single CLEAN log,
I suspect this test is unlikely to fail..

Thanks,
Eryu

> +
> +	$XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT
> +done
> +
> +echo "*** done"
> +status=0
> +exit 0
> diff --git a/tests/xfs/903.out b/tests/xfs/903.out
> new file mode 100644
> index 0000000..378f0cb
> --- /dev/null
> +++ b/tests/xfs/903.out
> @@ -0,0 +1,10 @@
> +QA output created by 903
> +*** init FS
> +*** test
> +    *** test 0
> +    *** test 1
> +    *** test 2
> +    *** test 3
> +    *** test 4
> +*** done
> +*** unmount
> diff --git a/tests/xfs/group b/tests/xfs/group
> index e1b1582..23c26c2 100644
> --- a/tests/xfs/group
> +++ b/tests/xfs/group
> @@ -435,3 +435,4 @@
>  435 auto quick clone
>  436 auto quick clone fsr
>  708 auto quick other
> +903 mount auto quick stress

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 10/8] xfs: check that fs freeze minimizes required recovery
  2018-01-09 11:33   ` Eryu Guan
@ 2018-01-10  0:03     ` Darrick J. Wong
  0 siblings, 0 replies; 55+ messages in thread
From: Darrick J. Wong @ 2018-01-10  0:03 UTC (permalink / raw)
  To: Eryu Guan; +Cc: linux-xfs, fstests

On Tue, Jan 09, 2018 at 07:33:16PM +0800, Eryu Guan wrote:
> On Wed, Jan 03, 2018 at 11:26:26AM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Make sure that a fs freeze operation cleans up as much of the filesystem
> > so as to minimize the recovery required in a crash/remount scenario.  In
> > particular we want to check that we don't leave CoW preallocations
> > sitting around in the refcountbt, though this test looks for anything
> > out of the ordinary on the frozen fs.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  tests/xfs/903     |  107 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  tests/xfs/903.out |   10 +++++
> >  tests/xfs/group   |    1 
> >  3 files changed, 118 insertions(+)
> >  create mode 100755 tests/xfs/903
> >  create mode 100644 tests/xfs/903.out
> > 
> > diff --git a/tests/xfs/903 b/tests/xfs/903
> > new file mode 100755
> > index 0000000..1686356
> > --- /dev/null
> > +++ b/tests/xfs/903
> > @@ -0,0 +1,107 @@
> > +#! /bin/bash
> > +# FS QA Test No. 903
> > +#
> > +# Test that frozen filesystems are relatively clean and not full of errors.
> > +# Prior to freezing a filesystem, we want to minimize the amount of recovery
> > +# that will have to happen if the system goes down while the fs is frozen.
> > +# Therefore, start up fsstress and cycle through a few freeze/thaw cycles
> > +# to ensure that nothing blows up when we try to do this.
> > +#
> > +# Unfortunately the log will probably still be dirty, so we can't do much
> > +# about enforcing a clean repair -n run.
> > +#
> > +#-----------------------------------------------------------------------
> > +# Copyright (c) 2000-2002 Silicon Graphics, Inc.  All Rights Reserved.
> > +# Copyright (c) 2018 Oracle.  All Rights Reserved.
> > +#
> > +# This program is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU General Public License as
> > +# published by the Free Software Foundation.
> > +#
> > +# This program is distributed in the hope that it would be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +# GNU General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU General Public License
> > +# along with this program; if not, write the Free Software Foundation,
> > +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> > +#
> > +#-----------------------------------------------------------------------
> > +#
> > +
> > +seq=`basename $0`
> > +seqres=$RESULT_DIR/$seq
> > +echo "QA output created by $seq"
> > +
> > +here=`pwd`
> > +tmp=/tmp/$$
> > +status=1
> > +trap "_cleanup; rm -f $tmp.*; exit \$status" 0 1 2 3 15
> > +
> > +_cleanup()
> > +{
> > +	# Make sure we thaw the fs before we unmount or else we remove the
> > +	# mount without actually deactivating the filesystem(!)
> > +	$XFS_IO_PROG -x -c "thaw" $SCRATCH_MNT 2> /dev/null
> > +	echo "*** unmount"
> > +	_scratch_unmount 2>/dev/null
> > +}
> > +
> > +# get standard environment, filters and checks
> > +. ./common/rc
> > +. ./common/filter
> > +
> > +# real QA test starts here
> > +_supported_fs xfs
> > +_supported_os Linux
> > +
> > +_require_scratch
> > +
> > +# xfs_db will OOM kill the machine if you don't have huge amounts of RAM, so
> > +# don't run this on large filesystems.
> > +_require_no_large_scratch_dev
> 
> Looks like this is copied from some other test, but seems
> _check_xfs_filesystem already skips _xfs_check if $LARGE_SCRATCH_DEV is
> 'yes', so we don't need this _require rule now.

Oops, this was just leftover from debugging that wasn't necessary either.

> > +
> > +echo "*** init FS"
> > +
> > +rm -f $seqres.full
> > +_scratch_unmount >/dev/null 2>&1
> 
> _require_scratch umounts it for you :)
> 
> > +echo "*** MKFS ***" >>$seqres.full
> > +echo "" >>$seqres.full
> > +_scratch_mkfs_xfs >>$seqres.full 2>&1 || _fail "mkfs failed"
> > +_scratch_mount >>$seqres.full 2>&1 || _fail "mount failed"
> > +
> > +echo "*** test"
> > +
> > +for l in 0 1 2 3 4
> > +do
> > +	echo "    *** test $l"
> > +	FSSTRESS_ARGS=`_scale_fsstress_args -d $SCRATCH_MNT -n 1000 $FSSTRESS_AVOID`
> > +	$FSSTRESS_PROG  $FSSTRESS_ARGS >>$seqres.full
> > +
> > +	$XFS_IO_PROG -x -c 'freeze' $SCRATCH_MNT
> > +
> > +	# Log will probably be dirty after the freeze, record state
> > +	echo "" >>$seqres.full
> > +	echo "*** xfs_logprint ***" >>$seqres.full
> > +	echo "" >>$seqres.full
> > +	log=clean
> > +	_scratch_xfs_logprint -tb 2>&1 | tee -a $seqres.full \
> > +		| head | grep -q "<CLEAN>" || log=dirty
> > +
> > +	# Fail if repair complains and the log is clean
> > +	echo "" >>$seqres.full
> > +	echo "*** XFS_REPAIR -n ***" >>$seqres.full
> > +	echo "" >>$seqres.full
> > +	_scratch_xfs_repair -f -n >> $seqres.full 2>&1
> > +
> > +	if [ $? -ne 0 ] && [ "$log" = "clean" ]; then
> > +		_fail "xfs_repair failed"
> > +	fi
> 
> Hmm, I enlarged the loop count to 100 and didn't see a single CLEAN log,
> I suspect this test is unlikely to fail..

Hmmm, you're right, we're really looking for cow extents that haven't
been cleaned out of the refcount btrees.  I'll add a clause to make it
look for them directly.  That said, the cow extent cleanup depends on
"vfs/xfs: clean up cow mappings during fs data freeze", so there's
no hurry to get this in.

--D

> 
> Thanks,
> Eryu
> 
> > +
> > +	$XFS_IO_PROG -x -c 'thaw' $SCRATCH_MNT
> > +done
> > +
> > +echo "*** done"
> > +status=0
> > +exit 0
> > diff --git a/tests/xfs/903.out b/tests/xfs/903.out
> > new file mode 100644
> > index 0000000..378f0cb
> > --- /dev/null
> > +++ b/tests/xfs/903.out
> > @@ -0,0 +1,10 @@
> > +QA output created by 903
> > +*** init FS
> > +*** test
> > +    *** test 0
> > +    *** test 1
> > +    *** test 2
> > +    *** test 3
> > +    *** test 4
> > +*** done
> > +*** unmount
> > diff --git a/tests/xfs/group b/tests/xfs/group
> > index e1b1582..23c26c2 100644
> > --- a/tests/xfs/group
> > +++ b/tests/xfs/group
> > @@ -435,3 +435,4 @@
> >  435 auto quick clone
> >  436 auto quick clone fsr
> >  708 auto quick other
> > +903 mount auto quick stress
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 6/8] fsstress: implement the clonerange/deduperange ioctls
  2017-12-15  2:07   ` [PATCH v2 " Darrick J. Wong
  2018-01-03  8:48     ` Eryu Guan
@ 2018-02-22 16:06     ` Luis Henriques
  2018-02-22 17:27       ` Darrick J. Wong
  1 sibling, 1 reply; 55+ messages in thread
From: Luis Henriques @ 2018-02-22 16:06 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: eguan, linux-xfs, fstests

On Thu, Dec 14, 2017 at 06:07:31PM -0800, Darrick J. Wong wrote:

<snip>

> +void
> +clonerange_f(
> +	int			opno,
> +	long			r)
> +{

<snip>

> +	/* Calculate offsets */
> +	len = (random() % FILELEN_MAX) + 1;
> +	len &= ~(stat1.st_blksize - 1);
> +	if (len == 0)
> +		len = stat1.st_blksize;
> +	if (len > stat1.st_size)
> +		len = stat1.st_size;
> +
> +	lr = ((__int64_t)random() << 32) + random();
> +	if (stat1.st_size == len)
> +		off1 = 0;
> +	else
> +		off1 = (off64_t)(lr % MIN(stat1.st_size - len, MAXFSIZE));
> +	off1 %= maxfsize;
> +	off1 &= ~(stat1.st_blksize - 1);
> +
> +	/*
> +	 * If srcfile == destfile, randomly generate destination ranges
> +	 * until we find one that doesn't overlap the source range.
> +	 */
> +	do {
> +		lr = ((__int64_t)random() << 32) + random();
> +		off2 = (off64_t)(lr % MIN(stat2.st_size + (1024 * 1024), MAXFSIZE));
> +		off2 %= maxfsize;
> +		off2 &= ~(stat2.st_blksize - 1);
> +	} while (stat1.st_ino == stat2.st_ino && llabs(off2 - off1) < len);

I started seeing hangs in generic/013 on cephfs.  After spending some
time looking, I found that this loops forever.  And the reason seems to
be that stat1.st_blksize is too big for this filesystem (4M) -- when
doing:

	off1 &= ~(stat1.st_blksize - 1);

off1 (and off2) will both end up with 0.  Does this make sense?  Would
something like:

-	off1 &= ~(stat1.st_blksize - 1);
+	if (stat1.st_blksize <= stat1.st_size)
+		off1 &= ~(stat1.st_blksize - 1);

be acceptable?  (and a similar change for off2, of course.)

Cheers,
--
Luís

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 6/8] fsstress: implement the clonerange/deduperange ioctls
  2018-02-22 16:06     ` Luis Henriques
@ 2018-02-22 17:27       ` Darrick J. Wong
  2018-02-22 18:17         ` Luis Henriques
  0 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2018-02-22 17:27 UTC (permalink / raw)
  To: Luis Henriques; +Cc: eguan, linux-xfs, fstests

On Thu, Feb 22, 2018 at 04:06:14PM +0000, Luis Henriques wrote:
> On Thu, Dec 14, 2017 at 06:07:31PM -0800, Darrick J. Wong wrote:
> 
> <snip>
> 
> > +void
> > +clonerange_f(
> > +	int			opno,
> > +	long			r)
> > +{
> 
> <snip>
> 
> > +	/* Calculate offsets */
> > +	len = (random() % FILELEN_MAX) + 1;
> > +	len &= ~(stat1.st_blksize - 1);
> > +	if (len == 0)
> > +		len = stat1.st_blksize;
> > +	if (len > stat1.st_size)
> > +		len = stat1.st_size;
> > +
> > +	lr = ((__int64_t)random() << 32) + random();
> > +	if (stat1.st_size == len)
> > +		off1 = 0;
> > +	else
> > +		off1 = (off64_t)(lr % MIN(stat1.st_size - len, MAXFSIZE));
> > +	off1 %= maxfsize;
> > +	off1 &= ~(stat1.st_blksize - 1);
> > +
> > +	/*
> > +	 * If srcfile == destfile, randomly generate destination ranges
> > +	 * until we find one that doesn't overlap the source range.
> > +	 */
> > +	do {
> > +		lr = ((__int64_t)random() << 32) + random();
> > +		off2 = (off64_t)(lr % MIN(stat2.st_size + (1024 * 1024), MAXFSIZE));
> > +		off2 %= maxfsize;
> > +		off2 &= ~(stat2.st_blksize - 1);
> > +	} while (stat1.st_ino == stat2.st_ino && llabs(off2 - off1) < len);
> 
> I started seeing hangs in generic/013 on cephfs.  After spending some
> time looking, I found that this loops forever.  And the reason seems to
> be that stat1.st_blksize is too big for this filesystem (4M) -- when
> doing:

"Too big for this filesystem"?

Uh... maybe you'd better start by giving me more stat buffer info --
what's st_size?

> 	off1 &= ~(stat1.st_blksize - 1);

These bits round the start offset down to block granularity, since clone
range implementations generally require that the ranges align to block
boundaries.

(Though AFAICT ceph doesn't support clone range anyway...)

So reading between the lines, is the problem here that ceph advertises a
blocksize of 4M and fsstress calls clonerange_f with files that are
smaller than 4M in size, so the only possible offsets with a 4M
blocksize are zero and that's why we end up looping forever?

--D

> 
> off1 (and off2) will both end up with 0.  Does this make sense?  Would
> something like:
> 
> -	off1 &= ~(stat1.st_blksize - 1);
> +	if (stat1.st_blksize <= stat1.st_size)
> +		off1 &= ~(stat1.st_blksize - 1);
> 
> be acceptable?  (and a similar change for off2, of course.)

> Cheers,
> --
> Luís
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 6/8] fsstress: implement the clonerange/deduperange ioctls
  2018-02-22 17:27       ` Darrick J. Wong
@ 2018-02-22 18:17         ` Luis Henriques
  2018-02-22 18:34           ` Darrick J. Wong
  0 siblings, 1 reply; 55+ messages in thread
From: Luis Henriques @ 2018-02-22 18:17 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: eguan, linux-xfs, fstests

On Thu, Feb 22, 2018 at 09:27:41AM -0800, Darrick J. Wong wrote:
> On Thu, Feb 22, 2018 at 04:06:14PM +0000, Luis Henriques wrote:
> > On Thu, Dec 14, 2017 at 06:07:31PM -0800, Darrick J. Wong wrote:
> > 
> > <snip>
> > 
> > > +void
> > > +clonerange_f(
> > > +	int			opno,
> > > +	long			r)
> > > +{
> > 
> > <snip>
> > 
> > > +	/* Calculate offsets */
> > > +	len = (random() % FILELEN_MAX) + 1;
> > > +	len &= ~(stat1.st_blksize - 1);
> > > +	if (len == 0)
> > > +		len = stat1.st_blksize;
> > > +	if (len > stat1.st_size)
> > > +		len = stat1.st_size;
> > > +
> > > +	lr = ((__int64_t)random() << 32) + random();
> > > +	if (stat1.st_size == len)
> > > +		off1 = 0;
> > > +	else
> > > +		off1 = (off64_t)(lr % MIN(stat1.st_size - len, MAXFSIZE));
> > > +	off1 %= maxfsize;
> > > +	off1 &= ~(stat1.st_blksize - 1);
> > > +
> > > +	/*
> > > +	 * If srcfile == destfile, randomly generate destination ranges
> > > +	 * until we find one that doesn't overlap the source range.
> > > +	 */
> > > +	do {
> > > +		lr = ((__int64_t)random() << 32) + random();
> > > +		off2 = (off64_t)(lr % MIN(stat2.st_size + (1024 * 1024), MAXFSIZE));
> > > +		off2 %= maxfsize;
> > > +		off2 &= ~(stat2.st_blksize - 1);
> > > +	} while (stat1.st_ino == stat2.st_ino && llabs(off2 - off1) < len);
> > 
> > I started seeing hangs in generic/013 on cephfs.  After spending some
> > time looking, I found that this loops forever.  And the reason seems to
> > be that stat1.st_blksize is too big for this filesystem (4M) -- when
> > doing:
> 
> "Too big for this filesystem"?
> 
> Uh... maybe you'd better start by giving me more stat buffer info --
> what's st_size?
> 
> > 	off1 &= ~(stat1.st_blksize - 1);
> 
> These bits round the start offset down to block granularity, since clone
> range implementations generally require that the ranges align to block
> boundaries.
> 
> (Though AFAICT ceph doesn't support clone range anyway...)
> 
> So reading between the lines, is the problem here that ceph advertises a
> blocksize of 4M and fsstress calls clonerange_f with files that are
> smaller than 4M in size, so the only possible offsets with a 4M
> blocksize are zero and that's why we end up looping forever?

Brilliantly described!  That is *exactly* what I'm seeing and failed to
describe.  I guess I could use FSSTRESS_AVOID to work around this issue,
but there are probably better options.

Cheers,
--
Luís

> 
> --D
> 
> > 
> > off1 (and off2) will both end up with 0.  Does this make sense?  Would
> > something like:
> > 
> > -	off1 &= ~(stat1.st_blksize - 1);
> > +	if (stat1.st_blksize <= stat1.st_size)
> > +		off1 &= ~(stat1.st_blksize - 1);
> > 
> > be acceptable?  (and a similar change for off2, of course.)
> 
> > Cheers,
> > --
> > Luís
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 6/8] fsstress: implement the clonerange/deduperange ioctls
  2018-02-22 18:17         ` Luis Henriques
@ 2018-02-22 18:34           ` Darrick J. Wong
  2018-02-23 10:17             ` Luis Henriques
  0 siblings, 1 reply; 55+ messages in thread
From: Darrick J. Wong @ 2018-02-22 18:34 UTC (permalink / raw)
  To: Luis Henriques; +Cc: eguan, linux-xfs, fstests

On Thu, Feb 22, 2018 at 06:17:31PM +0000, Luis Henriques wrote:
> On Thu, Feb 22, 2018 at 09:27:41AM -0800, Darrick J. Wong wrote:
> > On Thu, Feb 22, 2018 at 04:06:14PM +0000, Luis Henriques wrote:
> > > On Thu, Dec 14, 2017 at 06:07:31PM -0800, Darrick J. Wong wrote:
> > > 
> > > <snip>
> > > 
> > > > +void
> > > > +clonerange_f(
> > > > +	int			opno,
> > > > +	long			r)
> > > > +{
> > > 
> > > <snip>
> > > 
> > > > +	/* Calculate offsets */
> > > > +	len = (random() % FILELEN_MAX) + 1;
> > > > +	len &= ~(stat1.st_blksize - 1);
> > > > +	if (len == 0)
> > > > +		len = stat1.st_blksize;
> > > > +	if (len > stat1.st_size)
> > > > +		len = stat1.st_size;
> > > > +
> > > > +	lr = ((__int64_t)random() << 32) + random();
> > > > +	if (stat1.st_size == len)
> > > > +		off1 = 0;
> > > > +	else
> > > > +		off1 = (off64_t)(lr % MIN(stat1.st_size - len, MAXFSIZE));
> > > > +	off1 %= maxfsize;
> > > > +	off1 &= ~(stat1.st_blksize - 1);
> > > > +
> > > > +	/*
> > > > +	 * If srcfile == destfile, randomly generate destination ranges
> > > > +	 * until we find one that doesn't overlap the source range.
> > > > +	 */
> > > > +	do {
> > > > +		lr = ((__int64_t)random() << 32) + random();
> > > > +		off2 = (off64_t)(lr % MIN(stat2.st_size + (1024 * 1024), MAXFSIZE));
> > > > +		off2 %= maxfsize;
> > > > +		off2 &= ~(stat2.st_blksize - 1);
> > > > +	} while (stat1.st_ino == stat2.st_ino && llabs(off2 - off1) < len);
> > > 
> > > I started seeing hangs in generic/013 on cephfs.  After spending some
> > > time looking, I found that this loops forever.  And the reason seems to
> > > be that stat1.st_blksize is too big for this filesystem (4M) -- when
> > > doing:
> > 
> > "Too big for this filesystem"?
> > 
> > Uh... maybe you'd better start by giving me more stat buffer info --
> > what's st_size?
> > 
> > > 	off1 &= ~(stat1.st_blksize - 1);
> > 
> > These bits round the start offset down to block granularity, since clone
> > range implementations generally require that the ranges align to block
> > boundaries.
> > 
> > (Though AFAICT ceph doesn't support clone range anyway...)
> > 
> > So reading between the lines, is the problem here that ceph advertises a
> > blocksize of 4M and fsstress calls clonerange_f with files that are
> > smaller than 4M in size, so the only possible offsets with a 4M
> > blocksize are zero and that's why we end up looping forever?
> 
> Brilliantly described!  That is *exactly* what I'm seeing and failed to
> describe.  I guess I could use FSSTRESS_AVOID to work around this issue,
> but there are probably better options.

Better to patch fsstress.c against this bug. :)

Does the following patch help?

--D

diff --git a/ltp/fsstress.c b/ltp/fsstress.c
index 935f5de..e107099 100644
--- a/ltp/fsstress.c
+++ b/ltp/fsstress.c
@@ -2222,6 +2222,7 @@ clonerange_f(
 	off64_t			lr;
 	off64_t			off1;
 	off64_t			off2;
+	off64_t			max_off2;
 	size_t			len;
 	int			v1;
 	int			v2;
@@ -2305,9 +2306,10 @@ clonerange_f(
 	 * If srcfile == destfile, randomly generate destination ranges
 	 * until we find one that doesn't overlap the source range.
 	 */
+	max_off2 = MIN(stat2.st_size + (1024ULL * stat2.st_blksize), MAXFSIZE);
 	do {
 		lr = ((int64_t)random() << 32) + random();
-		off2 = (off64_t)(lr % MIN(stat2.st_size + (1024 * 1024), MAXFSIZE));
+		off2 = (off64_t)(lr % max_off2);
 		off2 %= maxfsize;
 		off2 &= ~(stat2.st_blksize - 1);
 	} while (stat1.st_ino == stat2.st_ino && llabs(off2 - off1) < len);

^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH v2 6/8] fsstress: implement the clonerange/deduperange ioctls
  2018-02-22 18:34           ` Darrick J. Wong
@ 2018-02-23 10:17             ` Luis Henriques
  0 siblings, 0 replies; 55+ messages in thread
From: Luis Henriques @ 2018-02-23 10:17 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: eguan, linux-xfs, fstests

On Thu, Feb 22, 2018 at 10:34:45AM -0800, Darrick J. Wong wrote:
> On Thu, Feb 22, 2018 at 06:17:31PM +0000, Luis Henriques wrote:
> > On Thu, Feb 22, 2018 at 09:27:41AM -0800, Darrick J. Wong wrote:
> > > On Thu, Feb 22, 2018 at 04:06:14PM +0000, Luis Henriques wrote:
> > > > On Thu, Dec 14, 2017 at 06:07:31PM -0800, Darrick J. Wong wrote:
> > > > 
> > > > <snip>
> > > > 
> > > > > +void
> > > > > +clonerange_f(
> > > > > +	int			opno,
> > > > > +	long			r)
> > > > > +{
> > > > 
> > > > <snip>
> > > > 
> > > > > +	/* Calculate offsets */
> > > > > +	len = (random() % FILELEN_MAX) + 1;
> > > > > +	len &= ~(stat1.st_blksize - 1);
> > > > > +	if (len == 0)
> > > > > +		len = stat1.st_blksize;
> > > > > +	if (len > stat1.st_size)
> > > > > +		len = stat1.st_size;
> > > > > +
> > > > > +	lr = ((__int64_t)random() << 32) + random();
> > > > > +	if (stat1.st_size == len)
> > > > > +		off1 = 0;
> > > > > +	else
> > > > > +		off1 = (off64_t)(lr % MIN(stat1.st_size - len, MAXFSIZE));
> > > > > +	off1 %= maxfsize;
> > > > > +	off1 &= ~(stat1.st_blksize - 1);
> > > > > +
> > > > > +	/*
> > > > > +	 * If srcfile == destfile, randomly generate destination ranges
> > > > > +	 * until we find one that doesn't overlap the source range.
> > > > > +	 */
> > > > > +	do {
> > > > > +		lr = ((__int64_t)random() << 32) + random();
> > > > > +		off2 = (off64_t)(lr % MIN(stat2.st_size + (1024 * 1024), MAXFSIZE));
> > > > > +		off2 %= maxfsize;
> > > > > +		off2 &= ~(stat2.st_blksize - 1);
> > > > > +	} while (stat1.st_ino == stat2.st_ino && llabs(off2 - off1) < len);
> > > > 
> > > > I started seeing hangs in generic/013 on cephfs.  After spending some
> > > > time looking, I found that this loops forever.  And the reason seems to
> > > > be that stat1.st_blksize is too big for this filesystem (4M) -- when
> > > > doing:
> > > 
> > > "Too big for this filesystem"?
> > > 
> > > Uh... maybe you'd better start by giving me more stat buffer info --
> > > what's st_size?
> > > 
> > > > 	off1 &= ~(stat1.st_blksize - 1);
> > > 
> > > These bits round the start offset down to block granularity, since clone
> > > range implementations generally require that the ranges align to block
> > > boundaries.
> > > 
> > > (Though AFAICT ceph doesn't support clone range anyway...)
> > > 
> > > So reading between the lines, is the problem here that ceph advertises a
> > > blocksize of 4M and fsstress calls clonerange_f with files that are
> > > smaller than 4M in size, so the only possible offsets with a 4M
> > > blocksize are zero and that's why we end up looping forever?
> > 
> > Brilliantly described!  That is *exactly* what I'm seeing and failed to
> > describe.  I guess I could use FSSTRESS_AVOID to work around this issue,
> > but there are probably better options.
> 
> Better to patch fsstress.c against this bug. :)
> 
> Does the following patch help?

Yes, it does.  Thanks!  Feel free to add my

Tested-by: Luis Henriques <lhenriques@suse.com>

Cheers,
--
Luís

> 
> --D
> 
> diff --git a/ltp/fsstress.c b/ltp/fsstress.c
> index 935f5de..e107099 100644
> --- a/ltp/fsstress.c
> +++ b/ltp/fsstress.c
> @@ -2222,6 +2222,7 @@ clonerange_f(
>  	off64_t			lr;
>  	off64_t			off1;
>  	off64_t			off2;
> +	off64_t			max_off2;
>  	size_t			len;
>  	int			v1;
>  	int			v2;
> @@ -2305,9 +2306,10 @@ clonerange_f(
>  	 * If srcfile == destfile, randomly generate destination ranges
>  	 * until we find one that doesn't overlap the source range.
>  	 */
> +	max_off2 = MIN(stat2.st_size + (1024ULL * stat2.st_blksize), MAXFSIZE);
>  	do {
>  		lr = ((int64_t)random() << 32) + random();
> -		off2 = (off64_t)(lr % MIN(stat2.st_size + (1024 * 1024), MAXFSIZE));
> +		off2 = (off64_t)(lr % max_off2);
>  		off2 %= maxfsize;
>  		off2 &= ~(stat2.st_blksize - 1);
>  	} while (stat1.st_ino == stat2.st_ino && llabs(off2 - off1) < len);
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2018-02-23 10:17 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-13  6:03 [PATCH 0/8] weekly fstests changes Darrick J. Wong
2017-12-13  6:03 ` [PATCH 1/8] common/rc: report kmemleak errors Darrick J. Wong
2017-12-14  9:37   ` Eryu Guan
2017-12-14 18:15     ` Darrick J. Wong
2018-01-05  8:02       ` Eryu Guan
2018-01-05 17:02         ` Darrick J. Wong
2018-01-07 15:25           ` Eryu Guan
2017-12-13  6:03 ` [PATCH 2/8] common/xfs: fix scrub support probing again Darrick J. Wong
2017-12-13  6:03 ` [PATCH 3/8] generic/45[34]: test line draw characters in file/attr names Darrick J. Wong
2017-12-13  6:03 ` [PATCH 4/8] xfs: fix tests to handle removal of no-alloc create nonfeature Darrick J. Wong
2017-12-13 22:12   ` Dave Chinner
2017-12-13 22:45   ` [PATCH v2 " Darrick J. Wong
2017-12-13  6:03 ` [PATCH 5/8] generic: test error shutdown while stressing filesystem Darrick J. Wong
2017-12-13  6:03 ` [PATCH 6/8] fsstress: implement the clonerange/deduperange ioctls Darrick J. Wong
2017-12-14  6:39   ` Amir Goldstein
2017-12-14  7:32     ` Eryu Guan
2017-12-14 20:20       ` Darrick J. Wong
2017-12-15  2:07   ` [PATCH v2 " Darrick J. Wong
2018-01-03  8:48     ` Eryu Guan
2018-01-03 17:12       ` Darrick J. Wong
2018-01-05  4:35         ` Eryu Guan
2018-01-05  4:54           ` Darrick J. Wong
2018-01-06  1:46             ` Darrick J. Wong
2018-01-09  7:09               ` Darrick J. Wong
2018-02-22 16:06     ` Luis Henriques
2018-02-22 17:27       ` Darrick J. Wong
2018-02-22 18:17         ` Luis Henriques
2018-02-22 18:34           ` Darrick J. Wong
2018-02-23 10:17             ` Luis Henriques
2017-12-13  6:04 ` [PATCH 7/8] generic: run a long-soak write-only fsstress test Darrick J. Wong
2018-01-07 15:34   ` Eryu Guan
2017-12-13  6:04 ` [PATCH 8/8] xfs/068: fix variability problems in file/dir count output Darrick J. Wong
2017-12-13 22:20   ` Dave Chinner
2017-12-13 22:23     ` Darrick J. Wong
2017-12-13 22:45       ` Dave Chinner
2017-12-13 23:17         ` Darrick J. Wong
2017-12-13 23:42           ` Dave Chinner
2017-12-13 23:28   ` [PATCH v2 8/8] xfs/068: fix clonerange " Darrick J. Wong
2017-12-13 23:44     ` Dave Chinner
2017-12-14  6:52       ` Amir Goldstein
2017-12-14  7:37         ` Amir Goldstein
2017-12-14  7:49         ` Eryu Guan
2017-12-14  8:15           ` Amir Goldstein
2017-12-14 21:35           ` Dave Chinner
2017-12-15  2:04             ` Darrick J. Wong
2017-12-15  4:37               ` Dave Chinner
2017-12-15  7:06                 ` Amir Goldstein
2017-12-15  2:08   ` [PATCH v3 8/8] xfs/068: fix variability " Darrick J. Wong
2017-12-15  2:16     ` Darrick J. Wong
2017-12-15  2:17   ` [PATCH v4 " Darrick J. Wong
2017-12-15  8:55 ` [PATCH 0/8] weekly fstests changes Eryu Guan
2018-01-03 19:22 ` [PATCH 9/8] xfs: find libxfs api violations Darrick J. Wong
2018-01-03 19:26 ` [PATCH 10/8] xfs: check that fs freeze minimizes required recovery Darrick J. Wong
2018-01-09 11:33   ` Eryu Guan
2018-01-10  0:03     ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.