fstests.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] xfstests: random fixes
@ 2020-10-27 19:01 Darrick J. Wong
  2020-10-27 19:01 ` [PATCH 1/9] common: extract rt extent size for _get_file_block_size Darrick J. Wong
                   ` (9 more replies)
  0 siblings, 10 replies; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-27 19:01 UTC (permalink / raw)
  To: darrick.wong, guaneryu; +Cc: linux-xfs, fstests

Hi all,

This series contains random fixes to fstests.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=random-fixes

xfsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=random-fixes

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=random-fixes
---
 check             |   21 ++++++++++++++++++++-
 common/populate   |    5 +++++
 common/rc         |   13 ++++++++++---
 common/repair     |    1 +
 common/xfs        |   20 ++++++++++++++++++++
 tests/xfs/030     |    1 +
 tests/xfs/272     |    3 +++
 tests/xfs/276     |    8 +++++++-
 tests/xfs/327     |   18 ++++++++++++++++--
 tests/xfs/327.out |   13 +++++++------
 tests/xfs/328     |    2 +-
 tests/xfs/341     |    8 +++++---
 tests/xfs/520     |    3 +++
 13 files changed, 99 insertions(+), 17 deletions(-)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/9] common: extract rt extent size for _get_file_block_size
  2020-10-27 19:01 [PATCH 0/9] xfstests: random fixes Darrick J. Wong
@ 2020-10-27 19:01 ` Darrick J. Wong
  2020-10-28  7:41   ` Christoph Hellwig
  2020-10-27 19:01 ` [PATCH 2/9] xfs/520: disable external devices Darrick J. Wong
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-27 19:01 UTC (permalink / raw)
  To: darrick.wong, guaneryu; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

_get_file_block_size is intended to return the size (in bytes) of the
fundamental allocation unit for a file.  This is required for remapping
operations like fallocate and reflink, which can only operate on
allocation units.  Since the XFS realtime volume can be configure for
allocation units larger than 1 fs block, we need to factor that in here.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 common/rc  |   13 ++++++++++---
 common/xfs |   20 ++++++++++++++++++++
 2 files changed, 30 insertions(+), 3 deletions(-)


diff --git a/common/rc b/common/rc
index 27a27ea3..41f93047 100644
--- a/common/rc
+++ b/common/rc
@@ -3974,11 +3974,18 @@ _get_file_block_size()
 		echo "Missing mount point argument for _get_file_block_size"
 		exit 1
 	fi
-	if [ "$FSTYP" = "ocfs2" ]; then
+
+	case "$FSTYP" in
+	"ocfs2")
 		stat -c '%o' $1
-	else
+		;;
+	"xfs")
+		_xfs_get_file_block_size $1
+		;;
+	*)
 		_get_block_size $1
-	fi
+		;;
+	esac
 }
 
 # Get the minimum block size of an fs.
diff --git a/common/xfs b/common/xfs
index 79dab058..3f5c14ba 100644
--- a/common/xfs
+++ b/common/xfs
@@ -174,6 +174,26 @@ _scratch_mkfs_xfs()
 	return $mkfs_status
 }
 
+# Get the size of an allocation unit of a file.  Normally this is just the
+# block size of the file, but for realtime files, this is the realtime extent
+# size.
+_xfs_get_file_block_size()
+{
+	local path="$1"
+
+	if ! ($XFS_IO_PROG -c "stat -v" "$path" 2>&1 | egrep -q '(rt-inherit|realtime)'); then
+		_get_block_size "$path"
+		return
+	fi
+
+	# Otherwise, call xfs_info until we find a mount point or the root.
+	path="$(readlink -m "$path")"
+	while ! $XFS_INFO_PROG "$path" &>/dev/null && [ "$path" != "/" ]; do
+		path="$(dirname "$path")"
+	done
+	$XFS_INFO_PROG "$path" | grep realtime | sed -e 's/^.*extsz=\([0-9]*\).*$/\1/g'
+}
+
 # xfs_check script is planned to be deprecated. But, we want to
 # be able to invoke "xfs_check" behavior in xfstests in order to
 # maintain the current verification levels.


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/9] xfs/520: disable external devices
  2020-10-27 19:01 [PATCH 0/9] xfstests: random fixes Darrick J. Wong
  2020-10-27 19:01 ` [PATCH 1/9] common: extract rt extent size for _get_file_block_size Darrick J. Wong
@ 2020-10-27 19:01 ` Darrick J. Wong
  2020-10-28  7:41   ` Christoph Hellwig
  2020-10-27 19:01 ` [PATCH 3/9] xfs/341: fix test when rextsize > 1 Darrick J. Wong
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-27 19:01 UTC (permalink / raw)
  To: darrick.wong, guaneryu; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

This is a regression test for a specific bug that requires a specific
configuration of the data device.  Realtime volumes and external logs
don't affect the efficacy of the test, but the test can fail mkfs if the
realtime device is very large.

Therefore, unset USE_EXTERNAL so that we always run this regression
test, even if the tester enabled realtime.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/520 |    3 +++
 1 file changed, 3 insertions(+)


diff --git a/tests/xfs/520 b/tests/xfs/520
index 51bc4100..b99b9c6e 100755
--- a/tests/xfs/520
+++ b/tests/xfs/520
@@ -41,6 +41,9 @@ _disable_dmesg_check
 _require_check_dmesg
 _require_scratch_nocheck
 
+# Don't let the rtbitmap fill up the data device and screw up this test
+unset USE_EXTERNAL
+
 force_crafted_metadata() {
 	_scratch_mkfs_xfs -f $fsdsopt "$4" >> $seqres.full 2>&1 || _fail "mkfs failed"
 	_scratch_xfs_set_metadata_field "$1" "$2" "$3" >> $seqres.full 2>&1


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/9] xfs/341: fix test when rextsize > 1
  2020-10-27 19:01 [PATCH 0/9] xfstests: random fixes Darrick J. Wong
  2020-10-27 19:01 ` [PATCH 1/9] common: extract rt extent size for _get_file_block_size Darrick J. Wong
  2020-10-27 19:01 ` [PATCH 2/9] xfs/520: disable external devices Darrick J. Wong
@ 2020-10-27 19:01 ` Darrick J. Wong
  2020-10-28  7:41   ` Christoph Hellwig
  2020-10-27 19:01 ` [PATCH 4/9] various: replace _get_block_size with _get_file_block_size when needed Darrick J. Wong
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-27 19:01 UTC (permalink / raw)
  To: darrick.wong, guaneryu; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

Fix this test so that it works when the rt extent size is larger than
single block.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/341 |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)


diff --git a/tests/xfs/341 b/tests/xfs/341
index 37ff2bd9..e1fbe588 100755
--- a/tests/xfs/341
+++ b/tests/xfs/341
@@ -41,6 +41,8 @@ cat $tmp.mkfs > "$seqres.full" 2>&1
 _scratch_mount
 blksz="$(_get_block_size $SCRATCH_MNT)"
 
+rtextsz_blks=$((rtextsz / blksz))
+
 # inode core size is at least 176 bytes; btree header is 56 bytes;
 # rtrmap record is 32 bytes; and rtrmap key/pointer are 56 bytes.
 i_ptrs=$(( (isize - 176) / 56 ))
@@ -52,14 +54,14 @@ len=$((blocks * rtextsz))
 echo "Create some files"
 $XFS_IO_PROG -f -R -c "falloc 0 $len" -c "pwrite -S 0x68 -b 1048576 0 $len" $SCRATCH_MNT/f1 >> $seqres.full
 $XFS_IO_PROG -f -R -c "falloc 0 $len" -c "pwrite -S 0x68 -b 1048576 0 $len" $SCRATCH_MNT/f2 >> $seqres.full
-$here/src/punch-alternating $SCRATCH_MNT/f1 >> "$seqres.full"
-$here/src/punch-alternating $SCRATCH_MNT/f2 >> "$seqres.full"
+$here/src/punch-alternating -i $((2 * rtextsz_blks)) -s $rtextsz_blks $SCRATCH_MNT/f1 >> "$seqres.full"
+$here/src/punch-alternating -i $((2 * rtextsz_blks)) -s $rtextsz_blks $SCRATCH_MNT/f2 >> "$seqres.full"
 echo garbage > $SCRATCH_MNT/f3
 ino=$(stat -c '%i' $SCRATCH_MNT/f3)
 _scratch_unmount
 
 echo "Corrupt fs"
-fsbno=$(_scratch_xfs_db -c "inode $ino" -c 'bmap' | \
+fsbno=$(_scratch_xfs_db -c "inode $ino" -c 'bmap' | grep 'flag 0' | head -n 1 | \
 	sed -e 's/^.*startblock \([0-9]*\) .*$/\1/g')
 
 _scratch_xfs_db -x -c 'sb 0' -c 'addr rrmapino' \


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 4/9] various: replace _get_block_size with _get_file_block_size when needed
  2020-10-27 19:01 [PATCH 0/9] xfstests: random fixes Darrick J. Wong
                   ` (2 preceding siblings ...)
  2020-10-27 19:01 ` [PATCH 3/9] xfs/341: fix test when rextsize > 1 Darrick J. Wong
@ 2020-10-27 19:01 ` Darrick J. Wong
  2020-10-28  7:42   ` Christoph Hellwig
  2020-10-27 19:02 ` [PATCH 5/9] xfs/327: fix inode reflink flag checking Darrick J. Wong
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-27 19:01 UTC (permalink / raw)
  To: darrick.wong, guaneryu; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

The _get_file_block_size helper was added so that tests could find out
the size of a fundamental unit of allocation for a given file, which is
necessary for certain fallocate and clonerange tests.

On certain filesystem configurations (ocfs2 with clusters, xfs with a
large rt extent size), this is /not/ the same as the filesystem block
size, and these tests will fail.  Fix them to use the correct helper.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/328 |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/tests/xfs/328 b/tests/xfs/328
index 9b17b6a4..26325b40 100755
--- a/tests/xfs/328
+++ b/tests/xfs/328
@@ -48,7 +48,7 @@ mkdir "$testdir"
 # 2^10 blocks... that should be plenty for anyone.
 fnr=$((12 + LOAD_FACTOR))
 free_blocks=$(stat -f -c '%a' "$testdir")
-blksz=$(_get_block_size $testdir)
+blksz=$(_get_file_block_size $testdir)
 space_avail=$((free_blocks * blksz))
 calc_space()
 {


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 5/9] xfs/327: fix inode reflink flag checking
  2020-10-27 19:01 [PATCH 0/9] xfstests: random fixes Darrick J. Wong
                   ` (3 preceding siblings ...)
  2020-10-27 19:01 ` [PATCH 4/9] various: replace _get_block_size with _get_file_block_size when needed Darrick J. Wong
@ 2020-10-27 19:02 ` Darrick J. Wong
  2020-10-28  7:42   ` Christoph Hellwig
  2020-10-27 19:02 ` [PATCH 6/9] xfs/27[26]: force realtime on or off as needed Darrick J. Wong
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-27 19:02 UTC (permalink / raw)
  To: darrick.wong, guaneryu; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

This is a regression test that tried to make sure that repair correctly
clears the XFS inode reflink flag when it detects files that do not
share any blocks.  However, it does this checking by looking at the
(online) lsattr output.  This worked fine during development when we
exposed the reflink state via the stat ioctls, but that has long since
been removed.  Now the only way to check is via xfs_db, so switch it to
use that.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/327     |   18 ++++++++++++++++--
 tests/xfs/327.out |   13 +++++++------
 2 files changed, 23 insertions(+), 8 deletions(-)


diff --git a/tests/xfs/327 b/tests/xfs/327
index 017e2a84..7a14798f 100755
--- a/tests/xfs/327
+++ b/tests/xfs/327
@@ -49,10 +49,21 @@ seq 1 $nr | while read i; do
 done
 sync
 
+ino_0=$(stat -c '%i' $SCRATCH_MNT/file.0)
+ino_64=$(stat -c '%i' $SCRATCH_MNT/file.64)
+ino_128=$(stat -c '%i' $SCRATCH_MNT/file.128)
+
+echo "Check filesystem"
+_scratch_unmount
+_scratch_xfs_db -c "inode $ino_0" -c print \
+	-c "inode $ino_64" -c print \
+	-c "inode $ino_128" -c print | grep reflink | sed -e 's/^v[0-9]*/vX/g'
+_scratch_mount
+
 echo "Check files"
 for i in 0 $((nr / 2)) $nr; do
 	md5sum $SCRATCH_MNT/file.$i | _filter_scratch
-	$XFS_IO_PROG -c 'lsattr -v' $SCRATCH_MNT/file.$i | _filter_scratch
+	$XFS_IO_PROG -c 'lsattr -v' $SCRATCH_MNT/file.$i >> $seqres.full
 done
 
 echo "CoW all files"
@@ -63,12 +74,15 @@ done
 echo "Repair filesystem"
 _scratch_unmount
 _repair_scratch_fs >> $seqres.full
+_scratch_xfs_db -c "inode $ino_0" -c print \
+	-c "inode $ino_64" -c print \
+	-c "inode $ino_128" -c print | grep reflink | sed -e 's/^v[0-9]*/vX/g'
 _scratch_mount
 
 echo "Check files again"
 for i in 0 $((nr / 2)) $nr; do
 	md5sum $SCRATCH_MNT/file.$i | _filter_scratch
-	$XFS_IO_PROG -c 'lsattr -v' $SCRATCH_MNT/file.$i | _filter_scratch
+	$XFS_IO_PROG -c 'lsattr -v' $SCRATCH_MNT/file.$i >> $seqres.full
 done
 
 echo "Done"
diff --git a/tests/xfs/327.out b/tests/xfs/327.out
index 5b3cba21..0e204205 100644
--- a/tests/xfs/327.out
+++ b/tests/xfs/327.out
@@ -1,20 +1,21 @@
 QA output created by 327
 Format filesystem
 Create files
+Check filesystem
+vX.reflink = 1
+vX.reflink = 1
+vX.reflink = 1
 Check files
 8fa14cdd754f91cc6554c9e71929cce7  SCRATCH_MNT/file.0
-[] SCRATCH_MNT/file.0 
 8fa14cdd754f91cc6554c9e71929cce7  SCRATCH_MNT/file.64
-[] SCRATCH_MNT/file.64 
 8fa14cdd754f91cc6554c9e71929cce7  SCRATCH_MNT/file.128
-[] SCRATCH_MNT/file.128 
 CoW all files
 Repair filesystem
+vX.reflink = 0
+vX.reflink = 0
+vX.reflink = 0
 Check files again
 8fa14cdd754f91cc6554c9e71929cce7  SCRATCH_MNT/file.0
-[] SCRATCH_MNT/file.0 
 0f17fd72b7bbf5bda0ff433e6d1fc118  SCRATCH_MNT/file.64
-[] SCRATCH_MNT/file.64 
 0f17fd72b7bbf5bda0ff433e6d1fc118  SCRATCH_MNT/file.128
-[] SCRATCH_MNT/file.128 
 Done


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 6/9] xfs/27[26]: force realtime on or off as needed
  2020-10-27 19:01 [PATCH 0/9] xfstests: random fixes Darrick J. Wong
                   ` (4 preceding siblings ...)
  2020-10-27 19:02 ` [PATCH 5/9] xfs/327: fix inode reflink flag checking Darrick J. Wong
@ 2020-10-27 19:02 ` Darrick J. Wong
  2020-10-28  7:43   ` Christoph Hellwig
  2020-10-27 19:02 ` [PATCH 7/9] xfs/030: hide the btree levels check errors Darrick J. Wong
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-27 19:02 UTC (permalink / raw)
  To: darrick.wong, guaneryu; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

Certain tests have certain requirements where the realtime parameters
are concerned.  Fix them all.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/272 |    3 +++
 tests/xfs/276 |    8 +++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)


diff --git a/tests/xfs/272 b/tests/xfs/272
index a9baf2e7..6c0fede5 100755
--- a/tests/xfs/272
+++ b/tests/xfs/272
@@ -37,6 +37,9 @@ echo "Format and mount"
 _scratch_mkfs > "$seqres.full" 2>&1
 _scratch_mount
 
+# Make sure everything is on the data device
+$XFS_IO_PROG -c 'chattr -t' $SCRATCH_MNT
+
 _pwrite_byte 0x80 0 737373 $SCRATCH_MNT/urk >> $seqres.full
 sync
 $here/src/punch-alternating $SCRATCH_MNT/urk >> $seqres.full
diff --git a/tests/xfs/276 b/tests/xfs/276
index 8b7e1de5..6e2b2fb4 100755
--- a/tests/xfs/276
+++ b/tests/xfs/276
@@ -35,9 +35,15 @@ _require_test_program "punch-alternating"
 rm -f "$seqres.full"
 
 echo "Format and mount"
-_scratch_mkfs > "$seqres.full" 2>&1
+_scratch_mkfs | _filter_mkfs 2> "$tmp.mkfs"
+. $tmp.mkfs
+cat "$tmp.mkfs" > $seqres.full
 _scratch_mount
 
+# Don't let the rt extent size perturb the fsmap output with unwritten
+# extents in places we don't expect them
+test $rtextsz -eq $dbsize || _notrun "Skipping test due to rtextsize > 1 fsb"
+
 $XFS_IO_PROG -f -R -c 'pwrite -S 0x80 0 737373' $SCRATCH_MNT/urk >> $seqres.full
 sync
 $here/src/punch-alternating $SCRATCH_MNT/urk >> $seqres.full


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 7/9] xfs/030: hide the btree levels check errors
  2020-10-27 19:01 [PATCH 0/9] xfstests: random fixes Darrick J. Wong
                   ` (5 preceding siblings ...)
  2020-10-27 19:02 ` [PATCH 6/9] xfs/27[26]: force realtime on or off as needed Darrick J. Wong
@ 2020-10-27 19:02 ` Darrick J. Wong
  2020-10-28  7:43   ` Christoph Hellwig
  2020-10-27 19:02 ` [PATCH 8/9] check: run tests in a systemd scope for mandatory test cleanup Darrick J. Wong
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-27 19:02 UTC (permalink / raw)
  To: darrick.wong, guaneryu; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

Newer versions of xfsprogs now complain if the rmap and refcount btree
levels are insane, so hide that error from the golden output.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 common/repair |    1 +
 tests/xfs/030 |    1 +
 2 files changed, 2 insertions(+)


diff --git a/common/repair b/common/repair
index 6668dd51..0ae7464e 100644
--- a/common/repair
+++ b/common/repair
@@ -45,6 +45,7 @@ s/(bad length -{0,1}\d+ for ag. 0, should be) (\d+)/\1 LENGTH/;
 s/(bad length # -{0,1}\d+ for ag. 0, should be) (\d+)/\1 LENGTH/;
 s/(bad agbno) (\d+)/\1 AGBNO/g;
 s/(max =) (\d+)/\1 MAX/g;
+s/(bad levels) (\d+) (for [a-z]* root, agno) (\d+)/\1 LEVELS \3 AGNO/;
 # for root inos
 s/(on inode) (\d+)/\1 INO/g;
 s/(imap claims a free inode) (\d+)/\1 INO/;
diff --git a/tests/xfs/030 b/tests/xfs/030
index b6590913..04440f9c 100755
--- a/tests/xfs/030
+++ b/tests/xfs/030
@@ -45,6 +45,7 @@ _check_ag()
 			    -e '/^agf has bad CRC/d' \
 			    -e '/^agi has bad CRC/d' \
 			    -e '/^Missing reverse-mapping record.*/d' \
+			    -e '/^bad levels LEVELS for [a-z]* root.*/d' \
 			    -e '/^unknown block state, ag AGNO, block.*/d'
 	done
 }


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 8/9] check: run tests in a systemd scope for mandatory test cleanup
  2020-10-27 19:01 [PATCH 0/9] xfstests: random fixes Darrick J. Wong
                   ` (6 preceding siblings ...)
  2020-10-27 19:02 ` [PATCH 7/9] xfs/030: hide the btree levels check errors Darrick J. Wong
@ 2020-10-27 19:02 ` Darrick J. Wong
  2020-10-28  7:44   ` Christoph Hellwig
  2020-11-02 21:37   ` Darrick J. Wong
  2020-10-27 19:02 ` [PATCH 9/9] common/populate: make sure _scratch_xfs_populate puts its files on the data device Darrick J. Wong
  2020-11-08 10:05 ` [PATCH 0/9] xfstests: random fixes Eryu Guan
  9 siblings, 2 replies; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-27 19:02 UTC (permalink / raw)
  To: darrick.wong, guaneryu; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

If systemd is available, run each test in its own temporary systemd
scope.  This enables the test harness to forcibly clean up all of the
test's child processes (if it does not do so itself) so that we can move
into the post-test unmount and check cleanly.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 check |   21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)


diff --git a/check b/check
index 5072dd82..47c72fa2 100755
--- a/check
+++ b/check
@@ -521,6 +521,11 @@ _expunge_test()
 	return 0
 }
 
+# Can we run systemd scopes?
+HAVE_SYSTEMD_SCOPES=
+systemd-run --quiet --unit "fstests-check" --scope bash -c "exit 77" &> /dev/null
+test $? -eq 77 && HAVE_SYSTEMD_SCOPES=yes
+
 # Make the check script unattractive to the OOM killer...
 OOM_SCORE_ADJ="/proc/self/oom_score_adj"
 test -w ${OOM_SCORE_ADJ} && echo -1000 > ${OOM_SCORE_ADJ}
@@ -528,8 +533,22 @@ test -w ${OOM_SCORE_ADJ} && echo -1000 > ${OOM_SCORE_ADJ}
 # ...and make the tests themselves somewhat more attractive to it, so that if
 # the system runs out of memory it'll be the test that gets killed and not the
 # test framework.
+#
+# If systemd is available, run the entire test script in a scope so that we can
+# kill all subprocesses of the test if it fails to clean up after itself.  This
+# is essential for ensuring that the post-test unmount succeeds.
 _run_seq() {
-	bash -c "test -w ${OOM_SCORE_ADJ} && echo 250 > ${OOM_SCORE_ADJ}; exec ./$seq"
+	local cmd=(bash -c "test -w ${OOM_SCORE_ADJ} && echo 250 > ${OOM_SCORE_ADJ}; exec ./$seq")
+
+	if [ -n "${HAVE_SYSTEMD_SCOPES}" ]; then
+		local unit="$(systemd-escape "fs$seq").scope"
+		systemd-run --quiet --unit "${unit}" --scope "${cmd[@]}"
+		res=$?
+		systemctl stop "${unit}" &> /dev/null
+		return "${res}"
+	else
+		"${cmd[@]}"
+	fi
 }
 
 _detect_kmemleak


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 9/9] common/populate: make sure _scratch_xfs_populate puts its files on the data device
  2020-10-27 19:01 [PATCH 0/9] xfstests: random fixes Darrick J. Wong
                   ` (7 preceding siblings ...)
  2020-10-27 19:02 ` [PATCH 8/9] check: run tests in a systemd scope for mandatory test cleanup Darrick J. Wong
@ 2020-10-27 19:02 ` Darrick J. Wong
  2020-10-28  7:44   ` Christoph Hellwig
  2020-11-08 10:05 ` [PATCH 0/9] xfstests: random fixes Eryu Guan
  9 siblings, 1 reply; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-27 19:02 UTC (permalink / raw)
  To: darrick.wong, guaneryu; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

Make sure that _scratch_xfs_populate always installs its files on the
data device even if the test config selects rt by default.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 common/populate |    5 +++++
 1 file changed, 5 insertions(+)


diff --git a/common/populate b/common/populate
index 0b036051..75e6f499 100644
--- a/common/populate
+++ b/common/populate
@@ -154,6 +154,10 @@ _scratch_xfs_populate() {
 
 	_populate_xfs_qmount_option
 	_scratch_mount
+
+	# Make sure the new files always end up on the data device
+	$XFS_IO_PROG -c 'chattr -t' $SCRATCH_MNT
+
 	blksz="$(stat -f -c '%s' "${SCRATCH_MNT}")"
 	dblksz="$($XFS_INFO_PROG "${SCRATCH_MNT}" | grep naming.*bsize | sed -e 's/^.*bsize=//g' -e 's/\([0-9]*\).*$/\1/g')"
 	crc="$($XFS_INFO_PROG "${SCRATCH_MNT}" | grep crc= | sed -e 's/^.*crc=//g' -e 's/\([0-9]*\).*$/\1/g')"
@@ -306,6 +310,7 @@ _scratch_xfs_populate() {
 	if [ $is_rmapbt -gt 0 ] && [ $is_rt -gt 0 ]; then
 		echo "+ rtrmapbt btree"
 		nr="$((blksz * 2 / 32))"
+		$XFS_IO_PROG -R -f -c 'truncate 0' "${SCRATCH_MNT}/RTRMAPBT"
 		__populate_create_file $((blksz * nr)) "${SCRATCH_MNT}/RTRMAPBT"
 	fi
 


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/9] common: extract rt extent size for _get_file_block_size
  2020-10-27 19:01 ` [PATCH 1/9] common: extract rt extent size for _get_file_block_size Darrick J. Wong
@ 2020-10-28  7:41   ` Christoph Hellwig
  2020-10-28 22:24     ` Darrick J. Wong
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2020-10-28  7:41 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests, linux-ext4

On Tue, Oct 27, 2020 at 12:01:35PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> _get_file_block_size is intended to return the size (in bytes) of the
> fundamental allocation unit for a file.  This is required for remapping
> operations like fallocate and reflink, which can only operate on
> allocation units.  Since the XFS realtime volume can be configure for
> allocation units larger than 1 fs block, we need to factor that in here.

Should this also cover the ext4 bigalloc clusters?  Or do they not
matter for fallocate?

> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  common/rc  |   13 ++++++++++---
>  common/xfs |   20 ++++++++++++++++++++
>  2 files changed, 30 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/common/rc b/common/rc
> index 27a27ea3..41f93047 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -3974,11 +3974,18 @@ _get_file_block_size()
>  		echo "Missing mount point argument for _get_file_block_size"
>  		exit 1
>  	fi
> -	if [ "$FSTYP" = "ocfs2" ]; then
> +
> +	case "$FSTYP" in
> +	"ocfs2")
>  		stat -c '%o' $1
> -	else
> +		;;
> +	"xfs")
> +		_xfs_get_file_block_size $1
> +		;;
> +	*)
>  		_get_block_size $1
> -	fi
> +		;;
> +	esac
>  }
>  
>  # Get the minimum block size of an fs.
> diff --git a/common/xfs b/common/xfs
> index 79dab058..3f5c14ba 100644
> --- a/common/xfs
> +++ b/common/xfs
> @@ -174,6 +174,26 @@ _scratch_mkfs_xfs()
>  	return $mkfs_status
>  }
>  
> +# Get the size of an allocation unit of a file.  Normally this is just the
> +# block size of the file, but for realtime files, this is the realtime extent
> +# size.
> +_xfs_get_file_block_size()
> +{
> +	local path="$1"
> +
> +	if ! ($XFS_IO_PROG -c "stat -v" "$path" 2>&1 | egrep -q '(rt-inherit|realtime)'); then
> +		_get_block_size "$path"
> +		return
> +	fi
> +
> +	# Otherwise, call xfs_info until we find a mount point or the root.
> +	path="$(readlink -m "$path")"
> +	while ! $XFS_INFO_PROG "$path" &>/dev/null && [ "$path" != "/" ]; do
> +		path="$(dirname "$path")"
> +	done
> +	$XFS_INFO_PROG "$path" | grep realtime | sed -e 's/^.*extsz=\([0-9]*\).*$/\1/g'
> +}
> +
>  # xfs_check script is planned to be deprecated. But, we want to
>  # be able to invoke "xfs_check" behavior in xfstests in order to
>  # maintain the current verification levels.
> 
---end quoted text---

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 2/9] xfs/520: disable external devices
  2020-10-27 19:01 ` [PATCH 2/9] xfs/520: disable external devices Darrick J. Wong
@ 2020-10-28  7:41   ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2020-10-28  7:41 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Tue, Oct 27, 2020 at 12:01:42PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> This is a regression test for a specific bug that requires a specific
> configuration of the data device.  Realtime volumes and external logs
> don't affect the efficacy of the test, but the test can fail mkfs if the
> realtime device is very large.
> 
> Therefore, unset USE_EXTERNAL so that we always run this regression
> test, even if the tester enabled realtime.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/9] xfs/341: fix test when rextsize > 1
  2020-10-27 19:01 ` [PATCH 3/9] xfs/341: fix test when rextsize > 1 Darrick J. Wong
@ 2020-10-28  7:41   ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2020-10-28  7:41 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Tue, Oct 27, 2020 at 12:01:48PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Fix this test so that it works when the rt extent size is larger than
> single block.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 4/9] various: replace _get_block_size with _get_file_block_size when needed
  2020-10-27 19:01 ` [PATCH 4/9] various: replace _get_block_size with _get_file_block_size when needed Darrick J. Wong
@ 2020-10-28  7:42   ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2020-10-28  7:42 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Tue, Oct 27, 2020 at 12:01:56PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> The _get_file_block_size helper was added so that tests could find out
> the size of a fundamental unit of allocation for a given file, which is
> necessary for certain fallocate and clonerange tests.
> 
> On certain filesystem configurations (ocfs2 with clusters, xfs with a
> large rt extent size), this is /not/ the same as the filesystem block
> size, and these tests will fail.  Fix them to use the correct helper.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 5/9] xfs/327: fix inode reflink flag checking
  2020-10-27 19:02 ` [PATCH 5/9] xfs/327: fix inode reflink flag checking Darrick J. Wong
@ 2020-10-28  7:42   ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2020-10-28  7:42 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Tue, Oct 27, 2020 at 12:02:02PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> This is a regression test that tried to make sure that repair correctly
> clears the XFS inode reflink flag when it detects files that do not
> share any blocks.  However, it does this checking by looking at the
> (online) lsattr output.  This worked fine during development when we
> exposed the reflink state via the stat ioctls, but that has long since
> been removed.  Now the only way to check is via xfs_db, so switch it to
> use that.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 6/9] xfs/27[26]: force realtime on or off as needed
  2020-10-27 19:02 ` [PATCH 6/9] xfs/27[26]: force realtime on or off as needed Darrick J. Wong
@ 2020-10-28  7:43   ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2020-10-28  7:43 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Tue, Oct 27, 2020 at 12:02:08PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Certain tests have certain requirements where the realtime parameters
> are concerned.  Fix them all.

Certain tests sound like a lot for exactly two flags :)

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 7/9] xfs/030: hide the btree levels check errors
  2020-10-27 19:02 ` [PATCH 7/9] xfs/030: hide the btree levels check errors Darrick J. Wong
@ 2020-10-28  7:43   ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2020-10-28  7:43 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Tue, Oct 27, 2020 at 12:02:15PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Newer versions of xfsprogs now complain if the rmap and refcount btree
> levels are insane, so hide that error from the golden output.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 8/9] check: run tests in a systemd scope for mandatory test cleanup
  2020-10-27 19:02 ` [PATCH 8/9] check: run tests in a systemd scope for mandatory test cleanup Darrick J. Wong
@ 2020-10-28  7:44   ` Christoph Hellwig
  2020-10-28 16:58     ` Darrick J. Wong
  2020-10-29  1:04     ` Darrick J. Wong
  2020-11-02 21:37   ` Darrick J. Wong
  1 sibling, 2 replies; 26+ messages in thread
From: Christoph Hellwig @ 2020-10-28  7:44 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Tue, Oct 27, 2020 at 12:02:21PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> If systemd is available, run each test in its own temporary systemd
> scope.  This enables the test harness to forcibly clean up all of the
> test's child processes (if it does not do so itself) so that we can move
> into the post-test unmount and check cleanly.

Can you explain what this mean in more detail?  Most importantly what
problems it fixes.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 9/9] common/populate: make sure _scratch_xfs_populate puts its files on the data device
  2020-10-27 19:02 ` [PATCH 9/9] common/populate: make sure _scratch_xfs_populate puts its files on the data device Darrick J. Wong
@ 2020-10-28  7:44   ` Christoph Hellwig
  2020-10-28 16:27     ` Darrick J. Wong
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2020-10-28  7:44 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Tue, Oct 27, 2020 at 12:02:27PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Make sure that _scratch_xfs_populate always installs its files on the
> data device even if the test config selects rt by default.

Can you explain why this is important (preferably also in a comment in
the source)?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 9/9] common/populate: make sure _scratch_xfs_populate puts its files on the data device
  2020-10-28  7:44   ` Christoph Hellwig
@ 2020-10-28 16:27     ` Darrick J. Wong
  0 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-28 16:27 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: guaneryu, linux-xfs, fstests

On Wed, Oct 28, 2020 at 07:44:46AM +0000, Christoph Hellwig wrote:
> On Tue, Oct 27, 2020 at 12:02:27PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Make sure that _scratch_xfs_populate always installs its files on the
> > data device even if the test config selects rt by default.
> 
> Can you explain why this is important (preferably also in a comment in
> the source)?

Ok.  I'll change it to this:

	# We cannot directly force the filesystem to create the metadata
	# structures we want; we can only achieve this indirectly by carefully
	# crafting files and a directory tree.  Therefore, we must have exact
	# control over the layout and device selection of all files created.
	# Clear the rtinherit flag on the root directory so that files are
	# always created on the data volume regardless of MKFS_OPTIONS.
	# We can set the realtime flag when needed.
	$XFS_IO_PROG -c 'chattr -t' $SCRATCH_MNT

--D

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 8/9] check: run tests in a systemd scope for mandatory test cleanup
  2020-10-28  7:44   ` Christoph Hellwig
@ 2020-10-28 16:58     ` Darrick J. Wong
  2020-10-29  1:04     ` Darrick J. Wong
  1 sibling, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-28 16:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: guaneryu, linux-xfs, fstests

On Wed, Oct 28, 2020 at 07:44:07AM +0000, Christoph Hellwig wrote:
> On Tue, Oct 27, 2020 at 12:02:21PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > If systemd is available, run each test in its own temporary systemd
> > scope.  This enables the test harness to forcibly clean up all of the
> > test's child processes (if it does not do so itself) so that we can move
> > into the post-test unmount and check cleanly.
> 
> Can you explain what this mean in more detail?  Most importantly what
> problems it fixes.

I'll answer these in reverse order. :)

I frequently run fstests in "low" memory situations (2GB!) to force the
kernel to do interesting things.  There are a few tests like generic/224
and generic/561 that put processes in the background and occasionally
trigger the OOM killer.  Most of the time the OOM killer correctly
shoots down fsstress or duperemove, but once in a while it's stupid
enough to shoot down the test control process (i.e. tests/generic/224)
instead.  fsstress is still running in the background, and the one
process that knew about that is dead.

When the control process dies, ./check moves on to the post-test fsck,
which fails because fsstress is still running and we can't unmount.
After fsck fails, ./check moves on to the next test, which fails because
fsstress is /still/ writing to the filesystem and we can't unmount or
format.

The end result is that that one OOM kill causes cascading test failures,
and I have to re-start fstests to see if I get a clean(er) run.  This is
frustrating in the -rc1 days, where I more frequently observe problems
with memory reclaim and OOM kills.  (Note: those problems are usually
gone by -rc3.)

So, the solution I present in this patch is to teach ./check to try to
run the test script in a systemd scope.  If that succeeds, ./check will
tell systemd to kill the scope when the test script exits and returns
control to ./check.  Concretely, this means that systemd creates a new
cgroup, stuffs the processes in that cgroup, and when we kill the scope,
systemd kills all the processes in that cgroup and deletes the cgroup.

The end result is that fstests now has an easy way to ensure that /all/
child processes of a test are dead before we try to unmount the test and
scratch devices.  I've designed this to be optional, because not
everyone does or wants or likes to run systemd, but it makes QA easier.

Hmm, this might make a better commit log.  I'll excerpt this into the
patch message.

--D

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/9] common: extract rt extent size for _get_file_block_size
  2020-10-28  7:41   ` Christoph Hellwig
@ 2020-10-28 22:24     ` Darrick J. Wong
  0 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-28 22:24 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: guaneryu, linux-xfs, fstests, linux-ext4

On Wed, Oct 28, 2020 at 07:41:19AM +0000, Christoph Hellwig wrote:
> On Tue, Oct 27, 2020 at 12:01:35PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > _get_file_block_size is intended to return the size (in bytes) of the
> > fundamental allocation unit for a file.  This is required for remapping
> > operations like fallocate and reflink, which can only operate on
> > allocation units.  Since the XFS realtime volume can be configure for
> > allocation units larger than 1 fs block, we need to factor that in here.
> 
> Should this also cover the ext4 bigalloc clusters?  Or do they not
> matter for fallocate?

They don't matter for fallocate, because ext4 doesn't require clusters
to be fully allocated like ocfs2 and xfs do.

This means that all the bigalloc codepaths have this horrible "implied
cluster allocation" thing sprinkled everywhere where to map in a single
block you have to scan left and right in the extent map to see if anyone
already mapped something.  And even more strangely, extent tree blocks
don't do this, so it seems to waste the entire cluster past the first fs
block.

But I guess it /does/ mean that _get_file_block_size doesn't have to do
anything special for ext*.

--D

> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  common/rc  |   13 ++++++++++---
> >  common/xfs |   20 ++++++++++++++++++++
> >  2 files changed, 30 insertions(+), 3 deletions(-)
> > 
> > 
> > diff --git a/common/rc b/common/rc
> > index 27a27ea3..41f93047 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -3974,11 +3974,18 @@ _get_file_block_size()
> >  		echo "Missing mount point argument for _get_file_block_size"
> >  		exit 1
> >  	fi
> > -	if [ "$FSTYP" = "ocfs2" ]; then
> > +
> > +	case "$FSTYP" in
> > +	"ocfs2")
> >  		stat -c '%o' $1
> > -	else
> > +		;;
> > +	"xfs")
> > +		_xfs_get_file_block_size $1
> > +		;;
> > +	*)
> >  		_get_block_size $1
> > -	fi
> > +		;;
> > +	esac
> >  }
> >  
> >  # Get the minimum block size of an fs.
> > diff --git a/common/xfs b/common/xfs
> > index 79dab058..3f5c14ba 100644
> > --- a/common/xfs
> > +++ b/common/xfs
> > @@ -174,6 +174,26 @@ _scratch_mkfs_xfs()
> >  	return $mkfs_status
> >  }
> >  
> > +# Get the size of an allocation unit of a file.  Normally this is just the
> > +# block size of the file, but for realtime files, this is the realtime extent
> > +# size.
> > +_xfs_get_file_block_size()
> > +{
> > +	local path="$1"
> > +
> > +	if ! ($XFS_IO_PROG -c "stat -v" "$path" 2>&1 | egrep -q '(rt-inherit|realtime)'); then
> > +		_get_block_size "$path"
> > +		return
> > +	fi
> > +
> > +	# Otherwise, call xfs_info until we find a mount point or the root.
> > +	path="$(readlink -m "$path")"
> > +	while ! $XFS_INFO_PROG "$path" &>/dev/null && [ "$path" != "/" ]; do
> > +		path="$(dirname "$path")"
> > +	done
> > +	$XFS_INFO_PROG "$path" | grep realtime | sed -e 's/^.*extsz=\([0-9]*\).*$/\1/g'
> > +}
> > +
> >  # xfs_check script is planned to be deprecated. But, we want to
> >  # be able to invoke "xfs_check" behavior in xfstests in order to
> >  # maintain the current verification levels.
> > 
> ---end quoted text---

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 8/9] check: run tests in a systemd scope for mandatory test cleanup
  2020-10-28  7:44   ` Christoph Hellwig
  2020-10-28 16:58     ` Darrick J. Wong
@ 2020-10-29  1:04     ` Darrick J. Wong
  1 sibling, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2020-10-29  1:04 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: guaneryu, linux-xfs, fstests

[resend; I can't keep track of which messages actually got NOSEND
warnings and which ones just dropped off...]

On Wed, Oct 28, 2020 at 07:44:07AM +0000, Christoph Hellwig wrote:
> On Tue, Oct 27, 2020 at 12:02:21PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > If systemd is available, run each test in its own temporary systemd
> > scope.  This enables the test harness to forcibly clean up all of the
> > test's child processes (if it does not do so itself) so that we can move
> > into the post-test unmount and check cleanly.
> 
> Can you explain what this mean in more detail?  Most importantly what
> problems it fixes.

I'll answer these in reverse order. :)

I frequently run fstests in "low" memory situations (2GB!) to force the
kernel to do interesting things.  There are a few tests like generic/224
and generic/561 that put processes in the background and occasionally
trigger the OOM killer.  Most of the time the OOM killer correctly
shoots down fsstress or duperemove, but once in a while it's stupid
enough to shoot down the test control process (i.e. tests/generic/224)
instead.  fsstress is still running in the background, and the one
process that knew about that is dead.

When the control process dies, ./check moves on to the post-test fsck,
which fails because fsstress is still running and we can't unmount.
After fsck fails, ./check moves on to the next test, which fails because
fsstress is /still/ writing to the filesystem and we can't unmount or
format.

The end result is that that one OOM kill causes cascading test failures,
and I have to re-start fstests to see if I get a clean(er) run.  This is
frustrating in the -rc1 days, where I more frequently observe problems
with memory reclaim and OOM kills.  (Note: those problems are usually
gone by -rc3.)

So, the solution I present in this patch is to teach ./check to try to
run the test script in a systemd scope.  If that succeeds, ./check will
tell systemd to kill the scope when the test script exits and returns
control to ./check.  Concretely, this means that systemd creates a new
cgroup, stuffs the processes in that cgroup, and when we kill the scope,
systemd kills all the processes in that cgroup and deletes the cgroup.

The end result is that fstests now has an easy way to ensure that /all/
child processes of a test are dead before we try to unmount the test and
scratch devices.  I've designed this to be optional, because not
everyone does or wants or likes to run systemd, but it makes QA easier.

Hmm, this might make a better commit log.  I'll excerpt this into the
patch message.

--D

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 8/9] check: run tests in a systemd scope for mandatory test cleanup
  2020-10-27 19:02 ` [PATCH 8/9] check: run tests in a systemd scope for mandatory test cleanup Darrick J. Wong
  2020-10-28  7:44   ` Christoph Hellwig
@ 2020-11-02 21:37   ` Darrick J. Wong
  1 sibling, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2020-11-02 21:37 UTC (permalink / raw)
  To: guaneryu; +Cc: linux-xfs, fstests

On Tue, Oct 27, 2020 at 12:02:21PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> If systemd is available, run each test in its own temporary systemd
> scope.  This enables the test harness to forcibly clean up all of the
> test's child processes (if it does not do so itself) so that we can move
> into the post-test unmount and check cleanly.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  check |   21 ++++++++++++++++++++-
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> 
> diff --git a/check b/check
> index 5072dd82..47c72fa2 100755
> --- a/check
> +++ b/check
> @@ -521,6 +521,11 @@ _expunge_test()
>  	return 0
>  }
>  
> +# Can we run systemd scopes?
> +HAVE_SYSTEMD_SCOPES=
> +systemd-run --quiet --unit "fstests-check" --scope bash -c "exit 77" &> /dev/null
> +test $? -eq 77 && HAVE_SYSTEMD_SCOPES=yes
> +
>  # Make the check script unattractive to the OOM killer...
>  OOM_SCORE_ADJ="/proc/self/oom_score_adj"
>  test -w ${OOM_SCORE_ADJ} && echo -1000 > ${OOM_SCORE_ADJ}
> @@ -528,8 +533,22 @@ test -w ${OOM_SCORE_ADJ} && echo -1000 > ${OOM_SCORE_ADJ}
>  # ...and make the tests themselves somewhat more attractive to it, so that if
>  # the system runs out of memory it'll be the test that gets killed and not the
>  # test framework.
> +#
> +# If systemd is available, run the entire test script in a scope so that we can
> +# kill all subprocesses of the test if it fails to clean up after itself.  This
> +# is essential for ensuring that the post-test unmount succeeds.
>  _run_seq() {
> -	bash -c "test -w ${OOM_SCORE_ADJ} && echo 250 > ${OOM_SCORE_ADJ}; exec ./$seq"
> +	local cmd=(bash -c "test -w ${OOM_SCORE_ADJ} && echo 250 > ${OOM_SCORE_ADJ}; exec ./$seq")
> +
> +	if [ -n "${HAVE_SYSTEMD_SCOPES}" ]; then
> +		local unit="$(systemd-escape "fs$seq").scope"
> +		systemd-run --quiet --unit "${unit}" --scope "${cmd[@]}"

/me discovers that systemd preserves failed transient scopes (where
"failed" means the processes get killed, not that they return nonzero),
so this patch has to reset-failed the scope in case the user calls
fstests before rebooting.

Hence, self NAK.

--D

> +		res=$?
> +		systemctl stop "${unit}" &> /dev/null
> +		return "${res}"
> +	else
> +		"${cmd[@]}"
> +	fi
>  }
>  
>  _detect_kmemleak
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] xfstests: random fixes
  2020-10-27 19:01 [PATCH 0/9] xfstests: random fixes Darrick J. Wong
                   ` (8 preceding siblings ...)
  2020-10-27 19:02 ` [PATCH 9/9] common/populate: make sure _scratch_xfs_populate puts its files on the data device Darrick J. Wong
@ 2020-11-08 10:05 ` Eryu Guan
  2020-11-08 17:23   ` Darrick J. Wong
  9 siblings, 1 reply; 26+ messages in thread
From: Eryu Guan @ 2020-11-08 10:05 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

Hi Darric,

On Tue, Oct 27, 2020 at 12:01:29PM -0700, Darrick J. Wong wrote:
> Hi all,
> 
> This series contains random fixes to fstests.

I applied patch 2-7 in this patchset, which were reviewed by Christoph,
and seems other patches need rework.

And regarding to your other patchsets, I'm a bit lost, it seems some of
them need rework as well. So I'd wait for your refresh version :)

Thanks,
Eryu

> 
> If you're going to start using this mess, you probably ought to just
> pull from my git trees, which are linked below.
> 
> This is an extraordinary way to destroy everything.  Enjoy!
> Comments and questions are, as always, welcome.
> 
> --D
> 
> kernel git tree:
> https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=random-fixes
> 
> xfsprogs git tree:
> https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=random-fixes
> 
> fstests git tree:
> https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=random-fixes
> ---
>  check             |   21 ++++++++++++++++++++-
>  common/populate   |    5 +++++
>  common/rc         |   13 ++++++++++---
>  common/repair     |    1 +
>  common/xfs        |   20 ++++++++++++++++++++
>  tests/xfs/030     |    1 +
>  tests/xfs/272     |    3 +++
>  tests/xfs/276     |    8 +++++++-
>  tests/xfs/327     |   18 ++++++++++++++++--
>  tests/xfs/327.out |   13 +++++++------
>  tests/xfs/328     |    2 +-
>  tests/xfs/341     |    8 +++++---
>  tests/xfs/520     |    3 +++
>  13 files changed, 99 insertions(+), 17 deletions(-)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 0/9] xfstests: random fixes
  2020-11-08 10:05 ` [PATCH 0/9] xfstests: random fixes Eryu Guan
@ 2020-11-08 17:23   ` Darrick J. Wong
  0 siblings, 0 replies; 26+ messages in thread
From: Darrick J. Wong @ 2020-11-08 17:23 UTC (permalink / raw)
  To: Eryu Guan; +Cc: guaneryu, linux-xfs, fstests

On Sun, Nov 08, 2020 at 06:05:32PM +0800, Eryu Guan wrote:
> Hi Darric,
> 
> On Tue, Oct 27, 2020 at 12:01:29PM -0700, Darrick J. Wong wrote:
> > Hi all,
> > 
> > This series contains random fixes to fstests.
> 
> I applied patch 2-7 in this patchset, which were reviewed by Christoph,
> and seems other patches need rework.
> 
> And regarding to your other patchsets, I'm a bit lost, it seems some of
> them need rework as well. So I'd wait for your refresh version :)

Yes.  I'll try to send a revised series this week, though I got bogged
down in a bug hunt.

--D

> Thanks,
> Eryu
> 
> > 
> > If you're going to start using this mess, you probably ought to just
> > pull from my git trees, which are linked below.
> > 
> > This is an extraordinary way to destroy everything.  Enjoy!
> > Comments and questions are, as always, welcome.
> > 
> > --D
> > 
> > kernel git tree:
> > https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=random-fixes
> > 
> > xfsprogs git tree:
> > https://git.kernel.org/cgit/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=random-fixes
> > 
> > fstests git tree:
> > https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=random-fixes
> > ---
> >  check             |   21 ++++++++++++++++++++-
> >  common/populate   |    5 +++++
> >  common/rc         |   13 ++++++++++---
> >  common/repair     |    1 +
> >  common/xfs        |   20 ++++++++++++++++++++
> >  tests/xfs/030     |    1 +
> >  tests/xfs/272     |    3 +++
> >  tests/xfs/276     |    8 +++++++-
> >  tests/xfs/327     |   18 ++++++++++++++++--
> >  tests/xfs/327.out |   13 +++++++------
> >  tests/xfs/328     |    2 +-
> >  tests/xfs/341     |    8 +++++---
> >  tests/xfs/520     |    3 +++
> >  13 files changed, 99 insertions(+), 17 deletions(-)

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2020-11-08 17:23 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-27 19:01 [PATCH 0/9] xfstests: random fixes Darrick J. Wong
2020-10-27 19:01 ` [PATCH 1/9] common: extract rt extent size for _get_file_block_size Darrick J. Wong
2020-10-28  7:41   ` Christoph Hellwig
2020-10-28 22:24     ` Darrick J. Wong
2020-10-27 19:01 ` [PATCH 2/9] xfs/520: disable external devices Darrick J. Wong
2020-10-28  7:41   ` Christoph Hellwig
2020-10-27 19:01 ` [PATCH 3/9] xfs/341: fix test when rextsize > 1 Darrick J. Wong
2020-10-28  7:41   ` Christoph Hellwig
2020-10-27 19:01 ` [PATCH 4/9] various: replace _get_block_size with _get_file_block_size when needed Darrick J. Wong
2020-10-28  7:42   ` Christoph Hellwig
2020-10-27 19:02 ` [PATCH 5/9] xfs/327: fix inode reflink flag checking Darrick J. Wong
2020-10-28  7:42   ` Christoph Hellwig
2020-10-27 19:02 ` [PATCH 6/9] xfs/27[26]: force realtime on or off as needed Darrick J. Wong
2020-10-28  7:43   ` Christoph Hellwig
2020-10-27 19:02 ` [PATCH 7/9] xfs/030: hide the btree levels check errors Darrick J. Wong
2020-10-28  7:43   ` Christoph Hellwig
2020-10-27 19:02 ` [PATCH 8/9] check: run tests in a systemd scope for mandatory test cleanup Darrick J. Wong
2020-10-28  7:44   ` Christoph Hellwig
2020-10-28 16:58     ` Darrick J. Wong
2020-10-29  1:04     ` Darrick J. Wong
2020-11-02 21:37   ` Darrick J. Wong
2020-10-27 19:02 ` [PATCH 9/9] common/populate: make sure _scratch_xfs_populate puts its files on the data device Darrick J. Wong
2020-10-28  7:44   ` Christoph Hellwig
2020-10-28 16:27     ` Darrick J. Wong
2020-11-08 10:05 ` [PATCH 0/9] xfstests: random fixes Eryu Guan
2020-11-08 17:23   ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).