fstests.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] fstests: random fixes
@ 2019-10-08  1:03 Darrick J. Wong
  2019-10-08  1:03 ` [PATCH 1/4] xfs/196: check for delalloc blocks after pwrite Darrick J. Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Darrick J. Wong @ 2019-10-08  1:03 UTC (permalink / raw)
  To: guaneryu, darrick.wong; +Cc: linux-xfs, fstests

Hi all,

Fix various bugs in the test suite that cause test failures XFS in DAX mode.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=random-fixes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] xfs/196: check for delalloc blocks after pwrite
  2019-10-08  1:03 [PATCH 0/4] fstests: random fixes Darrick J. Wong
@ 2019-10-08  1:03 ` Darrick J. Wong
  2019-10-08  7:01   ` Christoph Hellwig
  2019-10-08  1:03 ` [PATCH 2/4] xfs/{088, 089, 091}: redirect stderr when writing to corrupt fs Darrick J. Wong
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Darrick J. Wong @ 2019-10-08  1:03 UTC (permalink / raw)
  To: guaneryu, darrick.wong; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

This test depends on the pwrite creating delalloc blocks, which doesn't
happen if the scratch fs is mounted in dax mode (or has an extent size
hint applied).  Therefore, check for delalloc blocks and _notrun if we
didn't get any.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/196 |    2 ++
 1 file changed, 2 insertions(+)


diff --git a/tests/xfs/196 b/tests/xfs/196
index 5dc28670..406146c5 100755
--- a/tests/xfs/196
+++ b/tests/xfs/196
@@ -50,6 +50,8 @@ bytes=$((64 * 1024))
 
 # create sequential delayed allocation
 $XFS_IO_PROG -f -c "pwrite 0 $bytes" $file >> $seqres.full 2>&1
+$XFS_IO_PROG -c "bmap -elpv" $file | grep -q delalloc || \
+	_notrun "Unable to create delayed allocations"
 
 # Enable write drops. All buffered writes are dropped from this point on.
 _scratch_inject_error "drop_writes" 1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] xfs/{088, 089, 091}: redirect stderr when writing to corrupt fs
  2019-10-08  1:03 [PATCH 0/4] fstests: random fixes Darrick J. Wong
  2019-10-08  1:03 ` [PATCH 1/4] xfs/196: check for delalloc blocks after pwrite Darrick J. Wong
@ 2019-10-08  1:03 ` Darrick J. Wong
  2019-10-08  7:01   ` Christoph Hellwig
  2019-10-08  1:03 ` [PATCH 3/4] xfs/263: use _scratch_mkfs_xfs instead of open-coded mkfs call Darrick J. Wong
  2019-10-08  1:03 ` [PATCH 4/4] populate: punch files after writing to fragment free space properly Darrick J. Wong
  3 siblings, 1 reply; 12+ messages in thread
From: Darrick J. Wong @ 2019-10-08  1:03 UTC (permalink / raw)
  To: guaneryu, darrick.wong; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

These tests primarily check that writes to a corrupt fs don't take down
the system, and that running repair will fix them.  Therefore, redirect
stderr to seqres.full so that we don't fail these tests in DAX mode.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/088 |    2 +-
 tests/xfs/089 |    2 +-
 tests/xfs/091 |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)


diff --git a/tests/xfs/088 b/tests/xfs/088
index 74b45163..d8ca877a 100755
--- a/tests/xfs/088
+++ b/tests/xfs/088
@@ -80,7 +80,7 @@ echo "+ mount image && modify files"
 if _try_scratch_mount >> $seqres.full 2>&1; then
 
 	for x in `seq 1 64`; do
-		$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" >> $seqres.full
+		$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" >> $seqres.full 2>> $seqres.full
 	done
 	umount "${SCRATCH_MNT}"
 fi
diff --git a/tests/xfs/089 b/tests/xfs/089
index bcbc6363..ad980769 100755
--- a/tests/xfs/089
+++ b/tests/xfs/089
@@ -80,7 +80,7 @@ echo "+ mount image && modify files"
 if _try_scratch_mount >> $seqres.full 2>&1; then
 
 	for x in `seq 1 64`; do
-		$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" >> $seqres.full
+		$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" >> $seqres.full 2>> $seqres.full
 	done
 	umount "${SCRATCH_MNT}"
 fi
diff --git a/tests/xfs/091 b/tests/xfs/091
index be56d8ae..37c07a52 100755
--- a/tests/xfs/091
+++ b/tests/xfs/091
@@ -80,7 +80,7 @@ echo "+ mount image && modify files"
 if _try_scratch_mount >> $seqres.full 2>&1; then
 
 	for x in `seq 1 64`; do
-		$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" >> $seqres.full
+		$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" >> $seqres.full 2>> $seqres.full
 	done
 	umount "${SCRATCH_MNT}"
 fi


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] xfs/263: use _scratch_mkfs_xfs instead of open-coded mkfs call
  2019-10-08  1:03 [PATCH 0/4] fstests: random fixes Darrick J. Wong
  2019-10-08  1:03 ` [PATCH 1/4] xfs/196: check for delalloc blocks after pwrite Darrick J. Wong
  2019-10-08  1:03 ` [PATCH 2/4] xfs/{088, 089, 091}: redirect stderr when writing to corrupt fs Darrick J. Wong
@ 2019-10-08  1:03 ` Darrick J. Wong
  2019-10-08  7:02   ` Christoph Hellwig
  2019-10-08  1:03 ` [PATCH 4/4] populate: punch files after writing to fragment free space properly Darrick J. Wong
  3 siblings, 1 reply; 12+ messages in thread
From: Darrick J. Wong @ 2019-10-08  1:03 UTC (permalink / raw)
  To: guaneryu, darrick.wong; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

Fix this test to use _scratch_mkfs_xfs instead of the open-coded mkfs
call.  This is needed to make the test succeed when XFS DAX is enabled
and mkfs enables reflink by default.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 tests/xfs/263 |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


diff --git a/tests/xfs/263 b/tests/xfs/263
index 75477937..578f9ee7 100755
--- a/tests/xfs/263
+++ b/tests/xfs/263
@@ -75,11 +75,11 @@ function test_all_state()
 
 echo "==== NO CRC ===="
 # Control size to control inode numbers
-$MKFS_XFS_PROG -f -m crc=0 -n ftype=0 -d size=512m $SCRATCH_DEV >>$seqres.full
+_scratch_mkfs_xfs "-m crc=0 -n ftype=0 -d size=512m" >> $seqres.full
 test_all_state
 
 echo "==== CRC ===="
-$MKFS_XFS_PROG -f -m crc=1 -d size=512m $SCRATCH_DEV >>$seqres.full
+_scratch_mkfs_xfs "-m crc=1 -d size=512m" >>$seqres.full
 test_all_state
 
 status=0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] populate: punch files after writing to fragment free space properly
  2019-10-08  1:03 [PATCH 0/4] fstests: random fixes Darrick J. Wong
                   ` (2 preceding siblings ...)
  2019-10-08  1:03 ` [PATCH 3/4] xfs/263: use _scratch_mkfs_xfs instead of open-coded mkfs call Darrick J. Wong
@ 2019-10-08  1:03 ` Darrick J. Wong
  2019-10-09  7:03   ` Christoph Hellwig
  2019-10-09 18:18   ` [PATCH v2 " Darrick J. Wong
  3 siblings, 2 replies; 12+ messages in thread
From: Darrick J. Wong @ 2019-10-08  1:03 UTC (permalink / raw)
  To: guaneryu, darrick.wong; +Cc: linux-xfs, fstests

From: Darrick J. Wong <darrick.wong@oracle.com>

The filesystem population code frequently allocates a large file and
punches out every other block ("swiss-cheese files") in an attempt to
cause the creation of a lot of metadata to fill out the btrees.  This
pattern, however, has a subtle bug if the writes to the swiss-cheese
file are not allocated in batches and we're trying to fragment the free
space records in order to achieve a certain metadata btree shape.

This is exactly what happens on a DAX filesystem, since we no longer
have the page cache to stage delalloc writes.  Each xfs_io pwrite call
to the multi-megabyte swiss-chese file turns into multiple 4k pwrites,
which means that file data blocks are allocated 4k at a time.  This can
be fatal to our goal of fragmenting the free space btrees because the
allocator sees a 4k allocation request and uses 4k blocks from the
fragmented parts of the free space to satisfy the "small" request.  When
this happens, the XFS populate function cannot fill out the free space
btree to sufficient height and tests fail.

(In regular delalloc mode we'd cache all those small write() in memory
and try for a single large allocation, which we'd generally get.)

To fix this, we need to force the filesystem to allocate all blocks
before freeing any blocks.  Split the creation of swiss-cheese files
into two parts: (a) writing data to the file to force allocation, and
(b) punching the holes to fragment free space.  It's a little hokey for
helpers to be modifying variables in the caller's scope, but there's not
really a better way to do that in bash.

This bug affects only XFS but we convert the one ext4 usage anyway.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 common/populate |   54 ++++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 42 insertions(+), 12 deletions(-)


diff --git a/common/populate b/common/populate
index 7403dec3..4aab0274 100644
--- a/common/populate
+++ b/common/populate
@@ -120,6 +120,32 @@ _populate_xfs_qmount_option()
 	fi
 }
 
+# Set up a file that we'll later use to fragment metadata and free space.
+# Our strategy here is to force the fs to allocate large contiguous extents
+# to a file and then punch every other block to force the fs to suffer the
+# worst allocation outcome possible.
+#
+# NOTE: In order to prevent *subsequent* allocations from using the holes we
+# punch, we must store the relevant filenames for later.  This function
+# deliberately adds each file name to the @punch_files array, which must be
+# declared by the caller and will be picked up by __force_fragmentation.
+function __setup_fragmentation() {
+	local sz="$1"
+	local fname="$2"
+
+	$XFS_IO_PROG -f -c "pwrite -S 0x62 -W -b 1m 0 $sz" "${fname}"
+	punch_files+=("${fname}")
+}
+
+# Actually punch holes in the file.  Call this /after/ you're done calling
+# __setup_fragmentation.
+__force_fragmentation() {
+	for file in "${punch_files[@]}"; do
+		./src/punch-alternating "${file}"
+	done
+	punch_files=()
+}
+
 # Populate an XFS on the scratch device with (we hope) all known
 # types of metadata block
 _scratch_xfs_populate() {
@@ -132,6 +158,8 @@ _scratch_xfs_populate() {
 		esac
 	done
 
+	local punch_files=()
+
 	_populate_xfs_qmount_option
 	_scratch_mount
 	blksz="$(stat -f -c '%s' "${SCRATCH_MNT}")"
@@ -161,8 +189,7 @@ _scratch_xfs_populate() {
 	# - FMT_BTREE
 	echo "+ btree extents file"
 	nr="$((blksz * 2 / 16))"
-	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/S_IFREG.FMT_BTREE"
-	./src/punch-alternating "${SCRATCH_MNT}/S_IFREG.FMT_BTREE"
+	__setup_fragmentation $((blksz * nr)) "${SCRATCH_MNT}/S_IFREG.FMT_BTREE"
 
 	# Directories
 	# - INLINE
@@ -257,8 +284,7 @@ _scratch_xfs_populate() {
 	# Free space btree
 	echo "+ freesp btree"
 	nr="$((blksz * 2 / 8))"
-	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/BNOBT"
-	./src/punch-alternating "${SCRATCH_MNT}/BNOBT"
+	__setup_fragmentation $((blksz * nr)) "${SCRATCH_MNT}/BNOBT"
 
 	# Inode btree
 	echo "+ inobt btree"
@@ -280,8 +306,7 @@ _scratch_xfs_populate() {
 	if [ $is_rmapbt -gt 0 ]; then
 		echo "+ rmapbt btree"
 		nr="$((blksz * 2 / 24))"
-		$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/RMAPBT"
-		./src/punch-alternating "${SCRATCH_MNT}/RMAPBT"
+		__setup_fragmentation $((blksz * nr)) "${SCRATCH_MNT}/RMAPBT"
 	fi
 
 	# Realtime Reverse-mapping btree
@@ -289,8 +314,7 @@ _scratch_xfs_populate() {
 	if [ $is_rmapbt -gt 0 ] && [ $is_rt -gt 0 ]; then
 		echo "+ rtrmapbt btree"
 		nr="$((blksz * 2 / 32))"
-		$XFS_IO_PROG -f -R -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/RTRMAPBT"
-		./src/punch-alternating "${SCRATCH_MNT}/RTRMAPBT"
+		__setup_fragmentation $((blksz * nr)) "${SCRATCH_MNT}/RTRMAPBT"
 	fi
 
 	# Reference-count btree
@@ -298,15 +322,17 @@ _scratch_xfs_populate() {
 	if [ $is_reflink -gt 0 ]; then
 		echo "+ reflink btree"
 		nr="$((blksz * 2 / 12))"
-		$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/REFCOUNTBT"
+		__setup_fragmentation $((blksz * nr)) "${SCRATCH_MNT}/REFCOUNTBT"
 		cp --reflink=always "${SCRATCH_MNT}/REFCOUNTBT" "${SCRATCH_MNT}/REFCOUNTBT2"
-		./src/punch-alternating "${SCRATCH_MNT}/REFCOUNTBT"
 	fi
 
 	# Copy some real files (xfs tests, I guess...)
 	echo "+ real files"
 	test $fill -ne 0 && __populate_fill_fs "${SCRATCH_MNT}" 5
 
+	# Make sure we get all the fragmentation we asked for
+	__force_fragmentation
+
 	umount "${SCRATCH_MNT}"
 }
 
@@ -322,6 +348,8 @@ _scratch_ext4_populate() {
 		esac
 	done
 
+	local punch_files=()
+
 	_scratch_mount
 	blksz="$(stat -f -c '%s' "${SCRATCH_MNT}")"
 	dblksz="${blksz}"
@@ -342,8 +370,7 @@ _scratch_ext4_populate() {
 	# - FMT_ETREE
 	echo "+ extent tree file"
 	nr="$((blksz * 2 / 12))"
-	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/S_IFREG.FMT_ETREE"
-	./src/punch-alternating "${SCRATCH_MNT}/S_IFREG.FMT_ETREE"
+	__setup_fragmentation $((blksz * nr)) "${SCRATCH_MNT}/S_IFREG.FMT_ETREE"
 
 	# Directories
 	# - INLINE
@@ -406,6 +433,9 @@ _scratch_ext4_populate() {
 	echo "+ real files"
 	test $fill -ne 0 && __populate_fill_fs "${SCRATCH_MNT}" 5
 
+	# Make sure we get all the fragmentation we asked for
+	__force_fragmentation
+
 	umount "${SCRATCH_MNT}"
 }
 


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] xfs/196: check for delalloc blocks after pwrite
  2019-10-08  1:03 ` [PATCH 1/4] xfs/196: check for delalloc blocks after pwrite Darrick J. Wong
@ 2019-10-08  7:01   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2019-10-08  7:01 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Mon, Oct 07, 2019 at 06:03:11PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> This test depends on the pwrite creating delalloc blocks, which doesn't
> happen if the scratch fs is mounted in dax mode (or has an extent size
> hint applied).  Therefore, check for delalloc blocks and _notrun if we
> didn't get any.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] xfs/{088, 089, 091}: redirect stderr when writing to corrupt fs
  2019-10-08  1:03 ` [PATCH 2/4] xfs/{088, 089, 091}: redirect stderr when writing to corrupt fs Darrick J. Wong
@ 2019-10-08  7:01   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2019-10-08  7:01 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Mon, Oct 07, 2019 at 06:03:17PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> These tests primarily check that writes to a corrupt fs don't take down
> the system, and that running repair will fix them.  Therefore, redirect
> stderr to seqres.full so that we don't fail these tests in DAX mode.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] xfs/263: use _scratch_mkfs_xfs instead of open-coded mkfs call
  2019-10-08  1:03 ` [PATCH 3/4] xfs/263: use _scratch_mkfs_xfs instead of open-coded mkfs call Darrick J. Wong
@ 2019-10-08  7:02   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2019-10-08  7:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Mon, Oct 07, 2019 at 06:03:23PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Fix this test to use _scratch_mkfs_xfs instead of the open-coded mkfs
> call.  This is needed to make the test succeed when XFS DAX is enabled
> and mkfs enables reflink by default.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] populate: punch files after writing to fragment free space properly
  2019-10-08  1:03 ` [PATCH 4/4] populate: punch files after writing to fragment free space properly Darrick J. Wong
@ 2019-10-09  7:03   ` Christoph Hellwig
  2019-10-09 18:02     ` Darrick J. Wong
  2019-10-09 18:18   ` [PATCH v2 " Darrick J. Wong
  1 sibling, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2019-10-09  7:03 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests

On Mon, Oct 07, 2019 at 06:03:29PM -0700, Darrick J. Wong wrote:
> To fix this, we need to force the filesystem to allocate all blocks
> before freeing any blocks.  Split the creation of swiss-cheese files
> into two parts: (a) writing data to the file to force allocation, and
> (b) punching the holes to fragment free space.  It's a little hokey for
> helpers to be modifying variables in the caller's scope, but there's not
> really a better way to do that in bash.

Why can't we just split the operations into creating a large contigous
file and then fragment them?


create_large_file foo
create_large_file bar
create_large_file baz

fragment_large_file foo
fragment_large_file bar
fragment_large_file baz


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] populate: punch files after writing to fragment free space properly
  2019-10-09  7:03   ` Christoph Hellwig
@ 2019-10-09 18:02     ` Darrick J. Wong
  0 siblings, 0 replies; 12+ messages in thread
From: Darrick J. Wong @ 2019-10-09 18:02 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: guaneryu, linux-xfs, fstests

On Wed, Oct 09, 2019 at 12:03:53AM -0700, Christoph Hellwig wrote:
> On Mon, Oct 07, 2019 at 06:03:29PM -0700, Darrick J. Wong wrote:
> > To fix this, we need to force the filesystem to allocate all blocks
> > before freeing any blocks.  Split the creation of swiss-cheese files
> > into two parts: (a) writing data to the file to force allocation, and
> > (b) punching the holes to fragment free space.  It's a little hokey for
> > helpers to be modifying variables in the caller's scope, but there's not
> > really a better way to do that in bash.
> 
> Why can't we just split the operations into creating a large contigous
> file and then fragment them?
> 
> 
> create_large_file foo
> create_large_file bar
> create_large_file baz
> 
> fragment_large_file foo
> fragment_large_file bar
> fragment_large_file baz


Yeah, that would also work, and without the clumsy side effects.  I'll
do that instead.

--D

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 4/4] populate: punch files after writing to fragment free space properly
  2019-10-08  1:03 ` [PATCH 4/4] populate: punch files after writing to fragment free space properly Darrick J. Wong
  2019-10-09  7:03   ` Christoph Hellwig
@ 2019-10-09 18:18   ` Darrick J. Wong
  2019-10-11  7:52     ` Christoph Hellwig
  1 sibling, 1 reply; 12+ messages in thread
From: Darrick J. Wong @ 2019-10-09 18:18 UTC (permalink / raw)
  To: guaneryu; +Cc: linux-xfs, fstests, hch

From: Darrick J. Wong <darrick.wong@oracle.com>

The filesystem population code frequently allocates a large file and
punches out every other block ("swiss-cheese files") in an attempt to
cause the creation of a lot of metadata to fill out the btrees.  This
pattern, however, has a subtle bug if the writes to the swiss-cheese
file are not allocated in batches and we're trying to fragment the free
space records in order to achieve a certain metadata btree shape.

This is exactly what happens on a DAX filesystem, since we no longer
have the page cache to stage delalloc writes.  Each xfs_io pwrite call
to the multi-megabyte swiss-chese file turns into multiple 4k pwrites,
which means that file data blocks are allocated 4k at a time.  This can
be fatal to our goal of fragmenting the free space btrees because the
allocator sees a 4k allocation request and uses 4k blocks from the
fragmented parts of the free space to satisfy the "small" request.  When
this happens, the XFS populate function cannot fill out the free space
btree to sufficient height and tests fail.

(In regular delalloc mode we'd cache all those small write() in memory
and try for a single large allocation, which we'd generally get.)

To fix this, we need to force the filesystem to allocate all blocks
before freeing any blocks.  Split the creation of swiss-cheese files
into two parts: (a) writing data to the file to force allocation, and
(b) punching the holes to fragment free space.

This bug affects only XFS but we convert the one ext4 usage anyway.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v2: don't do the weird array side effect thing per hch suggestion
---
 common/populate |   54 +++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 39 insertions(+), 15 deletions(-)

diff --git a/common/populate b/common/populate
index 7403dec3..f2953a67 100644
--- a/common/populate
+++ b/common/populate
@@ -22,6 +22,26 @@ _require_xfs_db_blocktrash_z_command() {
 # Attempt to make files of "every" format for data, dirs, attrs etc.
 # (with apologies to Eric Sandeen for mutating xfser.sh)
 
+# Create a file of a given size.
+__populate_create_file() {
+	local sz="$1"
+	local fname="$2"
+
+	$XFS_IO_PROG -f -c "pwrite -S 0x62 -W -b 1m 0 $sz" "${fname}"
+}
+
+# Punch out every other hole in this file, if it exists.
+#
+# The goal here is to force the creation of a large number of metadata records
+# by creating a lot of tiny extent mappings in a file.  Callers should ensure
+# that fragmenting the file actually causes record creation.  Call this
+# function /after/ creating all other metadata structures.
+__populate_fragment_file() {
+	local fname="$1"
+
+	test -f "${fname}" && ./src/punch-alternating "${fname}"
+}
+
 # Create a large directory
 __populate_create_dir() {
 	name="$1"
@@ -156,13 +176,12 @@ _scratch_xfs_populate() {
 	# Regular files
 	# - FMT_EXTENTS
 	echo "+ extents file"
-	$XFS_IO_PROG -f -c "pwrite -S 0x61 0 ${blksz}" "${SCRATCH_MNT}/S_IFREG.FMT_EXTENTS"
+	__populate_create_file $blksz "${SCRATCH_MNT}/S_IFREG.FMT_EXTENTS"
 
 	# - FMT_BTREE
 	echo "+ btree extents file"
 	nr="$((blksz * 2 / 16))"
-	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/S_IFREG.FMT_BTREE"
-	./src/punch-alternating "${SCRATCH_MNT}/S_IFREG.FMT_BTREE"
+	__populate_create_file $((blksz * nr)) "${SCRATCH_MNT}/S_IFREG.FMT_BTREE"
 
 	# Directories
 	# - INLINE
@@ -257,8 +276,7 @@ _scratch_xfs_populate() {
 	# Free space btree
 	echo "+ freesp btree"
 	nr="$((blksz * 2 / 8))"
-	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/BNOBT"
-	./src/punch-alternating "${SCRATCH_MNT}/BNOBT"
+	__populate_create_file $((blksz * nr)) "${SCRATCH_MNT}/BNOBT"
 
 	# Inode btree
 	echo "+ inobt btree"
@@ -280,8 +298,7 @@ _scratch_xfs_populate() {
 	if [ $is_rmapbt -gt 0 ]; then
 		echo "+ rmapbt btree"
 		nr="$((blksz * 2 / 24))"
-		$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/RMAPBT"
-		./src/punch-alternating "${SCRATCH_MNT}/RMAPBT"
+		__populate_create_file $((blksz * nr)) "${SCRATCH_MNT}/RMAPBT"
 	fi
 
 	# Realtime Reverse-mapping btree
@@ -289,8 +306,7 @@ _scratch_xfs_populate() {
 	if [ $is_rmapbt -gt 0 ] && [ $is_rt -gt 0 ]; then
 		echo "+ rtrmapbt btree"
 		nr="$((blksz * 2 / 32))"
-		$XFS_IO_PROG -f -R -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/RTRMAPBT"
-		./src/punch-alternating "${SCRATCH_MNT}/RTRMAPBT"
+		__populate_create_file $((blksz * nr)) "${SCRATCH_MNT}/RTRMAPBT"
 	fi
 
 	# Reference-count btree
@@ -298,15 +314,21 @@ _scratch_xfs_populate() {
 	if [ $is_reflink -gt 0 ]; then
 		echo "+ reflink btree"
 		nr="$((blksz * 2 / 12))"
-		$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/REFCOUNTBT"
+		__populate_create_file $((blksz * nr)) "${SCRATCH_MNT}/REFCOUNTBT"
 		cp --reflink=always "${SCRATCH_MNT}/REFCOUNTBT" "${SCRATCH_MNT}/REFCOUNTBT2"
-		./src/punch-alternating "${SCRATCH_MNT}/REFCOUNTBT"
 	fi
 
 	# Copy some real files (xfs tests, I guess...)
 	echo "+ real files"
 	test $fill -ne 0 && __populate_fill_fs "${SCRATCH_MNT}" 5
 
+	# Make sure we get all the fragmentation we asked for
+	__populate_fragment_file "${SCRATCH_MNT}/S_IFREG.FMT_BTREE"
+	__populate_fragment_file "${SCRATCH_MNT}/BNOBT"
+	__populate_fragment_file "${SCRATCH_MNT}/RMAPBT"
+	__populate_fragment_file "${SCRATCH_MNT}/RTRMAPBT"
+	__populate_fragment_file "${SCRATCH_MNT}/REFCOUNTBT"
+
 	umount "${SCRATCH_MNT}"
 }
 
@@ -333,17 +355,16 @@ _scratch_ext4_populate() {
 	# Regular files
 	# - FMT_INLINE
 	echo "+ inline file"
-	$XFS_IO_PROG -f -c "pwrite -S 0x61 0 1" "${SCRATCH_MNT}/S_IFREG.FMT_INLINE"
+	__populate_create_file 1 "${SCRATCH_MNT}/S_IFREG.FMT_INLINE"
 
 	# - FMT_EXTENTS
 	echo "+ extents file"
-	$XFS_IO_PROG -f -c "pwrite -S 0x61 0 ${blksz}" "${SCRATCH_MNT}/S_IFREG.FMT_EXTENTS"
+	__populate_create_file $blksz "${SCRATCH_MNT}/S_IFREG.FMT_EXTENTS"
 
 	# - FMT_ETREE
 	echo "+ extent tree file"
 	nr="$((blksz * 2 / 12))"
-	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/S_IFREG.FMT_ETREE"
-	./src/punch-alternating "${SCRATCH_MNT}/S_IFREG.FMT_ETREE"
+	__populate_create_file $((blksz * nr)) "${SCRATCH_MNT}/S_IFREG.FMT_ETREE"
 
 	# Directories
 	# - INLINE
@@ -406,6 +427,9 @@ _scratch_ext4_populate() {
 	echo "+ real files"
 	test $fill -ne 0 && __populate_fill_fs "${SCRATCH_MNT}" 5
 
+	# Make sure we get all the fragmentation we asked for
+	__populate_fragment_file "${SCRATCH_MNT}/S_IFREG.FMT_ETREE"
+
 	umount "${SCRATCH_MNT}"
 }
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 4/4] populate: punch files after writing to fragment free space properly
  2019-10-09 18:18   ` [PATCH v2 " Darrick J. Wong
@ 2019-10-11  7:52     ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2019-10-11  7:52 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: guaneryu, linux-xfs, fstests, hch

Thanks, this looks much better:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-10-11  7:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-08  1:03 [PATCH 0/4] fstests: random fixes Darrick J. Wong
2019-10-08  1:03 ` [PATCH 1/4] xfs/196: check for delalloc blocks after pwrite Darrick J. Wong
2019-10-08  7:01   ` Christoph Hellwig
2019-10-08  1:03 ` [PATCH 2/4] xfs/{088, 089, 091}: redirect stderr when writing to corrupt fs Darrick J. Wong
2019-10-08  7:01   ` Christoph Hellwig
2019-10-08  1:03 ` [PATCH 3/4] xfs/263: use _scratch_mkfs_xfs instead of open-coded mkfs call Darrick J. Wong
2019-10-08  7:02   ` Christoph Hellwig
2019-10-08  1:03 ` [PATCH 4/4] populate: punch files after writing to fragment free space properly Darrick J. Wong
2019-10-09  7:03   ` Christoph Hellwig
2019-10-09 18:02     ` Darrick J. Wong
2019-10-09 18:18   ` [PATCH v2 " Darrick J. Wong
2019-10-11  7:52     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).