linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: guaneryu@gmail.com
Cc: linux-xfs@vger.kernel.org, fstests@vger.kernel.org, hch@infradead.org
Subject: [PATCH v2 4/4] populate: punch files after writing to fragment free space properly
Date: Wed, 9 Oct 2019 11:18:48 -0700	[thread overview]
Message-ID: <20191009181848.GG13097@magnolia> (raw)
In-Reply-To: <157049660991.2397321.6295105033631507023.stgit@magnolia>

From: Darrick J. Wong <darrick.wong@oracle.com>

The filesystem population code frequently allocates a large file and
punches out every other block ("swiss-cheese files") in an attempt to
cause the creation of a lot of metadata to fill out the btrees.  This
pattern, however, has a subtle bug if the writes to the swiss-cheese
file are not allocated in batches and we're trying to fragment the free
space records in order to achieve a certain metadata btree shape.

This is exactly what happens on a DAX filesystem, since we no longer
have the page cache to stage delalloc writes.  Each xfs_io pwrite call
to the multi-megabyte swiss-chese file turns into multiple 4k pwrites,
which means that file data blocks are allocated 4k at a time.  This can
be fatal to our goal of fragmenting the free space btrees because the
allocator sees a 4k allocation request and uses 4k blocks from the
fragmented parts of the free space to satisfy the "small" request.  When
this happens, the XFS populate function cannot fill out the free space
btree to sufficient height and tests fail.

(In regular delalloc mode we'd cache all those small write() in memory
and try for a single large allocation, which we'd generally get.)

To fix this, we need to force the filesystem to allocate all blocks
before freeing any blocks.  Split the creation of swiss-cheese files
into two parts: (a) writing data to the file to force allocation, and
(b) punching the holes to fragment free space.

This bug affects only XFS but we convert the one ext4 usage anyway.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v2: don't do the weird array side effect thing per hch suggestion
---
 common/populate |   54 +++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 39 insertions(+), 15 deletions(-)

diff --git a/common/populate b/common/populate
index 7403dec3..f2953a67 100644
--- a/common/populate
+++ b/common/populate
@@ -22,6 +22,26 @@ _require_xfs_db_blocktrash_z_command() {
 # Attempt to make files of "every" format for data, dirs, attrs etc.
 # (with apologies to Eric Sandeen for mutating xfser.sh)
 
+# Create a file of a given size.
+__populate_create_file() {
+	local sz="$1"
+	local fname="$2"
+
+	$XFS_IO_PROG -f -c "pwrite -S 0x62 -W -b 1m 0 $sz" "${fname}"
+}
+
+# Punch out every other hole in this file, if it exists.
+#
+# The goal here is to force the creation of a large number of metadata records
+# by creating a lot of tiny extent mappings in a file.  Callers should ensure
+# that fragmenting the file actually causes record creation.  Call this
+# function /after/ creating all other metadata structures.
+__populate_fragment_file() {
+	local fname="$1"
+
+	test -f "${fname}" && ./src/punch-alternating "${fname}"
+}
+
 # Create a large directory
 __populate_create_dir() {
 	name="$1"
@@ -156,13 +176,12 @@ _scratch_xfs_populate() {
 	# Regular files
 	# - FMT_EXTENTS
 	echo "+ extents file"
-	$XFS_IO_PROG -f -c "pwrite -S 0x61 0 ${blksz}" "${SCRATCH_MNT}/S_IFREG.FMT_EXTENTS"
+	__populate_create_file $blksz "${SCRATCH_MNT}/S_IFREG.FMT_EXTENTS"
 
 	# - FMT_BTREE
 	echo "+ btree extents file"
 	nr="$((blksz * 2 / 16))"
-	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/S_IFREG.FMT_BTREE"
-	./src/punch-alternating "${SCRATCH_MNT}/S_IFREG.FMT_BTREE"
+	__populate_create_file $((blksz * nr)) "${SCRATCH_MNT}/S_IFREG.FMT_BTREE"
 
 	# Directories
 	# - INLINE
@@ -257,8 +276,7 @@ _scratch_xfs_populate() {
 	# Free space btree
 	echo "+ freesp btree"
 	nr="$((blksz * 2 / 8))"
-	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/BNOBT"
-	./src/punch-alternating "${SCRATCH_MNT}/BNOBT"
+	__populate_create_file $((blksz * nr)) "${SCRATCH_MNT}/BNOBT"
 
 	# Inode btree
 	echo "+ inobt btree"
@@ -280,8 +298,7 @@ _scratch_xfs_populate() {
 	if [ $is_rmapbt -gt 0 ]; then
 		echo "+ rmapbt btree"
 		nr="$((blksz * 2 / 24))"
-		$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/RMAPBT"
-		./src/punch-alternating "${SCRATCH_MNT}/RMAPBT"
+		__populate_create_file $((blksz * nr)) "${SCRATCH_MNT}/RMAPBT"
 	fi
 
 	# Realtime Reverse-mapping btree
@@ -289,8 +306,7 @@ _scratch_xfs_populate() {
 	if [ $is_rmapbt -gt 0 ] && [ $is_rt -gt 0 ]; then
 		echo "+ rtrmapbt btree"
 		nr="$((blksz * 2 / 32))"
-		$XFS_IO_PROG -f -R -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/RTRMAPBT"
-		./src/punch-alternating "${SCRATCH_MNT}/RTRMAPBT"
+		__populate_create_file $((blksz * nr)) "${SCRATCH_MNT}/RTRMAPBT"
 	fi
 
 	# Reference-count btree
@@ -298,15 +314,21 @@ _scratch_xfs_populate() {
 	if [ $is_reflink -gt 0 ]; then
 		echo "+ reflink btree"
 		nr="$((blksz * 2 / 12))"
-		$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/REFCOUNTBT"
+		__populate_create_file $((blksz * nr)) "${SCRATCH_MNT}/REFCOUNTBT"
 		cp --reflink=always "${SCRATCH_MNT}/REFCOUNTBT" "${SCRATCH_MNT}/REFCOUNTBT2"
-		./src/punch-alternating "${SCRATCH_MNT}/REFCOUNTBT"
 	fi
 
 	# Copy some real files (xfs tests, I guess...)
 	echo "+ real files"
 	test $fill -ne 0 && __populate_fill_fs "${SCRATCH_MNT}" 5
 
+	# Make sure we get all the fragmentation we asked for
+	__populate_fragment_file "${SCRATCH_MNT}/S_IFREG.FMT_BTREE"
+	__populate_fragment_file "${SCRATCH_MNT}/BNOBT"
+	__populate_fragment_file "${SCRATCH_MNT}/RMAPBT"
+	__populate_fragment_file "${SCRATCH_MNT}/RTRMAPBT"
+	__populate_fragment_file "${SCRATCH_MNT}/REFCOUNTBT"
+
 	umount "${SCRATCH_MNT}"
 }
 
@@ -333,17 +355,16 @@ _scratch_ext4_populate() {
 	# Regular files
 	# - FMT_INLINE
 	echo "+ inline file"
-	$XFS_IO_PROG -f -c "pwrite -S 0x61 0 1" "${SCRATCH_MNT}/S_IFREG.FMT_INLINE"
+	__populate_create_file 1 "${SCRATCH_MNT}/S_IFREG.FMT_INLINE"
 
 	# - FMT_EXTENTS
 	echo "+ extents file"
-	$XFS_IO_PROG -f -c "pwrite -S 0x61 0 ${blksz}" "${SCRATCH_MNT}/S_IFREG.FMT_EXTENTS"
+	__populate_create_file $blksz "${SCRATCH_MNT}/S_IFREG.FMT_EXTENTS"
 
 	# - FMT_ETREE
 	echo "+ extent tree file"
 	nr="$((blksz * 2 / 12))"
-	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 $((blksz * nr))" "${SCRATCH_MNT}/S_IFREG.FMT_ETREE"
-	./src/punch-alternating "${SCRATCH_MNT}/S_IFREG.FMT_ETREE"
+	__populate_create_file $((blksz * nr)) "${SCRATCH_MNT}/S_IFREG.FMT_ETREE"
 
 	# Directories
 	# - INLINE
@@ -406,6 +427,9 @@ _scratch_ext4_populate() {
 	echo "+ real files"
 	test $fill -ne 0 && __populate_fill_fs "${SCRATCH_MNT}" 5
 
+	# Make sure we get all the fragmentation we asked for
+	__populate_fragment_file "${SCRATCH_MNT}/S_IFREG.FMT_ETREE"
+
 	umount "${SCRATCH_MNT}"
 }
 

  parent reply	other threads:[~2019-10-09 18:18 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-08  1:03 [PATCH 0/4] fstests: random fixes Darrick J. Wong
2019-10-08  1:03 ` [PATCH 1/4] xfs/196: check for delalloc blocks after pwrite Darrick J. Wong
2019-10-08  7:01   ` Christoph Hellwig
2019-10-08  1:03 ` [PATCH 2/4] xfs/{088, 089, 091}: redirect stderr when writing to corrupt fs Darrick J. Wong
2019-10-08  7:01   ` Christoph Hellwig
2019-10-08  1:03 ` [PATCH 3/4] xfs/263: use _scratch_mkfs_xfs instead of open-coded mkfs call Darrick J. Wong
2019-10-08  7:02   ` Christoph Hellwig
2019-10-08  1:03 ` [PATCH 4/4] populate: punch files after writing to fragment free space properly Darrick J. Wong
2019-10-09  7:03   ` Christoph Hellwig
2019-10-09 18:02     ` Darrick J. Wong
2019-10-09 18:18   ` Darrick J. Wong [this message]
2019-10-11  7:52     ` [PATCH v2 " Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191009181848.GG13097@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=fstests@vger.kernel.org \
    --cc=guaneryu@gmail.com \
    --cc=hch@infradead.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).