From: Dave Chinner <david@fromorbit.com>
To: linux-xfs@vger.kernel.org
Subject: [PATCH 4/5] xfs: speed up directory bestfree block scanning
Date: Thu, 29 Aug 2019 20:47:09 +1000 [thread overview]
Message-ID: <20190829104710.28239-5-david@fromorbit.com> (raw)
In-Reply-To: <20190829104710.28239-1-david@fromorbit.com>
From: Dave Chinner <dchinner@redhat.com>
When running a "create millions inodes in a directory" test
recently, I noticed we were spending a huge amount of time
converting freespace block headers from disk format to in-memory
format:
31.47% [kernel] [k] xfs_dir2_node_addname
17.86% [kernel] [k] xfs_dir3_free_hdr_from_disk
3.55% [kernel] [k] xfs_dir3_free_bests_p
We shouldn't be hitting the best free block scanning code so hard
when doing sequential directory creates, and it turns out there's
a highly suboptimal loop searching the the best free array in
the freespace block - it decodes the block header before checking
each entry inside a loop, instead of decoding the header once before
running the entry search loop.
This makes a massive difference to create rates. Profile now looks
like this:
13.15% [kernel] [k] xfs_dir2_node_addname
3.52% [kernel] [k] xfs_dir3_leaf_check_int
3.11% [kernel] [k] xfs_log_commit_cil
And the wall time/average file create rate differences are
just as stark:
create time(sec) / rate (files/s)
File count vanilla patched
10k 0.41 / 24.3k 0.42 / 23.8k
20k 0.74 / 27.0k 0.76 / 26.3k
100k 3.81 / 26.4k 3.47 / 28.8k
200k 8.58 / 23.3k 7.19 / 27.8k
1M 85.69 / 11.7k 48.53 / 20.6k
2M 280.31 / 7.1k 130.14 / 15.3k
The larger the directory, the bigger the performance improvement.
Signed-Off-By: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/libxfs/xfs_dir2_node.c | 97 ++++++++++++-----------------------
1 file changed, 34 insertions(+), 63 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
index 93254f45a5f9..a81f56d9e538 100644
--- a/fs/xfs/libxfs/xfs_dir2_node.c
+++ b/fs/xfs/libxfs/xfs_dir2_node.c
@@ -1750,8 +1750,8 @@ xfs_dir2_node_find_freeblk(
xfs_dir2_db_t dbno = -1;
xfs_dir2_db_t fbno = -1;
xfs_fileoff_t fo;
- __be16 *bests;
- int findex;
+ __be16 *bests = NULL;
+ int findex = 0;
int error;
/*
@@ -1781,14 +1781,14 @@ xfs_dir2_node_find_freeblk(
*/
ifbno = fblk->blkno;
fbno = ifbno;
+ xfs_trans_brelse(tp, fbp);
+ fbp = NULL;
+ fblk->bp = NULL;
}
- ASSERT(dbno == -1);
- findex = 0;
/*
* If we don't have a data block yet, we're going to scan the freespace
- * blocks looking for one. Figure out what the highest freespace block
- * number is.
+ * data for a data block with enough free space in it.
*/
error = xfs_bmap_last_offset(dp, &fo, XFS_DATA_FORK);
if (error)
@@ -1799,70 +1799,41 @@ xfs_dir2_node_find_freeblk(
if (fbno == -1)
fbno = xfs_dir2_byte_to_db(args->geo, XFS_DIR2_FREE_OFFSET);
- /*
- * While we haven't identified a data block, search the freeblock
- * data for a good data block. If we find a null freeblock entry,
- * indicating a hole in the data blocks, remember that.
- */
- while (dbno == -1) {
- /*
- * If we don't have a freeblock in hand, get the next one.
- */
- if (fbp == NULL) {
- /*
- * If it's ifbno we already looked at it.
- */
- if (++fbno == ifbno)
- fbno++;
- /*
- * If it's off the end we're done.
- */
- if (fbno >= lastfbno)
- break;
- /*
- * Read the block. There can be holes in the
- * freespace blocks, so this might not succeed.
- * This should be really rare, so there's no reason
- * to avoid it.
- */
- error = xfs_dir2_free_try_read(tp, dp,
- xfs_dir2_db_to_da(args->geo, fbno),
- &fbp);
- if (error)
- return error;
- if (!fbp)
- continue;
- free = fbp->b_addr;
- findex = 0;
- }
+ for ( ; fbno < lastfbno; fbno++) {
+ /* If it's ifbno we already looked at it. */
+ if (fbno == ifbno)
+ continue;
+
/*
- * Look at the current free entry. Is it good enough?
- *
- * The bests initialisation should be where the bufer is read in
- * the above branch. But gcc is too stupid to realise that bests
- * and the freehdr are actually initialised if they are placed
- * there, so we have to do it here to avoid warnings. Blech.
+ * Read the block. There can be holes in the freespace blocks,
+ * so this might not succeed. This should be really rare, so
+ * there's no reason to avoid it.
*/
+ error = xfs_dir2_free_try_read(tp, dp,
+ xfs_dir2_db_to_da(args->geo, fbno),
+ &fbp);
+ if (error)
+ return error;
+ if (!fbp)
+ continue;
+
+ free = fbp->b_addr;
bests = dp->d_ops->free_bests_p(free);
dp->d_ops->free_hdr_from_disk(&freehdr, free);
- if (be16_to_cpu(bests[findex]) != NULLDATAOFF &&
- be16_to_cpu(bests[findex]) >= length)
- dbno = freehdr.firstdb + findex;
- else {
- /*
- * Are we done with the freeblock?
- */
- if (++findex == freehdr.nvalid) {
- /*
- * Drop the block.
- */
- xfs_trans_brelse(tp, fbp);
- fbp = NULL;
- if (fblk && fblk->bp)
- fblk->bp = NULL;
+
+ /* Scan the free entry array for a large enough free space. */
+ for (findex = 0; findex < freehdr.nvalid; findex++) {
+ if (be16_to_cpu(bests[findex]) != NULLDATAOFF &&
+ be16_to_cpu(bests[findex]) >= length) {
+ dbno = freehdr.firstdb + findex;
+ goto found_block;
}
}
+
+ /* Didn't find free space, go on to next free block */
+ xfs_trans_brelse(tp, fbp);
}
+
found_block:
*dbnop = dbno;
*fbpp = fbp;
--
2.23.0.rc1
next prev parent reply other threads:[~2019-08-29 10:47 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-29 10:47 [PATCH v3 0/5] xfs: speed up large directory modifications Dave Chinner
2019-08-29 10:47 ` [PATCH 1/5] xfs: move xfs_dir2_addname() Dave Chinner
2019-08-29 20:52 ` Darrick J. Wong
2019-08-30 5:22 ` Christoph Hellwig
2019-08-29 10:47 ` [PATCH 2/5] xfs: factor data block addition from xfs_dir2_node_addname_int() Dave Chinner
2019-08-29 21:02 ` Darrick J. Wong
2019-08-31 1:00 ` Dave Chinner
2019-08-29 10:47 ` [PATCH 3/5] xfs: factor free block index lookup " Dave Chinner
2019-08-29 21:07 ` Darrick J. Wong
2019-08-29 10:47 ` Dave Chinner [this message]
2019-08-29 21:18 ` [PATCH 4/5] xfs: speed up directory bestfree block scanning Darrick J. Wong
2019-08-30 5:24 ` Christoph Hellwig
2019-08-29 10:47 ` [PATCH 5/5] xfs: reverse search directory freespace indexes Dave Chinner
2019-08-29 21:20 ` Darrick J. Wong
2019-08-30 5:24 ` Christoph Hellwig
-- strict thread matches above, loose matches on Subject: below --
2019-08-29 6:30 [PATCH V2 0/5] xfs: speed up large directory modifications Dave Chinner
2019-08-29 6:30 ` [PATCH 4/5] xfs: speed up directory bestfree block scanning Dave Chinner
2019-08-29 8:18 ` Christoph Hellwig
2019-08-29 8:45 ` Dave Chinner
2019-08-29 8:47 ` Christoph Hellwig
2019-08-29 8:55 ` Dave Chinner
2019-08-29 8:25 ` Christoph Hellwig
2019-08-29 9:31 ` Dave Chinner
2019-08-29 9:33 ` Christoph Hellwig
2018-10-24 22:57 [PATCH 0/5] xfs: speed up large directory modifications Dave Chinner
2018-10-24 22:57 ` [PATCH 4/5] xfs: speed up directory bestfree block scanning Dave Chinner
2018-10-26 10:24 ` Christoph Hellwig
2018-10-26 10:58 ` Dave Chinner
2018-10-26 11:56 ` Christoph Hellwig
2018-10-26 12:12 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190829104710.28239-5-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).