All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: darrick.wong@oracle.com
Cc: linux-xfs@vger.kernel.org, bfoster@redhat.com
Subject: [PATCH 07/15] xfs: calculate inode walk prefetch more carefully
Date: Wed, 26 Jun 2019 13:44:40 -0700	[thread overview]
Message-ID: <156158188075.495087.14228436478786857410.stgit@magnolia> (raw)
In-Reply-To: <156158183697.495087.5371839759804528321.stgit@magnolia>

From: Darrick J. Wong <darrick.wong@oracle.com>

The existing inode walk prefetch is based on the old bulkstat code,
which simply allocated 4 pages worth of memory and prefetched that many
inobt records, regardless of however many inodes the caller requested.
65536 inodes is a lot to prefetch (~32M on x64, ~512M on arm64) so let's
scale things down a little more intelligently based on the number of
inodes requested, etc.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_iwalk.c |   46 ++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 44 insertions(+), 2 deletions(-)


diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c
index 304c41e6ed1d..3e67d7702e16 100644
--- a/fs/xfs/xfs_iwalk.c
+++ b/fs/xfs/xfs_iwalk.c
@@ -333,16 +333,58 @@ xfs_iwalk_ag(
 	return error;
 }
 
+/*
+ * We experimentally determined that the reduction in ioctl call overhead
+ * diminishes when userspace asks for more than 2048 inodes, so we'll cap
+ * prefetch at this point.
+ */
+#define MAX_IWALK_PREFETCH	(2048U)
+
 /*
  * Given the number of inodes to prefetch, set the number of inobt records that
  * we cache in memory, which controls the number of inodes we try to read
- * ahead.
+ * ahead.  Set the maximum if @inode_records == 0.
  */
 static inline unsigned int
 xfs_iwalk_prefetch(
 	unsigned int		inode_records)
 {
-	return PAGE_SIZE * 4 / sizeof(struct xfs_inobt_rec_incore);
+	unsigned int		inobt_records;
+
+	/*
+	 * If the caller didn't tell us the number of inodes they wanted,
+	 * assume the maximum prefetch possible for best performance.
+	 * Otherwise, cap prefetch at that maximum so that we don't start an
+	 * absurd amount of prefetch.
+	 */
+	if (inode_records == 0)
+		inode_records = MAX_IWALK_PREFETCH;
+	inode_records = min(inode_records, MAX_IWALK_PREFETCH);
+
+	/* Round the inode count up to a full chunk. */
+	inode_records = round_up(inode_records, XFS_INODES_PER_CHUNK);
+
+	/*
+	 * In order to convert the number of inodes to prefetch into an
+	 * estimate of the number of inobt records to cache, we require a
+	 * conversion factor that reflects our expectations of the average
+	 * loading factor of an inode chunk.  Based on data gathered, most
+	 * (but not all) filesystems manage to keep the inode chunks totally
+	 * full, so we'll underestimate slightly so that our readahead will
+	 * still deliver the performance we want on aging filesystems:
+	 *
+	 * inobt = inodes / (INODES_PER_CHUNK * (4 / 5));
+	 *
+	 * The funny math is to avoid division.
+	 */
+	inobt_records = (inode_records * 5) / (4 * XFS_INODES_PER_CHUNK);
+
+	/*
+	 * Allocate enough space to prefetch at least two inobt records so that
+	 * we can cache both the record where the iwalk started and the next
+	 * record.  This simplifies the AG inode walk loop setup code.
+	 */
+	return max(inobt_records, 2U);
 }
 
 /*

  parent reply	other threads:[~2019-06-26 20:44 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-26 20:43 [PATCH v6 00/15] xfs: refactor and improve inode iteration Darrick J. Wong
2019-06-26 20:44 ` [PATCH 01/15] xfs: create iterator error codes Darrick J. Wong
2019-06-26 20:44 ` [PATCH 02/15] xfs: create simplified inode walk function Darrick J. Wong
2019-07-02 14:23   ` Brian Foster
2019-06-26 20:44 ` [PATCH 03/15] xfs: convert quotacheck to use the new iwalk functions Darrick J. Wong
2019-06-26 20:44 ` [PATCH 04/15] xfs: bulkstat should copy lastip whenever userspace supplies one Darrick J. Wong
2019-06-26 20:44 ` [PATCH 05/15] xfs: remove unnecessary includes of xfs_itable.h Darrick J. Wong
2019-06-26 20:44 ` [PATCH 06/15] xfs: convert bulkstat to new iwalk infrastructure Darrick J. Wong
2019-06-26 20:44 ` Darrick J. Wong [this message]
2019-07-02 14:24   ` [PATCH 07/15] xfs: calculate inode walk prefetch more carefully Brian Foster
2019-07-02 14:49     ` Darrick J. Wong
2019-06-26 20:44 ` [PATCH 08/15] xfs: move bulkstat ichunk helpers to iwalk code Darrick J. Wong
2019-06-26 20:44 ` [PATCH 09/15] xfs: change xfs_iwalk_grab_ichunk to use startino, not lastino Darrick J. Wong
2019-06-26 20:44 ` [PATCH 10/15] xfs: clean up long conditionals in xfs_iwalk_ichunk_ra Darrick J. Wong
2019-06-26 20:45 ` [PATCH 11/15] xfs: refactor xfs_iwalk_grab_ichunk Darrick J. Wong
2019-06-26 20:45 ` [PATCH 12/15] xfs: refactor iwalk code to handle walking inobt records Darrick J. Wong
2019-06-26 20:45 ` [PATCH 13/15] xfs: refactor INUMBERS to use iwalk functions Darrick J. Wong
2019-06-26 20:45 ` [PATCH 14/15] xfs: multithreaded iwalk implementation Darrick J. Wong
2019-07-02 14:33   ` Brian Foster
2019-07-02 15:51     ` Darrick J. Wong
2019-07-02 16:24       ` Brian Foster
2019-07-02 16:53   ` [PATCH v2 " Darrick J. Wong
2019-07-02 17:40     ` Brian Foster
2019-06-26 20:45 ` [PATCH 15/15] xfs: poll waiting for quotacheck Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=156158188075.495087.14228436478786857410.stgit@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.