All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: sandeen@sandeen.net, djwong@kernel.org
Cc: Dave Chinner <dchinner@redhat.com>, linux-xfs@vger.kernel.org
Subject: [PATCH 38/61] xfs: collapse AG selection for inode allocation
Date: Wed, 15 Sep 2021 16:10:02 -0700	[thread overview]
Message-ID: <163174740233.350433.16550095400244762904.stgit@magnolia> (raw)
In-Reply-To: <163174719429.350433.8562606396437219220.stgit@magnolia>

From: Dave Chinner <dchinner@redhat.com>

Source kernel commit: 89b1f55a2951bb89b7ae9f8cb3fd11513ff3f219

xfs_dialloc_select_ag() does a lot of repetitive work. It first
calls xfs_ialloc_ag_select() to select the AG to start allocation
attempts in, which can do up to two entire loops across the perags
that inodes can be allocated in. This is simply checking if there is
spce available to allocate inodes in an AG, and it returns when it
finds the first candidate AG.

xfs_dialloc_select_ag() then does it's own iterative walk across
all the perags locking the AGIs and trying to allocate inodes from
the locked AG. It also doesn't limit the search to mp->m_maxagi,
so it will walk all AGs whether they can allocate inodes or not.

Hence if we are really low on inodes, we could do almost 3 entire
walks across the whole perag range before we find an allocation
group we can allocate inodes in or report ENOSPC.

Because xfs_ialloc_ag_select() returns on the first candidate AG it
finds, we can simply do these checks directly in
xfs_dialloc_select_ag() before we lock and try to allocate inodes.
This reduces the inode allocation pass down to 2 perag sweeps at
most - one for aligned inode cluster allocation and if we can't
allocate full, aligned inode clusters anywhere we'll do another pass
trying to do sparse inode cluster allocation.

This also removes a big chunk of duplicate code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 libxfs/xfs_ialloc.c |  225 ++++++++++++++++++---------------------------------
 1 file changed, 78 insertions(+), 147 deletions(-)


diff --git a/libxfs/xfs_ialloc.c b/libxfs/xfs_ialloc.c
index 15785417..573a7804 100644
--- a/libxfs/xfs_ialloc.c
+++ b/libxfs/xfs_ialloc.c
@@ -894,139 +894,6 @@ xfs_ialloc_ag_alloc(
 	return 0;
 }
 
-STATIC xfs_agnumber_t
-xfs_ialloc_next_ag(
-	xfs_mount_t	*mp)
-{
-	xfs_agnumber_t	agno;
-
-	spin_lock(&mp->m_agirotor_lock);
-	agno = mp->m_agirotor;
-	if (++mp->m_agirotor >= mp->m_maxagi)
-		mp->m_agirotor = 0;
-	spin_unlock(&mp->m_agirotor_lock);
-
-	return agno;
-}
-
-/*
- * Select an allocation group to look for a free inode in, based on the parent
- * inode and the mode.  Return the allocation group buffer.
- */
-STATIC xfs_agnumber_t
-xfs_ialloc_ag_select(
-	xfs_trans_t	*tp,		/* transaction pointer */
-	xfs_ino_t	parent,		/* parent directory inode number */
-	umode_t		mode)		/* bits set to indicate file type */
-{
-	xfs_agnumber_t	agcount;	/* number of ag's in the filesystem */
-	xfs_agnumber_t	agno;		/* current ag number */
-	int		flags;		/* alloc buffer locking flags */
-	xfs_extlen_t	ineed;		/* blocks needed for inode allocation */
-	xfs_extlen_t	longest = 0;	/* longest extent available */
-	xfs_mount_t	*mp;		/* mount point structure */
-	int		needspace;	/* file mode implies space allocated */
-	xfs_perag_t	*pag;		/* per allocation group data */
-	xfs_agnumber_t	pagno;		/* parent (starting) ag number */
-	int		error;
-
-	/*
-	 * Files of these types need at least one block if length > 0
-	 * (and they won't fit in the inode, but that's hard to figure out).
-	 */
-	needspace = S_ISDIR(mode) || S_ISREG(mode) || S_ISLNK(mode);
-	mp = tp->t_mountp;
-	agcount = mp->m_maxagi;
-	if (S_ISDIR(mode))
-		pagno = xfs_ialloc_next_ag(mp);
-	else {
-		pagno = XFS_INO_TO_AGNO(mp, parent);
-		if (pagno >= agcount)
-			pagno = 0;
-	}
-
-	ASSERT(pagno < agcount);
-
-	/*
-	 * Loop through allocation groups, looking for one with a little
-	 * free space in it.  Note we don't look for free inodes, exactly.
-	 * Instead, we include whether there is a need to allocate inodes
-	 * to mean that blocks must be allocated for them,
-	 * if none are currently free.
-	 */
-	agno = pagno;
-	flags = XFS_ALLOC_FLAG_TRYLOCK;
-	for (;;) {
-		pag = xfs_perag_get(mp, agno);
-		if (!pag->pagi_inodeok) {
-			xfs_ialloc_next_ag(mp);
-			goto nextag;
-		}
-
-		if (!pag->pagi_init) {
-			error = xfs_ialloc_pagi_init(mp, tp, agno);
-			if (error)
-				goto nextag;
-		}
-
-		if (pag->pagi_freecount) {
-			xfs_perag_put(pag);
-			return agno;
-		}
-
-		if (!pag->pagf_init) {
-			error = xfs_alloc_pagf_init(mp, tp, agno, flags);
-			if (error)
-				goto nextag;
-		}
-
-		/*
-		 * Check that there is enough free space for the file plus a
-		 * chunk of inodes if we need to allocate some. If this is the
-		 * first pass across the AGs, take into account the potential
-		 * space needed for alignment of inode chunks when checking the
-		 * longest contiguous free space in the AG - this prevents us
-		 * from getting ENOSPC because we have free space larger than
-		 * ialloc_blks but alignment constraints prevent us from using
-		 * it.
-		 *
-		 * If we can't find an AG with space for full alignment slack to
-		 * be taken into account, we must be near ENOSPC in all AGs.
-		 * Hence we don't include alignment for the second pass and so
-		 * if we fail allocation due to alignment issues then it is most
-		 * likely a real ENOSPC condition.
-		 */
-		ineed = M_IGEO(mp)->ialloc_min_blks;
-		if (flags && ineed > 1)
-			ineed += M_IGEO(mp)->cluster_align;
-		longest = pag->pagf_longest;
-		if (!longest)
-			longest = pag->pagf_flcount > 0;
-
-		if (pag->pagf_freeblks >= needspace + ineed &&
-		    longest >= ineed) {
-			xfs_perag_put(pag);
-			return agno;
-		}
-nextag:
-		xfs_perag_put(pag);
-		/*
-		 * No point in iterating over the rest, if we're shutting
-		 * down.
-		 */
-		if (XFS_FORCED_SHUTDOWN(mp))
-			return NULLAGNUMBER;
-		agno++;
-		if (agno >= agcount)
-			agno = 0;
-		if (agno == pagno) {
-			if (flags == 0)
-				return NULLAGNUMBER;
-			flags = 0;
-		}
-	}
-}
-
 /*
  * Try to retrieve the next record to the left/right from the current one.
  */
@@ -1703,6 +1570,21 @@ xfs_dialloc_roll(
 	return 0;
 }
 
+STATIC xfs_agnumber_t
+xfs_ialloc_next_ag(
+	xfs_mount_t	*mp)
+{
+	xfs_agnumber_t	agno;
+
+	spin_lock(&mp->m_agirotor_lock);
+	agno = mp->m_agirotor;
+	if (++mp->m_agirotor >= mp->m_maxagi)
+		mp->m_agirotor = 0;
+	spin_unlock(&mp->m_agirotor_lock);
+
+	return agno;
+}
+
 /*
  * Select and prepare an AG for inode allocation.
  *
@@ -1729,16 +1611,24 @@ xfs_dialloc_select_ag(
 	struct xfs_perag	*pag;
 	struct xfs_ino_geometry	*igeo = M_IGEO(mp);
 	bool			okalloc = true;
+	int			needspace;
+	int			flags;
 
 	*IO_agbp = NULL;
 
 	/*
-	 * We do not have an agbp, so select an initial allocation
-	 * group for inode allocation.
+	 * Directories, symlinks, and regular files frequently allocate at least
+	 * one block, so factor that potential expansion when we examine whether
+	 * an AG has enough space for file creation.
 	 */
-	start_agno = xfs_ialloc_ag_select(*tpp, parent, mode);
-	if (start_agno == NULLAGNUMBER)
-		return -ENOSPC;
+	needspace = S_ISDIR(mode) || S_ISREG(mode) || S_ISLNK(mode);
+	if (S_ISDIR(mode))
+		start_agno = xfs_ialloc_next_ag(mp);
+	else {
+		start_agno = XFS_INO_TO_AGNO(mp, parent);
+		if (start_agno >= mp->m_maxagi)
+			start_agno = 0;
+	}
 
 	/*
 	 * If we have already hit the ceiling of inode blocks then clear
@@ -1760,12 +1650,14 @@ xfs_dialloc_select_ag(
 	 * allocation groups upward, wrapping at the end.
 	 */
 	agno = start_agno;
+	flags = XFS_ALLOC_FLAG_TRYLOCK;
 	for (;;) {
+		xfs_extlen_t	ineed;
+		xfs_extlen_t	longest = 0;
+
 		pag = xfs_perag_get(mp, agno);
-		if (!pag->pagi_inodeok) {
-			xfs_ialloc_next_ag(mp);
+		if (!pag->pagi_inodeok)
 			goto nextag;
-		}
 
 		if (!pag->pagi_init) {
 			error = xfs_ialloc_pagi_init(mp, *tpp, agno);
@@ -1773,12 +1665,44 @@ xfs_dialloc_select_ag(
 				break;
 		}
 
-		/*
-		 * Do a first racy fast path check if this AG is usable.
-		 */
 		if (!pag->pagi_freecount && !okalloc)
 			goto nextag;
 
+		if (!pag->pagf_init) {
+			error = xfs_alloc_pagf_init(mp, *tpp, agno, flags);
+			if (error)
+				goto nextag;
+		}
+
+		/*
+		 * Check that there is enough free space for the file plus a
+		 * chunk of inodes if we need to allocate some. If this is the
+		 * first pass across the AGs, take into account the potential
+		 * space needed for alignment of inode chunks when checking the
+		 * longest contiguous free space in the AG - this prevents us
+		 * from getting ENOSPC because we have free space larger than
+		 * ialloc_blks but alignment constraints prevent us from using
+		 * it.
+		 *
+		 * If we can't find an AG with space for full alignment slack to
+		 * be taken into account, we must be near ENOSPC in all AGs.
+		 * Hence we don't include alignment for the second pass and so
+		 * if we fail allocation due to alignment issues then it is most
+		 * likely a real ENOSPC condition.
+		 */
+		if (!pag->pagi_freecount) {
+			ineed = M_IGEO(mp)->ialloc_min_blks;
+			if (flags && ineed > 1)
+				ineed += M_IGEO(mp)->cluster_align;
+			longest = pag->pagf_longest;
+			if (!longest)
+				longest = pag->pagf_flcount > 0;
+
+			if (pag->pagf_freeblks < needspace + ineed ||
+			    longest < ineed)
+				goto nextag;
+		}
+
 		/*
 		 * Then read in the AGI buffer and recheck with the AGI buffer
 		 * lock held.
@@ -1818,10 +1742,17 @@ xfs_dialloc_select_ag(
 nextag_relse_buffer:
 		xfs_trans_brelse(*tpp, agbp);
 nextag:
-		if (++agno == mp->m_sb.sb_agcount)
-			agno = 0;
-		if (agno == start_agno)
+		if (XFS_FORCED_SHUTDOWN(mp)) {
+			error = -EFSCORRUPTED;
 			break;
+		}
+		if (++agno == mp->m_maxagi)
+			agno = 0;
+		if (agno == start_agno) {
+			if (!flags)
+				break;
+			flags = 0;
+		}
 		xfs_perag_put(pag);
 	}
 


  parent reply	other threads:[~2021-09-15 23:10 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-15 23:06 [PATCHSET 00/61] xfs: sync libxfs with 5.14 Darrick J. Wong
2021-09-15 23:06 ` [PATCH 01/61] mkfs: move mkfs/proto.c declarations to mkfs/proto.h Darrick J. Wong
2021-09-16  7:24   ` Christoph Hellwig
2021-09-15 23:06 ` [PATCH 02/61] libfrog: move topology.[ch] to libxfs Darrick J. Wong
2021-09-16  7:26   ` Christoph Hellwig
2021-09-15 23:06 ` [PATCH 03/61] libfrog: create header file for mocked-up kernel data structures Darrick J. Wong
2021-09-16  0:46   ` Dave Chinner
2021-09-16  0:58     ` Darrick J. Wong
2021-09-16  1:29       ` Dave Chinner
2021-09-16  1:37         ` Dave Chinner
2021-09-16  1:46           ` [PATCH 0/5] xfsprogs: generic serialisation primitives Dave Chinner
2021-09-16  1:46             ` [PATCH 1/5] xfsprogs: introduce liburcu support Dave Chinner
2021-09-24  0:41               ` Eric Sandeen
2021-09-24  3:02                 ` Chandan Babu R
2021-09-16  1:46             ` [PATCH 2/5] libxfs: add spinlock_t wrapper Dave Chinner
2021-09-16  1:46             ` [PATCH 3/5] atomic: convert to uatomic Dave Chinner
2021-09-16  1:46             ` [PATCH 4/5] libxfs: add kernel-compatible completion API Dave Chinner
2021-09-16  1:46             ` [PATCH 5/5] libxfs: add wrappers for kernel semaphores Dave Chinner
2021-09-22 22:08             ` [PATCH 0/5] xfsprogs: generic serialisation primitives Eric Sandeen
2021-09-23  8:47             ` [External] : " Chandan Babu R
2021-09-16 16:23     ` [PATCH 03/61] libfrog: create header file for mocked-up kernel data structures Eric Sandeen
2021-09-15 23:06 ` [PATCH 04/61] libxfs: port xfs_set_inode_alloc from the kernel Darrick J. Wong
2021-10-01 17:54   ` Eric Sandeen
2021-09-15 23:07 ` [PATCH 05/61] libxfs: fix whitespace inconsistencies with kernel Darrick J. Wong
2021-10-01 19:06   ` Eric Sandeen
2021-09-15 23:07 ` [PATCH 06/61] xfs: Fix fall-through warnings for Clang Darrick J. Wong
2021-10-01 19:57   ` Eric Sandeen
2021-09-15 23:07 ` [PATCH 07/61] misc: convert utilities to use "fallthrough;" Darrick J. Wong
2021-10-01 19:10   ` Eric Sandeen
2021-09-15 23:07 ` [PATCH 08/61] xfs: use xfs_buf_alloc_pages for uncached buffers Darrick J. Wong
2021-09-15 23:07 ` [PATCH 09/61] xfs: Reverse apply 72b97ea40d Darrick J. Wong
2021-09-15 23:07 ` [PATCH 10/61] xfs: Add xfs_attr_node_remove_name Darrick J. Wong
2021-09-15 23:07 ` [PATCH 11/61] xfs: Refactor xfs_attr_set_shortform Darrick J. Wong
2021-09-15 23:07 ` [PATCH 12/61] xfs: Separate xfs_attr_node_addname and xfs_attr_node_addname_clear_incomplete Darrick J. Wong
2021-09-15 23:07 ` [PATCH 13/61] xfs: Add helper xfs_attr_node_addname_find_attr Darrick J. Wong
2021-09-15 23:07 ` [PATCH 14/61] xfs: Hoist xfs_attr_node_addname Darrick J. Wong
2021-09-15 23:07 ` [PATCH 15/61] xfs: Hoist xfs_attr_leaf_addname Darrick J. Wong
2021-09-15 23:08 ` [PATCH 16/61] xfs: Hoist node transaction handling Darrick J. Wong
2021-09-15 23:08 ` [PATCH 17/61] xfs: Add delay ready attr remove routines Darrick J. Wong
2021-09-15 23:08 ` [PATCH 18/61] xfs: Add delay ready attr set routines Darrick J. Wong
2021-09-15 23:08 ` [PATCH 19/61] xfs: Remove xfs_attr_rmtval_set Darrick J. Wong
2021-09-15 23:08 ` [PATCH 20/61] xfs: Clean up xfs_attr_node_addname_clear_incomplete Darrick J. Wong
2021-09-15 23:08 ` [PATCH 21/61] xfs: clean up open-coded fs block unit conversions Darrick J. Wong
2021-09-15 23:08 ` [PATCH 22/61] xfs: move xfs_perag_get/put to xfs_ag.[ch] Darrick J. Wong
2021-09-15 23:08 ` [PATCH 23/61] xfs: move perag structure and setup to libxfs/xfs_ag.[ch] Darrick J. Wong
2021-09-15 23:08 ` [PATCH 24/61] xfs: make for_each_perag... a first class citizen Darrick J. Wong
2021-09-15 23:08 ` [PATCH 25/61] xfs: convert raw ag walks to use for_each_perag Darrick J. Wong
2021-09-15 23:08 ` [PATCH 26/61] xfs: convert xfs_iwalk to use perag references Darrick J. Wong
2021-09-15 23:09 ` [PATCH 27/61] xfs: convert secondary superblock walk to use perags Darrick J. Wong
2021-09-15 23:09 ` [PATCH 28/61] xfs: pass perags through to the busy extent code Darrick J. Wong
2021-09-15 23:09 ` [PATCH 29/61] xfs: push perags through the ag reservation callouts Darrick J. Wong
2021-09-15 23:09 ` [PATCH 30/61] xfs: pass perags around in fsmap data dev functions Darrick J. Wong
2021-09-15 23:09 ` [PATCH 31/61] xfs: add a perag to the btree cursor Darrick J. Wong
2021-09-15 23:09 ` [PATCH 32/61] xfs: convert rmap btree cursor to using a perag Darrick J. Wong
2021-09-15 23:09 ` [PATCH 33/61] xfs: convert refcount btree cursor to use perags Darrick J. Wong
2021-09-15 23:09 ` [PATCH 34/61] xfs: convert allocbt cursors " Darrick J. Wong
2021-09-15 23:09 ` [PATCH 35/61] xfs: use perag for ialloc btree cursors Darrick J. Wong
2021-09-15 23:09 ` [PATCH 36/61] xfs: remove agno from btree cursor Darrick J. Wong
2021-09-15 23:09 ` [PATCH 37/61] xfs: simplify xfs_dialloc_select_ag() return values Darrick J. Wong
2021-09-15 23:10 ` Darrick J. Wong [this message]
2021-09-15 23:10 ` [PATCH 39/61] xfs: get rid of xfs_dir_ialloc() Darrick J. Wong
2021-09-15 23:10 ` [PATCH 40/61] xfs: inode allocation can use a single perag instance Darrick J. Wong
2021-09-15 23:10 ` [PATCH 41/61] xfs: clean up and simplify xfs_dialloc() Darrick J. Wong
2021-09-15 23:10 ` [PATCH 42/61] xfs: use perag through unlink processing Darrick J. Wong
2021-09-15 23:10 ` [PATCH 43/61] xfs: remove xfs_perag_t Darrick J. Wong
2021-09-15 23:10 ` [PATCH 44/61] xfs: sort variable alphabetically to avoid repeated declaration Darrick J. Wong
2021-09-15 23:10 ` [PATCH 45/61] xfs: Remove redundant assignment to busy Darrick J. Wong
2021-09-15 23:10 ` [PATCH 46/61] xfs: mark xfs_bmap_set_attrforkoff static Darrick J. Wong
2021-09-15 23:10 ` [PATCH 47/61] xfs: fix radix tree tag signs Darrick J. Wong
2021-09-15 23:10 ` [PATCH 48/61] xfs: drop the AGI being passed to xfs_check_agi_freecount Darrick J. Wong
2021-09-15 23:11 ` [PATCH 49/61] xfs: Fix default ASSERT in xfs_attr_set_iter Darrick J. Wong
2021-09-15 23:11 ` [PATCH 50/61] xfs: Make attr name schemes consistent Darrick J. Wong
2021-09-15 23:11 ` [PATCH 51/61] xfs: perag may be null in xfs_imap() Darrick J. Wong
2021-09-15 23:11 ` [PATCH 52/61] xfs: log stripe roundoff is a property of the log Darrick J. Wong
2021-09-15 23:11 ` [PATCH 53/61] xfs: xfs_log_force_lsn isn't passed a LSN Darrick J. Wong
2021-09-15 23:11 ` [PATCH 54/61] xfs: fix endianness issue in xfs_ag_shrink_space Darrick J. Wong
2021-09-15 23:11 ` [PATCH 55/61] xfs: Initialize error in xfs_attr_remove_iter Darrick J. Wong
2021-09-15 23:11 ` [PATCH 56/61] xfs: Fix multiple fall-through warnings for Clang Darrick J. Wong
2021-09-15 23:11 ` [PATCH 57/61] xfs: check for sparse inode clusters that cross new EOAG when shrinking Darrick J. Wong
2021-09-15 23:11 ` [PATCH 58/61] xfs: correct the narrative around misaligned rtinherit/extszinherit dirs Darrick J. Wong
2021-09-15 23:11 ` [PATCH 59/61] xfs: logging the on disk inode LSN can make it go backwards Darrick J. Wong
2021-09-15 23:12 ` [PATCH 60/61] xfs_db: convert the agresv command to use for_each_perag Darrick J. Wong
2021-09-16  7:20   ` Christoph Hellwig
2021-09-15 23:12 ` [PATCH 61/61] mkfs: warn about V4 deprecation when creating new V4 filesystems Darrick J. Wong
2021-09-16  7:18   ` Christoph Hellwig
2021-09-16 15:10     ` Darrick J. Wong
2021-09-16 15:15       ` Christoph Hellwig
2021-11-04  2:25   ` Darrick J. Wong
2021-11-04  2:30     ` Eric Sandeen
2021-09-15 23:36 ` [PATCHSET 00/61] xfs: sync libxfs with 5.14 Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=163174740233.350433.16550095400244762904.stgit@magnolia \
    --to=djwong@kernel.org \
    --cc=dchinner@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.