From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: xfs <linux-xfs@vger.kernel.org>
Subject: Re: [PATCH] xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_ags
Date: Sat, 18 Mar 2023 11:38:42 +1100 [thread overview]
Message-ID: <ZBUIElGrUQnvxprN@destitution> (raw)
In-Reply-To: <20230316164721.GK11376@frogsfrogsfrogs>
On Thu, Mar 16, 2023 at 09:47:21AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Callers of xfs_alloc_vextent_iterate_ags that pass in the TRYLOCK flag
> want us to perform a non-blocking scan of the AGs for free space. There
> are no ordering constraints for non-blocking AGF lock acquisition, so
> the scan can freely start over at AG 0 even when minimum_agno > 0.
>
> This manifests fairly reliably on xfs/294 on 6.3-rc2 with the parent
> pointer patchset applied and the realtime volume enabled. I observed
> the following sequence as part of an xfs_dir_createname call:
>
> 0. Fragment the free space, then allocate nearly all the free space in
> all AGs except AG 0.
>
> 1. Create a directory in AG 2 and let it grow for a while.
>
> 2. Try to allocate 2 blocks to expand the dirent part of a directory.
> The space will be allocated out of AG 0, but the allocation will not
> be contiguous. This (I think) activates the LOWMODE allocator.
>
> 3. The bmapi call decides to convert from extents to bmbt format and
> tries to allocate 1 block. This allocation request calls
> xfs_alloc_vextent_start_ag with the inode number, which starts the
> scan at AG 2. We ignore AG 0 (with all its free space) and instead
> scrape AG 2 and 3 for more space. We find one block, but this now
> kicks t_highest_agno to 3.
>
> 4. The createname call decides it needs to split the dabtree. It tries
> to allocate even more space with xfs_alloc_vextent_start_ag, but now
> we're constrained to AG 3, and we don't find the space. The
> createname returns ENOSPC and the filesystem shuts down.
>
> This change fixes the problem by making the trylock scan wrap around to
> AG 0 if it doesn't like the AGs that it finds. Since the current
> transaction itself holds AGF 0, the trylock of AGF 0 will succeed, and
> we take space from the AG that has plenty.
>
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
> fs/xfs/libxfs/xfs_alloc.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index 8999e38e1bed..bd7112d430b6 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -3326,11 +3326,14 @@ xfs_alloc_vextent_iterate_ags(
> uint32_t flags)
> {
> struct xfs_mount *mp = args->mp;
> + xfs_agnumber_t restart_agno = minimum_agno;
> xfs_agnumber_t agno;
> int error = 0;
>
> + if (flags & XFS_ALLOC_FLAG_TRYLOCK)
> + restart_agno = 0;
> restart:
> - for_each_perag_wrap_range(mp, start_agno, minimum_agno,
> + for_each_perag_wrap_range(mp, start_agno, restart_agno,
> mp->m_sb.sb_agcount, agno, args->pag) {
> args->agno = agno;
> error = xfs_alloc_vextent_prepare_ag(args);
> @@ -3369,6 +3372,7 @@ xfs_alloc_vextent_iterate_ags(
> */
> if (flags) {
> flags = 0;
> + restart_agno = minimum_agno;
> goto restart;
> }
Looks good. Pretty much what we talked about on #xfs.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
prev parent reply other threads:[~2023-03-18 0:38 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-16 16:47 [PATCH] xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_ags Darrick J. Wong
2023-03-18 0:38 ` Dave Chinner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZBUIElGrUQnvxprN@destitution \
--to=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).