All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leah Rumancik <leah.rumancik@gmail.com>
To: linux-xfs@vger.kernel.org, djwong@kernel.org
Cc: Brian Foster <bfoster@redhat.com>,
	Dave Chinner <dchinner@redhat.com>,
	Leah Rumancik <leah.rumancik@gmail.com>
Subject: [PATCH 5.15 13/15] xfs: don't include bnobt blocks when reserving free block pool
Date: Fri,  3 Jun 2022 11:57:19 -0700	[thread overview]
Message-ID: <20220603185721.3121645-13-leah.rumancik@gmail.com> (raw)
In-Reply-To: <20220603185721.3121645-1-leah.rumancik@gmail.com>

From: "Darrick J. Wong" <djwong@kernel.org>

[ Upstream commit c8c568259772751a14e969b7230990508de73d9d ]

xfs_reserve_blocks controls the size of the user-visible free space
reserve pool.  Given the difference between the current and requested
pool sizes, it will try to reserve free space from fdblocks.  However,
the amount requested from fdblocks is also constrained by the amount of
space that we think xfs_mod_fdblocks will give us.  If we forget to
subtract m_allocbt_blks before calling xfs_mod_fdblocks, it will will
return ENOSPC and we'll hang the kernel at mount due to the infinite
loop.

In commit fd43cf600cf6, we decided that xfs_mod_fdblocks should not hand
out the "free space" used by the free space btrees, because some portion
of the free space btrees hold in reserve space for future btree
expansion.  Unfortunately, xfs_reserve_blocks' estimation of the number
of blocks that it could request from xfs_mod_fdblocks was not updated to
include m_allocbt_blks, so if space is extremely low, the caller hangs.

Fix this by creating a function to estimate the number of blocks that
can be reserved from fdblocks, which needs to exclude the set-aside and
m_allocbt_blks.

Found by running xfs/306 (which formats a single-AG 20MB filesystem)
with an fstests configuration that specifies a 1k blocksize and a
specially crafted log size that will consume 7/8 of the space (17920
blocks, specifically) in that AG.

Cc: Brian Foster <bfoster@redhat.com>
Fixes: fd43cf600cf6 ("xfs: set aside allocation btree blocks from block reservation")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Leah Rumancik <leah.rumancik@gmail.com>
---
 fs/xfs/xfs_fsops.c |  2 +-
 fs/xfs/xfs_mount.c |  2 +-
 fs/xfs/xfs_mount.h | 15 +++++++++++++++
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 33e26690a8c4..710e857bb825 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -434,7 +434,7 @@ xfs_reserve_blocks(
 	error = -ENOSPC;
 	do {
 		free = percpu_counter_sum(&mp->m_fdblocks) -
-						mp->m_alloc_set_aside;
+						xfs_fdblocks_unavailable(mp);
 		if (free <= 0)
 			break;
 
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 62f3c153d4b2..76056de83971 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1132,7 +1132,7 @@ xfs_mod_fdblocks(
 	 * problems (i.e. transaction abort, pagecache discards, etc.) than
 	 * slightly premature -ENOSPC.
 	 */
-	set_aside = mp->m_alloc_set_aside + atomic64_read(&mp->m_allocbt_blks);
+	set_aside = xfs_fdblocks_unavailable(mp);
 	percpu_counter_add_batch(&mp->m_fdblocks, delta, batch);
 	if (__percpu_counter_compare(&mp->m_fdblocks, set_aside,
 				     XFS_FDBLOCKS_BATCH) >= 0) {
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index e091f3b3fa15..86564295fce6 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -478,6 +478,21 @@ extern void	xfs_unmountfs(xfs_mount_t *);
  */
 #define XFS_FDBLOCKS_BATCH	1024
 
+/*
+ * Estimate the amount of free space that is not available to userspace and is
+ * not explicitly reserved from the incore fdblocks.  This includes:
+ *
+ * - The minimum number of blocks needed to support splitting a bmap btree
+ * - The blocks currently in use by the freespace btrees because they record
+ *   the actual blocks that will fill per-AG metadata space reservations
+ */
+static inline uint64_t
+xfs_fdblocks_unavailable(
+	struct xfs_mount	*mp)
+{
+	return mp->m_alloc_set_aside + atomic64_read(&mp->m_allocbt_blks);
+}
+
 extern int	xfs_mod_fdblocks(struct xfs_mount *mp, int64_t delta,
 				 bool reserved);
 extern int	xfs_mod_frextents(struct xfs_mount *mp, int64_t delta);
-- 
2.36.1.255.ge46751e96f-goog


  parent reply	other threads:[~2022-06-03 18:58 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-03 18:57 [PATCH 5.15 01/15] xfs: use kmem_cache_free() for kmem_cache objects Leah Rumancik
2022-06-03 18:57 ` [PATCH 5.15 02/15] xfs: punch out data fork delalloc blocks on COW writeback failure Leah Rumancik
2022-06-03 18:57 ` [PATCH 5.15 03/15] xfs: Fix the free logic of state in xfs_attr_node_hasname Leah Rumancik
2022-06-03 18:57 ` [PATCH 5.15 04/15] xfs: remove xfs_inew_wait Leah Rumancik
2022-06-03 18:57 ` [PATCH 5.15 05/15] xfs: remove all COW fork extents when remounting readonly Leah Rumancik
2022-06-03 18:57 ` [PATCH 5.15 06/15] xfs: only run COW extent recovery when there are no live extents Leah Rumancik
2022-06-03 18:57 ` [PATCH 5.15 07/15] xfs: check sb_meta_uuid for dabuf buffer recovery Leah Rumancik
2022-06-03 18:57 ` [PATCH 5.15 08/15] xfs: prevent UAF in xfs_log_item_in_current_chkpt Leah Rumancik
2022-06-03 18:57 ` [PATCH 5.15 09/15] xfs: only bother with sync_filesystem during readonly remount Leah Rumancik
2022-06-03 18:57 ` [PATCH 5.15 10/15] xfs: don't generate selinux audit messages for capability testing Leah Rumancik
2022-06-03 18:57 ` [PATCH 5.15 11/15] xfs: use setattr_copy to set vfs inode attributes Leah Rumancik
2022-06-03 18:57 ` [PATCH 5.15 12/15] xfs: async CIL flushes need pending pushes to be made stable Leah Rumancik
2022-06-08  7:43   ` Amir Goldstein
2022-06-13 17:31     ` Leah Rumancik
2022-06-03 18:57 ` Leah Rumancik [this message]
2022-06-03 18:57 ` [PATCH 5.15 14/15] xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks Leah Rumancik
2022-06-03 18:57 ` [PATCH 5.15 15/15] xfs: drop async cache flushes from CIL commits Leah Rumancik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220603185721.3121645-13-leah.rumancik@gmail.com \
    --to=leah.rumancik@gmail.com \
    --cc=bfoster@redhat.com \
    --cc=dchinner@redhat.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.