All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC: optimize COW end I/O remapping
@ 2024-03-28  7:02 Christoph Hellwig
  2024-03-28  7:02 ` [PATCH 1/6] xfs: check if_bytes under the ilock in xfs_reflink_end_cow_extent Christoph Hellwig
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Christoph Hellwig @ 2024-03-28  7:02 UTC (permalink / raw)
  To: Chandan Babu R, Darrick J. Wong; +Cc: linux-xfs

Hi all,

this series optimizes how we handle unmapping the old data fork
extents when finishing COW I/O.  To get there I also did a bunch
of cleanups for helpers called by this code.

Note that this conflicts with both Darrick's repair work and my
RT delalloc series, so it will need a rebase after those are merged.

Diffstat:
 libxfs/xfs_attr.c       |    5 -
 libxfs/xfs_bmap.c       |   21 ++-----
 libxfs/xfs_bmap.h       |    2 
 libxfs/xfs_inode_fork.c |   62 +++++++++-----------
 libxfs/xfs_inode_fork.h |    4 -
 xfs_aops.c              |    6 --
 xfs_bmap_item.c         |    4 -
 xfs_bmap_util.c         |   33 ++---------
 xfs_bmap_util.h         |    2 
 xfs_dquot.c             |    5 -
 xfs_iomap.c             |   13 +---
 xfs_quota.h             |   21 ++-----
 xfs_reflink.c           |  144 ++++++++++++++++++++++++------------------------
 xfs_rtalloc.c           |    5 -
 14 files changed, 137 insertions(+), 190 deletions(-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/6] xfs: check if_bytes under the ilock in xfs_reflink_end_cow_extent
  2024-03-28  7:02 RFC: optimize COW end I/O remapping Christoph Hellwig
@ 2024-03-28  7:02 ` Christoph Hellwig
  2024-03-29 16:14   ` Darrick J. Wong
  2024-03-28  7:02 ` [PATCH 2/6] xfs: consolidate the xfs_quota_reserve_blkres defintions Christoph Hellwig
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2024-03-28  7:02 UTC (permalink / raw)
  To: Chandan Babu R, Darrick J. Wong; +Cc: linux-xfs

Accessing if_bytes without the ilock is racy.  Move the check a little
further down into the ilock critical section.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_reflink.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 7da0e8f961d351..df632790a0a51c 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -731,12 +731,6 @@ xfs_reflink_end_cow_extent(
 	int			nmaps;
 	int			error;
 
-	/* No COW extents?  That's easy! */
-	if (ifp->if_bytes == 0) {
-		*offset_fsb = end_fsb;
-		return 0;
-	}
-
 	resblks = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK);
 	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0,
 			XFS_TRANS_RESERVE, &tp);
@@ -751,6 +745,12 @@ xfs_reflink_end_cow_extent(
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 	xfs_trans_ijoin(tp, ip, 0);
 
+	/* No COW extents?  That's easy! */
+	if (ifp->if_bytes == 0) {
+		*offset_fsb = end_fsb;
+		goto out_cancel;
+	}
+
 	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
 			XFS_IEXT_REFLINK_END_COW_CNT);
 	if (error == -EFBIG)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/6] xfs: consolidate the xfs_quota_reserve_blkres defintions
  2024-03-28  7:02 RFC: optimize COW end I/O remapping Christoph Hellwig
  2024-03-28  7:02 ` [PATCH 1/6] xfs: check if_bytes under the ilock in xfs_reflink_end_cow_extent Christoph Hellwig
@ 2024-03-28  7:02 ` Christoph Hellwig
  2024-03-29 16:16   ` Darrick J. Wong
  2024-03-28  7:02 ` [PATCH 3/6] xfs: xfs_quota_unreserve_blkres can't fail Christoph Hellwig
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2024-03-28  7:02 UTC (permalink / raw)
  To: Chandan Babu R, Darrick J. Wong; +Cc: linux-xfs

xfs_trans_reserve_quota_nblks is already stubbed out if quota support
is disabled, no need for an extra xfs_quota_reserve_blkres stub.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_quota.h | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h
index 85a4ae1a17f672..621ea9d7cf06d9 100644
--- a/fs/xfs/xfs_quota.h
+++ b/fs/xfs/xfs_quota.h
@@ -123,12 +123,6 @@ extern int xfs_qm_newmount(struct xfs_mount *, uint *, uint *);
 extern void xfs_qm_mount_quotas(struct xfs_mount *);
 extern void xfs_qm_unmount(struct xfs_mount *);
 extern void xfs_qm_unmount_quotas(struct xfs_mount *);
-
-static inline int
-xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t blocks)
-{
-	return xfs_trans_reserve_quota_nblks(NULL, ip, blocks, 0, false);
-}
 bool xfs_inode_near_dquot_enforcement(struct xfs_inode *ip, xfs_dqtype_t type);
 
 # ifdef CONFIG_XFS_LIVE_HOOKS
@@ -187,12 +181,6 @@ static inline int xfs_trans_reserve_quota_bydquots(struct xfs_trans *tp,
 	return 0;
 }
 
-static inline int
-xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t blocks)
-{
-	return 0;
-}
-
 static inline int
 xfs_trans_reserve_quota_icreate(struct xfs_trans *tp, struct xfs_dquot *udqp,
 		struct xfs_dquot *gdqp, struct xfs_dquot *pdqp, int64_t dblocks)
@@ -221,6 +209,12 @@ xfs_trans_reserve_quota_icreate(struct xfs_trans *tp, struct xfs_dquot *udqp,
 
 #endif /* CONFIG_XFS_QUOTA */
 
+static inline int
+xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t blocks)
+{
+	return xfs_trans_reserve_quota_nblks(NULL, ip, blocks, 0, false);
+}
+
 static inline int
 xfs_quota_unreserve_blkres(struct xfs_inode *ip, int64_t blocks)
 {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/6] xfs: xfs_quota_unreserve_blkres can't fail
  2024-03-28  7:02 RFC: optimize COW end I/O remapping Christoph Hellwig
  2024-03-28  7:02 ` [PATCH 1/6] xfs: check if_bytes under the ilock in xfs_reflink_end_cow_extent Christoph Hellwig
  2024-03-28  7:02 ` [PATCH 2/6] xfs: consolidate the xfs_quota_reserve_blkres defintions Christoph Hellwig
@ 2024-03-28  7:02 ` Christoph Hellwig
  2024-03-29 16:21   ` Darrick J. Wong
  2024-03-28  7:02 ` [PATCH 4/6] xfs: simplify iext overflow checking and upgrade Christoph Hellwig
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2024-03-28  7:02 UTC (permalink / raw)
  To: Chandan Babu R, Darrick J. Wong; +Cc: linux-xfs

Unreserving quotas can't fail due to quota limits, and we'll pіck up
a shutdown file system a bit later in all the callers anyway.  Return
void and remove the error checking and propagation in the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_bmap.c | 16 +++++-----------
 fs/xfs/libxfs/xfs_bmap.h |  2 +-
 fs/xfs/xfs_aops.c        |  6 +-----
 fs/xfs/xfs_bmap_util.c   |  9 +++------
 fs/xfs/xfs_bmap_util.h   |  2 +-
 fs/xfs/xfs_iomap.c       |  4 ++--
 fs/xfs/xfs_quota.h       |  5 +++--
 fs/xfs/xfs_reflink.c     | 11 +++--------
 8 files changed, 19 insertions(+), 36 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 656c95a22f2e6d..6c752e55a52724 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4895,7 +4895,7 @@ xfs_bmap_split_indlen(
 	return stolen;
 }
 
-int
+void
 xfs_bmap_del_extent_delay(
 	struct xfs_inode	*ip,
 	int			whichfork,
@@ -4910,7 +4910,6 @@ xfs_bmap_del_extent_delay(
 	xfs_fileoff_t		del_endoff, got_endoff;
 	xfs_filblks_t		got_indlen, new_indlen, stolen;
 	uint32_t		state = xfs_bmap_fork_to_state(whichfork);
-	int			error = 0;
 	bool			isrt;
 
 	XFS_STATS_INC(mp, xs_del_exlist);
@@ -4934,9 +4933,7 @@ xfs_bmap_del_extent_delay(
 	 * indirect block accounting.
 	 */
 	ASSERT(!isrt);
-	error = xfs_quota_unreserve_blkres(ip, del->br_blockcount);
-	if (error)
-		return error;
+	xfs_quota_unreserve_blkres(ip, del->br_blockcount);
 	ip->i_delayed_blks -= del->br_blockcount;
 
 	if (got->br_startoff == del->br_startoff)
@@ -5016,7 +5013,6 @@ xfs_bmap_del_extent_delay(
 		xfs_mod_fdblocks(mp, da_diff, false);
 		xfs_mod_delalloc(mp, -da_diff);
 	}
-	return error;
 }
 
 void
@@ -5584,18 +5580,16 @@ __xfs_bunmapi(
 
 delete:
 		if (wasdel) {
-			error = xfs_bmap_del_extent_delay(ip, whichfork, &icur,
-					&got, &del);
+			xfs_bmap_del_extent_delay(ip, whichfork, &icur, &got, &del);
 		} else {
 			error = xfs_bmap_del_extent_real(ip, tp, &icur, cur,
 					&del, &tmp_logflags, whichfork,
 					flags);
 			logflags |= tmp_logflags;
+			if (error)
+				goto error0;
 		}
 
-		if (error)
-			goto error0;
-
 		end = del.br_startoff - 1;
 nodelete:
 		/*
diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
index f7662595309d86..0144835117c610 100644
--- a/fs/xfs/libxfs/xfs_bmap.h
+++ b/fs/xfs/libxfs/xfs_bmap.h
@@ -195,7 +195,7 @@ int	xfs_bmapi_write(struct xfs_trans *tp, struct xfs_inode *ip,
 int	xfs_bunmapi(struct xfs_trans *tp, struct xfs_inode *ip,
 		xfs_fileoff_t bno, xfs_filblks_t len, uint32_t flags,
 		xfs_extnum_t nexts, int *done);
-int	xfs_bmap_del_extent_delay(struct xfs_inode *ip, int whichfork,
+void	xfs_bmap_del_extent_delay(struct xfs_inode *ip, int whichfork,
 		struct xfs_iext_cursor *cur, struct xfs_bmbt_irec *got,
 		struct xfs_bmbt_irec *del);
 void	xfs_bmap_del_extent_cow(struct xfs_inode *ip,
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 3f428620ebf2a3..c51bc17f5cfa03 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -469,7 +469,6 @@ xfs_discard_folio(
 {
 	struct xfs_inode	*ip = XFS_I(folio->mapping->host);
 	struct xfs_mount	*mp = ip->i_mount;
-	int			error;
 
 	if (xfs_is_shutdown(mp))
 		return;
@@ -483,11 +482,8 @@ xfs_discard_folio(
 	 * byte of the next folio. Hence the end offset is only dependent on the
 	 * folio itself and not the start offset that is passed in.
 	 */
-	error = xfs_bmap_punch_delalloc_range(ip, pos,
+	xfs_bmap_punch_delalloc_range(ip, pos,
 				folio_pos(folio) + folio_size(folio));
-
-	if (error && !xfs_is_shutdown(mp))
-		xfs_alert(mp, "page discard unable to remove delalloc mapping.");
 }
 
 static const struct iomap_writeback_ops xfs_writeback_ops = {
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 19e11d1da66074..2be52d8265db91 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -440,7 +440,7 @@ xfs_getbmap(
  * if the ranges only partially overlap them, so it is up to the caller to
  * ensure that partial blocks are not passed in.
  */
-int
+void
 xfs_bmap_punch_delalloc_range(
 	struct xfs_inode	*ip,
 	xfs_off_t		start_byte,
@@ -452,7 +452,6 @@ xfs_bmap_punch_delalloc_range(
 	xfs_fileoff_t		end_fsb = XFS_B_TO_FSB(mp, end_byte);
 	struct xfs_bmbt_irec	got, del;
 	struct xfs_iext_cursor	icur;
-	int			error = 0;
 
 	ASSERT(!xfs_need_iread_extents(ifp));
 
@@ -476,15 +475,13 @@ xfs_bmap_punch_delalloc_range(
 			continue;
 		}
 
-		error = xfs_bmap_del_extent_delay(ip, XFS_DATA_FORK, &icur,
-						  &got, &del);
-		if (error || !xfs_iext_get_extent(ifp, &icur, &got))
+		xfs_bmap_del_extent_delay(ip, XFS_DATA_FORK, &icur, &got, &del);
+		if (!xfs_iext_get_extent(ifp, &icur, &got))
 			break;
 	}
 
 out_unlock:
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
-	return error;
 }
 
 /*
diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
index 77ecbb753ef207..51f84d8ff372fa 100644
--- a/fs/xfs/xfs_bmap_util.h
+++ b/fs/xfs/xfs_bmap_util.h
@@ -30,7 +30,7 @@ xfs_bmap_rtalloc(struct xfs_bmalloca *ap)
 }
 #endif /* CONFIG_XFS_RT */
 
-int	xfs_bmap_punch_delalloc_range(struct xfs_inode *ip,
+void	xfs_bmap_punch_delalloc_range(struct xfs_inode *ip,
 		xfs_off_t start_byte, xfs_off_t end_byte);
 
 struct kgetbmap {
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 4087af7f3c9f3f..ba359bee2c2256 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1194,8 +1194,8 @@ xfs_buffered_write_delalloc_punch(
 	loff_t			offset,
 	loff_t			length)
 {
-	return xfs_bmap_punch_delalloc_range(XFS_I(inode), offset,
-			offset + length);
+	xfs_bmap_punch_delalloc_range(XFS_I(inode), offset, offset + length);
+	return 0;
 }
 
 static int
diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h
index 621ea9d7cf06d9..078b2b88206b2c 100644
--- a/fs/xfs/xfs_quota.h
+++ b/fs/xfs/xfs_quota.h
@@ -215,10 +215,11 @@ xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t blocks)
 	return xfs_trans_reserve_quota_nblks(NULL, ip, blocks, 0, false);
 }
 
-static inline int
+static inline void
 xfs_quota_unreserve_blkres(struct xfs_inode *ip, int64_t blocks)
 {
-	return xfs_quota_reserve_blkres(ip, -blocks);
+	/* don't return an error as unreserving quotas can't fail */
+	xfs_quota_reserve_blkres(ip, -blocks);
 }
 
 extern int xfs_mount_reset_sbqflags(struct xfs_mount *);
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index df632790a0a51c..83f243cfa40571 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -606,10 +606,8 @@ xfs_reflink_cancel_cow_blocks(
 		trace_xfs_reflink_cancel_cow(ip, &del);
 
 		if (isnullstartblock(del.br_startblock)) {
-			error = xfs_bmap_del_extent_delay(ip, XFS_COW_FORK,
-					&icur, &got, &del);
-			if (error)
-				break;
+			xfs_bmap_del_extent_delay(ip, XFS_COW_FORK, &icur, &got,
+					&del);
 		} else if (del.br_state == XFS_EXT_UNWRITTEN || cancel_real) {
 			ASSERT((*tpp)->t_highest_agno == NULLAGNUMBER);
 
@@ -632,10 +630,7 @@ xfs_reflink_cancel_cow_blocks(
 			xfs_bmap_del_extent_cow(ip, &icur, &got, &del);
 
 			/* Remove the quota reservation */
-			error = xfs_quota_unreserve_blkres(ip,
-					del.br_blockcount);
-			if (error)
-				break;
+			xfs_quota_unreserve_blkres(ip, del.br_blockcount);
 		} else {
 			/* Didn't do anything, push cursor back. */
 			xfs_iext_prev(ifp, &icur);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/6] xfs: simplify iext overflow checking and upgrade
  2024-03-28  7:02 RFC: optimize COW end I/O remapping Christoph Hellwig
                   ` (2 preceding siblings ...)
  2024-03-28  7:02 ` [PATCH 3/6] xfs: xfs_quota_unreserve_blkres can't fail Christoph Hellwig
@ 2024-03-28  7:02 ` Christoph Hellwig
  2024-03-28 22:04   ` Dave Chinner
  2024-03-28  7:02 ` [PATCH 5/6] xfs: optimize extent remapping in xfs_reflink_end_cow_extent Christoph Hellwig
  2024-03-28  7:02 ` [PATCH 6/6] xfs: rename the del variable " Christoph Hellwig
  5 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2024-03-28  7:02 UTC (permalink / raw)
  To: Chandan Babu R, Darrick J. Wong; +Cc: linux-xfs

Currently the calls to xfs_iext_count_may_overflow and
xfs_iext_count_upgrade are always paired.  Merge them into a single
function to simplify the callers and the actual check and upgrade
logic itself.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_attr.c       |  5 +--
 fs/xfs/libxfs/xfs_bmap.c       |  5 +--
 fs/xfs/libxfs/xfs_inode_fork.c | 62 ++++++++++++++++------------------
 fs/xfs/libxfs/xfs_inode_fork.h |  4 +--
 fs/xfs/xfs_bmap_item.c         |  4 +--
 fs/xfs/xfs_bmap_util.c         | 24 +++----------
 fs/xfs/xfs_dquot.c             |  5 +--
 fs/xfs/xfs_iomap.c             |  9 ++---
 fs/xfs/xfs_reflink.c           |  9 ++---
 fs/xfs/xfs_rtalloc.c           |  5 +--
 10 files changed, 44 insertions(+), 88 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 673a4b6d2e8d1e..6fd0b0a08bfee8 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -985,11 +985,8 @@ xfs_attr_set(
 		return error;
 
 	if (args->value || xfs_inode_hasattr(dp)) {
-		error = xfs_iext_count_may_overflow(dp, XFS_ATTR_FORK,
+		error = xfs_iext_count_upgrade(args->trans, dp, XFS_ATTR_FORK,
 				XFS_IEXT_ATTR_MANIP_CNT(rmt_blks));
-		if (error == -EFBIG)
-			error = xfs_iext_count_upgrade(args->trans, dp,
-					XFS_IEXT_ATTR_MANIP_CNT(rmt_blks));
 		if (error)
 			goto out_trans_cancel;
 	}
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 6c752e55a52724..89e171d59965bc 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4630,11 +4630,8 @@ xfs_bmapi_convert_delalloc(
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 	xfs_trans_ijoin(tp, ip, 0);
 
-	error = xfs_iext_count_may_overflow(ip, whichfork,
+	error = xfs_iext_count_upgrade(tp, ip, whichfork,
 			XFS_IEXT_ADD_NOSPLIT_CNT);
-	if (error == -EFBIG)
-		error = xfs_iext_count_upgrade(tp, ip,
-				XFS_IEXT_ADD_NOSPLIT_CNT);
 	if (error)
 		goto out_trans_cancel;
 
diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index 7d660a9739090a..235c41eca5edd7 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -765,53 +765,49 @@ xfs_ifork_verify_local_attr(
 	return 0;
 }
 
+/*
+ * Check if the inode fork supports adding nr_to_add more extents.
+ *
+ * If it doesn't but we can upgrade it to large extent counters, do the upgrade.
+ * If we can't upgrade or are already using big counters but still can't fit the
+ * additional extents, return -EFBIG.
+ */
 int
-xfs_iext_count_may_overflow(
+xfs_iext_count_upgrade(
+	struct xfs_trans	*tp,
 	struct xfs_inode	*ip,
 	int			whichfork,
-	int			nr_to_add)
+	uint			nr_to_add)
 {
+	struct xfs_mount	*mp = ip->i_mount;
+	bool			has_large =
+		xfs_inode_has_large_extent_counts(ip);
 	struct xfs_ifork	*ifp = xfs_ifork_ptr(ip, whichfork);
 	uint64_t		max_exts;
 	uint64_t		nr_exts;
 
+	ASSERT(nr_to_add <= XFS_MAX_EXTCNT_UPGRADE_NR);
+
 	if (whichfork == XFS_COW_FORK)
 		return 0;
 
-	max_exts = xfs_iext_max_nextents(xfs_inode_has_large_extent_counts(ip),
-				whichfork);
-
-	if (XFS_TEST_ERROR(false, ip->i_mount, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
-		max_exts = 10;
-
 	nr_exts = ifp->if_nextents + nr_to_add;
-	if (nr_exts < ifp->if_nextents || nr_exts > max_exts)
+	if (nr_exts < ifp->if_nextents) {
+		/* no point in upgrading if if_nextents overflows */
 		return -EFBIG;
+	}
 
-	return 0;
-}
-
-/*
- * Upgrade this inode's extent counter fields to be able to handle a potential
- * increase in the extent count by nr_to_add.  Normally this is the same
- * quantity that caused xfs_iext_count_may_overflow() to return -EFBIG.
- */
-int
-xfs_iext_count_upgrade(
-	struct xfs_trans	*tp,
-	struct xfs_inode	*ip,
-	uint			nr_to_add)
-{
-	ASSERT(nr_to_add <= XFS_MAX_EXTCNT_UPGRADE_NR);
-
-	if (!xfs_has_large_extent_counts(ip->i_mount) ||
-	    xfs_inode_has_large_extent_counts(ip) ||
-	    XFS_TEST_ERROR(false, ip->i_mount, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
-		return -EFBIG;
-
-	ip->i_diflags2 |= XFS_DIFLAG2_NREXT64;
-	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
-
+	if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
+		max_exts = 10;
+	else
+		max_exts = xfs_iext_max_nextents(has_large, whichfork);
+	if (nr_exts > max_exts) {
+		if (has_large || !xfs_has_large_extent_counts(mp) ||
+		    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
+			return -EFBIG;
+		ip->i_diflags2 |= XFS_DIFLAG2_NREXT64;
+		xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
+	}
 	return 0;
 }
 
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index bd53eb951b6515..61e35fa239b46f 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -256,10 +256,8 @@ extern void xfs_ifork_init_cow(struct xfs_inode *ip);
 
 int xfs_ifork_verify_local_data(struct xfs_inode *ip);
 int xfs_ifork_verify_local_attr(struct xfs_inode *ip);
-int xfs_iext_count_may_overflow(struct xfs_inode *ip, int whichfork,
-		int nr_to_add);
 int xfs_iext_count_upgrade(struct xfs_trans *tp, struct xfs_inode *ip,
-		uint nr_to_add);
+		int whichfork, uint nr_to_add);
 bool xfs_ifork_is_realtime(struct xfs_inode *ip, int whichfork);
 
 /* returns true if the fork has extents but they are not read in yet. */
diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c
index d27859a684aa69..4f467f9389e674 100644
--- a/fs/xfs/xfs_bmap_item.c
+++ b/fs/xfs/xfs_bmap_item.c
@@ -524,9 +524,7 @@ xfs_bmap_recover_work(
 	else
 		iext_delta = XFS_IEXT_PUNCH_HOLE_CNT;
 
-	error = xfs_iext_count_may_overflow(ip, work->bi_whichfork, iext_delta);
-	if (error == -EFBIG)
-		error = xfs_iext_count_upgrade(tp, ip, iext_delta);
+	error = xfs_iext_count_upgrade(tp, ip, work->bi_whichfork, iext_delta);
 	if (error)
 		goto err_cancel;
 
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 2be52d8265db91..035e4419e73200 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -710,11 +710,8 @@ xfs_alloc_file_space(
 		if (error)
 			break;
 
-		error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+		error = xfs_iext_count_upgrade(tp, ip, XFS_DATA_FORK,
 				XFS_IEXT_ADD_NOSPLIT_CNT);
-		if (error == -EFBIG)
-			error = xfs_iext_count_upgrade(tp, ip,
-					XFS_IEXT_ADD_NOSPLIT_CNT);
 		if (error)
 			goto error;
 
@@ -772,10 +769,8 @@ xfs_unmap_extent(
 	if (error)
 		return error;
 
-	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+	error = xfs_iext_count_upgrade(tp, ip, XFS_DATA_FORK,
 			XFS_IEXT_PUNCH_HOLE_CNT);
-	if (error == -EFBIG)
-		error = xfs_iext_count_upgrade(tp, ip, XFS_IEXT_PUNCH_HOLE_CNT);
 	if (error)
 		goto out_trans_cancel;
 
@@ -1051,10 +1046,8 @@ xfs_insert_file_space(
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 	xfs_trans_ijoin(tp, ip, 0);
 
-	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+	error = xfs_iext_count_upgrade(tp, ip, XFS_DATA_FORK,
 			XFS_IEXT_PUNCH_HOLE_CNT);
-	if (error == -EFBIG)
-		error = xfs_iext_count_upgrade(tp, ip, XFS_IEXT_PUNCH_HOLE_CNT);
 	if (error)
 		goto out_trans_cancel;
 
@@ -1280,23 +1273,16 @@ xfs_swap_extent_rmap(
 			trace_xfs_swap_extent_rmap_remap_piece(tip, &uirec);
 
 			if (xfs_bmap_is_real_extent(&uirec)) {
-				error = xfs_iext_count_may_overflow(ip,
-						XFS_DATA_FORK,
+				error = xfs_iext_count_upgrade(tp, ip, XFS_DATA_FORK,
 						XFS_IEXT_SWAP_RMAP_CNT);
-				if (error == -EFBIG)
-					error = xfs_iext_count_upgrade(tp, ip,
-							XFS_IEXT_SWAP_RMAP_CNT);
 				if (error)
 					goto out;
 			}
 
 			if (xfs_bmap_is_real_extent(&irec)) {
-				error = xfs_iext_count_may_overflow(tip,
+				error = xfs_iext_count_upgrade(tp, ip,
 						XFS_DATA_FORK,
 						XFS_IEXT_SWAP_RMAP_CNT);
-				if (error == -EFBIG)
-					error = xfs_iext_count_upgrade(tp, ip,
-							XFS_IEXT_SWAP_RMAP_CNT);
 				if (error)
 					goto out;
 			}
diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index c98cb468c35780..82ac6ad9d1596a 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -341,11 +341,8 @@ xfs_dquot_disk_alloc(
 		goto err_cancel;
 	}
 
-	error = xfs_iext_count_may_overflow(quotip, XFS_DATA_FORK,
+	error = xfs_iext_count_upgrade(tp, quotip, XFS_DATA_FORK,
 			XFS_IEXT_ADD_NOSPLIT_CNT);
-	if (error == -EFBIG)
-		error = xfs_iext_count_upgrade(tp, quotip,
-				XFS_IEXT_ADD_NOSPLIT_CNT);
 	if (error)
 		goto err_cancel;
 
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index ba359bee2c2256..d4a85a934b967e 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -298,9 +298,7 @@ xfs_iomap_write_direct(
 	if (error)
 		return error;
 
-	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK, nr_exts);
-	if (error == -EFBIG)
-		error = xfs_iext_count_upgrade(tp, ip, nr_exts);
+	error = xfs_iext_count_upgrade(tp, ip, XFS_DATA_FORK, nr_exts);
 	if (error)
 		goto out_trans_cancel;
 
@@ -606,11 +604,8 @@ xfs_iomap_write_unwritten(
 		if (error)
 			return error;
 
-		error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+		error = xfs_iext_count_upgrade(tp, ip, XFS_DATA_FORK,
 				XFS_IEXT_WRITE_UNWRITTEN_CNT);
-		if (error == -EFBIG)
-			error = xfs_iext_count_upgrade(tp, ip,
-					XFS_IEXT_WRITE_UNWRITTEN_CNT);
 		if (error)
 			goto error_on_bmapi_transaction;
 
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 83f243cfa40571..3c35cd3b2dec5d 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -746,11 +746,8 @@ xfs_reflink_end_cow_extent(
 		goto out_cancel;
 	}
 
-	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+	error = xfs_iext_count_upgrade(tp, ip, XFS_DATA_FORK,
 			XFS_IEXT_REFLINK_END_COW_CNT);
-	if (error == -EFBIG)
-		error = xfs_iext_count_upgrade(tp, ip,
-				XFS_IEXT_REFLINK_END_COW_CNT);
 	if (error)
 		goto out_cancel;
 
@@ -1278,9 +1275,7 @@ xfs_reflink_remap_extent(
 	if (dmap_written)
 		++iext_delta;
 
-	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK, iext_delta);
-	if (error == -EFBIG)
-		error = xfs_iext_count_upgrade(tp, ip, iext_delta);
+	error = xfs_iext_count_upgrade(tp, ip, XFS_DATA_FORK, iext_delta);
 	if (error)
 		goto out_cancel;
 
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index e66f9bd5de5cff..655e42cca7e4d8 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -695,11 +695,8 @@ xfs_growfs_rt_alloc(
 		xfs_ilock(ip, XFS_ILOCK_EXCL);
 		xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
 
-		error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
+		error = xfs_iext_count_upgrade(tp, ip, XFS_DATA_FORK,
 				XFS_IEXT_ADD_NOSPLIT_CNT);
-		if (error == -EFBIG)
-			error = xfs_iext_count_upgrade(tp, ip,
-					XFS_IEXT_ADD_NOSPLIT_CNT);
 		if (error)
 			goto out_trans_cancel;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/6] xfs: optimize extent remapping in xfs_reflink_end_cow_extent
  2024-03-28  7:02 RFC: optimize COW end I/O remapping Christoph Hellwig
                   ` (3 preceding siblings ...)
  2024-03-28  7:02 ` [PATCH 4/6] xfs: simplify iext overflow checking and upgrade Christoph Hellwig
@ 2024-03-28  7:02 ` Christoph Hellwig
  2024-03-29 16:29   ` Darrick J. Wong
  2024-03-28  7:02 ` [PATCH 6/6] xfs: rename the del variable " Christoph Hellwig
  5 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2024-03-28  7:02 UTC (permalink / raw)
  To: Chandan Babu R, Darrick J. Wong; +Cc: linux-xfs

xfs_reflink_end_cow_extent currently caps the range it works on to
fit both the existing extent (or hole) in the data fork and the
new COW range.  For overwrites of fragmented regions that is highly
inefficient, as we need to split the new region at every boundary,
just for it to be merge back in the next pass.

Switch to unmapping the old data using a chain of deferred bmap
and extent free ops ops first, and then handle remapping the new
data in one single transaction instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_reflink.c | 98 +++++++++++++++++++++++++-------------------
 1 file changed, 56 insertions(+), 42 deletions(-)

diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 3c35cd3b2dec5d..a7ee868d79bf02 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -701,6 +701,52 @@ xfs_reflink_cancel_cow_range(
 	return error;
 }
 
+/*
+ * Unmap any old data covering the COW target.
+ */
+static void
+xfs_reflink_unmap_old_data(
+	struct xfs_trans	*tp,
+	struct xfs_inode	*ip,
+	xfs_fileoff_t		offset_fsb,
+	xfs_fileoff_t		end_fsb)
+{
+	struct xfs_ifork	*ifp = &ip->i_df;
+	struct xfs_bmbt_irec	got, del;
+	struct xfs_iext_cursor	icur;
+
+	ASSERT(!xfs_need_iread_extents(ifp));
+
+	if (!xfs_iext_lookup_extent_before(ip, ifp, &end_fsb, &icur, &got))
+		return;
+
+	while (got.br_startoff + got.br_blockcount > offset_fsb) {
+		del = got;
+		xfs_trim_extent(&del, offset_fsb, end_fsb - offset_fsb);
+
+		/* Extent delete may have bumped us forward */
+		if (!del.br_blockcount)
+			goto prev_extent;
+
+		trace_xfs_reflink_cow_remap_to(ip, &del);
+		if (isnullstartblock(del.br_startblock)) {
+			xfs_bmap_del_extent_delay(ip, XFS_DATA_FORK, &icur,
+					&got, &del);
+			goto refresh;
+		}
+
+		xfs_bmap_unmap_extent(tp, ip, XFS_DATA_FORK, &del);
+		xfs_refcount_decrease_extent(tp, &del);
+		xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT,
+				-del.br_blockcount);
+prev_extent:
+		xfs_iext_prev(ifp, &icur);
+refresh:
+		if (!xfs_iext_get_extent(ifp, &icur, &got))
+			break;
+	}
+}
+
 /*
  * Remap part of the CoW fork into the data fork.
  *
@@ -718,12 +764,11 @@ xfs_reflink_end_cow_extent(
 	xfs_fileoff_t		end_fsb)
 {
 	struct xfs_iext_cursor	icur;
-	struct xfs_bmbt_irec	got, del, data;
+	struct xfs_bmbt_irec	got, del;
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_trans	*tp;
 	struct xfs_ifork	*ifp = xfs_ifork_ptr(ip, XFS_COW_FORK);
 	unsigned int		resblks;
-	int			nmaps;
 	int			error;
 
 	resblks = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK);
@@ -765,9 +810,7 @@ xfs_reflink_end_cow_extent(
 	/*
 	 * Only remap real extents that contain data.  With AIO, speculative
 	 * preallocations can leak into the range we are called upon, and we
-	 * need to skip them.  Preserve @got for the eventual CoW fork
-	 * deletion; from now on @del represents the mapping that we're
-	 * actually remapping.
+	 * need to skip them.
 	 */
 	while (!xfs_bmap_is_written_extent(&got)) {
 		if (!xfs_iext_next_extent(ifp, &icur, &got) ||
@@ -776,47 +819,18 @@ xfs_reflink_end_cow_extent(
 			goto out_cancel;
 		}
 	}
+
+	/*
+	 * Preserve @got for the eventual CoW fork deletion; from now on @del
+	 * represents the mapping that we're actually remapping.
+	 */
 	del = got;
 	xfs_trim_extent(&del, *offset_fsb, end_fsb - *offset_fsb);
-
-	/* Grab the corresponding mapping in the data fork. */
-	nmaps = 1;
-	error = xfs_bmapi_read(ip, del.br_startoff, del.br_blockcount, &data,
-			&nmaps, 0);
-	if (error)
-		goto out_cancel;
-
-	/* We can only remap the smaller of the two extent sizes. */
-	data.br_blockcount = min(data.br_blockcount, del.br_blockcount);
-	del.br_blockcount = data.br_blockcount;
-
 	trace_xfs_reflink_cow_remap_from(ip, &del);
-	trace_xfs_reflink_cow_remap_to(ip, &data);
-
-	if (xfs_bmap_is_real_extent(&data)) {
-		/*
-		 * If the extent we're remapping is backed by storage (written
-		 * or not), unmap the extent and drop its refcount.
-		 */
-		xfs_bmap_unmap_extent(tp, ip, XFS_DATA_FORK, &data);
-		xfs_refcount_decrease_extent(tp, &data);
-		xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT,
-				-data.br_blockcount);
-	} else if (data.br_startblock == DELAYSTARTBLOCK) {
-		int		done;
 
-		/*
-		 * If the extent we're remapping is a delalloc reservation,
-		 * we can use the regular bunmapi function to release the
-		 * incore state.  Dropping the delalloc reservation takes care
-		 * of the quota reservation for us.
-		 */
-		error = xfs_bunmapi(NULL, ip, data.br_startoff,
-				data.br_blockcount, 0, 1, &done);
-		if (error)
-			goto out_cancel;
-		ASSERT(done);
-	}
+	/* Unmap the old data. */
+	xfs_reflink_unmap_old_data(tp, ip, del.br_startoff,
+			del.br_startoff + del.br_blockcount);
 
 	/* Free the CoW orphan record. */
 	xfs_refcount_free_cow_extent(tp, del.br_startblock, del.br_blockcount);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/6] xfs: rename the del variable in xfs_reflink_end_cow_extent
  2024-03-28  7:02 RFC: optimize COW end I/O remapping Christoph Hellwig
                   ` (4 preceding siblings ...)
  2024-03-28  7:02 ` [PATCH 5/6] xfs: optimize extent remapping in xfs_reflink_end_cow_extent Christoph Hellwig
@ 2024-03-28  7:02 ` Christoph Hellwig
  2024-03-29 16:31   ` Darrick J. Wong
  5 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2024-03-28  7:02 UTC (permalink / raw)
  To: Chandan Babu R, Darrick J. Wong; +Cc: linux-xfs

del contains the new extent that we are remapping.  Give it a somewhat
less confusing name.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_reflink.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index a7ee868d79bf02..15c723396cfdab 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -764,7 +764,7 @@ xfs_reflink_end_cow_extent(
 	xfs_fileoff_t		end_fsb)
 {
 	struct xfs_iext_cursor	icur;
-	struct xfs_bmbt_irec	got, del;
+	struct xfs_bmbt_irec	got, new;
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_trans	*tp;
 	struct xfs_ifork	*ifp = xfs_ifork_ptr(ip, XFS_COW_FORK);
@@ -821,29 +821,29 @@ xfs_reflink_end_cow_extent(
 	}
 
 	/*
-	 * Preserve @got for the eventual CoW fork deletion; from now on @del
+	 * Preserve @got for the eventual CoW fork deletion; from now on @new
 	 * represents the mapping that we're actually remapping.
 	 */
-	del = got;
-	xfs_trim_extent(&del, *offset_fsb, end_fsb - *offset_fsb);
-	trace_xfs_reflink_cow_remap_from(ip, &del);
+	new = got;
+	xfs_trim_extent(&new, *offset_fsb, end_fsb - *offset_fsb);
+	trace_xfs_reflink_cow_remap_from(ip, &new);
 
 	/* Unmap the old data. */
-	xfs_reflink_unmap_old_data(tp, ip, del.br_startoff,
-			del.br_startoff + del.br_blockcount);
+	xfs_reflink_unmap_old_data(tp, ip, new.br_startoff,
+			new.br_startoff + new.br_blockcount);
 
 	/* Free the CoW orphan record. */
-	xfs_refcount_free_cow_extent(tp, del.br_startblock, del.br_blockcount);
+	xfs_refcount_free_cow_extent(tp, new.br_startblock, new.br_blockcount);
 
 	/* Map the new blocks into the data fork. */
-	xfs_bmap_map_extent(tp, ip, XFS_DATA_FORK, &del);
+	xfs_bmap_map_extent(tp, ip, XFS_DATA_FORK, &new);
 
 	/* Charge this new data fork mapping to the on-disk quota. */
 	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_DELBCOUNT,
-			(long)del.br_blockcount);
+			(long)new.br_blockcount);
 
 	/* Remove the mapping from the CoW fork. */
-	xfs_bmap_del_extent_cow(ip, &icur, &got, &del);
+	xfs_bmap_del_extent_cow(ip, &icur, &got, &new);
 
 	error = xfs_trans_commit(tp);
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
@@ -851,7 +851,7 @@ xfs_reflink_end_cow_extent(
 		return error;
 
 	/* Update the caller about how much progress we made. */
-	*offset_fsb = del.br_startoff + del.br_blockcount;
+	*offset_fsb = new.br_startoff + new.br_blockcount;
 	return 0;
 
 out_cancel:
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/6] xfs: simplify iext overflow checking and upgrade
  2024-03-28  7:02 ` [PATCH 4/6] xfs: simplify iext overflow checking and upgrade Christoph Hellwig
@ 2024-03-28 22:04   ` Dave Chinner
  2024-03-29  4:10     ` Christoph Hellwig
  2024-03-29 16:24     ` Darrick J. Wong
  0 siblings, 2 replies; 20+ messages in thread
From: Dave Chinner @ 2024-03-28 22:04 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, Darrick J. Wong, linux-xfs

On Thu, Mar 28, 2024 at 08:02:54AM +0100, Christoph Hellwig wrote:
> Currently the calls to xfs_iext_count_may_overflow and
> xfs_iext_count_upgrade are always paired.  Merge them into a single
> function to simplify the callers and the actual check and upgrade
> logic itself.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/libxfs/xfs_attr.c       |  5 +--
>  fs/xfs/libxfs/xfs_bmap.c       |  5 +--
>  fs/xfs/libxfs/xfs_inode_fork.c | 62 ++++++++++++++++------------------
>  fs/xfs/libxfs/xfs_inode_fork.h |  4 +--
>  fs/xfs/xfs_bmap_item.c         |  4 +--
>  fs/xfs/xfs_bmap_util.c         | 24 +++----------
>  fs/xfs/xfs_dquot.c             |  5 +--
>  fs/xfs/xfs_iomap.c             |  9 ++---
>  fs/xfs/xfs_reflink.c           |  9 ++---
>  fs/xfs/xfs_rtalloc.c           |  5 +--
>  10 files changed, 44 insertions(+), 88 deletions(-)

....

> diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
> index 7d660a9739090a..235c41eca5edd7 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.c
> +++ b/fs/xfs/libxfs/xfs_inode_fork.c
> @@ -765,53 +765,49 @@ xfs_ifork_verify_local_attr(
>  	return 0;
>  }
>  
> +/*
> + * Check if the inode fork supports adding nr_to_add more extents.
> + *
> + * If it doesn't but we can upgrade it to large extent counters, do the upgrade.
> + * If we can't upgrade or are already using big counters but still can't fit the
> + * additional extents, return -EFBIG.
> + */
>  int
> -xfs_iext_count_may_overflow(
> +xfs_iext_count_upgrade(
> +	struct xfs_trans	*tp,
>  	struct xfs_inode	*ip,
>  	int			whichfork,
> -	int			nr_to_add)
> +	uint			nr_to_add)
>  {
> +	struct xfs_mount	*mp = ip->i_mount;
> +	bool			has_large =
> +		xfs_inode_has_large_extent_counts(ip);
>  	struct xfs_ifork	*ifp = xfs_ifork_ptr(ip, whichfork);
>  	uint64_t		max_exts;
>  	uint64_t		nr_exts;
>  
> +	ASSERT(nr_to_add <= XFS_MAX_EXTCNT_UPGRADE_NR);
> +
>  	if (whichfork == XFS_COW_FORK)
>  		return 0;
>  
> -	max_exts = xfs_iext_max_nextents(xfs_inode_has_large_extent_counts(ip),
> -				whichfork);
> -
> -	if (XFS_TEST_ERROR(false, ip->i_mount, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
> -		max_exts = 10;
> -
>  	nr_exts = ifp->if_nextents + nr_to_add;
> -	if (nr_exts < ifp->if_nextents || nr_exts > max_exts)
> +	if (nr_exts < ifp->if_nextents) {
> +		/* no point in upgrading if if_nextents overflows */
>  		return -EFBIG;
> +	}
>  
> -	return 0;
> -}
> -
> -/*
> - * Upgrade this inode's extent counter fields to be able to handle a potential
> - * increase in the extent count by nr_to_add.  Normally this is the same
> - * quantity that caused xfs_iext_count_may_overflow() to return -EFBIG.
> - */
> -int
> -xfs_iext_count_upgrade(
> -	struct xfs_trans	*tp,
> -	struct xfs_inode	*ip,
> -	uint			nr_to_add)
> -{
> -	ASSERT(nr_to_add <= XFS_MAX_EXTCNT_UPGRADE_NR);
> -
> -	if (!xfs_has_large_extent_counts(ip->i_mount) ||
> -	    xfs_inode_has_large_extent_counts(ip) ||
> -	    XFS_TEST_ERROR(false, ip->i_mount, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
> -		return -EFBIG;
> -
> -	ip->i_diflags2 |= XFS_DIFLAG2_NREXT64;
> -	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
> -
> +	if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
> +		max_exts = 10;
> +	else
> +		max_exts = xfs_iext_max_nextents(has_large, whichfork);
> +	if (nr_exts > max_exts) {
> +		if (has_large || !xfs_has_large_extent_counts(mp) ||
> +		    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
> +			return -EFBIG;
> +		ip->i_diflags2 |= XFS_DIFLAG2_NREXT64;
> +		xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
> +	}

IIUC, testing the error tag twice won't always give the same result.
I think this will be more reliable, and it self-documents the error
injection case better:

	if (XFS_TEST_ERROR(false, ip->i_mount, XFS_ERRTAG_REDUCE_MAX_IEXTENTS) &&
	    nr_exts > 10))
		return -EFBIG;

	if (nr_exts > xfs_iext_max_nextents(has_large, whichfork)) {
		if (has_large || !xfs_has_large_extent_counts(mp))
			return -EFBIG;
		ip->i_diflags2 |= XFS_DIFLAG2_NREXT64;
		xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
	}
	return 0;

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/6] xfs: simplify iext overflow checking and upgrade
  2024-03-28 22:04   ` Dave Chinner
@ 2024-03-29  4:10     ` Christoph Hellwig
  2024-03-29 16:24     ` Darrick J. Wong
  1 sibling, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2024-03-29  4:10 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, Chandan Babu R, Darrick J. Wong, linux-xfs

On Fri, Mar 29, 2024 at 09:04:43AM +1100, Dave Chinner wrote:
> IIUC, testing the error tag twice won't always give the same result.

Yeah.

> I think this will be more reliable, and it self-documents the error
> injection case better:

Yes, now that we have the opportunity it probably makes sense to only test
it once.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] xfs: check if_bytes under the ilock in xfs_reflink_end_cow_extent
  2024-03-28  7:02 ` [PATCH 1/6] xfs: check if_bytes under the ilock in xfs_reflink_end_cow_extent Christoph Hellwig
@ 2024-03-29 16:14   ` Darrick J. Wong
  2024-03-30  5:56     ` Christoph Hellwig
  0 siblings, 1 reply; 20+ messages in thread
From: Darrick J. Wong @ 2024-03-29 16:14 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs

On Thu, Mar 28, 2024 at 08:02:51AM +0100, Christoph Hellwig wrote:
> Accessing if_bytes without the ilock is racy.  Move the check a little
> further down into the ilock critical section.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/xfs_reflink.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index 7da0e8f961d351..df632790a0a51c 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -731,12 +731,6 @@ xfs_reflink_end_cow_extent(
>  	int			nmaps;
>  	int			error;
>  
> -	/* No COW extents?  That's easy! */
> -	if (ifp->if_bytes == 0) {
> -		*offset_fsb = end_fsb;
> -		return 0;
> -	}

This unlocked access was supposed to short-circuit the case where
there's absolutely nothing in the cow fork at all, so that we don't have
to wait for a transaction and the ILOCK.  Is the unlocked access
causing problems?

> -
>  	resblks = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK);
>  	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0,
>  			XFS_TRANS_RESERVE, &tp);
> @@ -751,6 +745,12 @@ xfs_reflink_end_cow_extent(
>  	xfs_ilock(ip, XFS_ILOCK_EXCL);
>  	xfs_trans_ijoin(tp, ip, 0);
>  
> +	/* No COW extents?  That's easy! */
> +	if (ifp->if_bytes == 0) {
> +		*offset_fsb = end_fsb;
> +		goto out_cancel;
> +	}

This is already taken care of by the clause that comes below
the end of this diff:

	/*
	 * In case of racing, overlapping AIO writes no COW extents might be
	 * left by the time I/O completes for the loser of the race.  In that
	 * case we are done.
	 */
	if (!xfs_iext_lookup_extent(ip, ifp, *offset_fsb, &icur, &got) ||
	    got.br_startoff >= end_fsb) {
		*offset_fsb = end_fsb;
		goto out_cancel;
	}

Since xfs_iext_lookup_extent will return false if the cow fork tree is
empty.

That said, I think the xfs_iext_count_may_overflow stuff is misplaced --
we should be querying the cow fork extent and bouncing out early before
we bother with checking/upgrading the nextents width.  If
xfs_iext_count_upgrade dirtied the transaction, the early bailout will
cause a shutdown.

(The iext upgrade only needs to happen after the bmapi_read.)

--D

> +
>  	error = xfs_iext_count_may_overflow(ip, XFS_DATA_FORK,
>  			XFS_IEXT_REFLINK_END_COW_CNT);
>  	if (error == -EFBIG)
> -- 
> 2.39.2
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/6] xfs: consolidate the xfs_quota_reserve_blkres defintions
  2024-03-28  7:02 ` [PATCH 2/6] xfs: consolidate the xfs_quota_reserve_blkres defintions Christoph Hellwig
@ 2024-03-29 16:16   ` Darrick J. Wong
  0 siblings, 0 replies; 20+ messages in thread
From: Darrick J. Wong @ 2024-03-29 16:16 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs

On Thu, Mar 28, 2024 at 08:02:52AM +0100, Christoph Hellwig wrote:
> xfs_trans_reserve_quota_nblks is already stubbed out if quota support
> is disabled, no need for an extra xfs_quota_reserve_blkres stub.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks ok,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  fs/xfs/xfs_quota.h | 18 ++++++------------
>  1 file changed, 6 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h
> index 85a4ae1a17f672..621ea9d7cf06d9 100644
> --- a/fs/xfs/xfs_quota.h
> +++ b/fs/xfs/xfs_quota.h
> @@ -123,12 +123,6 @@ extern int xfs_qm_newmount(struct xfs_mount *, uint *, uint *);
>  extern void xfs_qm_mount_quotas(struct xfs_mount *);
>  extern void xfs_qm_unmount(struct xfs_mount *);
>  extern void xfs_qm_unmount_quotas(struct xfs_mount *);
> -
> -static inline int
> -xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t blocks)
> -{
> -	return xfs_trans_reserve_quota_nblks(NULL, ip, blocks, 0, false);
> -}
>  bool xfs_inode_near_dquot_enforcement(struct xfs_inode *ip, xfs_dqtype_t type);
>  
>  # ifdef CONFIG_XFS_LIVE_HOOKS
> @@ -187,12 +181,6 @@ static inline int xfs_trans_reserve_quota_bydquots(struct xfs_trans *tp,
>  	return 0;
>  }
>  
> -static inline int
> -xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t blocks)
> -{
> -	return 0;
> -}
> -
>  static inline int
>  xfs_trans_reserve_quota_icreate(struct xfs_trans *tp, struct xfs_dquot *udqp,
>  		struct xfs_dquot *gdqp, struct xfs_dquot *pdqp, int64_t dblocks)
> @@ -221,6 +209,12 @@ xfs_trans_reserve_quota_icreate(struct xfs_trans *tp, struct xfs_dquot *udqp,
>  
>  #endif /* CONFIG_XFS_QUOTA */
>  
> +static inline int
> +xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t blocks)
> +{
> +	return xfs_trans_reserve_quota_nblks(NULL, ip, blocks, 0, false);
> +}
> +
>  static inline int
>  xfs_quota_unreserve_blkres(struct xfs_inode *ip, int64_t blocks)
>  {
> -- 
> 2.39.2
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] xfs: xfs_quota_unreserve_blkres can't fail
  2024-03-28  7:02 ` [PATCH 3/6] xfs: xfs_quota_unreserve_blkres can't fail Christoph Hellwig
@ 2024-03-29 16:21   ` Darrick J. Wong
  2024-03-30  5:57     ` Christoph Hellwig
  0 siblings, 1 reply; 20+ messages in thread
From: Darrick J. Wong @ 2024-03-29 16:21 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs

On Thu, Mar 28, 2024 at 08:02:53AM +0100, Christoph Hellwig wrote:
> Unreserving quotas can't fail due to quota limits, and we'll pіck up
> a shutdown file system a bit later in all the callers anyway.  Return
> void and remove the error checking and propagation in the callers.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/libxfs/xfs_bmap.c | 16 +++++-----------
>  fs/xfs/libxfs/xfs_bmap.h |  2 +-
>  fs/xfs/xfs_aops.c        |  6 +-----
>  fs/xfs/xfs_bmap_util.c   |  9 +++------
>  fs/xfs/xfs_bmap_util.h   |  2 +-
>  fs/xfs/xfs_iomap.c       |  4 ++--
>  fs/xfs/xfs_quota.h       |  5 +++--
>  fs/xfs/xfs_reflink.c     | 11 +++--------
>  8 files changed, 19 insertions(+), 36 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 656c95a22f2e6d..6c752e55a52724 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -4895,7 +4895,7 @@ xfs_bmap_split_indlen(
>  	return stolen;
>  }
>  
> -int
> +void
>  xfs_bmap_del_extent_delay(
>  	struct xfs_inode	*ip,
>  	int			whichfork,
> @@ -4910,7 +4910,6 @@ xfs_bmap_del_extent_delay(
>  	xfs_fileoff_t		del_endoff, got_endoff;
>  	xfs_filblks_t		got_indlen, new_indlen, stolen;
>  	uint32_t		state = xfs_bmap_fork_to_state(whichfork);
> -	int			error = 0;
>  	bool			isrt;
>  
>  	XFS_STATS_INC(mp, xs_del_exlist);
> @@ -4934,9 +4933,7 @@ xfs_bmap_del_extent_delay(
>  	 * indirect block accounting.
>  	 */
>  	ASSERT(!isrt);
> -	error = xfs_quota_unreserve_blkres(ip, del->br_blockcount);
> -	if (error)
> -		return error;
> +	xfs_quota_unreserve_blkres(ip, del->br_blockcount);
>  	ip->i_delayed_blks -= del->br_blockcount;
>  
>  	if (got->br_startoff == del->br_startoff)
> @@ -5016,7 +5013,6 @@ xfs_bmap_del_extent_delay(
>  		xfs_mod_fdblocks(mp, da_diff, false);
>  		xfs_mod_delalloc(mp, -da_diff);
>  	}
> -	return error;
>  }
>  
>  void
> @@ -5584,18 +5580,16 @@ __xfs_bunmapi(
>  
>  delete:
>  		if (wasdel) {
> -			error = xfs_bmap_del_extent_delay(ip, whichfork, &icur,
> -					&got, &del);
> +			xfs_bmap_del_extent_delay(ip, whichfork, &icur, &got, &del);
>  		} else {
>  			error = xfs_bmap_del_extent_real(ip, tp, &icur, cur,
>  					&del, &tmp_logflags, whichfork,
>  					flags);
>  			logflags |= tmp_logflags;
> +			if (error)
> +				goto error0;
>  		}
>  
> -		if (error)
> -			goto error0;
> -
>  		end = del.br_startoff - 1;
>  nodelete:
>  		/*
> diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
> index f7662595309d86..0144835117c610 100644
> --- a/fs/xfs/libxfs/xfs_bmap.h
> +++ b/fs/xfs/libxfs/xfs_bmap.h
> @@ -195,7 +195,7 @@ int	xfs_bmapi_write(struct xfs_trans *tp, struct xfs_inode *ip,
>  int	xfs_bunmapi(struct xfs_trans *tp, struct xfs_inode *ip,
>  		xfs_fileoff_t bno, xfs_filblks_t len, uint32_t flags,
>  		xfs_extnum_t nexts, int *done);
> -int	xfs_bmap_del_extent_delay(struct xfs_inode *ip, int whichfork,
> +void	xfs_bmap_del_extent_delay(struct xfs_inode *ip, int whichfork,
>  		struct xfs_iext_cursor *cur, struct xfs_bmbt_irec *got,
>  		struct xfs_bmbt_irec *del);
>  void	xfs_bmap_del_extent_cow(struct xfs_inode *ip,
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index 3f428620ebf2a3..c51bc17f5cfa03 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -469,7 +469,6 @@ xfs_discard_folio(
>  {
>  	struct xfs_inode	*ip = XFS_I(folio->mapping->host);
>  	struct xfs_mount	*mp = ip->i_mount;
> -	int			error;
>  
>  	if (xfs_is_shutdown(mp))
>  		return;
> @@ -483,11 +482,8 @@ xfs_discard_folio(
>  	 * byte of the next folio. Hence the end offset is only dependent on the
>  	 * folio itself and not the start offset that is passed in.
>  	 */
> -	error = xfs_bmap_punch_delalloc_range(ip, pos,
> +	xfs_bmap_punch_delalloc_range(ip, pos,
>  				folio_pos(folio) + folio_size(folio));
> -
> -	if (error && !xfs_is_shutdown(mp))
> -		xfs_alert(mp, "page discard unable to remove delalloc mapping.");
>  }
>  
>  static const struct iomap_writeback_ops xfs_writeback_ops = {
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index 19e11d1da66074..2be52d8265db91 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -440,7 +440,7 @@ xfs_getbmap(
>   * if the ranges only partially overlap them, so it is up to the caller to
>   * ensure that partial blocks are not passed in.
>   */
> -int
> +void
>  xfs_bmap_punch_delalloc_range(
>  	struct xfs_inode	*ip,
>  	xfs_off_t		start_byte,
> @@ -452,7 +452,6 @@ xfs_bmap_punch_delalloc_range(
>  	xfs_fileoff_t		end_fsb = XFS_B_TO_FSB(mp, end_byte);
>  	struct xfs_bmbt_irec	got, del;
>  	struct xfs_iext_cursor	icur;
> -	int			error = 0;
>  
>  	ASSERT(!xfs_need_iread_extents(ifp));
>  
> @@ -476,15 +475,13 @@ xfs_bmap_punch_delalloc_range(
>  			continue;
>  		}
>  
> -		error = xfs_bmap_del_extent_delay(ip, XFS_DATA_FORK, &icur,
> -						  &got, &del);
> -		if (error || !xfs_iext_get_extent(ifp, &icur, &got))
> +		xfs_bmap_del_extent_delay(ip, XFS_DATA_FORK, &icur, &got, &del);
> +		if (!xfs_iext_get_extent(ifp, &icur, &got))
>  			break;
>  	}
>  
>  out_unlock:
>  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
> -	return error;
>  }
>  
>  /*
> diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
> index 77ecbb753ef207..51f84d8ff372fa 100644
> --- a/fs/xfs/xfs_bmap_util.h
> +++ b/fs/xfs/xfs_bmap_util.h
> @@ -30,7 +30,7 @@ xfs_bmap_rtalloc(struct xfs_bmalloca *ap)
>  }
>  #endif /* CONFIG_XFS_RT */
>  
> -int	xfs_bmap_punch_delalloc_range(struct xfs_inode *ip,
> +void	xfs_bmap_punch_delalloc_range(struct xfs_inode *ip,
>  		xfs_off_t start_byte, xfs_off_t end_byte);
>  
>  struct kgetbmap {
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 4087af7f3c9f3f..ba359bee2c2256 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -1194,8 +1194,8 @@ xfs_buffered_write_delalloc_punch(
>  	loff_t			offset,
>  	loff_t			length)
>  {
> -	return xfs_bmap_punch_delalloc_range(XFS_I(inode), offset,
> -			offset + length);
> +	xfs_bmap_punch_delalloc_range(XFS_I(inode), offset, offset + length);
> +	return 0;
>  }
>  
>  static int
> diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h
> index 621ea9d7cf06d9..078b2b88206b2c 100644
> --- a/fs/xfs/xfs_quota.h
> +++ b/fs/xfs/xfs_quota.h
> @@ -215,10 +215,11 @@ xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t blocks)
>  	return xfs_trans_reserve_quota_nblks(NULL, ip, blocks, 0, false);
>  }
>  
> -static inline int
> +static inline void
>  xfs_quota_unreserve_blkres(struct xfs_inode *ip, int64_t blocks)
>  {
> -	return xfs_quota_reserve_blkres(ip, -blocks);
> +	/* don't return an error as unreserving quotas can't fail */
> +	xfs_quota_reserve_blkres(ip, -blocks);

xfs_quota_reserve_blkres only doesn't fail if the nblks argument is
actually negative.  Can we have an ASSERT(blocks >= 0) here to guard
against someone accidentally passing in a negative @blocks here?

Everything else in this patch looks good to me.

--D

>  }
>  
>  extern int xfs_mount_reset_sbqflags(struct xfs_mount *);
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index df632790a0a51c..83f243cfa40571 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -606,10 +606,8 @@ xfs_reflink_cancel_cow_blocks(
>  		trace_xfs_reflink_cancel_cow(ip, &del);
>  
>  		if (isnullstartblock(del.br_startblock)) {
> -			error = xfs_bmap_del_extent_delay(ip, XFS_COW_FORK,
> -					&icur, &got, &del);
> -			if (error)
> -				break;
> +			xfs_bmap_del_extent_delay(ip, XFS_COW_FORK, &icur, &got,
> +					&del);
>  		} else if (del.br_state == XFS_EXT_UNWRITTEN || cancel_real) {
>  			ASSERT((*tpp)->t_highest_agno == NULLAGNUMBER);
>  
> @@ -632,10 +630,7 @@ xfs_reflink_cancel_cow_blocks(
>  			xfs_bmap_del_extent_cow(ip, &icur, &got, &del);
>  
>  			/* Remove the quota reservation */
> -			error = xfs_quota_unreserve_blkres(ip,
> -					del.br_blockcount);
> -			if (error)
> -				break;
> +			xfs_quota_unreserve_blkres(ip, del.br_blockcount);
>  		} else {
>  			/* Didn't do anything, push cursor back. */
>  			xfs_iext_prev(ifp, &icur);
> -- 
> 2.39.2
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/6] xfs: simplify iext overflow checking and upgrade
  2024-03-28 22:04   ` Dave Chinner
  2024-03-29  4:10     ` Christoph Hellwig
@ 2024-03-29 16:24     ` Darrick J. Wong
  1 sibling, 0 replies; 20+ messages in thread
From: Darrick J. Wong @ 2024-03-29 16:24 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, Chandan Babu R, linux-xfs

On Fri, Mar 29, 2024 at 09:04:43AM +1100, Dave Chinner wrote:
> On Thu, Mar 28, 2024 at 08:02:54AM +0100, Christoph Hellwig wrote:
> > Currently the calls to xfs_iext_count_may_overflow and
> > xfs_iext_count_upgrade are always paired.  Merge them into a single
> > function to simplify the callers and the actual check and upgrade
> > logic itself.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  fs/xfs/libxfs/xfs_attr.c       |  5 +--
> >  fs/xfs/libxfs/xfs_bmap.c       |  5 +--
> >  fs/xfs/libxfs/xfs_inode_fork.c | 62 ++++++++++++++++------------------
> >  fs/xfs/libxfs/xfs_inode_fork.h |  4 +--
> >  fs/xfs/xfs_bmap_item.c         |  4 +--
> >  fs/xfs/xfs_bmap_util.c         | 24 +++----------
> >  fs/xfs/xfs_dquot.c             |  5 +--
> >  fs/xfs/xfs_iomap.c             |  9 ++---
> >  fs/xfs/xfs_reflink.c           |  9 ++---
> >  fs/xfs/xfs_rtalloc.c           |  5 +--
> >  10 files changed, 44 insertions(+), 88 deletions(-)
> 
> ....
> 
> > diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
> > index 7d660a9739090a..235c41eca5edd7 100644
> > --- a/fs/xfs/libxfs/xfs_inode_fork.c
> > +++ b/fs/xfs/libxfs/xfs_inode_fork.c
> > @@ -765,53 +765,49 @@ xfs_ifork_verify_local_attr(
> >  	return 0;
> >  }
> >  
> > +/*
> > + * Check if the inode fork supports adding nr_to_add more extents.
> > + *
> > + * If it doesn't but we can upgrade it to large extent counters, do the upgrade.
> > + * If we can't upgrade or are already using big counters but still can't fit the
> > + * additional extents, return -EFBIG.
> > + */
> >  int
> > -xfs_iext_count_may_overflow(
> > +xfs_iext_count_upgrade(
> > +	struct xfs_trans	*tp,
> >  	struct xfs_inode	*ip,
> >  	int			whichfork,
> > -	int			nr_to_add)
> > +	uint			nr_to_add)
> >  {
> > +	struct xfs_mount	*mp = ip->i_mount;
> > +	bool			has_large =
> > +		xfs_inode_has_large_extent_counts(ip);
> >  	struct xfs_ifork	*ifp = xfs_ifork_ptr(ip, whichfork);
> >  	uint64_t		max_exts;
> >  	uint64_t		nr_exts;
> >  
> > +	ASSERT(nr_to_add <= XFS_MAX_EXTCNT_UPGRADE_NR);
> > +
> >  	if (whichfork == XFS_COW_FORK)
> >  		return 0;
> >  
> > -	max_exts = xfs_iext_max_nextents(xfs_inode_has_large_extent_counts(ip),
> > -				whichfork);
> > -
> > -	if (XFS_TEST_ERROR(false, ip->i_mount, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
> > -		max_exts = 10;
> > -
> >  	nr_exts = ifp->if_nextents + nr_to_add;
> > -	if (nr_exts < ifp->if_nextents || nr_exts > max_exts)
> > +	if (nr_exts < ifp->if_nextents) {
> > +		/* no point in upgrading if if_nextents overflows */
> >  		return -EFBIG;
> > +	}
> >  
> > -	return 0;
> > -}
> > -
> > -/*
> > - * Upgrade this inode's extent counter fields to be able to handle a potential
> > - * increase in the extent count by nr_to_add.  Normally this is the same
> > - * quantity that caused xfs_iext_count_may_overflow() to return -EFBIG.
> > - */
> > -int
> > -xfs_iext_count_upgrade(
> > -	struct xfs_trans	*tp,
> > -	struct xfs_inode	*ip,
> > -	uint			nr_to_add)
> > -{
> > -	ASSERT(nr_to_add <= XFS_MAX_EXTCNT_UPGRADE_NR);
> > -
> > -	if (!xfs_has_large_extent_counts(ip->i_mount) ||
> > -	    xfs_inode_has_large_extent_counts(ip) ||
> > -	    XFS_TEST_ERROR(false, ip->i_mount, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
> > -		return -EFBIG;
> > -
> > -	ip->i_diflags2 |= XFS_DIFLAG2_NREXT64;
> > -	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
> > -
> > +	if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
> > +		max_exts = 10;
> > +	else
> > +		max_exts = xfs_iext_max_nextents(has_large, whichfork);
> > +	if (nr_exts > max_exts) {
> > +		if (has_large || !xfs_has_large_extent_counts(mp) ||
> > +		    XFS_TEST_ERROR(false, mp, XFS_ERRTAG_REDUCE_MAX_IEXTENTS))
> > +			return -EFBIG;
> > +		ip->i_diflags2 |= XFS_DIFLAG2_NREXT64;
> > +		xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
> > +	}
> 
> IIUC, testing the error tag twice won't always give the same result.
> I think this will be more reliable, and it self-documents the error
> injection case better:
> 
> 	if (XFS_TEST_ERROR(false, ip->i_mount, XFS_ERRTAG_REDUCE_MAX_IEXTENTS) &&
> 	    nr_exts > 10))
> 		return -EFBIG;
> 
> 	if (nr_exts > xfs_iext_max_nextents(has_large, whichfork)) {
> 		if (has_large || !xfs_has_large_extent_counts(mp))
> 			return -EFBIG;
> 		ip->i_diflags2 |= XFS_DIFLAG2_NREXT64;
> 		xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
> 	}
> 	return 0;

Agreed, that looks better to me than sampling the errtag twice.

--D

> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/6] xfs: optimize extent remapping in xfs_reflink_end_cow_extent
  2024-03-28  7:02 ` [PATCH 5/6] xfs: optimize extent remapping in xfs_reflink_end_cow_extent Christoph Hellwig
@ 2024-03-29 16:29   ` Darrick J. Wong
  2024-03-30  6:00     ` Christoph Hellwig
  0 siblings, 1 reply; 20+ messages in thread
From: Darrick J. Wong @ 2024-03-29 16:29 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs

On Thu, Mar 28, 2024 at 08:02:55AM +0100, Christoph Hellwig wrote:
> xfs_reflink_end_cow_extent currently caps the range it works on to
> fit both the existing extent (or hole) in the data fork and the
> new COW range.  For overwrites of fragmented regions that is highly
> inefficient, as we need to split the new region at every boundary,
> just for it to be merge back in the next pass.
> 
> Switch to unmapping the old data using a chain of deferred bmap
> and extent free ops ops first, and then handle remapping the new
> data in one single transaction instead.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/xfs_reflink.c | 98 +++++++++++++++++++++++++-------------------
>  1 file changed, 56 insertions(+), 42 deletions(-)
> 
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index 3c35cd3b2dec5d..a7ee868d79bf02 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -701,6 +701,52 @@ xfs_reflink_cancel_cow_range(
>  	return error;
>  }
>  
> +/*
> + * Unmap any old data covering the COW target.
> + */
> +static void
> +xfs_reflink_unmap_old_data(
> +	struct xfs_trans	*tp,
> +	struct xfs_inode	*ip,
> +	xfs_fileoff_t		offset_fsb,
> +	xfs_fileoff_t		end_fsb)
> +{
> +	struct xfs_ifork	*ifp = &ip->i_df;
> +	struct xfs_bmbt_irec	got, del;
> +	struct xfs_iext_cursor	icur;
> +
> +	ASSERT(!xfs_need_iread_extents(ifp));
> +
> +	if (!xfs_iext_lookup_extent_before(ip, ifp, &end_fsb, &icur, &got))
> +		return;
> +
> +	while (got.br_startoff + got.br_blockcount > offset_fsb) {

How many bmap and refcount log intent items can we attach to a single
transaction?  It's roughly t_log_res / (32 + 32) though iirc in repair
I simply picked an upper limit of 128 extent mappings before I'd go back
for a fresh transaction.

--D

> +		del = got;
> +		xfs_trim_extent(&del, offset_fsb, end_fsb - offset_fsb);
> +
> +		/* Extent delete may have bumped us forward */
> +		if (!del.br_blockcount)
> +			goto prev_extent;
> +
> +		trace_xfs_reflink_cow_remap_to(ip, &del);
> +		if (isnullstartblock(del.br_startblock)) {
> +			xfs_bmap_del_extent_delay(ip, XFS_DATA_FORK, &icur,
> +					&got, &del);
> +			goto refresh;
> +		}
> +
> +		xfs_bmap_unmap_extent(tp, ip, XFS_DATA_FORK, &del);
> +		xfs_refcount_decrease_extent(tp, &del);
> +		xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT,
> +				-del.br_blockcount);
> +prev_extent:
> +		xfs_iext_prev(ifp, &icur);
> +refresh:
> +		if (!xfs_iext_get_extent(ifp, &icur, &got))
> +			break;
> +	}
> +}
> +
>  /*
>   * Remap part of the CoW fork into the data fork.
>   *
> @@ -718,12 +764,11 @@ xfs_reflink_end_cow_extent(
>  	xfs_fileoff_t		end_fsb)
>  {
>  	struct xfs_iext_cursor	icur;
> -	struct xfs_bmbt_irec	got, del, data;
> +	struct xfs_bmbt_irec	got, del;
>  	struct xfs_mount	*mp = ip->i_mount;
>  	struct xfs_trans	*tp;
>  	struct xfs_ifork	*ifp = xfs_ifork_ptr(ip, XFS_COW_FORK);
>  	unsigned int		resblks;
> -	int			nmaps;
>  	int			error;
>  
>  	resblks = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK);
> @@ -765,9 +810,7 @@ xfs_reflink_end_cow_extent(
>  	/*
>  	 * Only remap real extents that contain data.  With AIO, speculative
>  	 * preallocations can leak into the range we are called upon, and we
> -	 * need to skip them.  Preserve @got for the eventual CoW fork
> -	 * deletion; from now on @del represents the mapping that we're
> -	 * actually remapping.
> +	 * need to skip them.
>  	 */
>  	while (!xfs_bmap_is_written_extent(&got)) {
>  		if (!xfs_iext_next_extent(ifp, &icur, &got) ||
> @@ -776,47 +819,18 @@ xfs_reflink_end_cow_extent(
>  			goto out_cancel;
>  		}
>  	}
> +
> +	/*
> +	 * Preserve @got for the eventual CoW fork deletion; from now on @del
> +	 * represents the mapping that we're actually remapping.
> +	 */
>  	del = got;
>  	xfs_trim_extent(&del, *offset_fsb, end_fsb - *offset_fsb);
> -
> -	/* Grab the corresponding mapping in the data fork. */
> -	nmaps = 1;
> -	error = xfs_bmapi_read(ip, del.br_startoff, del.br_blockcount, &data,
> -			&nmaps, 0);
> -	if (error)
> -		goto out_cancel;
> -
> -	/* We can only remap the smaller of the two extent sizes. */
> -	data.br_blockcount = min(data.br_blockcount, del.br_blockcount);
> -	del.br_blockcount = data.br_blockcount;
> -
>  	trace_xfs_reflink_cow_remap_from(ip, &del);
> -	trace_xfs_reflink_cow_remap_to(ip, &data);
> -
> -	if (xfs_bmap_is_real_extent(&data)) {
> -		/*
> -		 * If the extent we're remapping is backed by storage (written
> -		 * or not), unmap the extent and drop its refcount.
> -		 */
> -		xfs_bmap_unmap_extent(tp, ip, XFS_DATA_FORK, &data);
> -		xfs_refcount_decrease_extent(tp, &data);
> -		xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT,
> -				-data.br_blockcount);
> -	} else if (data.br_startblock == DELAYSTARTBLOCK) {
> -		int		done;
>  
> -		/*
> -		 * If the extent we're remapping is a delalloc reservation,
> -		 * we can use the regular bunmapi function to release the
> -		 * incore state.  Dropping the delalloc reservation takes care
> -		 * of the quota reservation for us.
> -		 */
> -		error = xfs_bunmapi(NULL, ip, data.br_startoff,
> -				data.br_blockcount, 0, 1, &done);
> -		if (error)
> -			goto out_cancel;
> -		ASSERT(done);
> -	}
> +	/* Unmap the old data. */
> +	xfs_reflink_unmap_old_data(tp, ip, del.br_startoff,
> +			del.br_startoff + del.br_blockcount);
>  
>  	/* Free the CoW orphan record. */
>  	xfs_refcount_free_cow_extent(tp, del.br_startblock, del.br_blockcount);
> -- 
> 2.39.2
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/6] xfs: rename the del variable in xfs_reflink_end_cow_extent
  2024-03-28  7:02 ` [PATCH 6/6] xfs: rename the del variable " Christoph Hellwig
@ 2024-03-29 16:31   ` Darrick J. Wong
  2024-03-30  5:59     ` Christoph Hellwig
  0 siblings, 1 reply; 20+ messages in thread
From: Darrick J. Wong @ 2024-03-29 16:31 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs

On Thu, Mar 28, 2024 at 08:02:56AM +0100, Christoph Hellwig wrote:
> del contains the new extent that we are remapping.  Give it a somewhat
> less confusing name.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/xfs_reflink.c | 24 ++++++++++++------------
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index a7ee868d79bf02..15c723396cfdab 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -764,7 +764,7 @@ xfs_reflink_end_cow_extent(
>  	xfs_fileoff_t		end_fsb)
>  {
>  	struct xfs_iext_cursor	icur;
> -	struct xfs_bmbt_irec	got, del;
> +	struct xfs_bmbt_irec	got, new;
>  	struct xfs_mount	*mp = ip->i_mount;
>  	struct xfs_trans	*tp;
>  	struct xfs_ifork	*ifp = xfs_ifork_ptr(ip, XFS_COW_FORK);
> @@ -821,29 +821,29 @@ xfs_reflink_end_cow_extent(
>  	}
>  
>  	/*
> -	 * Preserve @got for the eventual CoW fork deletion; from now on @del
> +	 * Preserve @got for the eventual CoW fork deletion; from now on @new
>  	 * represents the mapping that we're actually remapping.

I'd have called it 'remap' because that's what the comment says.

--D

>  	 */
> -	del = got;
> -	xfs_trim_extent(&del, *offset_fsb, end_fsb - *offset_fsb);
> -	trace_xfs_reflink_cow_remap_from(ip, &del);
> +	new = got;
> +	xfs_trim_extent(&new, *offset_fsb, end_fsb - *offset_fsb);
> +	trace_xfs_reflink_cow_remap_from(ip, &new);
>  
>  	/* Unmap the old data. */
> -	xfs_reflink_unmap_old_data(tp, ip, del.br_startoff,
> -			del.br_startoff + del.br_blockcount);
> +	xfs_reflink_unmap_old_data(tp, ip, new.br_startoff,
> +			new.br_startoff + new.br_blockcount);
>  
>  	/* Free the CoW orphan record. */
> -	xfs_refcount_free_cow_extent(tp, del.br_startblock, del.br_blockcount);
> +	xfs_refcount_free_cow_extent(tp, new.br_startblock, new.br_blockcount);
>  
>  	/* Map the new blocks into the data fork. */
> -	xfs_bmap_map_extent(tp, ip, XFS_DATA_FORK, &del);
> +	xfs_bmap_map_extent(tp, ip, XFS_DATA_FORK, &new);
>  
>  	/* Charge this new data fork mapping to the on-disk quota. */
>  	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_DELBCOUNT,
> -			(long)del.br_blockcount);
> +			(long)new.br_blockcount);
>  
>  	/* Remove the mapping from the CoW fork. */
> -	xfs_bmap_del_extent_cow(ip, &icur, &got, &del);
> +	xfs_bmap_del_extent_cow(ip, &icur, &got, &new);
>  
>  	error = xfs_trans_commit(tp);
>  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
> @@ -851,7 +851,7 @@ xfs_reflink_end_cow_extent(
>  		return error;
>  
>  	/* Update the caller about how much progress we made. */
> -	*offset_fsb = del.br_startoff + del.br_blockcount;
> +	*offset_fsb = new.br_startoff + new.br_blockcount;
>  	return 0;
>  
>  out_cancel:
> -- 
> 2.39.2
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] xfs: check if_bytes under the ilock in xfs_reflink_end_cow_extent
  2024-03-29 16:14   ` Darrick J. Wong
@ 2024-03-30  5:56     ` Christoph Hellwig
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2024-03-30  5:56 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, Chandan Babu R, linux-xfs

On Fri, Mar 29, 2024 at 09:14:29AM -0700, Darrick J. Wong wrote:
> This unlocked access was supposed to short-circuit the case where
> there's absolutely nothing in the cow fork at all, so that we don't have
> to wait for a transaction and the ILOCK.  Is the unlocked access
> causing problems?

I've not observeved problems.  But I can't see how this access can
be race free.  Note that this case can only happen if we have racy
direct I/O writes, so I'm not sure trying to optimize performance for
it makes much sense.

> > +	/* No COW extents?  That's easy! */
> > +	if (ifp->if_bytes == 0) {
> > +		*offset_fsb = end_fsb;
> > +		goto out_cancel;
> > +	}
> 
> This is already taken care of by the clause that comes below
> the end of this diff:


> Since xfs_iext_lookup_extent will return false if the cow fork tree is
> empty.
> That said, I think the xfs_iext_count_may_overflow stuff is misplaced --
> we should be querying the cow fork extent and bouncing out early before
> we bother with checking/upgrading the nextents width.  If
> xfs_iext_count_upgrade dirtied the transaction, the early bailout will
> cause a shutdown.
> 
> (The iext upgrade only needs to happen after the bmapi_read.)

Yes, I guess the right thing is to move the upgrade later and then
just let xfs_iext_lookup_extent handle the no extents case.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] xfs: xfs_quota_unreserve_blkres can't fail
  2024-03-29 16:21   ` Darrick J. Wong
@ 2024-03-30  5:57     ` Christoph Hellwig
  2024-04-02  1:41       ` Darrick J. Wong
  0 siblings, 1 reply; 20+ messages in thread
From: Christoph Hellwig @ 2024-03-30  5:57 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, Chandan Babu R, linux-xfs

On Fri, Mar 29, 2024 at 09:21:23AM -0700, Darrick J. Wong wrote:
> > +static inline void
> >  xfs_quota_unreserve_blkres(struct xfs_inode *ip, int64_t blocks)
> >  {
> > -	return xfs_quota_reserve_blkres(ip, -blocks);
> > +	/* don't return an error as unreserving quotas can't fail */
> > +	xfs_quota_reserve_blkres(ip, -blocks);
> 
> xfs_quota_reserve_blkres only doesn't fail if the nblks argument is
> actually negative.  Can we have an ASSERT(blocks >= 0) here to guard
> against someone accidentally passing in a negative @blocks here?

Sure.  Or even better just mark blocks as unsigned?


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/6] xfs: rename the del variable in xfs_reflink_end_cow_extent
  2024-03-29 16:31   ` Darrick J. Wong
@ 2024-03-30  5:59     ` Christoph Hellwig
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2024-03-30  5:59 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, Chandan Babu R, linux-xfs

On Fri, Mar 29, 2024 at 09:31:08AM -0700, Darrick J. Wong wrote:
> >  	/*
> > -	 * Preserve @got for the eventual CoW fork deletion; from now on @del
> > +	 * Preserve @got for the eventual CoW fork deletion; from now on @new
> >  	 * represents the mapping that we're actually remapping.
> 
> I'd have called it 'remap' because that's what the comment says.

Fine with me.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/6] xfs: optimize extent remapping in xfs_reflink_end_cow_extent
  2024-03-29 16:29   ` Darrick J. Wong
@ 2024-03-30  6:00     ` Christoph Hellwig
  0 siblings, 0 replies; 20+ messages in thread
From: Christoph Hellwig @ 2024-03-30  6:00 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, Chandan Babu R, linux-xfs

On Fri, Mar 29, 2024 at 09:29:36AM -0700, Darrick J. Wong wrote:
> On Thu, Mar 28, 2024 at 08:02:55AM +0100, Christoph Hellwig wrote:
> > xfs_reflink_end_cow_extent currently caps the range it works on to
> > fit both the existing extent (or hole) in the data fork and the
> > new COW range.  For overwrites of fragmented regions that is highly
> > inefficient, as we need to split the new region at every boundary,
> > just for it to be merge back in the next pass.
> > 
> > Switch to unmapping the old data using a chain of deferred bmap
> > and extent free ops ops first, and then handle remapping the new
> > data in one single transaction instead.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  fs/xfs/xfs_reflink.c | 98 +++++++++++++++++++++++++-------------------
> >  1 file changed, 56 insertions(+), 42 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> > index 3c35cd3b2dec5d..a7ee868d79bf02 100644
> > --- a/fs/xfs/xfs_reflink.c
> > +++ b/fs/xfs/xfs_reflink.c
> > @@ -701,6 +701,52 @@ xfs_reflink_cancel_cow_range(
> >  	return error;
> >  }
> >  
> > +/*
> > + * Unmap any old data covering the COW target.
> > + */
> > +static void
> > +xfs_reflink_unmap_old_data(
> > +	struct xfs_trans	*tp,
> > +	struct xfs_inode	*ip,
> > +	xfs_fileoff_t		offset_fsb,
> > +	xfs_fileoff_t		end_fsb)
> > +{
> > +	struct xfs_ifork	*ifp = &ip->i_df;
> > +	struct xfs_bmbt_irec	got, del;
> > +	struct xfs_iext_cursor	icur;
> > +
> > +	ASSERT(!xfs_need_iread_extents(ifp));
> > +
> > +	if (!xfs_iext_lookup_extent_before(ip, ifp, &end_fsb, &icur, &got))
> > +		return;
> > +
> > +	while (got.br_startoff + got.br_blockcount > offset_fsb) {
> 
> How many bmap and refcount log intent items can we attach to a single
> transaction?  It's roughly t_log_res / (32 + 32) though iirc in repair
> I simply picked an upper limit of 128 extent mappings before I'd go back
> for a fresh transaction.

Ah, I didn't even think of a limit, but the log reservation obiously
caps it.  I'll look into simply reusing and documenting the repair
limit.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/6] xfs: xfs_quota_unreserve_blkres can't fail
  2024-03-30  5:57     ` Christoph Hellwig
@ 2024-04-02  1:41       ` Darrick J. Wong
  0 siblings, 0 replies; 20+ messages in thread
From: Darrick J. Wong @ 2024-04-02  1:41 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Chandan Babu R, linux-xfs

On Sat, Mar 30, 2024 at 06:57:55AM +0100, Christoph Hellwig wrote:
> On Fri, Mar 29, 2024 at 09:21:23AM -0700, Darrick J. Wong wrote:
> > > +static inline void
> > >  xfs_quota_unreserve_blkres(struct xfs_inode *ip, int64_t blocks)
> > >  {
> > > -	return xfs_quota_reserve_blkres(ip, -blocks);
> > > +	/* don't return an error as unreserving quotas can't fail */
> > > +	xfs_quota_reserve_blkres(ip, -blocks);
> > 
> > xfs_quota_reserve_blkres only doesn't fail if the nblks argument is
> > actually negative.  Can we have an ASSERT(blocks >= 0) here to guard
> > against someone accidentally passing in a negative @blocks here?
> 
> Sure.  Or even better just mark blocks as unsigned?

<shrug> I guess that works.

--D

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-04-02  1:41 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-28  7:02 RFC: optimize COW end I/O remapping Christoph Hellwig
2024-03-28  7:02 ` [PATCH 1/6] xfs: check if_bytes under the ilock in xfs_reflink_end_cow_extent Christoph Hellwig
2024-03-29 16:14   ` Darrick J. Wong
2024-03-30  5:56     ` Christoph Hellwig
2024-03-28  7:02 ` [PATCH 2/6] xfs: consolidate the xfs_quota_reserve_blkres defintions Christoph Hellwig
2024-03-29 16:16   ` Darrick J. Wong
2024-03-28  7:02 ` [PATCH 3/6] xfs: xfs_quota_unreserve_blkres can't fail Christoph Hellwig
2024-03-29 16:21   ` Darrick J. Wong
2024-03-30  5:57     ` Christoph Hellwig
2024-04-02  1:41       ` Darrick J. Wong
2024-03-28  7:02 ` [PATCH 4/6] xfs: simplify iext overflow checking and upgrade Christoph Hellwig
2024-03-28 22:04   ` Dave Chinner
2024-03-29  4:10     ` Christoph Hellwig
2024-03-29 16:24     ` Darrick J. Wong
2024-03-28  7:02 ` [PATCH 5/6] xfs: optimize extent remapping in xfs_reflink_end_cow_extent Christoph Hellwig
2024-03-29 16:29   ` Darrick J. Wong
2024-03-30  6:00     ` Christoph Hellwig
2024-03-28  7:02 ` [PATCH 6/6] xfs: rename the del variable " Christoph Hellwig
2024-03-29 16:31   ` Darrick J. Wong
2024-03-30  5:59     ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.