linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] xfs: some end COW remapping optimization
@ 2022-02-09  7:36 Gao Xiang
  2022-02-09  7:36 ` [PATCH 1/3] xfs: get rid of LEFT, RIGHT, PREV in xfs_bmap_add_extent_unwritten_real() Gao Xiang
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Gao Xiang @ 2022-02-09  7:36 UTC (permalink / raw)
  To: xfs; +Cc: LKML, Gao Xiang

Hi folks,

Currently, xfs_reflink_end_cow_extent() will unconditionally unmap an
extent from DATA fork and then remap an extent from COW fork. It seems
somewhat ineffective since for many cases we could update real bmbt
records directly by sightly enhancing old
xfs_bmap_add_extent_unwritten_real() implementation, thus reduce some
measurable extra metadata overhead.

It's important to us since, actually, we're planing to use a modified
alway-cow like atomic write approach internally for database
applications, therefore it'd be nice to do some optimization over
simple end COW approach. Also I think it's still generic and can
benefit other reflink use cases as well.

I did some tests with ramdisk in order to measure metadata overhead:

echo 1 > /sys/fs/xfs/debug/always_cow
mkfs.xfs -f -mreflink=1 /dev/ram0
mount /dev/ram0 testdir
fio -filename=testdir/1 -size=1G -ioengine=psync -bs=4k -rw=randwrite -overwrite=1 -direct=1 -end_fsync=1 -name=job1

Test results as below:
Vanilla:
(1)   iops        : min= 7986, max=16434, avg=12505.76, stdev=2400.05, samples=41
(2)   iops        : min= 7636, max=16376, avg=12474.19, stdev=2258.18, samples=42
(3)   iops        : min= 8346, max=16439, avg=12227.95, stdev=2432.12, samples=42
(4)   iops        : min= 8580, max=16496, avg=12779.41, stdev=2297.42, samples=41
(5)   iops        : min= 8286, max=16556, avg=12500.76, stdev=2123.90, samples=41

Patched:
(1)   iops        : min= 7086, max=17132, avg=12931.20, stdev=2729.10, samples=40
(2)   iops        : min= 7704, max=17508, avg=13204.62, stdev=2507.70, samples=39
(3)   iops        : min= 8736, max=17634, avg=13253.08, stdev=2545.18, samples=39
(4)   iops        : min= 7188, max=17550, avg=12928.40, stdev=2633.64, samples=40
(5)   iops        : min= 8268, max=17446, avg=12837.55, stdev=2717.98, samples=40

It's still immature so far yet xfstests seems survived.
Comments are much welcomed and thanks for your time!

Thanks,
Gao Xiang

Gao Xiang (3):
  xfs: get rid of LEFT, RIGHT, PREV in
    xfs_bmap_add_extent_unwritten_real()
  xfs: introduce xfs_bmap_update_extent_real()
  xfs: introduce xfs_bremapi_from_cowfork()

 fs/xfs/libxfs/xfs_bmap.c | 378 +++++++++++++++++++++++++--------------
 fs/xfs/libxfs/xfs_bmap.h |   7 +-
 fs/xfs/xfs_reflink.c     |  24 +--
 3 files changed, 253 insertions(+), 156 deletions(-)

-- 
2.24.4


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3] xfs: get rid of LEFT, RIGHT, PREV in xfs_bmap_add_extent_unwritten_real()
  2022-02-09  7:36 [PATCH 0/3] xfs: some end COW remapping optimization Gao Xiang
@ 2022-02-09  7:36 ` Gao Xiang
  2022-02-09  7:36 ` [PATCH 2/3] xfs: introduce xfs_bmap_update_extent_real() Gao Xiang
  2022-02-09  7:36 ` [PATCH 3/3] xfs: introduce xfs_bremapi_from_cowfork() Gao Xiang
  2 siblings, 0 replies; 6+ messages in thread
From: Gao Xiang @ 2022-02-09  7:36 UTC (permalink / raw)
  To: xfs; +Cc: LKML, Gao Xiang

It doesn't seems that such macros are easier to read. Also, they
could polluate the identifier namespace.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
---
 fs/xfs/libxfs/xfs_bmap.c | 200 +++++++++++++++++++--------------------
 1 file changed, 96 insertions(+), 104 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 74198dd82b03..14d1a806ba15 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -1935,11 +1935,11 @@ xfs_bmap_add_extent_delay_real(
 int					/* error */
 xfs_bmap_add_extent_unwritten_real(
 	struct xfs_trans	*tp,
-	xfs_inode_t		*ip,	/* incore inode pointer */
+	struct xfs_inode	*ip,	/* incore inode pointer */
 	int			whichfork,
 	struct xfs_iext_cursor	*icur,
 	struct xfs_btree_cur	**curp,	/* if *curp is null, not a btree */
-	xfs_bmbt_irec_t		*new,	/* new data to add to file extents */
+	struct xfs_bmbt_irec	*new,	/* new data to add to file extents */
 	int			*logflagsp) /* inode logging flags */
 {
 	struct xfs_btree_cur	*cur;	/* btree cursor */
@@ -1947,8 +1947,8 @@ xfs_bmap_add_extent_unwritten_real(
 	int			i;	/* temp state */
 	struct xfs_ifork	*ifp;	/* inode fork pointer */
 	xfs_fileoff_t		new_endoff;	/* end offset of new entry */
-	xfs_bmbt_irec_t		r[3];	/* neighbor extent entries */
-					/* left is 0, right is 1, prev is 2 */
+	struct xfs_bmbt_irec	left, right;	/* neighbor extent entries */
+	struct xfs_bmbt_irec	prev;		/* previous old extent */
 	int			rval=0;	/* return value (logging flags) */
 	int			state = xfs_bmap_fork_to_state(whichfork);
 	struct xfs_mount	*mp = ip->i_mount;
@@ -1963,44 +1963,40 @@ xfs_bmap_add_extent_unwritten_real(
 
 	XFS_STATS_INC(mp, xs_add_exlist);
 
-#define	LEFT		r[0]
-#define	RIGHT		r[1]
-#define	PREV		r[2]
-
 	/*
 	 * Set up a bunch of variables to make the tests simpler.
 	 */
 	error = 0;
-	xfs_iext_get_extent(ifp, icur, &PREV);
-	ASSERT(new->br_state != PREV.br_state);
+	xfs_iext_get_extent(ifp, icur, &prev);
+	ASSERT(new->br_state != prev.br_state);
 	new_endoff = new->br_startoff + new->br_blockcount;
-	ASSERT(PREV.br_startoff <= new->br_startoff);
-	ASSERT(PREV.br_startoff + PREV.br_blockcount >= new_endoff);
+	ASSERT(prev.br_startoff <= new->br_startoff);
+	ASSERT(prev.br_startoff + prev.br_blockcount >= new_endoff);
 
 	/*
 	 * Set flags determining what part of the previous oldext allocation
 	 * extent is being replaced by a newext allocation.
 	 */
-	if (PREV.br_startoff == new->br_startoff)
+	if (prev.br_startoff == new->br_startoff)
 		state |= BMAP_LEFT_FILLING;
-	if (PREV.br_startoff + PREV.br_blockcount == new_endoff)
+	if (prev.br_startoff + prev.br_blockcount == new_endoff)
 		state |= BMAP_RIGHT_FILLING;
 
 	/*
 	 * Check and set flags if this segment has a left neighbor.
 	 * Don't set contiguous if the combined extent would be too large.
 	 */
-	if (xfs_iext_peek_prev_extent(ifp, icur, &LEFT)) {
+	if (xfs_iext_peek_prev_extent(ifp, icur, &left)) {
 		state |= BMAP_LEFT_VALID;
-		if (isnullstartblock(LEFT.br_startblock))
+		if (isnullstartblock(left.br_startblock))
 			state |= BMAP_LEFT_DELAY;
 	}
 
 	if ((state & BMAP_LEFT_VALID) && !(state & BMAP_LEFT_DELAY) &&
-	    LEFT.br_startoff + LEFT.br_blockcount == new->br_startoff &&
-	    LEFT.br_startblock + LEFT.br_blockcount == new->br_startblock &&
-	    LEFT.br_state == new->br_state &&
-	    LEFT.br_blockcount + new->br_blockcount <= MAXEXTLEN)
+	    left.br_startoff + left.br_blockcount == new->br_startoff &&
+	    left.br_startblock + left.br_blockcount == new->br_startblock &&
+	    left.br_state == new->br_state &&
+	    left.br_blockcount + new->br_blockcount <= MAXEXTLEN)
 		state |= BMAP_LEFT_CONTIG;
 
 	/*
@@ -2008,22 +2004,22 @@ xfs_bmap_add_extent_unwritten_real(
 	 * Don't set contiguous if the combined extent would be too large.
 	 * Also check for all-three-contiguous being too large.
 	 */
-	if (xfs_iext_peek_next_extent(ifp, icur, &RIGHT)) {
+	if (xfs_iext_peek_next_extent(ifp, icur, &right)) {
 		state |= BMAP_RIGHT_VALID;
-		if (isnullstartblock(RIGHT.br_startblock))
+		if (isnullstartblock(right.br_startblock))
 			state |= BMAP_RIGHT_DELAY;
 	}
 
 	if ((state & BMAP_RIGHT_VALID) && !(state & BMAP_RIGHT_DELAY) &&
-	    new_endoff == RIGHT.br_startoff &&
-	    new->br_startblock + new->br_blockcount == RIGHT.br_startblock &&
-	    new->br_state == RIGHT.br_state &&
-	    new->br_blockcount + RIGHT.br_blockcount <= MAXEXTLEN &&
+	    new_endoff == right.br_startoff &&
+	    new->br_startblock + new->br_blockcount == right.br_startblock &&
+	    new->br_state == right.br_state &&
+	    new->br_blockcount + right.br_blockcount <= MAXEXTLEN &&
 	    ((state & (BMAP_LEFT_CONTIG | BMAP_LEFT_FILLING |
 		       BMAP_RIGHT_FILLING)) !=
 		      (BMAP_LEFT_CONTIG | BMAP_LEFT_FILLING |
 		       BMAP_RIGHT_FILLING) ||
-	     LEFT.br_blockcount + new->br_blockcount + RIGHT.br_blockcount
+	     left.br_blockcount + new->br_blockcount + right.br_blockcount
 			<= MAXEXTLEN))
 		state |= BMAP_RIGHT_CONTIG;
 
@@ -2038,18 +2034,18 @@ xfs_bmap_add_extent_unwritten_real(
 		 * Setting all of a previous oldext extent to newext.
 		 * The left and right neighbors are both contiguous with new.
 		 */
-		LEFT.br_blockcount += PREV.br_blockcount + RIGHT.br_blockcount;
+		left.br_blockcount += prev.br_blockcount + right.br_blockcount;
 
 		xfs_iext_remove(ip, icur, state);
 		xfs_iext_remove(ip, icur, state);
 		xfs_iext_prev(ifp, icur);
-		xfs_iext_update_extent(ip, state, icur, &LEFT);
+		xfs_iext_update_extent(ip, state, icur, &left);
 		ifp->if_nextents -= 2;
-		if (cur == NULL)
+		if (cur == NULL) {
 			rval = XFS_ILOG_CORE | XFS_ILOG_DEXT;
-		else {
+		} else {
 			rval = XFS_ILOG_CORE;
-			error = xfs_bmbt_lookup_eq(cur, &RIGHT, &i);
+			error = xfs_bmbt_lookup_eq(cur, &right, &i);
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
@@ -2080,7 +2076,7 @@ xfs_bmap_add_extent_unwritten_real(
 				error = -EFSCORRUPTED;
 				goto done;
 			}
-			error = xfs_bmbt_update(cur, &LEFT);
+			error = xfs_bmbt_update(cur, &left);
 			if (error)
 				goto done;
 		}
@@ -2091,17 +2087,17 @@ xfs_bmap_add_extent_unwritten_real(
 		 * Setting all of a previous oldext extent to newext.
 		 * The left neighbor is contiguous, the right is not.
 		 */
-		LEFT.br_blockcount += PREV.br_blockcount;
+		left.br_blockcount += prev.br_blockcount;
 
 		xfs_iext_remove(ip, icur, state);
 		xfs_iext_prev(ifp, icur);
-		xfs_iext_update_extent(ip, state, icur, &LEFT);
+		xfs_iext_update_extent(ip, state, icur, &left);
 		ifp->if_nextents--;
-		if (cur == NULL)
+		if (cur == NULL) {
 			rval = XFS_ILOG_CORE | XFS_ILOG_DEXT;
-		else {
+		} else {
 			rval = XFS_ILOG_CORE;
-			error = xfs_bmbt_lookup_eq(cur, &PREV, &i);
+			error = xfs_bmbt_lookup_eq(cur, &prev, &i);
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
@@ -2120,7 +2116,7 @@ xfs_bmap_add_extent_unwritten_real(
 				error = -EFSCORRUPTED;
 				goto done;
 			}
-			error = xfs_bmbt_update(cur, &LEFT);
+			error = xfs_bmbt_update(cur, &left);
 			if (error)
 				goto done;
 		}
@@ -2131,20 +2127,20 @@ xfs_bmap_add_extent_unwritten_real(
 		 * Setting all of a previous oldext extent to newext.
 		 * The right neighbor is contiguous, the left is not.
 		 */
-		PREV.br_blockcount += RIGHT.br_blockcount;
-		PREV.br_state = new->br_state;
+		prev.br_blockcount += right.br_blockcount;
+		prev.br_state = new->br_state;
 
 		xfs_iext_next(ifp, icur);
 		xfs_iext_remove(ip, icur, state);
 		xfs_iext_prev(ifp, icur);
-		xfs_iext_update_extent(ip, state, icur, &PREV);
+		xfs_iext_update_extent(ip, state, icur, &prev);
 		ifp->if_nextents--;
 
-		if (cur == NULL)
+		if (cur == NULL) {
 			rval = XFS_ILOG_CORE | XFS_ILOG_DEXT;
-		else {
+		} else {
 			rval = XFS_ILOG_CORE;
-			error = xfs_bmbt_lookup_eq(cur, &RIGHT, &i);
+			error = xfs_bmbt_lookup_eq(cur, &right, &i);
 			if (error)
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
@@ -2163,7 +2159,7 @@ xfs_bmap_add_extent_unwritten_real(
 				error = -EFSCORRUPTED;
 				goto done;
 			}
-			error = xfs_bmbt_update(cur, &PREV);
+			error = xfs_bmbt_update(cur, &prev);
 			if (error)
 				goto done;
 		}
@@ -2175,12 +2171,12 @@ xfs_bmap_add_extent_unwritten_real(
 		 * Neither the left nor right neighbors are contiguous with
 		 * the new one.
 		 */
-		PREV.br_state = new->br_state;
-		xfs_iext_update_extent(ip, state, icur, &PREV);
+		prev.br_state = new->br_state;
+		xfs_iext_update_extent(ip, state, icur, &prev);
 
-		if (cur == NULL)
+		if (cur == NULL) {
 			rval = XFS_ILOG_DEXT;
-		else {
+		} else {
 			rval = 0;
 			error = xfs_bmbt_lookup_eq(cur, new, &i);
 			if (error)
@@ -2189,7 +2185,7 @@ xfs_bmap_add_extent_unwritten_real(
 				error = -EFSCORRUPTED;
 				goto done;
 			}
-			error = xfs_bmbt_update(cur, &PREV);
+			error = xfs_bmbt_update(cur, &prev);
 			if (error)
 				goto done;
 		}
@@ -2200,20 +2196,20 @@ xfs_bmap_add_extent_unwritten_real(
 		 * Setting the first part of a previous oldext extent to newext.
 		 * The left neighbor is contiguous.
 		 */
-		LEFT.br_blockcount += new->br_blockcount;
+		left.br_blockcount += new->br_blockcount;
 
-		old = PREV;
-		PREV.br_startoff += new->br_blockcount;
-		PREV.br_startblock += new->br_blockcount;
-		PREV.br_blockcount -= new->br_blockcount;
+		old = prev;
+		prev.br_startoff += new->br_blockcount;
+		prev.br_startblock += new->br_blockcount;
+		prev.br_blockcount -= new->br_blockcount;
 
-		xfs_iext_update_extent(ip, state, icur, &PREV);
+		xfs_iext_update_extent(ip, state, icur, &prev);
 		xfs_iext_prev(ifp, icur);
-		xfs_iext_update_extent(ip, state, icur, &LEFT);
+		xfs_iext_update_extent(ip, state, icur, &left);
 
-		if (cur == NULL)
+		if (cur == NULL) {
 			rval = XFS_ILOG_DEXT;
-		else {
+		} else {
 			rval = 0;
 			error = xfs_bmbt_lookup_eq(cur, &old, &i);
 			if (error)
@@ -2222,13 +2218,13 @@ xfs_bmap_add_extent_unwritten_real(
 				error = -EFSCORRUPTED;
 				goto done;
 			}
-			error = xfs_bmbt_update(cur, &PREV);
+			error = xfs_bmbt_update(cur, &prev);
 			if (error)
 				goto done;
 			error = xfs_btree_decrement(cur, 0, &i);
 			if (error)
 				goto done;
-			error = xfs_bmbt_update(cur, &LEFT);
+			error = xfs_bmbt_update(cur, &left);
 			if (error)
 				goto done;
 		}
@@ -2239,18 +2235,18 @@ xfs_bmap_add_extent_unwritten_real(
 		 * Setting the first part of a previous oldext extent to newext.
 		 * The left neighbor is not contiguous.
 		 */
-		old = PREV;
-		PREV.br_startoff += new->br_blockcount;
-		PREV.br_startblock += new->br_blockcount;
-		PREV.br_blockcount -= new->br_blockcount;
+		old = prev;
+		prev.br_startoff += new->br_blockcount;
+		prev.br_startblock += new->br_blockcount;
+		prev.br_blockcount -= new->br_blockcount;
 
-		xfs_iext_update_extent(ip, state, icur, &PREV);
+		xfs_iext_update_extent(ip, state, icur, &prev);
 		xfs_iext_insert(ip, icur, new, state);
 		ifp->if_nextents++;
 
-		if (cur == NULL)
+		if (cur == NULL) {
 			rval = XFS_ILOG_CORE | XFS_ILOG_DEXT;
-		else {
+		} else {
 			rval = XFS_ILOG_CORE;
 			error = xfs_bmbt_lookup_eq(cur, &old, &i);
 			if (error)
@@ -2259,7 +2255,7 @@ xfs_bmap_add_extent_unwritten_real(
 				error = -EFSCORRUPTED;
 				goto done;
 			}
-			error = xfs_bmbt_update(cur, &PREV);
+			error = xfs_bmbt_update(cur, &prev);
 			if (error)
 				goto done;
 			cur->bc_rec.b = *new;
@@ -2277,20 +2273,20 @@ xfs_bmap_add_extent_unwritten_real(
 		 * Setting the last part of a previous oldext extent to newext.
 		 * The right neighbor is contiguous with the new allocation.
 		 */
-		old = PREV;
-		PREV.br_blockcount -= new->br_blockcount;
+		old = prev;
+		prev.br_blockcount -= new->br_blockcount;
 
-		RIGHT.br_startoff = new->br_startoff;
-		RIGHT.br_startblock = new->br_startblock;
-		RIGHT.br_blockcount += new->br_blockcount;
+		right.br_startoff = new->br_startoff;
+		right.br_startblock = new->br_startblock;
+		right.br_blockcount += new->br_blockcount;
 
-		xfs_iext_update_extent(ip, state, icur, &PREV);
+		xfs_iext_update_extent(ip, state, icur, &prev);
 		xfs_iext_next(ifp, icur);
-		xfs_iext_update_extent(ip, state, icur, &RIGHT);
+		xfs_iext_update_extent(ip, state, icur, &right);
 
-		if (cur == NULL)
+		if (cur == NULL) {
 			rval = XFS_ILOG_DEXT;
-		else {
+		} else {
 			rval = 0;
 			error = xfs_bmbt_lookup_eq(cur, &old, &i);
 			if (error)
@@ -2299,13 +2295,13 @@ xfs_bmap_add_extent_unwritten_real(
 				error = -EFSCORRUPTED;
 				goto done;
 			}
-			error = xfs_bmbt_update(cur, &PREV);
+			error = xfs_bmbt_update(cur, &prev);
 			if (error)
 				goto done;
 			error = xfs_btree_increment(cur, 0, &i);
 			if (error)
 				goto done;
-			error = xfs_bmbt_update(cur, &RIGHT);
+			error = xfs_bmbt_update(cur, &right);
 			if (error)
 				goto done;
 		}
@@ -2316,17 +2312,17 @@ xfs_bmap_add_extent_unwritten_real(
 		 * Setting the last part of a previous oldext extent to newext.
 		 * The right neighbor is not contiguous.
 		 */
-		old = PREV;
-		PREV.br_blockcount -= new->br_blockcount;
+		old = prev;
+		prev.br_blockcount -= new->br_blockcount;
 
-		xfs_iext_update_extent(ip, state, icur, &PREV);
+		xfs_iext_update_extent(ip, state, icur, &prev);
 		xfs_iext_next(ifp, icur);
 		xfs_iext_insert(ip, icur, new, state);
 		ifp->if_nextents++;
 
-		if (cur == NULL)
+		if (cur == NULL) {
 			rval = XFS_ILOG_CORE | XFS_ILOG_DEXT;
-		else {
+		} else {
 			rval = XFS_ILOG_CORE;
 			error = xfs_bmbt_lookup_eq(cur, &old, &i);
 			if (error)
@@ -2335,7 +2331,7 @@ xfs_bmap_add_extent_unwritten_real(
 				error = -EFSCORRUPTED;
 				goto done;
 			}
-			error = xfs_bmbt_update(cur, &PREV);
+			error = xfs_bmbt_update(cur, &prev);
 			if (error)
 				goto done;
 			error = xfs_bmbt_lookup_eq(cur, new, &i);
@@ -2360,25 +2356,24 @@ xfs_bmap_add_extent_unwritten_real(
 		 * newext.  Contiguity is impossible here.
 		 * One extent becomes three extents.
 		 */
-		old = PREV;
-		PREV.br_blockcount = new->br_startoff - PREV.br_startoff;
+		old = prev;
+		prev.br_blockcount = new->br_startoff - prev.br_startoff;
 
-		r[0] = *new;
-		r[1].br_startoff = new_endoff;
-		r[1].br_blockcount =
+		right.br_startoff = new_endoff;
+		right.br_blockcount =
 			old.br_startoff + old.br_blockcount - new_endoff;
-		r[1].br_startblock = new->br_startblock + new->br_blockcount;
-		r[1].br_state = PREV.br_state;
+		right.br_startblock = new->br_startblock + new->br_blockcount;
+		right.br_state = prev.br_state;
 
-		xfs_iext_update_extent(ip, state, icur, &PREV);
+		xfs_iext_update_extent(ip, state, icur, &prev);
 		xfs_iext_next(ifp, icur);
-		xfs_iext_insert(ip, icur, &r[1], state);
-		xfs_iext_insert(ip, icur, &r[0], state);
+		xfs_iext_insert(ip, icur, &right, state);
+		xfs_iext_insert(ip, icur, new, state);
 		ifp->if_nextents += 2;
 
-		if (cur == NULL)
+		if (cur == NULL) {
 			rval = XFS_ILOG_CORE | XFS_ILOG_DEXT;
-		else {
+		} else {
 			rval = XFS_ILOG_CORE;
 			error = xfs_bmbt_lookup_eq(cur, &old, &i);
 			if (error)
@@ -2388,11 +2383,11 @@ xfs_bmap_add_extent_unwritten_real(
 				goto done;
 			}
 			/* new right extent - oldext */
-			error = xfs_bmbt_update(cur, &r[1]);
+			error = xfs_bmbt_update(cur, &right);
 			if (error)
 				goto done;
 			/* new left extent - oldext */
-			cur->bc_rec.b = PREV;
+			cur->bc_rec.b = prev;
 			if ((error = xfs_btree_insert(cur, &i)))
 				goto done;
 			if (XFS_IS_CORRUPT(mp, i != 1)) {
@@ -2459,9 +2454,6 @@ xfs_bmap_add_extent_unwritten_real(
 done:
 	*logflagsp |= rval;
 	return error;
-#undef	LEFT
-#undef	RIGHT
-#undef	PREV
 }
 
 /*
-- 
2.24.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] xfs: introduce xfs_bmap_update_extent_real()
  2022-02-09  7:36 [PATCH 0/3] xfs: some end COW remapping optimization Gao Xiang
  2022-02-09  7:36 ` [PATCH 1/3] xfs: get rid of LEFT, RIGHT, PREV in xfs_bmap_add_extent_unwritten_real() Gao Xiang
@ 2022-02-09  7:36 ` Gao Xiang
  2022-02-09  7:36 ` [PATCH 3/3] xfs: introduce xfs_bremapi_from_cowfork() Gao Xiang
  2 siblings, 0 replies; 6+ messages in thread
From: Gao Xiang @ 2022-02-09  7:36 UTC (permalink / raw)
  To: xfs; +Cc: LKML, Gao Xiang

Previously, xfs_bmap_add_extent_unwritten_real() was just used for
unwritten conversion. However, the code could be sightly modified
to update a real allocated extent. It can be then used to avoid
unnecessary bmbt unmap due to end COW.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
---
 fs/xfs/libxfs/xfs_bmap.c | 65 ++++++++++++++++++++++++++--------------
 fs/xfs/libxfs/xfs_bmap.h |  4 +--
 fs/xfs/xfs_reflink.c     |  5 ++--
 3 files changed, 47 insertions(+), 27 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 14d1a806ba15..a10476dee701 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -1930,22 +1930,26 @@ xfs_bmap_add_extent_delay_real(
 }
 
 /*
- * Convert an unwritten allocation to a real allocation or vice versa.
+ * Update a real allocated extent (including converting an unwritten
+ * allocation to a real allocation or vice versa.)
  */
 int					/* error */
-xfs_bmap_add_extent_unwritten_real(
+xfs_bmap_update_extent_real(
 	struct xfs_trans	*tp,
 	struct xfs_inode	*ip,	/* incore inode pointer */
 	int			whichfork,
 	struct xfs_iext_cursor	*icur,
 	struct xfs_btree_cur	**curp,	/* if *curp is null, not a btree */
 	struct xfs_bmbt_irec	*new,	/* new data to add to file extents */
-	int			*logflagsp) /* inode logging flags */
+	int			*logflagsp, /* inode logging flags */
+	bool			convert)
 {
 	struct xfs_btree_cur	*cur;	/* btree cursor */
 	int			error;	/* error return value */
 	int			i;	/* temp state */
 	struct xfs_ifork	*ifp;	/* inode fork pointer */
+	xfs_fileoff_t		del_startoff;	/* start offset of del entry */
+	xfs_exntst_t		del_state;
 	xfs_fileoff_t		new_endoff;	/* end offset of new entry */
 	struct xfs_bmbt_irec	left, right;	/* neighbor extent entries */
 	struct xfs_bmbt_irec	prev;		/* previous old extent */
@@ -1953,6 +1957,7 @@ xfs_bmap_add_extent_unwritten_real(
 	int			state = xfs_bmap_fork_to_state(whichfork);
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_bmbt_irec	old;
+	int			tmp_logflags;	/* partial log flag return val */
 
 	*logflagsp = 0;
 
@@ -1968,8 +1973,11 @@ xfs_bmap_add_extent_unwritten_real(
 	 */
 	error = 0;
 	xfs_iext_get_extent(ifp, icur, &prev);
-	ASSERT(new->br_state != prev.br_state);
+	ASSERT(!convert || new->br_state != prev.br_state);
 	new_endoff = new->br_startoff + new->br_blockcount;
+	del_startoff = prev.br_startblock +
+		new->br_startoff - prev.br_startoff;
+	del_state = prev.br_state;
 	ASSERT(prev.br_startoff <= new->br_startoff);
 	ASSERT(prev.br_startoff + prev.br_blockcount >= new_endoff);
 
@@ -2129,6 +2137,7 @@ xfs_bmap_add_extent_unwritten_real(
 		 */
 		prev.br_blockcount += right.br_blockcount;
 		prev.br_state = new->br_state;
+		prev.br_startblock = new->br_startblock;
 
 		xfs_iext_next(ifp, icur);
 		xfs_iext_remove(ip, icur, state);
@@ -2171,6 +2180,7 @@ xfs_bmap_add_extent_unwritten_real(
 		 * Neither the left nor right neighbors are contiguous with
 		 * the new one.
 		 */
+		prev.br_startblock = new->br_startblock;
 		prev.br_state = new->br_state;
 		xfs_iext_update_extent(ip, state, icur, &prev);
 
@@ -2362,7 +2372,8 @@ xfs_bmap_add_extent_unwritten_real(
 		right.br_startoff = new_endoff;
 		right.br_blockcount =
 			old.br_startoff + old.br_blockcount - new_endoff;
-		right.br_startblock = new->br_startblock + new->br_blockcount;
+		right.br_startblock = old.br_startblock + prev.br_blockcount +
+			new->br_blockcount;
 		right.br_state = prev.br_state;
 
 		xfs_iext_update_extent(ip, state, icur, &prev);
@@ -2430,20 +2441,30 @@ xfs_bmap_add_extent_unwritten_real(
 	}
 
 	/* update reverse mappings */
-	xfs_rmap_convert_extent(mp, tp, ip, whichfork, new);
+	if (!convert) {
+		old = *new;
+		old.br_startblock = del_startoff;
+		old.br_state = del_state;
+		xfs_rmap_unmap_extent(tp, ip, whichfork, &old);
+		xfs_rmap_map_extent(tp, ip, whichfork, new);
+	} else {
+		xfs_rmap_convert_extent(mp, tp, ip, whichfork, new);
+	}
 
-	/* convert to a btree if necessary */
+	/* convert to a btree or extents if necessary */
 	if (xfs_bmap_needs_btree(ip, whichfork)) {
-		int	tmp_logflags;	/* partial log flag return val */
-
 		ASSERT(cur == NULL);
 		error = xfs_bmap_extents_to_btree(tp, ip, &cur, 0,
 				&tmp_logflags, whichfork);
-		*logflagsp |= tmp_logflags;
-		if (error)
-			goto done;
+	} else if (!convert) {
+		error = xfs_bmap_btree_to_extents(tp, ip, cur,
+				&tmp_logflags, whichfork);
 	}
 
+	*logflagsp |= tmp_logflags;
+	if (error)
+		goto done;
+
 	/* clear out the allocated field, done with it now in any case. */
 	if (cur) {
 		cur->bc_ino.allocated = 0;
@@ -4216,8 +4237,8 @@ xfs_bmapi_convert_unwritten(
 			return error;
 	}
 
-	error = xfs_bmap_add_extent_unwritten_real(bma->tp, bma->ip, whichfork,
-			&bma->icur, &bma->cur, mval, &tmp_logflags);
+	error = xfs_bmap_update_extent_real(bma->tp, bma->ip, whichfork,
+			&bma->icur, &bma->cur, mval, &tmp_logflags, true);
 	/*
 	 * Log the inode core unconditionally in the unwritten extent conversion
 	 * path because the conversion might not have done so (e.g., if the
@@ -5444,9 +5465,9 @@ __xfs_bunmapi(
 				del.br_blockcount = mod;
 			}
 			del.br_state = XFS_EXT_UNWRITTEN;
-			error = xfs_bmap_add_extent_unwritten_real(tp, ip,
+			error = xfs_bmap_update_extent_real(tp, ip,
 					whichfork, &icur, &cur, &del,
-					&logflags);
+					&logflags, true);
 			if (error)
 				goto error0;
 			goto nodelete;
@@ -5503,18 +5524,18 @@ __xfs_bunmapi(
 				prev.br_startblock += mod;
 				prev.br_blockcount -= mod;
 				prev.br_state = XFS_EXT_UNWRITTEN;
-				error = xfs_bmap_add_extent_unwritten_real(tp,
-						ip, whichfork, &icur, &cur,
-						&prev, &logflags);
+				error = xfs_bmap_update_extent_real(tp, ip,
+						whichfork, &icur, &cur, &prev,
+						&logflags, true);
 				if (error)
 					goto error0;
 				goto nodelete;
 			} else {
 				ASSERT(del.br_state == XFS_EXT_NORM);
 				del.br_state = XFS_EXT_UNWRITTEN;
-				error = xfs_bmap_add_extent_unwritten_real(tp,
-						ip, whichfork, &icur, &cur,
-						&del, &logflags);
+				error = xfs_bmap_update_extent_real(tp, ip,
+						whichfork, &icur, &cur, &del,
+						&logflags, true);
 				if (error)
 					goto error0;
 				goto nodelete;
diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
index 03d9aaf87413..c52ff94786e2 100644
--- a/fs/xfs/libxfs/xfs_bmap.h
+++ b/fs/xfs/libxfs/xfs_bmap.h
@@ -216,10 +216,10 @@ int	xfs_bmapi_reserve_delalloc(struct xfs_inode *ip, int whichfork,
 		int eof);
 int	xfs_bmapi_convert_delalloc(struct xfs_inode *ip, int whichfork,
 		xfs_off_t offset, struct iomap *iomap, unsigned int *seq);
-int	xfs_bmap_add_extent_unwritten_real(struct xfs_trans *tp,
+int	xfs_bmap_update_extent_real(struct xfs_trans *tp,
 		struct xfs_inode *ip, int whichfork,
 		struct xfs_iext_cursor *icur, struct xfs_btree_cur **curp,
-		struct xfs_bmbt_irec *new, int *logflagsp);
+		struct xfs_bmbt_irec *new, int *logflagsp, bool convert);
 
 enum xfs_bmap_intent_type {
 	XFS_BMAP_MAP = 1,
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index db70060e7bf6..276387a6a85d 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -266,9 +266,8 @@ xfs_reflink_convert_cow_locked(
 			continue;
 
 		got.br_state = XFS_EXT_NORM;
-		error = xfs_bmap_add_extent_unwritten_real(NULL, ip,
-				XFS_COW_FORK, &icur, &dummy_cur, &got,
-				&dummy_logflags);
+		error = xfs_bmap_update_extent_real(NULL, ip, XFS_COW_FORK,
+				&icur, &dummy_cur, &got, &dummy_logflags, true);
 		if (error)
 			return error;
 	} while (xfs_iext_next_extent(ip->i_cowfp, &icur, &got));
-- 
2.24.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] xfs: introduce xfs_bremapi_from_cowfork()
  2022-02-09  7:36 [PATCH 0/3] xfs: some end COW remapping optimization Gao Xiang
  2022-02-09  7:36 ` [PATCH 1/3] xfs: get rid of LEFT, RIGHT, PREV in xfs_bmap_add_extent_unwritten_real() Gao Xiang
  2022-02-09  7:36 ` [PATCH 2/3] xfs: introduce xfs_bmap_update_extent_real() Gao Xiang
@ 2022-02-09  7:36 ` Gao Xiang
  2022-02-16  1:24   ` Darrick J. Wong
  2 siblings, 1 reply; 6+ messages in thread
From: Gao Xiang @ 2022-02-09  7:36 UTC (permalink / raw)
  To: xfs; +Cc: LKML, Gao Xiang

Previously, xfs_reflink_end_cow_extent() will unconditionally unmap
the corresponding old extent and remap an extent from COW fork.
However, it seems somewhat ineffective since the old bmbt records can
be directly updated for many cases instead.

This patch uses introduced xfs_bmap_update_extent_real() in the
previous patch for most extent inclusive cases or it will fall back
to the old way if such replacement is not possible.

Actually, we're planing to use a modified alway-cow like atomic write
approach internally, therefore it'd be nice to do some optimization
to reduce some metadata overhead.

Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
---
 fs/xfs/libxfs/xfs_bmap.c | 117 ++++++++++++++++++++++++++++++++++++---
 fs/xfs/libxfs/xfs_bmap.h |   3 +
 fs/xfs/xfs_reflink.c     |  19 +------
 3 files changed, 112 insertions(+), 27 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index a10476dee701..0e132f811f7a 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -5880,6 +5880,114 @@ xfs_bmap_collapse_extents(
 	return error;
 }
 
+/* Deferred mapping is only for real extents in the data fork. */
+static bool
+xfs_bmap_is_update_needed(
+	struct xfs_bmbt_irec	*bmap)
+{
+	return  bmap->br_startblock != HOLESTARTBLOCK &&
+		bmap->br_startblock != DELAYSTARTBLOCK;
+}
+
+/* del is an extent from COW fork */
+int
+xfs_bremapi_from_cowfork(
+	struct xfs_trans	*tp,
+	struct xfs_inode	*ip,
+	struct xfs_bmbt_irec	*icow)
+{
+	int			error;
+	xfs_filblks_t		rlen;
+
+	/* Use the old (unmap-remap) way for real-time inodes instead */
+	if (!XFS_IS_REALTIME_INODE(ip) && xfs_bmap_is_update_needed(icow)) {
+		xfs_fileoff_t		start, end, max_len;
+		struct xfs_bmbt_irec	got;
+		struct xfs_iext_cursor	icur;
+		struct xfs_btree_cur	*cur = NULL;
+		struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
+		int			logflags = 0;
+
+		error = xfs_iread_extents(tp, ip, XFS_DATA_FORK);
+		if (error)
+			return error;
+
+		max_len = xfs_refcount_max_unmap(tp->t_log_res);
+		if (max_len < icow->br_blockcount) {
+			icow->br_startoff += icow->br_blockcount - max_len;
+			icow->br_startblock += icow->br_blockcount - max_len;
+			icow->br_blockcount = max_len;
+		}
+
+		end = icow->br_startoff + icow->br_blockcount;
+		if (!xfs_iext_count(ifp) || !xfs_iext_lookup_extent_before(ip,
+				ifp, &end, &icur, &got) ||
+		    isnullstartblock(got.br_startblock) ||
+		    icow->br_startoff + icow->br_blockcount > got.br_startoff +
+				got.br_blockcount) {
+			error = -EAGAIN;
+		} else {
+			end = icow->br_startoff + icow->br_blockcount;
+			start = XFS_FILEOFF_MAX(icow->br_startoff,
+						got.br_startoff);
+			ASSERT(start < end);
+
+			/* Trim the extent to what we need */
+			xfs_trim_extent(icow, start, end - start);
+			xfs_trim_extent(&got, start, end - start);
+
+			if (ifp->if_format == XFS_DINODE_FMT_BTREE) {
+				cur = xfs_bmbt_init_cursor(tp->t_mountp, tp, ip,
+							   XFS_DATA_FORK);
+				cur->bc_ino.flags = 0;
+			}
+
+			/*
+			 * Free the CoW orphan record (it should be done here
+			 * before updating extent due to rmapbt update)
+			 */
+			xfs_refcount_free_cow_extent(tp, icow->br_startblock,
+						     icow->br_blockcount);
+
+			xfs_bmap_update_extent_real(tp, ip, XFS_DATA_FORK,
+					&icur, &cur, icow, &logflags, false);
+
+			/* Free previous referenced space */
+			xfs_refcount_decrease_extent(tp, &got);
+
+			trace_xfs_reflink_cow_remap(ip, icow);
+			error = 0;
+		}
+		if (cur)
+			xfs_btree_del_cursor(cur, 0);
+		if (logflags)
+			xfs_trans_log_inode(tp, ip, logflags);
+		if (!error)
+			return 0;
+	}
+
+	rlen = icow->br_blockcount;
+	error = __xfs_bunmapi(tp, ip, icow->br_startoff, &rlen, 0, 1);
+	if (error)
+		return error;
+
+	/* Trim the extent to whatever got unmapped. */
+	xfs_trim_extent(icow, icow->br_startoff + rlen,
+			icow->br_blockcount - rlen);
+	/* Free the CoW orphan record. */
+	xfs_refcount_free_cow_extent(tp, icow->br_startblock,
+				     icow->br_blockcount);
+
+	/* Map the new blocks into the data fork. */
+	xfs_bmap_map_extent(tp, ip, icow);
+
+	/* Charge this new data fork mapping to the on-disk quota. */
+	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_DELBCOUNT,
+			(long)icow->br_blockcount);
+	trace_xfs_reflink_cow_remap(ip, icow);
+	return 0;
+}
+
 /* Make sure we won't be right-shifting an extent past the maximum bound. */
 int
 xfs_bmap_can_insert_extents(
@@ -6123,15 +6231,6 @@ xfs_bmap_split_extent(
 	return error;
 }
 
-/* Deferred mapping is only for real extents in the data fork. */
-static bool
-xfs_bmap_is_update_needed(
-	struct xfs_bmbt_irec	*bmap)
-{
-	return  bmap->br_startblock != HOLESTARTBLOCK &&
-		bmap->br_startblock != DELAYSTARTBLOCK;
-}
-
 /* Record a bmap intent. */
 static int
 __xfs_bmap_add(
diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
index c52ff94786e2..9da1cff41c1c 100644
--- a/fs/xfs/libxfs/xfs_bmap.h
+++ b/fs/xfs/libxfs/xfs_bmap.h
@@ -220,6 +220,9 @@ int	xfs_bmap_update_extent_real(struct xfs_trans *tp,
 		struct xfs_inode *ip, int whichfork,
 		struct xfs_iext_cursor *icur, struct xfs_btree_cur **curp,
 		struct xfs_bmbt_irec *new, int *logflagsp, bool convert);
+int
+xfs_bremapi_from_cowfork(struct xfs_trans *tp, struct xfs_inode *ip,
+		struct xfs_bmbt_irec *icow);
 
 enum xfs_bmap_intent_type {
 	XFS_BMAP_MAP = 1,
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 276387a6a85d..75bd2e03cd5b 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -590,7 +590,6 @@ xfs_reflink_end_cow_extent(
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_trans	*tp;
 	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK);
-	xfs_filblks_t		rlen;
 	unsigned int		resblks;
 	int			error;
 
@@ -651,26 +650,10 @@ xfs_reflink_end_cow_extent(
 		goto out_cancel;
 	}
 
-	/* Unmap the old blocks in the data fork. */
-	rlen = del.br_blockcount;
-	error = __xfs_bunmapi(tp, ip, del.br_startoff, &rlen, 0, 1);
+	error = xfs_bremapi_from_cowfork(tp, ip, &del);
 	if (error)
 		goto out_cancel;
 
-	/* Trim the extent to whatever got unmapped. */
-	xfs_trim_extent(&del, del.br_startoff + rlen, del.br_blockcount - rlen);
-	trace_xfs_reflink_cow_remap(ip, &del);
-
-	/* Free the CoW orphan record. */
-	xfs_refcount_free_cow_extent(tp, del.br_startblock, del.br_blockcount);
-
-	/* Map the new blocks into the data fork. */
-	xfs_bmap_map_extent(tp, ip, &del);
-
-	/* Charge this new data fork mapping to the on-disk quota. */
-	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_DELBCOUNT,
-			(long)del.br_blockcount);
-
 	/* Remove the mapping from the CoW fork. */
 	xfs_bmap_del_extent_cow(ip, &icur, &got, &del);
 
-- 
2.24.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 3/3] xfs: introduce xfs_bremapi_from_cowfork()
  2022-02-09  7:36 ` [PATCH 3/3] xfs: introduce xfs_bremapi_from_cowfork() Gao Xiang
@ 2022-02-16  1:24   ` Darrick J. Wong
  2022-02-16  2:03     ` Gao Xiang
  0 siblings, 1 reply; 6+ messages in thread
From: Darrick J. Wong @ 2022-02-16  1:24 UTC (permalink / raw)
  To: Gao Xiang; +Cc: xfs, LKML

On Wed, Feb 09, 2022 at 03:36:55PM +0800, Gao Xiang wrote:
> Previously, xfs_reflink_end_cow_extent() will unconditionally unmap
> the corresponding old extent and remap an extent from COW fork.
> However, it seems somewhat ineffective since the old bmbt records can
> be directly updated for many cases instead.
> 
> This patch uses introduced xfs_bmap_update_extent_real() in the
> previous patch for most extent inclusive cases or it will fall back
> to the old way if such replacement is not possible.
> 
> Actually, we're planing to use a modified alway-cow like atomic write
> approach internally, therefore it'd be nice to do some optimization
> to reduce some metadata overhead.
> 
> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
> ---
>  fs/xfs/libxfs/xfs_bmap.c | 117 ++++++++++++++++++++++++++++++++++++---
>  fs/xfs/libxfs/xfs_bmap.h |   3 +
>  fs/xfs/xfs_reflink.c     |  19 +------
>  3 files changed, 112 insertions(+), 27 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index a10476dee701..0e132f811f7a 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -5880,6 +5880,114 @@ xfs_bmap_collapse_extents(
>  	return error;
>  }
>  
> +/* Deferred mapping is only for real extents in the data fork. */
> +static bool
> +xfs_bmap_is_update_needed(
> +	struct xfs_bmbt_irec	*bmap)
> +{
> +	return  bmap->br_startblock != HOLESTARTBLOCK &&
> +		bmap->br_startblock != DELAYSTARTBLOCK;
> +}
> +
> +/* del is an extent from COW fork */
> +int
> +xfs_bremapi_from_cowfork(
> +	struct xfs_trans	*tp,
> +	struct xfs_inode	*ip,
> +	struct xfs_bmbt_irec	*icow)
> +{
> +	int			error;
> +	xfs_filblks_t		rlen;
> +
> +	/* Use the old (unmap-remap) way for real-time inodes instead */
> +	if (!XFS_IS_REALTIME_INODE(ip) && xfs_bmap_is_update_needed(icow)) {

When would be be remapping a realtime file with a COW fork?

> +		xfs_fileoff_t		start, end, max_len;
> +		struct xfs_bmbt_irec	got;
> +		struct xfs_iext_cursor	icur;
> +		struct xfs_btree_cur	*cur = NULL;
> +		struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
> +		int			logflags = 0;
> +
> +		error = xfs_iread_extents(tp, ip, XFS_DATA_FORK);
> +		if (error)
> +			return error;
> +
> +		max_len = xfs_refcount_max_unmap(tp->t_log_res);
> +		if (max_len < icow->br_blockcount) {
> +			icow->br_startoff += icow->br_blockcount - max_len;
> +			icow->br_startblock += icow->br_blockcount - max_len;
> +			icow->br_blockcount = max_len;
> +		}
> +
> +		end = icow->br_startoff + icow->br_blockcount;
> +		if (!xfs_iext_count(ifp) || !xfs_iext_lookup_extent_before(ip,
> +				ifp, &end, &icur, &got) ||
> +		    isnullstartblock(got.br_startblock) ||
> +		    icow->br_startoff + icow->br_blockcount > got.br_startoff +
> +				got.br_blockcount) {
> +			error = -EAGAIN;
> +		} else {
> +			end = icow->br_startoff + icow->br_blockcount;
> +			start = XFS_FILEOFF_MAX(icow->br_startoff,
> +						got.br_startoff);
> +			ASSERT(start < end);
> +
> +			/* Trim the extent to what we need */
> +			xfs_trim_extent(icow, start, end - start);
> +			xfs_trim_extent(&got, start, end - start);
> +
> +			if (ifp->if_format == XFS_DINODE_FMT_BTREE) {
> +				cur = xfs_bmbt_init_cursor(tp->t_mountp, tp, ip,
> +							   XFS_DATA_FORK);
> +				cur->bc_ino.flags = 0;
> +			}
> +
> +			/*
> +			 * Free the CoW orphan record (it should be done here
> +			 * before updating extent due to rmapbt update)
> +			 */
> +			xfs_refcount_free_cow_extent(tp, icow->br_startblock,
> +						     icow->br_blockcount);
> +
> +			xfs_bmap_update_extent_real(tp, ip, XFS_DATA_FORK,
> +					&icur, &cur, icow, &logflags, false);

Hmm... are you directly updating the data fork mapping record from the
COW fork mapping record?  Is the performance advantage you mentioned
earlier a result of this code no longer logging a bmap map intent item
and reducing the transaction roll count by one?

--D

> +
> +			/* Free previous referenced space */
> +			xfs_refcount_decrease_extent(tp, &got);
> +
> +			trace_xfs_reflink_cow_remap(ip, icow);
> +			error = 0;
> +		}
> +		if (cur)
> +			xfs_btree_del_cursor(cur, 0);
> +		if (logflags)
> +			xfs_trans_log_inode(tp, ip, logflags);
> +		if (!error)
> +			return 0;
> +	}
> +
> +	rlen = icow->br_blockcount;
> +	error = __xfs_bunmapi(tp, ip, icow->br_startoff, &rlen, 0, 1);
> +	if (error)
> +		return error;
> +
> +	/* Trim the extent to whatever got unmapped. */
> +	xfs_trim_extent(icow, icow->br_startoff + rlen,
> +			icow->br_blockcount - rlen);
> +	/* Free the CoW orphan record. */
> +	xfs_refcount_free_cow_extent(tp, icow->br_startblock,
> +				     icow->br_blockcount);
> +
> +	/* Map the new blocks into the data fork. */
> +	xfs_bmap_map_extent(tp, ip, icow);
> +
> +	/* Charge this new data fork mapping to the on-disk quota. */
> +	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_DELBCOUNT,
> +			(long)icow->br_blockcount);
> +	trace_xfs_reflink_cow_remap(ip, icow);
> +	return 0;
> +}
> +
>  /* Make sure we won't be right-shifting an extent past the maximum bound. */
>  int
>  xfs_bmap_can_insert_extents(
> @@ -6123,15 +6231,6 @@ xfs_bmap_split_extent(
>  	return error;
>  }
>  
> -/* Deferred mapping is only for real extents in the data fork. */
> -static bool
> -xfs_bmap_is_update_needed(
> -	struct xfs_bmbt_irec	*bmap)
> -{
> -	return  bmap->br_startblock != HOLESTARTBLOCK &&
> -		bmap->br_startblock != DELAYSTARTBLOCK;
> -}
> -
>  /* Record a bmap intent. */
>  static int
>  __xfs_bmap_add(
> diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
> index c52ff94786e2..9da1cff41c1c 100644
> --- a/fs/xfs/libxfs/xfs_bmap.h
> +++ b/fs/xfs/libxfs/xfs_bmap.h
> @@ -220,6 +220,9 @@ int	xfs_bmap_update_extent_real(struct xfs_trans *tp,
>  		struct xfs_inode *ip, int whichfork,
>  		struct xfs_iext_cursor *icur, struct xfs_btree_cur **curp,
>  		struct xfs_bmbt_irec *new, int *logflagsp, bool convert);
> +int
> +xfs_bremapi_from_cowfork(struct xfs_trans *tp, struct xfs_inode *ip,
> +		struct xfs_bmbt_irec *icow);
>  
>  enum xfs_bmap_intent_type {
>  	XFS_BMAP_MAP = 1,
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index 276387a6a85d..75bd2e03cd5b 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -590,7 +590,6 @@ xfs_reflink_end_cow_extent(
>  	struct xfs_mount	*mp = ip->i_mount;
>  	struct xfs_trans	*tp;
>  	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK);
> -	xfs_filblks_t		rlen;
>  	unsigned int		resblks;
>  	int			error;
>  
> @@ -651,26 +650,10 @@ xfs_reflink_end_cow_extent(
>  		goto out_cancel;
>  	}
>  
> -	/* Unmap the old blocks in the data fork. */
> -	rlen = del.br_blockcount;
> -	error = __xfs_bunmapi(tp, ip, del.br_startoff, &rlen, 0, 1);
> +	error = xfs_bremapi_from_cowfork(tp, ip, &del);
>  	if (error)
>  		goto out_cancel;
>  
> -	/* Trim the extent to whatever got unmapped. */
> -	xfs_trim_extent(&del, del.br_startoff + rlen, del.br_blockcount - rlen);
> -	trace_xfs_reflink_cow_remap(ip, &del);
> -
> -	/* Free the CoW orphan record. */
> -	xfs_refcount_free_cow_extent(tp, del.br_startblock, del.br_blockcount);
> -
> -	/* Map the new blocks into the data fork. */
> -	xfs_bmap_map_extent(tp, ip, &del);
> -
> -	/* Charge this new data fork mapping to the on-disk quota. */
> -	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_DELBCOUNT,
> -			(long)del.br_blockcount);
> -
>  	/* Remove the mapping from the CoW fork. */
>  	xfs_bmap_del_extent_cow(ip, &icur, &got, &del);
>  
> -- 
> 2.24.4
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 3/3] xfs: introduce xfs_bremapi_from_cowfork()
  2022-02-16  1:24   ` Darrick J. Wong
@ 2022-02-16  2:03     ` Gao Xiang
  0 siblings, 0 replies; 6+ messages in thread
From: Gao Xiang @ 2022-02-16  2:03 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: xfs, LKML

Hi Darrick,

On Tue, Feb 15, 2022 at 05:24:33PM -0800, Darrick J. Wong wrote:
> On Wed, Feb 09, 2022 at 03:36:55PM +0800, Gao Xiang wrote:
> > Previously, xfs_reflink_end_cow_extent() will unconditionally unmap
> > the corresponding old extent and remap an extent from COW fork.
> > However, it seems somewhat ineffective since the old bmbt records can
> > be directly updated for many cases instead.
> > 
> > This patch uses introduced xfs_bmap_update_extent_real() in the
> > previous patch for most extent inclusive cases or it will fall back
> > to the old way if such replacement is not possible.
> > 
> > Actually, we're planing to use a modified alway-cow like atomic write
> > approach internally, therefore it'd be nice to do some optimization
> > to reduce some metadata overhead.
> > 
> > Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
> > ---
> >  fs/xfs/libxfs/xfs_bmap.c | 117 ++++++++++++++++++++++++++++++++++++---
> >  fs/xfs/libxfs/xfs_bmap.h |   3 +
> >  fs/xfs/xfs_reflink.c     |  19 +------
> >  3 files changed, 112 insertions(+), 27 deletions(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> > index a10476dee701..0e132f811f7a 100644
> > --- a/fs/xfs/libxfs/xfs_bmap.c
> > +++ b/fs/xfs/libxfs/xfs_bmap.c
> > @@ -5880,6 +5880,114 @@ xfs_bmap_collapse_extents(
> >  	return error;
> >  }
> >  
> > +/* Deferred mapping is only for real extents in the data fork. */
> > +static bool
> > +xfs_bmap_is_update_needed(
> > +	struct xfs_bmbt_irec	*bmap)
> > +{
> > +	return  bmap->br_startblock != HOLESTARTBLOCK &&
> > +		bmap->br_startblock != DELAYSTARTBLOCK;
> > +}
> > +
> > +/* del is an extent from COW fork */
> > +int
> > +xfs_bremapi_from_cowfork(
> > +	struct xfs_trans	*tp,
> > +	struct xfs_inode	*ip,
> > +	struct xfs_bmbt_irec	*icow)
> > +{
> > +	int			error;
> > +	xfs_filblks_t		rlen;
> > +
> > +	/* Use the old (unmap-remap) way for real-time inodes instead */
> > +	if (!XFS_IS_REALTIME_INODE(ip) && xfs_bmap_is_update_needed(icow)) {
> 
> When would be be remapping a realtime file with a COW fork?

Sorry, I missed this part, this is a necessary check.
Thanks for pointing out this.

> 
> > +		xfs_fileoff_t		start, end, max_len;
> > +		struct xfs_bmbt_irec	got;
> > +		struct xfs_iext_cursor	icur;
> > +		struct xfs_btree_cur	*cur = NULL;
> > +		struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
> > +		int			logflags = 0;
> > +
> > +		error = xfs_iread_extents(tp, ip, XFS_DATA_FORK);
> > +		if (error)
> > +			return error;
> > +
> > +		max_len = xfs_refcount_max_unmap(tp->t_log_res);
> > +		if (max_len < icow->br_blockcount) {
> > +			icow->br_startoff += icow->br_blockcount - max_len;
> > +			icow->br_startblock += icow->br_blockcount - max_len;
> > +			icow->br_blockcount = max_len;
> > +		}
> > +
> > +		end = icow->br_startoff + icow->br_blockcount;
> > +		if (!xfs_iext_count(ifp) || !xfs_iext_lookup_extent_before(ip,
> > +				ifp, &end, &icur, &got) ||
> > +		    isnullstartblock(got.br_startblock) ||
> > +		    icow->br_startoff + icow->br_blockcount > got.br_startoff +
> > +				got.br_blockcount) {
> > +			error = -EAGAIN;
> > +		} else {
> > +			end = icow->br_startoff + icow->br_blockcount;
> > +			start = XFS_FILEOFF_MAX(icow->br_startoff,
> > +						got.br_startoff);
> > +			ASSERT(start < end);
> > +
> > +			/* Trim the extent to what we need */
> > +			xfs_trim_extent(icow, start, end - start);
> > +			xfs_trim_extent(&got, start, end - start);
> > +
> > +			if (ifp->if_format == XFS_DINODE_FMT_BTREE) {
> > +				cur = xfs_bmbt_init_cursor(tp->t_mountp, tp, ip,
> > +							   XFS_DATA_FORK);
> > +				cur->bc_ino.flags = 0;
> > +			}
> > +
> > +			/*
> > +			 * Free the CoW orphan record (it should be done here
> > +			 * before updating extent due to rmapbt update)
> > +			 */
> > +			xfs_refcount_free_cow_extent(tp, icow->br_startblock,
> > +						     icow->br_blockcount);
> > +
> > +			xfs_bmap_update_extent_real(tp, ip, XFS_DATA_FORK,
> > +					&icur, &cur, icow, &logflags, false);
> 
> Hmm... are you directly updating the data fork mapping record from the
> COW fork mapping record?  Is the performance advantage you mentioned
> earlier a result of this code no longer logging a bmap map intent item
> and reducing the transaction roll count by one?

Yes, roughly I think it reduces bmap again overhead.

Thanks,
Gao Xiang

> 
> --D
> 
> > +
> > +			/* Free previous referenced space */
> > +			xfs_refcount_decrease_extent(tp, &got);
> > +
> > +			trace_xfs_reflink_cow_remap(ip, icow);
> > +			error = 0;
> > +		}
> > +		if (cur)
> > +			xfs_btree_del_cursor(cur, 0);
> > +		if (logflags)
> > +			xfs_trans_log_inode(tp, ip, logflags);
> > +		if (!error)
> > +			return 0;
> > +	}
> > +
> > +	rlen = icow->br_blockcount;
> > +	error = __xfs_bunmapi(tp, ip, icow->br_startoff, &rlen, 0, 1);
> > +	if (error)
> > +		return error;
> > +
> > +	/* Trim the extent to whatever got unmapped. */
> > +	xfs_trim_extent(icow, icow->br_startoff + rlen,
> > +			icow->br_blockcount - rlen);
> > +	/* Free the CoW orphan record. */
> > +	xfs_refcount_free_cow_extent(tp, icow->br_startblock,
> > +				     icow->br_blockcount);
> > +
> > +	/* Map the new blocks into the data fork. */
> > +	xfs_bmap_map_extent(tp, ip, icow);
> > +
> > +	/* Charge this new data fork mapping to the on-disk quota. */
> > +	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_DELBCOUNT,
> > +			(long)icow->br_blockcount);
> > +	trace_xfs_reflink_cow_remap(ip, icow);
> > +	return 0;
> > +}
> > +
> >  /* Make sure we won't be right-shifting an extent past the maximum bound. */
> >  int
> >  xfs_bmap_can_insert_extents(
> > @@ -6123,15 +6231,6 @@ xfs_bmap_split_extent(
> >  	return error;
> >  }
> >  
> > -/* Deferred mapping is only for real extents in the data fork. */
> > -static bool
> > -xfs_bmap_is_update_needed(
> > -	struct xfs_bmbt_irec	*bmap)
> > -{
> > -	return  bmap->br_startblock != HOLESTARTBLOCK &&
> > -		bmap->br_startblock != DELAYSTARTBLOCK;
> > -}
> > -
> >  /* Record a bmap intent. */
> >  static int
> >  __xfs_bmap_add(
> > diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
> > index c52ff94786e2..9da1cff41c1c 100644
> > --- a/fs/xfs/libxfs/xfs_bmap.h
> > +++ b/fs/xfs/libxfs/xfs_bmap.h
> > @@ -220,6 +220,9 @@ int	xfs_bmap_update_extent_real(struct xfs_trans *tp,
> >  		struct xfs_inode *ip, int whichfork,
> >  		struct xfs_iext_cursor *icur, struct xfs_btree_cur **curp,
> >  		struct xfs_bmbt_irec *new, int *logflagsp, bool convert);
> > +int
> > +xfs_bremapi_from_cowfork(struct xfs_trans *tp, struct xfs_inode *ip,
> > +		struct xfs_bmbt_irec *icow);
> >  
> >  enum xfs_bmap_intent_type {
> >  	XFS_BMAP_MAP = 1,
> > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> > index 276387a6a85d..75bd2e03cd5b 100644
> > --- a/fs/xfs/xfs_reflink.c
> > +++ b/fs/xfs/xfs_reflink.c
> > @@ -590,7 +590,6 @@ xfs_reflink_end_cow_extent(
> >  	struct xfs_mount	*mp = ip->i_mount;
> >  	struct xfs_trans	*tp;
> >  	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK);
> > -	xfs_filblks_t		rlen;
> >  	unsigned int		resblks;
> >  	int			error;
> >  
> > @@ -651,26 +650,10 @@ xfs_reflink_end_cow_extent(
> >  		goto out_cancel;
> >  	}
> >  
> > -	/* Unmap the old blocks in the data fork. */
> > -	rlen = del.br_blockcount;
> > -	error = __xfs_bunmapi(tp, ip, del.br_startoff, &rlen, 0, 1);
> > +	error = xfs_bremapi_from_cowfork(tp, ip, &del);
> >  	if (error)
> >  		goto out_cancel;
> >  
> > -	/* Trim the extent to whatever got unmapped. */
> > -	xfs_trim_extent(&del, del.br_startoff + rlen, del.br_blockcount - rlen);
> > -	trace_xfs_reflink_cow_remap(ip, &del);
> > -
> > -	/* Free the CoW orphan record. */
> > -	xfs_refcount_free_cow_extent(tp, del.br_startblock, del.br_blockcount);
> > -
> > -	/* Map the new blocks into the data fork. */
> > -	xfs_bmap_map_extent(tp, ip, &del);
> > -
> > -	/* Charge this new data fork mapping to the on-disk quota. */
> > -	xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_DELBCOUNT,
> > -			(long)del.br_blockcount);
> > -
> >  	/* Remove the mapping from the CoW fork. */
> >  	xfs_bmap_del_extent_cow(ip, &icur, &got, &del);
> >  
> > -- 
> > 2.24.4
> > 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-02-16  2:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-09  7:36 [PATCH 0/3] xfs: some end COW remapping optimization Gao Xiang
2022-02-09  7:36 ` [PATCH 1/3] xfs: get rid of LEFT, RIGHT, PREV in xfs_bmap_add_extent_unwritten_real() Gao Xiang
2022-02-09  7:36 ` [PATCH 2/3] xfs: introduce xfs_bmap_update_extent_real() Gao Xiang
2022-02-09  7:36 ` [PATCH 3/3] xfs: introduce xfs_bremapi_from_cowfork() Gao Xiang
2022-02-16  1:24   ` Darrick J. Wong
2022-02-16  2:03     ` Gao Xiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).