All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof
@ 2024-03-20 11:05 Zhang Yi
  2024-03-20 11:05 ` [PATCH v4 1/9] xfs: match lock mode in xfs_buffered_write_iomap_begin() Zhang Yi
                   ` (10 more replies)
  0 siblings, 11 replies; 21+ messages in thread
From: Zhang Yi @ 2024-03-20 11:05 UTC (permalink / raw)
  To: linux-xfs, linux-fsdevel
  Cc: linux-kernel, djwong, hch, brauner, david, tytso, jack, yi.zhang,
	yi.zhang, chengzhihao1, yukuai3

Changes since v3:
 - Improve some git message comments and do some minor code cleanup, no
   logic changes.

Changes since v2:
 - Merge the patch for dropping of xfs_convert_blocks() and the patch
   for modifying xfs_bmapi_convert_delalloc().
 - Reword the commit message of the second patch.

Changes since v1:
 - Make xfs_bmapi_convert_delalloc() to allocate the target offset and
   drop the writeback helper xfs_convert_blocks().
 - Don't use xfs_iomap_write_direct() to convert delalloc blocks for
   zeroing posteof case, use xfs_bmapi_convert_delalloc() instead.
 - Fix two off-by-one issues when converting delalloc blocks.
 - Add a separate patch to drop the buffered write failure handle in
   zeroing and unsharing.
 - Add a comments do emphasize updating i_size should under folio lock.
 - Make iomap_write_end() to return a boolean, and do some cleanups in
   buffered write begin path.

This patch series fix a problem of exposing zeroed data on xfs since the
non-atomic clone operation. This problem was found while I was
developing ext4 buffered IO iomap conversion (ext4 is relying on this
fix [1]), the root cause of this problem and the discussion about the
solution please see [2]. After fix the problem, iomap_zero_range()
doesn't need to update i_size so that ext4 can use it to zero partial
block, e.g. truncate eof block [3].

[1] https://lore.kernel.org/linux-ext4/20240127015825.1608160-1-yi.zhang@huaweicloud.com/
[2] https://lore.kernel.org/linux-ext4/9b0040ef-3d9d-6246-4bdd-82b9a8f55fa2@huaweicloud.com/
[3] https://lore.kernel.org/linux-ext4/9c9f1831-a772-299b-072b-1c8116c3fb35@huaweicloud.com/

Thanks,
Yi.

Zhang Yi (9):
  xfs: match lock mode in xfs_buffered_write_iomap_begin()
  xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional
  xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset
  xfs: convert delayed extents to unwritten when zeroing post eof blocks
  iomap: drop the write failure handles when unsharing and zeroing
  iomap: don't increase i_size if it's not a write operation
  iomap: use a new variable to handle the written bytes in
    iomap_write_iter()
  iomap: make iomap_write_end() return a boolean
  iomap: do some small logical cleanup in buffered write

 fs/iomap/buffered-io.c   | 105 ++++++++++++++++++++++-----------------
 fs/xfs/libxfs/xfs_bmap.c |  40 +++++++++++++--
 fs/xfs/xfs_aops.c        |  54 ++++++--------------
 fs/xfs/xfs_iomap.c       |  39 +++++++++++++--
 4 files changed, 144 insertions(+), 94 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v4 1/9] xfs: match lock mode in xfs_buffered_write_iomap_begin()
  2024-03-20 11:05 [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Zhang Yi
@ 2024-03-20 11:05 ` Zhang Yi
  2024-03-20 11:05 ` [PATCH v4 2/9] xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional Zhang Yi
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Zhang Yi @ 2024-03-20 11:05 UTC (permalink / raw)
  To: linux-xfs, linux-fsdevel
  Cc: linux-kernel, djwong, hch, brauner, david, tytso, jack, yi.zhang,
	yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Commit 1aa91d9c9933 ("xfs: Add async buffered write support") replace
xfs_ilock(XFS_ILOCK_EXCL) with xfs_ilock_for_iomap() when locking the
writing inode, and a new variable lockmode is used to indicate the lock
mode. Although the lockmode should always be XFS_ILOCK_EXCL, it's still
better to use this variable instead of useing XFS_ILOCK_EXCL directly
when unlocking the inode.

Fixes: 1aa91d9c9933 ("xfs: Add async buffered write support")
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_iomap.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 18c8f168b153..ccf83e72d8ca 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1149,13 +1149,13 @@ xfs_buffered_write_iomap_begin(
 	 * them out if the write happens to fail.
 	 */
 	seq = xfs_iomap_inode_sequence(ip, IOMAP_F_NEW);
-	xfs_iunlock(ip, XFS_ILOCK_EXCL);
+	xfs_iunlock(ip, lockmode);
 	trace_xfs_iomap_alloc(ip, offset, count, allocfork, &imap);
 	return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, IOMAP_F_NEW, seq);
 
 found_imap:
 	seq = xfs_iomap_inode_sequence(ip, 0);
-	xfs_iunlock(ip, XFS_ILOCK_EXCL);
+	xfs_iunlock(ip, lockmode);
 	return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, 0, seq);
 
 found_cow:
@@ -1165,17 +1165,17 @@ xfs_buffered_write_iomap_begin(
 		if (error)
 			goto out_unlock;
 		seq = xfs_iomap_inode_sequence(ip, IOMAP_F_SHARED);
-		xfs_iunlock(ip, XFS_ILOCK_EXCL);
+		xfs_iunlock(ip, lockmode);
 		return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags,
 					 IOMAP_F_SHARED, seq);
 	}
 
 	xfs_trim_extent(&cmap, offset_fsb, imap.br_startoff - offset_fsb);
-	xfs_iunlock(ip, XFS_ILOCK_EXCL);
+	xfs_iunlock(ip, lockmode);
 	return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags, 0, seq);
 
 out_unlock:
-	xfs_iunlock(ip, XFS_ILOCK_EXCL);
+	xfs_iunlock(ip, lockmode);
 	return error;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 2/9] xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional
  2024-03-20 11:05 [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Zhang Yi
  2024-03-20 11:05 ` [PATCH v4 1/9] xfs: match lock mode in xfs_buffered_write_iomap_begin() Zhang Yi
@ 2024-03-20 11:05 ` Zhang Yi
  2024-03-20 11:05 ` [PATCH v4 3/9] xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset Zhang Yi
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Zhang Yi @ 2024-03-20 11:05 UTC (permalink / raw)
  To: linux-xfs, linux-fsdevel
  Cc: linux-kernel, djwong, hch, brauner, david, tytso, jack, yi.zhang,
	yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Allow callers to pass a NULLL seq argument if they don't care about
the fork sequence number.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_bmap.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index f362345467fa..07dc35de8ce5 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4574,7 +4574,8 @@ xfs_bmapi_convert_delalloc(
 	if (!isnullstartblock(bma.got.br_startblock)) {
 		xfs_bmbt_to_iomap(ip, iomap, &bma.got, 0, flags,
 				xfs_iomap_inode_sequence(ip, flags));
-		*seq = READ_ONCE(ifp->if_seq);
+		if (seq)
+			*seq = READ_ONCE(ifp->if_seq);
 		goto out_trans_cancel;
 	}
 
@@ -4623,7 +4624,8 @@ xfs_bmapi_convert_delalloc(
 	ASSERT(!isnullstartblock(bma.got.br_startblock));
 	xfs_bmbt_to_iomap(ip, iomap, &bma.got, 0, flags,
 				xfs_iomap_inode_sequence(ip, flags));
-	*seq = READ_ONCE(ifp->if_seq);
+	if (seq)
+		*seq = READ_ONCE(ifp->if_seq);
 
 	if (whichfork == XFS_COW_FORK)
 		xfs_refcount_alloc_cow_extent(tp, bma.blkno, bma.length);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 3/9] xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset
  2024-03-20 11:05 [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Zhang Yi
  2024-03-20 11:05 ` [PATCH v4 1/9] xfs: match lock mode in xfs_buffered_write_iomap_begin() Zhang Yi
  2024-03-20 11:05 ` [PATCH v4 2/9] xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional Zhang Yi
@ 2024-03-20 11:05 ` Zhang Yi
  2024-03-20 11:05 ` [PATCH v4 4/9] xfs: convert delayed extents to unwritten when zeroing post eof blocks Zhang Yi
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Zhang Yi @ 2024-03-20 11:05 UTC (permalink / raw)
  To: linux-xfs, linux-fsdevel
  Cc: linux-kernel, djwong, hch, brauner, david, tytso, jack, yi.zhang,
	yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Since xfs_bmapi_convert_delalloc() only attempts to allocate the entire
delalloc extent and require multiple invocations to allocate the target
offset. So xfs_convert_blocks() add a loop to do this job and we call it
in the write back path, but xfs_convert_blocks() isn't a common helper.
Let's do it in xfs_bmapi_convert_delalloc() and drop
xfs_convert_blocks(), preparing for the post EOF delalloc blocks
converting in the buffered write begin path.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_bmap.c | 34 +++++++++++++++++++++++--
 fs/xfs/xfs_aops.c        | 54 +++++++++++-----------------------------
 2 files changed, 46 insertions(+), 42 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 07dc35de8ce5..e44b9bbe0c4a 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4516,8 +4516,8 @@ xfs_bmapi_write(
  * invocations to allocate the target offset if a large enough physical extent
  * is not available.
  */
-int
-xfs_bmapi_convert_delalloc(
+static int
+xfs_bmapi_convert_one_delalloc(
 	struct xfs_inode	*ip,
 	int			whichfork,
 	xfs_off_t		offset,
@@ -4648,6 +4648,36 @@ xfs_bmapi_convert_delalloc(
 	return error;
 }
 
+/*
+ * Pass in a dellalloc extent and convert it to real extents, return the real
+ * extent that maps offset_fsb in iomap.
+ */
+int
+xfs_bmapi_convert_delalloc(
+	struct xfs_inode	*ip,
+	int			whichfork,
+	loff_t			offset,
+	struct iomap		*iomap,
+	unsigned int		*seq)
+{
+	int			error;
+
+	/*
+	 * Attempt to allocate whatever delalloc extent currently backs offset
+	 * and put the result into iomap.  Allocate in a loop because it may
+	 * take several attempts to allocate real blocks for a contiguous
+	 * delalloc extent if free space is sufficiently fragmented.
+	 */
+	do {
+		error = xfs_bmapi_convert_one_delalloc(ip, whichfork, offset,
+					iomap, seq);
+		if (error)
+			return error;
+	} while (iomap->offset + iomap->length <= offset);
+
+	return 0;
+}
+
 int
 xfs_bmapi_remap(
 	struct xfs_trans	*tp,
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 813f85156b0c..6479e0dac69d 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -233,45 +233,6 @@ xfs_imap_valid(
 	return true;
 }
 
-/*
- * Pass in a dellalloc extent and convert it to real extents, return the real
- * extent that maps offset_fsb in wpc->iomap.
- *
- * The current page is held locked so nothing could have removed the block
- * backing offset_fsb, although it could have moved from the COW to the data
- * fork by another thread.
- */
-static int
-xfs_convert_blocks(
-	struct iomap_writepage_ctx *wpc,
-	struct xfs_inode	*ip,
-	int			whichfork,
-	loff_t			offset)
-{
-	int			error;
-	unsigned		*seq;
-
-	if (whichfork == XFS_COW_FORK)
-		seq = &XFS_WPC(wpc)->cow_seq;
-	else
-		seq = &XFS_WPC(wpc)->data_seq;
-
-	/*
-	 * Attempt to allocate whatever delalloc extent currently backs offset
-	 * and put the result into wpc->iomap.  Allocate in a loop because it
-	 * may take several attempts to allocate real blocks for a contiguous
-	 * delalloc extent if free space is sufficiently fragmented.
-	 */
-	do {
-		error = xfs_bmapi_convert_delalloc(ip, whichfork, offset,
-				&wpc->iomap, seq);
-		if (error)
-			return error;
-	} while (wpc->iomap.offset + wpc->iomap.length <= offset);
-
-	return 0;
-}
-
 static int
 xfs_map_blocks(
 	struct iomap_writepage_ctx *wpc,
@@ -289,6 +250,7 @@ xfs_map_blocks(
 	struct xfs_iext_cursor	icur;
 	int			retries = 0;
 	int			error = 0;
+	unsigned int		*seq;
 
 	if (xfs_is_shutdown(mp))
 		return -EIO;
@@ -386,7 +348,19 @@ xfs_map_blocks(
 	trace_xfs_map_blocks_found(ip, offset, count, whichfork, &imap);
 	return 0;
 allocate_blocks:
-	error = xfs_convert_blocks(wpc, ip, whichfork, offset);
+	/*
+	 * Convert a dellalloc extent to a real one. The current page is held
+	 * locked so nothing could have removed the block backing offset_fsb,
+	 * although it could have moved from the COW to the data fork by another
+	 * thread.
+	 */
+	if (whichfork == XFS_COW_FORK)
+		seq = &XFS_WPC(wpc)->cow_seq;
+	else
+		seq = &XFS_WPC(wpc)->data_seq;
+
+	error = xfs_bmapi_convert_delalloc(ip, whichfork, offset,
+				&wpc->iomap, seq);
 	if (error) {
 		/*
 		 * If we failed to find the extent in the COW fork we might have
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 4/9] xfs: convert delayed extents to unwritten when zeroing post eof blocks
  2024-03-20 11:05 [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Zhang Yi
                   ` (2 preceding siblings ...)
  2024-03-20 11:05 ` [PATCH v4 3/9] xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset Zhang Yi
@ 2024-03-20 11:05 ` Zhang Yi
  2024-04-23 11:17   ` [PATCH v5 " Zhang Yi
  2024-03-20 11:05 ` [PATCH v4 5/9] iomap: drop the write failure handles when unsharing and zeroing Zhang Yi
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 21+ messages in thread
From: Zhang Yi @ 2024-03-20 11:05 UTC (permalink / raw)
  To: linux-xfs, linux-fsdevel
  Cc: linux-kernel, djwong, hch, brauner, david, tytso, jack, yi.zhang,
	yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Current clone operation could be non-atomic if the destination of a file
is beyond EOF, user could get a file with corrupted (zeroed) data on
crash.

The problem is about preallocations. If you write some data into a file:

	[A...B)

and XFS decides to preallocate some post-eof blocks, then it can create
a delayed allocation reservation:

	[A.........D)

The writeback path tries to convert delayed extents to real ones by
allocating blocks. If there aren't enough contiguous free space, we can
end up with two extents, the first real and the second still delalloc:

	[A....C)[C.D)

After that, both the in-memory and the on-disk file sizes are still B.
If we clone into the range [E...F) from another file:

	[A....C)[C.D)      [E...F)

then xfs_reflink_zero_posteof() calls iomap_zero_range() to zero out the
range [B, E) beyond EOF and flush it. Since [C, D) is still a delalloc
extent, its pagecache will be zeroed and both the in-memory and on-disk
size will be updated to D after flushing but before cloning. This is
wrong, because the user can see the size change and read the zeroes
while the clone operation is ongoing.

We need to keep the in-memory and on-disk size before the clone
operation starts, so instead of writing zeroes through the page cache
for delayed ranges beyond EOF, we convert these ranges to unwritten and
invalidate any cached data over that range beyond EOF.

Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_iomap.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index ccf83e72d8ca..334860f780ff 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1035,6 +1035,24 @@ xfs_buffered_write_iomap_begin(
 	}
 
 	if (imap.br_startoff <= offset_fsb) {
+		/*
+		 * Trim a delalloc extent that extends beyond the EOF block.
+		 * If it starts beyond the EOF block, convert it to an unwritten
+		 * extent.
+		 */
+		if ((flags & IOMAP_ZERO) &&
+		    isnullstartblock(imap.br_startblock)) {
+			xfs_fileoff_t eof_fsb = XFS_B_TO_FSB(mp, XFS_ISIZE(ip));
+
+			if (offset_fsb >= eof_fsb)
+				goto convert_delay;
+			if (end_fsb > eof_fsb) {
+				end_fsb = eof_fsb;
+				xfs_trim_extent(&imap, offset_fsb,
+						end_fsb - offset_fsb);
+			}
+		}
+
 		/*
 		 * For reflink files we may need a delalloc reservation when
 		 * overwriting shared extents.   This includes zeroing of
@@ -1158,6 +1176,17 @@ xfs_buffered_write_iomap_begin(
 	xfs_iunlock(ip, lockmode);
 	return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, 0, seq);
 
+convert_delay:
+	xfs_iunlock(ip, lockmode);
+	truncate_pagecache(inode, offset);
+	error = xfs_bmapi_convert_delalloc(ip, XFS_DATA_FORK, offset,
+					   iomap, NULL);
+	if (error)
+		return error;
+
+	trace_xfs_iomap_alloc(ip, offset, count, XFS_DATA_FORK, &imap);
+	return 0;
+
 found_cow:
 	seq = xfs_iomap_inode_sequence(ip, 0);
 	if (imap.br_startoff <= offset_fsb) {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 5/9] iomap: drop the write failure handles when unsharing and zeroing
  2024-03-20 11:05 [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Zhang Yi
                   ` (3 preceding siblings ...)
  2024-03-20 11:05 ` [PATCH v4 4/9] xfs: convert delayed extents to unwritten when zeroing post eof blocks Zhang Yi
@ 2024-03-20 11:05 ` Zhang Yi
  2024-03-20 11:05 ` [PATCH v4 6/9] iomap: don't increase i_size if it's not a write operation Zhang Yi
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Zhang Yi @ 2024-03-20 11:05 UTC (permalink / raw)
  To: linux-xfs, linux-fsdevel
  Cc: linux-kernel, djwong, hch, brauner, david, tytso, jack, yi.zhang,
	yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Unsharing and zeroing can only happen within EOF, so there is never a
need to perform posteof pagecache truncation if write begin fails, also
partial write could never theoretically happened from iomap_write_end(),
so remove both of them.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/iomap/buffered-io.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 093c4515b22a..7e32a204650b 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -786,7 +786,6 @@ static int iomap_write_begin(struct iomap_iter *iter, loff_t pos,
 
 out_unlock:
 	__iomap_put_folio(iter, pos, 0, folio);
-	iomap_write_failed(iter->inode, pos, len);
 
 	return status;
 }
@@ -863,8 +862,6 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
 
 	if (old_size < pos)
 		pagecache_isize_extended(iter->inode, old_size, pos);
-	if (ret < len)
-		iomap_write_failed(iter->inode, pos + ret, len - ret);
 	return ret;
 }
 
@@ -912,8 +909,10 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 		}
 
 		status = iomap_write_begin(iter, pos, bytes, &folio);
-		if (unlikely(status))
+		if (unlikely(status)) {
+			iomap_write_failed(iter->inode, pos, bytes);
 			break;
+		}
 		if (iter->iomap.flags & IOMAP_F_STALE)
 			break;
 
@@ -927,6 +926,9 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 		copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
 		status = iomap_write_end(iter, pos, bytes, copied, folio);
 
+		if (status < bytes)
+			iomap_write_failed(iter->inode, pos + status,
+					   bytes - status);
 		if (unlikely(copied != status))
 			iov_iter_revert(i, copied - status);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 6/9] iomap: don't increase i_size if it's not a write operation
  2024-03-20 11:05 [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Zhang Yi
                   ` (4 preceding siblings ...)
  2024-03-20 11:05 ` [PATCH v4 5/9] iomap: drop the write failure handles when unsharing and zeroing Zhang Yi
@ 2024-03-20 11:05 ` Zhang Yi
  2024-04-19  6:07   ` Chandan Babu R
  2024-03-20 11:05 ` [PATCH v4 7/9] iomap: use a new variable to handle the written bytes in iomap_write_iter() Zhang Yi
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 21+ messages in thread
From: Zhang Yi @ 2024-03-20 11:05 UTC (permalink / raw)
  To: linux-xfs, linux-fsdevel
  Cc: linux-kernel, djwong, hch, brauner, david, tytso, jack, yi.zhang,
	yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Increase i_size in iomap_zero_range() and iomap_unshare_iter() is not
needed, the caller should handle it. Especially, when truncate partial
block, we should not increase i_size beyond the new EOF here. It doesn't
affect xfs and gfs2 now because they set the new file size after zero
out, it doesn't matter that a transient increase in i_size, but it will
affect ext4 because it set file size before truncate. So move the i_size
updating logic to iomap_write_iter().

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/iomap/buffered-io.c | 50 +++++++++++++++++++++---------------------
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 7e32a204650b..e9112dc78d15 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -837,32 +837,13 @@ static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
 		size_t copied, struct folio *folio)
 {
 	const struct iomap *srcmap = iomap_iter_srcmap(iter);
-	loff_t old_size = iter->inode->i_size;
-	size_t ret;
-
-	if (srcmap->type == IOMAP_INLINE) {
-		ret = iomap_write_end_inline(iter, folio, pos, copied);
-	} else if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
-		ret = block_write_end(NULL, iter->inode->i_mapping, pos, len,
-				copied, &folio->page, NULL);
-	} else {
-		ret = __iomap_write_end(iter->inode, pos, len, copied, folio);
-	}
-
-	/*
-	 * Update the in-memory inode size after copying the data into the page
-	 * cache.  It's up to the file system to write the updated size to disk,
-	 * preferably after I/O completion so that no stale data is exposed.
-	 */
-	if (pos + ret > old_size) {
-		i_size_write(iter->inode, pos + ret);
-		iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
-	}
-	__iomap_put_folio(iter, pos, ret, folio);
 
-	if (old_size < pos)
-		pagecache_isize_extended(iter->inode, old_size, pos);
-	return ret;
+	if (srcmap->type == IOMAP_INLINE)
+		return iomap_write_end_inline(iter, folio, pos, copied);
+	if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
+		return block_write_end(NULL, iter->inode->i_mapping, pos, len,
+				       copied, &folio->page, NULL);
+	return __iomap_write_end(iter->inode, pos, len, copied, folio);
 }
 
 static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
@@ -877,6 +858,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 
 	do {
 		struct folio *folio;
+		loff_t old_size;
 		size_t offset;		/* Offset into folio */
 		size_t bytes;		/* Bytes to write to folio */
 		size_t copied;		/* Bytes copied from user */
@@ -926,6 +908,22 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 		copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
 		status = iomap_write_end(iter, pos, bytes, copied, folio);
 
+		/*
+		 * Update the in-memory inode size after copying the data into
+		 * the page cache.  It's up to the file system to write the
+		 * updated size to disk, preferably after I/O completion so that
+		 * no stale data is exposed.  Only once that's done can we
+		 * unlock and release the folio.
+		 */
+		old_size = iter->inode->i_size;
+		if (pos + status > old_size) {
+			i_size_write(iter->inode, pos + status);
+			iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
+		}
+		__iomap_put_folio(iter, pos, status, folio);
+
+		if (old_size < pos)
+			pagecache_isize_extended(iter->inode, old_size, pos);
 		if (status < bytes)
 			iomap_write_failed(iter->inode, pos + status,
 					   bytes - status);
@@ -1298,6 +1296,7 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter)
 			bytes = folio_size(folio) - offset;
 
 		bytes = iomap_write_end(iter, pos, bytes, bytes, folio);
+		__iomap_put_folio(iter, pos, bytes, folio);
 		if (WARN_ON_ONCE(bytes == 0))
 			return -EIO;
 
@@ -1362,6 +1361,7 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
 		folio_mark_accessed(folio);
 
 		bytes = iomap_write_end(iter, pos, bytes, bytes, folio);
+		__iomap_put_folio(iter, pos, bytes, folio);
 		if (WARN_ON_ONCE(bytes == 0))
 			return -EIO;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 7/9] iomap: use a new variable to handle the written bytes in iomap_write_iter()
  2024-03-20 11:05 [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Zhang Yi
                   ` (5 preceding siblings ...)
  2024-03-20 11:05 ` [PATCH v4 6/9] iomap: don't increase i_size if it's not a write operation Zhang Yi
@ 2024-03-20 11:05 ` Zhang Yi
  2024-03-20 11:05 ` [PATCH v4 8/9] iomap: make iomap_write_end() return a boolean Zhang Yi
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Zhang Yi @ 2024-03-20 11:05 UTC (permalink / raw)
  To: linux-xfs, linux-fsdevel
  Cc: linux-kernel, djwong, hch, brauner, david, tytso, jack, yi.zhang,
	yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

In iomap_write_iter(), the status variable used to receive the return
value from iomap_write_end() is confusing, replace it with a new written
variable to represent the written bytes in each cycle, no logic changes.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/iomap/buffered-io.c | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index e9112dc78d15..291648c61a32 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -851,7 +851,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 	loff_t length = iomap_length(iter);
 	size_t chunk = PAGE_SIZE << MAX_PAGECACHE_ORDER;
 	loff_t pos = iter->pos;
-	ssize_t written = 0;
+	ssize_t total_written = 0;
 	long status = 0;
 	struct address_space *mapping = iter->inode->i_mapping;
 	unsigned int bdp_flags = (iter->flags & IOMAP_NOWAIT) ? BDP_ASYNC : 0;
@@ -862,6 +862,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 		size_t offset;		/* Offset into folio */
 		size_t bytes;		/* Bytes to write to folio */
 		size_t copied;		/* Bytes copied from user */
+		size_t written;		/* Bytes have been written */
 
 		bytes = iov_iter_count(i);
 retry:
@@ -906,7 +907,7 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 			flush_dcache_folio(folio);
 
 		copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
-		status = iomap_write_end(iter, pos, bytes, copied, folio);
+		written = iomap_write_end(iter, pos, bytes, copied, folio);
 
 		/*
 		 * Update the in-memory inode size after copying the data into
@@ -916,22 +917,22 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 		 * unlock and release the folio.
 		 */
 		old_size = iter->inode->i_size;
-		if (pos + status > old_size) {
-			i_size_write(iter->inode, pos + status);
+		if (pos + written > old_size) {
+			i_size_write(iter->inode, pos + written);
 			iter->iomap.flags |= IOMAP_F_SIZE_CHANGED;
 		}
-		__iomap_put_folio(iter, pos, status, folio);
+		__iomap_put_folio(iter, pos, written, folio);
 
 		if (old_size < pos)
 			pagecache_isize_extended(iter->inode, old_size, pos);
-		if (status < bytes)
-			iomap_write_failed(iter->inode, pos + status,
-					   bytes - status);
-		if (unlikely(copied != status))
-			iov_iter_revert(i, copied - status);
+		if (written < bytes)
+			iomap_write_failed(iter->inode, pos + written,
+					   bytes - written);
+		if (unlikely(copied != written))
+			iov_iter_revert(i, copied - written);
 
 		cond_resched();
-		if (unlikely(status == 0)) {
+		if (unlikely(written == 0)) {
 			/*
 			 * A short copy made iomap_write_end() reject the
 			 * thing entirely.  Might be memory poisoning
@@ -945,17 +946,17 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 				goto retry;
 			}
 		} else {
-			pos += status;
-			written += status;
-			length -= status;
+			pos += written;
+			total_written += written;
+			length -= written;
 		}
 	} while (iov_iter_count(i) && length);
 
 	if (status == -EAGAIN) {
-		iov_iter_revert(i, written);
+		iov_iter_revert(i, total_written);
 		return -EAGAIN;
 	}
-	return written ? written : status;
+	return total_written ? total_written : status;
 }
 
 ssize_t
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 8/9] iomap: make iomap_write_end() return a boolean
  2024-03-20 11:05 [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Zhang Yi
                   ` (6 preceding siblings ...)
  2024-03-20 11:05 ` [PATCH v4 7/9] iomap: use a new variable to handle the written bytes in iomap_write_iter() Zhang Yi
@ 2024-03-20 11:05 ` Zhang Yi
  2024-03-20 11:05 ` [PATCH v4 9/9] iomap: do some small logical cleanup in buffered write Zhang Yi
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Zhang Yi @ 2024-03-20 11:05 UTC (permalink / raw)
  To: linux-xfs, linux-fsdevel
  Cc: linux-kernel, djwong, hch, brauner, david, tytso, jack, yi.zhang,
	yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

For now, we can make sure iomap_write_end() always return 0 or copied
bytes, so instead of return written bytes, convert to return a boolean
to indicate the copied bytes have been written to the pagecache.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/iomap/buffered-io.c | 48 +++++++++++++++++++++++++++---------------
 1 file changed, 31 insertions(+), 17 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 291648c61a32..dc863a76c72a 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -790,7 +790,7 @@ static int iomap_write_begin(struct iomap_iter *iter, loff_t pos,
 	return status;
 }
 
-static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
+static bool __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 		size_t copied, struct folio *folio)
 {
 	flush_dcache_folio(folio);
@@ -807,14 +807,14 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len,
 	 * redo the whole thing.
 	 */
 	if (unlikely(copied < len && !folio_test_uptodate(folio)))
-		return 0;
+		return false;
 	iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), len);
 	iomap_set_range_dirty(folio, offset_in_folio(folio, pos), copied);
 	filemap_dirty_folio(inode->i_mapping, folio);
-	return copied;
+	return true;
 }
 
-static size_t iomap_write_end_inline(const struct iomap_iter *iter,
+static void iomap_write_end_inline(const struct iomap_iter *iter,
 		struct folio *folio, loff_t pos, size_t copied)
 {
 	const struct iomap *iomap = &iter->iomap;
@@ -829,20 +829,31 @@ static size_t iomap_write_end_inline(const struct iomap_iter *iter,
 	kunmap_local(addr);
 
 	mark_inode_dirty(iter->inode);
-	return copied;
 }
 
-/* Returns the number of bytes copied.  May be 0.  Cannot be an errno. */
-static size_t iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
+/*
+ * Returns true if all copied bytes have been written to the pagecache,
+ * otherwise return false.
+ */
+static bool iomap_write_end(struct iomap_iter *iter, loff_t pos, size_t len,
 		size_t copied, struct folio *folio)
 {
 	const struct iomap *srcmap = iomap_iter_srcmap(iter);
 
-	if (srcmap->type == IOMAP_INLINE)
-		return iomap_write_end_inline(iter, folio, pos, copied);
-	if (srcmap->flags & IOMAP_F_BUFFER_HEAD)
-		return block_write_end(NULL, iter->inode->i_mapping, pos, len,
-				       copied, &folio->page, NULL);
+	if (srcmap->type == IOMAP_INLINE) {
+		iomap_write_end_inline(iter, folio, pos, copied);
+		return true;
+	}
+
+	if (srcmap->flags & IOMAP_F_BUFFER_HEAD) {
+		size_t bh_written;
+
+		bh_written = block_write_end(NULL, iter->inode->i_mapping, pos,
+					len, copied, &folio->page, NULL);
+		WARN_ON_ONCE(bh_written != copied && bh_written != 0);
+		return bh_written == copied;
+	}
+
 	return __iomap_write_end(iter->inode, pos, len, copied, folio);
 }
 
@@ -907,7 +918,8 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 			flush_dcache_folio(folio);
 
 		copied = copy_folio_from_iter_atomic(folio, offset, bytes, i);
-		written = iomap_write_end(iter, pos, bytes, copied, folio);
+		written = iomap_write_end(iter, pos, bytes, copied, folio) ?
+			  copied : 0;
 
 		/*
 		 * Update the in-memory inode size after copying the data into
@@ -1285,6 +1297,7 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter)
 		int status;
 		size_t offset;
 		size_t bytes = min_t(u64, SIZE_MAX, length);
+		bool ret;
 
 		status = iomap_write_begin(iter, pos, bytes, &folio);
 		if (unlikely(status))
@@ -1296,9 +1309,9 @@ static loff_t iomap_unshare_iter(struct iomap_iter *iter)
 		if (bytes > folio_size(folio) - offset)
 			bytes = folio_size(folio) - offset;
 
-		bytes = iomap_write_end(iter, pos, bytes, bytes, folio);
+		ret = iomap_write_end(iter, pos, bytes, bytes, folio);
 		__iomap_put_folio(iter, pos, bytes, folio);
-		if (WARN_ON_ONCE(bytes == 0))
+		if (WARN_ON_ONCE(!ret))
 			return -EIO;
 
 		cond_resched();
@@ -1347,6 +1360,7 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
 		int status;
 		size_t offset;
 		size_t bytes = min_t(u64, SIZE_MAX, length);
+		bool ret;
 
 		status = iomap_write_begin(iter, pos, bytes, &folio);
 		if (status)
@@ -1361,9 +1375,9 @@ static loff_t iomap_zero_iter(struct iomap_iter *iter, bool *did_zero)
 		folio_zero_range(folio, offset, bytes);
 		folio_mark_accessed(folio);
 
-		bytes = iomap_write_end(iter, pos, bytes, bytes, folio);
+		ret = iomap_write_end(iter, pos, bytes, bytes, folio);
 		__iomap_put_folio(iter, pos, bytes, folio);
-		if (WARN_ON_ONCE(bytes == 0))
+		if (WARN_ON_ONCE(!ret))
 			return -EIO;
 
 		pos += bytes;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 9/9] iomap: do some small logical cleanup in buffered write
  2024-03-20 11:05 [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Zhang Yi
                   ` (7 preceding siblings ...)
  2024-03-20 11:05 ` [PATCH v4 8/9] iomap: make iomap_write_end() return a boolean Zhang Yi
@ 2024-03-20 11:05 ` Zhang Yi
  2024-04-17  4:42 ` [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Chandan Babu R
  2024-04-25 12:25 ` (subset) " Christian Brauner
  10 siblings, 0 replies; 21+ messages in thread
From: Zhang Yi @ 2024-03-20 11:05 UTC (permalink / raw)
  To: linux-xfs, linux-fsdevel
  Cc: linux-kernel, djwong, hch, brauner, david, tytso, jack, yi.zhang,
	yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Since iomap_write_end() can never return a partial write length, the
comparison between written, copied and bytes becomes useless, just
merge them with the unwritten branch.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/iomap/buffered-io.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index dc863a76c72a..a111e8b816df 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -937,11 +937,6 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 
 		if (old_size < pos)
 			pagecache_isize_extended(iter->inode, old_size, pos);
-		if (written < bytes)
-			iomap_write_failed(iter->inode, pos + written,
-					   bytes - written);
-		if (unlikely(copied != written))
-			iov_iter_revert(i, copied - written);
 
 		cond_resched();
 		if (unlikely(written == 0)) {
@@ -951,6 +946,9 @@ static loff_t iomap_write_iter(struct iomap_iter *iter, struct iov_iter *i)
 			 * halfway through, might be a race with munmap,
 			 * might be severe memory pressure.
 			 */
+			iomap_write_failed(iter->inode, pos, bytes);
+			iov_iter_revert(i, copied);
+
 			if (chunk > PAGE_SIZE)
 				chunk /= 2;
 			if (copied) {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof
  2024-03-20 11:05 [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Zhang Yi
                   ` (8 preceding siblings ...)
  2024-03-20 11:05 ` [PATCH v4 9/9] iomap: do some small logical cleanup in buffered write Zhang Yi
@ 2024-04-17  4:42 ` Chandan Babu R
  2024-04-17 11:40   ` Christian Brauner
  2024-04-25 12:25 ` (subset) " Christian Brauner
  10 siblings, 1 reply; 21+ messages in thread
From: Chandan Babu R @ 2024-04-17  4:42 UTC (permalink / raw)
  To: Zhang Yi
  Cc: linux-xfs, linux-fsdevel, linux-kernel, djwong, hch, brauner,
	david, tytso, jack, yi.zhang, chengzhihao1, yukuai3

On Wed, Mar 20, 2024 at 07:05:39 PM +0800, Zhang Yi wrote:
> Changes since v3:
>  - Improve some git message comments and do some minor code cleanup, no
>    logic changes.
>
> Changes since v2:
>  - Merge the patch for dropping of xfs_convert_blocks() and the patch
>    for modifying xfs_bmapi_convert_delalloc().
>  - Reword the commit message of the second patch.
>
> Changes since v1:
>  - Make xfs_bmapi_convert_delalloc() to allocate the target offset and
>    drop the writeback helper xfs_convert_blocks().
>  - Don't use xfs_iomap_write_direct() to convert delalloc blocks for
>    zeroing posteof case, use xfs_bmapi_convert_delalloc() instead.
>  - Fix two off-by-one issues when converting delalloc blocks.
>  - Add a separate patch to drop the buffered write failure handle in
>    zeroing and unsharing.
>  - Add a comments do emphasize updating i_size should under folio lock.
>  - Make iomap_write_end() to return a boolean, and do some cleanups in
>    buffered write begin path.
>
> This patch series fix a problem of exposing zeroed data on xfs since the
> non-atomic clone operation. This problem was found while I was
> developing ext4 buffered IO iomap conversion (ext4 is relying on this
> fix [1]), the root cause of this problem and the discussion about the
> solution please see [2]. After fix the problem, iomap_zero_range()
> doesn't need to update i_size so that ext4 can use it to zero partial
> block, e.g. truncate eof block [3].
>
> [1] https://lore.kernel.org/linux-ext4/20240127015825.1608160-1-yi.zhang@huaweicloud.com/
> [2] https://lore.kernel.org/linux-ext4/9b0040ef-3d9d-6246-4bdd-82b9a8f55fa2@huaweicloud.com/
> [3] https://lore.kernel.org/linux-ext4/9c9f1831-a772-299b-072b-1c8116c3fb35@huaweicloud.com/
>
> Thanks,
> Yi.
>
> Zhang Yi (9):
>   xfs: match lock mode in xfs_buffered_write_iomap_begin()
>   xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional
>   xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset
>   xfs: convert delayed extents to unwritten when zeroing post eof blocks
>   iomap: drop the write failure handles when unsharing and zeroing
>   iomap: don't increase i_size if it's not a write operation
>   iomap: use a new variable to handle the written bytes in
>     iomap_write_iter()
>   iomap: make iomap_write_end() return a boolean
>   iomap: do some small logical cleanup in buffered write
>

Hi all,

I have picked up this patchset for inclusion into XFS' 6.10-rc1 patch
queue. Please let me know if there are any objections.

-- 
Chandan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof
  2024-04-17  4:42 ` [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Chandan Babu R
@ 2024-04-17 11:40   ` Christian Brauner
  2024-04-18  9:30     ` Chandan Babu R
  0 siblings, 1 reply; 21+ messages in thread
From: Christian Brauner @ 2024-04-17 11:40 UTC (permalink / raw)
  To: Chandan Babu R
  Cc: Zhang Yi, linux-xfs, linux-fsdevel, linux-kernel, djwong, hch,
	david, tytso, jack, yi.zhang, chengzhihao1, yukuai3

On Wed, Apr 17, 2024 at 10:12:10AM +0530, Chandan Babu R wrote:
> On Wed, Mar 20, 2024 at 07:05:39 PM +0800, Zhang Yi wrote:
> > Changes since v3:
> >  - Improve some git message comments and do some minor code cleanup, no
> >    logic changes.
> >
> > Changes since v2:
> >  - Merge the patch for dropping of xfs_convert_blocks() and the patch
> >    for modifying xfs_bmapi_convert_delalloc().
> >  - Reword the commit message of the second patch.
> >
> > Changes since v1:
> >  - Make xfs_bmapi_convert_delalloc() to allocate the target offset and
> >    drop the writeback helper xfs_convert_blocks().
> >  - Don't use xfs_iomap_write_direct() to convert delalloc blocks for
> >    zeroing posteof case, use xfs_bmapi_convert_delalloc() instead.
> >  - Fix two off-by-one issues when converting delalloc blocks.
> >  - Add a separate patch to drop the buffered write failure handle in
> >    zeroing and unsharing.
> >  - Add a comments do emphasize updating i_size should under folio lock.
> >  - Make iomap_write_end() to return a boolean, and do some cleanups in
> >    buffered write begin path.
> >
> > This patch series fix a problem of exposing zeroed data on xfs since the
> > non-atomic clone operation. This problem was found while I was
> > developing ext4 buffered IO iomap conversion (ext4 is relying on this
> > fix [1]), the root cause of this problem and the discussion about the
> > solution please see [2]. After fix the problem, iomap_zero_range()
> > doesn't need to update i_size so that ext4 can use it to zero partial
> > block, e.g. truncate eof block [3].
> >
> > [1] https://lore.kernel.org/linux-ext4/20240127015825.1608160-1-yi.zhang@huaweicloud.com/
> > [2] https://lore.kernel.org/linux-ext4/9b0040ef-3d9d-6246-4bdd-82b9a8f55fa2@huaweicloud.com/
> > [3] https://lore.kernel.org/linux-ext4/9c9f1831-a772-299b-072b-1c8116c3fb35@huaweicloud.com/
> >
> > Thanks,
> > Yi.
> >
> > Zhang Yi (9):
> >   xfs: match lock mode in xfs_buffered_write_iomap_begin()
> >   xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional
> >   xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset
> >   xfs: convert delayed extents to unwritten when zeroing post eof blocks
> >   iomap: drop the write failure handles when unsharing and zeroing
> >   iomap: don't increase i_size if it's not a write operation
> >   iomap: use a new variable to handle the written bytes in
> >     iomap_write_iter()
> >   iomap: make iomap_write_end() return a boolean
> >   iomap: do some small logical cleanup in buffered write
> >
> 
> Hi all,
> 
> I have picked up this patchset for inclusion into XFS' 6.10-rc1 patch
> queue. Please let me know if there are any objections.

It'd be nice if I could take the iomap patches into the vfs.iomap tree
that you can then pull from if you depend on it.. There's already some
cleanups in there. Sounds ok?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof
  2024-04-17 11:40   ` Christian Brauner
@ 2024-04-18  9:30     ` Chandan Babu R
  0 siblings, 0 replies; 21+ messages in thread
From: Chandan Babu R @ 2024-04-18  9:30 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Zhang Yi, linux-xfs, linux-fsdevel, linux-kernel, djwong, hch,
	david, tytso, jack, yi.zhang, chengzhihao1, yukuai3

On Wed, Apr 17, 2024 at 01:40:36 PM +0200, Christian Brauner wrote:
> On Wed, Apr 17, 2024 at 10:12:10AM +0530, Chandan Babu R wrote:
>> On Wed, Mar 20, 2024 at 07:05:39 PM +0800, Zhang Yi wrote:
>> > Changes since v3:
>> >  - Improve some git message comments and do some minor code cleanup, no
>> >    logic changes.
>> >
>> > Changes since v2:
>> >  - Merge the patch for dropping of xfs_convert_blocks() and the patch
>> >    for modifying xfs_bmapi_convert_delalloc().
>> >  - Reword the commit message of the second patch.
>> >
>> > Changes since v1:
>> >  - Make xfs_bmapi_convert_delalloc() to allocate the target offset and
>> >    drop the writeback helper xfs_convert_blocks().
>> >  - Don't use xfs_iomap_write_direct() to convert delalloc blocks for
>> >    zeroing posteof case, use xfs_bmapi_convert_delalloc() instead.
>> >  - Fix two off-by-one issues when converting delalloc blocks.
>> >  - Add a separate patch to drop the buffered write failure handle in
>> >    zeroing and unsharing.
>> >  - Add a comments do emphasize updating i_size should under folio lock.
>> >  - Make iomap_write_end() to return a boolean, and do some cleanups in
>> >    buffered write begin path.
>> >
>> > This patch series fix a problem of exposing zeroed data on xfs since the
>> > non-atomic clone operation. This problem was found while I was
>> > developing ext4 buffered IO iomap conversion (ext4 is relying on this
>> > fix [1]), the root cause of this problem and the discussion about the
>> > solution please see [2]. After fix the problem, iomap_zero_range()
>> > doesn't need to update i_size so that ext4 can use it to zero partial
>> > block, e.g. truncate eof block [3].
>> >
>> > [1] https://lore.kernel.org/linux-ext4/20240127015825.1608160-1-yi.zhang@huaweicloud.com/
>> > [2] https://lore.kernel.org/linux-ext4/9b0040ef-3d9d-6246-4bdd-82b9a8f55fa2@huaweicloud.com/
>> > [3] https://lore.kernel.org/linux-ext4/9c9f1831-a772-299b-072b-1c8116c3fb35@huaweicloud.com/
>> >
>> > Thanks,
>> > Yi.
>> >
>> > Zhang Yi (9):
>> >   xfs: match lock mode in xfs_buffered_write_iomap_begin()
>> >   xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional
>> >   xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset
>> >   xfs: convert delayed extents to unwritten when zeroing post eof blocks
>> >   iomap: drop the write failure handles when unsharing and zeroing
>> >   iomap: don't increase i_size if it's not a write operation
>> >   iomap: use a new variable to handle the written bytes in
>> >     iomap_write_iter()
>> >   iomap: make iomap_write_end() return a boolean
>> >   iomap: do some small logical cleanup in buffered write
>> >
>> 
>> Hi all,
>> 
>> I have picked up this patchset for inclusion into XFS' 6.10-rc1 patch
>> queue. Please let me know if there are any objections.
>
> It'd be nice if I could take the iomap patches into the vfs.iomap tree
> that you can then pull from if you depend on it.. There's already some
> cleanups in there. Sounds ok?

Yes, that works for me. I will pick only those patches that are specific to
XFS i.e. patches numbered 1 to 4.

-- 
Chandan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 6/9] iomap: don't increase i_size if it's not a write operation
  2024-03-20 11:05 ` [PATCH v4 6/9] iomap: don't increase i_size if it's not a write operation Zhang Yi
@ 2024-04-19  6:07   ` Chandan Babu R
  2024-04-19  8:14     ` Zhang Yi
  2024-04-23 11:30     ` Zhang Yi
  0 siblings, 2 replies; 21+ messages in thread
From: Chandan Babu R @ 2024-04-19  6:07 UTC (permalink / raw)
  To: Zhang Yi
  Cc: linux-xfs, linux-fsdevel, linux-kernel, djwong, hch, brauner,
	david, tytso, jack, yi.zhang, chengzhihao1, yukuai3

On Wed, Mar 20, 2024 at 07:05:45 PM +0800, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
>
> Increase i_size in iomap_zero_range() and iomap_unshare_iter() is not
> needed, the caller should handle it. Especially, when truncate partial
> block, we should not increase i_size beyond the new EOF here. It doesn't
> affect xfs and gfs2 now because they set the new file size after zero
> out, it doesn't matter that a transient increase in i_size, but it will
> affect ext4 because it set file size before truncate. So move the i_size
> updating logic to iomap_write_iter().
>

This patch causes generic/522 to consistently fail when using the following
fstest configuration,

TEST_DEV=/dev/loop16
TEST_LOGDEV=/dev/loop13
SCRATCH_DEV_POOL="/dev/loop5 /dev/loop6 /dev/loop7 /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11 /dev/loop12"
MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -lsize=1g'
MOUNT_OPTIONS='-o usrquota,grpquota,prjquota'
TEST_FS_MOUNT_OPTS="$TEST_FS_MOUNT_OPTS -o usrquota,grpquota,prjquota"
TEST_FS_MOUNT_OPTS="-o logdev=/dev/loop13"
SCRATCH_LOGDEV=/dev/loop15
USE_EXTERNAL=yes
LOGWRITES_DEV=/dev/loop15

-- 
Chandan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 6/9] iomap: don't increase i_size if it's not a write operation
  2024-04-19  6:07   ` Chandan Babu R
@ 2024-04-19  8:14     ` Zhang Yi
  2024-04-23 11:30     ` Zhang Yi
  1 sibling, 0 replies; 21+ messages in thread
From: Zhang Yi @ 2024-04-19  8:14 UTC (permalink / raw)
  To: Chandan Babu R
  Cc: linux-xfs, linux-fsdevel, linux-kernel, djwong, hch, brauner,
	david, tytso, jack, yi.zhang, chengzhihao1, yukuai3

On 2024/4/19 14:07, Chandan Babu R wrote:
> On Wed, Mar 20, 2024 at 07:05:45 PM +0800, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@huawei.com>
>>
>> Increase i_size in iomap_zero_range() and iomap_unshare_iter() is not
>> needed, the caller should handle it. Especially, when truncate partial
>> block, we should not increase i_size beyond the new EOF here. It doesn't
>> affect xfs and gfs2 now because they set the new file size after zero
>> out, it doesn't matter that a transient increase in i_size, but it will
>> affect ext4 because it set file size before truncate. So move the i_size
>> updating logic to iomap_write_iter().
>>
> 
> This patch causes generic/522 to consistently fail when using the following
> fstest configuration,
> 
> TEST_DEV=/dev/loop16
> TEST_LOGDEV=/dev/loop13
> SCRATCH_DEV_POOL="/dev/loop5 /dev/loop6 /dev/loop7 /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11 /dev/loop12"
> MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -lsize=1g'
> MOUNT_OPTIONS='-o usrquota,grpquota,prjquota'
> TEST_FS_MOUNT_OPTS="$TEST_FS_MOUNT_OPTS -o usrquota,grpquota,prjquota"
> TEST_FS_MOUNT_OPTS="-o logdev=/dev/loop13"
> SCRATCH_LOGDEV=/dev/loop15
> USE_EXTERNAL=yes
> LOGWRITES_DEV=/dev/loop15
> 

Sorry for the regression, I didn't notice this issue last time I ran
this test. It looks like the 004 patch doesn't work for some cases.
I can reproduce it on my machine now, and I'll take a look at this
and fix it soon.

Thanks,
Yi.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v5 4/9] xfs: convert delayed extents to unwritten when zeroing post eof blocks
  2024-03-20 11:05 ` [PATCH v4 4/9] xfs: convert delayed extents to unwritten when zeroing post eof blocks Zhang Yi
@ 2024-04-23 11:17   ` Zhang Yi
  2024-04-25 12:22     ` Christoph Hellwig
  0 siblings, 1 reply; 21+ messages in thread
From: Zhang Yi @ 2024-04-23 11:17 UTC (permalink / raw)
  To: linux-xfs, linux-fsdevel
  Cc: linux-kernel, djwong, hch, brauner, david, chandanbabu, tytso,
	jack, yi.zhang, yi.zhang, chengzhihao1, yukuai3

From: Zhang Yi <yi.zhang@huawei.com>

Current clone operation could be non-atomic if the destination of a file
is beyond EOF, user could get a file with corrupted (zeroed) data on
crash.

The problem is about preallocations. If you write some data into a file:

	[A...B)

and XFS decides to preallocate some post-eof blocks, then it can create
a delayed allocation reservation:

	[A.........D)

The writeback path tries to convert delayed extents to real ones by
allocating blocks. If there aren't enough contiguous free space, we can
end up with two extents, the first real and the second still delalloc:

	[A....C)[C.D)

After that, both the in-memory and the on-disk file sizes are still B.
If we clone into the range [E...F) from another file:

	[A....C)[C.D)      [E...F)

then xfs_reflink_zero_posteof() calls iomap_zero_range() to zero out the
range [B, E) beyond EOF and flush it. Since [C, D) is still a delalloc
extent, its pagecache will be zeroed and both the in-memory and on-disk
size will be updated to D after flushing but before cloning. This is
wrong, because the user can see the size change and read the zeroes
while the clone operation is ongoing.

We need to keep the in-memory and on-disk size before the clone
operation starts, so instead of writing zeroes through the page cache
for delayed ranges beyond EOF, we convert these ranges to unwritten and
invalidate any cached data over that range beyond EOF.

Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
Changes since v4:

Move the delalloc converting hunk before searching the COW fork. Because
if the file has been reflinked and copied on write,
xfs_bmap_extsize_align() aligned the range of COW delalloc extent, after
the writeback, there might be some unwritten extents left over in the
COW fork that overlaps the delalloc extent we found in data fork.

  data fork  ...wwww|dddddddddd...
  cow fork          |uuuuuuuuuu...
                    ^
                  i_size

In my v4, we search the COW fork before checking the delalloc extent,
goto found_cow tag and return unconverted delalloc srcmap in the above
case, so the delayed extent in the data fork will have no chance to
convert to unwritten, it will lead to delalloc extent residue and break
generic/522 after merging patch 6.

 fs/xfs/xfs_iomap.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 236ee78aa75b..ab398cb3680a 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1022,6 +1022,23 @@ xfs_buffered_write_iomap_begin(
 		goto out_unlock;
 	}
 
+	/*
+	 * For zeroing, trim a delalloc extent that extends beyond the EOF
+	 * block.  If it starts beyond the EOF block, convert it to an
+	 * unwritten extent.
+	 */
+	if ((flags & IOMAP_ZERO) && imap.br_startoff <= offset_fsb &&
+	    isnullstartblock(imap.br_startblock)) {
+		xfs_fileoff_t eof_fsb = XFS_B_TO_FSB(mp, XFS_ISIZE(ip));
+
+		if (offset_fsb >= eof_fsb)
+			goto convert_delay;
+		if (end_fsb > eof_fsb) {
+			end_fsb = eof_fsb;
+			xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb);
+		}
+	}
+
 	/*
 	 * Search the COW fork extent list even if we did not find a data fork
 	 * extent.  This serves two purposes: first this implements the
@@ -1167,6 +1184,17 @@ xfs_buffered_write_iomap_begin(
 	xfs_iunlock(ip, lockmode);
 	return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, 0, seq);
 
+convert_delay:
+	xfs_iunlock(ip, lockmode);
+	truncate_pagecache(inode, offset);
+	error = xfs_bmapi_convert_delalloc(ip, XFS_DATA_FORK, offset,
+					   iomap, NULL);
+	if (error)
+		return error;
+
+	trace_xfs_iomap_alloc(ip, offset, count, XFS_DATA_FORK, &imap);
+	return 0;
+
 found_cow:
 	seq = xfs_iomap_inode_sequence(ip, 0);
 	if (imap.br_startoff <= offset_fsb) {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 6/9] iomap: don't increase i_size if it's not a write operation
  2024-04-19  6:07   ` Chandan Babu R
  2024-04-19  8:14     ` Zhang Yi
@ 2024-04-23 11:30     ` Zhang Yi
  1 sibling, 0 replies; 21+ messages in thread
From: Zhang Yi @ 2024-04-23 11:30 UTC (permalink / raw)
  To: Chandan Babu R
  Cc: linux-xfs, linux-fsdevel, linux-kernel, djwong, hch, brauner,
	david, tytso, jack, yi.zhang, chengzhihao1, yukuai3

On 2024/4/19 14:07, Chandan Babu R wrote:
> On Wed, Mar 20, 2024 at 07:05:45 PM +0800, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@huawei.com>
>>
>> Increase i_size in iomap_zero_range() and iomap_unshare_iter() is not
>> needed, the caller should handle it. Especially, when truncate partial
>> block, we should not increase i_size beyond the new EOF here. It doesn't
>> affect xfs and gfs2 now because they set the new file size after zero
>> out, it doesn't matter that a transient increase in i_size, but it will
>> affect ext4 because it set file size before truncate. So move the i_size
>> updating logic to iomap_write_iter().
>>
> 
> This patch causes generic/522 to consistently fail when using the following
> fstest configuration,
> 
> TEST_DEV=/dev/loop16
> TEST_LOGDEV=/dev/loop13
> SCRATCH_DEV_POOL="/dev/loop5 /dev/loop6 /dev/loop7 /dev/loop8 /dev/loop9 /dev/loop10 /dev/loop11 /dev/loop12"
> MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -lsize=1g'
> MOUNT_OPTIONS='-o usrquota,grpquota,prjquota'
> TEST_FS_MOUNT_OPTS="$TEST_FS_MOUNT_OPTS -o usrquota,grpquota,prjquota"
> TEST_FS_MOUNT_OPTS="-o logdev=/dev/loop13"
> SCRATCH_LOGDEV=/dev/loop15
> USE_EXTERNAL=yes
> LOGWRITES_DEV=/dev/loop15
> 

Hello!

The root cause of the problem is caused by patch 4/9, this patch is fine,
I've revised the patch 4/9 and send it out separately in reply to v4. I've
tested that on my machine for over 100 rounds on generic/522 and it hasn't
failed again. Please take a look at the v5 patch for details and test it
on your machine.

https://lore.kernel.org/linux-xfs/20240423111735.1298851-1-yi.zhang@huaweicloud.com/T/#m2da33e643b642071aa20077321e2c431f5a64e38

Thanks,
Yi.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 4/9] xfs: convert delayed extents to unwritten when zeroing post eof blocks
  2024-04-23 11:17   ` [PATCH v5 " Zhang Yi
@ 2024-04-25 12:22     ` Christoph Hellwig
  2024-04-25 12:32       ` Zhang Yi
  0 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2024-04-25 12:22 UTC (permalink / raw)
  To: Zhang Yi
  Cc: linux-xfs, linux-fsdevel, linux-kernel, djwong, hch, brauner,
	david, chandanbabu, tytso, jack, yi.zhang, chengzhihao1, yukuai3

On Tue, Apr 23, 2024 at 07:17:35PM +0800, Zhang Yi wrote:
> +	if ((flags & IOMAP_ZERO) && imap.br_startoff <= offset_fsb &&
> +	    isnullstartblock(imap.br_startblock)) {
> +		xfs_fileoff_t eof_fsb = XFS_B_TO_FSB(mp, XFS_ISIZE(ip));
> +
> +		if (offset_fsb >= eof_fsb)
> +			goto convert_delay;
> +		if (end_fsb > eof_fsb) {
> +			end_fsb = eof_fsb;
> +			xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb);

Nit: overly long line here.

I've also tried to to a more comprehensive review, but this depends on
the rest of the series, which isn't in my linux-xfs folder for April.

I've your're not doing and instant revision it's usually much easier to
just review the whole series.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: (subset) [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof
  2024-03-20 11:05 [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Zhang Yi
                   ` (9 preceding siblings ...)
  2024-04-17  4:42 ` [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Chandan Babu R
@ 2024-04-25 12:25 ` Christian Brauner
  2024-04-29 11:48   ` Chandan Babu R
  10 siblings, 1 reply; 21+ messages in thread
From: Christian Brauner @ 2024-04-25 12:25 UTC (permalink / raw)
  To: Zhang Yi, Chandan Babu R
  Cc: Christian Brauner, linux-kernel, djwong, hch, david, tytso, jack,
	yi.zhang, chengzhihao1, yukuai3, linux-xfs, linux-fsdevel

On Wed, 20 Mar 2024 19:05:39 +0800, Zhang Yi wrote:
> Changes since v3:
>  - Improve some git message comments and do some minor code cleanup, no
>    logic changes.
> 
> Changes since v2:
>  - Merge the patch for dropping of xfs_convert_blocks() and the patch
>    for modifying xfs_bmapi_convert_delalloc().
>  - Reword the commit message of the second patch.
> 
> [...]

@Chandan, since the bug has been determined to be in the xfs specific changes
for this I've picked up the cleanup patches into vfs.iomap. If you need to rely
on that branch I can keep it stable.

---

Applied to the vfs.iomap branch of the vfs/vfs.git tree.
Patches in the vfs.iomap branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.iomap

[5/9] iomap: drop the write failure handles when unsharing and zeroing
      https://git.kernel.org/vfs/vfs/c/89c6c1d91ab2
[6/9] iomap: don't increase i_size if it's not a write operation
      https://git.kernel.org/vfs/vfs/c/943bc0882ceb
[7/9] iomap: use a new variable to handle the written bytes in iomap_write_iter()
      https://git.kernel.org/vfs/vfs/c/1a61d74932d4
[8/9] iomap: make iomap_write_end() return a boolean
      https://git.kernel.org/vfs/vfs/c/815f4b633ba1
[9/9] iomap: do some small logical cleanup in buffered write
      https://git.kernel.org/vfs/vfs/c/e1f453d4336d

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v5 4/9] xfs: convert delayed extents to unwritten when zeroing post eof blocks
  2024-04-25 12:22     ` Christoph Hellwig
@ 2024-04-25 12:32       ` Zhang Yi
  0 siblings, 0 replies; 21+ messages in thread
From: Zhang Yi @ 2024-04-25 12:32 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-xfs, linux-fsdevel, linux-kernel, djwong, brauner, david,
	chandanbabu, tytso, jack, yi.zhang, chengzhihao1, yukuai3

On 2024/4/25 20:22, Christoph Hellwig wrote:
> On Tue, Apr 23, 2024 at 07:17:35PM +0800, Zhang Yi wrote:
>> +	if ((flags & IOMAP_ZERO) && imap.br_startoff <= offset_fsb &&
>> +	    isnullstartblock(imap.br_startblock)) {
>> +		xfs_fileoff_t eof_fsb = XFS_B_TO_FSB(mp, XFS_ISIZE(ip));
>> +
>> +		if (offset_fsb >= eof_fsb)
>> +			goto convert_delay;
>> +		if (end_fsb > eof_fsb) {
>> +			end_fsb = eof_fsb;
>> +			xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb);
> 
> Nit: overly long line here.
> 
> I've also tried to to a more comprehensive review, but this depends on
> the rest of the series, which isn't in my linux-xfs folder for April.
> 
> I've your're not doing and instant revision it's usually much easier to
> just review the whole series.
> .
>

Sure, I will resend the whole series.

Thanks,
Yi.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: (subset) [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof
  2024-04-25 12:25 ` (subset) " Christian Brauner
@ 2024-04-29 11:48   ` Chandan Babu R
  0 siblings, 0 replies; 21+ messages in thread
From: Chandan Babu R @ 2024-04-29 11:48 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Zhang Yi, linux-kernel, djwong, hch, david, tytso, jack,
	yi.zhang, chengzhihao1, yukuai3, linux-xfs, linux-fsdevel

On Thu, Apr 25, 2024 at 02:25:47 PM +0200, Christian Brauner wrote:
> On Wed, 20 Mar 2024 19:05:39 +0800, Zhang Yi wrote:
>> Changes since v3:
>>  - Improve some git message comments and do some minor code cleanup, no
>>    logic changes.
>> 
>> Changes since v2:
>>  - Merge the patch for dropping of xfs_convert_blocks() and the patch
>>    for modifying xfs_bmapi_convert_delalloc().
>>  - Reword the commit message of the second patch.
>> 
>> [...]
>
> @Chandan, since the bug has been determined to be in the xfs specific changes
> for this I've picked up the cleanup patches into vfs.iomap. If you need to rely
> on that branch I can keep it stable.

I am sorry about the late reply. I somehow missed this mail.

I will pick up the XFS specific patches now.

-- 
Chandan

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-04-29 11:49 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-20 11:05 [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Zhang Yi
2024-03-20 11:05 ` [PATCH v4 1/9] xfs: match lock mode in xfs_buffered_write_iomap_begin() Zhang Yi
2024-03-20 11:05 ` [PATCH v4 2/9] xfs: make the seq argument to xfs_bmapi_convert_delalloc() optional Zhang Yi
2024-03-20 11:05 ` [PATCH v4 3/9] xfs: make xfs_bmapi_convert_delalloc() to allocate the target offset Zhang Yi
2024-03-20 11:05 ` [PATCH v4 4/9] xfs: convert delayed extents to unwritten when zeroing post eof blocks Zhang Yi
2024-04-23 11:17   ` [PATCH v5 " Zhang Yi
2024-04-25 12:22     ` Christoph Hellwig
2024-04-25 12:32       ` Zhang Yi
2024-03-20 11:05 ` [PATCH v4 5/9] iomap: drop the write failure handles when unsharing and zeroing Zhang Yi
2024-03-20 11:05 ` [PATCH v4 6/9] iomap: don't increase i_size if it's not a write operation Zhang Yi
2024-04-19  6:07   ` Chandan Babu R
2024-04-19  8:14     ` Zhang Yi
2024-04-23 11:30     ` Zhang Yi
2024-03-20 11:05 ` [PATCH v4 7/9] iomap: use a new variable to handle the written bytes in iomap_write_iter() Zhang Yi
2024-03-20 11:05 ` [PATCH v4 8/9] iomap: make iomap_write_end() return a boolean Zhang Yi
2024-03-20 11:05 ` [PATCH v4 9/9] iomap: do some small logical cleanup in buffered write Zhang Yi
2024-04-17  4:42 ` [PATCH v4 0/9] xfs/iomap: fix non-atomic clone operation and don't update size when zeroing range post eof Chandan Babu R
2024-04-17 11:40   ` Christian Brauner
2024-04-18  9:30     ` Chandan Babu R
2024-04-25 12:25 ` (subset) " Christian Brauner
2024-04-29 11:48   ` Chandan Babu R

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.