linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: linux-xfs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Subject: [PATCH 2/2] xfs: use iomap_valid method to detect stale cached iomaps
Date: Wed, 21 Sep 2022 18:29:59 +1000	[thread overview]
Message-ID: <20220921082959.1411675-3-david@fromorbit.com> (raw)
In-Reply-To: <20220921082959.1411675-1-david@fromorbit.com>

From: Dave Chinner <dchinner@redhat.com>

Now that iomap supports a mechanism to validate cached iomaps for
buffered write operations, hook it up to the XFS buffered write ops
so that we can avoid data corruptions that result from stale cached
iomaps. See:

https://lore.kernel.org/linux-xfs/20220817093627.GZ3600936@dread.disaster.area/

or the ->iomap_valid() introduction commit for exact details of the
corruption vector.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_iomap.c | 53 ++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 49 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 07da03976ec1..2e77ae817e6b 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -91,6 +91,12 @@ xfs_bmbt_to_iomap(
 	if (xfs_ipincount(ip) &&
 	    (ip->i_itemp->ili_fsync_fields & ~XFS_ILOG_TIMESTAMP))
 		iomap->flags |= IOMAP_F_DIRTY;
+
+	/*
+	 * Sample the extent tree sequence so that we can detect if the tree
+	 * changes while the iomap is still being used.
+	 */
+	*((int *)&iomap->private) = READ_ONCE(ip->i_df.if_seq);
 	return 0;
 }
 
@@ -915,6 +921,7 @@ xfs_buffered_write_iomap_begin(
 	int			allocfork = XFS_DATA_FORK;
 	int			error = 0;
 	unsigned int		lockmode = XFS_ILOCK_EXCL;
+	u16			remap_flags = 0;
 
 	if (xfs_is_shutdown(mp))
 		return -EIO;
@@ -926,6 +933,20 @@ xfs_buffered_write_iomap_begin(
 
 	ASSERT(!XFS_IS_REALTIME_INODE(ip));
 
+	/*
+	 * If we are remapping a stale iomap, preserve the IOMAP_F_NEW flag
+	 * if it is passed to us. This will only be set if we are remapping a
+	 * range that we just allocated and hence had set IOMAP_F_NEW on. We
+	 * need to set it again here so any further writes over this newly
+	 * allocated region we are remapping are preserved.
+	 *
+	 * This pairs with the code in xfs_buffered_write_iomap_end() that skips
+	 * punching newly allocated delalloc regions that have iomaps marked as
+	 * stale.
+	 */
+	if (iomap->flags & IOMAP_F_STALE)
+		remap_flags = iomap->flags & IOMAP_F_NEW;
+
 	error = xfs_ilock_for_iomap(ip, flags, &lockmode);
 	if (error)
 		return error;
@@ -1100,7 +1121,7 @@ xfs_buffered_write_iomap_begin(
 
 found_imap:
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
-	return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, 0);
+	return xfs_bmbt_to_iomap(ip, iomap, &imap, flags, remap_flags);
 
 found_cow:
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
@@ -1160,13 +1181,20 @@ xfs_buffered_write_iomap_end(
 
 	/*
 	 * Trim delalloc blocks if they were allocated by this write and we
-	 * didn't manage to write the whole range.
+	 * didn't manage to write the whole range. If the iomap was marked stale
+	 * because it is no longer valid, we are going to remap this range
+	 * immediately, so don't punch it out.
 	 *
-	 * We don't need to care about racing delalloc as we hold i_mutex
+	 * XXX (dgc): This next comment and assumption is totally bogus because
+	 * iomap_page_mkwrite() runs through here and it doesn't hold the
+	 * i_rwsem. Hence this whole error handling path may be badly broken.
+	 *
+	 * We don't need to care about racing delalloc as we hold i_rwsem
 	 * across the reserve/allocate/unreserve calls. If there are delalloc
 	 * blocks in the range, they are ours.
 	 */
-	if ((iomap->flags & IOMAP_F_NEW) && start_fsb < end_fsb) {
+	if (((iomap->flags & (IOMAP_F_NEW | IOMAP_F_STALE)) == IOMAP_F_NEW) &&
+	    start_fsb < end_fsb) {
 		truncate_pagecache_range(VFS_I(ip), XFS_FSB_TO_B(mp, start_fsb),
 					 XFS_FSB_TO_B(mp, end_fsb) - 1);
 
@@ -1182,9 +1210,26 @@ xfs_buffered_write_iomap_end(
 	return 0;
 }
 
+/*
+ * Check that the iomap passed to us is still valid for the given offset and
+ * length.
+ */
+static bool
+xfs_buffered_write_iomap_valid(
+	struct inode		*inode,
+	const struct iomap	*iomap)
+{
+	int			seq = *((int *)&iomap->private);
+
+	if (seq != READ_ONCE(XFS_I(inode)->i_df.if_seq))
+		return false;
+	return true;
+}
+
 const struct iomap_ops xfs_buffered_write_iomap_ops = {
 	.iomap_begin		= xfs_buffered_write_iomap_begin,
 	.iomap_end		= xfs_buffered_write_iomap_end,
+	.iomap_valid		= xfs_buffered_write_iomap_valid,
 };
 
 static int
-- 
2.37.2


  parent reply	other threads:[~2022-09-21  8:30 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-21  8:29 [RFC PATCH 0/2] iomap/xfs: fix data corruption due to stale cached iomaps Dave Chinner
2022-09-21  8:29 ` [PATCH 1/2] iomap: write iomap validity checks Dave Chinner
2022-09-22  3:40   ` Darrick J. Wong
2022-09-22 23:16     ` Dave Chinner
2022-09-28  4:48       ` Darrick J. Wong
2022-09-21  8:29 ` Dave Chinner [this message]
2022-09-22  3:44   ` [PATCH 2/2] xfs: use iomap_valid method to detect stale cached iomaps Darrick J. Wong
2022-09-23  0:04     ` Dave Chinner
2022-09-28  4:54       ` Darrick J. Wong
2022-09-29  1:45         ` Dave Chinner
2022-10-04 23:34           ` Frank Sorenson
2022-10-05  1:34             ` Darrick J. Wong
2022-09-22  4:25 ` [RFC PATCH 0/2] iomap/xfs: fix data corruption due to " Darrick J. Wong
2022-09-22 22:59   ` Dave Chinner
2022-09-28  5:16     ` Darrick J. Wong
2022-09-29  2:11       ` Dave Chinner
2022-09-29  2:15         ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220921082959.1411675-3-david@fromorbit.com \
    --to=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).