All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs: fix reflink source file racing with directio writes
@ 2019-08-15 16:50 Darrick J. Wong
  2019-08-15 21:37 ` Dave Chinner
  2019-08-16 16:11 ` [PATCH v2] " Darrick J. Wong
  0 siblings, 2 replies; 5+ messages in thread
From: Darrick J. Wong @ 2019-08-15 16:50 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

While trawling through the dedupe file comparison code trying to fix
page deadlocking problems, Dave Chinner noticed that the reflink code
only takes shared IOLOCK/MMAPLOCKs on the source file.  Because
page_mkwrite and directio writes do not take the EXCL versions of those
locks, this means that reflink can race with writer processes.

For pure remapping this can lead to undefined behavior and file
corruption; for dedupe this means that we cannot be sure that the
contents are identical when we decide to go ahead with the remapping.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_reflink.c |   14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index c4ec7afd1170..9fd417de15c9 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -1205,7 +1205,7 @@ xfs_iolock_two_inodes_and_break_layout(
 
 retry:
 	if (src < dest) {
-		inode_lock_shared(src);
+		inode_lock(src);
 		inode_lock_nested(dest, I_MUTEX_NONDIR2);
 	} else {
 		/* src >= dest */
@@ -1216,7 +1216,7 @@ xfs_iolock_two_inodes_and_break_layout(
 	if (error == -EWOULDBLOCK) {
 		inode_unlock(dest);
 		if (src < dest)
-			inode_unlock_shared(src);
+			inode_unlock(src);
 		error = break_layout(dest, true);
 		if (error)
 			return error;
@@ -1225,11 +1225,11 @@ xfs_iolock_two_inodes_and_break_layout(
 	if (error) {
 		inode_unlock(dest);
 		if (src < dest)
-			inode_unlock_shared(src);
+			inode_unlock(src);
 		return error;
 	}
 	if (src > dest)
-		inode_lock_shared_nested(src, I_MUTEX_NONDIR2);
+		inode_lock_nested(src, I_MUTEX_NONDIR2);
 	return 0;
 }
 
@@ -1247,10 +1247,10 @@ xfs_reflink_remap_unlock(
 
 	xfs_iunlock(dest, XFS_MMAPLOCK_EXCL);
 	if (!same_inode)
-		xfs_iunlock(src, XFS_MMAPLOCK_SHARED);
+		xfs_iunlock(src, XFS_MMAPLOCK_EXCL);
 	inode_unlock(inode_out);
 	if (!same_inode)
-		inode_unlock_shared(inode_in);
+		inode_unlock(inode_in);
 }
 
 /*
@@ -1325,7 +1325,7 @@ xfs_reflink_remap_prep(
 	if (same_inode)
 		xfs_ilock(src, XFS_MMAPLOCK_EXCL);
 	else
-		xfs_lock_two_inodes(src, XFS_MMAPLOCK_SHARED, dest,
+		xfs_lock_two_inodes(src, XFS_MMAPLOCK_EXCL, dest,
 				XFS_MMAPLOCK_EXCL);
 
 	/* Check file eligibility and prepare for block sharing. */

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] xfs: fix reflink source file racing with directio writes
  2019-08-15 16:50 [PATCH] xfs: fix reflink source file racing with directio writes Darrick J. Wong
@ 2019-08-15 21:37 ` Dave Chinner
  2019-08-16  6:42   ` Christoph Hellwig
  2019-08-16 16:11 ` [PATCH v2] " Darrick J. Wong
  1 sibling, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2019-08-15 21:37 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: xfs

On Thu, Aug 15, 2019 at 09:50:43AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> While trawling through the dedupe file comparison code trying to fix
> page deadlocking problems, Dave Chinner noticed that the reflink code
> only takes shared IOLOCK/MMAPLOCKs on the source file.  Because
> page_mkwrite and directio writes do not take the EXCL versions of those
> locks, this means that reflink can race with writer processes.
> 
> For pure remapping this can lead to undefined behavior and file
> corruption; for dedupe this means that we cannot be sure that the
> contents are identical when we decide to go ahead with the remapping.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

I don't think this is quite right yet. The source inode locking
change is good, but it doesn't break the layout on the source inode
and so there is still they possibility that something has physical
access and is directly modifying the source file.

And with that, I suspect the locking algorithm changes
substantially:

	order inodes
restart:
	lock first inode
	break layout on first inode
	lock second inode
	break layout on second inode
	fail then unlock both inodes, goto restart

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] xfs: fix reflink source file racing with directio writes
  2019-08-15 21:37 ` Dave Chinner
@ 2019-08-16  6:42   ` Christoph Hellwig
  0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2019-08-16  6:42 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Darrick J. Wong, xfs

On Fri, Aug 16, 2019 at 07:37:17AM +1000, Dave Chinner wrote:
> I don't think this is quite right yet. The source inode locking
> change is good, but it doesn't break the layout on the source inode
> and so there is still they possibility that something has physical
> access and is directly modifying the source file.
> 
> And with that, I suspect the locking algorithm changes
> substantially:
> 
> 	order inodes
> restart:
> 	lock first inode
> 	break layout on first inode
> 	lock second inode
> 	break layout on second inode
> 	fail then unlock both inodes, goto restart

Agreed, that is the locking scheme I'd expect.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] xfs: fix reflink source file racing with directio writes
  2019-08-15 16:50 [PATCH] xfs: fix reflink source file racing with directio writes Darrick J. Wong
  2019-08-15 21:37 ` Dave Chinner
@ 2019-08-16 16:11 ` Darrick J. Wong
  2019-08-18  8:38   ` Christoph Hellwig
  1 sibling, 1 reply; 5+ messages in thread
From: Darrick J. Wong @ 2019-08-16 16:11 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs, Christoph Hellwig

From: Darrick J. Wong <darrick.wong@oracle.com>

While trawling through the dedupe file comparison code trying to fix
page deadlocking problems, Dave Chinner noticed that the reflink code
only takes shared IOLOCK/MMAPLOCKs on the source file.  Because
page_mkwrite and directio writes do not take the EXCL versions of those
locks, this means that reflink can race with writer processes.

For pure remapping this can lead to undefined behavior and file
corruption; for dedupe this means that we cannot be sure that the
contents are identical when we decide to go ahead with the remapping.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v2: break both layouts
---
 fs/xfs/xfs_reflink.c |   63 +++++++++++++++++++++++++++++---------------------
 1 file changed, 37 insertions(+), 26 deletions(-)

diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index c4ec7afd1170..edbe37b7f636 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -1190,11 +1190,11 @@ xfs_reflink_remap_blocks(
 }
 
 /*
- * Grab the exclusive iolock for a data copy from src to dest, making
- * sure to abide vfs locking order (lowest pointer value goes first) and
- * breaking the pnfs layout leases on dest before proceeding.  The loop
- * is needed because we cannot call the blocking break_layout() with the
- * src iolock held, and therefore have to back out both locks.
+ * Grab the exclusive iolock for a data copy from src to dest, making sure to
+ * abide vfs locking order (lowest pointer value goes first) and breaking the
+ * layout leases before proceeding.  The loop is needed because we cannot call
+ * the blocking break_layout() with the iolocks held, and therefore have to
+ * back out both locks.
  */
 static int
 xfs_iolock_two_inodes_and_break_layout(
@@ -1203,33 +1203,44 @@ xfs_iolock_two_inodes_and_break_layout(
 {
 	int			error;
 
-retry:
-	if (src < dest) {
-		inode_lock_shared(src);
-		inode_lock_nested(dest, I_MUTEX_NONDIR2);
-	} else {
-		/* src >= dest */
-		inode_lock(dest);
-	}
+	if (src > dest)
+		swap(src, dest);
 
-	error = break_layout(dest, false);
-	if (error == -EWOULDBLOCK) {
-		inode_unlock(dest);
-		if (src < dest)
-			inode_unlock_shared(src);
+retry:
+	/* Wait to break both inodes' layouts before we start locking. */
+	error = break_layout(src, true);
+	if (error)
+		return error;
+	if (src != dest) {
 		error = break_layout(dest, true);
 		if (error)
 			return error;
-		goto retry;
 	}
+
+	/* Lock one inode and make sure nobody got in and leased it. */
+	inode_lock(src);
+	error = break_layout(src, false);
 	if (error) {
+		inode_unlock(src);
+		if (error == -EWOULDBLOCK)
+			goto retry;
+		return error;
+	}
+
+	if (src == dest)
+		return 0;
+
+	/* Lock the other inode and make sure nobody got in and leased it. */
+	inode_lock_nested(dest, I_MUTEX_NONDIR2);
+	error = break_layout(dest, false);
+	if (error) {
+		inode_unlock(src);
 		inode_unlock(dest);
-		if (src < dest)
-			inode_unlock_shared(src);
+		if (error == -EWOULDBLOCK)
+			goto retry;
 		return error;
 	}
-	if (src > dest)
-		inode_lock_shared_nested(src, I_MUTEX_NONDIR2);
+
 	return 0;
 }
 
@@ -1247,10 +1258,10 @@ xfs_reflink_remap_unlock(
 
 	xfs_iunlock(dest, XFS_MMAPLOCK_EXCL);
 	if (!same_inode)
-		xfs_iunlock(src, XFS_MMAPLOCK_SHARED);
+		xfs_iunlock(src, XFS_MMAPLOCK_EXCL);
 	inode_unlock(inode_out);
 	if (!same_inode)
-		inode_unlock_shared(inode_in);
+		inode_unlock(inode_in);
 }
 
 /*
@@ -1325,7 +1336,7 @@ xfs_reflink_remap_prep(
 	if (same_inode)
 		xfs_ilock(src, XFS_MMAPLOCK_EXCL);
 	else
-		xfs_lock_two_inodes(src, XFS_MMAPLOCK_SHARED, dest,
+		xfs_lock_two_inodes(src, XFS_MMAPLOCK_EXCL, dest,
 				XFS_MMAPLOCK_EXCL);
 
 	/* Check file eligibility and prepare for block sharing. */

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] xfs: fix reflink source file racing with directio writes
  2019-08-16 16:11 ` [PATCH v2] " Darrick J. Wong
@ 2019-08-18  8:38   ` Christoph Hellwig
  0 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2019-08-18  8:38 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Dave Chinner, xfs, Christoph Hellwig

On Fri, Aug 16, 2019 at 09:11:34AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> While trawling through the dedupe file comparison code trying to fix
> page deadlocking problems, Dave Chinner noticed that the reflink code
> only takes shared IOLOCK/MMAPLOCKs on the source file.  Because
> page_mkwrite and directio writes do not take the EXCL versions of those
> locks, this means that reflink can race with writer processes.
> 
> For pure remapping this can lead to undefined behavior and file
> corruption; for dedupe this means that we cannot be sure that the
> contents are identical when we decide to go ahead with the remapping.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-08-18  8:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-15 16:50 [PATCH] xfs: fix reflink source file racing with directio writes Darrick J. Wong
2019-08-15 21:37 ` Dave Chinner
2019-08-16  6:42   ` Christoph Hellwig
2019-08-16 16:11 ` [PATCH v2] " Darrick J. Wong
2019-08-18  8:38   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.