[PATCH] xfs: fix an ABBA deadlock in xfs_rename

* [PATCH] xfs: fix an ABBA deadlock in xfs_rename
@ 2021-01-04 19:44 Darrick J. Wong
  2021-01-04 19:51 ` Darrick J. Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Darrick J. Wong @ 2021-01-04 19:44 UTC (permalink / raw)
  To: wenli xie; +Cc: xfs, chiluk, Brian Foster

From: Darrick J. Wong <darrick.wong@oracle.com>

When overlayfs is running on top of xfs and the user unlinks a file in
the overlay, overlayfs will create a whiteout inode and ask xfs to
"rename" the whiteout file atop the one being unlinked.  If the file
being unlinked loses its one nlink, we then have to put the inode on the
unlinked list.

This requires us to grab the AGI buffer of the whiteout inode to take it
off the unlinked list (which is where whiteouts are created) and to grab
the AGI buffer of the file being deleted.  If the whiteout was created
in a higher numbered AG than the file being deleted, we'll lock the AGIs
in the wrong order and deadlock.

Therefore, grab all the AGI locks we think we'll need ahead of time, and
in the correct order.

Reported-by: wenli xie <wlxie7296@gmail.com>
Tested-by: wenli xie <wlxie7296@gmail.com>
Fixes: 93597ae8dac0 ("xfs: Fix deadlock between AGI and AGF when target_ip exists in xfs_rename()")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_inode.c |   46 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index b7352bc4c815..dd419a1bc6ba 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -3000,6 +3000,48 @@ xfs_rename_alloc_whiteout(
 	return 0;
 }
 
+/*
+ * For the general case of renaming files, lock all the AGI buffers we need to
+ * handle bumping the nlink of the whiteout inode off the unlinked list and to
+ * handle dropping the nlink of the target inode.  We have to do this in
+ * increasing AG order to avoid deadlocks.
+ */
+static int
+xfs_rename_lock_agis(
+	struct xfs_trans	*tp,
+	struct xfs_inode	*wip,
+	struct xfs_inode	*target_ip)
+{
+	struct xfs_mount	*mp = tp->t_mountp;
+	struct xfs_buf		*bp;
+	xfs_agnumber_t		agi_locks[2] = { NULLAGNUMBER, NULLAGNUMBER };
+	int			error;
+
+	if (wip)
+		agi_locks[0] = XFS_INO_TO_AGNO(mp, wip->i_ino);
+
+	if (target_ip && VFS_I(target_ip)->i_nlink == 1)
+		agi_locks[1] = XFS_INO_TO_AGNO(mp, target_ip->i_ino);
+
+	if (agi_locks[0] != NULLAGNUMBER && agi_locks[1] != NULLAGNUMBER &&
+	    agi_locks[0] > agi_locks[1])
+		swap(agi_locks[0], agi_locks[1]);
+
+	if (agi_locks[0] != NULLAGNUMBER) {
+		error = xfs_read_agi(mp, tp, agi_locks[0], &bp);
+		if (error)
+			return error;
+	}
+
+	if (agi_locks[1] != NULLAGNUMBER) {
+		error = xfs_read_agi(mp, tp, agi_locks[1], &bp);
+		if (error)
+			return error;
+	}
+
+	return 0;
+}
+
 /*
  * xfs_rename
  */
@@ -3130,6 +3172,10 @@ xfs_rename(
 		}
 	}
 
+	error = xfs_rename_lock_agis(tp, wip, target_ip);
+	if (error)
+		return error;
+
 	/*
 	 * Directory entry creation below may acquire the AGF. Remove
 	 * the whiteout from the unlinked list first to preserve correct

^ permalink raw reply related	[flat|nested] 8+ messages in thread