[PATCH V2] xfs: Fix agi&agf ABBA deadlock when performing rename with RENAME_WHITEOUT flag

* [PATCH V2] xfs: Fix agi&agf ABBA deadlock when performing rename with RENAME_WHITEOUT flag
@ 2019-08-19 13:06 kaixuxia
  2019-08-19 15:13 ` Brian Foster
  0 siblings, 1 reply; 11+ messages in thread
From: kaixuxia @ 2019-08-19 13:06 UTC (permalink / raw)
  To: linux-xfs
  Cc: Darrick J. Wong, Brian Foster, Dave Chinner, newtongao,
	jasperwang, xiakaixu1987

When performing rename operation with RENAME_WHITEOUT flag, we will
hold AGF lock to allocate or free extents in manipulating the dirents
firstly, and then doing the xfs_iunlink_remove() call last to hold
AGI lock to modify the tmpfile info, so we the lock order AGI->AGF.

The big problem here is that we have an ordering constraint on AGF
and AGI locking - inode allocation locks the AGI, then can allocate
a new extent for new inodes, locking the AGF after the AGI. Hence
the ordering that is imposed by other parts of the code is AGI before
AGF. So we get the ABBA agi&agf deadlock here.

Process A:
Call trace:
  ? __schedule+0x2bd/0x620
  schedule+0x33/0x90
  schedule_timeout+0x17d/0x290
  __down_common+0xef/0x125
  ? xfs_buf_find+0x215/0x6c0 [xfs]
  down+0x3b/0x50
  xfs_buf_lock+0x34/0xf0 [xfs]
  xfs_buf_find+0x215/0x6c0 [xfs]
  xfs_buf_get_map+0x37/0x230 [xfs]
  xfs_buf_read_map+0x29/0x190 [xfs]
  xfs_trans_read_buf_map+0x13d/0x520 [xfs]
  xfs_read_agf+0xa6/0x180 [xfs]
  ? schedule_timeout+0x17d/0x290
  xfs_alloc_read_agf+0x52/0x1f0 [xfs]
  xfs_alloc_fix_freelist+0x432/0x590 [xfs]
  ? down+0x3b/0x50
  ? xfs_buf_lock+0x34/0xf0 [xfs]
  ? xfs_buf_find+0x215/0x6c0 [xfs]
  xfs_alloc_vextent+0x301/0x6c0 [xfs]
  xfs_ialloc_ag_alloc+0x182/0x700 [xfs]
  ? _xfs_trans_bjoin+0x72/0xf0 [xfs]
  xfs_dialloc+0x116/0x290 [xfs]
  xfs_ialloc+0x6d/0x5e0 [xfs]
  ? xfs_log_reserve+0x165/0x280 [xfs]
  xfs_dir_ialloc+0x8c/0x240 [xfs]
  xfs_create+0x35a/0x610 [xfs]
  xfs_generic_create+0x1f1/0x2f0 [xfs]
  ...

Process B:
Call trace:
  ? __schedule+0x2bd/0x620
  ? xfs_bmapi_allocate+0x245/0x380 [xfs]
  schedule+0x33/0x90
  schedule_timeout+0x17d/0x290
  ? xfs_buf_find+0x1fd/0x6c0 [xfs]
  __down_common+0xef/0x125
  ? xfs_buf_get_map+0x37/0x230 [xfs]
  ? xfs_buf_find+0x215/0x6c0 [xfs]
  down+0x3b/0x50
  xfs_buf_lock+0x34/0xf0 [xfs]
  xfs_buf_find+0x215/0x6c0 [xfs]
  xfs_buf_get_map+0x37/0x230 [xfs]
  xfs_buf_read_map+0x29/0x190 [xfs]
  xfs_trans_read_buf_map+0x13d/0x520 [xfs]
  xfs_read_agi+0xa8/0x160 [xfs]
  xfs_iunlink_remove+0x6f/0x2a0 [xfs]
  ? current_time+0x46/0x80
  ? xfs_trans_ichgtime+0x39/0xb0 [xfs]
  xfs_rename+0x57a/0xae0 [xfs]
  xfs_vn_rename+0xe4/0x150 [xfs]
  ...

In this patch we move the xfs_iunlink_remove() call to between
xfs_dir_canenter() and xfs_dir_createname(). By doing xfs_iunlink
_remove() firstly, we remove the AGI/AGF lock inversion problem.

Signed-off-by: kaixuxia <kaixuxia@tencent.com>
---
  fs/xfs/xfs_inode.c | 20 +++++++++++++++++---
  1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 6467d5e..48691f2 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -3294,6 +3294,18 @@ struct xfs_iunlink {
  			if (error)
  				goto out_trans_cancel;
  		}
+
+		/*
+		 * Handle the whiteout inode and acquire the AGI lock, so
+		 * fix the AGI/AGF lock inversion problem.
+		 */
+		if (wip) {
+			ASSERT(VFS_I(wip)->i_nlink == 0);
+			error = xfs_iunlink_remove(tp, wip);
+			if (error)
+				goto out_trans_cancel;
+		}
+
  		/*
  		 * If target does not exist and the rename crosses
  		 * directories, adjust the target directory link count
@@ -3428,9 +3440,11 @@ struct xfs_iunlink {
  	if (wip) {
  		ASSERT(VFS_I(wip)->i_nlink == 0);
  		xfs_bumplink(tp, wip);
-		error = xfs_iunlink_remove(tp, wip);
-		if (error)
-			goto out_trans_cancel;
+		if (target_ip != NULL) {
+			error = xfs_iunlink_remove(tp, wip);
+			if (error)
+				goto out_trans_cancel;
+		}
  		xfs_trans_log_inode(tp, wip, XFS_ILOG_CORE);

  		/*
-- 
1.8.3.1

-- 
kaixuxia

^ permalink raw reply related	[flat|nested] 11+ messages in thread