* [PATCH v2] xfs: fix incorrect i_nlink caused by inode racing
@ 2022-11-17 2:58 Long Li
2022-11-17 3:13 ` Darrick J. Wong
0 siblings, 1 reply; 2+ messages in thread
From: Long Li @ 2022-11-17 2:58 UTC (permalink / raw)
To: djwong; +Cc: david, linux-xfs, houtao1, yi.zhang, guoxuenan
The following error occurred during the fsstress test:
XFS: Assertion failed: VFS_I(ip)->i_nlink >= 2, file: fs/xfs/xfs_inode.c, line: 2452
The problem was that inode race condition causes incorrect i_nlink to be
written to disk, and then it is read into memory. Consider the following
call graph, inodes that are marked as both XFS_IFLUSHING and
XFS_IRECLAIMABLE, i_nlink will be reset to 1 and then restored to original
value in xfs_reinit_inode(). Therefore, the i_nlink of directory on disk
may be set to 1.
xfsaild
xfs_inode_item_push
xfs_iflush_cluster
xfs_iflush
xfs_inode_to_disk
xfs_iget
xfs_iget_cache_hit
xfs_iget_recycle
xfs_reinit_inode
inode_init_always
xfs_reinit_inode() needs to hold the ILOCK_EXCL as it is changing internal
inode state and can race with other RCU protected inode lookups. On the
read side, xfs_iflush_cluster() grabs the ILOCK_SHARED while under rcu +
ip->i_flags_lock, and so xfs_iflush/xfs_inode_to_disk() are protected from
racing inode updates (during transactions) by that lock.
Signed-off-by: Long Li <leo.lilong@huawei.com>
---
v2:
- Modify the assertion error code line number
- Use ILOCK_EXCL to prevent inode racing
fs/xfs/xfs_icache.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index eae7427062cf..5a1650e769e7 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -329,7 +329,7 @@ xfs_reinit_inode(
/*
* Carefully nudge an inode whose VFS state has been torn down back into a
- * usable state. Drops the i_flags_lock and the rcu read lock.
+ * usable state. Drops the i_flags_lock, rcu read lock and XFS_ILOCK_EXCL.
*/
static int
xfs_iget_recycle(
@@ -355,6 +355,7 @@ xfs_iget_recycle(
ASSERT(!rwsem_is_locked(&inode->i_rwsem));
error = xfs_reinit_inode(mp, inode);
+ xfs_iunlock(ip, XFS_ILOCK_EXCL);
if (error) {
/*
* Re-initializing the inode failed, and we are in deep
@@ -516,7 +517,10 @@ xfs_iget_cache_hit(
/* The inode fits the selection criteria; process it. */
if (ip->i_flags & XFS_IRECLAIMABLE) {
- /* Drops i_flags_lock and RCU read lock. */
+ if (!xfs_ilock_nowait(ip, XFS_ILOCK_EXCL))
+ goto out_skip;
+
+ /* Drops i_flags_lock, RCU read lock and XFS_ILOCK_EXCL. */
error = xfs_iget_recycle(pag, ip);
if (error)
return error;
--
2.31.1
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH v2] xfs: fix incorrect i_nlink caused by inode racing
2022-11-17 2:58 [PATCH v2] xfs: fix incorrect i_nlink caused by inode racing Long Li
@ 2022-11-17 3:13 ` Darrick J. Wong
0 siblings, 0 replies; 2+ messages in thread
From: Darrick J. Wong @ 2022-11-17 3:13 UTC (permalink / raw)
To: Long Li; +Cc: david, linux-xfs, houtao1, yi.zhang, guoxuenan
On Thu, Nov 17, 2022 at 10:58:29AM +0800, Long Li wrote:
> The following error occurred during the fsstress test:
>
> XFS: Assertion failed: VFS_I(ip)->i_nlink >= 2, file: fs/xfs/xfs_inode.c, line: 2452
>
> The problem was that inode race condition causes incorrect i_nlink to be
> written to disk, and then it is read into memory. Consider the following
> call graph, inodes that are marked as both XFS_IFLUSHING and
> XFS_IRECLAIMABLE, i_nlink will be reset to 1 and then restored to original
> value in xfs_reinit_inode(). Therefore, the i_nlink of directory on disk
> may be set to 1.
>
> xfsaild
> xfs_inode_item_push
> xfs_iflush_cluster
> xfs_iflush
> xfs_inode_to_disk
>
> xfs_iget
> xfs_iget_cache_hit
> xfs_iget_recycle
> xfs_reinit_inode
> inode_init_always
>
> xfs_reinit_inode() needs to hold the ILOCK_EXCL as it is changing internal
> inode state and can race with other RCU protected inode lookups. On the
> read side, xfs_iflush_cluster() grabs the ILOCK_SHARED while under rcu +
> ip->i_flags_lock, and so xfs_iflush/xfs_inode_to_disk() are protected from
> racing inode updates (during transactions) by that lock.
>
> Signed-off-by: Long Li <leo.lilong@huawei.com>
> ---
> v2:
> - Modify the assertion error code line number
> - Use ILOCK_EXCL to prevent inode racing
>
> fs/xfs/xfs_icache.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index eae7427062cf..5a1650e769e7 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -329,7 +329,7 @@ xfs_reinit_inode(
>
> /*
> * Carefully nudge an inode whose VFS state has been torn down back into a
> - * usable state. Drops the i_flags_lock and the rcu read lock.
> + * usable state. Drops the i_flags_lock, rcu read lock and XFS_ILOCK_EXCL.
> */
> static int
> xfs_iget_recycle(
> @@ -355,6 +355,7 @@ xfs_iget_recycle(
>
> ASSERT(!rwsem_is_locked(&inode->i_rwsem));
> error = xfs_reinit_inode(mp, inode);
> + xfs_iunlock(ip, XFS_ILOCK_EXCL);
Ugh, please don't take a lock in one function and drop it in a different
function. If the trylock is really necessary for this operation, have
xfs_iget_recycle return EAGAIN and then make xfs_iget_cache_hit goto
out_skip if recycling returns EAGAIN.
--D
> if (error) {
> /*
> * Re-initializing the inode failed, and we are in deep
> @@ -516,7 +517,10 @@ xfs_iget_cache_hit(
>
> /* The inode fits the selection criteria; process it. */
> if (ip->i_flags & XFS_IRECLAIMABLE) {
> - /* Drops i_flags_lock and RCU read lock. */
> + if (!xfs_ilock_nowait(ip, XFS_ILOCK_EXCL))
> + goto out_skip;
> +
> + /* Drops i_flags_lock, RCU read lock and XFS_ILOCK_EXCL. */
> error = xfs_iget_recycle(pag, ip);
> if (error)
> return error;
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-11-17 3:13 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-17 2:58 [PATCH v2] xfs: fix incorrect i_nlink caused by inode racing Long Li
2022-11-17 3:13 ` Darrick J. Wong
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.