linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim
@ 2020-07-07 19:16 Waiman Long
  2020-07-09 13:39 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Waiman Long @ 2020-07-07 19:16 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-xfs, linux-kernel, Dave Chinner, Qian Cai, Eric Sandeen,
	Waiman Long

Depending on the workloads, the following circular locking dependency
warning between sb_internal (a percpu rwsem) and fs_reclaim (a pseudo
lock) may show up:

======================================================
WARNING: possible circular locking dependency detected
5.0.0-rc1+ #60 Tainted: G        W
------------------------------------------------------
fsfreeze/4346 is trying to acquire lock:
0000000026f1d784 (fs_reclaim){+.+.}, at:
fs_reclaim_acquire.part.19+0x5/0x30

but task is already holding lock:
0000000072bfc54b (sb_internal){++++}, at: percpu_down_write+0xb4/0x650

which lock already depends on the new lock.
  :
 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(sb_internal);
                               lock(fs_reclaim);
                               lock(sb_internal);
  lock(fs_reclaim);

 *** DEADLOCK ***

4 locks held by fsfreeze/4346:
 #0: 00000000b478ef56 (sb_writers#8){++++}, at: percpu_down_write+0xb4/0x650
 #1: 000000001ec487a9 (&type->s_umount_key#28){++++}, at: freeze_super+0xda/0x290
 #2: 000000003edbd5a0 (sb_pagefaults){++++}, at: percpu_down_write+0xb4/0x650
 #3: 0000000072bfc54b (sb_internal){++++}, at: percpu_down_write+0xb4/0x650

stack backtrace:
Call Trace:
 dump_stack+0xe0/0x19a
 print_circular_bug.isra.10.cold.34+0x2f4/0x435
 check_prev_add.constprop.19+0xca1/0x15f0
 validate_chain.isra.14+0x11af/0x3b50
 __lock_acquire+0x728/0x1200
 lock_acquire+0x269/0x5a0
 fs_reclaim_acquire.part.19+0x29/0x30
 fs_reclaim_acquire+0x19/0x20
 kmem_cache_alloc+0x3e/0x3f0
 kmem_zone_alloc+0x79/0x150
 xfs_trans_alloc+0xfa/0x9d0
 xfs_sync_sb+0x86/0x170
 xfs_log_sbcount+0x10f/0x140
 xfs_quiesce_attr+0x134/0x270
 xfs_fs_freeze+0x4a/0x70
 freeze_super+0x1af/0x290
 do_vfs_ioctl+0xedc/0x16c0
 ksys_ioctl+0x41/0x80
 __x64_sys_ioctl+0x73/0xa9
 do_syscall_64+0x18f/0xd23
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

This is a false positive as all the dirty pages are flushed out before
the filesystem can be frozen.

One way to avoid this splat is to add GFP_NOFS to the affected allocation
calls by using the memalloc_nofs_save()/memalloc_nofs_restore() pair.
This shouldn't matter unless the system is really running out of memory.
In that particular case, the filesystem freeze operation may fail while
it was succeeding previously.

Without this patch, the command sequence below will show that the lock
dependency chain sb_internal -> fs_reclaim exists.

 # fsfreeze -f /home
 # fsfreeze --unfreeze /home
 # grep -i fs_reclaim -C 3 /proc/lockdep_chains | grep -C 5 sb_internal

After applying the patch, such sb_internal -> fs_reclaim lock dependency
chain can no longer be found. Because of that, the locking dependency
warning will not be shown.

Suggested-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Waiman Long <longman@redhat.com>
---
 fs/xfs/xfs_super.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 379cbff438bc..0797a96b83d6 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -913,11 +913,21 @@ xfs_fs_freeze(
 	struct super_block	*sb)
 {
 	struct xfs_mount	*mp = XFS_M(sb);
+	unsigned int		flags;
+	int			ret;
 
+	/*
+	 * The filesystem is now frozen far enough that memory reclaim
+	 * cannot safely operate on the filesystem. Hence we need to
+	 * set a GFP_NOFS context here to avoid recursion deadlocks.
+	 */
+	flags = memalloc_nofs_save();
 	xfs_stop_block_reaping(mp);
 	xfs_save_resvblks(mp);
 	xfs_quiesce_attr(mp);
-	return xfs_sync_sb(mp, true);
+	ret = xfs_sync_sb(mp, true);
+	memalloc_nofs_restore(flags);
+	return ret;
 }
 
 STATIC int
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim
  2020-07-07 19:16 [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim Waiman Long
@ 2020-07-09 13:39 ` Christoph Hellwig
  2020-07-09 22:38 ` Dave Chinner
  2020-07-13 16:41 ` Darrick J. Wong
  2 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2020-07-09 13:39 UTC (permalink / raw)
  To: Waiman Long
  Cc: Darrick J. Wong, linux-xfs, linux-kernel, Dave Chinner, Qian Cai,
	Eric Sandeen

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim
  2020-07-07 19:16 [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim Waiman Long
  2020-07-09 13:39 ` Christoph Hellwig
@ 2020-07-09 22:38 ` Dave Chinner
  2020-07-13 16:41 ` Darrick J. Wong
  2 siblings, 0 replies; 7+ messages in thread
From: Dave Chinner @ 2020-07-09 22:38 UTC (permalink / raw)
  To: Waiman Long
  Cc: Darrick J. Wong, linux-xfs, linux-kernel, Qian Cai, Eric Sandeen

On Tue, Jul 07, 2020 at 03:16:29PM -0400, Waiman Long wrote:
> One way to avoid this splat is to add GFP_NOFS to the affected allocation
> calls by using the memalloc_nofs_save()/memalloc_nofs_restore() pair.
> This shouldn't matter unless the system is really running out of memory.
> In that particular case, the filesystem freeze operation may fail while
> it was succeeding previously.
> 
> Without this patch, the command sequence below will show that the lock
> dependency chain sb_internal -> fs_reclaim exists.
> 
>  # fsfreeze -f /home
>  # fsfreeze --unfreeze /home
>  # grep -i fs_reclaim -C 3 /proc/lockdep_chains | grep -C 5 sb_internal
> 
> After applying the patch, such sb_internal -> fs_reclaim lock dependency
> chain can no longer be found. Because of that, the locking dependency
> warning will not be shown.
> 
> Suggested-by: Dave Chinner <david@fromorbit.com>
> Signed-off-by: Waiman Long <longman@redhat.com>

Looks good. Thanks for working through this, Waiman.

Reviewed-by: Dave Chinner <dchinner@redhat.com>

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim
  2020-07-07 19:16 [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim Waiman Long
  2020-07-09 13:39 ` Christoph Hellwig
  2020-07-09 22:38 ` Dave Chinner
@ 2020-07-13 16:41 ` Darrick J. Wong
  2020-07-20 15:32   ` Waiman Long
  2 siblings, 1 reply; 7+ messages in thread
From: Darrick J. Wong @ 2020-07-13 16:41 UTC (permalink / raw)
  To: Waiman Long; +Cc: linux-xfs, linux-kernel, Dave Chinner, Qian Cai, Eric Sandeen

On Tue, Jul 07, 2020 at 03:16:29PM -0400, Waiman Long wrote:
> Depending on the workloads, the following circular locking dependency
> warning between sb_internal (a percpu rwsem) and fs_reclaim (a pseudo
> lock) may show up:
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.0.0-rc1+ #60 Tainted: G        W
> ------------------------------------------------------
> fsfreeze/4346 is trying to acquire lock:
> 0000000026f1d784 (fs_reclaim){+.+.}, at:
> fs_reclaim_acquire.part.19+0x5/0x30
> 
> but task is already holding lock:
> 0000000072bfc54b (sb_internal){++++}, at: percpu_down_write+0xb4/0x650
> 
> which lock already depends on the new lock.
>   :
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(sb_internal);
>                                lock(fs_reclaim);
>                                lock(sb_internal);
>   lock(fs_reclaim);
> 
>  *** DEADLOCK ***
> 
> 4 locks held by fsfreeze/4346:
>  #0: 00000000b478ef56 (sb_writers#8){++++}, at: percpu_down_write+0xb4/0x650
>  #1: 000000001ec487a9 (&type->s_umount_key#28){++++}, at: freeze_super+0xda/0x290
>  #2: 000000003edbd5a0 (sb_pagefaults){++++}, at: percpu_down_write+0xb4/0x650
>  #3: 0000000072bfc54b (sb_internal){++++}, at: percpu_down_write+0xb4/0x650
> 
> stack backtrace:
> Call Trace:
>  dump_stack+0xe0/0x19a
>  print_circular_bug.isra.10.cold.34+0x2f4/0x435
>  check_prev_add.constprop.19+0xca1/0x15f0
>  validate_chain.isra.14+0x11af/0x3b50
>  __lock_acquire+0x728/0x1200
>  lock_acquire+0x269/0x5a0
>  fs_reclaim_acquire.part.19+0x29/0x30
>  fs_reclaim_acquire+0x19/0x20
>  kmem_cache_alloc+0x3e/0x3f0
>  kmem_zone_alloc+0x79/0x150
>  xfs_trans_alloc+0xfa/0x9d0
>  xfs_sync_sb+0x86/0x170
>  xfs_log_sbcount+0x10f/0x140
>  xfs_quiesce_attr+0x134/0x270
>  xfs_fs_freeze+0x4a/0x70
>  freeze_super+0x1af/0x290
>  do_vfs_ioctl+0xedc/0x16c0
>  ksys_ioctl+0x41/0x80
>  __x64_sys_ioctl+0x73/0xa9
>  do_syscall_64+0x18f/0xd23
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> This is a false positive as all the dirty pages are flushed out before
> the filesystem can be frozen.
> 
> One way to avoid this splat is to add GFP_NOFS to the affected allocation
> calls by using the memalloc_nofs_save()/memalloc_nofs_restore() pair.
> This shouldn't matter unless the system is really running out of memory.
> In that particular case, the filesystem freeze operation may fail while
> it was succeeding previously.
> 
> Without this patch, the command sequence below will show that the lock
> dependency chain sb_internal -> fs_reclaim exists.
> 
>  # fsfreeze -f /home
>  # fsfreeze --unfreeze /home
>  # grep -i fs_reclaim -C 3 /proc/lockdep_chains | grep -C 5 sb_internal
> 
> After applying the patch, such sb_internal -> fs_reclaim lock dependency
> chain can no longer be found. Because of that, the locking dependency
> warning will not be shown.
> 
> Suggested-by: Dave Chinner <david@fromorbit.com>
> Signed-off-by: Waiman Long <longman@redhat.com>

Looks good to me,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_super.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 379cbff438bc..0797a96b83d6 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -913,11 +913,21 @@ xfs_fs_freeze(
>  	struct super_block	*sb)
>  {
>  	struct xfs_mount	*mp = XFS_M(sb);
> +	unsigned int		flags;
> +	int			ret;
>  
> +	/*
> +	 * The filesystem is now frozen far enough that memory reclaim
> +	 * cannot safely operate on the filesystem. Hence we need to
> +	 * set a GFP_NOFS context here to avoid recursion deadlocks.
> +	 */
> +	flags = memalloc_nofs_save();
>  	xfs_stop_block_reaping(mp);
>  	xfs_save_resvblks(mp);
>  	xfs_quiesce_attr(mp);
> -	return xfs_sync_sb(mp, true);
> +	ret = xfs_sync_sb(mp, true);
> +	memalloc_nofs_restore(flags);
> +	return ret;
>  }
>  
>  STATIC int
> -- 
> 2.18.1
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim
  2020-07-13 16:41 ` Darrick J. Wong
@ 2020-07-20 15:32   ` Waiman Long
  2020-07-20 15:40     ` Darrick J. Wong
  0 siblings, 1 reply; 7+ messages in thread
From: Waiman Long @ 2020-07-20 15:32 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-xfs, linux-kernel, Dave Chinner, Qian Cai, Eric Sandeen



----- Original Message -----
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: "Waiman Long" <longman@redhat.com>
Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, "Dave Chinner" <david@fromorbit.com>, "Qian Cai" <cai@lca.pw>, "Eric Sandeen" <sandeen@redhat.com>
Sent: Monday, July 13, 2020 12:41:12 PM
Subject: Re: [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim

On Tue, Jul 07, 2020 at 03:16:29PM -0400, Waiman Long wrote:
> Depending on the workloads, the following circular locking dependency
> warning between sb_internal (a percpu rwsem) and fs_reclaim (a pseudo
> lock) may show up:
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.0.0-rc1+ #60 Tainted: G        W
> ------------------------------------------------------
> fsfreeze/4346 is trying to acquire lock:
> 0000000026f1d784 (fs_reclaim){+.+.}, at:
> fs_reclaim_acquire.part.19+0x5/0x30
> 
> but task is already holding lock:
> 0000000072bfc54b (sb_internal){++++}, at: percpu_down_write+0xb4/0x650
> 
> which lock already depends on the new lock.
>   :
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(sb_internal);
>                                lock(fs_reclaim);
>                                lock(sb_internal);
>   lock(fs_reclaim);
> 
>  *** DEADLOCK ***
> 
> 4 locks held by fsfreeze/4346:
>  #0: 00000000b478ef56 (sb_writers#8){++++}, at: percpu_down_write+0xb4/0x650
>  #1: 000000001ec487a9 (&type->s_umount_key#28){++++}, at: freeze_super+0xda/0x290
>  #2: 000000003edbd5a0 (sb_pagefaults){++++}, at: percpu_down_write+0xb4/0x650
>  #3: 0000000072bfc54b (sb_internal){++++}, at: percpu_down_write+0xb4/0x650
> 
> stack backtrace:
> Call Trace:
>  dump_stack+0xe0/0x19a
>  print_circular_bug.isra.10.cold.34+0x2f4/0x435
>  check_prev_add.constprop.19+0xca1/0x15f0
>  validate_chain.isra.14+0x11af/0x3b50
>  __lock_acquire+0x728/0x1200
>  lock_acquire+0x269/0x5a0
>  fs_reclaim_acquire.part.19+0x29/0x30
>  fs_reclaim_acquire+0x19/0x20
>  kmem_cache_alloc+0x3e/0x3f0
>  kmem_zone_alloc+0x79/0x150
>  xfs_trans_alloc+0xfa/0x9d0
>  xfs_sync_sb+0x86/0x170
>  xfs_log_sbcount+0x10f/0x140
>  xfs_quiesce_attr+0x134/0x270
>  xfs_fs_freeze+0x4a/0x70
>  freeze_super+0x1af/0x290
>  do_vfs_ioctl+0xedc/0x16c0
>  ksys_ioctl+0x41/0x80
>  __x64_sys_ioctl+0x73/0xa9
>  do_syscall_64+0x18f/0xd23
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> This is a false positive as all the dirty pages are flushed out before
> the filesystem can be frozen.
> 
> One way to avoid this splat is to add GFP_NOFS to the affected allocation
> calls by using the memalloc_nofs_save()/memalloc_nofs_restore() pair.
> This shouldn't matter unless the system is really running out of memory.
> In that particular case, the filesystem freeze operation may fail while
> it was succeeding previously.
> 
> Without this patch, the command sequence below will show that the lock
> dependency chain sb_internal -> fs_reclaim exists.
> 
>  # fsfreeze -f /home
>  # fsfreeze --unfreeze /home
>  # grep -i fs_reclaim -C 3 /proc/lockdep_chains | grep -C 5 sb_internal
> 
> After applying the patch, such sb_internal -> fs_reclaim lock dependency
> chain can no longer be found. Because of that, the locking dependency
> warning will not be shown.
> 
> Suggested-by: Dave Chinner <david@fromorbit.com>
> Signed-off-by: Waiman Long <longman@redhat.com>

Looks good to me,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

Will this patch be merged into the xfs tree soon?

Thanks,
Longman


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim
  2020-07-20 15:32   ` Waiman Long
@ 2020-07-20 15:40     ` Darrick J. Wong
  2020-07-20 15:46       ` Waiman Long
  0 siblings, 1 reply; 7+ messages in thread
From: Darrick J. Wong @ 2020-07-20 15:40 UTC (permalink / raw)
  To: Waiman Long; +Cc: linux-xfs, linux-kernel, Dave Chinner, Qian Cai, Eric Sandeen

On Mon, Jul 20, 2020 at 11:32:03AM -0400, Waiman Long wrote:
> 
> 
> ----- Original Message -----
> From: "Darrick J. Wong" <darrick.wong@oracle.com>
> To: "Waiman Long" <longman@redhat.com>
> Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, "Dave Chinner" <david@fromorbit.com>, "Qian Cai" <cai@lca.pw>, "Eric Sandeen" <sandeen@redhat.com>
> Sent: Monday, July 13, 2020 12:41:12 PM
> Subject: Re: [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim
> 
> On Tue, Jul 07, 2020 at 03:16:29PM -0400, Waiman Long wrote:
> > Depending on the workloads, the following circular locking dependency
> > warning between sb_internal (a percpu rwsem) and fs_reclaim (a pseudo
> > lock) may show up:
> > 
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 5.0.0-rc1+ #60 Tainted: G        W
> > ------------------------------------------------------
> > fsfreeze/4346 is trying to acquire lock:
> > 0000000026f1d784 (fs_reclaim){+.+.}, at:
> > fs_reclaim_acquire.part.19+0x5/0x30
> > 
> > but task is already holding lock:
> > 0000000072bfc54b (sb_internal){++++}, at: percpu_down_write+0xb4/0x650
> > 
> > which lock already depends on the new lock.
> >   :
> >  Possible unsafe locking scenario:
> > 
> >        CPU0                    CPU1
> >        ----                    ----
> >   lock(sb_internal);
> >                                lock(fs_reclaim);
> >                                lock(sb_internal);
> >   lock(fs_reclaim);
> > 
> >  *** DEADLOCK ***
> > 
> > 4 locks held by fsfreeze/4346:
> >  #0: 00000000b478ef56 (sb_writers#8){++++}, at: percpu_down_write+0xb4/0x650
> >  #1: 000000001ec487a9 (&type->s_umount_key#28){++++}, at: freeze_super+0xda/0x290
> >  #2: 000000003edbd5a0 (sb_pagefaults){++++}, at: percpu_down_write+0xb4/0x650
> >  #3: 0000000072bfc54b (sb_internal){++++}, at: percpu_down_write+0xb4/0x650
> > 
> > stack backtrace:
> > Call Trace:
> >  dump_stack+0xe0/0x19a
> >  print_circular_bug.isra.10.cold.34+0x2f4/0x435
> >  check_prev_add.constprop.19+0xca1/0x15f0
> >  validate_chain.isra.14+0x11af/0x3b50
> >  __lock_acquire+0x728/0x1200
> >  lock_acquire+0x269/0x5a0
> >  fs_reclaim_acquire.part.19+0x29/0x30
> >  fs_reclaim_acquire+0x19/0x20
> >  kmem_cache_alloc+0x3e/0x3f0
> >  kmem_zone_alloc+0x79/0x150
> >  xfs_trans_alloc+0xfa/0x9d0
> >  xfs_sync_sb+0x86/0x170
> >  xfs_log_sbcount+0x10f/0x140
> >  xfs_quiesce_attr+0x134/0x270
> >  xfs_fs_freeze+0x4a/0x70
> >  freeze_super+0x1af/0x290
> >  do_vfs_ioctl+0xedc/0x16c0
> >  ksys_ioctl+0x41/0x80
> >  __x64_sys_ioctl+0x73/0xa9
> >  do_syscall_64+0x18f/0xd23
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > 
> > This is a false positive as all the dirty pages are flushed out before
> > the filesystem can be frozen.
> > 
> > One way to avoid this splat is to add GFP_NOFS to the affected allocation
> > calls by using the memalloc_nofs_save()/memalloc_nofs_restore() pair.
> > This shouldn't matter unless the system is really running out of memory.
> > In that particular case, the filesystem freeze operation may fail while
> > it was succeeding previously.
> > 
> > Without this patch, the command sequence below will show that the lock
> > dependency chain sb_internal -> fs_reclaim exists.
> > 
> >  # fsfreeze -f /home
> >  # fsfreeze --unfreeze /home
> >  # grep -i fs_reclaim -C 3 /proc/lockdep_chains | grep -C 5 sb_internal
> > 
> > After applying the patch, such sb_internal -> fs_reclaim lock dependency
> > chain can no longer be found. Because of that, the locking dependency
> > warning will not be shown.
> > 
> > Suggested-by: Dave Chinner <david@fromorbit.com>
> > Signed-off-by: Waiman Long <longman@redhat.com>
> 
> Looks good to me,
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Will this patch be merged into the xfs tree soon?

It should appear in for-next in the next day or so.  I am trying to push
there only every other couple of weeks to reduce the amount of developer
tree rebasing that has to go on when people are trying to land a complex
series.

--D

> Thanks,
> Longman
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim
  2020-07-20 15:40     ` Darrick J. Wong
@ 2020-07-20 15:46       ` Waiman Long
  0 siblings, 0 replies; 7+ messages in thread
From: Waiman Long @ 2020-07-20 15:46 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: linux-xfs, linux-kernel, Dave Chinner, Qian Cai, Eric Sandeen

On 7/20/20 11:40 AM, Darrick J. Wong wrote:
> On Mon, Jul 20, 2020 at 11:32:03AM -0400, Waiman Long wrote:
>>
>> ----- Original Message -----
>> From: "Darrick J. Wong" <darrick.wong@oracle.com>
>> To: "Waiman Long" <longman@redhat.com>
>> Cc: linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, "Dave Chinner" <david@fromorbit.com>, "Qian Cai" <cai@lca.pw>, "Eric Sandeen" <sandeen@redhat.com>
>> Sent: Monday, July 13, 2020 12:41:12 PM
>> Subject: Re: [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim
>>
>> On Tue, Jul 07, 2020 at 03:16:29PM -0400, Waiman Long wrote:
>>> Depending on the workloads, the following circular locking dependency
>>> warning between sb_internal (a percpu rwsem) and fs_reclaim (a pseudo
>>> lock) may show up:
>>>
>>> ======================================================
>>> WARNING: possible circular locking dependency detected
>>> 5.0.0-rc1+ #60 Tainted: G        W
>>> ------------------------------------------------------
>>> fsfreeze/4346 is trying to acquire lock:
>>> 0000000026f1d784 (fs_reclaim){+.+.}, at:
>>> fs_reclaim_acquire.part.19+0x5/0x30
>>>
>>> but task is already holding lock:
>>> 0000000072bfc54b (sb_internal){++++}, at: percpu_down_write+0xb4/0x650
>>>
>>> which lock already depends on the new lock.
>>>    :
>>>   Possible unsafe locking scenario:
>>>
>>>         CPU0                    CPU1
>>>         ----                    ----
>>>    lock(sb_internal);
>>>                                 lock(fs_reclaim);
>>>                                 lock(sb_internal);
>>>    lock(fs_reclaim);
>>>
>>>   *** DEADLOCK ***
>>>
>>> 4 locks held by fsfreeze/4346:
>>>   #0: 00000000b478ef56 (sb_writers#8){++++}, at: percpu_down_write+0xb4/0x650
>>>   #1: 000000001ec487a9 (&type->s_umount_key#28){++++}, at: freeze_super+0xda/0x290
>>>   #2: 000000003edbd5a0 (sb_pagefaults){++++}, at: percpu_down_write+0xb4/0x650
>>>   #3: 0000000072bfc54b (sb_internal){++++}, at: percpu_down_write+0xb4/0x650
>>>
>>> stack backtrace:
>>> Call Trace:
>>>   dump_stack+0xe0/0x19a
>>>   print_circular_bug.isra.10.cold.34+0x2f4/0x435
>>>   check_prev_add.constprop.19+0xca1/0x15f0
>>>   validate_chain.isra.14+0x11af/0x3b50
>>>   __lock_acquire+0x728/0x1200
>>>   lock_acquire+0x269/0x5a0
>>>   fs_reclaim_acquire.part.19+0x29/0x30
>>>   fs_reclaim_acquire+0x19/0x20
>>>   kmem_cache_alloc+0x3e/0x3f0
>>>   kmem_zone_alloc+0x79/0x150
>>>   xfs_trans_alloc+0xfa/0x9d0
>>>   xfs_sync_sb+0x86/0x170
>>>   xfs_log_sbcount+0x10f/0x140
>>>   xfs_quiesce_attr+0x134/0x270
>>>   xfs_fs_freeze+0x4a/0x70
>>>   freeze_super+0x1af/0x290
>>>   do_vfs_ioctl+0xedc/0x16c0
>>>   ksys_ioctl+0x41/0x80
>>>   __x64_sys_ioctl+0x73/0xa9
>>>   do_syscall_64+0x18f/0xd23
>>>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>
>>> This is a false positive as all the dirty pages are flushed out before
>>> the filesystem can be frozen.
>>>
>>> One way to avoid this splat is to add GFP_NOFS to the affected allocation
>>> calls by using the memalloc_nofs_save()/memalloc_nofs_restore() pair.
>>> This shouldn't matter unless the system is really running out of memory.
>>> In that particular case, the filesystem freeze operation may fail while
>>> it was succeeding previously.
>>>
>>> Without this patch, the command sequence below will show that the lock
>>> dependency chain sb_internal -> fs_reclaim exists.
>>>
>>>   # fsfreeze -f /home
>>>   # fsfreeze --unfreeze /home
>>>   # grep -i fs_reclaim -C 3 /proc/lockdep_chains | grep -C 5 sb_internal
>>>
>>> After applying the patch, such sb_internal -> fs_reclaim lock dependency
>>> chain can no longer be found. Because of that, the locking dependency
>>> warning will not be shown.
>>>
>>> Suggested-by: Dave Chinner <david@fromorbit.com>
>>> Signed-off-by: Waiman Long <longman@redhat.com>
>> Looks good to me,
>> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
>>
>> Will this patch be merged into the xfs tree soon?
> It should appear in for-next in the next day or so.  I am trying to push
> there only every other couple of weeks to reduce the amount of developer
> tree rebasing that has to go on when people are trying to land a complex
> series.
>
> --D

Thanks for the clarification.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-07-20 16:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-07 19:16 [PATCH v6] xfs: Fix false positive lockdep warning with sb_internal & fs_reclaim Waiman Long
2020-07-09 13:39 ` Christoph Hellwig
2020-07-09 22:38 ` Dave Chinner
2020-07-13 16:41 ` Darrick J. Wong
2020-07-20 15:32   ` Waiman Long
2020-07-20 15:40     ` Darrick J. Wong
2020-07-20 15:46       ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).