[Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
@ 2020-04-01 19:02 bugzilla-daemon
  2020-04-02  0:15 ` Dave Chinner
                   ` (12 more replies)
  0 siblings, 13 replies; 20+ messages in thread
From: bugzilla-daemon @ 2020-04-01 19:02 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=207053

            Bug ID: 207053
           Summary: fsfreeze deadlock on XFS (the FIFREEZE ioctl and
                    subsequent FITHAW hang indefinitely)
           Product: File System
           Version: 2.5
    Kernel Version: 4.19.75, 5.4.20
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: XFS
          Assignee: filesystem_xfs@kernel-bugs.kernel.org
          Reporter: paulfurtado91@gmail.com
        Regression: No

When we upgraded from kernel 4.14.146 to kernel 4.19.75, we began to experience
frequent deadlocks from our cronjobs that freeze the filesystem for
snapshotting.

The fsfreeze stack shows:
# cat /proc/33256/stack 
[<0>] __flush_work+0x177/0x1b0
[<0>] __cancel_work_timer+0x12b/0x1b0
[<0>] xfs_stop_block_reaping+0x15/0x30 [xfs]
[<0>] xfs_fs_freeze+0x15/0x40 [xfs]
[<0>] freeze_super+0xc8/0x190
[<0>] do_vfs_ioctl+0x510/0x630
[<0>] ksys_ioctl+0x70/0x80
[<0>] __x64_sys_ioctl+0x16/0x20
[<0>] do_syscall_64+0x4e/0x100
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9


The fsfreeze -u stack shows:
# cat /proc/37753/stack 
[<0>] rwsem_down_write_slowpath+0x257/0x510
[<0>] thaw_super+0x12/0x20
[<0>] do_vfs_ioctl+0x609/0x630
[<0>] ksys_ioctl+0x70/0x80
[<0>] __x64_sys_ioctl+0x16/0x20
[<0>] do_syscall_64+0x4e/0x100
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9


Echoing "j" into /proc/sysrq-trigger to emergency thaw all filesystems doesn't
solve this either. We're hitting this bug many times per week, so if there's
any more debug information you need that we could turn on, let us know. Thanks!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
@ 2020-04-02  0:15 ` Dave Chinner
  2020-04-02  0:15 ` [Bug 207053] " bugzilla-daemon
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Dave Chinner @ 2020-04-02  0:15 UTC (permalink / raw)
  To: bugzilla-daemon; +Cc: linux-xfs

On Wed, Apr 01, 2020 at 07:02:57PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> When we upgraded from kernel 4.14.146 to kernel 4.19.75, we began to experience
> frequent deadlocks from our cronjobs that freeze the filesystem for
> snapshotting.

Probably commit d6b636ebb1c9 ("xfs: halt auto-reclamation activities
while rebuilding rmap") in 4.18, but that fixes a bug that allowed
reaping functions to attempt to modify the filesystem while it was
frozen...

> The fsfreeze stack shows:
> # cat /proc/33256/stack 
> [<0>] __flush_work+0x177/0x1b0
> [<0>] __cancel_work_timer+0x12b/0x1b0
> [<0>] xfs_stop_block_reaping+0x15/0x30 [xfs]
> [<0>] xfs_fs_freeze+0x15/0x40 [xfs]
> [<0>] freeze_super+0xc8/0x190
> [<0>] do_vfs_ioctl+0x510/0x630
> [<0>] ksys_ioctl+0x70/0x80
> [<0>] __x64_sys_ioctl+0x16/0x20
> [<0>] do_syscall_64+0x4e/0x100
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

This indicates that the reaping worker is still busy doing work. It
needs to finish before the freeze will continue to make progress.
So either the system is still doing work, or the kworker has blocked
somewhere.

What is the dmesg output of 'echo w > /prox/sysrq-trigger'?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
  2020-04-02  0:15 ` Dave Chinner
@ 2020-04-02  0:15 ` bugzilla-daemon
  2020-04-07  6:41 ` bugzilla-daemon
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2020-04-02  0:15 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=207053

--- Comment #1 from Dave Chinner (david@fromorbit.com) ---
On Wed, Apr 01, 2020 at 07:02:57PM +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> When we upgraded from kernel 4.14.146 to kernel 4.19.75, we began to
> experience
> frequent deadlocks from our cronjobs that freeze the filesystem for
> snapshotting.

Probably commit d6b636ebb1c9 ("xfs: halt auto-reclamation activities
while rebuilding rmap") in 4.18, but that fixes a bug that allowed
reaping functions to attempt to modify the filesystem while it was
frozen...

> The fsfreeze stack shows:
> # cat /proc/33256/stack 
> [<0>] __flush_work+0x177/0x1b0
> [<0>] __cancel_work_timer+0x12b/0x1b0
> [<0>] xfs_stop_block_reaping+0x15/0x30 [xfs]
> [<0>] xfs_fs_freeze+0x15/0x40 [xfs]
> [<0>] freeze_super+0xc8/0x190
> [<0>] do_vfs_ioctl+0x510/0x630
> [<0>] ksys_ioctl+0x70/0x80
> [<0>] __x64_sys_ioctl+0x16/0x20
> [<0>] do_syscall_64+0x4e/0x100
> [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

This indicates that the reaping worker is still busy doing work. It
needs to finish before the freeze will continue to make progress.
So either the system is still doing work, or the kworker has blocked
somewhere.

What is the dmesg output of 'echo w > /prox/sysrq-trigger'?

Cheers,

Dave.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
  2020-04-02  0:15 ` Dave Chinner
  2020-04-02  0:15 ` [Bug 207053] " bugzilla-daemon
@ 2020-04-07  6:41 ` bugzilla-daemon
  2020-04-07 13:18   ` Brian Foster
  2020-04-07 13:18 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 20+ messages in thread
From: bugzilla-daemon @ 2020-04-07  6:41 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=207053

--- Comment #2 from Paul Furtado (paulfurtado91@gmail.com) ---
Hi Dave,

Just had another case of this crop up and I was able to get the blocked tasks
output before automation killed the server. Because the log was too large to
attach, I've pasted the output into a github gist here:
https://gist.githubusercontent.com/PaulFurtado/c9bade038b8a5c7ddb53a6e10def058f/raw/ee43926c96c0d6a9ec81a648754c1af599ef0bdd/sysrq_w.log


Thanks,
Paul

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-07  6:41 ` bugzilla-daemon
@ 2020-04-07 13:18   ` Brian Foster
  2020-04-07 15:17     ` Darrick J. Wong
  0 siblings, 1 reply; 20+ messages in thread
From: Brian Foster @ 2020-04-07 13:18 UTC (permalink / raw)
  To: bugzilla-daemon; +Cc: linux-xfs

On Tue, Apr 07, 2020 at 06:41:31AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=207053
> 
> --- Comment #2 from Paul Furtado (paulfurtado91@gmail.com) ---
> Hi Dave,
> 
> Just had another case of this crop up and I was able to get the blocked tasks
> output before automation killed the server. Because the log was too large to
> attach, I've pasted the output into a github gist here:
> https://gist.githubusercontent.com/PaulFurtado/c9bade038b8a5c7ddb53a6e10def058f/raw/ee43926c96c0d6a9ec81a648754c1af599ef0bdd/sysrq_w.log
> 

Hm, so it looks like this is stuck between freeze:

[377279.630957] fsfreeze        D    0 46819  46337 0x00004084
[377279.634910] Call Trace:
[377279.637594]  ? __schedule+0x292/0x6f0
[377279.640833]  ? xfs_xattr_get+0x51/0x80 [xfs]
[377279.644287]  schedule+0x2f/0xa0
[377279.647286]  schedule_timeout+0x1dd/0x300
[377279.650661]  wait_for_completion+0x126/0x190
[377279.654154]  ? wake_up_q+0x80/0x80
[377279.657277]  ? work_busy+0x80/0x80
[377279.660375]  __flush_work+0x177/0x1b0
[377279.663604]  ? worker_attach_to_pool+0x90/0x90
[377279.667121]  __cancel_work_timer+0x12b/0x1b0
[377279.670571]  ? rcu_sync_enter+0x8b/0xd0
[377279.673864]  xfs_stop_block_reaping+0x15/0x30 [xfs]
[377279.677585]  xfs_fs_freeze+0x15/0x40 [xfs]
[377279.680950]  freeze_super+0xc8/0x190
[377279.684086]  do_vfs_ioctl+0x510/0x630
...

... and the eofblocks scanner:

[377279.422496] Workqueue: xfs-eofblocks/nvme13n1 xfs_eofblocks_worker [xfs]
[377279.426971] Call Trace:
[377279.429662]  ? __schedule+0x292/0x6f0
[377279.432839]  schedule+0x2f/0xa0
[377279.435794]  rwsem_down_read_slowpath+0x196/0x530
[377279.439435]  ? kmem_cache_alloc+0x152/0x1f0
[377279.442834]  ? __percpu_down_read+0x49/0x60
[377279.446242]  __percpu_down_read+0x49/0x60
[377279.449586]  __sb_start_write+0x5b/0x60
[377279.452869]  xfs_trans_alloc+0x152/0x160 [xfs]
[377279.456372]  xfs_free_eofblocks+0x12d/0x1f0 [xfs]
[377279.460014]  xfs_inode_free_eofblocks+0x128/0x1a0 [xfs]
[377279.463903]  ? xfs_inode_ag_walk_grab+0x5f/0x90 [xfs]
[377279.467680]  xfs_inode_ag_walk.isra.17+0x1a7/0x410 [xfs]
[377279.471567]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[377279.475620]  ? kvm_sched_clock_read+0xd/0x20
[377279.479059]  ? sched_clock+0x5/0x10
[377279.482184]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[377279.486234]  ? radix_tree_gang_lookup_tag+0xa8/0x100
[377279.489974]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[377279.494041]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
[377279.497859]  xfs_eofblocks_worker+0x29/0x40 [xfs]
[377279.501484]  process_one_work+0x195/0x380
...

The immediate issue is likely that the eofblocks transaction is not
NOWRITECOUNT (same for the cowblocks scanner, btw), but the problem with
doing that is these helpers are called from other contexts outside of
the background scanners.

Perhaps what we need to do here is let these background scanners acquire
a superblock write reference, similar to what Darrick recently added to
scrub..? We'd have to do that from the scanner workqueue task, so it
would probably need to be a trylock so we don't end up in a similar
situation as above. I.e., we'd either get the reference and cause freeze
to wait until it's dropped or bail out if freeze has already stopped the
transaction subsystem. Thoughts?

Brian

> 
> Thanks,
> Paul
> 
> -- 
> You are receiving this mail because:
> You are watching the assignee of the bug.
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
                   ` (2 preceding siblings ...)
  2020-04-07  6:41 ` bugzilla-daemon
@ 2020-04-07 13:18 ` bugzilla-daemon
  2020-04-07 15:17 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2020-04-07 13:18 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=207053

--- Comment #3 from bfoster@redhat.com ---
On Tue, Apr 07, 2020 at 06:41:31AM +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=207053
> 
> --- Comment #2 from Paul Furtado (paulfurtado91@gmail.com) ---
> Hi Dave,
> 
> Just had another case of this crop up and I was able to get the blocked tasks
> output before automation killed the server. Because the log was too large to
> attach, I've pasted the output into a github gist here:
>
> https://gist.githubusercontent.com/PaulFurtado/c9bade038b8a5c7ddb53a6e10def058f/raw/ee43926c96c0d6a9ec81a648754c1af599ef0bdd/sysrq_w.log
> 

Hm, so it looks like this is stuck between freeze:

[377279.630957] fsfreeze        D    0 46819  46337 0x00004084
[377279.634910] Call Trace:
[377279.637594]  ? __schedule+0x292/0x6f0
[377279.640833]  ? xfs_xattr_get+0x51/0x80 [xfs]
[377279.644287]  schedule+0x2f/0xa0
[377279.647286]  schedule_timeout+0x1dd/0x300
[377279.650661]  wait_for_completion+0x126/0x190
[377279.654154]  ? wake_up_q+0x80/0x80
[377279.657277]  ? work_busy+0x80/0x80
[377279.660375]  __flush_work+0x177/0x1b0
[377279.663604]  ? worker_attach_to_pool+0x90/0x90
[377279.667121]  __cancel_work_timer+0x12b/0x1b0
[377279.670571]  ? rcu_sync_enter+0x8b/0xd0
[377279.673864]  xfs_stop_block_reaping+0x15/0x30 [xfs]
[377279.677585]  xfs_fs_freeze+0x15/0x40 [xfs]
[377279.680950]  freeze_super+0xc8/0x190
[377279.684086]  do_vfs_ioctl+0x510/0x630
...

... and the eofblocks scanner:

[377279.422496] Workqueue: xfs-eofblocks/nvme13n1 xfs_eofblocks_worker [xfs]
[377279.426971] Call Trace:
[377279.429662]  ? __schedule+0x292/0x6f0
[377279.432839]  schedule+0x2f/0xa0
[377279.435794]  rwsem_down_read_slowpath+0x196/0x530
[377279.439435]  ? kmem_cache_alloc+0x152/0x1f0
[377279.442834]  ? __percpu_down_read+0x49/0x60
[377279.446242]  __percpu_down_read+0x49/0x60
[377279.449586]  __sb_start_write+0x5b/0x60
[377279.452869]  xfs_trans_alloc+0x152/0x160 [xfs]
[377279.456372]  xfs_free_eofblocks+0x12d/0x1f0 [xfs]
[377279.460014]  xfs_inode_free_eofblocks+0x128/0x1a0 [xfs]
[377279.463903]  ? xfs_inode_ag_walk_grab+0x5f/0x90 [xfs]
[377279.467680]  xfs_inode_ag_walk.isra.17+0x1a7/0x410 [xfs]
[377279.471567]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[377279.475620]  ? kvm_sched_clock_read+0xd/0x20
[377279.479059]  ? sched_clock+0x5/0x10
[377279.482184]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[377279.486234]  ? radix_tree_gang_lookup_tag+0xa8/0x100
[377279.489974]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
[377279.494041]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
[377279.497859]  xfs_eofblocks_worker+0x29/0x40 [xfs]
[377279.501484]  process_one_work+0x195/0x380
...

The immediate issue is likely that the eofblocks transaction is not
NOWRITECOUNT (same for the cowblocks scanner, btw), but the problem with
doing that is these helpers are called from other contexts outside of
the background scanners.

Perhaps what we need to do here is let these background scanners acquire
a superblock write reference, similar to what Darrick recently added to
scrub..? We'd have to do that from the scanner workqueue task, so it
would probably need to be a trylock so we don't end up in a similar
situation as above. I.e., we'd either get the reference and cause freeze
to wait until it's dropped or bail out if freeze has already stopped the
transaction subsystem. Thoughts?

Brian

> 
> Thanks,
> Paul
> 
> -- 
> You are receiving this mail because:
> You are watching the assignee of the bug.
>

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-07 13:18   ` Brian Foster
@ 2020-04-07 15:17     ` Darrick J. Wong
  2020-04-07 16:37       ` Brian Foster
  0 siblings, 1 reply; 20+ messages in thread
From: Darrick J. Wong @ 2020-04-07 15:17 UTC (permalink / raw)
  To: Brian Foster; +Cc: bugzilla-daemon, linux-xfs

On Tue, Apr 07, 2020 at 09:18:12AM -0400, Brian Foster wrote:
> On Tue, Apr 07, 2020 at 06:41:31AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=207053
> > 
> > --- Comment #2 from Paul Furtado (paulfurtado91@gmail.com) ---
> > Hi Dave,
> > 
> > Just had another case of this crop up and I was able to get the blocked tasks
> > output before automation killed the server. Because the log was too large to
> > attach, I've pasted the output into a github gist here:
> > https://gist.githubusercontent.com/PaulFurtado/c9bade038b8a5c7ddb53a6e10def058f/raw/ee43926c96c0d6a9ec81a648754c1af599ef0bdd/sysrq_w.log
> > 
> 
> Hm, so it looks like this is stuck between freeze:
> 
> [377279.630957] fsfreeze        D    0 46819  46337 0x00004084
> [377279.634910] Call Trace:
> [377279.637594]  ? __schedule+0x292/0x6f0
> [377279.640833]  ? xfs_xattr_get+0x51/0x80 [xfs]
> [377279.644287]  schedule+0x2f/0xa0
> [377279.647286]  schedule_timeout+0x1dd/0x300
> [377279.650661]  wait_for_completion+0x126/0x190
> [377279.654154]  ? wake_up_q+0x80/0x80
> [377279.657277]  ? work_busy+0x80/0x80
> [377279.660375]  __flush_work+0x177/0x1b0
> [377279.663604]  ? worker_attach_to_pool+0x90/0x90
> [377279.667121]  __cancel_work_timer+0x12b/0x1b0
> [377279.670571]  ? rcu_sync_enter+0x8b/0xd0
> [377279.673864]  xfs_stop_block_reaping+0x15/0x30 [xfs]
> [377279.677585]  xfs_fs_freeze+0x15/0x40 [xfs]
> [377279.680950]  freeze_super+0xc8/0x190
> [377279.684086]  do_vfs_ioctl+0x510/0x630
> ...
> 
> ... and the eofblocks scanner:
> 
> [377279.422496] Workqueue: xfs-eofblocks/nvme13n1 xfs_eofblocks_worker [xfs]
> [377279.426971] Call Trace:
> [377279.429662]  ? __schedule+0x292/0x6f0
> [377279.432839]  schedule+0x2f/0xa0
> [377279.435794]  rwsem_down_read_slowpath+0x196/0x530
> [377279.439435]  ? kmem_cache_alloc+0x152/0x1f0
> [377279.442834]  ? __percpu_down_read+0x49/0x60
> [377279.446242]  __percpu_down_read+0x49/0x60
> [377279.449586]  __sb_start_write+0x5b/0x60
> [377279.452869]  xfs_trans_alloc+0x152/0x160 [xfs]
> [377279.456372]  xfs_free_eofblocks+0x12d/0x1f0 [xfs]
> [377279.460014]  xfs_inode_free_eofblocks+0x128/0x1a0 [xfs]
> [377279.463903]  ? xfs_inode_ag_walk_grab+0x5f/0x90 [xfs]
> [377279.467680]  xfs_inode_ag_walk.isra.17+0x1a7/0x410 [xfs]
> [377279.471567]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> [377279.475620]  ? kvm_sched_clock_read+0xd/0x20
> [377279.479059]  ? sched_clock+0x5/0x10
> [377279.482184]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> [377279.486234]  ? radix_tree_gang_lookup_tag+0xa8/0x100
> [377279.489974]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> [377279.494041]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> [377279.497859]  xfs_eofblocks_worker+0x29/0x40 [xfs]
> [377279.501484]  process_one_work+0x195/0x380
> ...
> 
> The immediate issue is likely that the eofblocks transaction is not
> NOWRITECOUNT (same for the cowblocks scanner, btw), but the problem with
> doing that is these helpers are called from other contexts outside of
> the background scanners.
> 
> Perhaps what we need to do here is let these background scanners acquire
> a superblock write reference, similar to what Darrick recently added to
> scrub..? We'd have to do that from the scanner workqueue task, so it
> would probably need to be a trylock so we don't end up in a similar
> situation as above. I.e., we'd either get the reference and cause freeze
> to wait until it's dropped or bail out if freeze has already stopped the
> transaction subsystem. Thoughts?

Hmm, I had a whole gigantic series to refactor all the speculative
preallocation gc work into a single thread + radix tree tag; I'll see if
that series actually fixed this problem too.

But yes, all background threads that run transactions need to have
freezer protection.

--D

> Brian
> 
> > 
> > Thanks,
> > Paul
> > 
> > -- 
> > You are receiving this mail because:
> > You are watching the assignee of the bug.
> > 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
                   ` (3 preceding siblings ...)
  2020-04-07 13:18 ` bugzilla-daemon
@ 2020-04-07 15:17 ` bugzilla-daemon
  2020-04-07 16:37 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2020-04-07 15:17 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=207053

--- Comment #4 from darrick.wong@oracle.com ---
On Tue, Apr 07, 2020 at 09:18:12AM -0400, Brian Foster wrote:
> On Tue, Apr 07, 2020 at 06:41:31AM +0000, bugzilla-daemon@bugzilla.kernel.org
> wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=207053
> > 
> > --- Comment #2 from Paul Furtado (paulfurtado91@gmail.com) ---
> > Hi Dave,
> > 
> > Just had another case of this crop up and I was able to get the blocked
> tasks
> > output before automation killed the server. Because the log was too large
> to
> > attach, I've pasted the output into a github gist here:
> >
> https://gist.githubusercontent.com/PaulFurtado/c9bade038b8a5c7ddb53a6e10def058f/raw/ee43926c96c0d6a9ec81a648754c1af599ef0bdd/sysrq_w.log
> > 
> 
> Hm, so it looks like this is stuck between freeze:
> 
> [377279.630957] fsfreeze        D    0 46819  46337 0x00004084
> [377279.634910] Call Trace:
> [377279.637594]  ? __schedule+0x292/0x6f0
> [377279.640833]  ? xfs_xattr_get+0x51/0x80 [xfs]
> [377279.644287]  schedule+0x2f/0xa0
> [377279.647286]  schedule_timeout+0x1dd/0x300
> [377279.650661]  wait_for_completion+0x126/0x190
> [377279.654154]  ? wake_up_q+0x80/0x80
> [377279.657277]  ? work_busy+0x80/0x80
> [377279.660375]  __flush_work+0x177/0x1b0
> [377279.663604]  ? worker_attach_to_pool+0x90/0x90
> [377279.667121]  __cancel_work_timer+0x12b/0x1b0
> [377279.670571]  ? rcu_sync_enter+0x8b/0xd0
> [377279.673864]  xfs_stop_block_reaping+0x15/0x30 [xfs]
> [377279.677585]  xfs_fs_freeze+0x15/0x40 [xfs]
> [377279.680950]  freeze_super+0xc8/0x190
> [377279.684086]  do_vfs_ioctl+0x510/0x630
> ...
> 
> ... and the eofblocks scanner:
> 
> [377279.422496] Workqueue: xfs-eofblocks/nvme13n1 xfs_eofblocks_worker [xfs]
> [377279.426971] Call Trace:
> [377279.429662]  ? __schedule+0x292/0x6f0
> [377279.432839]  schedule+0x2f/0xa0
> [377279.435794]  rwsem_down_read_slowpath+0x196/0x530
> [377279.439435]  ? kmem_cache_alloc+0x152/0x1f0
> [377279.442834]  ? __percpu_down_read+0x49/0x60
> [377279.446242]  __percpu_down_read+0x49/0x60
> [377279.449586]  __sb_start_write+0x5b/0x60
> [377279.452869]  xfs_trans_alloc+0x152/0x160 [xfs]
> [377279.456372]  xfs_free_eofblocks+0x12d/0x1f0 [xfs]
> [377279.460014]  xfs_inode_free_eofblocks+0x128/0x1a0 [xfs]
> [377279.463903]  ? xfs_inode_ag_walk_grab+0x5f/0x90 [xfs]
> [377279.467680]  xfs_inode_ag_walk.isra.17+0x1a7/0x410 [xfs]
> [377279.471567]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> [377279.475620]  ? kvm_sched_clock_read+0xd/0x20
> [377279.479059]  ? sched_clock+0x5/0x10
> [377279.482184]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> [377279.486234]  ? radix_tree_gang_lookup_tag+0xa8/0x100
> [377279.489974]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> [377279.494041]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> [377279.497859]  xfs_eofblocks_worker+0x29/0x40 [xfs]
> [377279.501484]  process_one_work+0x195/0x380
> ...
> 
> The immediate issue is likely that the eofblocks transaction is not
> NOWRITECOUNT (same for the cowblocks scanner, btw), but the problem with
> doing that is these helpers are called from other contexts outside of
> the background scanners.
> 
> Perhaps what we need to do here is let these background scanners acquire
> a superblock write reference, similar to what Darrick recently added to
> scrub..? We'd have to do that from the scanner workqueue task, so it
> would probably need to be a trylock so we don't end up in a similar
> situation as above. I.e., we'd either get the reference and cause freeze
> to wait until it's dropped or bail out if freeze has already stopped the
> transaction subsystem. Thoughts?

Hmm, I had a whole gigantic series to refactor all the speculative
preallocation gc work into a single thread + radix tree tag; I'll see if
that series actually fixed this problem too.

But yes, all background threads that run transactions need to have
freezer protection.

--D

> Brian
> 
> > 
> > Thanks,
> > Paul
> > 
> > -- 
> > You are receiving this mail because:
> > You are watching the assignee of the bug.
> > 
>

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-07 15:17     ` Darrick J. Wong
@ 2020-04-07 16:37       ` Brian Foster
  2020-04-07 16:49         ` Darrick J. Wong
  0 siblings, 1 reply; 20+ messages in thread
From: Brian Foster @ 2020-04-07 16:37 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: bugzilla-daemon, linux-xfs

On Tue, Apr 07, 2020 at 08:17:38AM -0700, Darrick J. Wong wrote:
> On Tue, Apr 07, 2020 at 09:18:12AM -0400, Brian Foster wrote:
> > On Tue, Apr 07, 2020 at 06:41:31AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> > > https://bugzilla.kernel.org/show_bug.cgi?id=207053
> > > 
> > > --- Comment #2 from Paul Furtado (paulfurtado91@gmail.com) ---
> > > Hi Dave,
> > > 
> > > Just had another case of this crop up and I was able to get the blocked tasks
> > > output before automation killed the server. Because the log was too large to
> > > attach, I've pasted the output into a github gist here:
> > > https://gist.githubusercontent.com/PaulFurtado/c9bade038b8a5c7ddb53a6e10def058f/raw/ee43926c96c0d6a9ec81a648754c1af599ef0bdd/sysrq_w.log
> > > 
> > 
> > Hm, so it looks like this is stuck between freeze:
> > 
> > [377279.630957] fsfreeze        D    0 46819  46337 0x00004084
> > [377279.634910] Call Trace:
> > [377279.637594]  ? __schedule+0x292/0x6f0
> > [377279.640833]  ? xfs_xattr_get+0x51/0x80 [xfs]
> > [377279.644287]  schedule+0x2f/0xa0
> > [377279.647286]  schedule_timeout+0x1dd/0x300
> > [377279.650661]  wait_for_completion+0x126/0x190
> > [377279.654154]  ? wake_up_q+0x80/0x80
> > [377279.657277]  ? work_busy+0x80/0x80
> > [377279.660375]  __flush_work+0x177/0x1b0
> > [377279.663604]  ? worker_attach_to_pool+0x90/0x90
> > [377279.667121]  __cancel_work_timer+0x12b/0x1b0
> > [377279.670571]  ? rcu_sync_enter+0x8b/0xd0
> > [377279.673864]  xfs_stop_block_reaping+0x15/0x30 [xfs]
> > [377279.677585]  xfs_fs_freeze+0x15/0x40 [xfs]
> > [377279.680950]  freeze_super+0xc8/0x190
> > [377279.684086]  do_vfs_ioctl+0x510/0x630
> > ...
> > 
> > ... and the eofblocks scanner:
> > 
> > [377279.422496] Workqueue: xfs-eofblocks/nvme13n1 xfs_eofblocks_worker [xfs]
> > [377279.426971] Call Trace:
> > [377279.429662]  ? __schedule+0x292/0x6f0
> > [377279.432839]  schedule+0x2f/0xa0
> > [377279.435794]  rwsem_down_read_slowpath+0x196/0x530
> > [377279.439435]  ? kmem_cache_alloc+0x152/0x1f0
> > [377279.442834]  ? __percpu_down_read+0x49/0x60
> > [377279.446242]  __percpu_down_read+0x49/0x60
> > [377279.449586]  __sb_start_write+0x5b/0x60
> > [377279.452869]  xfs_trans_alloc+0x152/0x160 [xfs]
> > [377279.456372]  xfs_free_eofblocks+0x12d/0x1f0 [xfs]
> > [377279.460014]  xfs_inode_free_eofblocks+0x128/0x1a0 [xfs]
> > [377279.463903]  ? xfs_inode_ag_walk_grab+0x5f/0x90 [xfs]
> > [377279.467680]  xfs_inode_ag_walk.isra.17+0x1a7/0x410 [xfs]
> > [377279.471567]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > [377279.475620]  ? kvm_sched_clock_read+0xd/0x20
> > [377279.479059]  ? sched_clock+0x5/0x10
> > [377279.482184]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > [377279.486234]  ? radix_tree_gang_lookup_tag+0xa8/0x100
> > [377279.489974]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > [377279.494041]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> > [377279.497859]  xfs_eofblocks_worker+0x29/0x40 [xfs]
> > [377279.501484]  process_one_work+0x195/0x380
> > ...
> > 
> > The immediate issue is likely that the eofblocks transaction is not
> > NOWRITECOUNT (same for the cowblocks scanner, btw), but the problem with
> > doing that is these helpers are called from other contexts outside of
> > the background scanners.
> > 
> > Perhaps what we need to do here is let these background scanners acquire
> > a superblock write reference, similar to what Darrick recently added to
> > scrub..? We'd have to do that from the scanner workqueue task, so it
> > would probably need to be a trylock so we don't end up in a similar
> > situation as above. I.e., we'd either get the reference and cause freeze
> > to wait until it's dropped or bail out if freeze has already stopped the
> > transaction subsystem. Thoughts?
> 
> Hmm, I had a whole gigantic series to refactor all the speculative
> preallocation gc work into a single thread + radix tree tag; I'll see if
> that series actually fixed this problem too.
> 
> But yes, all background threads that run transactions need to have
> freezer protection.
> 

So something like the following in the meantime, assuming we want a
backportable fix..? I think this means we could return -EAGAIN from the
eofblocks ioctl, but afaict if something functionally conflicts with an
active scan across freeze then perhaps that's preferred.

Brian

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index a7be7a9e5c1a..0f14d58e5bb0 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1515,13 +1515,24 @@ __xfs_icache_free_eofblocks(
 					   void *args),
 	int			tag)
 {
-	int flags = SYNC_TRYLOCK;
+	int			flags = SYNC_TRYLOCK;
+	int			error;
 
 	if (eofb && (eofb->eof_flags & XFS_EOF_FLAGS_SYNC))
 		flags = SYNC_WAIT;
 
-	return xfs_inode_ag_iterator_tag(mp, execute, flags,
-					 eofb, tag);
+	/*
+	 * freeze waits on background scanner jobs to complete so we cannot
+	 * block on write protection here. Bail if the transaction subsystem is
+	 * already freezing, returning -EAGAIN to notify other callers.
+	 */
+	if (!sb_start_write_trylock(mp->m_super))
+		return -EAGAIN;
+
+	error = xfs_inode_ag_iterator_tag(mp, execute, flags, eofb, tag);
+	sb_end_write(mp->m_super);
+
+	return error;
 }
 
 int

> --D
> 
> > Brian
> > 
> > > 
> > > Thanks,
> > > Paul
> > > 
> > > -- 
> > > You are receiving this mail because:
> > > You are watching the assignee of the bug.
> > > 
> > 
> 


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
                   ` (4 preceding siblings ...)
  2020-04-07 15:17 ` bugzilla-daemon
@ 2020-04-07 16:37 ` bugzilla-daemon
  2020-04-07 16:49 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2020-04-07 16:37 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=207053

--- Comment #5 from bfoster@redhat.com ---
On Tue, Apr 07, 2020 at 08:17:38AM -0700, Darrick J. Wong wrote:
> On Tue, Apr 07, 2020 at 09:18:12AM -0400, Brian Foster wrote:
> > On Tue, Apr 07, 2020 at 06:41:31AM +0000,
> bugzilla-daemon@bugzilla.kernel.org wrote:
> > > https://bugzilla.kernel.org/show_bug.cgi?id=207053
> > > 
> > > --- Comment #2 from Paul Furtado (paulfurtado91@gmail.com) ---
> > > Hi Dave,
> > > 
> > > Just had another case of this crop up and I was able to get the blocked
> tasks
> > > output before automation killed the server. Because the log was too large
> to
> > > attach, I've pasted the output into a github gist here:
> > >
> https://gist.githubusercontent.com/PaulFurtado/c9bade038b8a5c7ddb53a6e10def058f/raw/ee43926c96c0d6a9ec81a648754c1af599ef0bdd/sysrq_w.log
> > > 
> > 
> > Hm, so it looks like this is stuck between freeze:
> > 
> > [377279.630957] fsfreeze        D    0 46819  46337 0x00004084
> > [377279.634910] Call Trace:
> > [377279.637594]  ? __schedule+0x292/0x6f0
> > [377279.640833]  ? xfs_xattr_get+0x51/0x80 [xfs]
> > [377279.644287]  schedule+0x2f/0xa0
> > [377279.647286]  schedule_timeout+0x1dd/0x300
> > [377279.650661]  wait_for_completion+0x126/0x190
> > [377279.654154]  ? wake_up_q+0x80/0x80
> > [377279.657277]  ? work_busy+0x80/0x80
> > [377279.660375]  __flush_work+0x177/0x1b0
> > [377279.663604]  ? worker_attach_to_pool+0x90/0x90
> > [377279.667121]  __cancel_work_timer+0x12b/0x1b0
> > [377279.670571]  ? rcu_sync_enter+0x8b/0xd0
> > [377279.673864]  xfs_stop_block_reaping+0x15/0x30 [xfs]
> > [377279.677585]  xfs_fs_freeze+0x15/0x40 [xfs]
> > [377279.680950]  freeze_super+0xc8/0x190
> > [377279.684086]  do_vfs_ioctl+0x510/0x630
> > ...
> > 
> > ... and the eofblocks scanner:
> > 
> > [377279.422496] Workqueue: xfs-eofblocks/nvme13n1 xfs_eofblocks_worker
> [xfs]
> > [377279.426971] Call Trace:
> > [377279.429662]  ? __schedule+0x292/0x6f0
> > [377279.432839]  schedule+0x2f/0xa0
> > [377279.435794]  rwsem_down_read_slowpath+0x196/0x530
> > [377279.439435]  ? kmem_cache_alloc+0x152/0x1f0
> > [377279.442834]  ? __percpu_down_read+0x49/0x60
> > [377279.446242]  __percpu_down_read+0x49/0x60
> > [377279.449586]  __sb_start_write+0x5b/0x60
> > [377279.452869]  xfs_trans_alloc+0x152/0x160 [xfs]
> > [377279.456372]  xfs_free_eofblocks+0x12d/0x1f0 [xfs]
> > [377279.460014]  xfs_inode_free_eofblocks+0x128/0x1a0 [xfs]
> > [377279.463903]  ? xfs_inode_ag_walk_grab+0x5f/0x90 [xfs]
> > [377279.467680]  xfs_inode_ag_walk.isra.17+0x1a7/0x410 [xfs]
> > [377279.471567]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > [377279.475620]  ? kvm_sched_clock_read+0xd/0x20
> > [377279.479059]  ? sched_clock+0x5/0x10
> > [377279.482184]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > [377279.486234]  ? radix_tree_gang_lookup_tag+0xa8/0x100
> > [377279.489974]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > [377279.494041]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> > [377279.497859]  xfs_eofblocks_worker+0x29/0x40 [xfs]
> > [377279.501484]  process_one_work+0x195/0x380
> > ...
> > 
> > The immediate issue is likely that the eofblocks transaction is not
> > NOWRITECOUNT (same for the cowblocks scanner, btw), but the problem with
> > doing that is these helpers are called from other contexts outside of
> > the background scanners.
> > 
> > Perhaps what we need to do here is let these background scanners acquire
> > a superblock write reference, similar to what Darrick recently added to
> > scrub..? We'd have to do that from the scanner workqueue task, so it
> > would probably need to be a trylock so we don't end up in a similar
> > situation as above. I.e., we'd either get the reference and cause freeze
> > to wait until it's dropped or bail out if freeze has already stopped the
> > transaction subsystem. Thoughts?
> 
> Hmm, I had a whole gigantic series to refactor all the speculative
> preallocation gc work into a single thread + radix tree tag; I'll see if
> that series actually fixed this problem too.
> 
> But yes, all background threads that run transactions need to have
> freezer protection.
> 

So something like the following in the meantime, assuming we want a
backportable fix..? I think this means we could return -EAGAIN from the
eofblocks ioctl, but afaict if something functionally conflicts with an
active scan across freeze then perhaps that's preferred.

Brian

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index a7be7a9e5c1a..0f14d58e5bb0 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1515,13 +1515,24 @@ __xfs_icache_free_eofblocks(
                                           void *args),
        int                     tag)
 {
-       int flags = SYNC_TRYLOCK;
+       int                     flags = SYNC_TRYLOCK;
+       int                     error;

        if (eofb && (eofb->eof_flags & XFS_EOF_FLAGS_SYNC))
                flags = SYNC_WAIT;

-       return xfs_inode_ag_iterator_tag(mp, execute, flags,
-                                        eofb, tag);
+       /*
+        * freeze waits on background scanner jobs to complete so we cannot
+        * block on write protection here. Bail if the transaction subsystem is
+        * already freezing, returning -EAGAIN to notify other callers.
+        */
+       if (!sb_start_write_trylock(mp->m_super))
+               return -EAGAIN;
+
+       error = xfs_inode_ag_iterator_tag(mp, execute, flags, eofb, tag);
+       sb_end_write(mp->m_super);
+
+       return error;
 }

 int

> --D
> 
> > Brian
> > 
> > > 
> > > Thanks,
> > > Paul
> > > 
> > > -- 
> > > You are receiving this mail because:
> > > You are watching the assignee of the bug.
> > > 
> > 
>

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-07 16:37       ` Brian Foster
@ 2020-04-07 16:49         ` Darrick J. Wong
  2020-04-07 17:02           ` Brian Foster
  0 siblings, 1 reply; 20+ messages in thread
From: Darrick J. Wong @ 2020-04-07 16:49 UTC (permalink / raw)
  To: Brian Foster; +Cc: bugzilla-daemon, linux-xfs

On Tue, Apr 07, 2020 at 12:37:39PM -0400, Brian Foster wrote:
> On Tue, Apr 07, 2020 at 08:17:38AM -0700, Darrick J. Wong wrote:
> > On Tue, Apr 07, 2020 at 09:18:12AM -0400, Brian Foster wrote:
> > > On Tue, Apr 07, 2020 at 06:41:31AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=207053
> > > > 
> > > > --- Comment #2 from Paul Furtado (paulfurtado91@gmail.com) ---
> > > > Hi Dave,
> > > > 
> > > > Just had another case of this crop up and I was able to get the blocked tasks
> > > > output before automation killed the server. Because the log was too large to
> > > > attach, I've pasted the output into a github gist here:
> > > > https://gist.githubusercontent.com/PaulFurtado/c9bade038b8a5c7ddb53a6e10def058f/raw/ee43926c96c0d6a9ec81a648754c1af599ef0bdd/sysrq_w.log
> > > > 
> > > 
> > > Hm, so it looks like this is stuck between freeze:
> > > 
> > > [377279.630957] fsfreeze        D    0 46819  46337 0x00004084
> > > [377279.634910] Call Trace:
> > > [377279.637594]  ? __schedule+0x292/0x6f0
> > > [377279.640833]  ? xfs_xattr_get+0x51/0x80 [xfs]
> > > [377279.644287]  schedule+0x2f/0xa0
> > > [377279.647286]  schedule_timeout+0x1dd/0x300
> > > [377279.650661]  wait_for_completion+0x126/0x190
> > > [377279.654154]  ? wake_up_q+0x80/0x80
> > > [377279.657277]  ? work_busy+0x80/0x80
> > > [377279.660375]  __flush_work+0x177/0x1b0
> > > [377279.663604]  ? worker_attach_to_pool+0x90/0x90
> > > [377279.667121]  __cancel_work_timer+0x12b/0x1b0
> > > [377279.670571]  ? rcu_sync_enter+0x8b/0xd0
> > > [377279.673864]  xfs_stop_block_reaping+0x15/0x30 [xfs]
> > > [377279.677585]  xfs_fs_freeze+0x15/0x40 [xfs]
> > > [377279.680950]  freeze_super+0xc8/0x190
> > > [377279.684086]  do_vfs_ioctl+0x510/0x630
> > > ...
> > > 
> > > ... and the eofblocks scanner:
> > > 
> > > [377279.422496] Workqueue: xfs-eofblocks/nvme13n1 xfs_eofblocks_worker [xfs]
> > > [377279.426971] Call Trace:
> > > [377279.429662]  ? __schedule+0x292/0x6f0
> > > [377279.432839]  schedule+0x2f/0xa0
> > > [377279.435794]  rwsem_down_read_slowpath+0x196/0x530
> > > [377279.439435]  ? kmem_cache_alloc+0x152/0x1f0
> > > [377279.442834]  ? __percpu_down_read+0x49/0x60
> > > [377279.446242]  __percpu_down_read+0x49/0x60
> > > [377279.449586]  __sb_start_write+0x5b/0x60
> > > [377279.452869]  xfs_trans_alloc+0x152/0x160 [xfs]
> > > [377279.456372]  xfs_free_eofblocks+0x12d/0x1f0 [xfs]
> > > [377279.460014]  xfs_inode_free_eofblocks+0x128/0x1a0 [xfs]
> > > [377279.463903]  ? xfs_inode_ag_walk_grab+0x5f/0x90 [xfs]
> > > [377279.467680]  xfs_inode_ag_walk.isra.17+0x1a7/0x410 [xfs]
> > > [377279.471567]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > [377279.475620]  ? kvm_sched_clock_read+0xd/0x20
> > > [377279.479059]  ? sched_clock+0x5/0x10
> > > [377279.482184]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > [377279.486234]  ? radix_tree_gang_lookup_tag+0xa8/0x100
> > > [377279.489974]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > [377279.494041]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> > > [377279.497859]  xfs_eofblocks_worker+0x29/0x40 [xfs]
> > > [377279.501484]  process_one_work+0x195/0x380
> > > ...
> > > 
> > > The immediate issue is likely that the eofblocks transaction is not
> > > NOWRITECOUNT (same for the cowblocks scanner, btw), but the problem with
> > > doing that is these helpers are called from other contexts outside of
> > > the background scanners.
> > > 
> > > Perhaps what we need to do here is let these background scanners acquire
> > > a superblock write reference, similar to what Darrick recently added to
> > > scrub..? We'd have to do that from the scanner workqueue task, so it
> > > would probably need to be a trylock so we don't end up in a similar
> > > situation as above. I.e., we'd either get the reference and cause freeze
> > > to wait until it's dropped or bail out if freeze has already stopped the
> > > transaction subsystem. Thoughts?
> > 
> > Hmm, I had a whole gigantic series to refactor all the speculative
> > preallocation gc work into a single thread + radix tree tag; I'll see if
> > that series actually fixed this problem too.
> > 
> > But yes, all background threads that run transactions need to have
> > freezer protection.
> > 
> 
> So something like the following in the meantime, assuming we want a
> backportable fix..? I think this means we could return -EAGAIN from the
> eofblocks ioctl, but afaict if something functionally conflicts with an
> active scan across freeze then perhaps that's preferred.

Apparently I don't have a patch that fixes the speculative gc code.  The
deferred inactivation worker does it, so perhaps I got mixed up. :/

I think a better fix would be to annotate xfs_icache_free_eofblocks and
xfs_icache_free_cowblocks to note that the caller must obtain freeze
protection before calling those functions.  Then we can play whackamole
with the existing callers:

1. xfs_eofblocks_worker and xfs_cowblocks_worker can try to
sb_start_write and just go back to sleep if the fs is frozen.  The
flush_workqueue will then cancel the delayed work and the freeze can
proceed.

2. The buffered write ENOSPC scour-and-retry loops already have freeze
protection because they're file writes, so they don't have to change.

3. XFS_IOC_FREE_EOFBLOCKS can sb_start_write, which means that callers
will sleep on the frozen fs.

--D

> Brian
> 
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index a7be7a9e5c1a..0f14d58e5bb0 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -1515,13 +1515,24 @@ __xfs_icache_free_eofblocks(
>  					   void *args),
>  	int			tag)
>  {
> -	int flags = SYNC_TRYLOCK;
> +	int			flags = SYNC_TRYLOCK;
> +	int			error;
>  
>  	if (eofb && (eofb->eof_flags & XFS_EOF_FLAGS_SYNC))
>  		flags = SYNC_WAIT;
>  
> -	return xfs_inode_ag_iterator_tag(mp, execute, flags,
> -					 eofb, tag);
> +	/*
> +	 * freeze waits on background scanner jobs to complete so we cannot
> +	 * block on write protection here. Bail if the transaction subsystem is
> +	 * already freezing, returning -EAGAIN to notify other callers.
> +	 */
> +	if (!sb_start_write_trylock(mp->m_super))
> +		return -EAGAIN;
> +
> +	error = xfs_inode_ag_iterator_tag(mp, execute, flags, eofb, tag);
> +	sb_end_write(mp->m_super);
> +
> +	return error;
>  }
>  
>  int
> 
> > --D
> > 
> > > Brian
> > > 
> > > > 
> > > > Thanks,
> > > > Paul
> > > > 
> > > > -- 
> > > > You are receiving this mail because:
> > > > You are watching the assignee of the bug.
> > > > 
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
                   ` (5 preceding siblings ...)
  2020-04-07 16:37 ` bugzilla-daemon
@ 2020-04-07 16:49 ` bugzilla-daemon
  2020-04-07 17:02 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2020-04-07 16:49 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=207053

--- Comment #6 from darrick.wong@oracle.com ---
On Tue, Apr 07, 2020 at 12:37:39PM -0400, Brian Foster wrote:
> On Tue, Apr 07, 2020 at 08:17:38AM -0700, Darrick J. Wong wrote:
> > On Tue, Apr 07, 2020 at 09:18:12AM -0400, Brian Foster wrote:
> > > On Tue, Apr 07, 2020 at 06:41:31AM +0000,
> bugzilla-daemon@bugzilla.kernel.org wrote:
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=207053
> > > > 
> > > > --- Comment #2 from Paul Furtado (paulfurtado91@gmail.com) ---
> > > > Hi Dave,
> > > > 
> > > > Just had another case of this crop up and I was able to get the blocked
> tasks
> > > > output before automation killed the server. Because the log was too
> large to
> > > > attach, I've pasted the output into a github gist here:
> > > >
> https://gist.githubusercontent.com/PaulFurtado/c9bade038b8a5c7ddb53a6e10def058f/raw/ee43926c96c0d6a9ec81a648754c1af599ef0bdd/sysrq_w.log
> > > > 
> > > 
> > > Hm, so it looks like this is stuck between freeze:
> > > 
> > > [377279.630957] fsfreeze        D    0 46819  46337 0x00004084
> > > [377279.634910] Call Trace:
> > > [377279.637594]  ? __schedule+0x292/0x6f0
> > > [377279.640833]  ? xfs_xattr_get+0x51/0x80 [xfs]
> > > [377279.644287]  schedule+0x2f/0xa0
> > > [377279.647286]  schedule_timeout+0x1dd/0x300
> > > [377279.650661]  wait_for_completion+0x126/0x190
> > > [377279.654154]  ? wake_up_q+0x80/0x80
> > > [377279.657277]  ? work_busy+0x80/0x80
> > > [377279.660375]  __flush_work+0x177/0x1b0
> > > [377279.663604]  ? worker_attach_to_pool+0x90/0x90
> > > [377279.667121]  __cancel_work_timer+0x12b/0x1b0
> > > [377279.670571]  ? rcu_sync_enter+0x8b/0xd0
> > > [377279.673864]  xfs_stop_block_reaping+0x15/0x30 [xfs]
> > > [377279.677585]  xfs_fs_freeze+0x15/0x40 [xfs]
> > > [377279.680950]  freeze_super+0xc8/0x190
> > > [377279.684086]  do_vfs_ioctl+0x510/0x630
> > > ...
> > > 
> > > ... and the eofblocks scanner:
> > > 
> > > [377279.422496] Workqueue: xfs-eofblocks/nvme13n1 xfs_eofblocks_worker
> [xfs]
> > > [377279.426971] Call Trace:
> > > [377279.429662]  ? __schedule+0x292/0x6f0
> > > [377279.432839]  schedule+0x2f/0xa0
> > > [377279.435794]  rwsem_down_read_slowpath+0x196/0x530
> > > [377279.439435]  ? kmem_cache_alloc+0x152/0x1f0
> > > [377279.442834]  ? __percpu_down_read+0x49/0x60
> > > [377279.446242]  __percpu_down_read+0x49/0x60
> > > [377279.449586]  __sb_start_write+0x5b/0x60
> > > [377279.452869]  xfs_trans_alloc+0x152/0x160 [xfs]
> > > [377279.456372]  xfs_free_eofblocks+0x12d/0x1f0 [xfs]
> > > [377279.460014]  xfs_inode_free_eofblocks+0x128/0x1a0 [xfs]
> > > [377279.463903]  ? xfs_inode_ag_walk_grab+0x5f/0x90 [xfs]
> > > [377279.467680]  xfs_inode_ag_walk.isra.17+0x1a7/0x410 [xfs]
> > > [377279.471567]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > [377279.475620]  ? kvm_sched_clock_read+0xd/0x20
> > > [377279.479059]  ? sched_clock+0x5/0x10
> > > [377279.482184]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > [377279.486234]  ? radix_tree_gang_lookup_tag+0xa8/0x100
> > > [377279.489974]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > [377279.494041]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> > > [377279.497859]  xfs_eofblocks_worker+0x29/0x40 [xfs]
> > > [377279.501484]  process_one_work+0x195/0x380
> > > ...
> > > 
> > > The immediate issue is likely that the eofblocks transaction is not
> > > NOWRITECOUNT (same for the cowblocks scanner, btw), but the problem with
> > > doing that is these helpers are called from other contexts outside of
> > > the background scanners.
> > > 
> > > Perhaps what we need to do here is let these background scanners acquire
> > > a superblock write reference, similar to what Darrick recently added to
> > > scrub..? We'd have to do that from the scanner workqueue task, so it
> > > would probably need to be a trylock so we don't end up in a similar
> > > situation as above. I.e., we'd either get the reference and cause freeze
> > > to wait until it's dropped or bail out if freeze has already stopped the
> > > transaction subsystem. Thoughts?
> > 
> > Hmm, I had a whole gigantic series to refactor all the speculative
> > preallocation gc work into a single thread + radix tree tag; I'll see if
> > that series actually fixed this problem too.
> > 
> > But yes, all background threads that run transactions need to have
> > freezer protection.
> > 
> 
> So something like the following in the meantime, assuming we want a
> backportable fix..? I think this means we could return -EAGAIN from the
> eofblocks ioctl, but afaict if something functionally conflicts with an
> active scan across freeze then perhaps that's preferred.

Apparently I don't have a patch that fixes the speculative gc code.  The
deferred inactivation worker does it, so perhaps I got mixed up. :/

I think a better fix would be to annotate xfs_icache_free_eofblocks and
xfs_icache_free_cowblocks to note that the caller must obtain freeze
protection before calling those functions.  Then we can play whackamole
with the existing callers:

1. xfs_eofblocks_worker and xfs_cowblocks_worker can try to
sb_start_write and just go back to sleep if the fs is frozen.  The
flush_workqueue will then cancel the delayed work and the freeze can
proceed.

2. The buffered write ENOSPC scour-and-retry loops already have freeze
protection because they're file writes, so they don't have to change.

3. XFS_IOC_FREE_EOFBLOCKS can sb_start_write, which means that callers
will sleep on the frozen fs.

--D

> Brian
> 
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index a7be7a9e5c1a..0f14d58e5bb0 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -1515,13 +1515,24 @@ __xfs_icache_free_eofblocks(
>                                          void *args),
>       int                     tag)
>  {
> -     int flags = SYNC_TRYLOCK;
> +     int                     flags = SYNC_TRYLOCK;
> +     int                     error;
>  
>       if (eofb && (eofb->eof_flags & XFS_EOF_FLAGS_SYNC))
>               flags = SYNC_WAIT;
>  
> -     return xfs_inode_ag_iterator_tag(mp, execute, flags,
> -                                      eofb, tag);
> +     /*
> +      * freeze waits on background scanner jobs to complete so we cannot
> +      * block on write protection here. Bail if the transaction subsystem is
> +      * already freezing, returning -EAGAIN to notify other callers.
> +      */
> +     if (!sb_start_write_trylock(mp->m_super))
> +             return -EAGAIN;
> +
> +     error = xfs_inode_ag_iterator_tag(mp, execute, flags, eofb, tag);
> +     sb_end_write(mp->m_super);
> +
> +     return error;
>  }
>  
>  int
> 
> > --D
> > 
> > > Brian
> > > 
> > > > 
> > > > Thanks,
> > > > Paul
> > > > 
> > > > -- 
> > > > You are receiving this mail because:
> > > > You are watching the assignee of the bug.
> > > > 
> > > 
> > 
>

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-07 16:49         ` Darrick J. Wong
@ 2020-04-07 17:02           ` Brian Foster
  0 siblings, 0 replies; 20+ messages in thread
From: Brian Foster @ 2020-04-07 17:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: bugzilla-daemon, linux-xfs

On Tue, Apr 07, 2020 at 09:49:36AM -0700, Darrick J. Wong wrote:
> On Tue, Apr 07, 2020 at 12:37:39PM -0400, Brian Foster wrote:
> > On Tue, Apr 07, 2020 at 08:17:38AM -0700, Darrick J. Wong wrote:
> > > On Tue, Apr 07, 2020 at 09:18:12AM -0400, Brian Foster wrote:
> > > > On Tue, Apr 07, 2020 at 06:41:31AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=207053
> > > > > 
> > > > > --- Comment #2 from Paul Furtado (paulfurtado91@gmail.com) ---
> > > > > Hi Dave,
> > > > > 
> > > > > Just had another case of this crop up and I was able to get the blocked tasks
> > > > > output before automation killed the server. Because the log was too large to
> > > > > attach, I've pasted the output into a github gist here:
> > > > > https://gist.githubusercontent.com/PaulFurtado/c9bade038b8a5c7ddb53a6e10def058f/raw/ee43926c96c0d6a9ec81a648754c1af599ef0bdd/sysrq_w.log
> > > > > 
> > > > 
> > > > Hm, so it looks like this is stuck between freeze:
> > > > 
> > > > [377279.630957] fsfreeze        D    0 46819  46337 0x00004084
> > > > [377279.634910] Call Trace:
> > > > [377279.637594]  ? __schedule+0x292/0x6f0
> > > > [377279.640833]  ? xfs_xattr_get+0x51/0x80 [xfs]
> > > > [377279.644287]  schedule+0x2f/0xa0
> > > > [377279.647286]  schedule_timeout+0x1dd/0x300
> > > > [377279.650661]  wait_for_completion+0x126/0x190
> > > > [377279.654154]  ? wake_up_q+0x80/0x80
> > > > [377279.657277]  ? work_busy+0x80/0x80
> > > > [377279.660375]  __flush_work+0x177/0x1b0
> > > > [377279.663604]  ? worker_attach_to_pool+0x90/0x90
> > > > [377279.667121]  __cancel_work_timer+0x12b/0x1b0
> > > > [377279.670571]  ? rcu_sync_enter+0x8b/0xd0
> > > > [377279.673864]  xfs_stop_block_reaping+0x15/0x30 [xfs]
> > > > [377279.677585]  xfs_fs_freeze+0x15/0x40 [xfs]
> > > > [377279.680950]  freeze_super+0xc8/0x190
> > > > [377279.684086]  do_vfs_ioctl+0x510/0x630
> > > > ...
> > > > 
> > > > ... and the eofblocks scanner:
> > > > 
> > > > [377279.422496] Workqueue: xfs-eofblocks/nvme13n1 xfs_eofblocks_worker [xfs]
> > > > [377279.426971] Call Trace:
> > > > [377279.429662]  ? __schedule+0x292/0x6f0
> > > > [377279.432839]  schedule+0x2f/0xa0
> > > > [377279.435794]  rwsem_down_read_slowpath+0x196/0x530
> > > > [377279.439435]  ? kmem_cache_alloc+0x152/0x1f0
> > > > [377279.442834]  ? __percpu_down_read+0x49/0x60
> > > > [377279.446242]  __percpu_down_read+0x49/0x60
> > > > [377279.449586]  __sb_start_write+0x5b/0x60
> > > > [377279.452869]  xfs_trans_alloc+0x152/0x160 [xfs]
> > > > [377279.456372]  xfs_free_eofblocks+0x12d/0x1f0 [xfs]
> > > > [377279.460014]  xfs_inode_free_eofblocks+0x128/0x1a0 [xfs]
> > > > [377279.463903]  ? xfs_inode_ag_walk_grab+0x5f/0x90 [xfs]
> > > > [377279.467680]  xfs_inode_ag_walk.isra.17+0x1a7/0x410 [xfs]
> > > > [377279.471567]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > > [377279.475620]  ? kvm_sched_clock_read+0xd/0x20
> > > > [377279.479059]  ? sched_clock+0x5/0x10
> > > > [377279.482184]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > > [377279.486234]  ? radix_tree_gang_lookup_tag+0xa8/0x100
> > > > [377279.489974]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > > [377279.494041]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> > > > [377279.497859]  xfs_eofblocks_worker+0x29/0x40 [xfs]
> > > > [377279.501484]  process_one_work+0x195/0x380
> > > > ...
> > > > 
> > > > The immediate issue is likely that the eofblocks transaction is not
> > > > NOWRITECOUNT (same for the cowblocks scanner, btw), but the problem with
> > > > doing that is these helpers are called from other contexts outside of
> > > > the background scanners.
> > > > 
> > > > Perhaps what we need to do here is let these background scanners acquire
> > > > a superblock write reference, similar to what Darrick recently added to
> > > > scrub..? We'd have to do that from the scanner workqueue task, so it
> > > > would probably need to be a trylock so we don't end up in a similar
> > > > situation as above. I.e., we'd either get the reference and cause freeze
> > > > to wait until it's dropped or bail out if freeze has already stopped the
> > > > transaction subsystem. Thoughts?
> > > 
> > > Hmm, I had a whole gigantic series to refactor all the speculative
> > > preallocation gc work into a single thread + radix tree tag; I'll see if
> > > that series actually fixed this problem too.
> > > 
> > > But yes, all background threads that run transactions need to have
> > > freezer protection.
> > > 
> > 
> > So something like the following in the meantime, assuming we want a
> > backportable fix..? I think this means we could return -EAGAIN from the
> > eofblocks ioctl, but afaict if something functionally conflicts with an
> > active scan across freeze then perhaps that's preferred.
> 
> Apparently I don't have a patch that fixes the speculative gc code.  The
> deferred inactivation worker does it, so perhaps I got mixed up. :/
> 

Ok.

> I think a better fix would be to annotate xfs_icache_free_eofblocks and
> xfs_icache_free_cowblocks to note that the caller must obtain freeze
> protection before calling those functions.  Then we can play whackamole
> with the existing callers:
> 
> 1. xfs_eofblocks_worker and xfs_cowblocks_worker can try to
> sb_start_write and just go back to sleep if the fs is frozen.  The
> flush_workqueue will then cancel the delayed work and the freeze can
> proceed.
> 
> 2. The buffered write ENOSPC scour-and-retry loops already have freeze
> protection because they're file writes, so they don't have to change.
> 
> 3. XFS_IOC_FREE_EOFBLOCKS can sb_start_write, which means that callers
> will sleep on the frozen fs.
> 

Sure, that works for me. It can be condensed later if it ends up in a
single thread.

Brian

> --D
> 
> > Brian
> > 
> > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > index a7be7a9e5c1a..0f14d58e5bb0 100644
> > --- a/fs/xfs/xfs_icache.c
> > +++ b/fs/xfs/xfs_icache.c
> > @@ -1515,13 +1515,24 @@ __xfs_icache_free_eofblocks(
> >  					   void *args),
> >  	int			tag)
> >  {
> > -	int flags = SYNC_TRYLOCK;
> > +	int			flags = SYNC_TRYLOCK;
> > +	int			error;
> >  
> >  	if (eofb && (eofb->eof_flags & XFS_EOF_FLAGS_SYNC))
> >  		flags = SYNC_WAIT;
> >  
> > -	return xfs_inode_ag_iterator_tag(mp, execute, flags,
> > -					 eofb, tag);
> > +	/*
> > +	 * freeze waits on background scanner jobs to complete so we cannot
> > +	 * block on write protection here. Bail if the transaction subsystem is
> > +	 * already freezing, returning -EAGAIN to notify other callers.
> > +	 */
> > +	if (!sb_start_write_trylock(mp->m_super))
> > +		return -EAGAIN;
> > +
> > +	error = xfs_inode_ag_iterator_tag(mp, execute, flags, eofb, tag);
> > +	sb_end_write(mp->m_super);
> > +
> > +	return error;
> >  }
> >  
> >  int
> > 
> > > --D
> > > 
> > > > Brian
> > > > 
> > > > > 
> > > > > Thanks,
> > > > > Paul
> > > > > 
> > > > > -- 
> > > > > You are receiving this mail because:
> > > > > You are watching the assignee of the bug.
> > > > > 
> > > > 
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
                   ` (6 preceding siblings ...)
  2020-04-07 16:49 ` bugzilla-daemon
@ 2020-04-07 17:02 ` bugzilla-daemon
  2020-05-28  6:00 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2020-04-07 17:02 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=207053

--- Comment #7 from bfoster@redhat.com ---
On Tue, Apr 07, 2020 at 09:49:36AM -0700, Darrick J. Wong wrote:
> On Tue, Apr 07, 2020 at 12:37:39PM -0400, Brian Foster wrote:
> > On Tue, Apr 07, 2020 at 08:17:38AM -0700, Darrick J. Wong wrote:
> > > On Tue, Apr 07, 2020 at 09:18:12AM -0400, Brian Foster wrote:
> > > > On Tue, Apr 07, 2020 at 06:41:31AM +0000,
> bugzilla-daemon@bugzilla.kernel.org wrote:
> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=207053
> > > > > 
> > > > > --- Comment #2 from Paul Furtado (paulfurtado91@gmail.com) ---
> > > > > Hi Dave,
> > > > > 
> > > > > Just had another case of this crop up and I was able to get the
> blocked tasks
> > > > > output before automation killed the server. Because the log was too
> large to
> > > > > attach, I've pasted the output into a github gist here:
> > > > >
> https://gist.githubusercontent.com/PaulFurtado/c9bade038b8a5c7ddb53a6e10def058f/raw/ee43926c96c0d6a9ec81a648754c1af599ef0bdd/sysrq_w.log
> > > > > 
> > > > 
> > > > Hm, so it looks like this is stuck between freeze:
> > > > 
> > > > [377279.630957] fsfreeze        D    0 46819  46337 0x00004084
> > > > [377279.634910] Call Trace:
> > > > [377279.637594]  ? __schedule+0x292/0x6f0
> > > > [377279.640833]  ? xfs_xattr_get+0x51/0x80 [xfs]
> > > > [377279.644287]  schedule+0x2f/0xa0
> > > > [377279.647286]  schedule_timeout+0x1dd/0x300
> > > > [377279.650661]  wait_for_completion+0x126/0x190
> > > > [377279.654154]  ? wake_up_q+0x80/0x80
> > > > [377279.657277]  ? work_busy+0x80/0x80
> > > > [377279.660375]  __flush_work+0x177/0x1b0
> > > > [377279.663604]  ? worker_attach_to_pool+0x90/0x90
> > > > [377279.667121]  __cancel_work_timer+0x12b/0x1b0
> > > > [377279.670571]  ? rcu_sync_enter+0x8b/0xd0
> > > > [377279.673864]  xfs_stop_block_reaping+0x15/0x30 [xfs]
> > > > [377279.677585]  xfs_fs_freeze+0x15/0x40 [xfs]
> > > > [377279.680950]  freeze_super+0xc8/0x190
> > > > [377279.684086]  do_vfs_ioctl+0x510/0x630
> > > > ...
> > > > 
> > > > ... and the eofblocks scanner:
> > > > 
> > > > [377279.422496] Workqueue: xfs-eofblocks/nvme13n1 xfs_eofblocks_worker
> [xfs]
> > > > [377279.426971] Call Trace:
> > > > [377279.429662]  ? __schedule+0x292/0x6f0
> > > > [377279.432839]  schedule+0x2f/0xa0
> > > > [377279.435794]  rwsem_down_read_slowpath+0x196/0x530
> > > > [377279.439435]  ? kmem_cache_alloc+0x152/0x1f0
> > > > [377279.442834]  ? __percpu_down_read+0x49/0x60
> > > > [377279.446242]  __percpu_down_read+0x49/0x60
> > > > [377279.449586]  __sb_start_write+0x5b/0x60
> > > > [377279.452869]  xfs_trans_alloc+0x152/0x160 [xfs]
> > > > [377279.456372]  xfs_free_eofblocks+0x12d/0x1f0 [xfs]
> > > > [377279.460014]  xfs_inode_free_eofblocks+0x128/0x1a0 [xfs]
> > > > [377279.463903]  ? xfs_inode_ag_walk_grab+0x5f/0x90 [xfs]
> > > > [377279.467680]  xfs_inode_ag_walk.isra.17+0x1a7/0x410 [xfs]
> > > > [377279.471567]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > > [377279.475620]  ? kvm_sched_clock_read+0xd/0x20
> > > > [377279.479059]  ? sched_clock+0x5/0x10
> > > > [377279.482184]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > > [377279.486234]  ? radix_tree_gang_lookup_tag+0xa8/0x100
> > > > [377279.489974]  ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> > > > [377279.494041]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> > > > [377279.497859]  xfs_eofblocks_worker+0x29/0x40 [xfs]
> > > > [377279.501484]  process_one_work+0x195/0x380
> > > > ...
> > > > 
> > > > The immediate issue is likely that the eofblocks transaction is not
> > > > NOWRITECOUNT (same for the cowblocks scanner, btw), but the problem
> with
> > > > doing that is these helpers are called from other contexts outside of
> > > > the background scanners.
> > > > 
> > > > Perhaps what we need to do here is let these background scanners
> acquire
> > > > a superblock write reference, similar to what Darrick recently added to
> > > > scrub..? We'd have to do that from the scanner workqueue task, so it
> > > > would probably need to be a trylock so we don't end up in a similar
> > > > situation as above. I.e., we'd either get the reference and cause
> freeze
> > > > to wait until it's dropped or bail out if freeze has already stopped
> the
> > > > transaction subsystem. Thoughts?
> > > 
> > > Hmm, I had a whole gigantic series to refactor all the speculative
> > > preallocation gc work into a single thread + radix tree tag; I'll see if
> > > that series actually fixed this problem too.
> > > 
> > > But yes, all background threads that run transactions need to have
> > > freezer protection.
> > > 
> > 
> > So something like the following in the meantime, assuming we want a
> > backportable fix..? I think this means we could return -EAGAIN from the
> > eofblocks ioctl, but afaict if something functionally conflicts with an
> > active scan across freeze then perhaps that's preferred.
> 
> Apparently I don't have a patch that fixes the speculative gc code.  The
> deferred inactivation worker does it, so perhaps I got mixed up. :/
> 

Ok.

> I think a better fix would be to annotate xfs_icache_free_eofblocks and
> xfs_icache_free_cowblocks to note that the caller must obtain freeze
> protection before calling those functions.  Then we can play whackamole
> with the existing callers:
> 
> 1. xfs_eofblocks_worker and xfs_cowblocks_worker can try to
> sb_start_write and just go back to sleep if the fs is frozen.  The
> flush_workqueue will then cancel the delayed work and the freeze can
> proceed.
> 
> 2. The buffered write ENOSPC scour-and-retry loops already have freeze
> protection because they're file writes, so they don't have to change.
> 
> 3. XFS_IOC_FREE_EOFBLOCKS can sb_start_write, which means that callers
> will sleep on the frozen fs.
> 

Sure, that works for me. It can be condensed later if it ends up in a
single thread.

Brian

> --D
> 
> > Brian
> > 
> > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > index a7be7a9e5c1a..0f14d58e5bb0 100644
> > --- a/fs/xfs/xfs_icache.c
> > +++ b/fs/xfs/xfs_icache.c
> > @@ -1515,13 +1515,24 @@ __xfs_icache_free_eofblocks(
> >                                        void *args),
> >     int                     tag)
> >  {
> > -   int flags = SYNC_TRYLOCK;
> > +   int                     flags = SYNC_TRYLOCK;
> > +   int                     error;
> >  
> >     if (eofb && (eofb->eof_flags & XFS_EOF_FLAGS_SYNC))
> >             flags = SYNC_WAIT;
> >  
> > -   return xfs_inode_ag_iterator_tag(mp, execute, flags,
> > -                                    eofb, tag);
> > +   /*
> > +    * freeze waits on background scanner jobs to complete so we cannot
> > +    * block on write protection here. Bail if the transaction subsystem is
> > +    * already freezing, returning -EAGAIN to notify other callers.
> > +    */
> > +   if (!sb_start_write_trylock(mp->m_super))
> > +           return -EAGAIN;
> > +
> > +   error = xfs_inode_ag_iterator_tag(mp, execute, flags, eofb, tag);
> > +   sb_end_write(mp->m_super);
> > +
> > +   return error;
> >  }
> >  
> >  int
> > 
> > > --D
> > > 
> > > > Brian
> > > > 
> > > > > 
> > > > > Thanks,
> > > > > Paul
> > > > > 
> > > > > -- 
> > > > > You are receiving this mail because:
> > > > > You are watching the assignee of the bug.
> > > > > 
> > > > 
> > > 
> > 
>

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
                   ` (7 preceding siblings ...)
  2020-04-07 17:02 ` bugzilla-daemon
@ 2020-05-28  6:00 ` bugzilla-daemon
  2020-05-28 10:47   ` Brian Foster
  2020-05-28 10:47 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 20+ messages in thread
From: bugzilla-daemon @ 2020-05-28  6:00 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=207053

--- Comment #8 from Paul Furtado (paulfurtado91@gmail.com) ---
The patches that came from this issue have given us many weeks of stability now
and we were ready to declare this as totally fixed, however, we hit another
instance of this issue this week which I'm assuming is probably on a slightly
different and much rarer code path.

Here's a link to the blocked tasks log (beware that it's 2MB due to endless
processes getting hung inside the container once the filesystem was frozen):
https://gist.githubusercontent.com/PaulFurtado/48253a6978763671f70dc94d933df851/raw/6bad12023ac56e9b6cb3dde771fcb5b15f0bd679/patched_kernel_fsfreeze_sys_w.log

Thanks,
Paul

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-05-28  6:00 ` bugzilla-daemon
@ 2020-05-28 10:47   ` Brian Foster
  0 siblings, 0 replies; 20+ messages in thread
From: Brian Foster @ 2020-05-28 10:47 UTC (permalink / raw)
  To: bugzilla-daemon; +Cc: linux-xfs

On Thu, May 28, 2020 at 06:00:38AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=207053
> 
> --- Comment #8 from Paul Furtado (paulfurtado91@gmail.com) ---
> The patches that came from this issue have given us many weeks of stability now
> and we were ready to declare this as totally fixed, however, we hit another
> instance of this issue this week which I'm assuming is probably on a slightly
> different and much rarer code path.
> 
> Here's a link to the blocked tasks log (beware that it's 2MB due to endless
> processes getting hung inside the container once the filesystem was frozen):
> https://gist.githubusercontent.com/PaulFurtado/48253a6978763671f70dc94d933df851/raw/6bad12023ac56e9b6cb3dde771fcb5b15f0bd679/patched_kernel_fsfreeze_sys_w.log
> 

This shows the eofblocks scan in the following (massaged) trace:

[1259466.349224] Workqueue: xfs-eofblocks/nvme4n1 xfs_eofblocks_worker [xfs]
[1259466.353550] Call Trace:
[1259466.359370]  schedule+0x2f/0xa0
[1259466.362297]  rwsem_down_read_slowpath+0x196/0x530
[1259466.372467]  __percpu_down_read+0x49/0x60
[1259466.375778]  __sb_start_write+0x5b/0x60
[1259466.379139]  xfs_trans_alloc+0x152/0x160 [xfs]
[1259466.382715]  xfs_free_eofblocks+0x12d/0x1f0 [xfs]
[1259466.386407]  xfs_inode_free_eofblocks+0x128/0x1a0 [xfs]
[1259466.394058]  xfs_inode_ag_walk.isra.17+0x1a7/0x410 [xfs]
[1259466.536551]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
[1259466.540235]  xfs_eofblocks_worker+0x29/0x40 [xfs]
[1259466.543748]  process_one_work+0x195/0x380
[1259466.546996]  worker_thread+0x30/0x390
[1259466.553449]  kthread+0x113/0x130
[1259466.559579]  ret_from_fork+0x1f/0x40

This should be addressed by upstream commit 4b674b9ac8529 ("xfs: acquire
superblock freeze protection on eofblocks scans"), which causes
xfs_eofblocks_worker() to bail unless it acquires freeze write
protection. What exact kernel is this seen on?

Brian

> Thanks,
> Paul
> 
> -- 
> You are receiving this mail because:
> You are watching the assignee of the bug.
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
                   ` (8 preceding siblings ...)
  2020-05-28  6:00 ` bugzilla-daemon
@ 2020-05-28 10:47 ` bugzilla-daemon
  2020-05-28 16:16 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2020-05-28 10:47 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=207053

--- Comment #9 from bfoster@redhat.com ---
On Thu, May 28, 2020 at 06:00:38AM +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=207053
> 
> --- Comment #8 from Paul Furtado (paulfurtado91@gmail.com) ---
> The patches that came from this issue have given us many weeks of stability
> now
> and we were ready to declare this as totally fixed, however, we hit another
> instance of this issue this week which I'm assuming is probably on a slightly
> different and much rarer code path.
> 
> Here's a link to the blocked tasks log (beware that it's 2MB due to endless
> processes getting hung inside the container once the filesystem was frozen):
>
> https://gist.githubusercontent.com/PaulFurtado/48253a6978763671f70dc94d933df851/raw/6bad12023ac56e9b6cb3dde771fcb5b15f0bd679/patched_kernel_fsfreeze_sys_w.log
> 

This shows the eofblocks scan in the following (massaged) trace:

[1259466.349224] Workqueue: xfs-eofblocks/nvme4n1 xfs_eofblocks_worker [xfs]
[1259466.353550] Call Trace:
[1259466.359370]  schedule+0x2f/0xa0
[1259466.362297]  rwsem_down_read_slowpath+0x196/0x530
[1259466.372467]  __percpu_down_read+0x49/0x60
[1259466.375778]  __sb_start_write+0x5b/0x60
[1259466.379139]  xfs_trans_alloc+0x152/0x160 [xfs]
[1259466.382715]  xfs_free_eofblocks+0x12d/0x1f0 [xfs]
[1259466.386407]  xfs_inode_free_eofblocks+0x128/0x1a0 [xfs]
[1259466.394058]  xfs_inode_ag_walk.isra.17+0x1a7/0x410 [xfs]
[1259466.536551]  xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
[1259466.540235]  xfs_eofblocks_worker+0x29/0x40 [xfs]
[1259466.543748]  process_one_work+0x195/0x380
[1259466.546996]  worker_thread+0x30/0x390
[1259466.553449]  kthread+0x113/0x130
[1259466.559579]  ret_from_fork+0x1f/0x40

This should be addressed by upstream commit 4b674b9ac8529 ("xfs: acquire
superblock freeze protection on eofblocks scans"), which causes
xfs_eofblocks_worker() to bail unless it acquires freeze write
protection. What exact kernel is this seen on?

Brian

> Thanks,
> Paul
> 
> -- 
> You are receiving this mail because:
> You are watching the assignee of the bug.
>

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
                   ` (9 preceding siblings ...)
  2020-05-28 10:47 ` bugzilla-daemon
@ 2020-05-28 16:16 ` bugzilla-daemon
  2023-03-16 18:30 ` bugzilla-daemon
  2023-03-16 21:02 ` bugzilla-daemon
  12 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2020-05-28 16:16 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=207053

--- Comment #10 from Paul Furtado (paulfurtado91@gmail.com) ---
This is my mistake actually: it turns out this host had slipped through the
cracks in our configuration management and didn't actually get updated to the
kernel build with that commit in it, I had thought we'd covered every host.
Looks like we're actually in the clear.

Thanks,
Paul

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
                   ` (10 preceding siblings ...)
  2020-05-28 16:16 ` bugzilla-daemon
@ 2023-03-16 18:30 ` bugzilla-daemon
  2023-03-16 21:02 ` bugzilla-daemon
  12 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2023-03-16 18:30 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=207053

bobbysmith013@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bobbysmith013@gmail.com

--- Comment #11 from bobbysmith013@gmail.com ---
I'm trying to figure out which kernel version this bug is fixed in. I see
references to a patch above, but am having trouble tracking down when it was
added into the baseline.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug 207053] fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely)
  2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
                   ` (11 preceding siblings ...)
  2023-03-16 18:30 ` bugzilla-daemon
@ 2023-03-16 21:02 ` bugzilla-daemon
  12 siblings, 0 replies; 20+ messages in thread
From: bugzilla-daemon @ 2023-03-16 21:02 UTC (permalink / raw)
  To: linux-xfs

https://bugzilla.kernel.org/show_bug.cgi?id=207053

Eric Sandeen (sandeen@redhat.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sandeen@redhat.com

--- Comment #12 from Eric Sandeen (sandeen@redhat.com) ---
commit 4b674b9ac8529 should be in upstream kernel v5.7

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2023-03-16 21:02 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-01 19:02 [Bug 207053] New: fsfreeze deadlock on XFS (the FIFREEZE ioctl and subsequent FITHAW hang indefinitely) bugzilla-daemon
2020-04-02  0:15 ` Dave Chinner
2020-04-02  0:15 ` [Bug 207053] " bugzilla-daemon
2020-04-07  6:41 ` bugzilla-daemon
2020-04-07 13:18   ` Brian Foster
2020-04-07 15:17     ` Darrick J. Wong
2020-04-07 16:37       ` Brian Foster
2020-04-07 16:49         ` Darrick J. Wong
2020-04-07 17:02           ` Brian Foster
2020-04-07 13:18 ` bugzilla-daemon
2020-04-07 15:17 ` bugzilla-daemon
2020-04-07 16:37 ` bugzilla-daemon
2020-04-07 16:49 ` bugzilla-daemon
2020-04-07 17:02 ` bugzilla-daemon
2020-05-28  6:00 ` bugzilla-daemon
2020-05-28 10:47   ` Brian Foster
2020-05-28 10:47 ` bugzilla-daemon
2020-05-28 16:16 ` bugzilla-daemon
2023-03-16 18:30 ` bugzilla-daemon
2023-03-16 21:02 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).