All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount
@ 2020-12-14 10:10 fdmanana
  2020-12-14 10:10 ` [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan fdmanana
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: fdmanana @ 2020-12-14 10:10 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

There are some cases where we can leak a transaction and crash during unmount
after remounting the filesystem in RO mode or mounting RO. These issues were
actually being hit by automated tests from the openQA for openSUSE Tumbleweed
(bugzilla https://bugzilla.suse.com/show_bug.cgi?id=1164503).

Filipe Manana (5):
  btrfs: fix transaction leak and crash after RO remount caused by
    qgroup rescan
  btrfs: fix transaction leak and crash after cleaning up orphans on RO
    mount
  btrfs: fix race between RO remount and the cleaner task
  btrfs: add assertion for empty list of transactions at late stage of
    umount
  btrfs: run delayed iputs when remounting RO to avoid leaking them

 fs/btrfs/ctree.h   | 20 +++++++++++++++++++-
 fs/btrfs/disk-io.c | 13 ++++++++-----
 fs/btrfs/qgroup.c  | 13 ++++++++++---
 fs/btrfs/super.c   | 40 +++++++++++++++++++++++++++++++++++++---
 fs/btrfs/volumes.c |  4 ++--
 5 files changed, 76 insertions(+), 14 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan
  2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
@ 2020-12-14 10:10 ` fdmanana
  2020-12-17 17:44   ` David Sterba
  2020-12-14 10:10 ` [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount fdmanana
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: fdmanana @ 2020-12-14 10:10 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

If we remount a filesystem in RO mode while the qgroup rescan worker is
running, we can end up having it still running after the remount is done,
and at unmount time we may end up with an open transaction that ends up
never getting committed. If that happens we end up with several memory
leaks and can crash when hardware acceleration is unavailable for crc32c.
Possibly it can lead to other nasty surprises too, due to use-after-free
issues.

The following steps explain how the problem happens.

1) We have a filesystem mounted in RW mode and the qgroup rescan worker is
   running;

2) We remount the filesystem in RO mode, and never stop/pause the rescan
   worker, so after the remount the rescan worker is still running. The
   important detail here is that the rescan task is still running after
   the remount operation committed any ongoing transaction through its
   call to btrfs_commit_super();

3) The rescan is still running, and after the remount completed, the
   rescan worker started a transaction, after it finished iterating all
   leaves of the extent tree, to update the qgroup status item in the
   quotas tree. It does not commit the transaction, it only releases its
   handle on the transaction;

4) A filesystem unmount operation starts shortly after;

5) The unmount task, at close_ctree(), stops the transaction kthread,
   which had not had a chance to commit the open transaction since it was
   sleeping and the commit interval (default of 30 seconds) has not yet
   elapsed since the last time it committed a transaction;

6) So after stopping the transaction kthread we still have the transaction
   used to update the qgroup status item open. At close_ctree(), when the
   filesystem is in RO mode and no transaction abort happened (or the
   filesystem is in error mode), we do not expect to have any transaction
   open, so we do not call btrfs_commit_super();

7) We then proceed to destroy the work queues, free the roots and block
   groups, etc. After that we drop the last reference on the btree inode
   by calling iput() on it. Since there are dirty pages for the btree
   inode, corresponding to the COWed extent buffer for the quotas btree,
   btree_write_cache_pages() is invoked to flush those dirty pages. This
   results in creating a bio and submitting it, which makes us end up at
   btrfs_submit_metadata_bio();

8) At btrfs_submit_metadata_bio() we end up at the if-then-else branch
   that calls btrfs_wq_submit_bio(), because check_async_write() returned
   a value of 1. This value of 1 is because we did not have hardware
   acceleration available for crc32c, so BTRFS_FS_CSUM_IMPL_FAST was not
   set in fs_info->flags;

9) Then at btrfs_wq_submit_bio() we call btrfs_queue_work() against the
   workqueue at fs_info->workers, which was already freed before by the
   call to btrfs_stop_all_workers() at close_ctree(). This results in an
   invalid memory access due to a use-after-free, leading to a crash.

When this happens, before the crash there are several warnings triggered,
since we have reserved metadata space in a block group, the delayed refs
reservation, etc:

  ------------[ cut here ]------------
  WARNING: CPU: 4 PID: 1729896 at fs/btrfs/block-group.c:125 btrfs_put_block_group+0x63/0xa0 [btrfs]
  Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
  CPU: 4 PID: 1729896 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:btrfs_put_block_group+0x63/0xa0 [btrfs]
  Code: f0 01 00 00 48 39 c2 75 (...)
  RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
  RAX: 0000000000000001 RBX: ffff947ed73e4000 RCX: ffff947ebc8b29c8
  RDX: 0000000000000001 RSI: ffffffffc0b150a0 RDI: ffff947ebc8b2800
  RBP: ffff947ebc8b2800 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
  R13: ffff947ed73e4160 R14: ffff947ebc8b2988 R15: dead000000000100
  FS:  00007f15edfea840(0000) GS:ffff9481ad600000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f37e2893320 CR3: 0000000138f68001 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   btrfs_free_block_groups+0x17f/0x2f0 [btrfs]
   close_ctree+0x2ba/0x2fa [btrfs]
   generic_shutdown_super+0x6c/0x100
   kill_anon_super+0x14/0x30
   btrfs_kill_super+0x12/0x20 [btrfs]
   deactivate_locked_super+0x31/0x70
   cleanup_mnt+0x100/0x160
   task_work_run+0x68/0xb0
   exit_to_user_mode_prepare+0x1bb/0x1c0
   syscall_exit_to_user_mode+0x4b/0x260
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f15ee221ee7
  Code: ff 0b 00 f7 d8 64 89 01 48 (...)
  RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
  RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
  RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
  RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
  R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
  R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
  irq event stamp: 0
  hardirqs last  enabled at (0): [<0000000000000000>] 0x0
  hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
  softirqs last  enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
  softirqs last disabled at (0): [<0000000000000000>] 0x0
  ---[ end trace dd74718fef1ed5c6 ]---
  ------------[ cut here ]------------
  WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-rsv.c:459 btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
  Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
  CPU: 2 PID: 1729896 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
  Code: 48 83 bb b0 03 00 00 00 (...)
  RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
  RAX: 000000000033c000 RBX: ffff947ed73e4000 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: ffffffffc0b0d8c1 RDI: 00000000ffffffff
  RBP: ffff947ebc8b7000 R08: 0000000000000001 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
  R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
  FS:  00007f15edfea840(0000) GS:ffff9481aca00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000561a79f76e20 CR3: 0000000138f68006 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   btrfs_free_block_groups+0x24c/0x2f0 [btrfs]
   close_ctree+0x2ba/0x2fa [btrfs]
   generic_shutdown_super+0x6c/0x100
   kill_anon_super+0x14/0x30
   btrfs_kill_super+0x12/0x20 [btrfs]
   deactivate_locked_super+0x31/0x70
   cleanup_mnt+0x100/0x160
   task_work_run+0x68/0xb0
   exit_to_user_mode_prepare+0x1bb/0x1c0
   syscall_exit_to_user_mode+0x4b/0x260
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f15ee221ee7
  Code: ff 0b 00 f7 d8 64 89 01 (...)
  RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
  RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
  RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
  RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
  R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
  R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
  irq event stamp: 0
  hardirqs last  enabled at (0): [<0000000000000000>] 0x0
  hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
  softirqs last  enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
  softirqs last disabled at (0): [<0000000000000000>] 0x0
  ---[ end trace dd74718fef1ed5c7 ]---
  ------------[ cut here ]------------
  WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-group.c:3377 btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
  Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
  CPU: 5 PID: 1729896 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
  Code: ad de 49 be 22 01 00 (...)
  RSP: 0018:ffffb270826bbde8 EFLAGS: 00010206
  RAX: ffff947ebeae1d08 RBX: ffff947ed73e4000 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: ffff947e9d823ae8 RDI: 0000000000000246
  RBP: ffff947ebeae1d08 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ebeae1c00
  R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
  FS:  00007f15edfea840(0000) GS:ffff9481ad200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f1475d98ea8 CR3: 0000000138f68005 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   close_ctree+0x2ba/0x2fa [btrfs]
   generic_shutdown_super+0x6c/0x100
   kill_anon_super+0x14/0x30
   btrfs_kill_super+0x12/0x20 [btrfs]
   deactivate_locked_super+0x31/0x70
   cleanup_mnt+0x100/0x160
   task_work_run+0x68/0xb0
   exit_to_user_mode_prepare+0x1bb/0x1c0
   syscall_exit_to_user_mode+0x4b/0x260
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f15ee221ee7
  Code: ff 0b 00 f7 d8 64 89 (...)
  RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
  RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
  RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
  RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
  R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
  R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
  irq event stamp: 0
  hardirqs last  enabled at (0): [<0000000000000000>] 0x0
  hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
  softirqs last  enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
  softirqs last disabled at (0): [<0000000000000000>] 0x0
  ---[ end trace dd74718fef1ed5c8 ]---
  BTRFS info (device sdc): space_info 4 has 268238848 free, is not full
  BTRFS info (device sdc): space_info total=268435456, used=114688, pinned=0, reserved=16384, may_use=0, readonly=65536
  BTRFS info (device sdc): global_block_rsv: size 0 reserved 0
  BTRFS info (device sdc): trans_block_rsv: size 0 reserved 0
  BTRFS info (device sdc): chunk_block_rsv: size 0 reserved 0
  BTRFS info (device sdc): delayed_block_rsv: size 0 reserved 0
  BTRFS info (device sdc): delayed_refs_rsv: size 524288 reserved 0

And the crash, which only happens when we do not have crc32c hardware
acceleration, produces the following trace immediately after those
warnings:

  stack segment: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
  CPU: 2 PID: 1749129 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:btrfs_queue_work+0x36/0x190 [btrfs]
  Code: 54 55 53 48 89 f3 (...)
  RSP: 0018:ffffb27082443ae8 EFLAGS: 00010282
  RAX: 0000000000000004 RBX: ffff94810ee9ad90 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: ffff94810ee9ad90 RDI: ffff947ed8ee75a0
  RBP: a56b6b6b6b6b6b6b R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000007 R11: 0000000000000001 R12: ffff947fa9b435a8
  R13: ffff94810ee9ad90 R14: 0000000000000000 R15: ffff947e93dc0000
  FS:  00007f3cfe974840(0000) GS:ffff9481ac600000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f1b42995a70 CR3: 0000000127638003 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   btrfs_wq_submit_bio+0xb3/0xd0 [btrfs]
   btrfs_submit_metadata_bio+0x44/0xc0 [btrfs]
   submit_one_bio+0x61/0x70 [btrfs]
   btree_write_cache_pages+0x414/0x450 [btrfs]
   ? kobject_put+0x9a/0x1d0
   ? trace_hardirqs_on+0x1b/0xf0
   ? _raw_spin_unlock_irqrestore+0x3c/0x60
   ? free_debug_processing+0x1e1/0x2b0
   do_writepages+0x43/0xe0
   ? lock_acquired+0x199/0x490
   __writeback_single_inode+0x59/0x650
   writeback_single_inode+0xaf/0x120
   write_inode_now+0x94/0xd0
   iput+0x187/0x2b0
   close_ctree+0x2c6/0x2fa [btrfs]
   generic_shutdown_super+0x6c/0x100
   kill_anon_super+0x14/0x30
   btrfs_kill_super+0x12/0x20 [btrfs]
   deactivate_locked_super+0x31/0x70
   cleanup_mnt+0x100/0x160
   task_work_run+0x68/0xb0
   exit_to_user_mode_prepare+0x1bb/0x1c0
   syscall_exit_to_user_mode+0x4b/0x260
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f3cfebabee7
  Code: ff 0b 00 f7 d8 64 89 01 (...)
  RSP: 002b:00007ffc9c9a05f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
  RAX: 0000000000000000 RBX: 00007f3cfecd1264 RCX: 00007f3cfebabee7
  RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 0000562b6b478000
  RBP: 0000562b6b473a30 R08: 0000000000000000 R09: 00007f3cfec6cbe0
  R10: 0000562b6b479fe0 R11: 0000000000000246 R12: 0000000000000000
  R13: 0000562b6b478000 R14: 0000562b6b473b40 R15: 0000562b6b473c60
  Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
  ---[ end trace dd74718fef1ed5cc ]---

Finally when we remove the btrfs module (rmmod btrfs), there are several
warnings about objects that were allocated from our slabs but were never
freed, consequence of the transaction that was never committed and got
leaked:

  =============================================================================
  BUG btrfs_delayed_ref_head (Tainted: G    B   W        ): Objects remaining in btrfs_delayed_ref_head on __kmem_cache_shutdown()
  -----------------------------------------------------------------------------

  INFO: Slab 0x0000000094c2ae56 objects=24 used=2 fp=0x000000002bfa2521 flags=0x17fffc000010200
  CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x8d/0xb5
   slab_err+0xb7/0xdc
   ? lock_acquired+0x199/0x490
   __kmem_cache_shutdown+0x1ac/0x3c0
   ? lock_release+0x20e/0x4c0
   kmem_cache_destroy+0x55/0x120
   btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
   exit_btrfs_fs+0xa/0x59 [btrfs]
   __x64_sys_delete_module+0x194/0x260
   ? fpregs_assert_state_consistent+0x1e/0x40
   ? exit_to_user_mode_prepare+0x55/0x1c0
   ? trace_hardirqs_on+0x1b/0xf0
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f693e305897
  Code: 73 01 c3 48 8b 0d f9 f5 (...)
  RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
  RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
  RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
  R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
  R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
  INFO: Object 0x0000000050cbdd61 @offset=12104
  INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1894 cpu=6 pid=1729873
	__slab_alloc.isra.0+0x109/0x1c0
	kmem_cache_alloc+0x7bb/0x830
	btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
	btrfs_free_tree_block+0x128/0x360 [btrfs]
	__btrfs_cow_block+0x489/0x5f0 [btrfs]
	btrfs_cow_block+0xf7/0x220 [btrfs]
	btrfs_search_slot+0x62a/0xc40 [btrfs]
	btrfs_del_orphan_item+0x65/0xd0 [btrfs]
	btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
	open_ctree+0x125a/0x18a0 [btrfs]
	btrfs_mount_root.cold+0x13/0xed [btrfs]
	legacy_get_tree+0x30/0x60
	vfs_get_tree+0x28/0xe0
	fc_mount+0xe/0x40
	vfs_kern_mount.part.0+0x71/0x90
	btrfs_mount+0x13b/0x3e0 [btrfs]
  INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=4292 cpu=2 pid=1729526
	kmem_cache_free+0x34c/0x3c0
	__btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
	btrfs_run_delayed_refs+0x81/0x210 [btrfs]
	commit_cowonly_roots+0xfb/0x300 [btrfs]
	btrfs_commit_transaction+0x367/0xc40 [btrfs]
	sync_filesystem+0x74/0x90
	generic_shutdown_super+0x22/0x100
	kill_anon_super+0x14/0x30
	btrfs_kill_super+0x12/0x20 [btrfs]
	deactivate_locked_super+0x31/0x70
	cleanup_mnt+0x100/0x160
	task_work_run+0x68/0xb0
	exit_to_user_mode_prepare+0x1bb/0x1c0
	syscall_exit_to_user_mode+0x4b/0x260
	entry_SYSCALL_64_after_hwframe+0x44/0xa9
  INFO: Object 0x0000000086e9b0ff @offset=12776
  INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1900 cpu=6 pid=1729873
	__slab_alloc.isra.0+0x109/0x1c0
	kmem_cache_alloc+0x7bb/0x830
	btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
	btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
	alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
	__btrfs_cow_block+0x12d/0x5f0 [btrfs]
	btrfs_cow_block+0xf7/0x220 [btrfs]
	btrfs_search_slot+0x62a/0xc40 [btrfs]
	btrfs_del_orphan_item+0x65/0xd0 [btrfs]
	btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
	open_ctree+0x125a/0x18a0 [btrfs]
	btrfs_mount_root.cold+0x13/0xed [btrfs]
	legacy_get_tree+0x30/0x60
	vfs_get_tree+0x28/0xe0
	fc_mount+0xe/0x40
	vfs_kern_mount.part.0+0x71/0x90
  INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=3141 cpu=6 pid=1729803
	kmem_cache_free+0x34c/0x3c0
	__btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
	btrfs_run_delayed_refs+0x81/0x210 [btrfs]
	btrfs_write_dirty_block_groups+0x17d/0x3d0 [btrfs]
	commit_cowonly_roots+0x248/0x300 [btrfs]
	btrfs_commit_transaction+0x367/0xc40 [btrfs]
	close_ctree+0x113/0x2fa [btrfs]
	generic_shutdown_super+0x6c/0x100
	kill_anon_super+0x14/0x30
	btrfs_kill_super+0x12/0x20 [btrfs]
	deactivate_locked_super+0x31/0x70
	cleanup_mnt+0x100/0x160
	task_work_run+0x68/0xb0
	exit_to_user_mode_prepare+0x1bb/0x1c0
	syscall_exit_to_user_mode+0x4b/0x260
	entry_SYSCALL_64_after_hwframe+0x44/0xa9
  kmem_cache_destroy btrfs_delayed_ref_head: Slab cache still has objects
  CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x8d/0xb5
   kmem_cache_destroy+0x119/0x120
   btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
   exit_btrfs_fs+0xa/0x59 [btrfs]
   __x64_sys_delete_module+0x194/0x260
   ? fpregs_assert_state_consistent+0x1e/0x40
   ? exit_to_user_mode_prepare+0x55/0x1c0
   ? trace_hardirqs_on+0x1b/0xf0
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f693e305897
  Code: 73 01 c3 48 8b 0d f9 f5 0b (...)
  RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
  RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
  RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
  R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
  R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
  =============================================================================
  BUG btrfs_delayed_tree_ref (Tainted: G    B   W        ): Objects remaining in btrfs_delayed_tree_ref on __kmem_cache_shutdown()
  -----------------------------------------------------------------------------

  INFO: Slab 0x0000000011f78dc0 objects=37 used=2 fp=0x0000000032d55d91 flags=0x17fffc000010200
  CPU: 3 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x8d/0xb5
   slab_err+0xb7/0xdc
   ? lock_acquired+0x199/0x490
   __kmem_cache_shutdown+0x1ac/0x3c0
   ? lock_release+0x20e/0x4c0
   kmem_cache_destroy+0x55/0x120
   btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
   exit_btrfs_fs+0xa/0x59 [btrfs]
   __x64_sys_delete_module+0x194/0x260
   ? fpregs_assert_state_consistent+0x1e/0x40
   ? exit_to_user_mode_prepare+0x55/0x1c0
   ? trace_hardirqs_on+0x1b/0xf0
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f693e305897
  Code: 73 01 c3 48 8b 0d f9 f5 (...)
  RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
  RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
  RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
  R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
  R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
  INFO: Object 0x000000001a340018 @offset=4408
  INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1917 cpu=6 pid=1729873
	__slab_alloc.isra.0+0x109/0x1c0
	kmem_cache_alloc+0x7bb/0x830
	btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
	btrfs_free_tree_block+0x128/0x360 [btrfs]
	__btrfs_cow_block+0x489/0x5f0 [btrfs]
	btrfs_cow_block+0xf7/0x220 [btrfs]
	btrfs_search_slot+0x62a/0xc40 [btrfs]
	btrfs_del_orphan_item+0x65/0xd0 [btrfs]
	btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
	open_ctree+0x125a/0x18a0 [btrfs]
	btrfs_mount_root.cold+0x13/0xed [btrfs]
	legacy_get_tree+0x30/0x60
	vfs_get_tree+0x28/0xe0
	fc_mount+0xe/0x40
	vfs_kern_mount.part.0+0x71/0x90
	btrfs_mount+0x13b/0x3e0 [btrfs]
  INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=4167 cpu=4 pid=1729795
	kmem_cache_free+0x34c/0x3c0
	__btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
	btrfs_run_delayed_refs+0x81/0x210 [btrfs]
	btrfs_commit_transaction+0x60/0xc40 [btrfs]
	create_subvol+0x56a/0x990 [btrfs]
	btrfs_mksubvol+0x3fb/0x4a0 [btrfs]
	__btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs]
	btrfs_ioctl_snap_create+0x58/0x80 [btrfs]
	btrfs_ioctl+0x1a92/0x36f0 [btrfs]
	__x64_sys_ioctl+0x83/0xb0
	do_syscall_64+0x33/0x80
	entry_SYSCALL_64_after_hwframe+0x44/0xa9
  INFO: Object 0x000000002b46292a @offset=13648
  INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1923 cpu=6 pid=1729873
	__slab_alloc.isra.0+0x109/0x1c0
	kmem_cache_alloc+0x7bb/0x830
	btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
	btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
	alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
	__btrfs_cow_block+0x12d/0x5f0 [btrfs]
	btrfs_cow_block+0xf7/0x220 [btrfs]
	btrfs_search_slot+0x62a/0xc40 [btrfs]
	btrfs_del_orphan_item+0x65/0xd0 [btrfs]
	btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
	open_ctree+0x125a/0x18a0 [btrfs]
	btrfs_mount_root.cold+0x13/0xed [btrfs]
	legacy_get_tree+0x30/0x60
	vfs_get_tree+0x28/0xe0
	fc_mount+0xe/0x40
	vfs_kern_mount.part.0+0x71/0x90
  INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=3164 cpu=6 pid=1729803
	kmem_cache_free+0x34c/0x3c0
	__btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
	btrfs_run_delayed_refs+0x81/0x210 [btrfs]
	commit_cowonly_roots+0xfb/0x300 [btrfs]
	btrfs_commit_transaction+0x367/0xc40 [btrfs]
	close_ctree+0x113/0x2fa [btrfs]
	generic_shutdown_super+0x6c/0x100
	kill_anon_super+0x14/0x30
	btrfs_kill_super+0x12/0x20 [btrfs]
	deactivate_locked_super+0x31/0x70
	cleanup_mnt+0x100/0x160
	task_work_run+0x68/0xb0
	exit_to_user_mode_prepare+0x1bb/0x1c0
	syscall_exit_to_user_mode+0x4b/0x260
	entry_SYSCALL_64_after_hwframe+0x44/0xa9
  kmem_cache_destroy btrfs_delayed_tree_ref: Slab cache still has objects
  CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x8d/0xb5
   kmem_cache_destroy+0x119/0x120
   btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
   exit_btrfs_fs+0xa/0x59 [btrfs]
   __x64_sys_delete_module+0x194/0x260
   ? fpregs_assert_state_consistent+0x1e/0x40
   ? exit_to_user_mode_prepare+0x55/0x1c0
   ? trace_hardirqs_on+0x1b/0xf0
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f693e305897
  Code: 73 01 c3 48 8b 0d f9 f5 (...)
  RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
  RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
  RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
  R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
  R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
  =============================================================================
  BUG btrfs_delayed_extent_op (Tainted: G    B   W        ): Objects remaining in btrfs_delayed_extent_op on __kmem_cache_shutdown()
  -----------------------------------------------------------------------------

  INFO: Slab 0x00000000f145ce2f objects=22 used=1 fp=0x00000000af0f92cf flags=0x17fffc000010200
  CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x8d/0xb5
   slab_err+0xb7/0xdc
   ? lock_acquired+0x199/0x490
   __kmem_cache_shutdown+0x1ac/0x3c0
   ? __mutex_unlock_slowpath+0x45/0x2a0
   kmem_cache_destroy+0x55/0x120
   exit_btrfs_fs+0xa/0x59 [btrfs]
   __x64_sys_delete_module+0x194/0x260
   ? fpregs_assert_state_consistent+0x1e/0x40
   ? exit_to_user_mode_prepare+0x55/0x1c0
   ? trace_hardirqs_on+0x1b/0xf0
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f693e305897
  Code: 73 01 c3 48 8b 0d f9 f5 (...)
  RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
  RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
  RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
  R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
  R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
  INFO: Object 0x000000004cf95ea8 @offset=6264
  INFO: Allocated in btrfs_alloc_tree_block+0x1e0/0x360 [btrfs] age=1931 cpu=6 pid=1729873
	__slab_alloc.isra.0+0x109/0x1c0
	kmem_cache_alloc+0x7bb/0x830
	btrfs_alloc_tree_block+0x1e0/0x360 [btrfs]
	alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
	__btrfs_cow_block+0x12d/0x5f0 [btrfs]
	btrfs_cow_block+0xf7/0x220 [btrfs]
	btrfs_search_slot+0x62a/0xc40 [btrfs]
	btrfs_del_orphan_item+0x65/0xd0 [btrfs]
	btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
	open_ctree+0x125a/0x18a0 [btrfs]
	btrfs_mount_root.cold+0x13/0xed [btrfs]
	legacy_get_tree+0x30/0x60
	vfs_get_tree+0x28/0xe0
	fc_mount+0xe/0x40
	vfs_kern_mount.part.0+0x71/0x90
	btrfs_mount+0x13b/0x3e0 [btrfs]
  INFO: Freed in __btrfs_run_delayed_refs+0xabd/0x1290 [btrfs] age=3173 cpu=6 pid=1729803
	kmem_cache_free+0x34c/0x3c0
	__btrfs_run_delayed_refs+0xabd/0x1290 [btrfs]
	btrfs_run_delayed_refs+0x81/0x210 [btrfs]
	commit_cowonly_roots+0xfb/0x300 [btrfs]
	btrfs_commit_transaction+0x367/0xc40 [btrfs]
	close_ctree+0x113/0x2fa [btrfs]
	generic_shutdown_super+0x6c/0x100
	kill_anon_super+0x14/0x30
	btrfs_kill_super+0x12/0x20 [btrfs]
	deactivate_locked_super+0x31/0x70
	cleanup_mnt+0x100/0x160
	task_work_run+0x68/0xb0
	exit_to_user_mode_prepare+0x1bb/0x1c0
	syscall_exit_to_user_mode+0x4b/0x260
	entry_SYSCALL_64_after_hwframe+0x44/0xa9
  kmem_cache_destroy btrfs_delayed_extent_op: Slab cache still has objects
  CPU: 3 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x8d/0xb5
   kmem_cache_destroy+0x119/0x120
   exit_btrfs_fs+0xa/0x59 [btrfs]
   __x64_sys_delete_module+0x194/0x260
   ? fpregs_assert_state_consistent+0x1e/0x40
   ? exit_to_user_mode_prepare+0x55/0x1c0
   ? trace_hardirqs_on+0x1b/0xf0
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f693e305897
  Code: 73 01 c3 48 8b 0d f9 (...)
  RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
  RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
  RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
  R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
  R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
  BTRFS: state leak: start 30408704 end 30425087 state 1 in tree 1 refs 1

Fix this issue by having the remount path stop the qgroup rescan worker
when we are remounting RO and teach the rescan worker to stop when a
remount is in progress. If later a remount in RW mode happens, we are
already resuming the qgroup rescan worker through the call to
btrfs_qgroup_rescan_resume(), so we do not need to worry about that.

Tested-by: Fabian Vogt <fvogt@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/qgroup.c | 13 ++++++++++---
 fs/btrfs/super.c  |  8 ++++++++
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 47f27658eac1..808370ada888 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -3190,6 +3190,12 @@ static int qgroup_rescan_leaf(struct btrfs_trans_handle *trans,
 	return ret;
 }
 
+static bool rescan_should_stop(struct btrfs_fs_info *fs_info)
+{
+	return btrfs_fs_closing(fs_info) ||
+		test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state);
+}
+
 static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
 {
 	struct btrfs_fs_info *fs_info = container_of(work, struct btrfs_fs_info,
@@ -3198,6 +3204,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
 	struct btrfs_trans_handle *trans = NULL;
 	int err = -ENOMEM;
 	int ret = 0;
+	bool stopped = false;
 
 	path = btrfs_alloc_path();
 	if (!path)
@@ -3210,7 +3217,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
 	path->skip_locking = 1;
 
 	err = 0;
-	while (!err && !btrfs_fs_closing(fs_info)) {
+	while (!err && !(stopped = rescan_should_stop(fs_info))) {
 		trans = btrfs_start_transaction(fs_info->fs_root, 0);
 		if (IS_ERR(trans)) {
 			err = PTR_ERR(trans);
@@ -3253,7 +3260,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
 	}
 
 	mutex_lock(&fs_info->qgroup_rescan_lock);
-	if (!btrfs_fs_closing(fs_info))
+	if (!stopped)
 		fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
 	if (trans) {
 		ret = update_qgroup_status_item(trans);
@@ -3272,7 +3279,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
 
 	btrfs_end_transaction(trans);
 
-	if (btrfs_fs_closing(fs_info)) {
+	if (stopped) {
 		btrfs_info(fs_info, "qgroup scan paused");
 	} else if (err >= 0) {
 		btrfs_info(fs_info, "qgroup scan completed%s",
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 022f20810089..b24fa62375e0 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1968,6 +1968,14 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 		btrfs_scrub_cancel(fs_info);
 		btrfs_pause_balance(fs_info);
 
+		/*
+		 * Pause the qgroup rescan worker if it is running. We don't want
+		 * it to be still running after we are in RO mode, as after that,
+		 * by the time we unmount, it might have left a transaction open,
+		 * so we would leak the transaction and/or crash.
+		 */
+		btrfs_qgroup_wait_for_completion(fs_info, false);
+
 		ret = btrfs_commit_super(fs_info);
 		if (ret)
 			goto restore;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount
  2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
  2020-12-14 10:10 ` [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan fdmanana
@ 2020-12-14 10:10 ` fdmanana
  2021-03-16  6:44   ` robbieko
  2020-12-14 10:10 ` [PATCH 3/5] btrfs: fix race between RO remount and the cleaner task fdmanana
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: fdmanana @ 2020-12-14 10:10 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

When we delete a root (subvolume or snapshot), at the very end of the
operation, we attempt to remove the root's orphan item from the root tree,
at btrfs_drop_snapshot(), by calling btrfs_del_orphan_item(). We ignore any
error from btrfs_del_orphan_item() since it is not a serious problem and
the next time the filesystem is mounted we remove such stray orphan items
at btrfs_find_orphan_roots().

However if the filesystem is mounted RO and we have stray orphan items for
any previously deleted root, we can end up leaking a transaction and other
data structures when unmounting the filesystem, as well as crashing if we
do not have hardware acceleration for crc32c available.

The steps that lead to the transaction leak are the following:

1) The filesystem is mounted in RW mode;

2) A subvolume is deleted;

3) When the cleaner kthread runs btrfs_drop_snapshot() to delete the root,
   it gets a failure at btrfs_del_orphan_item(), which is ignored, due to
   a -ENOMEM when allocating a path for example. So the orphan item for
   the root remains in the root tree;

4) The filesystem is unmounted;

5) The filesystem is mounted RO (-o ro). During the mount path we call
   btrfs_find_orphan_roots(), which iterates the root tree searching for
   orphan items. It finds the orphan item for our deleted root, and since
   it can not find the root, it starts a transaction to delete the orphan
   item (by calling btrfs_del_orphan_item());

6) The RO mount completes;

7) Before the transaction kthread commits the transaction created for
   deleting the orphan item (i.e. less than 30 seconds elapsed since the
   mount, the default commit interval), a filesystem unmount operation is
   started;

8) At close_ctree(), we stop the transaction kthread, but we still have a
   transaction open with at least one dirty extent buffer, a leaf for the
   tree root which was COWed when deleting the orphan item;

9) We then proceed to destroy the work queues, free the roots and block
   groups, etc. After that we drop the last reference on the btree inode by
   calling iput() on it. Since there are dirty pages for the btree inode,
   corresponding to the COWed extent buffer, btree_write_cache_pages() is
   invoked to flush those dirty pages. This results in creating a bio and
   submitting it, which makes us end up at btrfs_submit_metadata_bio();

10) At btrfs_submit_metadata_bio() we end up at the if-then-else branch
    that calls btrfs_wq_submit_bio(), because check_async_write() returned
    a value of 1. This value of 1 is because we did not have hardware
    acceleration available for crc32c, so BTRFS_FS_CSUM_IMPL_FAST was not
    set in fs_info->flags;

11) Then at btrfs_wq_submit_bio() we call btrfs_queue_work() against the
    workqueue at fs_info->workers, which was already freed before by the
    call to btrfs_stop_all_workers() at close_ctree(). This results in an
    invalid memory access due to a use-after-free, leading to a crash.

When this happens, before the crash there are several warnings triggered,
since we have reserved metadata space in a block group, the delayed refs
reservation, etc:

 ------------[ cut here ]------------
 WARNING: CPU: 4 PID: 1729896 at fs/btrfs/block-group.c:125 btrfs_put_block_group+0x63/0xa0 [btrfs]
 Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
 CPU: 4 PID: 1729896 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:btrfs_put_block_group+0x63/0xa0 [btrfs]
 Code: f0 01 00 00 48 39 c2 75 (...)
 RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
 RAX: 0000000000000001 RBX: ffff947ed73e4000 RCX: ffff947ebc8b29c8
 RDX: 0000000000000001 RSI: ffffffffc0b150a0 RDI: ffff947ebc8b2800
 RBP: ffff947ebc8b2800 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
 R13: ffff947ed73e4160 R14: ffff947ebc8b2988 R15: dead000000000100
 FS:  00007f15edfea840(0000) GS:ffff9481ad600000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f37e2893320 CR3: 0000000138f68001 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  btrfs_free_block_groups+0x17f/0x2f0 [btrfs]
  close_ctree+0x2ba/0x2fa [btrfs]
  generic_shutdown_super+0x6c/0x100
  kill_anon_super+0x14/0x30
  btrfs_kill_super+0x12/0x20 [btrfs]
  deactivate_locked_super+0x31/0x70
  cleanup_mnt+0x100/0x160
  task_work_run+0x68/0xb0
  exit_to_user_mode_prepare+0x1bb/0x1c0
  syscall_exit_to_user_mode+0x4b/0x260
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f15ee221ee7
 Code: ff 0b 00 f7 d8 64 89 01 48 (...)
 RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
 RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
 RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
 RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
 R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
 R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
 irq event stamp: 0
 hardirqs last  enabled at (0): [<0000000000000000>] 0x0
 hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
 softirqs last  enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
 softirqs last disabled at (0): [<0000000000000000>] 0x0
 ---[ end trace dd74718fef1ed5c6 ]---
 ------------[ cut here ]------------
 WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-rsv.c:459 btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
 Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
 CPU: 2 PID: 1729896 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
 Code: 48 83 bb b0 03 00 00 00 (...)
 RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
 RAX: 000000000033c000 RBX: ffff947ed73e4000 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: ffffffffc0b0d8c1 RDI: 00000000ffffffff
 RBP: ffff947ebc8b7000 R08: 0000000000000001 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
 R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
 FS:  00007f15edfea840(0000) GS:ffff9481aca00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000561a79f76e20 CR3: 0000000138f68006 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  btrfs_free_block_groups+0x24c/0x2f0 [btrfs]
  close_ctree+0x2ba/0x2fa [btrfs]
  generic_shutdown_super+0x6c/0x100
  kill_anon_super+0x14/0x30
  btrfs_kill_super+0x12/0x20 [btrfs]
  deactivate_locked_super+0x31/0x70
  cleanup_mnt+0x100/0x160
  task_work_run+0x68/0xb0
  exit_to_user_mode_prepare+0x1bb/0x1c0
  syscall_exit_to_user_mode+0x4b/0x260
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f15ee221ee7
 Code: ff 0b 00 f7 d8 64 89 01 (...)
 RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
 RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
 RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
 RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
 R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
 R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
 irq event stamp: 0
 hardirqs last  enabled at (0): [<0000000000000000>] 0x0
 hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
 softirqs last  enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
 softirqs last disabled at (0): [<0000000000000000>] 0x0
 ---[ end trace dd74718fef1ed5c7 ]---
 ------------[ cut here ]------------
 WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-group.c:3377 btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
 Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
 CPU: 5 PID: 1729896 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
 Code: ad de 49 be 22 01 00 (...)
 RSP: 0018:ffffb270826bbde8 EFLAGS: 00010206
 RAX: ffff947ebeae1d08 RBX: ffff947ed73e4000 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: ffff947e9d823ae8 RDI: 0000000000000246
 RBP: ffff947ebeae1d08 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ebeae1c00
 R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
 FS:  00007f15edfea840(0000) GS:ffff9481ad200000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f1475d98ea8 CR3: 0000000138f68005 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  close_ctree+0x2ba/0x2fa [btrfs]
  generic_shutdown_super+0x6c/0x100
  kill_anon_super+0x14/0x30
  btrfs_kill_super+0x12/0x20 [btrfs]
  deactivate_locked_super+0x31/0x70
  cleanup_mnt+0x100/0x160
  task_work_run+0x68/0xb0
  exit_to_user_mode_prepare+0x1bb/0x1c0
  syscall_exit_to_user_mode+0x4b/0x260
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f15ee221ee7
 Code: ff 0b 00 f7 d8 64 89 (...)
 RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
 RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
 RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
 RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
 R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
 R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
 irq event stamp: 0
 hardirqs last  enabled at (0): [<0000000000000000>] 0x0
 hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
 softirqs last  enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
 softirqs last disabled at (0): [<0000000000000000>] 0x0
 ---[ end trace dd74718fef1ed5c8 ]---
 BTRFS info (device sdc): space_info 4 has 268238848 free, is not full
 BTRFS info (device sdc): space_info total=268435456, used=114688, pinned=0, reserved=16384, may_use=0, readonly=65536
 BTRFS info (device sdc): global_block_rsv: size 0 reserved 0
 BTRFS info (device sdc): trans_block_rsv: size 0 reserved 0
 BTRFS info (device sdc): chunk_block_rsv: size 0 reserved 0
 BTRFS info (device sdc): delayed_block_rsv: size 0 reserved 0
 BTRFS info (device sdc): delayed_refs_rsv: size 524288 reserved 0

And the crash, which only happens when we do not have crc32c hardware
acceleration, produces the following trace immediately after those
warnings:

 stack segment: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
 CPU: 2 PID: 1749129 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:btrfs_queue_work+0x36/0x190 [btrfs]
 Code: 54 55 53 48 89 f3 (...)
 RSP: 0018:ffffb27082443ae8 EFLAGS: 00010282
 RAX: 0000000000000004 RBX: ffff94810ee9ad90 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: ffff94810ee9ad90 RDI: ffff947ed8ee75a0
 RBP: a56b6b6b6b6b6b6b R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000007 R11: 0000000000000001 R12: ffff947fa9b435a8
 R13: ffff94810ee9ad90 R14: 0000000000000000 R15: ffff947e93dc0000
 FS:  00007f3cfe974840(0000) GS:ffff9481ac600000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f1b42995a70 CR3: 0000000127638003 CR4: 00000000003706e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  btrfs_wq_submit_bio+0xb3/0xd0 [btrfs]
  btrfs_submit_metadata_bio+0x44/0xc0 [btrfs]
  submit_one_bio+0x61/0x70 [btrfs]
  btree_write_cache_pages+0x414/0x450 [btrfs]
  ? kobject_put+0x9a/0x1d0
  ? trace_hardirqs_on+0x1b/0xf0
  ? _raw_spin_unlock_irqrestore+0x3c/0x60
  ? free_debug_processing+0x1e1/0x2b0
  do_writepages+0x43/0xe0
  ? lock_acquired+0x199/0x490
  __writeback_single_inode+0x59/0x650
  writeback_single_inode+0xaf/0x120
  write_inode_now+0x94/0xd0
  iput+0x187/0x2b0
  close_ctree+0x2c6/0x2fa [btrfs]
  generic_shutdown_super+0x6c/0x100
  kill_anon_super+0x14/0x30
  btrfs_kill_super+0x12/0x20 [btrfs]
  deactivate_locked_super+0x31/0x70
  cleanup_mnt+0x100/0x160
  task_work_run+0x68/0xb0
  exit_to_user_mode_prepare+0x1bb/0x1c0
  syscall_exit_to_user_mode+0x4b/0x260
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f3cfebabee7
 Code: ff 0b 00 f7 d8 64 89 01 (...)
 RSP: 002b:00007ffc9c9a05f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
 RAX: 0000000000000000 RBX: 00007f3cfecd1264 RCX: 00007f3cfebabee7
 RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 0000562b6b478000
 RBP: 0000562b6b473a30 R08: 0000000000000000 R09: 00007f3cfec6cbe0
 R10: 0000562b6b479fe0 R11: 0000000000000246 R12: 0000000000000000
 R13: 0000562b6b478000 R14: 0000562b6b473b40 R15: 0000562b6b473c60
 Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
 ---[ end trace dd74718fef1ed5cc ]---

Finally when we remove the btrfs module (rmmod btrfs), there are several
warnings about objects that were allocated from our slabs but were never
freed, consequence of the transaction that was never committed and got
leaked:
 =============================================================================
 BUG btrfs_delayed_ref_head (Tainted: G    B   W        ): Objects remaining in btrfs_delayed_ref_head on __kmem_cache_shutdown()
 -----------------------------------------------------------------------------

 INFO: Slab 0x0000000094c2ae56 objects=24 used=2 fp=0x000000002bfa2521 flags=0x17fffc000010200
 CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Call Trace:
  dump_stack+0x8d/0xb5
  slab_err+0xb7/0xdc
  ? lock_acquired+0x199/0x490
  __kmem_cache_shutdown+0x1ac/0x3c0
  ? lock_release+0x20e/0x4c0
  kmem_cache_destroy+0x55/0x120
  btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
  exit_btrfs_fs+0xa/0x59 [btrfs]
  __x64_sys_delete_module+0x194/0x260
  ? fpregs_assert_state_consistent+0x1e/0x40
  ? exit_to_user_mode_prepare+0x55/0x1c0
  ? trace_hardirqs_on+0x1b/0xf0
  do_syscall_64+0x33/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f693e305897
 Code: 73 01 c3 48 8b 0d f9 f5 (...)
 RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
 RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
 RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
 RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
 R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
 R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
 INFO: Object 0x0000000050cbdd61 @offset=12104
 INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1894 cpu=6 pid=1729873
        __slab_alloc.isra.0+0x109/0x1c0
        kmem_cache_alloc+0x7bb/0x830
        btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
        btrfs_free_tree_block+0x128/0x360 [btrfs]
        __btrfs_cow_block+0x489/0x5f0 [btrfs]
        btrfs_cow_block+0xf7/0x220 [btrfs]
        btrfs_search_slot+0x62a/0xc40 [btrfs]
        btrfs_del_orphan_item+0x65/0xd0 [btrfs]
        btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
        open_ctree+0x125a/0x18a0 [btrfs]
        btrfs_mount_root.cold+0x13/0xed [btrfs]
        legacy_get_tree+0x30/0x60
        vfs_get_tree+0x28/0xe0
        fc_mount+0xe/0x40
        vfs_kern_mount.part.0+0x71/0x90
        btrfs_mount+0x13b/0x3e0 [btrfs]
 INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=4292 cpu=2 pid=1729526
        kmem_cache_free+0x34c/0x3c0
        __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
        btrfs_run_delayed_refs+0x81/0x210 [btrfs]
        commit_cowonly_roots+0xfb/0x300 [btrfs]
        btrfs_commit_transaction+0x367/0xc40 [btrfs]
        sync_filesystem+0x74/0x90
        generic_shutdown_super+0x22/0x100
        kill_anon_super+0x14/0x30
        btrfs_kill_super+0x12/0x20 [btrfs]
        deactivate_locked_super+0x31/0x70
        cleanup_mnt+0x100/0x160
        task_work_run+0x68/0xb0
        exit_to_user_mode_prepare+0x1bb/0x1c0
        syscall_exit_to_user_mode+0x4b/0x260
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
 INFO: Object 0x0000000086e9b0ff @offset=12776
 INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1900 cpu=6 pid=1729873
        __slab_alloc.isra.0+0x109/0x1c0
        kmem_cache_alloc+0x7bb/0x830
        btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
        btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
        alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
        __btrfs_cow_block+0x12d/0x5f0 [btrfs]
        btrfs_cow_block+0xf7/0x220 [btrfs]
        btrfs_search_slot+0x62a/0xc40 [btrfs]
        btrfs_del_orphan_item+0x65/0xd0 [btrfs]
        btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
        open_ctree+0x125a/0x18a0 [btrfs]
        btrfs_mount_root.cold+0x13/0xed [btrfs]
        legacy_get_tree+0x30/0x60
        vfs_get_tree+0x28/0xe0
        fc_mount+0xe/0x40
        vfs_kern_mount.part.0+0x71/0x90
 INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=3141 cpu=6 pid=1729803
        kmem_cache_free+0x34c/0x3c0
        __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
        btrfs_run_delayed_refs+0x81/0x210 [btrfs]
        btrfs_write_dirty_block_groups+0x17d/0x3d0 [btrfs]
        commit_cowonly_roots+0x248/0x300 [btrfs]
        btrfs_commit_transaction+0x367/0xc40 [btrfs]
        close_ctree+0x113/0x2fa [btrfs]
        generic_shutdown_super+0x6c/0x100
        kill_anon_super+0x14/0x30
        btrfs_kill_super+0x12/0x20 [btrfs]
        deactivate_locked_super+0x31/0x70
        cleanup_mnt+0x100/0x160
        task_work_run+0x68/0xb0
        exit_to_user_mode_prepare+0x1bb/0x1c0
        syscall_exit_to_user_mode+0x4b/0x260
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
 kmem_cache_destroy btrfs_delayed_ref_head: Slab cache still has objects
 CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Call Trace:
  dump_stack+0x8d/0xb5
  kmem_cache_destroy+0x119/0x120
  btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
  exit_btrfs_fs+0xa/0x59 [btrfs]
  __x64_sys_delete_module+0x194/0x260
  ? fpregs_assert_state_consistent+0x1e/0x40
  ? exit_to_user_mode_prepare+0x55/0x1c0
  ? trace_hardirqs_on+0x1b/0xf0
  do_syscall_64+0x33/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f693e305897
 Code: 73 01 c3 48 8b 0d f9 f5 0b (...)
 RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
 RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
 RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
 RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
 R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
 R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
 =============================================================================
 BUG btrfs_delayed_tree_ref (Tainted: G    B   W        ): Objects remaining in btrfs_delayed_tree_ref on __kmem_cache_shutdown()
 -----------------------------------------------------------------------------

 INFO: Slab 0x0000000011f78dc0 objects=37 used=2 fp=0x0000000032d55d91 flags=0x17fffc000010200
 CPU: 3 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Call Trace:
  dump_stack+0x8d/0xb5
  slab_err+0xb7/0xdc
  ? lock_acquired+0x199/0x490
  __kmem_cache_shutdown+0x1ac/0x3c0
  ? lock_release+0x20e/0x4c0
  kmem_cache_destroy+0x55/0x120
  btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
  exit_btrfs_fs+0xa/0x59 [btrfs]
  __x64_sys_delete_module+0x194/0x260
  ? fpregs_assert_state_consistent+0x1e/0x40
  ? exit_to_user_mode_prepare+0x55/0x1c0
  ? trace_hardirqs_on+0x1b/0xf0
  do_syscall_64+0x33/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f693e305897
 Code: 73 01 c3 48 8b 0d f9 f5 (...)
 RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
 RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
 RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
 RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
 R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
 R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
 INFO: Object 0x000000001a340018 @offset=4408
 INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1917 cpu=6 pid=1729873
        __slab_alloc.isra.0+0x109/0x1c0
        kmem_cache_alloc+0x7bb/0x830
        btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
        btrfs_free_tree_block+0x128/0x360 [btrfs]
        __btrfs_cow_block+0x489/0x5f0 [btrfs]
        btrfs_cow_block+0xf7/0x220 [btrfs]
        btrfs_search_slot+0x62a/0xc40 [btrfs]
        btrfs_del_orphan_item+0x65/0xd0 [btrfs]
        btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
        open_ctree+0x125a/0x18a0 [btrfs]
        btrfs_mount_root.cold+0x13/0xed [btrfs]
        legacy_get_tree+0x30/0x60
        vfs_get_tree+0x28/0xe0
        fc_mount+0xe/0x40
        vfs_kern_mount.part.0+0x71/0x90
        btrfs_mount+0x13b/0x3e0 [btrfs]
 INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=4167 cpu=4 pid=1729795
        kmem_cache_free+0x34c/0x3c0
        __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
        btrfs_run_delayed_refs+0x81/0x210 [btrfs]
        btrfs_commit_transaction+0x60/0xc40 [btrfs]
        create_subvol+0x56a/0x990 [btrfs]
        btrfs_mksubvol+0x3fb/0x4a0 [btrfs]
        __btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs]
        btrfs_ioctl_snap_create+0x58/0x80 [btrfs]
        btrfs_ioctl+0x1a92/0x36f0 [btrfs]
        __x64_sys_ioctl+0x83/0xb0
        do_syscall_64+0x33/0x80
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
 INFO: Object 0x000000002b46292a @offset=13648
 INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1923 cpu=6 pid=1729873
        __slab_alloc.isra.0+0x109/0x1c0
        kmem_cache_alloc+0x7bb/0x830
        btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
        btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
        alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
        __btrfs_cow_block+0x12d/0x5f0 [btrfs]
        btrfs_cow_block+0xf7/0x220 [btrfs]
        btrfs_search_slot+0x62a/0xc40 [btrfs]
        btrfs_del_orphan_item+0x65/0xd0 [btrfs]
        btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
        open_ctree+0x125a/0x18a0 [btrfs]
        btrfs_mount_root.cold+0x13/0xed [btrfs]
        legacy_get_tree+0x30/0x60
        vfs_get_tree+0x28/0xe0
        fc_mount+0xe/0x40
        vfs_kern_mount.part.0+0x71/0x90
 INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=3164 cpu=6 pid=1729803
        kmem_cache_free+0x34c/0x3c0
        __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
        btrfs_run_delayed_refs+0x81/0x210 [btrfs]
        commit_cowonly_roots+0xfb/0x300 [btrfs]
        btrfs_commit_transaction+0x367/0xc40 [btrfs]
        close_ctree+0x113/0x2fa [btrfs]
        generic_shutdown_super+0x6c/0x100
        kill_anon_super+0x14/0x30
        btrfs_kill_super+0x12/0x20 [btrfs]
        deactivate_locked_super+0x31/0x70
        cleanup_mnt+0x100/0x160
        task_work_run+0x68/0xb0
        exit_to_user_mode_prepare+0x1bb/0x1c0
        syscall_exit_to_user_mode+0x4b/0x260
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
 kmem_cache_destroy btrfs_delayed_tree_ref: Slab cache still has objects
 CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Call Trace:
  dump_stack+0x8d/0xb5
  kmem_cache_destroy+0x119/0x120
  btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
  exit_btrfs_fs+0xa/0x59 [btrfs]
  __x64_sys_delete_module+0x194/0x260
  ? fpregs_assert_state_consistent+0x1e/0x40
  ? exit_to_user_mode_prepare+0x55/0x1c0
  ? trace_hardirqs_on+0x1b/0xf0
  do_syscall_64+0x33/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f693e305897
 Code: 73 01 c3 48 8b 0d f9 f5 (...)
 RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
 RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
 RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
 RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
 R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
 R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
 =============================================================================
 BUG btrfs_delayed_extent_op (Tainted: G    B   W        ): Objects remaining in btrfs_delayed_extent_op on __kmem_cache_shutdown()
 -----------------------------------------------------------------------------

 INFO: Slab 0x00000000f145ce2f objects=22 used=1 fp=0x00000000af0f92cf flags=0x17fffc000010200
 CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Call Trace:
  dump_stack+0x8d/0xb5
  slab_err+0xb7/0xdc
  ? lock_acquired+0x199/0x490
  __kmem_cache_shutdown+0x1ac/0x3c0
  ? __mutex_unlock_slowpath+0x45/0x2a0
  kmem_cache_destroy+0x55/0x120
  exit_btrfs_fs+0xa/0x59 [btrfs]
  __x64_sys_delete_module+0x194/0x260
  ? fpregs_assert_state_consistent+0x1e/0x40
  ? exit_to_user_mode_prepare+0x55/0x1c0
  ? trace_hardirqs_on+0x1b/0xf0
  do_syscall_64+0x33/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f693e305897
 Code: 73 01 c3 48 8b 0d f9 f5 (...)
 RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
 RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
 RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
 RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
 R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
 R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
 INFO: Object 0x000000004cf95ea8 @offset=6264
 INFO: Allocated in btrfs_alloc_tree_block+0x1e0/0x360 [btrfs] age=1931 cpu=6 pid=1729873
        __slab_alloc.isra.0+0x109/0x1c0
        kmem_cache_alloc+0x7bb/0x830
        btrfs_alloc_tree_block+0x1e0/0x360 [btrfs]
        alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
        __btrfs_cow_block+0x12d/0x5f0 [btrfs]
        btrfs_cow_block+0xf7/0x220 [btrfs]
        btrfs_search_slot+0x62a/0xc40 [btrfs]
        btrfs_del_orphan_item+0x65/0xd0 [btrfs]
        btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
        open_ctree+0x125a/0x18a0 [btrfs]
        btrfs_mount_root.cold+0x13/0xed [btrfs]
        legacy_get_tree+0x30/0x60
        vfs_get_tree+0x28/0xe0
        fc_mount+0xe/0x40
        vfs_kern_mount.part.0+0x71/0x90
        btrfs_mount+0x13b/0x3e0 [btrfs]
 INFO: Freed in __btrfs_run_delayed_refs+0xabd/0x1290 [btrfs] age=3173 cpu=6 pid=1729803
        kmem_cache_free+0x34c/0x3c0
        __btrfs_run_delayed_refs+0xabd/0x1290 [btrfs]
        btrfs_run_delayed_refs+0x81/0x210 [btrfs]
        commit_cowonly_roots+0xfb/0x300 [btrfs]
        btrfs_commit_transaction+0x367/0xc40 [btrfs]
        close_ctree+0x113/0x2fa [btrfs]
        generic_shutdown_super+0x6c/0x100
        kill_anon_super+0x14/0x30
        btrfs_kill_super+0x12/0x20 [btrfs]
        deactivate_locked_super+0x31/0x70
        cleanup_mnt+0x100/0x160
        task_work_run+0x68/0xb0
        exit_to_user_mode_prepare+0x1bb/0x1c0
        syscall_exit_to_user_mode+0x4b/0x260
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
 kmem_cache_destroy btrfs_delayed_extent_op: Slab cache still has objects
 CPU: 3 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Call Trace:
  dump_stack+0x8d/0xb5
  kmem_cache_destroy+0x119/0x120
  exit_btrfs_fs+0xa/0x59 [btrfs]
  __x64_sys_delete_module+0x194/0x260
  ? fpregs_assert_state_consistent+0x1e/0x40
  ? exit_to_user_mode_prepare+0x55/0x1c0
  ? trace_hardirqs_on+0x1b/0xf0
  do_syscall_64+0x33/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f693e305897
 Code: 73 01 c3 48 8b 0d f9 (...)
 RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
 RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
 RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
 RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
 R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
 R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
 BTRFS: state leak: start 30408704 end 30425087 state 1 in tree 1 refs 1

So fix this by calling btrfs_find_orphan_roots() in the mount path only if
we are mounting the filesystem in RW mode. It's pointless to have it called
for RO mounts anyway, since despite adding any deleted roots to the list of
dead roots, we will never have the roots deleted until the filesystem is
remounted in RW mode, as the cleaner kthread does nothing when we are
mounted in RO - btrfs_need_cleaner_sleep() always returns true and the
cleaner spends all time sleeping, never cleaning dead roots.

This is accomplished by moving the call to btrfs_find_orphan_roots() from
open_ctree() to btrfs_start_pre_rw_mount(), which also guarantees that
if later the filesystem is remounted RW, we populate the list of dead
roots and have the cleaner task delete the dead roots.

Tested-by: Fabian Vogt <fvogt@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/disk-io.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 765deefda92b..e941cbae3991 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2969,6 +2969,7 @@ int btrfs_start_pre_rw_mount(struct btrfs_fs_info *fs_info)
 		}
 	}
 
+	ret = btrfs_find_orphan_roots(fs_info);
 out:
 	return ret;
 }
@@ -3383,10 +3384,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
 		}
 	}
 
-	ret = btrfs_find_orphan_roots(fs_info);
-	if (ret)
-		goto fail_qgroup;
-
 	fs_info->fs_root = btrfs_get_fs_root(fs_info, BTRFS_FS_TREE_OBJECTID, true);
 	if (IS_ERR(fs_info->fs_root)) {
 		err = PTR_ERR(fs_info->fs_root);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/5] btrfs: fix race between RO remount and the cleaner task
  2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
  2020-12-14 10:10 ` [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan fdmanana
  2020-12-14 10:10 ` [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount fdmanana
@ 2020-12-14 10:10 ` fdmanana
  2020-12-14 10:10 ` [PATCH 4/5] btrfs: add assertion for empty list of transactions at late stage of umount fdmanana
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: fdmanana @ 2020-12-14 10:10 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

When we are remounting a filesystem in RO mode we can race with the cleaner
task and result in leaking a transaction if the filesystem is unmounted
shortly after, before the transaction kthread had a chance to commit that
transaction. That also results in a crash during unmount, due to a
use-after-free, if hardware acceleration is not available for crc32c.

The following sequence of steps explains how the race happens.

1) The filesystem is mounted in RW mode and the cleaner task is running.
   This means that currently BTRFS_FS_CLEANER_RUNNING is set at
   fs_info->flags;

2) The cleaner task is currently running delayed iputs for example;

3) A filesystem RO remount operation starts;

4) The RO remount task calls btrfs_commit_super(), which commits any
   currently open transaction, and it finishes;

5) At this point the cleaner task is still running and it creates a new
   transaction by doing one of the following things:

   * When running the delayed iput() for an inode with a 0 link count,
     in which case at btrfs_evict_inode() we start a transaction through
     the call to evict_refill_and_join(), use it and then release its
     handle through btrfs_end_transaction();

   * When deleting a dead root through btrfs_clean_one_deleted_snapshot(),
     a transaction is started at btrfs_drop_snapshot() and then its handle
     is released through a call to btrfs_end_transaction_throttle();

   * When the remount task was still running, and before the remount task
     called btrfs_delete_unused_bgs(), the cleaner task also called
     btrfs_delete_unused_bgs() and it picked and removed one block group
     from the list of unused block groups. Before the cleaner task started
     a transaction, through btrfs_start_trans_remove_block_group() at
     btrfs_delete_unused_bgs(), the remount task had already called
     btrfs_commit_super();

6) So at this point the filesystem is in RO mode and we have an open
   transaction that was started by the cleaner task;

7) Shortly after a filesystem unmount operation starts. At close_ctree()
   we stop the transaction kthread before it had a chance to commit the
   transaction, since less than 30 seconds (the default commit interval)
   have elapsed since the last transaction was committed;

8) We end up calling iput() against the btree inode at close_ctree() while
   there is an open transaction, and since that transaction was used to
   update btrees by the cleaner, we have dirty pages in the btree inode
   due to COW operations on metadata extents, and therefore writeback is
   triggered for the btree inode.

   So btree_write_cache_pages() is invoked to flush those dirty pages
   during the final iput() on the btree inode. This results in creating a
   bio and submitting it, which makes us end up at
   btrfs_submit_metadata_bio();

9) At btrfs_submit_metadata_bio() we end up at the if-then-else branch
   that calls btrfs_wq_submit_bio(), because check_async_write() returned
   a value of 1. This value of 1 is because we did not have hardware
   acceleration available for crc32c, so BTRFS_FS_CSUM_IMPL_FAST was not
   set in fs_info->flags;

10) Then at btrfs_wq_submit_bio() we call btrfs_queue_work() against the
    workqueue at fs_info->workers, which was already freed before by the
    call to btrfs_stop_all_workers() at close_ctree(). This results in an
    invalid memory access due to a use-after-free, leading to a crash.

When this happens, before the crash there are several warnings triggered,
since we have reserved metadata space in a block group, the delayed refs
reservation, etc:

  ------------[ cut here ]------------
  WARNING: CPU: 4 PID: 1729896 at fs/btrfs/block-group.c:125 btrfs_put_block_group+0x63/0xa0 [btrfs]
  Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
  CPU: 4 PID: 1729896 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:btrfs_put_block_group+0x63/0xa0 [btrfs]
  Code: f0 01 00 00 48 39 c2 75 (...)
  RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
  RAX: 0000000000000001 RBX: ffff947ed73e4000 RCX: ffff947ebc8b29c8
  RDX: 0000000000000001 RSI: ffffffffc0b150a0 RDI: ffff947ebc8b2800
  RBP: ffff947ebc8b2800 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
  R13: ffff947ed73e4160 R14: ffff947ebc8b2988 R15: dead000000000100
  FS:  00007f15edfea840(0000) GS:ffff9481ad600000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f37e2893320 CR3: 0000000138f68001 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   btrfs_free_block_groups+0x17f/0x2f0 [btrfs]
   close_ctree+0x2ba/0x2fa [btrfs]
   generic_shutdown_super+0x6c/0x100
   kill_anon_super+0x14/0x30
   btrfs_kill_super+0x12/0x20 [btrfs]
   deactivate_locked_super+0x31/0x70
   cleanup_mnt+0x100/0x160
   task_work_run+0x68/0xb0
   exit_to_user_mode_prepare+0x1bb/0x1c0
   syscall_exit_to_user_mode+0x4b/0x260
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f15ee221ee7
  Code: ff 0b 00 f7 d8 64 89 01 48 (...)
  RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
  RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
  RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
  RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
  R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
  R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
  irq event stamp: 0
  hardirqs last  enabled at (0): [<0000000000000000>] 0x0
  hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
  softirqs last  enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
  softirqs last disabled at (0): [<0000000000000000>] 0x0
  ---[ end trace dd74718fef1ed5c6 ]---
  ------------[ cut here ]------------
  WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-rsv.c:459 btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
  Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
  CPU: 2 PID: 1729896 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
  Code: 48 83 bb b0 03 00 00 00 (...)
  RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
  RAX: 000000000033c000 RBX: ffff947ed73e4000 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: ffffffffc0b0d8c1 RDI: 00000000ffffffff
  RBP: ffff947ebc8b7000 R08: 0000000000000001 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
  R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
  FS:  00007f15edfea840(0000) GS:ffff9481aca00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000561a79f76e20 CR3: 0000000138f68006 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   btrfs_free_block_groups+0x24c/0x2f0 [btrfs]
   close_ctree+0x2ba/0x2fa [btrfs]
   generic_shutdown_super+0x6c/0x100
   kill_anon_super+0x14/0x30
   btrfs_kill_super+0x12/0x20 [btrfs]
   deactivate_locked_super+0x31/0x70
   cleanup_mnt+0x100/0x160
   task_work_run+0x68/0xb0
   exit_to_user_mode_prepare+0x1bb/0x1c0
   syscall_exit_to_user_mode+0x4b/0x260
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f15ee221ee7
  Code: ff 0b 00 f7 d8 64 89 01 (...)
  RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
  RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
  RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
  RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
  R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
  R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
  irq event stamp: 0
  hardirqs last  enabled at (0): [<0000000000000000>] 0x0
  hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
  softirqs last  enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
  softirqs last disabled at (0): [<0000000000000000>] 0x0
  ---[ end trace dd74718fef1ed5c7 ]---
  ------------[ cut here ]------------
  WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-group.c:3377 btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
  Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
  CPU: 5 PID: 1729896 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
  Code: ad de 49 be 22 01 00 (...)
  RSP: 0018:ffffb270826bbde8 EFLAGS: 00010206
  RAX: ffff947ebeae1d08 RBX: ffff947ed73e4000 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: ffff947e9d823ae8 RDI: 0000000000000246
  RBP: ffff947ebeae1d08 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ebeae1c00
  R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
  FS:  00007f15edfea840(0000) GS:ffff9481ad200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f1475d98ea8 CR3: 0000000138f68005 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   close_ctree+0x2ba/0x2fa [btrfs]
   generic_shutdown_super+0x6c/0x100
   kill_anon_super+0x14/0x30
   btrfs_kill_super+0x12/0x20 [btrfs]
   deactivate_locked_super+0x31/0x70
   cleanup_mnt+0x100/0x160
   task_work_run+0x68/0xb0
   exit_to_user_mode_prepare+0x1bb/0x1c0
   syscall_exit_to_user_mode+0x4b/0x260
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f15ee221ee7
  Code: ff 0b 00 f7 d8 64 89 (...)
  RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
  RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
  RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
  RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
  R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
  R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
  irq event stamp: 0
  hardirqs last  enabled at (0): [<0000000000000000>] 0x0
  hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
  softirqs last  enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
  softirqs last disabled at (0): [<0000000000000000>] 0x0
  ---[ end trace dd74718fef1ed5c8 ]---
  BTRFS info (device sdc): space_info 4 has 268238848 free, is not full
  BTRFS info (device sdc): space_info total=268435456, used=114688, pinned=0, reserved=16384, may_use=0, readonly=65536
  BTRFS info (device sdc): global_block_rsv: size 0 reserved 0
  BTRFS info (device sdc): trans_block_rsv: size 0 reserved 0
  BTRFS info (device sdc): chunk_block_rsv: size 0 reserved 0
  BTRFS info (device sdc): delayed_block_rsv: size 0 reserved 0
  BTRFS info (device sdc): delayed_refs_rsv: size 524288 reserved 0

And the crash, which only happens when we do not have crc32c hardware
acceleration, produces the following trace immediately after those
warnings:

  stack segment: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
  CPU: 2 PID: 1749129 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  RIP: 0010:btrfs_queue_work+0x36/0x190 [btrfs]
  Code: 54 55 53 48 89 f3 (...)
  RSP: 0018:ffffb27082443ae8 EFLAGS: 00010282
  RAX: 0000000000000004 RBX: ffff94810ee9ad90 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: ffff94810ee9ad90 RDI: ffff947ed8ee75a0
  RBP: a56b6b6b6b6b6b6b R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000007 R11: 0000000000000001 R12: ffff947fa9b435a8
  R13: ffff94810ee9ad90 R14: 0000000000000000 R15: ffff947e93dc0000
  FS:  00007f3cfe974840(0000) GS:ffff9481ac600000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f1b42995a70 CR3: 0000000127638003 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   btrfs_wq_submit_bio+0xb3/0xd0 [btrfs]
   btrfs_submit_metadata_bio+0x44/0xc0 [btrfs]
   submit_one_bio+0x61/0x70 [btrfs]
   btree_write_cache_pages+0x414/0x450 [btrfs]
   ? kobject_put+0x9a/0x1d0
   ? trace_hardirqs_on+0x1b/0xf0
   ? _raw_spin_unlock_irqrestore+0x3c/0x60
   ? free_debug_processing+0x1e1/0x2b0
   do_writepages+0x43/0xe0
   ? lock_acquired+0x199/0x490
   __writeback_single_inode+0x59/0x650
   writeback_single_inode+0xaf/0x120
   write_inode_now+0x94/0xd0
   iput+0x187/0x2b0
   close_ctree+0x2c6/0x2fa [btrfs]
   generic_shutdown_super+0x6c/0x100
   kill_anon_super+0x14/0x30
   btrfs_kill_super+0x12/0x20 [btrfs]
   deactivate_locked_super+0x31/0x70
   cleanup_mnt+0x100/0x160
   task_work_run+0x68/0xb0
   exit_to_user_mode_prepare+0x1bb/0x1c0
   syscall_exit_to_user_mode+0x4b/0x260
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f3cfebabee7
  Code: ff 0b 00 f7 d8 64 89 01 (...)
  RSP: 002b:00007ffc9c9a05f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
  RAX: 0000000000000000 RBX: 00007f3cfecd1264 RCX: 00007f3cfebabee7
  RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 0000562b6b478000
  RBP: 0000562b6b473a30 R08: 0000000000000000 R09: 00007f3cfec6cbe0
  R10: 0000562b6b479fe0 R11: 0000000000000246 R12: 0000000000000000
  R13: 0000562b6b478000 R14: 0000562b6b473b40 R15: 0000562b6b473c60
  Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
  ---[ end trace dd74718fef1ed5cc ]---

Finally when we remove the btrfs module (rmmod btrfs), there are several
warnings about objects that were allocated from our slabs but were never
freed, consequence of the transaction that was never committed and got
leaked:

  =============================================================================
  BUG btrfs_delayed_ref_head (Tainted: G    B   W        ): Objects remaining in btrfs_delayed_ref_head on __kmem_cache_shutdown()
  -----------------------------------------------------------------------------

  INFO: Slab 0x0000000094c2ae56 objects=24 used=2 fp=0x000000002bfa2521 flags=0x17fffc000010200
  CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x8d/0xb5
   slab_err+0xb7/0xdc
   ? lock_acquired+0x199/0x490
   __kmem_cache_shutdown+0x1ac/0x3c0
   ? lock_release+0x20e/0x4c0
   kmem_cache_destroy+0x55/0x120
   btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
   exit_btrfs_fs+0xa/0x59 [btrfs]
   __x64_sys_delete_module+0x194/0x260
   ? fpregs_assert_state_consistent+0x1e/0x40
   ? exit_to_user_mode_prepare+0x55/0x1c0
   ? trace_hardirqs_on+0x1b/0xf0
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f693e305897
  Code: 73 01 c3 48 8b 0d f9 f5 (...)
  RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
  RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
  RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
  R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
  R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
  INFO: Object 0x0000000050cbdd61 @offset=12104
  INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1894 cpu=6 pid=1729873
        __slab_alloc.isra.0+0x109/0x1c0
        kmem_cache_alloc+0x7bb/0x830
        btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
        btrfs_free_tree_block+0x128/0x360 [btrfs]
        __btrfs_cow_block+0x489/0x5f0 [btrfs]
        btrfs_cow_block+0xf7/0x220 [btrfs]
        btrfs_search_slot+0x62a/0xc40 [btrfs]
        btrfs_del_orphan_item+0x65/0xd0 [btrfs]
        btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
        open_ctree+0x125a/0x18a0 [btrfs]
        btrfs_mount_root.cold+0x13/0xed [btrfs]
        legacy_get_tree+0x30/0x60
        vfs_get_tree+0x28/0xe0
        fc_mount+0xe/0x40
        vfs_kern_mount.part.0+0x71/0x90
        btrfs_mount+0x13b/0x3e0 [btrfs]
  INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=4292 cpu=2 pid=1729526
        kmem_cache_free+0x34c/0x3c0
        __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
        btrfs_run_delayed_refs+0x81/0x210 [btrfs]
        commit_cowonly_roots+0xfb/0x300 [btrfs]
        btrfs_commit_transaction+0x367/0xc40 [btrfs]
        sync_filesystem+0x74/0x90
        generic_shutdown_super+0x22/0x100
        kill_anon_super+0x14/0x30
        btrfs_kill_super+0x12/0x20 [btrfs]
        deactivate_locked_super+0x31/0x70
        cleanup_mnt+0x100/0x160
        task_work_run+0x68/0xb0
        exit_to_user_mode_prepare+0x1bb/0x1c0
        syscall_exit_to_user_mode+0x4b/0x260
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
  INFO: Object 0x0000000086e9b0ff @offset=12776
  INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1900 cpu=6 pid=1729873
        __slab_alloc.isra.0+0x109/0x1c0
        kmem_cache_alloc+0x7bb/0x830
        btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
        btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
        alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
        __btrfs_cow_block+0x12d/0x5f0 [btrfs]
        btrfs_cow_block+0xf7/0x220 [btrfs]
        btrfs_search_slot+0x62a/0xc40 [btrfs]
        btrfs_del_orphan_item+0x65/0xd0 [btrfs]
        btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
        open_ctree+0x125a/0x18a0 [btrfs]
        btrfs_mount_root.cold+0x13/0xed [btrfs]
        legacy_get_tree+0x30/0x60
        vfs_get_tree+0x28/0xe0
        fc_mount+0xe/0x40
        vfs_kern_mount.part.0+0x71/0x90
  INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=3141 cpu=6 pid=1729803
        kmem_cache_free+0x34c/0x3c0
        __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
        btrfs_run_delayed_refs+0x81/0x210 [btrfs]
        btrfs_write_dirty_block_groups+0x17d/0x3d0 [btrfs]
        commit_cowonly_roots+0x248/0x300 [btrfs]
        btrfs_commit_transaction+0x367/0xc40 [btrfs]
        close_ctree+0x113/0x2fa [btrfs]
        generic_shutdown_super+0x6c/0x100
        kill_anon_super+0x14/0x30
        btrfs_kill_super+0x12/0x20 [btrfs]
        deactivate_locked_super+0x31/0x70
        cleanup_mnt+0x100/0x160
        task_work_run+0x68/0xb0
        exit_to_user_mode_prepare+0x1bb/0x1c0
        syscall_exit_to_user_mode+0x4b/0x260
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
  kmem_cache_destroy btrfs_delayed_ref_head: Slab cache still has objects
  CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x8d/0xb5
   kmem_cache_destroy+0x119/0x120
   btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
   exit_btrfs_fs+0xa/0x59 [btrfs]
   __x64_sys_delete_module+0x194/0x260
   ? fpregs_assert_state_consistent+0x1e/0x40
   ? exit_to_user_mode_prepare+0x55/0x1c0
   ? trace_hardirqs_on+0x1b/0xf0
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f693e305897
  Code: 73 01 c3 48 8b 0d f9 f5 0b (...)
  RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
  RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
  RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
  R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
  R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
  =============================================================================
  BUG btrfs_delayed_tree_ref (Tainted: G    B   W        ): Objects remaining in btrfs_delayed_tree_ref on __kmem_cache_shutdown()
  -----------------------------------------------------------------------------

  INFO: Slab 0x0000000011f78dc0 objects=37 used=2 fp=0x0000000032d55d91 flags=0x17fffc000010200
  CPU: 3 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x8d/0xb5
   slab_err+0xb7/0xdc
   ? lock_acquired+0x199/0x490
   __kmem_cache_shutdown+0x1ac/0x3c0
   ? lock_release+0x20e/0x4c0
   kmem_cache_destroy+0x55/0x120
   btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
   exit_btrfs_fs+0xa/0x59 [btrfs]
   __x64_sys_delete_module+0x194/0x260
   ? fpregs_assert_state_consistent+0x1e/0x40
   ? exit_to_user_mode_prepare+0x55/0x1c0
   ? trace_hardirqs_on+0x1b/0xf0
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f693e305897
  Code: 73 01 c3 48 8b 0d f9 f5 (...)
  RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
  RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
  RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
  R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
  R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
  INFO: Object 0x000000001a340018 @offset=4408
  INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1917 cpu=6 pid=1729873
        __slab_alloc.isra.0+0x109/0x1c0
        kmem_cache_alloc+0x7bb/0x830
        btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
        btrfs_free_tree_block+0x128/0x360 [btrfs]
        __btrfs_cow_block+0x489/0x5f0 [btrfs]
        btrfs_cow_block+0xf7/0x220 [btrfs]
        btrfs_search_slot+0x62a/0xc40 [btrfs]
        btrfs_del_orphan_item+0x65/0xd0 [btrfs]
        btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
        open_ctree+0x125a/0x18a0 [btrfs]
        btrfs_mount_root.cold+0x13/0xed [btrfs]
        legacy_get_tree+0x30/0x60
        vfs_get_tree+0x28/0xe0
        fc_mount+0xe/0x40
        vfs_kern_mount.part.0+0x71/0x90
        btrfs_mount+0x13b/0x3e0 [btrfs]
  INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=4167 cpu=4 pid=1729795
        kmem_cache_free+0x34c/0x3c0
        __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
        btrfs_run_delayed_refs+0x81/0x210 [btrfs]
        btrfs_commit_transaction+0x60/0xc40 [btrfs]
        create_subvol+0x56a/0x990 [btrfs]
        btrfs_mksubvol+0x3fb/0x4a0 [btrfs]
        __btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs]
        btrfs_ioctl_snap_create+0x58/0x80 [btrfs]
        btrfs_ioctl+0x1a92/0x36f0 [btrfs]
        __x64_sys_ioctl+0x83/0xb0
        do_syscall_64+0x33/0x80
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
  INFO: Object 0x000000002b46292a @offset=13648
  INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1923 cpu=6 pid=1729873
        __slab_alloc.isra.0+0x109/0x1c0
        kmem_cache_alloc+0x7bb/0x830
        btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
        btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
        alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
        __btrfs_cow_block+0x12d/0x5f0 [btrfs]
        btrfs_cow_block+0xf7/0x220 [btrfs]
        btrfs_search_slot+0x62a/0xc40 [btrfs]
        btrfs_del_orphan_item+0x65/0xd0 [btrfs]
        btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
        open_ctree+0x125a/0x18a0 [btrfs]
        btrfs_mount_root.cold+0x13/0xed [btrfs]
        legacy_get_tree+0x30/0x60
        vfs_get_tree+0x28/0xe0
        fc_mount+0xe/0x40
        vfs_kern_mount.part.0+0x71/0x90
  INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=3164 cpu=6 pid=1729803
        kmem_cache_free+0x34c/0x3c0
        __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
        btrfs_run_delayed_refs+0x81/0x210 [btrfs]
        commit_cowonly_roots+0xfb/0x300 [btrfs]
        btrfs_commit_transaction+0x367/0xc40 [btrfs]
        close_ctree+0x113/0x2fa [btrfs]
        generic_shutdown_super+0x6c/0x100
        kill_anon_super+0x14/0x30
        btrfs_kill_super+0x12/0x20 [btrfs]
        deactivate_locked_super+0x31/0x70
        cleanup_mnt+0x100/0x160
        task_work_run+0x68/0xb0
        exit_to_user_mode_prepare+0x1bb/0x1c0
        syscall_exit_to_user_mode+0x4b/0x260
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
  kmem_cache_destroy btrfs_delayed_tree_ref: Slab cache still has objects
  CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x8d/0xb5
   kmem_cache_destroy+0x119/0x120
   btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
   exit_btrfs_fs+0xa/0x59 [btrfs]
   __x64_sys_delete_module+0x194/0x260
   ? fpregs_assert_state_consistent+0x1e/0x40
   ? exit_to_user_mode_prepare+0x55/0x1c0
   ? trace_hardirqs_on+0x1b/0xf0
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f693e305897
  Code: 73 01 c3 48 8b 0d f9 f5 (...)
  RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
  RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
  RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
  R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
  R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
  =============================================================================
  BUG btrfs_delayed_extent_op (Tainted: G    B   W        ): Objects remaining in btrfs_delayed_extent_op on __kmem_cache_shutdown()
  -----------------------------------------------------------------------------
  INFO: Slab 0x00000000f145ce2f objects=22 used=1 fp=0x00000000af0f92cf flags=0x17fffc000010200
  CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x8d/0xb5
   slab_err+0xb7/0xdc
   ? lock_acquired+0x199/0x490
   __kmem_cache_shutdown+0x1ac/0x3c0
   ? __mutex_unlock_slowpath+0x45/0x2a0
   kmem_cache_destroy+0x55/0x120
   exit_btrfs_fs+0xa/0x59 [btrfs]
   __x64_sys_delete_module+0x194/0x260
   ? fpregs_assert_state_consistent+0x1e/0x40
   ? exit_to_user_mode_prepare+0x55/0x1c0
   ? trace_hardirqs_on+0x1b/0xf0
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f693e305897
  Code: 73 01 c3 48 8b 0d f9 f5 (...)
  RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
  RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
  RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
  R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
  R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
  INFO: Object 0x000000004cf95ea8 @offset=6264
  INFO: Allocated in btrfs_alloc_tree_block+0x1e0/0x360 [btrfs] age=1931 cpu=6 pid=1729873
        __slab_alloc.isra.0+0x109/0x1c0
        kmem_cache_alloc+0x7bb/0x830
        btrfs_alloc_tree_block+0x1e0/0x360 [btrfs]
        alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
        __btrfs_cow_block+0x12d/0x5f0 [btrfs]
        btrfs_cow_block+0xf7/0x220 [btrfs]
        btrfs_search_slot+0x62a/0xc40 [btrfs]
        btrfs_del_orphan_item+0x65/0xd0 [btrfs]
        btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
        open_ctree+0x125a/0x18a0 [btrfs]
        btrfs_mount_root.cold+0x13/0xed [btrfs]
        legacy_get_tree+0x30/0x60
        vfs_get_tree+0x28/0xe0
        fc_mount+0xe/0x40
        vfs_kern_mount.part.0+0x71/0x90
        btrfs_mount+0x13b/0x3e0 [btrfs]
  INFO: Freed in __btrfs_run_delayed_refs+0xabd/0x1290 [btrfs] age=3173 cpu=6 pid=1729803
        kmem_cache_free+0x34c/0x3c0
        __btrfs_run_delayed_refs+0xabd/0x1290 [btrfs]
        btrfs_run_delayed_refs+0x81/0x210 [btrfs]
        commit_cowonly_roots+0xfb/0x300 [btrfs]
        btrfs_commit_transaction+0x367/0xc40 [btrfs]
        close_ctree+0x113/0x2fa [btrfs]
        generic_shutdown_super+0x6c/0x100
        kill_anon_super+0x14/0x30
        btrfs_kill_super+0x12/0x20 [btrfs]
        deactivate_locked_super+0x31/0x70
        cleanup_mnt+0x100/0x160
        task_work_run+0x68/0xb0
        exit_to_user_mode_prepare+0x1bb/0x1c0
        syscall_exit_to_user_mode+0x4b/0x260
        entry_SYSCALL_64_after_hwframe+0x44/0xa9
  kmem_cache_destroy btrfs_delayed_extent_op: Slab cache still has objects
  CPU: 3 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   dump_stack+0x8d/0xb5
   kmem_cache_destroy+0x119/0x120
   exit_btrfs_fs+0xa/0x59 [btrfs]
   __x64_sys_delete_module+0x194/0x260
   ? fpregs_assert_state_consistent+0x1e/0x40
   ? exit_to_user_mode_prepare+0x55/0x1c0
   ? trace_hardirqs_on+0x1b/0xf0
   do_syscall_64+0x33/0x80
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f693e305897
  Code: 73 01 c3 48 8b 0d f9 (...)
  RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
  RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
  RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
  R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
  R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
  BTRFS: state leak: start 30408704 end 30425087 state 1 in tree 1 refs 1

So fix this by making the remount path to wait for the cleaner task before
calling btrfs_commit_super(). The remount path now waits for the bit
BTRFS_FS_CLEANER_RUNNING to be cleared from fs_info->flags before calling
btrfs_commit_super() and this ensures the cleaner can not start a
transaction after that, because it sleeps when the filesystem is in RO
mode and we have already flagged the filesystem as RO before waiting for
BTRFS_FS_CLEANER_RUNNING to be cleared.

This also introduces a new flag BTRFS_FS_STATE_RO to be used for
fs_info->fs_state when the filesystem is in RO mode. This is because we
were doing the RO check using the flags of the superblock and setting the
RO mode simply by ORing into the superblock's flags - those operations are
not atomic and could result in the cleaner not see the update from the
remount task after it clears BTRFS_FS_CLEANER_RUNNING.

Tested-by: Fabian Vogt <fvogt@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/ctree.h   | 20 +++++++++++++++++++-
 fs/btrfs/disk-io.c |  5 ++++-
 fs/btrfs/super.c   | 22 +++++++++++++++++++---
 fs/btrfs/volumes.c |  4 ++--
 4 files changed, 44 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 3935d297d198..0225c5208f44 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -132,6 +132,8 @@ enum {
 	 * defrag
 	 */
 	BTRFS_FS_STATE_REMOUNTING,
+	/* Filesystem in RO mode */
+	BTRFS_FS_STATE_RO,
 	/* Track if a transaction abort has been reported on this filesystem */
 	BTRFS_FS_STATE_TRANS_ABORTED,
 	/*
@@ -2892,10 +2894,26 @@ static inline int btrfs_fs_closing(struct btrfs_fs_info *fs_info)
  * If we remount the fs to be R/O or umount the fs, the cleaner needn't do
  * anything except sleeping. This function is used to check the status of
  * the fs.
+ * We check for BTRFS_FS_STATE_RO to avoid races with a concurrent remount,
+ * since setting and checking for SB_RDONLY in the superblock's flags is not
+ * atomic.
  */
 static inline int btrfs_need_cleaner_sleep(struct btrfs_fs_info *fs_info)
 {
-	return fs_info->sb->s_flags & SB_RDONLY || btrfs_fs_closing(fs_info);
+	return test_bit(BTRFS_FS_STATE_RO, &fs_info->fs_state) ||
+		btrfs_fs_closing(fs_info);
+}
+
+static inline void btrfs_set_sb_rdonly(struct super_block *sb)
+{
+	sb->s_flags |= SB_RDONLY;
+	set_bit(BTRFS_FS_STATE_RO, &btrfs_sb(sb)->fs_state);
+}
+
+static inline void btrfs_clear_sb_rdonly(struct super_block *sb)
+{
+	sb->s_flags &= ~SB_RDONLY;
+	clear_bit(BTRFS_FS_STATE_RO, &btrfs_sb(sb)->fs_state);
 }
 
 /* tree mod log functions from ctree.c */
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index e941cbae3991..e7bcbd0b93ef 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1729,7 +1729,7 @@ static int cleaner_kthread(void *arg)
 		 */
 		btrfs_delete_unused_bgs(fs_info);
 sleep:
-		clear_bit(BTRFS_FS_CLEANER_RUNNING, &fs_info->flags);
+		clear_and_wake_up_bit(BTRFS_FS_CLEANER_RUNNING, &fs_info->flags);
 		if (kthread_should_park())
 			kthread_parkme();
 		if (kthread_should_stop())
@@ -2830,6 +2830,9 @@ static int init_mount_fs_info(struct btrfs_fs_info *fs_info, struct super_block
 		return -ENOMEM;
 	btrfs_init_delayed_root(fs_info->delayed_root);
 
+	if (sb_rdonly(sb))
+		set_bit(BTRFS_FS_STATE_RO, &fs_info->fs_state);
+
 	return btrfs_alloc_stripe_hash_table(fs_info);
 }
 
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index b24fa62375e0..38740cc2919f 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -175,7 +175,7 @@ void __btrfs_handle_fs_error(struct btrfs_fs_info *fs_info, const char *function
 	btrfs_discard_stop(fs_info);
 
 	/* btrfs handle error by forcing the filesystem readonly */
-	sb->s_flags |= SB_RDONLY;
+	btrfs_set_sb_rdonly(sb);
 	btrfs_info(fs_info, "forced readonly");
 	/*
 	 * Note that a running device replace operation is not canceled here
@@ -1953,7 +1953,7 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 		/* avoid complains from lockdep et al. */
 		up(&fs_info->uuid_tree_rescan_sem);
 
-		sb->s_flags |= SB_RDONLY;
+		btrfs_set_sb_rdonly(sb);
 
 		/*
 		 * Setting SB_RDONLY will put the cleaner thread to
@@ -1964,6 +1964,20 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 		 */
 		btrfs_delete_unused_bgs(fs_info);
 
+		/*
+		 * The cleaner task could be already running before we set the
+		 * flag BTRFS_FS_STATE_RO (and SB_RDONLY in the superblock).
+		 * We must make sure that after we finish the remount, i.e. after
+		 * we call btrfs_commit_super(), the cleaner can no longer start
+		 * a transaction - either because it was dropping a dead root,
+		 * running delayed iputs or deleting an unused block group (the
+		 * cleaner picked a block group from the list of unused block
+		 * groups before we were able to in the previous call to
+		 * btrfs_delete_unused_bgs()).
+		 */
+		wait_on_bit(&fs_info->flags, BTRFS_FS_CLEANER_RUNNING,
+			    TASK_UNINTERRUPTIBLE);
+
 		btrfs_dev_replace_suspend_for_unmount(fs_info);
 		btrfs_scrub_cancel(fs_info);
 		btrfs_pause_balance(fs_info);
@@ -2014,7 +2028,7 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 		if (ret)
 			goto restore;
 
-		sb->s_flags &= ~SB_RDONLY;
+		btrfs_clear_sb_rdonly(sb);
 
 		set_bit(BTRFS_FS_OPEN, &fs_info->flags);
 	}
@@ -2036,6 +2050,8 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 	/* We've hit an error - don't reset SB_RDONLY */
 	if (sb_rdonly(sb))
 		old_flags |= SB_RDONLY;
+	if (!(old_flags & SB_RDONLY))
+		clear_bit(BTRFS_FS_STATE_RO, &fs_info->fs_state);
 	sb->s_flags = old_flags;
 	fs_info->mount_opt = old_opts;
 	fs_info->compress_type = old_compress_type;
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 7930e1c78c45..2c0aa03b6437 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2593,7 +2593,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 	set_blocksize(device->bdev, BTRFS_BDEV_BLOCKSIZE);
 
 	if (seeding_dev) {
-		sb->s_flags &= ~SB_RDONLY;
+		btrfs_clear_sb_rdonly(sb);
 		ret = btrfs_prepare_sprout(fs_info);
 		if (ret) {
 			btrfs_abort_transaction(trans, ret);
@@ -2729,7 +2729,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 	mutex_unlock(&fs_info->fs_devices->device_list_mutex);
 error_trans:
 	if (seeding_dev)
-		sb->s_flags |= SB_RDONLY;
+		btrfs_set_sb_rdonly(sb);
 	if (trans)
 		btrfs_end_transaction(trans);
 error_free_zone:
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/5] btrfs: add assertion for empty list of transactions at late stage of umount
  2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
                   ` (2 preceding siblings ...)
  2020-12-14 10:10 ` [PATCH 3/5] btrfs: fix race between RO remount and the cleaner task fdmanana
@ 2020-12-14 10:10 ` fdmanana
  2020-12-14 10:10 ` [PATCH 5/5] btrfs: run delayed iputs when remounting RO to avoid leaking them fdmanana
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: fdmanana @ 2020-12-14 10:10 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

Add an assertion to close_ctree(), after destroying all the work queues,
to verify we do not have any transaction still open or committing at that
at that point. If we have any, it means something is seriously wrong and
that can cause memory leaks and use-after-free problems. This is motivated
by the previous patches that fixed bugs where we ended up leaking an open
transaction after unmmounting the filesystem.

Tested-by: Fabian Vogt <fvogt@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/disk-io.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index e7bcbd0b93ef..a567d578d0c8 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4181,6 +4181,9 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
 	invalidate_inode_pages2(fs_info->btree_inode->i_mapping);
 	btrfs_stop_all_workers(fs_info);
 
+	/* We shouldn't have any transaction open at this point. */
+	ASSERT(list_empty(&fs_info->trans_list));
+
 	clear_bit(BTRFS_FS_OPEN, &fs_info->flags);
 	free_root_pointers(fs_info, true);
 	btrfs_free_fs_roots(fs_info);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/5] btrfs: run delayed iputs when remounting RO to avoid leaking them
  2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
                   ` (3 preceding siblings ...)
  2020-12-14 10:10 ` [PATCH 4/5] btrfs: add assertion for empty list of transactions at late stage of umount fdmanana
@ 2020-12-14 10:10 ` fdmanana
  2020-12-17 16:26 ` [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount Josef Bacik
  2020-12-17 18:08 ` David Sterba
  6 siblings, 0 replies; 13+ messages in thread
From: fdmanana @ 2020-12-14 10:10 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

When remounting RO, after setting the superblock with the RO flag, the
cleaner task will start sleeping and do nothing, since the call to
btrfs_need_cleaner_sleep() keeps returning 'true'. However, when the
cleaner task goes to sleep, the list of delayed iputs may not be empty.

As long as we are in RO mode, the cleaner task will keep sleeping and
never run the delayed iputs. This means that if a filesystem unmount
is started, we get into close_ctree() with a non-empty list of delayed
iputs, and because the filesystem is in RO mode and is not in an error
state (or a transaction aborted), btrfs_error_commit_super() and
btrfs_commit_super(), which run the delayed iputs, are never called,
and later we fail the assertion that checks if the delayed iputs list
is empty:

  assertion failed: list_empty(&fs_info->delayed_iputs), in fs/btrfs/disk-io.c:4049
  ------------[ cut here ]------------
  kernel BUG at fs/btrfs/ctree.h:3153!
  invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
  CPU: 1 PID: 3780621 Comm: umount Tainted: G             L    5.6.0-rc2-btrfs-next-73 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
  RIP: 0010:assertfail.constprop.0+0x18/0x26 [btrfs]
  Code: 8b 7b 58 48 85 ff 74 (...)
  RSP: 0018:ffffb748c89bbdf8 EFLAGS: 00010246
  RAX: 0000000000000051 RBX: ffff9608f2584000 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffffffff91998988 RDI: 00000000ffffffff
  RBP: ffff9608f25870d8 R08: 0000000000000000 R09: 0000000000000001
  R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc0cbc500
  R13: ffffffff92411750 R14: 0000000000000000 R15: ffff9608f2aab250
  FS:  00007fcbfaa66c80(0000) GS:ffff960936c80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fffc2c2dd38 CR3: 0000000235e54002 CR4: 00000000003606e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   close_ctree+0x1a2/0x2e6 [btrfs]
   generic_shutdown_super+0x6c/0x100
   kill_anon_super+0x14/0x30
   btrfs_kill_super+0x12/0x20 [btrfs]
   deactivate_locked_super+0x31/0x70
   cleanup_mnt+0x100/0x160
   task_work_run+0x93/0xc0
   exit_to_usermode_loop+0xf9/0x100
   do_syscall_64+0x20d/0x260
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
  RIP: 0033:0x7fcbfaca6307
  Code: eb 0b 00 f7 d8 64 89 (...)
  RSP: 002b:00007fffc2c2ed68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
  RAX: 0000000000000000 RBX: 0000558203b559b0 RCX: 00007fcbfaca6307
  RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000558203b55bc0
  RBP: 0000000000000000 R08: 0000000000000001 R09: 00007fffc2c2dad0
  R10: 0000558203b55bf0 R11: 0000000000000246 R12: 0000558203b55bc0
  R13: 00007fcbfadcc204 R14: 0000558203b55aa8 R15: 0000000000000000
  Modules linked in: btrfs dm_flakey dm_log_writes (...)
  ---[ end trace d44d303790049ef6 ]---

So fix this by making the remount RO path run any remaining delayed iputs
after waiting for the cleaner to become inactive.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/super.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 38740cc2919f..12d7d3be7cd4 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1978,6 +1978,16 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 		wait_on_bit(&fs_info->flags, BTRFS_FS_CLEANER_RUNNING,
 			    TASK_UNINTERRUPTIBLE);
 
+		/*
+		 * We've set the superblock to RO mode, so we might have made
+		 * the cleaner task sleep without running all pending delayed
+		 * iputs. Go through all the delayed iputs here, so that if an
+		 * unmount happens without remounting RW we don't end up at
+		 * finishing close_ctree() with a non-empty list of delayed
+		 * iputs.
+		 */
+		btrfs_run_delayed_iputs(fs_info);
+
 		btrfs_dev_replace_suspend_for_unmount(fs_info);
 		btrfs_scrub_cancel(fs_info);
 		btrfs_pause_balance(fs_info);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount
  2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
                   ` (4 preceding siblings ...)
  2020-12-14 10:10 ` [PATCH 5/5] btrfs: run delayed iputs when remounting RO to avoid leaking them fdmanana
@ 2020-12-17 16:26 ` Josef Bacik
  2020-12-17 18:08 ` David Sterba
  6 siblings, 0 replies; 13+ messages in thread
From: Josef Bacik @ 2020-12-17 16:26 UTC (permalink / raw)
  To: fdmanana, linux-btrfs

On 12/14/20 5:10 AM, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> There are some cases where we can leak a transaction and crash during unmount
> after remounting the filesystem in RO mode or mounting RO. These issues were
> actually being hit by automated tests from the openQA for openSUSE Tumbleweed
> (bugzilla https://bugzilla.suse.com/show_bug.cgi?id=1164503).
> 
> Filipe Manana (5):
>    btrfs: fix transaction leak and crash after RO remount caused by
>      qgroup rescan
>    btrfs: fix transaction leak and crash after cleaning up orphans on RO
>      mount
>    btrfs: fix race between RO remount and the cleaner task
>    btrfs: add assertion for empty list of transactions at late stage of
>      umount
>    btrfs: run delayed iputs when remounting RO to avoid leaking them
> 

You can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan
  2020-12-14 10:10 ` [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan fdmanana
@ 2020-12-17 17:44   ` David Sterba
  2020-12-17 18:21     ` Filipe Manana
  0 siblings, 1 reply; 13+ messages in thread
From: David Sterba @ 2020-12-17 17:44 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On Mon, Dec 14, 2020 at 10:10:45AM +0000, fdmanana@kernel.org wrote:
> +static bool rescan_should_stop(struct btrfs_fs_info *fs_info)
> +{
> +	return btrfs_fs_closing(fs_info) ||
> +		test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state);
> +}
> +
>  static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
>  {
>  	struct btrfs_fs_info *fs_info = container_of(work, struct btrfs_fs_info,
> @@ -3198,6 +3204,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
>  	struct btrfs_trans_handle *trans = NULL;
>  	int err = -ENOMEM;
>  	int ret = 0;
> +	bool stopped = false;
>  
>  	path = btrfs_alloc_path();
>  	if (!path)
> @@ -3210,7 +3217,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
>  	path->skip_locking = 1;
>  
>  	err = 0;
> -	while (!err && !btrfs_fs_closing(fs_info)) {
> +	while (!err && !(stopped = rescan_should_stop(fs_info))) {
>  		trans = btrfs_start_transaction(fs_info->fs_root, 0);
>  		if (IS_ERR(trans)) {
>  			err = PTR_ERR(trans);
> @@ -3253,7 +3260,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
>  	}
>  
>  	mutex_lock(&fs_info->qgroup_rescan_lock);
> -	if (!btrfs_fs_closing(fs_info))
> +	if (!stopped)
>  		fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
>  	if (trans) {
>  		ret = update_qgroup_status_item(trans);
> @@ -3272,7 +3279,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
>  
>  	btrfs_end_transaction(trans);
>  
> -	if (btrfs_fs_closing(fs_info)) {
> +	if (stopped) {

Thinking aloud, this is slightly different as it uses the cached status
of fs_closing but there is mutex lock/unlock or transaction start/end
between the checks so the status could change.

But as the flow goes, we want to get fresh status in the while loop.
Once it stops because of the fs_closing or remount request, the
following code does the qgroup status update, wakeups, even tough this
means one more transaction. Remount needs to sync anyway and this should
be no problem.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount
  2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
                   ` (5 preceding siblings ...)
  2020-12-17 16:26 ` [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount Josef Bacik
@ 2020-12-17 18:08 ` David Sterba
  6 siblings, 0 replies; 13+ messages in thread
From: David Sterba @ 2020-12-17 18:08 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On Mon, Dec 14, 2020 at 10:10:44AM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> There are some cases where we can leak a transaction and crash during unmount
> after remounting the filesystem in RO mode or mounting RO. These issues were
> actually being hit by automated tests from the openQA for openSUSE Tumbleweed
> (bugzilla https://bugzilla.suse.com/show_bug.cgi?id=1164503).
> 
> Filipe Manana (5):
>   btrfs: fix transaction leak and crash after RO remount caused by
>     qgroup rescan
>   btrfs: fix transaction leak and crash after cleaning up orphans on RO
>     mount
>   btrfs: fix race between RO remount and the cleaner task
>   btrfs: add assertion for empty list of transactions at late stage of
>     umount
>   btrfs: run delayed iputs when remounting RO to avoid leaking them

Added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan
  2020-12-17 17:44   ` David Sterba
@ 2020-12-17 18:21     ` Filipe Manana
  0 siblings, 0 replies; 13+ messages in thread
From: Filipe Manana @ 2020-12-17 18:21 UTC (permalink / raw)
  To: dsterba, Filipe Manana, linux-btrfs

On Thu, Dec 17, 2020 at 5:45 PM David Sterba <dsterba@suse.cz> wrote:
>
> On Mon, Dec 14, 2020 at 10:10:45AM +0000, fdmanana@kernel.org wrote:
> > +static bool rescan_should_stop(struct btrfs_fs_info *fs_info)
> > +{
> > +     return btrfs_fs_closing(fs_info) ||
> > +             test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state);
> > +}
> > +
> >  static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> >  {
> >       struct btrfs_fs_info *fs_info = container_of(work, struct btrfs_fs_info,
> > @@ -3198,6 +3204,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> >       struct btrfs_trans_handle *trans = NULL;
> >       int err = -ENOMEM;
> >       int ret = 0;
> > +     bool stopped = false;
> >
> >       path = btrfs_alloc_path();
> >       if (!path)
> > @@ -3210,7 +3217,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> >       path->skip_locking = 1;
> >
> >       err = 0;
> > -     while (!err && !btrfs_fs_closing(fs_info)) {
> > +     while (!err && !(stopped = rescan_should_stop(fs_info))) {
> >               trans = btrfs_start_transaction(fs_info->fs_root, 0);
> >               if (IS_ERR(trans)) {
> >                       err = PTR_ERR(trans);
> > @@ -3253,7 +3260,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> >       }
> >
> >       mutex_lock(&fs_info->qgroup_rescan_lock);
> > -     if (!btrfs_fs_closing(fs_info))
> > +     if (!stopped)
> >               fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
> >       if (trans) {
> >               ret = update_qgroup_status_item(trans);
> > @@ -3272,7 +3279,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> >
> >       btrfs_end_transaction(trans);
> >
> > -     if (btrfs_fs_closing(fs_info)) {
> > +     if (stopped) {
>
> Thinking aloud, this is slightly different as it uses the cached status
> of fs_closing but there is mutex lock/unlock or transaction start/end
> between the checks so the status could change.
>
> But as the flow goes, we want to get fresh status in the while loop.
> Once it stops because of the fs_closing or remount request, the
> following code does the qgroup status update, wakeups, even tough this
> means one more transaction. Remount needs to sync anyway and this should
> be no problem.

Yes, that and the fact that the rescan calls
complete_all(&fs_info->qgroup_rescan_completion) before it logs the
reason why it finished.

So it would be possible for remount to stop it, then remount
completes, and then the rescan worker logs that an error happened
instead of logging that it was stopped - it's a very big stretch for
that to happen, but an error message would be confusing from a user's
perspective at least.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount
  2020-12-14 10:10 ` [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount fdmanana
@ 2021-03-16  6:44   ` robbieko
  2021-03-16 11:43     ` Filipe Manana
  0 siblings, 1 reply; 13+ messages in thread
From: robbieko @ 2021-03-16  6:44 UTC (permalink / raw)
  To: fdmanana, linux-btrfs

Hi All,

The patch delayed find orphan roots.
Move to after orphan cleanup with tree_root.
I think this will cause all orphan items to be deleted
when orphan cleanup with tree_root.
Afterwards, find orphan roots cannot find
the subvolume being deleted.

Is my suspicion correct?

Thanks.

Robbie Ko


fdmanana@kernel.org 於 2020/12/14 下午6:10 寫道:
> From: Filipe Manana <fdmanana@suse.com>
>
> When we delete a root (subvolume or snapshot), at the very end of the
> operation, we attempt to remove the root's orphan item from the root tree,
> at btrfs_drop_snapshot(), by calling btrfs_del_orphan_item(). We ignore any
> error from btrfs_del_orphan_item() since it is not a serious problem and
> the next time the filesystem is mounted we remove such stray orphan items
> at btrfs_find_orphan_roots().
>
> However if the filesystem is mounted RO and we have stray orphan items for
> any previously deleted root, we can end up leaking a transaction and other
> data structures when unmounting the filesystem, as well as crashing if we
> do not have hardware acceleration for crc32c available.
>
> The steps that lead to the transaction leak are the following:
>
> 1) The filesystem is mounted in RW mode;
>
> 2) A subvolume is deleted;
>
> 3) When the cleaner kthread runs btrfs_drop_snapshot() to delete the root,
>     it gets a failure at btrfs_del_orphan_item(), which is ignored, due to
>     a -ENOMEM when allocating a path for example. So the orphan item for
>     the root remains in the root tree;
>
> 4) The filesystem is unmounted;
>
> 5) The filesystem is mounted RO (-o ro). During the mount path we call
>     btrfs_find_orphan_roots(), which iterates the root tree searching for
>     orphan items. It finds the orphan item for our deleted root, and since
>     it can not find the root, it starts a transaction to delete the orphan
>     item (by calling btrfs_del_orphan_item());
>
> 6) The RO mount completes;
>
> 7) Before the transaction kthread commits the transaction created for
>     deleting the orphan item (i.e. less than 30 seconds elapsed since the
>     mount, the default commit interval), a filesystem unmount operation is
>     started;
>
> 8) At close_ctree(), we stop the transaction kthread, but we still have a
>     transaction open with at least one dirty extent buffer, a leaf for the
>     tree root which was COWed when deleting the orphan item;
>
> 9) We then proceed to destroy the work queues, free the roots and block
>     groups, etc. After that we drop the last reference on the btree inode by
>     calling iput() on it. Since there are dirty pages for the btree inode,
>     corresponding to the COWed extent buffer, btree_write_cache_pages() is
>     invoked to flush those dirty pages. This results in creating a bio and
>     submitting it, which makes us end up at btrfs_submit_metadata_bio();
>
> 10) At btrfs_submit_metadata_bio() we end up at the if-then-else branch
>      that calls btrfs_wq_submit_bio(), because check_async_write() returned
>      a value of 1. This value of 1 is because we did not have hardware
>      acceleration available for crc32c, so BTRFS_FS_CSUM_IMPL_FAST was not
>      set in fs_info->flags;
>
> 11) Then at btrfs_wq_submit_bio() we call btrfs_queue_work() against the
>      workqueue at fs_info->workers, which was already freed before by the
>      call to btrfs_stop_all_workers() at close_ctree(). This results in an
>      invalid memory access due to a use-after-free, leading to a crash.
>
> When this happens, before the crash there are several warnings triggered,
> since we have reserved metadata space in a block group, the delayed refs
> reservation, etc:
>
>   ------------[ cut here ]------------
>   WARNING: CPU: 4 PID: 1729896 at fs/btrfs/block-group.c:125 btrfs_put_block_group+0x63/0xa0 [btrfs]
>   Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
>   CPU: 4 PID: 1729896 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   RIP: 0010:btrfs_put_block_group+0x63/0xa0 [btrfs]
>   Code: f0 01 00 00 48 39 c2 75 (...)
>   RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
>   RAX: 0000000000000001 RBX: ffff947ed73e4000 RCX: ffff947ebc8b29c8
>   RDX: 0000000000000001 RSI: ffffffffc0b150a0 RDI: ffff947ebc8b2800
>   RBP: ffff947ebc8b2800 R08: 0000000000000000 R09: 0000000000000000
>   R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
>   R13: ffff947ed73e4160 R14: ffff947ebc8b2988 R15: dead000000000100
>   FS:  00007f15edfea840(0000) GS:ffff9481ad600000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00007f37e2893320 CR3: 0000000138f68001 CR4: 00000000003706e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   Call Trace:
>    btrfs_free_block_groups+0x17f/0x2f0 [btrfs]
>    close_ctree+0x2ba/0x2fa [btrfs]
>    generic_shutdown_super+0x6c/0x100
>    kill_anon_super+0x14/0x30
>    btrfs_kill_super+0x12/0x20 [btrfs]
>    deactivate_locked_super+0x31/0x70
>    cleanup_mnt+0x100/0x160
>    task_work_run+0x68/0xb0
>    exit_to_user_mode_prepare+0x1bb/0x1c0
>    syscall_exit_to_user_mode+0x4b/0x260
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f15ee221ee7
>   Code: ff 0b 00 f7 d8 64 89 01 48 (...)
>   RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
>   RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
>   RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
>   RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
>   R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
>   R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
>   irq event stamp: 0
>   hardirqs last  enabled at (0): [<0000000000000000>] 0x0
>   hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
>   softirqs last  enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
>   softirqs last disabled at (0): [<0000000000000000>] 0x0
>   ---[ end trace dd74718fef1ed5c6 ]---
>   ------------[ cut here ]------------
>   WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-rsv.c:459 btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
>   Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
>   CPU: 2 PID: 1729896 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   RIP: 0010:btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
>   Code: 48 83 bb b0 03 00 00 00 (...)
>   RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
>   RAX: 000000000033c000 RBX: ffff947ed73e4000 RCX: 0000000000000000
>   RDX: 0000000000000001 RSI: ffffffffc0b0d8c1 RDI: 00000000ffffffff
>   RBP: ffff947ebc8b7000 R08: 0000000000000001 R09: 0000000000000000
>   R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
>   R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
>   FS:  00007f15edfea840(0000) GS:ffff9481aca00000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 0000561a79f76e20 CR3: 0000000138f68006 CR4: 00000000003706e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   Call Trace:
>    btrfs_free_block_groups+0x24c/0x2f0 [btrfs]
>    close_ctree+0x2ba/0x2fa [btrfs]
>    generic_shutdown_super+0x6c/0x100
>    kill_anon_super+0x14/0x30
>    btrfs_kill_super+0x12/0x20 [btrfs]
>    deactivate_locked_super+0x31/0x70
>    cleanup_mnt+0x100/0x160
>    task_work_run+0x68/0xb0
>    exit_to_user_mode_prepare+0x1bb/0x1c0
>    syscall_exit_to_user_mode+0x4b/0x260
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f15ee221ee7
>   Code: ff 0b 00 f7 d8 64 89 01 (...)
>   RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
>   RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
>   RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
>   RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
>   R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
>   R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
>   irq event stamp: 0
>   hardirqs last  enabled at (0): [<0000000000000000>] 0x0
>   hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
>   softirqs last  enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
>   softirqs last disabled at (0): [<0000000000000000>] 0x0
>   ---[ end trace dd74718fef1ed5c7 ]---
>   ------------[ cut here ]------------
>   WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-group.c:3377 btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
>   Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
>   CPU: 5 PID: 1729896 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   RIP: 0010:btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
>   Code: ad de 49 be 22 01 00 (...)
>   RSP: 0018:ffffb270826bbde8 EFLAGS: 00010206
>   RAX: ffff947ebeae1d08 RBX: ffff947ed73e4000 RCX: 0000000000000000
>   RDX: 0000000000000001 RSI: ffff947e9d823ae8 RDI: 0000000000000246
>   RBP: ffff947ebeae1d08 R08: 0000000000000000 R09: 0000000000000000
>   R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ebeae1c00
>   R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
>   FS:  00007f15edfea840(0000) GS:ffff9481ad200000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00007f1475d98ea8 CR3: 0000000138f68005 CR4: 00000000003706e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   Call Trace:
>    close_ctree+0x2ba/0x2fa [btrfs]
>    generic_shutdown_super+0x6c/0x100
>    kill_anon_super+0x14/0x30
>    btrfs_kill_super+0x12/0x20 [btrfs]
>    deactivate_locked_super+0x31/0x70
>    cleanup_mnt+0x100/0x160
>    task_work_run+0x68/0xb0
>    exit_to_user_mode_prepare+0x1bb/0x1c0
>    syscall_exit_to_user_mode+0x4b/0x260
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f15ee221ee7
>   Code: ff 0b 00 f7 d8 64 89 (...)
>   RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
>   RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
>   RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
>   RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
>   R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
>   R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
>   irq event stamp: 0
>   hardirqs last  enabled at (0): [<0000000000000000>] 0x0
>   hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
>   softirqs last  enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
>   softirqs last disabled at (0): [<0000000000000000>] 0x0
>   ---[ end trace dd74718fef1ed5c8 ]---
>   BTRFS info (device sdc): space_info 4 has 268238848 free, is not full
>   BTRFS info (device sdc): space_info total=268435456, used=114688, pinned=0, reserved=16384, may_use=0, readonly=65536
>   BTRFS info (device sdc): global_block_rsv: size 0 reserved 0
>   BTRFS info (device sdc): trans_block_rsv: size 0 reserved 0
>   BTRFS info (device sdc): chunk_block_rsv: size 0 reserved 0
>   BTRFS info (device sdc): delayed_block_rsv: size 0 reserved 0
>   BTRFS info (device sdc): delayed_refs_rsv: size 524288 reserved 0
>
> And the crash, which only happens when we do not have crc32c hardware
> acceleration, produces the following trace immediately after those
> warnings:
>
>   stack segment: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
>   CPU: 2 PID: 1749129 Comm: umount Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   RIP: 0010:btrfs_queue_work+0x36/0x190 [btrfs]
>   Code: 54 55 53 48 89 f3 (...)
>   RSP: 0018:ffffb27082443ae8 EFLAGS: 00010282
>   RAX: 0000000000000004 RBX: ffff94810ee9ad90 RCX: 0000000000000000
>   RDX: 0000000000000001 RSI: ffff94810ee9ad90 RDI: ffff947ed8ee75a0
>   RBP: a56b6b6b6b6b6b6b R08: 0000000000000000 R09: 0000000000000000
>   R10: 0000000000000007 R11: 0000000000000001 R12: ffff947fa9b435a8
>   R13: ffff94810ee9ad90 R14: 0000000000000000 R15: ffff947e93dc0000
>   FS:  00007f3cfe974840(0000) GS:ffff9481ac600000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: 00007f1b42995a70 CR3: 0000000127638003 CR4: 00000000003706e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   Call Trace:
>    btrfs_wq_submit_bio+0xb3/0xd0 [btrfs]
>    btrfs_submit_metadata_bio+0x44/0xc0 [btrfs]
>    submit_one_bio+0x61/0x70 [btrfs]
>    btree_write_cache_pages+0x414/0x450 [btrfs]
>    ? kobject_put+0x9a/0x1d0
>    ? trace_hardirqs_on+0x1b/0xf0
>    ? _raw_spin_unlock_irqrestore+0x3c/0x60
>    ? free_debug_processing+0x1e1/0x2b0
>    do_writepages+0x43/0xe0
>    ? lock_acquired+0x199/0x490
>    __writeback_single_inode+0x59/0x650
>    writeback_single_inode+0xaf/0x120
>    write_inode_now+0x94/0xd0
>    iput+0x187/0x2b0
>    close_ctree+0x2c6/0x2fa [btrfs]
>    generic_shutdown_super+0x6c/0x100
>    kill_anon_super+0x14/0x30
>    btrfs_kill_super+0x12/0x20 [btrfs]
>    deactivate_locked_super+0x31/0x70
>    cleanup_mnt+0x100/0x160
>    task_work_run+0x68/0xb0
>    exit_to_user_mode_prepare+0x1bb/0x1c0
>    syscall_exit_to_user_mode+0x4b/0x260
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f3cfebabee7
>   Code: ff 0b 00 f7 d8 64 89 01 (...)
>   RSP: 002b:00007ffc9c9a05f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
>   RAX: 0000000000000000 RBX: 00007f3cfecd1264 RCX: 00007f3cfebabee7
>   RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 0000562b6b478000
>   RBP: 0000562b6b473a30 R08: 0000000000000000 R09: 00007f3cfec6cbe0
>   R10: 0000562b6b479fe0 R11: 0000000000000246 R12: 0000000000000000
>   R13: 0000562b6b478000 R14: 0000562b6b473b40 R15: 0000562b6b473c60
>   Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
>   ---[ end trace dd74718fef1ed5cc ]---
>
> Finally when we remove the btrfs module (rmmod btrfs), there are several
> warnings about objects that were allocated from our slabs but were never
> freed, consequence of the transaction that was never committed and got
> leaked:
>   =============================================================================
>   BUG btrfs_delayed_ref_head (Tainted: G    B   W        ): Objects remaining in btrfs_delayed_ref_head on __kmem_cache_shutdown()
>   -----------------------------------------------------------------------------
>
>   INFO: Slab 0x0000000094c2ae56 objects=24 used=2 fp=0x000000002bfa2521 flags=0x17fffc000010200
>   CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   Call Trace:
>    dump_stack+0x8d/0xb5
>    slab_err+0xb7/0xdc
>    ? lock_acquired+0x199/0x490
>    __kmem_cache_shutdown+0x1ac/0x3c0
>    ? lock_release+0x20e/0x4c0
>    kmem_cache_destroy+0x55/0x120
>    btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
>    exit_btrfs_fs+0xa/0x59 [btrfs]
>    __x64_sys_delete_module+0x194/0x260
>    ? fpregs_assert_state_consistent+0x1e/0x40
>    ? exit_to_user_mode_prepare+0x55/0x1c0
>    ? trace_hardirqs_on+0x1b/0xf0
>    do_syscall_64+0x33/0x80
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f693e305897
>   Code: 73 01 c3 48 8b 0d f9 f5 (...)
>   RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>   RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
>   RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
>   RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
>   R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
>   R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
>   INFO: Object 0x0000000050cbdd61 @offset=12104
>   INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1894 cpu=6 pid=1729873
>          __slab_alloc.isra.0+0x109/0x1c0
>          kmem_cache_alloc+0x7bb/0x830
>          btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
>          btrfs_free_tree_block+0x128/0x360 [btrfs]
>          __btrfs_cow_block+0x489/0x5f0 [btrfs]
>          btrfs_cow_block+0xf7/0x220 [btrfs]
>          btrfs_search_slot+0x62a/0xc40 [btrfs]
>          btrfs_del_orphan_item+0x65/0xd0 [btrfs]
>          btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
>          open_ctree+0x125a/0x18a0 [btrfs]
>          btrfs_mount_root.cold+0x13/0xed [btrfs]
>          legacy_get_tree+0x30/0x60
>          vfs_get_tree+0x28/0xe0
>          fc_mount+0xe/0x40
>          vfs_kern_mount.part.0+0x71/0x90
>          btrfs_mount+0x13b/0x3e0 [btrfs]
>   INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=4292 cpu=2 pid=1729526
>          kmem_cache_free+0x34c/0x3c0
>          __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
>          btrfs_run_delayed_refs+0x81/0x210 [btrfs]
>          commit_cowonly_roots+0xfb/0x300 [btrfs]
>          btrfs_commit_transaction+0x367/0xc40 [btrfs]
>          sync_filesystem+0x74/0x90
>          generic_shutdown_super+0x22/0x100
>          kill_anon_super+0x14/0x30
>          btrfs_kill_super+0x12/0x20 [btrfs]
>          deactivate_locked_super+0x31/0x70
>          cleanup_mnt+0x100/0x160
>          task_work_run+0x68/0xb0
>          exit_to_user_mode_prepare+0x1bb/0x1c0
>          syscall_exit_to_user_mode+0x4b/0x260
>          entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   INFO: Object 0x0000000086e9b0ff @offset=12776
>   INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1900 cpu=6 pid=1729873
>          __slab_alloc.isra.0+0x109/0x1c0
>          kmem_cache_alloc+0x7bb/0x830
>          btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
>          btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
>          alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
>          __btrfs_cow_block+0x12d/0x5f0 [btrfs]
>          btrfs_cow_block+0xf7/0x220 [btrfs]
>          btrfs_search_slot+0x62a/0xc40 [btrfs]
>          btrfs_del_orphan_item+0x65/0xd0 [btrfs]
>          btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
>          open_ctree+0x125a/0x18a0 [btrfs]
>          btrfs_mount_root.cold+0x13/0xed [btrfs]
>          legacy_get_tree+0x30/0x60
>          vfs_get_tree+0x28/0xe0
>          fc_mount+0xe/0x40
>          vfs_kern_mount.part.0+0x71/0x90
>   INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=3141 cpu=6 pid=1729803
>          kmem_cache_free+0x34c/0x3c0
>          __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
>          btrfs_run_delayed_refs+0x81/0x210 [btrfs]
>          btrfs_write_dirty_block_groups+0x17d/0x3d0 [btrfs]
>          commit_cowonly_roots+0x248/0x300 [btrfs]
>          btrfs_commit_transaction+0x367/0xc40 [btrfs]
>          close_ctree+0x113/0x2fa [btrfs]
>          generic_shutdown_super+0x6c/0x100
>          kill_anon_super+0x14/0x30
>          btrfs_kill_super+0x12/0x20 [btrfs]
>          deactivate_locked_super+0x31/0x70
>          cleanup_mnt+0x100/0x160
>          task_work_run+0x68/0xb0
>          exit_to_user_mode_prepare+0x1bb/0x1c0
>          syscall_exit_to_user_mode+0x4b/0x260
>          entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   kmem_cache_destroy btrfs_delayed_ref_head: Slab cache still has objects
>   CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   Call Trace:
>    dump_stack+0x8d/0xb5
>    kmem_cache_destroy+0x119/0x120
>    btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
>    exit_btrfs_fs+0xa/0x59 [btrfs]
>    __x64_sys_delete_module+0x194/0x260
>    ? fpregs_assert_state_consistent+0x1e/0x40
>    ? exit_to_user_mode_prepare+0x55/0x1c0
>    ? trace_hardirqs_on+0x1b/0xf0
>    do_syscall_64+0x33/0x80
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f693e305897
>   Code: 73 01 c3 48 8b 0d f9 f5 0b (...)
>   RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>   RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
>   RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
>   RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
>   R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
>   R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
>   =============================================================================
>   BUG btrfs_delayed_tree_ref (Tainted: G    B   W        ): Objects remaining in btrfs_delayed_tree_ref on __kmem_cache_shutdown()
>   -----------------------------------------------------------------------------
>
>   INFO: Slab 0x0000000011f78dc0 objects=37 used=2 fp=0x0000000032d55d91 flags=0x17fffc000010200
>   CPU: 3 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   Call Trace:
>    dump_stack+0x8d/0xb5
>    slab_err+0xb7/0xdc
>    ? lock_acquired+0x199/0x490
>    __kmem_cache_shutdown+0x1ac/0x3c0
>    ? lock_release+0x20e/0x4c0
>    kmem_cache_destroy+0x55/0x120
>    btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
>    exit_btrfs_fs+0xa/0x59 [btrfs]
>    __x64_sys_delete_module+0x194/0x260
>    ? fpregs_assert_state_consistent+0x1e/0x40
>    ? exit_to_user_mode_prepare+0x55/0x1c0
>    ? trace_hardirqs_on+0x1b/0xf0
>    do_syscall_64+0x33/0x80
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f693e305897
>   Code: 73 01 c3 48 8b 0d f9 f5 (...)
>   RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>   RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
>   RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
>   RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
>   R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
>   R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
>   INFO: Object 0x000000001a340018 @offset=4408
>   INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1917 cpu=6 pid=1729873
>          __slab_alloc.isra.0+0x109/0x1c0
>          kmem_cache_alloc+0x7bb/0x830
>          btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
>          btrfs_free_tree_block+0x128/0x360 [btrfs]
>          __btrfs_cow_block+0x489/0x5f0 [btrfs]
>          btrfs_cow_block+0xf7/0x220 [btrfs]
>          btrfs_search_slot+0x62a/0xc40 [btrfs]
>          btrfs_del_orphan_item+0x65/0xd0 [btrfs]
>          btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
>          open_ctree+0x125a/0x18a0 [btrfs]
>          btrfs_mount_root.cold+0x13/0xed [btrfs]
>          legacy_get_tree+0x30/0x60
>          vfs_get_tree+0x28/0xe0
>          fc_mount+0xe/0x40
>          vfs_kern_mount.part.0+0x71/0x90
>          btrfs_mount+0x13b/0x3e0 [btrfs]
>   INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=4167 cpu=4 pid=1729795
>          kmem_cache_free+0x34c/0x3c0
>          __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
>          btrfs_run_delayed_refs+0x81/0x210 [btrfs]
>          btrfs_commit_transaction+0x60/0xc40 [btrfs]
>          create_subvol+0x56a/0x990 [btrfs]
>          btrfs_mksubvol+0x3fb/0x4a0 [btrfs]
>          __btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs]
>          btrfs_ioctl_snap_create+0x58/0x80 [btrfs]
>          btrfs_ioctl+0x1a92/0x36f0 [btrfs]
>          __x64_sys_ioctl+0x83/0xb0
>          do_syscall_64+0x33/0x80
>          entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   INFO: Object 0x000000002b46292a @offset=13648
>   INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1923 cpu=6 pid=1729873
>          __slab_alloc.isra.0+0x109/0x1c0
>          kmem_cache_alloc+0x7bb/0x830
>          btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
>          btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
>          alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
>          __btrfs_cow_block+0x12d/0x5f0 [btrfs]
>          btrfs_cow_block+0xf7/0x220 [btrfs]
>          btrfs_search_slot+0x62a/0xc40 [btrfs]
>          btrfs_del_orphan_item+0x65/0xd0 [btrfs]
>          btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
>          open_ctree+0x125a/0x18a0 [btrfs]
>          btrfs_mount_root.cold+0x13/0xed [btrfs]
>          legacy_get_tree+0x30/0x60
>          vfs_get_tree+0x28/0xe0
>          fc_mount+0xe/0x40
>          vfs_kern_mount.part.0+0x71/0x90
>   INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=3164 cpu=6 pid=1729803
>          kmem_cache_free+0x34c/0x3c0
>          __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
>          btrfs_run_delayed_refs+0x81/0x210 [btrfs]
>          commit_cowonly_roots+0xfb/0x300 [btrfs]
>          btrfs_commit_transaction+0x367/0xc40 [btrfs]
>          close_ctree+0x113/0x2fa [btrfs]
>          generic_shutdown_super+0x6c/0x100
>          kill_anon_super+0x14/0x30
>          btrfs_kill_super+0x12/0x20 [btrfs]
>          deactivate_locked_super+0x31/0x70
>          cleanup_mnt+0x100/0x160
>          task_work_run+0x68/0xb0
>          exit_to_user_mode_prepare+0x1bb/0x1c0
>          syscall_exit_to_user_mode+0x4b/0x260
>          entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   kmem_cache_destroy btrfs_delayed_tree_ref: Slab cache still has objects
>   CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   Call Trace:
>    dump_stack+0x8d/0xb5
>    kmem_cache_destroy+0x119/0x120
>    btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
>    exit_btrfs_fs+0xa/0x59 [btrfs]
>    __x64_sys_delete_module+0x194/0x260
>    ? fpregs_assert_state_consistent+0x1e/0x40
>    ? exit_to_user_mode_prepare+0x55/0x1c0
>    ? trace_hardirqs_on+0x1b/0xf0
>    do_syscall_64+0x33/0x80
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f693e305897
>   Code: 73 01 c3 48 8b 0d f9 f5 (...)
>   RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>   RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
>   RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
>   RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
>   R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
>   R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
>   =============================================================================
>   BUG btrfs_delayed_extent_op (Tainted: G    B   W        ): Objects remaining in btrfs_delayed_extent_op on __kmem_cache_shutdown()
>   -----------------------------------------------------------------------------
>
>   INFO: Slab 0x00000000f145ce2f objects=22 used=1 fp=0x00000000af0f92cf flags=0x17fffc000010200
>   CPU: 5 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   Call Trace:
>    dump_stack+0x8d/0xb5
>    slab_err+0xb7/0xdc
>    ? lock_acquired+0x199/0x490
>    __kmem_cache_shutdown+0x1ac/0x3c0
>    ? __mutex_unlock_slowpath+0x45/0x2a0
>    kmem_cache_destroy+0x55/0x120
>    exit_btrfs_fs+0xa/0x59 [btrfs]
>    __x64_sys_delete_module+0x194/0x260
>    ? fpregs_assert_state_consistent+0x1e/0x40
>    ? exit_to_user_mode_prepare+0x55/0x1c0
>    ? trace_hardirqs_on+0x1b/0xf0
>    do_syscall_64+0x33/0x80
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f693e305897
>   Code: 73 01 c3 48 8b 0d f9 f5 (...)
>   RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>   RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
>   RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
>   RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
>   R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
>   R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
>   INFO: Object 0x000000004cf95ea8 @offset=6264
>   INFO: Allocated in btrfs_alloc_tree_block+0x1e0/0x360 [btrfs] age=1931 cpu=6 pid=1729873
>          __slab_alloc.isra.0+0x109/0x1c0
>          kmem_cache_alloc+0x7bb/0x830
>          btrfs_alloc_tree_block+0x1e0/0x360 [btrfs]
>          alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
>          __btrfs_cow_block+0x12d/0x5f0 [btrfs]
>          btrfs_cow_block+0xf7/0x220 [btrfs]
>          btrfs_search_slot+0x62a/0xc40 [btrfs]
>          btrfs_del_orphan_item+0x65/0xd0 [btrfs]
>          btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
>          open_ctree+0x125a/0x18a0 [btrfs]
>          btrfs_mount_root.cold+0x13/0xed [btrfs]
>          legacy_get_tree+0x30/0x60
>          vfs_get_tree+0x28/0xe0
>          fc_mount+0xe/0x40
>          vfs_kern_mount.part.0+0x71/0x90
>          btrfs_mount+0x13b/0x3e0 [btrfs]
>   INFO: Freed in __btrfs_run_delayed_refs+0xabd/0x1290 [btrfs] age=3173 cpu=6 pid=1729803
>          kmem_cache_free+0x34c/0x3c0
>          __btrfs_run_delayed_refs+0xabd/0x1290 [btrfs]
>          btrfs_run_delayed_refs+0x81/0x210 [btrfs]
>          commit_cowonly_roots+0xfb/0x300 [btrfs]
>          btrfs_commit_transaction+0x367/0xc40 [btrfs]
>          close_ctree+0x113/0x2fa [btrfs]
>          generic_shutdown_super+0x6c/0x100
>          kill_anon_super+0x14/0x30
>          btrfs_kill_super+0x12/0x20 [btrfs]
>          deactivate_locked_super+0x31/0x70
>          cleanup_mnt+0x100/0x160
>          task_work_run+0x68/0xb0
>          exit_to_user_mode_prepare+0x1bb/0x1c0
>          syscall_exit_to_user_mode+0x4b/0x260
>          entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   kmem_cache_destroy btrfs_delayed_extent_op: Slab cache still has objects
>   CPU: 3 PID: 1729921 Comm: rmmod Tainted: G    B   W         5.10.0-rc4-btrfs-next-73 #1
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   Call Trace:
>    dump_stack+0x8d/0xb5
>    kmem_cache_destroy+0x119/0x120
>    exit_btrfs_fs+0xa/0x59 [btrfs]
>    __x64_sys_delete_module+0x194/0x260
>    ? fpregs_assert_state_consistent+0x1e/0x40
>    ? exit_to_user_mode_prepare+0x55/0x1c0
>    ? trace_hardirqs_on+0x1b/0xf0
>    do_syscall_64+0x33/0x80
>    entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   RIP: 0033:0x7f693e305897
>   Code: 73 01 c3 48 8b 0d f9 (...)
>   RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>   RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
>   RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
>   RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
>   R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
>   R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
>   BTRFS: state leak: start 30408704 end 30425087 state 1 in tree 1 refs 1
>
> So fix this by calling btrfs_find_orphan_roots() in the mount path only if
> we are mounting the filesystem in RW mode. It's pointless to have it called
> for RO mounts anyway, since despite adding any deleted roots to the list of
> dead roots, we will never have the roots deleted until the filesystem is
> remounted in RW mode, as the cleaner kthread does nothing when we are
> mounted in RO - btrfs_need_cleaner_sleep() always returns true and the
> cleaner spends all time sleeping, never cleaning dead roots.
>
> This is accomplished by moving the call to btrfs_find_orphan_roots() from
> open_ctree() to btrfs_start_pre_rw_mount(), which also guarantees that
> if later the filesystem is remounted RW, we populate the list of dead
> roots and have the cleaner task delete the dead roots.
>
> Tested-by: Fabian Vogt <fvogt@suse.com>
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>   fs/btrfs/disk-io.c | 5 +----
>   1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 765deefda92b..e941cbae3991 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2969,6 +2969,7 @@ int btrfs_start_pre_rw_mount(struct btrfs_fs_info *fs_info)
>   		}
>   	}
>   
> +	ret = btrfs_find_orphan_roots(fs_info);
>   out:
>   	return ret;
>   }
> @@ -3383,10 +3384,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>   		}
>   	}
>   
> -	ret = btrfs_find_orphan_roots(fs_info);
> -	if (ret)
> -		goto fail_qgroup;
> -
>   	fs_info->fs_root = btrfs_get_fs_root(fs_info, BTRFS_FS_TREE_OBJECTID, true);
>   	if (IS_ERR(fs_info->fs_root)) {
>   		err = PTR_ERR(fs_info->fs_root);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount
  2021-03-16  6:44   ` robbieko
@ 2021-03-16 11:43     ` Filipe Manana
  2021-03-16 16:56       ` Filipe Manana
  0 siblings, 1 reply; 13+ messages in thread
From: Filipe Manana @ 2021-03-16 11:43 UTC (permalink / raw)
  To: robbieko; +Cc: linux-btrfs

On Tue, Mar 16, 2021 at 6:49 AM robbieko <robbieko@synology.com> wrote:
>
> Hi All,
>
> The patch delayed find orphan roots.
> Move to after orphan cleanup with tree_root.
> I think this will cause all orphan items to be deleted
> when orphan cleanup with tree_root.
> Afterwards, find orphan roots cannot find
> the subvolume being deleted.

Not entirely able to parse what you are trying to say.

I suppose your concern is that the call to:

btrfs_orphan_cleanup(fs_info->tree_root)

which now happens before calling btrfs_find_orphan_roots(), results in
the orphans for roots being accidentally deleted and therefore cause
no root deletions to happen later?
If that's your concern, than it does not happen because
btrfs_orphan_cleanup() skips deletion of orphan items for deleted
roots.

I've just created a test case to verify it's correct, for RW mounts,
RO mounts and remounts from RO to RW:

https://pastebin.com/raw/zSZjgn48

I couldn't find any regression.

Thanks.


> >   out:
> >       return ret;
> >   }
> > @@ -3383,10 +3384,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
> >               }
> >       }
> >
> > -     ret = btrfs_find_orphan_roots(fs_info);
> > -     if (ret)
> > -             goto fail_qgroup;
> > -
> >       fs_info->fs_root = btrfs_get_fs_root(fs_info, BTRFS_FS_TREE_OBJECTID, true);
> >       if (IS_ERR(fs_info->fs_root)) {
> >               err = PTR_ERR(fs_info->fs_root);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount
  2021-03-16 11:43     ` Filipe Manana
@ 2021-03-16 16:56       ` Filipe Manana
  0 siblings, 0 replies; 13+ messages in thread
From: Filipe Manana @ 2021-03-16 16:56 UTC (permalink / raw)
  To: robbieko; +Cc: linux-btrfs

On Tue, Mar 16, 2021 at 11:43 AM Filipe Manana <fdmanana@kernel.org> wrote:
>
> On Tue, Mar 16, 2021 at 6:49 AM robbieko <robbieko@synology.com> wrote:
> >
> > Hi All,
> >
> > The patch delayed find orphan roots.
> > Move to after orphan cleanup with tree_root.
> > I think this will cause all orphan items to be deleted
> > when orphan cleanup with tree_root.
> > Afterwards, find orphan roots cannot find
> > the subvolume being deleted.
>
> Not entirely able to parse what you are trying to say.
>
> I suppose your concern is that the call to:
>
> btrfs_orphan_cleanup(fs_info->tree_root)
>
> which now happens before calling btrfs_find_orphan_roots(), results in
> the orphans for roots being accidentally deleted and therefore cause
> no root deletions to happen later?
> If that's your concern, than it does not happen because
> btrfs_orphan_cleanup() skips deletion of orphan items for deleted
> roots.
>
> I've just created a test case to verify it's correct, for RW mounts,
> RO mounts and remounts from RO to RW:
>
> https://pastebin.com/raw/zSZjgn48
>
> I couldn't find any regression.

Ok, I figured out what you meant, and the test was not checking the
btree was deleted, only the orphan items.
I just sent a fix and an updated test case.

Thanks for the report.


>
> Thanks.
>
>
> > >   out:
> > >       return ret;
> > >   }
> > > @@ -3383,10 +3384,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
> > >               }
> > >       }
> > >
> > > -     ret = btrfs_find_orphan_roots(fs_info);
> > > -     if (ret)
> > > -             goto fail_qgroup;
> > > -
> > >       fs_info->fs_root = btrfs_get_fs_root(fs_info, BTRFS_FS_TREE_OBJECTID, true);
> > >       if (IS_ERR(fs_info->fs_root)) {
> > >               err = PTR_ERR(fs_info->fs_root);

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-03-16 16:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
2020-12-14 10:10 ` [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan fdmanana
2020-12-17 17:44   ` David Sterba
2020-12-17 18:21     ` Filipe Manana
2020-12-14 10:10 ` [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount fdmanana
2021-03-16  6:44   ` robbieko
2021-03-16 11:43     ` Filipe Manana
2021-03-16 16:56       ` Filipe Manana
2020-12-14 10:10 ` [PATCH 3/5] btrfs: fix race between RO remount and the cleaner task fdmanana
2020-12-14 10:10 ` [PATCH 4/5] btrfs: add assertion for empty list of transactions at late stage of umount fdmanana
2020-12-14 10:10 ` [PATCH 5/5] btrfs: run delayed iputs when remounting RO to avoid leaking them fdmanana
2020-12-17 16:26 ` [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount Josef Bacik
2020-12-17 18:08 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.