* [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount
@ 2020-12-14 10:10 fdmanana
2020-12-14 10:10 ` [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan fdmanana
` (6 more replies)
0 siblings, 7 replies; 13+ messages in thread
From: fdmanana @ 2020-12-14 10:10 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
There are some cases where we can leak a transaction and crash during unmount
after remounting the filesystem in RO mode or mounting RO. These issues were
actually being hit by automated tests from the openQA for openSUSE Tumbleweed
(bugzilla https://bugzilla.suse.com/show_bug.cgi?id=1164503).
Filipe Manana (5):
btrfs: fix transaction leak and crash after RO remount caused by
qgroup rescan
btrfs: fix transaction leak and crash after cleaning up orphans on RO
mount
btrfs: fix race between RO remount and the cleaner task
btrfs: add assertion for empty list of transactions at late stage of
umount
btrfs: run delayed iputs when remounting RO to avoid leaking them
fs/btrfs/ctree.h | 20 +++++++++++++++++++-
fs/btrfs/disk-io.c | 13 ++++++++-----
fs/btrfs/qgroup.c | 13 ++++++++++---
fs/btrfs/super.c | 40 +++++++++++++++++++++++++++++++++++++---
fs/btrfs/volumes.c | 4 ++--
5 files changed, 76 insertions(+), 14 deletions(-)
--
2.28.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan
2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
@ 2020-12-14 10:10 ` fdmanana
2020-12-17 17:44 ` David Sterba
2020-12-14 10:10 ` [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount fdmanana
` (5 subsequent siblings)
6 siblings, 1 reply; 13+ messages in thread
From: fdmanana @ 2020-12-14 10:10 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
If we remount a filesystem in RO mode while the qgroup rescan worker is
running, we can end up having it still running after the remount is done,
and at unmount time we may end up with an open transaction that ends up
never getting committed. If that happens we end up with several memory
leaks and can crash when hardware acceleration is unavailable for crc32c.
Possibly it can lead to other nasty surprises too, due to use-after-free
issues.
The following steps explain how the problem happens.
1) We have a filesystem mounted in RW mode and the qgroup rescan worker is
running;
2) We remount the filesystem in RO mode, and never stop/pause the rescan
worker, so after the remount the rescan worker is still running. The
important detail here is that the rescan task is still running after
the remount operation committed any ongoing transaction through its
call to btrfs_commit_super();
3) The rescan is still running, and after the remount completed, the
rescan worker started a transaction, after it finished iterating all
leaves of the extent tree, to update the qgroup status item in the
quotas tree. It does not commit the transaction, it only releases its
handle on the transaction;
4) A filesystem unmount operation starts shortly after;
5) The unmount task, at close_ctree(), stops the transaction kthread,
which had not had a chance to commit the open transaction since it was
sleeping and the commit interval (default of 30 seconds) has not yet
elapsed since the last time it committed a transaction;
6) So after stopping the transaction kthread we still have the transaction
used to update the qgroup status item open. At close_ctree(), when the
filesystem is in RO mode and no transaction abort happened (or the
filesystem is in error mode), we do not expect to have any transaction
open, so we do not call btrfs_commit_super();
7) We then proceed to destroy the work queues, free the roots and block
groups, etc. After that we drop the last reference on the btree inode
by calling iput() on it. Since there are dirty pages for the btree
inode, corresponding to the COWed extent buffer for the quotas btree,
btree_write_cache_pages() is invoked to flush those dirty pages. This
results in creating a bio and submitting it, which makes us end up at
btrfs_submit_metadata_bio();
8) At btrfs_submit_metadata_bio() we end up at the if-then-else branch
that calls btrfs_wq_submit_bio(), because check_async_write() returned
a value of 1. This value of 1 is because we did not have hardware
acceleration available for crc32c, so BTRFS_FS_CSUM_IMPL_FAST was not
set in fs_info->flags;
9) Then at btrfs_wq_submit_bio() we call btrfs_queue_work() against the
workqueue at fs_info->workers, which was already freed before by the
call to btrfs_stop_all_workers() at close_ctree(). This results in an
invalid memory access due to a use-after-free, leading to a crash.
When this happens, before the crash there are several warnings triggered,
since we have reserved metadata space in a block group, the delayed refs
reservation, etc:
------------[ cut here ]------------
WARNING: CPU: 4 PID: 1729896 at fs/btrfs/block-group.c:125 btrfs_put_block_group+0x63/0xa0 [btrfs]
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
CPU: 4 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_put_block_group+0x63/0xa0 [btrfs]
Code: f0 01 00 00 48 39 c2 75 (...)
RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
RAX: 0000000000000001 RBX: ffff947ed73e4000 RCX: ffff947ebc8b29c8
RDX: 0000000000000001 RSI: ffffffffc0b150a0 RDI: ffff947ebc8b2800
RBP: ffff947ebc8b2800 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
R13: ffff947ed73e4160 R14: ffff947ebc8b2988 R15: dead000000000100
FS: 00007f15edfea840(0000) GS:ffff9481ad600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f37e2893320 CR3: 0000000138f68001 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_free_block_groups+0x17f/0x2f0 [btrfs]
close_ctree+0x2ba/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f15ee221ee7
Code: ff 0b 00 f7 d8 64 89 01 48 (...)
RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace dd74718fef1ed5c6 ]---
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-rsv.c:459 btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
CPU: 2 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
Code: 48 83 bb b0 03 00 00 00 (...)
RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
RAX: 000000000033c000 RBX: ffff947ed73e4000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffffc0b0d8c1 RDI: 00000000ffffffff
RBP: ffff947ebc8b7000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
FS: 00007f15edfea840(0000) GS:ffff9481aca00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000561a79f76e20 CR3: 0000000138f68006 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_free_block_groups+0x24c/0x2f0 [btrfs]
close_ctree+0x2ba/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f15ee221ee7
Code: ff 0b 00 f7 d8 64 89 01 (...)
RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace dd74718fef1ed5c7 ]---
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-group.c:3377 btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
CPU: 5 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
Code: ad de 49 be 22 01 00 (...)
RSP: 0018:ffffb270826bbde8 EFLAGS: 00010206
RAX: ffff947ebeae1d08 RBX: ffff947ed73e4000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff947e9d823ae8 RDI: 0000000000000246
RBP: ffff947ebeae1d08 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ebeae1c00
R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
FS: 00007f15edfea840(0000) GS:ffff9481ad200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1475d98ea8 CR3: 0000000138f68005 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
close_ctree+0x2ba/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f15ee221ee7
Code: ff 0b 00 f7 d8 64 89 (...)
RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace dd74718fef1ed5c8 ]---
BTRFS info (device sdc): space_info 4 has 268238848 free, is not full
BTRFS info (device sdc): space_info total=268435456, used=114688, pinned=0, reserved=16384, may_use=0, readonly=65536
BTRFS info (device sdc): global_block_rsv: size 0 reserved 0
BTRFS info (device sdc): trans_block_rsv: size 0 reserved 0
BTRFS info (device sdc): chunk_block_rsv: size 0 reserved 0
BTRFS info (device sdc): delayed_block_rsv: size 0 reserved 0
BTRFS info (device sdc): delayed_refs_rsv: size 524288 reserved 0
And the crash, which only happens when we do not have crc32c hardware
acceleration, produces the following trace immediately after those
warnings:
stack segment: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
CPU: 2 PID: 1749129 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_queue_work+0x36/0x190 [btrfs]
Code: 54 55 53 48 89 f3 (...)
RSP: 0018:ffffb27082443ae8 EFLAGS: 00010282
RAX: 0000000000000004 RBX: ffff94810ee9ad90 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff94810ee9ad90 RDI: ffff947ed8ee75a0
RBP: a56b6b6b6b6b6b6b R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000007 R11: 0000000000000001 R12: ffff947fa9b435a8
R13: ffff94810ee9ad90 R14: 0000000000000000 R15: ffff947e93dc0000
FS: 00007f3cfe974840(0000) GS:ffff9481ac600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1b42995a70 CR3: 0000000127638003 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_wq_submit_bio+0xb3/0xd0 [btrfs]
btrfs_submit_metadata_bio+0x44/0xc0 [btrfs]
submit_one_bio+0x61/0x70 [btrfs]
btree_write_cache_pages+0x414/0x450 [btrfs]
? kobject_put+0x9a/0x1d0
? trace_hardirqs_on+0x1b/0xf0
? _raw_spin_unlock_irqrestore+0x3c/0x60
? free_debug_processing+0x1e1/0x2b0
do_writepages+0x43/0xe0
? lock_acquired+0x199/0x490
__writeback_single_inode+0x59/0x650
writeback_single_inode+0xaf/0x120
write_inode_now+0x94/0xd0
iput+0x187/0x2b0
close_ctree+0x2c6/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f3cfebabee7
Code: ff 0b 00 f7 d8 64 89 01 (...)
RSP: 002b:00007ffc9c9a05f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f3cfecd1264 RCX: 00007f3cfebabee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 0000562b6b478000
RBP: 0000562b6b473a30 R08: 0000000000000000 R09: 00007f3cfec6cbe0
R10: 0000562b6b479fe0 R11: 0000000000000246 R12: 0000000000000000
R13: 0000562b6b478000 R14: 0000562b6b473b40 R15: 0000562b6b473c60
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
---[ end trace dd74718fef1ed5cc ]---
Finally when we remove the btrfs module (rmmod btrfs), there are several
warnings about objects that were allocated from our slabs but were never
freed, consequence of the transaction that was never committed and got
leaked:
=============================================================================
BUG btrfs_delayed_ref_head (Tainted: G B W ): Objects remaining in btrfs_delayed_ref_head on __kmem_cache_shutdown()
-----------------------------------------------------------------------------
INFO: Slab 0x0000000094c2ae56 objects=24 used=2 fp=0x000000002bfa2521 flags=0x17fffc000010200
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
slab_err+0xb7/0xdc
? lock_acquired+0x199/0x490
__kmem_cache_shutdown+0x1ac/0x3c0
? lock_release+0x20e/0x4c0
kmem_cache_destroy+0x55/0x120
btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
INFO: Object 0x0000000050cbdd61 @offset=12104
INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1894 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
btrfs_free_tree_block+0x128/0x360 [btrfs]
__btrfs_cow_block+0x489/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=4292 cpu=2 pid=1729526
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
commit_cowonly_roots+0xfb/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
sync_filesystem+0x74/0x90
generic_shutdown_super+0x22/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
INFO: Object 0x0000000086e9b0ff @offset=12776
INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1900 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
__btrfs_cow_block+0x12d/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=3141 cpu=6 pid=1729803
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
btrfs_write_dirty_block_groups+0x17d/0x3d0 [btrfs]
commit_cowonly_roots+0x248/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
close_ctree+0x113/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
kmem_cache_destroy btrfs_delayed_ref_head: Slab cache still has objects
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
kmem_cache_destroy+0x119/0x120
btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 0b (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
=============================================================================
BUG btrfs_delayed_tree_ref (Tainted: G B W ): Objects remaining in btrfs_delayed_tree_ref on __kmem_cache_shutdown()
-----------------------------------------------------------------------------
INFO: Slab 0x0000000011f78dc0 objects=37 used=2 fp=0x0000000032d55d91 flags=0x17fffc000010200
CPU: 3 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
slab_err+0xb7/0xdc
? lock_acquired+0x199/0x490
__kmem_cache_shutdown+0x1ac/0x3c0
? lock_release+0x20e/0x4c0
kmem_cache_destroy+0x55/0x120
btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
INFO: Object 0x000000001a340018 @offset=4408
INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1917 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
btrfs_free_tree_block+0x128/0x360 [btrfs]
__btrfs_cow_block+0x489/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=4167 cpu=4 pid=1729795
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
btrfs_commit_transaction+0x60/0xc40 [btrfs]
create_subvol+0x56a/0x990 [btrfs]
btrfs_mksubvol+0x3fb/0x4a0 [btrfs]
__btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs]
btrfs_ioctl_snap_create+0x58/0x80 [btrfs]
btrfs_ioctl+0x1a92/0x36f0 [btrfs]
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
INFO: Object 0x000000002b46292a @offset=13648
INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1923 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
__btrfs_cow_block+0x12d/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=3164 cpu=6 pid=1729803
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
commit_cowonly_roots+0xfb/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
close_ctree+0x113/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
kmem_cache_destroy btrfs_delayed_tree_ref: Slab cache still has objects
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
kmem_cache_destroy+0x119/0x120
btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
=============================================================================
BUG btrfs_delayed_extent_op (Tainted: G B W ): Objects remaining in btrfs_delayed_extent_op on __kmem_cache_shutdown()
-----------------------------------------------------------------------------
INFO: Slab 0x00000000f145ce2f objects=22 used=1 fp=0x00000000af0f92cf flags=0x17fffc000010200
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
slab_err+0xb7/0xdc
? lock_acquired+0x199/0x490
__kmem_cache_shutdown+0x1ac/0x3c0
? __mutex_unlock_slowpath+0x45/0x2a0
kmem_cache_destroy+0x55/0x120
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
INFO: Object 0x000000004cf95ea8 @offset=6264
INFO: Allocated in btrfs_alloc_tree_block+0x1e0/0x360 [btrfs] age=1931 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_alloc_tree_block+0x1e0/0x360 [btrfs]
alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
__btrfs_cow_block+0x12d/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
INFO: Freed in __btrfs_run_delayed_refs+0xabd/0x1290 [btrfs] age=3173 cpu=6 pid=1729803
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0xabd/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
commit_cowonly_roots+0xfb/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
close_ctree+0x113/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
kmem_cache_destroy btrfs_delayed_extent_op: Slab cache still has objects
CPU: 3 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
kmem_cache_destroy+0x119/0x120
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
BTRFS: state leak: start 30408704 end 30425087 state 1 in tree 1 refs 1
Fix this issue by having the remount path stop the qgroup rescan worker
when we are remounting RO and teach the rescan worker to stop when a
remount is in progress. If later a remount in RW mode happens, we are
already resuming the qgroup rescan worker through the call to
btrfs_qgroup_rescan_resume(), so we do not need to worry about that.
Tested-by: Fabian Vogt <fvogt@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/qgroup.c | 13 ++++++++++---
fs/btrfs/super.c | 8 ++++++++
2 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 47f27658eac1..808370ada888 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -3190,6 +3190,12 @@ static int qgroup_rescan_leaf(struct btrfs_trans_handle *trans,
return ret;
}
+static bool rescan_should_stop(struct btrfs_fs_info *fs_info)
+{
+ return btrfs_fs_closing(fs_info) ||
+ test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state);
+}
+
static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
{
struct btrfs_fs_info *fs_info = container_of(work, struct btrfs_fs_info,
@@ -3198,6 +3204,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
struct btrfs_trans_handle *trans = NULL;
int err = -ENOMEM;
int ret = 0;
+ bool stopped = false;
path = btrfs_alloc_path();
if (!path)
@@ -3210,7 +3217,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
path->skip_locking = 1;
err = 0;
- while (!err && !btrfs_fs_closing(fs_info)) {
+ while (!err && !(stopped = rescan_should_stop(fs_info))) {
trans = btrfs_start_transaction(fs_info->fs_root, 0);
if (IS_ERR(trans)) {
err = PTR_ERR(trans);
@@ -3253,7 +3260,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
}
mutex_lock(&fs_info->qgroup_rescan_lock);
- if (!btrfs_fs_closing(fs_info))
+ if (!stopped)
fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
if (trans) {
ret = update_qgroup_status_item(trans);
@@ -3272,7 +3279,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
btrfs_end_transaction(trans);
- if (btrfs_fs_closing(fs_info)) {
+ if (stopped) {
btrfs_info(fs_info, "qgroup scan paused");
} else if (err >= 0) {
btrfs_info(fs_info, "qgroup scan completed%s",
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 022f20810089..b24fa62375e0 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1968,6 +1968,14 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
btrfs_scrub_cancel(fs_info);
btrfs_pause_balance(fs_info);
+ /*
+ * Pause the qgroup rescan worker if it is running. We don't want
+ * it to be still running after we are in RO mode, as after that,
+ * by the time we unmount, it might have left a transaction open,
+ * so we would leak the transaction and/or crash.
+ */
+ btrfs_qgroup_wait_for_completion(fs_info, false);
+
ret = btrfs_commit_super(fs_info);
if (ret)
goto restore;
--
2.28.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount
2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
2020-12-14 10:10 ` [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan fdmanana
@ 2020-12-14 10:10 ` fdmanana
2021-03-16 6:44 ` robbieko
2020-12-14 10:10 ` [PATCH 3/5] btrfs: fix race between RO remount and the cleaner task fdmanana
` (4 subsequent siblings)
6 siblings, 1 reply; 13+ messages in thread
From: fdmanana @ 2020-12-14 10:10 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
When we delete a root (subvolume or snapshot), at the very end of the
operation, we attempt to remove the root's orphan item from the root tree,
at btrfs_drop_snapshot(), by calling btrfs_del_orphan_item(). We ignore any
error from btrfs_del_orphan_item() since it is not a serious problem and
the next time the filesystem is mounted we remove such stray orphan items
at btrfs_find_orphan_roots().
However if the filesystem is mounted RO and we have stray orphan items for
any previously deleted root, we can end up leaking a transaction and other
data structures when unmounting the filesystem, as well as crashing if we
do not have hardware acceleration for crc32c available.
The steps that lead to the transaction leak are the following:
1) The filesystem is mounted in RW mode;
2) A subvolume is deleted;
3) When the cleaner kthread runs btrfs_drop_snapshot() to delete the root,
it gets a failure at btrfs_del_orphan_item(), which is ignored, due to
a -ENOMEM when allocating a path for example. So the orphan item for
the root remains in the root tree;
4) The filesystem is unmounted;
5) The filesystem is mounted RO (-o ro). During the mount path we call
btrfs_find_orphan_roots(), which iterates the root tree searching for
orphan items. It finds the orphan item for our deleted root, and since
it can not find the root, it starts a transaction to delete the orphan
item (by calling btrfs_del_orphan_item());
6) The RO mount completes;
7) Before the transaction kthread commits the transaction created for
deleting the orphan item (i.e. less than 30 seconds elapsed since the
mount, the default commit interval), a filesystem unmount operation is
started;
8) At close_ctree(), we stop the transaction kthread, but we still have a
transaction open with at least one dirty extent buffer, a leaf for the
tree root which was COWed when deleting the orphan item;
9) We then proceed to destroy the work queues, free the roots and block
groups, etc. After that we drop the last reference on the btree inode by
calling iput() on it. Since there are dirty pages for the btree inode,
corresponding to the COWed extent buffer, btree_write_cache_pages() is
invoked to flush those dirty pages. This results in creating a bio and
submitting it, which makes us end up at btrfs_submit_metadata_bio();
10) At btrfs_submit_metadata_bio() we end up at the if-then-else branch
that calls btrfs_wq_submit_bio(), because check_async_write() returned
a value of 1. This value of 1 is because we did not have hardware
acceleration available for crc32c, so BTRFS_FS_CSUM_IMPL_FAST was not
set in fs_info->flags;
11) Then at btrfs_wq_submit_bio() we call btrfs_queue_work() against the
workqueue at fs_info->workers, which was already freed before by the
call to btrfs_stop_all_workers() at close_ctree(). This results in an
invalid memory access due to a use-after-free, leading to a crash.
When this happens, before the crash there are several warnings triggered,
since we have reserved metadata space in a block group, the delayed refs
reservation, etc:
------------[ cut here ]------------
WARNING: CPU: 4 PID: 1729896 at fs/btrfs/block-group.c:125 btrfs_put_block_group+0x63/0xa0 [btrfs]
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
CPU: 4 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_put_block_group+0x63/0xa0 [btrfs]
Code: f0 01 00 00 48 39 c2 75 (...)
RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
RAX: 0000000000000001 RBX: ffff947ed73e4000 RCX: ffff947ebc8b29c8
RDX: 0000000000000001 RSI: ffffffffc0b150a0 RDI: ffff947ebc8b2800
RBP: ffff947ebc8b2800 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
R13: ffff947ed73e4160 R14: ffff947ebc8b2988 R15: dead000000000100
FS: 00007f15edfea840(0000) GS:ffff9481ad600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f37e2893320 CR3: 0000000138f68001 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_free_block_groups+0x17f/0x2f0 [btrfs]
close_ctree+0x2ba/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f15ee221ee7
Code: ff 0b 00 f7 d8 64 89 01 48 (...)
RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace dd74718fef1ed5c6 ]---
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-rsv.c:459 btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
CPU: 2 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
Code: 48 83 bb b0 03 00 00 00 (...)
RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
RAX: 000000000033c000 RBX: ffff947ed73e4000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffffc0b0d8c1 RDI: 00000000ffffffff
RBP: ffff947ebc8b7000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
FS: 00007f15edfea840(0000) GS:ffff9481aca00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000561a79f76e20 CR3: 0000000138f68006 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_free_block_groups+0x24c/0x2f0 [btrfs]
close_ctree+0x2ba/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f15ee221ee7
Code: ff 0b 00 f7 d8 64 89 01 (...)
RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace dd74718fef1ed5c7 ]---
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-group.c:3377 btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
CPU: 5 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
Code: ad de 49 be 22 01 00 (...)
RSP: 0018:ffffb270826bbde8 EFLAGS: 00010206
RAX: ffff947ebeae1d08 RBX: ffff947ed73e4000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff947e9d823ae8 RDI: 0000000000000246
RBP: ffff947ebeae1d08 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ebeae1c00
R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
FS: 00007f15edfea840(0000) GS:ffff9481ad200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1475d98ea8 CR3: 0000000138f68005 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
close_ctree+0x2ba/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f15ee221ee7
Code: ff 0b 00 f7 d8 64 89 (...)
RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace dd74718fef1ed5c8 ]---
BTRFS info (device sdc): space_info 4 has 268238848 free, is not full
BTRFS info (device sdc): space_info total=268435456, used=114688, pinned=0, reserved=16384, may_use=0, readonly=65536
BTRFS info (device sdc): global_block_rsv: size 0 reserved 0
BTRFS info (device sdc): trans_block_rsv: size 0 reserved 0
BTRFS info (device sdc): chunk_block_rsv: size 0 reserved 0
BTRFS info (device sdc): delayed_block_rsv: size 0 reserved 0
BTRFS info (device sdc): delayed_refs_rsv: size 524288 reserved 0
And the crash, which only happens when we do not have crc32c hardware
acceleration, produces the following trace immediately after those
warnings:
stack segment: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
CPU: 2 PID: 1749129 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_queue_work+0x36/0x190 [btrfs]
Code: 54 55 53 48 89 f3 (...)
RSP: 0018:ffffb27082443ae8 EFLAGS: 00010282
RAX: 0000000000000004 RBX: ffff94810ee9ad90 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff94810ee9ad90 RDI: ffff947ed8ee75a0
RBP: a56b6b6b6b6b6b6b R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000007 R11: 0000000000000001 R12: ffff947fa9b435a8
R13: ffff94810ee9ad90 R14: 0000000000000000 R15: ffff947e93dc0000
FS: 00007f3cfe974840(0000) GS:ffff9481ac600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1b42995a70 CR3: 0000000127638003 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_wq_submit_bio+0xb3/0xd0 [btrfs]
btrfs_submit_metadata_bio+0x44/0xc0 [btrfs]
submit_one_bio+0x61/0x70 [btrfs]
btree_write_cache_pages+0x414/0x450 [btrfs]
? kobject_put+0x9a/0x1d0
? trace_hardirqs_on+0x1b/0xf0
? _raw_spin_unlock_irqrestore+0x3c/0x60
? free_debug_processing+0x1e1/0x2b0
do_writepages+0x43/0xe0
? lock_acquired+0x199/0x490
__writeback_single_inode+0x59/0x650
writeback_single_inode+0xaf/0x120
write_inode_now+0x94/0xd0
iput+0x187/0x2b0
close_ctree+0x2c6/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f3cfebabee7
Code: ff 0b 00 f7 d8 64 89 01 (...)
RSP: 002b:00007ffc9c9a05f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f3cfecd1264 RCX: 00007f3cfebabee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 0000562b6b478000
RBP: 0000562b6b473a30 R08: 0000000000000000 R09: 00007f3cfec6cbe0
R10: 0000562b6b479fe0 R11: 0000000000000246 R12: 0000000000000000
R13: 0000562b6b478000 R14: 0000562b6b473b40 R15: 0000562b6b473c60
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
---[ end trace dd74718fef1ed5cc ]---
Finally when we remove the btrfs module (rmmod btrfs), there are several
warnings about objects that were allocated from our slabs but were never
freed, consequence of the transaction that was never committed and got
leaked:
=============================================================================
BUG btrfs_delayed_ref_head (Tainted: G B W ): Objects remaining in btrfs_delayed_ref_head on __kmem_cache_shutdown()
-----------------------------------------------------------------------------
INFO: Slab 0x0000000094c2ae56 objects=24 used=2 fp=0x000000002bfa2521 flags=0x17fffc000010200
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
slab_err+0xb7/0xdc
? lock_acquired+0x199/0x490
__kmem_cache_shutdown+0x1ac/0x3c0
? lock_release+0x20e/0x4c0
kmem_cache_destroy+0x55/0x120
btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
INFO: Object 0x0000000050cbdd61 @offset=12104
INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1894 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
btrfs_free_tree_block+0x128/0x360 [btrfs]
__btrfs_cow_block+0x489/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=4292 cpu=2 pid=1729526
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
commit_cowonly_roots+0xfb/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
sync_filesystem+0x74/0x90
generic_shutdown_super+0x22/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
INFO: Object 0x0000000086e9b0ff @offset=12776
INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1900 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
__btrfs_cow_block+0x12d/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=3141 cpu=6 pid=1729803
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
btrfs_write_dirty_block_groups+0x17d/0x3d0 [btrfs]
commit_cowonly_roots+0x248/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
close_ctree+0x113/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
kmem_cache_destroy btrfs_delayed_ref_head: Slab cache still has objects
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
kmem_cache_destroy+0x119/0x120
btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 0b (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
=============================================================================
BUG btrfs_delayed_tree_ref (Tainted: G B W ): Objects remaining in btrfs_delayed_tree_ref on __kmem_cache_shutdown()
-----------------------------------------------------------------------------
INFO: Slab 0x0000000011f78dc0 objects=37 used=2 fp=0x0000000032d55d91 flags=0x17fffc000010200
CPU: 3 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
slab_err+0xb7/0xdc
? lock_acquired+0x199/0x490
__kmem_cache_shutdown+0x1ac/0x3c0
? lock_release+0x20e/0x4c0
kmem_cache_destroy+0x55/0x120
btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
INFO: Object 0x000000001a340018 @offset=4408
INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1917 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
btrfs_free_tree_block+0x128/0x360 [btrfs]
__btrfs_cow_block+0x489/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=4167 cpu=4 pid=1729795
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
btrfs_commit_transaction+0x60/0xc40 [btrfs]
create_subvol+0x56a/0x990 [btrfs]
btrfs_mksubvol+0x3fb/0x4a0 [btrfs]
__btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs]
btrfs_ioctl_snap_create+0x58/0x80 [btrfs]
btrfs_ioctl+0x1a92/0x36f0 [btrfs]
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
INFO: Object 0x000000002b46292a @offset=13648
INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1923 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
__btrfs_cow_block+0x12d/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=3164 cpu=6 pid=1729803
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
commit_cowonly_roots+0xfb/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
close_ctree+0x113/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
kmem_cache_destroy btrfs_delayed_tree_ref: Slab cache still has objects
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
kmem_cache_destroy+0x119/0x120
btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
=============================================================================
BUG btrfs_delayed_extent_op (Tainted: G B W ): Objects remaining in btrfs_delayed_extent_op on __kmem_cache_shutdown()
-----------------------------------------------------------------------------
INFO: Slab 0x00000000f145ce2f objects=22 used=1 fp=0x00000000af0f92cf flags=0x17fffc000010200
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
slab_err+0xb7/0xdc
? lock_acquired+0x199/0x490
__kmem_cache_shutdown+0x1ac/0x3c0
? __mutex_unlock_slowpath+0x45/0x2a0
kmem_cache_destroy+0x55/0x120
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
INFO: Object 0x000000004cf95ea8 @offset=6264
INFO: Allocated in btrfs_alloc_tree_block+0x1e0/0x360 [btrfs] age=1931 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_alloc_tree_block+0x1e0/0x360 [btrfs]
alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
__btrfs_cow_block+0x12d/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
INFO: Freed in __btrfs_run_delayed_refs+0xabd/0x1290 [btrfs] age=3173 cpu=6 pid=1729803
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0xabd/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
commit_cowonly_roots+0xfb/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
close_ctree+0x113/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
kmem_cache_destroy btrfs_delayed_extent_op: Slab cache still has objects
CPU: 3 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
kmem_cache_destroy+0x119/0x120
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
BTRFS: state leak: start 30408704 end 30425087 state 1 in tree 1 refs 1
So fix this by calling btrfs_find_orphan_roots() in the mount path only if
we are mounting the filesystem in RW mode. It's pointless to have it called
for RO mounts anyway, since despite adding any deleted roots to the list of
dead roots, we will never have the roots deleted until the filesystem is
remounted in RW mode, as the cleaner kthread does nothing when we are
mounted in RO - btrfs_need_cleaner_sleep() always returns true and the
cleaner spends all time sleeping, never cleaning dead roots.
This is accomplished by moving the call to btrfs_find_orphan_roots() from
open_ctree() to btrfs_start_pre_rw_mount(), which also guarantees that
if later the filesystem is remounted RW, we populate the list of dead
roots and have the cleaner task delete the dead roots.
Tested-by: Fabian Vogt <fvogt@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/disk-io.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 765deefda92b..e941cbae3991 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2969,6 +2969,7 @@ int btrfs_start_pre_rw_mount(struct btrfs_fs_info *fs_info)
}
}
+ ret = btrfs_find_orphan_roots(fs_info);
out:
return ret;
}
@@ -3383,10 +3384,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
}
}
- ret = btrfs_find_orphan_roots(fs_info);
- if (ret)
- goto fail_qgroup;
-
fs_info->fs_root = btrfs_get_fs_root(fs_info, BTRFS_FS_TREE_OBJECTID, true);
if (IS_ERR(fs_info->fs_root)) {
err = PTR_ERR(fs_info->fs_root);
--
2.28.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/5] btrfs: fix race between RO remount and the cleaner task
2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
2020-12-14 10:10 ` [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan fdmanana
2020-12-14 10:10 ` [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount fdmanana
@ 2020-12-14 10:10 ` fdmanana
2020-12-14 10:10 ` [PATCH 4/5] btrfs: add assertion for empty list of transactions at late stage of umount fdmanana
` (3 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: fdmanana @ 2020-12-14 10:10 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
When we are remounting a filesystem in RO mode we can race with the cleaner
task and result in leaking a transaction if the filesystem is unmounted
shortly after, before the transaction kthread had a chance to commit that
transaction. That also results in a crash during unmount, due to a
use-after-free, if hardware acceleration is not available for crc32c.
The following sequence of steps explains how the race happens.
1) The filesystem is mounted in RW mode and the cleaner task is running.
This means that currently BTRFS_FS_CLEANER_RUNNING is set at
fs_info->flags;
2) The cleaner task is currently running delayed iputs for example;
3) A filesystem RO remount operation starts;
4) The RO remount task calls btrfs_commit_super(), which commits any
currently open transaction, and it finishes;
5) At this point the cleaner task is still running and it creates a new
transaction by doing one of the following things:
* When running the delayed iput() for an inode with a 0 link count,
in which case at btrfs_evict_inode() we start a transaction through
the call to evict_refill_and_join(), use it and then release its
handle through btrfs_end_transaction();
* When deleting a dead root through btrfs_clean_one_deleted_snapshot(),
a transaction is started at btrfs_drop_snapshot() and then its handle
is released through a call to btrfs_end_transaction_throttle();
* When the remount task was still running, and before the remount task
called btrfs_delete_unused_bgs(), the cleaner task also called
btrfs_delete_unused_bgs() and it picked and removed one block group
from the list of unused block groups. Before the cleaner task started
a transaction, through btrfs_start_trans_remove_block_group() at
btrfs_delete_unused_bgs(), the remount task had already called
btrfs_commit_super();
6) So at this point the filesystem is in RO mode and we have an open
transaction that was started by the cleaner task;
7) Shortly after a filesystem unmount operation starts. At close_ctree()
we stop the transaction kthread before it had a chance to commit the
transaction, since less than 30 seconds (the default commit interval)
have elapsed since the last transaction was committed;
8) We end up calling iput() against the btree inode at close_ctree() while
there is an open transaction, and since that transaction was used to
update btrees by the cleaner, we have dirty pages in the btree inode
due to COW operations on metadata extents, and therefore writeback is
triggered for the btree inode.
So btree_write_cache_pages() is invoked to flush those dirty pages
during the final iput() on the btree inode. This results in creating a
bio and submitting it, which makes us end up at
btrfs_submit_metadata_bio();
9) At btrfs_submit_metadata_bio() we end up at the if-then-else branch
that calls btrfs_wq_submit_bio(), because check_async_write() returned
a value of 1. This value of 1 is because we did not have hardware
acceleration available for crc32c, so BTRFS_FS_CSUM_IMPL_FAST was not
set in fs_info->flags;
10) Then at btrfs_wq_submit_bio() we call btrfs_queue_work() against the
workqueue at fs_info->workers, which was already freed before by the
call to btrfs_stop_all_workers() at close_ctree(). This results in an
invalid memory access due to a use-after-free, leading to a crash.
When this happens, before the crash there are several warnings triggered,
since we have reserved metadata space in a block group, the delayed refs
reservation, etc:
------------[ cut here ]------------
WARNING: CPU: 4 PID: 1729896 at fs/btrfs/block-group.c:125 btrfs_put_block_group+0x63/0xa0 [btrfs]
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
CPU: 4 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_put_block_group+0x63/0xa0 [btrfs]
Code: f0 01 00 00 48 39 c2 75 (...)
RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
RAX: 0000000000000001 RBX: ffff947ed73e4000 RCX: ffff947ebc8b29c8
RDX: 0000000000000001 RSI: ffffffffc0b150a0 RDI: ffff947ebc8b2800
RBP: ffff947ebc8b2800 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
R13: ffff947ed73e4160 R14: ffff947ebc8b2988 R15: dead000000000100
FS: 00007f15edfea840(0000) GS:ffff9481ad600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f37e2893320 CR3: 0000000138f68001 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_free_block_groups+0x17f/0x2f0 [btrfs]
close_ctree+0x2ba/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f15ee221ee7
Code: ff 0b 00 f7 d8 64 89 01 48 (...)
RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace dd74718fef1ed5c6 ]---
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-rsv.c:459 btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
CPU: 2 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
Code: 48 83 bb b0 03 00 00 00 (...)
RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
RAX: 000000000033c000 RBX: ffff947ed73e4000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffffc0b0d8c1 RDI: 00000000ffffffff
RBP: ffff947ebc8b7000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
FS: 00007f15edfea840(0000) GS:ffff9481aca00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000561a79f76e20 CR3: 0000000138f68006 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_free_block_groups+0x24c/0x2f0 [btrfs]
close_ctree+0x2ba/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f15ee221ee7
Code: ff 0b 00 f7 d8 64 89 01 (...)
RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace dd74718fef1ed5c7 ]---
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-group.c:3377 btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
CPU: 5 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
Code: ad de 49 be 22 01 00 (...)
RSP: 0018:ffffb270826bbde8 EFLAGS: 00010206
RAX: ffff947ebeae1d08 RBX: ffff947ed73e4000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff947e9d823ae8 RDI: 0000000000000246
RBP: ffff947ebeae1d08 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ebeae1c00
R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
FS: 00007f15edfea840(0000) GS:ffff9481ad200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1475d98ea8 CR3: 0000000138f68005 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
close_ctree+0x2ba/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f15ee221ee7
Code: ff 0b 00 f7 d8 64 89 (...)
RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
softirqs last disabled at (0): [<0000000000000000>] 0x0
---[ end trace dd74718fef1ed5c8 ]---
BTRFS info (device sdc): space_info 4 has 268238848 free, is not full
BTRFS info (device sdc): space_info total=268435456, used=114688, pinned=0, reserved=16384, may_use=0, readonly=65536
BTRFS info (device sdc): global_block_rsv: size 0 reserved 0
BTRFS info (device sdc): trans_block_rsv: size 0 reserved 0
BTRFS info (device sdc): chunk_block_rsv: size 0 reserved 0
BTRFS info (device sdc): delayed_block_rsv: size 0 reserved 0
BTRFS info (device sdc): delayed_refs_rsv: size 524288 reserved 0
And the crash, which only happens when we do not have crc32c hardware
acceleration, produces the following trace immediately after those
warnings:
stack segment: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
CPU: 2 PID: 1749129 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_queue_work+0x36/0x190 [btrfs]
Code: 54 55 53 48 89 f3 (...)
RSP: 0018:ffffb27082443ae8 EFLAGS: 00010282
RAX: 0000000000000004 RBX: ffff94810ee9ad90 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff94810ee9ad90 RDI: ffff947ed8ee75a0
RBP: a56b6b6b6b6b6b6b R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000007 R11: 0000000000000001 R12: ffff947fa9b435a8
R13: ffff94810ee9ad90 R14: 0000000000000000 R15: ffff947e93dc0000
FS: 00007f3cfe974840(0000) GS:ffff9481ac600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1b42995a70 CR3: 0000000127638003 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
btrfs_wq_submit_bio+0xb3/0xd0 [btrfs]
btrfs_submit_metadata_bio+0x44/0xc0 [btrfs]
submit_one_bio+0x61/0x70 [btrfs]
btree_write_cache_pages+0x414/0x450 [btrfs]
? kobject_put+0x9a/0x1d0
? trace_hardirqs_on+0x1b/0xf0
? _raw_spin_unlock_irqrestore+0x3c/0x60
? free_debug_processing+0x1e1/0x2b0
do_writepages+0x43/0xe0
? lock_acquired+0x199/0x490
__writeback_single_inode+0x59/0x650
writeback_single_inode+0xaf/0x120
write_inode_now+0x94/0xd0
iput+0x187/0x2b0
close_ctree+0x2c6/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f3cfebabee7
Code: ff 0b 00 f7 d8 64 89 01 (...)
RSP: 002b:00007ffc9c9a05f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 00007f3cfecd1264 RCX: 00007f3cfebabee7
RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 0000562b6b478000
RBP: 0000562b6b473a30 R08: 0000000000000000 R09: 00007f3cfec6cbe0
R10: 0000562b6b479fe0 R11: 0000000000000246 R12: 0000000000000000
R13: 0000562b6b478000 R14: 0000562b6b473b40 R15: 0000562b6b473c60
Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
---[ end trace dd74718fef1ed5cc ]---
Finally when we remove the btrfs module (rmmod btrfs), there are several
warnings about objects that were allocated from our slabs but were never
freed, consequence of the transaction that was never committed and got
leaked:
=============================================================================
BUG btrfs_delayed_ref_head (Tainted: G B W ): Objects remaining in btrfs_delayed_ref_head on __kmem_cache_shutdown()
-----------------------------------------------------------------------------
INFO: Slab 0x0000000094c2ae56 objects=24 used=2 fp=0x000000002bfa2521 flags=0x17fffc000010200
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
slab_err+0xb7/0xdc
? lock_acquired+0x199/0x490
__kmem_cache_shutdown+0x1ac/0x3c0
? lock_release+0x20e/0x4c0
kmem_cache_destroy+0x55/0x120
btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
INFO: Object 0x0000000050cbdd61 @offset=12104
INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1894 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
btrfs_free_tree_block+0x128/0x360 [btrfs]
__btrfs_cow_block+0x489/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=4292 cpu=2 pid=1729526
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
commit_cowonly_roots+0xfb/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
sync_filesystem+0x74/0x90
generic_shutdown_super+0x22/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
INFO: Object 0x0000000086e9b0ff @offset=12776
INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1900 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
__btrfs_cow_block+0x12d/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=3141 cpu=6 pid=1729803
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
btrfs_write_dirty_block_groups+0x17d/0x3d0 [btrfs]
commit_cowonly_roots+0x248/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
close_ctree+0x113/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
kmem_cache_destroy btrfs_delayed_ref_head: Slab cache still has objects
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
kmem_cache_destroy+0x119/0x120
btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 0b (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
=============================================================================
BUG btrfs_delayed_tree_ref (Tainted: G B W ): Objects remaining in btrfs_delayed_tree_ref on __kmem_cache_shutdown()
-----------------------------------------------------------------------------
INFO: Slab 0x0000000011f78dc0 objects=37 used=2 fp=0x0000000032d55d91 flags=0x17fffc000010200
CPU: 3 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
slab_err+0xb7/0xdc
? lock_acquired+0x199/0x490
__kmem_cache_shutdown+0x1ac/0x3c0
? lock_release+0x20e/0x4c0
kmem_cache_destroy+0x55/0x120
btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
INFO: Object 0x000000001a340018 @offset=4408
INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1917 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
btrfs_free_tree_block+0x128/0x360 [btrfs]
__btrfs_cow_block+0x489/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=4167 cpu=4 pid=1729795
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
btrfs_commit_transaction+0x60/0xc40 [btrfs]
create_subvol+0x56a/0x990 [btrfs]
btrfs_mksubvol+0x3fb/0x4a0 [btrfs]
__btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs]
btrfs_ioctl_snap_create+0x58/0x80 [btrfs]
btrfs_ioctl+0x1a92/0x36f0 [btrfs]
__x64_sys_ioctl+0x83/0xb0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
INFO: Object 0x000000002b46292a @offset=13648
INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1923 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
__btrfs_cow_block+0x12d/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=3164 cpu=6 pid=1729803
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
commit_cowonly_roots+0xfb/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
close_ctree+0x113/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
kmem_cache_destroy btrfs_delayed_tree_ref: Slab cache still has objects
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
kmem_cache_destroy+0x119/0x120
btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
=============================================================================
BUG btrfs_delayed_extent_op (Tainted: G B W ): Objects remaining in btrfs_delayed_extent_op on __kmem_cache_shutdown()
-----------------------------------------------------------------------------
INFO: Slab 0x00000000f145ce2f objects=22 used=1 fp=0x00000000af0f92cf flags=0x17fffc000010200
CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
slab_err+0xb7/0xdc
? lock_acquired+0x199/0x490
__kmem_cache_shutdown+0x1ac/0x3c0
? __mutex_unlock_slowpath+0x45/0x2a0
kmem_cache_destroy+0x55/0x120
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 f5 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
INFO: Object 0x000000004cf95ea8 @offset=6264
INFO: Allocated in btrfs_alloc_tree_block+0x1e0/0x360 [btrfs] age=1931 cpu=6 pid=1729873
__slab_alloc.isra.0+0x109/0x1c0
kmem_cache_alloc+0x7bb/0x830
btrfs_alloc_tree_block+0x1e0/0x360 [btrfs]
alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
__btrfs_cow_block+0x12d/0x5f0 [btrfs]
btrfs_cow_block+0xf7/0x220 [btrfs]
btrfs_search_slot+0x62a/0xc40 [btrfs]
btrfs_del_orphan_item+0x65/0xd0 [btrfs]
btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
open_ctree+0x125a/0x18a0 [btrfs]
btrfs_mount_root.cold+0x13/0xed [btrfs]
legacy_get_tree+0x30/0x60
vfs_get_tree+0x28/0xe0
fc_mount+0xe/0x40
vfs_kern_mount.part.0+0x71/0x90
btrfs_mount+0x13b/0x3e0 [btrfs]
INFO: Freed in __btrfs_run_delayed_refs+0xabd/0x1290 [btrfs] age=3173 cpu=6 pid=1729803
kmem_cache_free+0x34c/0x3c0
__btrfs_run_delayed_refs+0xabd/0x1290 [btrfs]
btrfs_run_delayed_refs+0x81/0x210 [btrfs]
commit_cowonly_roots+0xfb/0x300 [btrfs]
btrfs_commit_transaction+0x367/0xc40 [btrfs]
close_ctree+0x113/0x2fa [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x68/0xb0
exit_to_user_mode_prepare+0x1bb/0x1c0
syscall_exit_to_user_mode+0x4b/0x260
entry_SYSCALL_64_after_hwframe+0x44/0xa9
kmem_cache_destroy btrfs_delayed_extent_op: Slab cache still has objects
CPU: 3 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xb5
kmem_cache_destroy+0x119/0x120
exit_btrfs_fs+0xa/0x59 [btrfs]
__x64_sys_delete_module+0x194/0x260
? fpregs_assert_state_consistent+0x1e/0x40
? exit_to_user_mode_prepare+0x55/0x1c0
? trace_hardirqs_on+0x1b/0xf0
do_syscall_64+0x33/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f693e305897
Code: 73 01 c3 48 8b 0d f9 (...)
RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
BTRFS: state leak: start 30408704 end 30425087 state 1 in tree 1 refs 1
So fix this by making the remount path to wait for the cleaner task before
calling btrfs_commit_super(). The remount path now waits for the bit
BTRFS_FS_CLEANER_RUNNING to be cleared from fs_info->flags before calling
btrfs_commit_super() and this ensures the cleaner can not start a
transaction after that, because it sleeps when the filesystem is in RO
mode and we have already flagged the filesystem as RO before waiting for
BTRFS_FS_CLEANER_RUNNING to be cleared.
This also introduces a new flag BTRFS_FS_STATE_RO to be used for
fs_info->fs_state when the filesystem is in RO mode. This is because we
were doing the RO check using the flags of the superblock and setting the
RO mode simply by ORing into the superblock's flags - those operations are
not atomic and could result in the cleaner not see the update from the
remount task after it clears BTRFS_FS_CLEANER_RUNNING.
Tested-by: Fabian Vogt <fvogt@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/ctree.h | 20 +++++++++++++++++++-
fs/btrfs/disk-io.c | 5 ++++-
fs/btrfs/super.c | 22 +++++++++++++++++++---
fs/btrfs/volumes.c | 4 ++--
4 files changed, 44 insertions(+), 7 deletions(-)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 3935d297d198..0225c5208f44 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -132,6 +132,8 @@ enum {
* defrag
*/
BTRFS_FS_STATE_REMOUNTING,
+ /* Filesystem in RO mode */
+ BTRFS_FS_STATE_RO,
/* Track if a transaction abort has been reported on this filesystem */
BTRFS_FS_STATE_TRANS_ABORTED,
/*
@@ -2892,10 +2894,26 @@ static inline int btrfs_fs_closing(struct btrfs_fs_info *fs_info)
* If we remount the fs to be R/O or umount the fs, the cleaner needn't do
* anything except sleeping. This function is used to check the status of
* the fs.
+ * We check for BTRFS_FS_STATE_RO to avoid races with a concurrent remount,
+ * since setting and checking for SB_RDONLY in the superblock's flags is not
+ * atomic.
*/
static inline int btrfs_need_cleaner_sleep(struct btrfs_fs_info *fs_info)
{
- return fs_info->sb->s_flags & SB_RDONLY || btrfs_fs_closing(fs_info);
+ return test_bit(BTRFS_FS_STATE_RO, &fs_info->fs_state) ||
+ btrfs_fs_closing(fs_info);
+}
+
+static inline void btrfs_set_sb_rdonly(struct super_block *sb)
+{
+ sb->s_flags |= SB_RDONLY;
+ set_bit(BTRFS_FS_STATE_RO, &btrfs_sb(sb)->fs_state);
+}
+
+static inline void btrfs_clear_sb_rdonly(struct super_block *sb)
+{
+ sb->s_flags &= ~SB_RDONLY;
+ clear_bit(BTRFS_FS_STATE_RO, &btrfs_sb(sb)->fs_state);
}
/* tree mod log functions from ctree.c */
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index e941cbae3991..e7bcbd0b93ef 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1729,7 +1729,7 @@ static int cleaner_kthread(void *arg)
*/
btrfs_delete_unused_bgs(fs_info);
sleep:
- clear_bit(BTRFS_FS_CLEANER_RUNNING, &fs_info->flags);
+ clear_and_wake_up_bit(BTRFS_FS_CLEANER_RUNNING, &fs_info->flags);
if (kthread_should_park())
kthread_parkme();
if (kthread_should_stop())
@@ -2830,6 +2830,9 @@ static int init_mount_fs_info(struct btrfs_fs_info *fs_info, struct super_block
return -ENOMEM;
btrfs_init_delayed_root(fs_info->delayed_root);
+ if (sb_rdonly(sb))
+ set_bit(BTRFS_FS_STATE_RO, &fs_info->fs_state);
+
return btrfs_alloc_stripe_hash_table(fs_info);
}
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index b24fa62375e0..38740cc2919f 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -175,7 +175,7 @@ void __btrfs_handle_fs_error(struct btrfs_fs_info *fs_info, const char *function
btrfs_discard_stop(fs_info);
/* btrfs handle error by forcing the filesystem readonly */
- sb->s_flags |= SB_RDONLY;
+ btrfs_set_sb_rdonly(sb);
btrfs_info(fs_info, "forced readonly");
/*
* Note that a running device replace operation is not canceled here
@@ -1953,7 +1953,7 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
/* avoid complains from lockdep et al. */
up(&fs_info->uuid_tree_rescan_sem);
- sb->s_flags |= SB_RDONLY;
+ btrfs_set_sb_rdonly(sb);
/*
* Setting SB_RDONLY will put the cleaner thread to
@@ -1964,6 +1964,20 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
*/
btrfs_delete_unused_bgs(fs_info);
+ /*
+ * The cleaner task could be already running before we set the
+ * flag BTRFS_FS_STATE_RO (and SB_RDONLY in the superblock).
+ * We must make sure that after we finish the remount, i.e. after
+ * we call btrfs_commit_super(), the cleaner can no longer start
+ * a transaction - either because it was dropping a dead root,
+ * running delayed iputs or deleting an unused block group (the
+ * cleaner picked a block group from the list of unused block
+ * groups before we were able to in the previous call to
+ * btrfs_delete_unused_bgs()).
+ */
+ wait_on_bit(&fs_info->flags, BTRFS_FS_CLEANER_RUNNING,
+ TASK_UNINTERRUPTIBLE);
+
btrfs_dev_replace_suspend_for_unmount(fs_info);
btrfs_scrub_cancel(fs_info);
btrfs_pause_balance(fs_info);
@@ -2014,7 +2028,7 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
if (ret)
goto restore;
- sb->s_flags &= ~SB_RDONLY;
+ btrfs_clear_sb_rdonly(sb);
set_bit(BTRFS_FS_OPEN, &fs_info->flags);
}
@@ -2036,6 +2050,8 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
/* We've hit an error - don't reset SB_RDONLY */
if (sb_rdonly(sb))
old_flags |= SB_RDONLY;
+ if (!(old_flags & SB_RDONLY))
+ clear_bit(BTRFS_FS_STATE_RO, &fs_info->fs_state);
sb->s_flags = old_flags;
fs_info->mount_opt = old_opts;
fs_info->compress_type = old_compress_type;
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 7930e1c78c45..2c0aa03b6437 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2593,7 +2593,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
set_blocksize(device->bdev, BTRFS_BDEV_BLOCKSIZE);
if (seeding_dev) {
- sb->s_flags &= ~SB_RDONLY;
+ btrfs_clear_sb_rdonly(sb);
ret = btrfs_prepare_sprout(fs_info);
if (ret) {
btrfs_abort_transaction(trans, ret);
@@ -2729,7 +2729,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
mutex_unlock(&fs_info->fs_devices->device_list_mutex);
error_trans:
if (seeding_dev)
- sb->s_flags |= SB_RDONLY;
+ btrfs_set_sb_rdonly(sb);
if (trans)
btrfs_end_transaction(trans);
error_free_zone:
--
2.28.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 4/5] btrfs: add assertion for empty list of transactions at late stage of umount
2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
` (2 preceding siblings ...)
2020-12-14 10:10 ` [PATCH 3/5] btrfs: fix race between RO remount and the cleaner task fdmanana
@ 2020-12-14 10:10 ` fdmanana
2020-12-14 10:10 ` [PATCH 5/5] btrfs: run delayed iputs when remounting RO to avoid leaking them fdmanana
` (2 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: fdmanana @ 2020-12-14 10:10 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
Add an assertion to close_ctree(), after destroying all the work queues,
to verify we do not have any transaction still open or committing at that
at that point. If we have any, it means something is seriously wrong and
that can cause memory leaks and use-after-free problems. This is motivated
by the previous patches that fixed bugs where we ended up leaking an open
transaction after unmmounting the filesystem.
Tested-by: Fabian Vogt <fvogt@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/disk-io.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index e7bcbd0b93ef..a567d578d0c8 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4181,6 +4181,9 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
invalidate_inode_pages2(fs_info->btree_inode->i_mapping);
btrfs_stop_all_workers(fs_info);
+ /* We shouldn't have any transaction open at this point. */
+ ASSERT(list_empty(&fs_info->trans_list));
+
clear_bit(BTRFS_FS_OPEN, &fs_info->flags);
free_root_pointers(fs_info, true);
btrfs_free_fs_roots(fs_info);
--
2.28.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 5/5] btrfs: run delayed iputs when remounting RO to avoid leaking them
2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
` (3 preceding siblings ...)
2020-12-14 10:10 ` [PATCH 4/5] btrfs: add assertion for empty list of transactions at late stage of umount fdmanana
@ 2020-12-14 10:10 ` fdmanana
2020-12-17 16:26 ` [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount Josef Bacik
2020-12-17 18:08 ` David Sterba
6 siblings, 0 replies; 13+ messages in thread
From: fdmanana @ 2020-12-14 10:10 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
When remounting RO, after setting the superblock with the RO flag, the
cleaner task will start sleeping and do nothing, since the call to
btrfs_need_cleaner_sleep() keeps returning 'true'. However, when the
cleaner task goes to sleep, the list of delayed iputs may not be empty.
As long as we are in RO mode, the cleaner task will keep sleeping and
never run the delayed iputs. This means that if a filesystem unmount
is started, we get into close_ctree() with a non-empty list of delayed
iputs, and because the filesystem is in RO mode and is not in an error
state (or a transaction aborted), btrfs_error_commit_super() and
btrfs_commit_super(), which run the delayed iputs, are never called,
and later we fail the assertion that checks if the delayed iputs list
is empty:
assertion failed: list_empty(&fs_info->delayed_iputs), in fs/btrfs/disk-io.c:4049
------------[ cut here ]------------
kernel BUG at fs/btrfs/ctree.h:3153!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
CPU: 1 PID: 3780621 Comm: umount Tainted: G L 5.6.0-rc2-btrfs-next-73 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
RIP: 0010:assertfail.constprop.0+0x18/0x26 [btrfs]
Code: 8b 7b 58 48 85 ff 74 (...)
RSP: 0018:ffffb748c89bbdf8 EFLAGS: 00010246
RAX: 0000000000000051 RBX: ffff9608f2584000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff91998988 RDI: 00000000ffffffff
RBP: ffff9608f25870d8 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc0cbc500
R13: ffffffff92411750 R14: 0000000000000000 R15: ffff9608f2aab250
FS: 00007fcbfaa66c80(0000) GS:ffff960936c80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffc2c2dd38 CR3: 0000000235e54002 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
close_ctree+0x1a2/0x2e6 [btrfs]
generic_shutdown_super+0x6c/0x100
kill_anon_super+0x14/0x30
btrfs_kill_super+0x12/0x20 [btrfs]
deactivate_locked_super+0x31/0x70
cleanup_mnt+0x100/0x160
task_work_run+0x93/0xc0
exit_to_usermode_loop+0xf9/0x100
do_syscall_64+0x20d/0x260
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fcbfaca6307
Code: eb 0b 00 f7 d8 64 89 (...)
RSP: 002b:00007fffc2c2ed68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 0000558203b559b0 RCX: 00007fcbfaca6307
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000558203b55bc0
RBP: 0000000000000000 R08: 0000000000000001 R09: 00007fffc2c2dad0
R10: 0000558203b55bf0 R11: 0000000000000246 R12: 0000558203b55bc0
R13: 00007fcbfadcc204 R14: 0000558203b55aa8 R15: 0000000000000000
Modules linked in: btrfs dm_flakey dm_log_writes (...)
---[ end trace d44d303790049ef6 ]---
So fix this by making the remount RO path run any remaining delayed iputs
after waiting for the cleaner to become inactive.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
fs/btrfs/super.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 38740cc2919f..12d7d3be7cd4 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1978,6 +1978,16 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
wait_on_bit(&fs_info->flags, BTRFS_FS_CLEANER_RUNNING,
TASK_UNINTERRUPTIBLE);
+ /*
+ * We've set the superblock to RO mode, so we might have made
+ * the cleaner task sleep without running all pending delayed
+ * iputs. Go through all the delayed iputs here, so that if an
+ * unmount happens without remounting RW we don't end up at
+ * finishing close_ctree() with a non-empty list of delayed
+ * iputs.
+ */
+ btrfs_run_delayed_iputs(fs_info);
+
btrfs_dev_replace_suspend_for_unmount(fs_info);
btrfs_scrub_cancel(fs_info);
btrfs_pause_balance(fs_info);
--
2.28.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount
2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
` (4 preceding siblings ...)
2020-12-14 10:10 ` [PATCH 5/5] btrfs: run delayed iputs when remounting RO to avoid leaking them fdmanana
@ 2020-12-17 16:26 ` Josef Bacik
2020-12-17 18:08 ` David Sterba
6 siblings, 0 replies; 13+ messages in thread
From: Josef Bacik @ 2020-12-17 16:26 UTC (permalink / raw)
To: fdmanana, linux-btrfs
On 12/14/20 5:10 AM, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
>
> There are some cases where we can leak a transaction and crash during unmount
> after remounting the filesystem in RO mode or mounting RO. These issues were
> actually being hit by automated tests from the openQA for openSUSE Tumbleweed
> (bugzilla https://bugzilla.suse.com/show_bug.cgi?id=1164503).
>
> Filipe Manana (5):
> btrfs: fix transaction leak and crash after RO remount caused by
> qgroup rescan
> btrfs: fix transaction leak and crash after cleaning up orphans on RO
> mount
> btrfs: fix race between RO remount and the cleaner task
> btrfs: add assertion for empty list of transactions at late stage of
> umount
> btrfs: run delayed iputs when remounting RO to avoid leaking them
>
You can add
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Thanks,
Josef
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan
2020-12-14 10:10 ` [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan fdmanana
@ 2020-12-17 17:44 ` David Sterba
2020-12-17 18:21 ` Filipe Manana
0 siblings, 1 reply; 13+ messages in thread
From: David Sterba @ 2020-12-17 17:44 UTC (permalink / raw)
To: fdmanana; +Cc: linux-btrfs
On Mon, Dec 14, 2020 at 10:10:45AM +0000, fdmanana@kernel.org wrote:
> +static bool rescan_should_stop(struct btrfs_fs_info *fs_info)
> +{
> + return btrfs_fs_closing(fs_info) ||
> + test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state);
> +}
> +
> static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> {
> struct btrfs_fs_info *fs_info = container_of(work, struct btrfs_fs_info,
> @@ -3198,6 +3204,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> struct btrfs_trans_handle *trans = NULL;
> int err = -ENOMEM;
> int ret = 0;
> + bool stopped = false;
>
> path = btrfs_alloc_path();
> if (!path)
> @@ -3210,7 +3217,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> path->skip_locking = 1;
>
> err = 0;
> - while (!err && !btrfs_fs_closing(fs_info)) {
> + while (!err && !(stopped = rescan_should_stop(fs_info))) {
> trans = btrfs_start_transaction(fs_info->fs_root, 0);
> if (IS_ERR(trans)) {
> err = PTR_ERR(trans);
> @@ -3253,7 +3260,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> }
>
> mutex_lock(&fs_info->qgroup_rescan_lock);
> - if (!btrfs_fs_closing(fs_info))
> + if (!stopped)
> fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
> if (trans) {
> ret = update_qgroup_status_item(trans);
> @@ -3272,7 +3279,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
>
> btrfs_end_transaction(trans);
>
> - if (btrfs_fs_closing(fs_info)) {
> + if (stopped) {
Thinking aloud, this is slightly different as it uses the cached status
of fs_closing but there is mutex lock/unlock or transaction start/end
between the checks so the status could change.
But as the flow goes, we want to get fresh status in the while loop.
Once it stops because of the fs_closing or remount request, the
following code does the qgroup status update, wakeups, even tough this
means one more transaction. Remount needs to sync anyway and this should
be no problem.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount
2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
` (5 preceding siblings ...)
2020-12-17 16:26 ` [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount Josef Bacik
@ 2020-12-17 18:08 ` David Sterba
6 siblings, 0 replies; 13+ messages in thread
From: David Sterba @ 2020-12-17 18:08 UTC (permalink / raw)
To: fdmanana; +Cc: linux-btrfs
On Mon, Dec 14, 2020 at 10:10:44AM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
>
> There are some cases where we can leak a transaction and crash during unmount
> after remounting the filesystem in RO mode or mounting RO. These issues were
> actually being hit by automated tests from the openQA for openSUSE Tumbleweed
> (bugzilla https://bugzilla.suse.com/show_bug.cgi?id=1164503).
>
> Filipe Manana (5):
> btrfs: fix transaction leak and crash after RO remount caused by
> qgroup rescan
> btrfs: fix transaction leak and crash after cleaning up orphans on RO
> mount
> btrfs: fix race between RO remount and the cleaner task
> btrfs: add assertion for empty list of transactions at late stage of
> umount
> btrfs: run delayed iputs when remounting RO to avoid leaking them
Added to misc-next, thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan
2020-12-17 17:44 ` David Sterba
@ 2020-12-17 18:21 ` Filipe Manana
0 siblings, 0 replies; 13+ messages in thread
From: Filipe Manana @ 2020-12-17 18:21 UTC (permalink / raw)
To: dsterba, Filipe Manana, linux-btrfs
On Thu, Dec 17, 2020 at 5:45 PM David Sterba <dsterba@suse.cz> wrote:
>
> On Mon, Dec 14, 2020 at 10:10:45AM +0000, fdmanana@kernel.org wrote:
> > +static bool rescan_should_stop(struct btrfs_fs_info *fs_info)
> > +{
> > + return btrfs_fs_closing(fs_info) ||
> > + test_bit(BTRFS_FS_STATE_REMOUNTING, &fs_info->fs_state);
> > +}
> > +
> > static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> > {
> > struct btrfs_fs_info *fs_info = container_of(work, struct btrfs_fs_info,
> > @@ -3198,6 +3204,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> > struct btrfs_trans_handle *trans = NULL;
> > int err = -ENOMEM;
> > int ret = 0;
> > + bool stopped = false;
> >
> > path = btrfs_alloc_path();
> > if (!path)
> > @@ -3210,7 +3217,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> > path->skip_locking = 1;
> >
> > err = 0;
> > - while (!err && !btrfs_fs_closing(fs_info)) {
> > + while (!err && !(stopped = rescan_should_stop(fs_info))) {
> > trans = btrfs_start_transaction(fs_info->fs_root, 0);
> > if (IS_ERR(trans)) {
> > err = PTR_ERR(trans);
> > @@ -3253,7 +3260,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> > }
> >
> > mutex_lock(&fs_info->qgroup_rescan_lock);
> > - if (!btrfs_fs_closing(fs_info))
> > + if (!stopped)
> > fs_info->qgroup_flags &= ~BTRFS_QGROUP_STATUS_FLAG_RESCAN;
> > if (trans) {
> > ret = update_qgroup_status_item(trans);
> > @@ -3272,7 +3279,7 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> >
> > btrfs_end_transaction(trans);
> >
> > - if (btrfs_fs_closing(fs_info)) {
> > + if (stopped) {
>
> Thinking aloud, this is slightly different as it uses the cached status
> of fs_closing but there is mutex lock/unlock or transaction start/end
> between the checks so the status could change.
>
> But as the flow goes, we want to get fresh status in the while loop.
> Once it stops because of the fs_closing or remount request, the
> following code does the qgroup status update, wakeups, even tough this
> means one more transaction. Remount needs to sync anyway and this should
> be no problem.
Yes, that and the fact that the rescan calls
complete_all(&fs_info->qgroup_rescan_completion) before it logs the
reason why it finished.
So it would be possible for remount to stop it, then remount
completes, and then the rescan worker logs that an error happened
instead of logging that it was stopped - it's a very big stretch for
that to happen, but an error message would be confusing from a user's
perspective at least.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount
2020-12-14 10:10 ` [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount fdmanana
@ 2021-03-16 6:44 ` robbieko
2021-03-16 11:43 ` Filipe Manana
0 siblings, 1 reply; 13+ messages in thread
From: robbieko @ 2021-03-16 6:44 UTC (permalink / raw)
To: fdmanana, linux-btrfs
Hi All,
The patch delayed find orphan roots.
Move to after orphan cleanup with tree_root.
I think this will cause all orphan items to be deleted
when orphan cleanup with tree_root.
Afterwards, find orphan roots cannot find
the subvolume being deleted.
Is my suspicion correct?
Thanks.
Robbie Ko
fdmanana@kernel.org 於 2020/12/14 下午6:10 寫道:
> From: Filipe Manana <fdmanana@suse.com>
>
> When we delete a root (subvolume or snapshot), at the very end of the
> operation, we attempt to remove the root's orphan item from the root tree,
> at btrfs_drop_snapshot(), by calling btrfs_del_orphan_item(). We ignore any
> error from btrfs_del_orphan_item() since it is not a serious problem and
> the next time the filesystem is mounted we remove such stray orphan items
> at btrfs_find_orphan_roots().
>
> However if the filesystem is mounted RO and we have stray orphan items for
> any previously deleted root, we can end up leaking a transaction and other
> data structures when unmounting the filesystem, as well as crashing if we
> do not have hardware acceleration for crc32c available.
>
> The steps that lead to the transaction leak are the following:
>
> 1) The filesystem is mounted in RW mode;
>
> 2) A subvolume is deleted;
>
> 3) When the cleaner kthread runs btrfs_drop_snapshot() to delete the root,
> it gets a failure at btrfs_del_orphan_item(), which is ignored, due to
> a -ENOMEM when allocating a path for example. So the orphan item for
> the root remains in the root tree;
>
> 4) The filesystem is unmounted;
>
> 5) The filesystem is mounted RO (-o ro). During the mount path we call
> btrfs_find_orphan_roots(), which iterates the root tree searching for
> orphan items. It finds the orphan item for our deleted root, and since
> it can not find the root, it starts a transaction to delete the orphan
> item (by calling btrfs_del_orphan_item());
>
> 6) The RO mount completes;
>
> 7) Before the transaction kthread commits the transaction created for
> deleting the orphan item (i.e. less than 30 seconds elapsed since the
> mount, the default commit interval), a filesystem unmount operation is
> started;
>
> 8) At close_ctree(), we stop the transaction kthread, but we still have a
> transaction open with at least one dirty extent buffer, a leaf for the
> tree root which was COWed when deleting the orphan item;
>
> 9) We then proceed to destroy the work queues, free the roots and block
> groups, etc. After that we drop the last reference on the btree inode by
> calling iput() on it. Since there are dirty pages for the btree inode,
> corresponding to the COWed extent buffer, btree_write_cache_pages() is
> invoked to flush those dirty pages. This results in creating a bio and
> submitting it, which makes us end up at btrfs_submit_metadata_bio();
>
> 10) At btrfs_submit_metadata_bio() we end up at the if-then-else branch
> that calls btrfs_wq_submit_bio(), because check_async_write() returned
> a value of 1. This value of 1 is because we did not have hardware
> acceleration available for crc32c, so BTRFS_FS_CSUM_IMPL_FAST was not
> set in fs_info->flags;
>
> 11) Then at btrfs_wq_submit_bio() we call btrfs_queue_work() against the
> workqueue at fs_info->workers, which was already freed before by the
> call to btrfs_stop_all_workers() at close_ctree(). This results in an
> invalid memory access due to a use-after-free, leading to a crash.
>
> When this happens, before the crash there are several warnings triggered,
> since we have reserved metadata space in a block group, the delayed refs
> reservation, etc:
>
> ------------[ cut here ]------------
> WARNING: CPU: 4 PID: 1729896 at fs/btrfs/block-group.c:125 btrfs_put_block_group+0x63/0xa0 [btrfs]
> Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
> CPU: 4 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> RIP: 0010:btrfs_put_block_group+0x63/0xa0 [btrfs]
> Code: f0 01 00 00 48 39 c2 75 (...)
> RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
> RAX: 0000000000000001 RBX: ffff947ed73e4000 RCX: ffff947ebc8b29c8
> RDX: 0000000000000001 RSI: ffffffffc0b150a0 RDI: ffff947ebc8b2800
> RBP: ffff947ebc8b2800 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
> R13: ffff947ed73e4160 R14: ffff947ebc8b2988 R15: dead000000000100
> FS: 00007f15edfea840(0000) GS:ffff9481ad600000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f37e2893320 CR3: 0000000138f68001 CR4: 00000000003706e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> btrfs_free_block_groups+0x17f/0x2f0 [btrfs]
> close_ctree+0x2ba/0x2fa [btrfs]
> generic_shutdown_super+0x6c/0x100
> kill_anon_super+0x14/0x30
> btrfs_kill_super+0x12/0x20 [btrfs]
> deactivate_locked_super+0x31/0x70
> cleanup_mnt+0x100/0x160
> task_work_run+0x68/0xb0
> exit_to_user_mode_prepare+0x1bb/0x1c0
> syscall_exit_to_user_mode+0x4b/0x260
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f15ee221ee7
> Code: ff 0b 00 f7 d8 64 89 01 48 (...)
> RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
> RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
> RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
> R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
> R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
> irq event stamp: 0
> hardirqs last enabled at (0): [<0000000000000000>] 0x0
> hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
> softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
> softirqs last disabled at (0): [<0000000000000000>] 0x0
> ---[ end trace dd74718fef1ed5c6 ]---
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-rsv.c:459 btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
> Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
> CPU: 2 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> RIP: 0010:btrfs_release_global_block_rsv+0x70/0xc0 [btrfs]
> Code: 48 83 bb b0 03 00 00 00 (...)
> RSP: 0018:ffffb270826bbdd8 EFLAGS: 00010206
> RAX: 000000000033c000 RBX: ffff947ed73e4000 RCX: 0000000000000000
> RDX: 0000000000000001 RSI: ffffffffc0b0d8c1 RDI: 00000000ffffffff
> RBP: ffff947ebc8b7000 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ed73e4110
> R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
> FS: 00007f15edfea840(0000) GS:ffff9481aca00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000561a79f76e20 CR3: 0000000138f68006 CR4: 00000000003706e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> btrfs_free_block_groups+0x24c/0x2f0 [btrfs]
> close_ctree+0x2ba/0x2fa [btrfs]
> generic_shutdown_super+0x6c/0x100
> kill_anon_super+0x14/0x30
> btrfs_kill_super+0x12/0x20 [btrfs]
> deactivate_locked_super+0x31/0x70
> cleanup_mnt+0x100/0x160
> task_work_run+0x68/0xb0
> exit_to_user_mode_prepare+0x1bb/0x1c0
> syscall_exit_to_user_mode+0x4b/0x260
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f15ee221ee7
> Code: ff 0b 00 f7 d8 64 89 01 (...)
> RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
> RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
> RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
> R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
> R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
> irq event stamp: 0
> hardirqs last enabled at (0): [<0000000000000000>] 0x0
> hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
> softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
> softirqs last disabled at (0): [<0000000000000000>] 0x0
> ---[ end trace dd74718fef1ed5c7 ]---
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 1729896 at fs/btrfs/block-group.c:3377 btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
> Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
> CPU: 5 PID: 1729896 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> RIP: 0010:btrfs_free_block_groups+0x25d/0x2f0 [btrfs]
> Code: ad de 49 be 22 01 00 (...)
> RSP: 0018:ffffb270826bbde8 EFLAGS: 00010206
> RAX: ffff947ebeae1d08 RBX: ffff947ed73e4000 RCX: 0000000000000000
> RDX: 0000000000000001 RSI: ffff947e9d823ae8 RDI: 0000000000000246
> RBP: ffff947ebeae1d08 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000001 R12: ffff947ebeae1c00
> R13: ffff947ed73e5278 R14: dead000000000122 R15: dead000000000100
> FS: 00007f15edfea840(0000) GS:ffff9481ad200000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f1475d98ea8 CR3: 0000000138f68005 CR4: 00000000003706e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> close_ctree+0x2ba/0x2fa [btrfs]
> generic_shutdown_super+0x6c/0x100
> kill_anon_super+0x14/0x30
> btrfs_kill_super+0x12/0x20 [btrfs]
> deactivate_locked_super+0x31/0x70
> cleanup_mnt+0x100/0x160
> task_work_run+0x68/0xb0
> exit_to_user_mode_prepare+0x1bb/0x1c0
> syscall_exit_to_user_mode+0x4b/0x260
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f15ee221ee7
> Code: ff 0b 00 f7 d8 64 89 (...)
> RSP: 002b:00007ffe9470f0f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> RAX: 0000000000000000 RBX: 00007f15ee347264 RCX: 00007f15ee221ee7
> RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 000056169701d000
> RBP: 0000561697018a30 R08: 0000000000000000 R09: 00007f15ee2e2be0
> R10: 000056169701efe0 R11: 0000000000000246 R12: 0000000000000000
> R13: 000056169701d000 R14: 0000561697018b40 R15: 0000561697018c60
> irq event stamp: 0
> hardirqs last enabled at (0): [<0000000000000000>] 0x0
> hardirqs last disabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
> softirqs last enabled at (0): [<ffffffff8bcae560>] copy_process+0x8a0/0x1d70
> softirqs last disabled at (0): [<0000000000000000>] 0x0
> ---[ end trace dd74718fef1ed5c8 ]---
> BTRFS info (device sdc): space_info 4 has 268238848 free, is not full
> BTRFS info (device sdc): space_info total=268435456, used=114688, pinned=0, reserved=16384, may_use=0, readonly=65536
> BTRFS info (device sdc): global_block_rsv: size 0 reserved 0
> BTRFS info (device sdc): trans_block_rsv: size 0 reserved 0
> BTRFS info (device sdc): chunk_block_rsv: size 0 reserved 0
> BTRFS info (device sdc): delayed_block_rsv: size 0 reserved 0
> BTRFS info (device sdc): delayed_refs_rsv: size 524288 reserved 0
>
> And the crash, which only happens when we do not have crc32c hardware
> acceleration, produces the following trace immediately after those
> warnings:
>
> stack segment: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
> CPU: 2 PID: 1749129 Comm: umount Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> RIP: 0010:btrfs_queue_work+0x36/0x190 [btrfs]
> Code: 54 55 53 48 89 f3 (...)
> RSP: 0018:ffffb27082443ae8 EFLAGS: 00010282
> RAX: 0000000000000004 RBX: ffff94810ee9ad90 RCX: 0000000000000000
> RDX: 0000000000000001 RSI: ffff94810ee9ad90 RDI: ffff947ed8ee75a0
> RBP: a56b6b6b6b6b6b6b R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000007 R11: 0000000000000001 R12: ffff947fa9b435a8
> R13: ffff94810ee9ad90 R14: 0000000000000000 R15: ffff947e93dc0000
> FS: 00007f3cfe974840(0000) GS:ffff9481ac600000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f1b42995a70 CR3: 0000000127638003 CR4: 00000000003706e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> btrfs_wq_submit_bio+0xb3/0xd0 [btrfs]
> btrfs_submit_metadata_bio+0x44/0xc0 [btrfs]
> submit_one_bio+0x61/0x70 [btrfs]
> btree_write_cache_pages+0x414/0x450 [btrfs]
> ? kobject_put+0x9a/0x1d0
> ? trace_hardirqs_on+0x1b/0xf0
> ? _raw_spin_unlock_irqrestore+0x3c/0x60
> ? free_debug_processing+0x1e1/0x2b0
> do_writepages+0x43/0xe0
> ? lock_acquired+0x199/0x490
> __writeback_single_inode+0x59/0x650
> writeback_single_inode+0xaf/0x120
> write_inode_now+0x94/0xd0
> iput+0x187/0x2b0
> close_ctree+0x2c6/0x2fa [btrfs]
> generic_shutdown_super+0x6c/0x100
> kill_anon_super+0x14/0x30
> btrfs_kill_super+0x12/0x20 [btrfs]
> deactivate_locked_super+0x31/0x70
> cleanup_mnt+0x100/0x160
> task_work_run+0x68/0xb0
> exit_to_user_mode_prepare+0x1bb/0x1c0
> syscall_exit_to_user_mode+0x4b/0x260
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f3cfebabee7
> Code: ff 0b 00 f7 d8 64 89 01 (...)
> RSP: 002b:00007ffc9c9a05f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
> RAX: 0000000000000000 RBX: 00007f3cfecd1264 RCX: 00007f3cfebabee7
> RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 0000562b6b478000
> RBP: 0000562b6b473a30 R08: 0000000000000000 R09: 00007f3cfec6cbe0
> R10: 0000562b6b479fe0 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000562b6b478000 R14: 0000562b6b473b40 R15: 0000562b6b473c60
> Modules linked in: btrfs dm_snapshot dm_thin_pool (...)
> ---[ end trace dd74718fef1ed5cc ]---
>
> Finally when we remove the btrfs module (rmmod btrfs), there are several
> warnings about objects that were allocated from our slabs but were never
> freed, consequence of the transaction that was never committed and got
> leaked:
> =============================================================================
> BUG btrfs_delayed_ref_head (Tainted: G B W ): Objects remaining in btrfs_delayed_ref_head on __kmem_cache_shutdown()
> -----------------------------------------------------------------------------
>
> INFO: Slab 0x0000000094c2ae56 objects=24 used=2 fp=0x000000002bfa2521 flags=0x17fffc000010200
> CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> Call Trace:
> dump_stack+0x8d/0xb5
> slab_err+0xb7/0xdc
> ? lock_acquired+0x199/0x490
> __kmem_cache_shutdown+0x1ac/0x3c0
> ? lock_release+0x20e/0x4c0
> kmem_cache_destroy+0x55/0x120
> btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
> exit_btrfs_fs+0xa/0x59 [btrfs]
> __x64_sys_delete_module+0x194/0x260
> ? fpregs_assert_state_consistent+0x1e/0x40
> ? exit_to_user_mode_prepare+0x55/0x1c0
> ? trace_hardirqs_on+0x1b/0xf0
> do_syscall_64+0x33/0x80
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f693e305897
> Code: 73 01 c3 48 8b 0d f9 f5 (...)
> RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
> RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
> RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
> R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
> R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
> INFO: Object 0x0000000050cbdd61 @offset=12104
> INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1894 cpu=6 pid=1729873
> __slab_alloc.isra.0+0x109/0x1c0
> kmem_cache_alloc+0x7bb/0x830
> btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
> btrfs_free_tree_block+0x128/0x360 [btrfs]
> __btrfs_cow_block+0x489/0x5f0 [btrfs]
> btrfs_cow_block+0xf7/0x220 [btrfs]
> btrfs_search_slot+0x62a/0xc40 [btrfs]
> btrfs_del_orphan_item+0x65/0xd0 [btrfs]
> btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
> open_ctree+0x125a/0x18a0 [btrfs]
> btrfs_mount_root.cold+0x13/0xed [btrfs]
> legacy_get_tree+0x30/0x60
> vfs_get_tree+0x28/0xe0
> fc_mount+0xe/0x40
> vfs_kern_mount.part.0+0x71/0x90
> btrfs_mount+0x13b/0x3e0 [btrfs]
> INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=4292 cpu=2 pid=1729526
> kmem_cache_free+0x34c/0x3c0
> __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
> btrfs_run_delayed_refs+0x81/0x210 [btrfs]
> commit_cowonly_roots+0xfb/0x300 [btrfs]
> btrfs_commit_transaction+0x367/0xc40 [btrfs]
> sync_filesystem+0x74/0x90
> generic_shutdown_super+0x22/0x100
> kill_anon_super+0x14/0x30
> btrfs_kill_super+0x12/0x20 [btrfs]
> deactivate_locked_super+0x31/0x70
> cleanup_mnt+0x100/0x160
> task_work_run+0x68/0xb0
> exit_to_user_mode_prepare+0x1bb/0x1c0
> syscall_exit_to_user_mode+0x4b/0x260
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> INFO: Object 0x0000000086e9b0ff @offset=12776
> INFO: Allocated in btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs] age=1900 cpu=6 pid=1729873
> __slab_alloc.isra.0+0x109/0x1c0
> kmem_cache_alloc+0x7bb/0x830
> btrfs_add_delayed_tree_ref+0xbb/0x480 [btrfs]
> btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
> alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
> __btrfs_cow_block+0x12d/0x5f0 [btrfs]
> btrfs_cow_block+0xf7/0x220 [btrfs]
> btrfs_search_slot+0x62a/0xc40 [btrfs]
> btrfs_del_orphan_item+0x65/0xd0 [btrfs]
> btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
> open_ctree+0x125a/0x18a0 [btrfs]
> btrfs_mount_root.cold+0x13/0xed [btrfs]
> legacy_get_tree+0x30/0x60
> vfs_get_tree+0x28/0xe0
> fc_mount+0xe/0x40
> vfs_kern_mount.part.0+0x71/0x90
> INFO: Freed in __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs] age=3141 cpu=6 pid=1729803
> kmem_cache_free+0x34c/0x3c0
> __btrfs_run_delayed_refs+0x1117/0x1290 [btrfs]
> btrfs_run_delayed_refs+0x81/0x210 [btrfs]
> btrfs_write_dirty_block_groups+0x17d/0x3d0 [btrfs]
> commit_cowonly_roots+0x248/0x300 [btrfs]
> btrfs_commit_transaction+0x367/0xc40 [btrfs]
> close_ctree+0x113/0x2fa [btrfs]
> generic_shutdown_super+0x6c/0x100
> kill_anon_super+0x14/0x30
> btrfs_kill_super+0x12/0x20 [btrfs]
> deactivate_locked_super+0x31/0x70
> cleanup_mnt+0x100/0x160
> task_work_run+0x68/0xb0
> exit_to_user_mode_prepare+0x1bb/0x1c0
> syscall_exit_to_user_mode+0x4b/0x260
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> kmem_cache_destroy btrfs_delayed_ref_head: Slab cache still has objects
> CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> Call Trace:
> dump_stack+0x8d/0xb5
> kmem_cache_destroy+0x119/0x120
> btrfs_delayed_ref_exit+0x11/0x35 [btrfs]
> exit_btrfs_fs+0xa/0x59 [btrfs]
> __x64_sys_delete_module+0x194/0x260
> ? fpregs_assert_state_consistent+0x1e/0x40
> ? exit_to_user_mode_prepare+0x55/0x1c0
> ? trace_hardirqs_on+0x1b/0xf0
> do_syscall_64+0x33/0x80
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f693e305897
> Code: 73 01 c3 48 8b 0d f9 f5 0b (...)
> RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
> RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
> RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
> R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
> R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
> =============================================================================
> BUG btrfs_delayed_tree_ref (Tainted: G B W ): Objects remaining in btrfs_delayed_tree_ref on __kmem_cache_shutdown()
> -----------------------------------------------------------------------------
>
> INFO: Slab 0x0000000011f78dc0 objects=37 used=2 fp=0x0000000032d55d91 flags=0x17fffc000010200
> CPU: 3 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> Call Trace:
> dump_stack+0x8d/0xb5
> slab_err+0xb7/0xdc
> ? lock_acquired+0x199/0x490
> __kmem_cache_shutdown+0x1ac/0x3c0
> ? lock_release+0x20e/0x4c0
> kmem_cache_destroy+0x55/0x120
> btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
> exit_btrfs_fs+0xa/0x59 [btrfs]
> __x64_sys_delete_module+0x194/0x260
> ? fpregs_assert_state_consistent+0x1e/0x40
> ? exit_to_user_mode_prepare+0x55/0x1c0
> ? trace_hardirqs_on+0x1b/0xf0
> do_syscall_64+0x33/0x80
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f693e305897
> Code: 73 01 c3 48 8b 0d f9 f5 (...)
> RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
> RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
> RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
> R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
> R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
> INFO: Object 0x000000001a340018 @offset=4408
> INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1917 cpu=6 pid=1729873
> __slab_alloc.isra.0+0x109/0x1c0
> kmem_cache_alloc+0x7bb/0x830
> btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
> btrfs_free_tree_block+0x128/0x360 [btrfs]
> __btrfs_cow_block+0x489/0x5f0 [btrfs]
> btrfs_cow_block+0xf7/0x220 [btrfs]
> btrfs_search_slot+0x62a/0xc40 [btrfs]
> btrfs_del_orphan_item+0x65/0xd0 [btrfs]
> btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
> open_ctree+0x125a/0x18a0 [btrfs]
> btrfs_mount_root.cold+0x13/0xed [btrfs]
> legacy_get_tree+0x30/0x60
> vfs_get_tree+0x28/0xe0
> fc_mount+0xe/0x40
> vfs_kern_mount.part.0+0x71/0x90
> btrfs_mount+0x13b/0x3e0 [btrfs]
> INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=4167 cpu=4 pid=1729795
> kmem_cache_free+0x34c/0x3c0
> __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
> btrfs_run_delayed_refs+0x81/0x210 [btrfs]
> btrfs_commit_transaction+0x60/0xc40 [btrfs]
> create_subvol+0x56a/0x990 [btrfs]
> btrfs_mksubvol+0x3fb/0x4a0 [btrfs]
> __btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs]
> btrfs_ioctl_snap_create+0x58/0x80 [btrfs]
> btrfs_ioctl+0x1a92/0x36f0 [btrfs]
> __x64_sys_ioctl+0x83/0xb0
> do_syscall_64+0x33/0x80
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> INFO: Object 0x000000002b46292a @offset=13648
> INFO: Allocated in btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs] age=1923 cpu=6 pid=1729873
> __slab_alloc.isra.0+0x109/0x1c0
> kmem_cache_alloc+0x7bb/0x830
> btrfs_add_delayed_tree_ref+0x9e/0x480 [btrfs]
> btrfs_alloc_tree_block+0x2bf/0x360 [btrfs]
> alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
> __btrfs_cow_block+0x12d/0x5f0 [btrfs]
> btrfs_cow_block+0xf7/0x220 [btrfs]
> btrfs_search_slot+0x62a/0xc40 [btrfs]
> btrfs_del_orphan_item+0x65/0xd0 [btrfs]
> btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
> open_ctree+0x125a/0x18a0 [btrfs]
> btrfs_mount_root.cold+0x13/0xed [btrfs]
> legacy_get_tree+0x30/0x60
> vfs_get_tree+0x28/0xe0
> fc_mount+0xe/0x40
> vfs_kern_mount.part.0+0x71/0x90
> INFO: Freed in __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs] age=3164 cpu=6 pid=1729803
> kmem_cache_free+0x34c/0x3c0
> __btrfs_run_delayed_refs+0x63d/0x1290 [btrfs]
> btrfs_run_delayed_refs+0x81/0x210 [btrfs]
> commit_cowonly_roots+0xfb/0x300 [btrfs]
> btrfs_commit_transaction+0x367/0xc40 [btrfs]
> close_ctree+0x113/0x2fa [btrfs]
> generic_shutdown_super+0x6c/0x100
> kill_anon_super+0x14/0x30
> btrfs_kill_super+0x12/0x20 [btrfs]
> deactivate_locked_super+0x31/0x70
> cleanup_mnt+0x100/0x160
> task_work_run+0x68/0xb0
> exit_to_user_mode_prepare+0x1bb/0x1c0
> syscall_exit_to_user_mode+0x4b/0x260
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> kmem_cache_destroy btrfs_delayed_tree_ref: Slab cache still has objects
> CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> Call Trace:
> dump_stack+0x8d/0xb5
> kmem_cache_destroy+0x119/0x120
> btrfs_delayed_ref_exit+0x1d/0x35 [btrfs]
> exit_btrfs_fs+0xa/0x59 [btrfs]
> __x64_sys_delete_module+0x194/0x260
> ? fpregs_assert_state_consistent+0x1e/0x40
> ? exit_to_user_mode_prepare+0x55/0x1c0
> ? trace_hardirqs_on+0x1b/0xf0
> do_syscall_64+0x33/0x80
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f693e305897
> Code: 73 01 c3 48 8b 0d f9 f5 (...)
> RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
> RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
> RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
> R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
> R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
> =============================================================================
> BUG btrfs_delayed_extent_op (Tainted: G B W ): Objects remaining in btrfs_delayed_extent_op on __kmem_cache_shutdown()
> -----------------------------------------------------------------------------
>
> INFO: Slab 0x00000000f145ce2f objects=22 used=1 fp=0x00000000af0f92cf flags=0x17fffc000010200
> CPU: 5 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> Call Trace:
> dump_stack+0x8d/0xb5
> slab_err+0xb7/0xdc
> ? lock_acquired+0x199/0x490
> __kmem_cache_shutdown+0x1ac/0x3c0
> ? __mutex_unlock_slowpath+0x45/0x2a0
> kmem_cache_destroy+0x55/0x120
> exit_btrfs_fs+0xa/0x59 [btrfs]
> __x64_sys_delete_module+0x194/0x260
> ? fpregs_assert_state_consistent+0x1e/0x40
> ? exit_to_user_mode_prepare+0x55/0x1c0
> ? trace_hardirqs_on+0x1b/0xf0
> do_syscall_64+0x33/0x80
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f693e305897
> Code: 73 01 c3 48 8b 0d f9 f5 (...)
> RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
> RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
> RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
> R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
> R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
> INFO: Object 0x000000004cf95ea8 @offset=6264
> INFO: Allocated in btrfs_alloc_tree_block+0x1e0/0x360 [btrfs] age=1931 cpu=6 pid=1729873
> __slab_alloc.isra.0+0x109/0x1c0
> kmem_cache_alloc+0x7bb/0x830
> btrfs_alloc_tree_block+0x1e0/0x360 [btrfs]
> alloc_tree_block_no_bg_flush+0x4f/0x60 [btrfs]
> __btrfs_cow_block+0x12d/0x5f0 [btrfs]
> btrfs_cow_block+0xf7/0x220 [btrfs]
> btrfs_search_slot+0x62a/0xc40 [btrfs]
> btrfs_del_orphan_item+0x65/0xd0 [btrfs]
> btrfs_find_orphan_roots+0x1bf/0x200 [btrfs]
> open_ctree+0x125a/0x18a0 [btrfs]
> btrfs_mount_root.cold+0x13/0xed [btrfs]
> legacy_get_tree+0x30/0x60
> vfs_get_tree+0x28/0xe0
> fc_mount+0xe/0x40
> vfs_kern_mount.part.0+0x71/0x90
> btrfs_mount+0x13b/0x3e0 [btrfs]
> INFO: Freed in __btrfs_run_delayed_refs+0xabd/0x1290 [btrfs] age=3173 cpu=6 pid=1729803
> kmem_cache_free+0x34c/0x3c0
> __btrfs_run_delayed_refs+0xabd/0x1290 [btrfs]
> btrfs_run_delayed_refs+0x81/0x210 [btrfs]
> commit_cowonly_roots+0xfb/0x300 [btrfs]
> btrfs_commit_transaction+0x367/0xc40 [btrfs]
> close_ctree+0x113/0x2fa [btrfs]
> generic_shutdown_super+0x6c/0x100
> kill_anon_super+0x14/0x30
> btrfs_kill_super+0x12/0x20 [btrfs]
> deactivate_locked_super+0x31/0x70
> cleanup_mnt+0x100/0x160
> task_work_run+0x68/0xb0
> exit_to_user_mode_prepare+0x1bb/0x1c0
> syscall_exit_to_user_mode+0x4b/0x260
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> kmem_cache_destroy btrfs_delayed_extent_op: Slab cache still has objects
> CPU: 3 PID: 1729921 Comm: rmmod Tainted: G B W 5.10.0-rc4-btrfs-next-73 #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> Call Trace:
> dump_stack+0x8d/0xb5
> kmem_cache_destroy+0x119/0x120
> exit_btrfs_fs+0xa/0x59 [btrfs]
> __x64_sys_delete_module+0x194/0x260
> ? fpregs_assert_state_consistent+0x1e/0x40
> ? exit_to_user_mode_prepare+0x55/0x1c0
> ? trace_hardirqs_on+0x1b/0xf0
> do_syscall_64+0x33/0x80
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f693e305897
> Code: 73 01 c3 48 8b 0d f9 (...)
> RSP: 002b:00007ffcf73eb508 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> RAX: ffffffffffffffda RBX: 0000559df504f760 RCX: 00007f693e305897
> RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559df504f7c8
> RBP: 00007ffcf73eb568 R08: 0000000000000000 R09: 0000000000000000
> R10: 00007f693e378ac0 R11: 0000000000000206 R12: 00007ffcf73eb740
> R13: 00007ffcf73ec5a6 R14: 0000559df504f2a0 R15: 0000559df504f760
> BTRFS: state leak: start 30408704 end 30425087 state 1 in tree 1 refs 1
>
> So fix this by calling btrfs_find_orphan_roots() in the mount path only if
> we are mounting the filesystem in RW mode. It's pointless to have it called
> for RO mounts anyway, since despite adding any deleted roots to the list of
> dead roots, we will never have the roots deleted until the filesystem is
> remounted in RW mode, as the cleaner kthread does nothing when we are
> mounted in RO - btrfs_need_cleaner_sleep() always returns true and the
> cleaner spends all time sleeping, never cleaning dead roots.
>
> This is accomplished by moving the call to btrfs_find_orphan_roots() from
> open_ctree() to btrfs_start_pre_rw_mount(), which also guarantees that
> if later the filesystem is remounted RW, we populate the list of dead
> roots and have the cleaner task delete the dead roots.
>
> Tested-by: Fabian Vogt <fvogt@suse.com>
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
> fs/btrfs/disk-io.c | 5 +----
> 1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 765deefda92b..e941cbae3991 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2969,6 +2969,7 @@ int btrfs_start_pre_rw_mount(struct btrfs_fs_info *fs_info)
> }
> }
>
> + ret = btrfs_find_orphan_roots(fs_info);
> out:
> return ret;
> }
> @@ -3383,10 +3384,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
> }
> }
>
> - ret = btrfs_find_orphan_roots(fs_info);
> - if (ret)
> - goto fail_qgroup;
> -
> fs_info->fs_root = btrfs_get_fs_root(fs_info, BTRFS_FS_TREE_OBJECTID, true);
> if (IS_ERR(fs_info->fs_root)) {
> err = PTR_ERR(fs_info->fs_root);
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount
2021-03-16 6:44 ` robbieko
@ 2021-03-16 11:43 ` Filipe Manana
2021-03-16 16:56 ` Filipe Manana
0 siblings, 1 reply; 13+ messages in thread
From: Filipe Manana @ 2021-03-16 11:43 UTC (permalink / raw)
To: robbieko; +Cc: linux-btrfs
On Tue, Mar 16, 2021 at 6:49 AM robbieko <robbieko@synology.com> wrote:
>
> Hi All,
>
> The patch delayed find orphan roots.
> Move to after orphan cleanup with tree_root.
> I think this will cause all orphan items to be deleted
> when orphan cleanup with tree_root.
> Afterwards, find orphan roots cannot find
> the subvolume being deleted.
Not entirely able to parse what you are trying to say.
I suppose your concern is that the call to:
btrfs_orphan_cleanup(fs_info->tree_root)
which now happens before calling btrfs_find_orphan_roots(), results in
the orphans for roots being accidentally deleted and therefore cause
no root deletions to happen later?
If that's your concern, than it does not happen because
btrfs_orphan_cleanup() skips deletion of orphan items for deleted
roots.
I've just created a test case to verify it's correct, for RW mounts,
RO mounts and remounts from RO to RW:
https://pastebin.com/raw/zSZjgn48
I couldn't find any regression.
Thanks.
> > out:
> > return ret;
> > }
> > @@ -3383,10 +3384,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
> > }
> > }
> >
> > - ret = btrfs_find_orphan_roots(fs_info);
> > - if (ret)
> > - goto fail_qgroup;
> > -
> > fs_info->fs_root = btrfs_get_fs_root(fs_info, BTRFS_FS_TREE_OBJECTID, true);
> > if (IS_ERR(fs_info->fs_root)) {
> > err = PTR_ERR(fs_info->fs_root);
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount
2021-03-16 11:43 ` Filipe Manana
@ 2021-03-16 16:56 ` Filipe Manana
0 siblings, 0 replies; 13+ messages in thread
From: Filipe Manana @ 2021-03-16 16:56 UTC (permalink / raw)
To: robbieko; +Cc: linux-btrfs
On Tue, Mar 16, 2021 at 11:43 AM Filipe Manana <fdmanana@kernel.org> wrote:
>
> On Tue, Mar 16, 2021 at 6:49 AM robbieko <robbieko@synology.com> wrote:
> >
> > Hi All,
> >
> > The patch delayed find orphan roots.
> > Move to after orphan cleanup with tree_root.
> > I think this will cause all orphan items to be deleted
> > when orphan cleanup with tree_root.
> > Afterwards, find orphan roots cannot find
> > the subvolume being deleted.
>
> Not entirely able to parse what you are trying to say.
>
> I suppose your concern is that the call to:
>
> btrfs_orphan_cleanup(fs_info->tree_root)
>
> which now happens before calling btrfs_find_orphan_roots(), results in
> the orphans for roots being accidentally deleted and therefore cause
> no root deletions to happen later?
> If that's your concern, than it does not happen because
> btrfs_orphan_cleanup() skips deletion of orphan items for deleted
> roots.
>
> I've just created a test case to verify it's correct, for RW mounts,
> RO mounts and remounts from RO to RW:
>
> https://pastebin.com/raw/zSZjgn48
>
> I couldn't find any regression.
Ok, I figured out what you meant, and the test was not checking the
btree was deleted, only the orphan items.
I just sent a fix and an updated test case.
Thanks for the report.
>
> Thanks.
>
>
> > > out:
> > > return ret;
> > > }
> > > @@ -3383,10 +3384,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
> > > }
> > > }
> > >
> > > - ret = btrfs_find_orphan_roots(fs_info);
> > > - if (ret)
> > > - goto fail_qgroup;
> > > -
> > > fs_info->fs_root = btrfs_get_fs_root(fs_info, BTRFS_FS_TREE_OBJECTID, true);
> > > if (IS_ERR(fs_info->fs_root)) {
> > > err = PTR_ERR(fs_info->fs_root);
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-03-16 16:57 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-14 10:10 [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount fdmanana
2020-12-14 10:10 ` [PATCH 1/5] btrfs: fix transaction leak and crash after RO remount caused by qgroup rescan fdmanana
2020-12-17 17:44 ` David Sterba
2020-12-17 18:21 ` Filipe Manana
2020-12-14 10:10 ` [PATCH 2/5] btrfs: fix transaction leak and crash after cleaning up orphans on RO mount fdmanana
2021-03-16 6:44 ` robbieko
2021-03-16 11:43 ` Filipe Manana
2021-03-16 16:56 ` Filipe Manana
2020-12-14 10:10 ` [PATCH 3/5] btrfs: fix race between RO remount and the cleaner task fdmanana
2020-12-14 10:10 ` [PATCH 4/5] btrfs: add assertion for empty list of transactions at late stage of umount fdmanana
2020-12-14 10:10 ` [PATCH 5/5] btrfs: run delayed iputs when remounting RO to avoid leaking them fdmanana
2020-12-17 16:26 ` [PATCH 0/5] btrfs: fix transaction leaks and crashes during unmount Josef Bacik
2020-12-17 18:08 ` David Sterba
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.