linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] btrfs: fix use-after-free of new block group that became unused
@ 2023-06-28 16:13 fdmanana
  2023-06-29 18:00 ` David Sterba
  0 siblings, 1 reply; 2+ messages in thread
From: fdmanana @ 2023-06-28 16:13 UTC (permalink / raw)
  To: linux-btrfs

From: Filipe Manana <fdmanana@suse.com>

If a task creates a new block group and that block group becomes unused
before we finish its creation, at btrfs_create_pending_block_groups(),
then when btrfs_mark_bg_unused() is called against the block group, we
assume that the block group is currently in the list of block groups to
reclaim, and we move it out of the list of new block groups and into the
list of unused block groups. This has two consequences:

1) We move it out of the list of new block groups associated to the
   current transaction. So the block group creation is not finished and
   if we attempt to delete the bg because it's unused, we will not find
   the block group item in the extent tree (or the new block group tree),
   its device extent items in the device tree etc, resulting in the
   deletion to fail due to the missing items;

2) We don't increment the reference count on the block group when we
   move it to the list of unused block groups, because we assumed the
   block group was on the list of block groups to reclaim, and in that
   case it already has the correct reference count. However the block
   group was on the list of new block groups, in which case no extra
   reference was taken because it's local to the current task. This
   later results in doing an extra reference count decrement when
   removing the block group from the unused list, eventually leading the
   referecence count to 0.

This second case was caught when running generic/297 from fstests, which
produced the following assertion failure and stack trace:

   [457589.559668] assertion failed: refcount_read(&block_group->refs) == 1, in fs/btrfs/block-group.c:4299
   [457589.559931] ------------[ cut here ]------------
   [457589.559932] kernel BUG at fs/btrfs/block-group.c:4299!
   [457589.560168] invalid opcode: 0000 [#1] PREEMPT SMP PTI
   [457589.560381] CPU: 8 PID: 2819134 Comm: umount Tainted: G        W          6.4.0-rc6-btrfs-next-134+ #1
   [457589.560630] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
   [457589.560871] RIP: 0010:btrfs_free_block_groups+0x449/0x4a0 [btrfs]
   [457589.561181] Code: 68 62 da c0 (...)
   [457589.561711] RSP: 0018:ffffa55a8c3b3d98 EFLAGS: 00010246
   [457589.561957] RAX: 0000000000000058 RBX: ffff8f030d7f2000 RCX: 0000000000000000
   [457589.562202] RDX: 0000000000000000 RSI: ffffffff953f0878 RDI: 00000000ffffffff
   [457589.562442] RBP: ffff8f030d7f2088 R08: 0000000000000000 R09: ffffa55a8c3b3c50
   [457589.562680] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8f05850b4c00
   [457589.562921] R13: ffff8f030d7f2090 R14: ffff8f05850b4cd8 R15: dead000000000100
   [457589.563167] FS:  00007f497fd2e840(0000) GS:ffff8f09dfc00000(0000) knlGS:0000000000000000
   [457589.563419] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   [457589.563673] CR2: 00007f497ff8ec10 CR3: 0000000271472006 CR4: 0000000000370ee0
   [457589.563934] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
   [457589.564196] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
   [457589.564460] Call Trace:
   [457589.564771]  <TASK>
   [457589.565032]  ? __die_body+0x1b/0x60
   [457589.565290]  ? die+0x39/0x60
   [457589.565571]  ? do_trap+0xeb/0x110
   [457589.565818]  ? btrfs_free_block_groups+0x449/0x4a0 [btrfs]
   [457589.566109]  ? do_error_trap+0x6a/0x90
   [457589.566347]  ? btrfs_free_block_groups+0x449/0x4a0 [btrfs]
   [457589.566623]  ? exc_invalid_op+0x4e/0x70
   [457589.566854]  ? btrfs_free_block_groups+0x449/0x4a0 [btrfs]
   [457589.567122]  ? asm_exc_invalid_op+0x16/0x20
   [457589.567352]  ? btrfs_free_block_groups+0x449/0x4a0 [btrfs]
   [457589.567624]  ? btrfs_free_block_groups+0x449/0x4a0 [btrfs]
   [457589.567895]  close_ctree+0x35d/0x560 [btrfs]
   [457589.568164]  ? fsnotify_sb_delete+0x13e/0x1d0
   [457589.568401]  ? dispose_list+0x3a/0x50
   [457589.568671]  ? evict_inodes+0x151/0x1a0
   [457589.568897]  generic_shutdown_super+0x73/0x1a0
   [457589.569128]  kill_anon_super+0x14/0x30
   [457589.569358]  btrfs_kill_super+0x12/0x20 [btrfs]
   [457589.569673]  deactivate_locked_super+0x2e/0x70
   [457589.569901]  cleanup_mnt+0x104/0x160
   [457589.570156]  task_work_run+0x56/0x90
   [457589.570500]  exit_to_user_mode_prepare+0x160/0x170
   [457589.570750]  syscall_exit_to_user_mode+0x22/0x50
   [457589.570971]  ? __x64_sys_umount+0x12/0x20
   [457589.571190]  do_syscall_64+0x48/0x90
   [457589.571412]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
   [457589.571639] RIP: 0033:0x7f497ff0a567
   [457589.571865] Code: af 98 0e (...)
   [457589.572348] RSP: 002b:00007ffc98347358 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
   [457589.572641] RAX: 0000000000000000 RBX: 00007f49800b8264 RCX: 00007f497ff0a567
   [457589.572883] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000557f558abfa0
   [457589.573124] RBP: 0000557f558a6ba0 R08: 0000000000000000 R09: 00007ffc98346100
   [457589.573359] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
   [457589.573628] R13: 0000557f558abfa0 R14: 0000557f558a6cb0 R15: 0000557f558a6dd0
   [457589.573853]  </TASK>
   [457589.574064] Modules linked in: dm_snapshot dm_thin_pool (...)
   [457589.576327] ---[ end trace 0000000000000000 ]---

Fix this by adding a runtime flag to the block group to tell that the
block group is still in the list of new block groups, and therefore it
should not be moved to the list of unused block groups, at
btrfs_mark_bg_unused(), until the flag is cleared, when we finish the
creation of the block group at btrfs_create_pending_block_groups().

Fixes: a9f189716cf1 ("btrfs: move out now unused BG from the reclaim list")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 fs/btrfs/block-group.c | 13 +++++++++++--
 fs/btrfs/block-group.h |  5 +++++
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 6753524b146c..f53297726238 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1640,13 +1640,14 @@ void btrfs_mark_bg_unused(struct btrfs_block_group *bg)
 {
 	struct btrfs_fs_info *fs_info = bg->fs_info;
 
-	trace_btrfs_add_unused_block_group(bg);
 	spin_lock(&fs_info->unused_bgs_lock);
 	if (list_empty(&bg->bg_list)) {
 		btrfs_get_block_group(bg);
+		trace_btrfs_add_unused_block_group(bg);
 		list_add_tail(&bg->bg_list, &fs_info->unused_bgs);
-	} else {
+	} else if (!test_bit(BLOCK_GROUP_FLAG_NEW, &bg->runtime_flags)) {
 		/* Pull out the block group from the reclaim_bgs list. */
+		trace_btrfs_add_unused_block_group(bg);
 		list_move_tail(&bg->bg_list, &fs_info->unused_bgs);
 	}
 	spin_unlock(&fs_info->unused_bgs_lock);
@@ -2668,6 +2669,7 @@ void btrfs_create_pending_block_groups(struct btrfs_trans_handle *trans)
 next:
 		btrfs_delayed_refs_rsv_release(fs_info, 1);
 		list_del_init(&block_group->bg_list);
+		clear_bit(BLOCK_GROUP_FLAG_NEW, &block_group->runtime_flags);
 	}
 	btrfs_trans_release_chunk_metadata(trans);
 }
@@ -2707,6 +2709,13 @@ struct btrfs_block_group *btrfs_make_block_group(struct btrfs_trans_handle *tran
 	if (!cache)
 		return ERR_PTR(-ENOMEM);
 
+	/*
+	 * Mark it as new before adding it to the rbtree of block groups or any
+	 * list, so that no other task finds it and calls btrfs_mark_bg_unused()
+	 * before the new flag is set.
+	 */
+	set_bit(BLOCK_GROUP_FLAG_NEW, &cache->runtime_flags);
+
 	cache->length = size;
 	set_free_space_tree_thresholds(cache);
 	cache->flags = type;
diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
index f204addc3fe8..0ed31112d932 100644
--- a/fs/btrfs/block-group.h
+++ b/fs/btrfs/block-group.h
@@ -70,6 +70,11 @@ enum btrfs_block_group_flags {
 	BLOCK_GROUP_FLAG_NEEDS_FREE_SPACE,
 	/* Indicate that the block group is placed on a sequential zone */
 	BLOCK_GROUP_FLAG_SEQUENTIAL_ZONE,
+	/*
+	 * Indicate the block group is in the list of new block groups of a
+	 * transaction.
+	 */
+	BLOCK_GROUP_FLAG_NEW,
 };
 
 enum btrfs_caching_type {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] btrfs: fix use-after-free of new block group that became unused
  2023-06-28 16:13 [PATCH] btrfs: fix use-after-free of new block group that became unused fdmanana
@ 2023-06-29 18:00 ` David Sterba
  0 siblings, 0 replies; 2+ messages in thread
From: David Sterba @ 2023-06-29 18:00 UTC (permalink / raw)
  To: fdmanana; +Cc: linux-btrfs

On Wed, Jun 28, 2023 at 05:13:37PM +0100, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> If a task creates a new block group and that block group becomes unused
> before we finish its creation, at btrfs_create_pending_block_groups(),
> then when btrfs_mark_bg_unused() is called against the block group, we
> assume that the block group is currently in the list of block groups to
> reclaim, and we move it out of the list of new block groups and into the
> list of unused block groups. This has two consequences:
> 
> 1) We move it out of the list of new block groups associated to the
>    current transaction. So the block group creation is not finished and
>    if we attempt to delete the bg because it's unused, we will not find
>    the block group item in the extent tree (or the new block group tree),
>    its device extent items in the device tree etc, resulting in the
>    deletion to fail due to the missing items;
> 
> 2) We don't increment the reference count on the block group when we
>    move it to the list of unused block groups, because we assumed the
>    block group was on the list of block groups to reclaim, and in that
>    case it already has the correct reference count. However the block
>    group was on the list of new block groups, in which case no extra
>    reference was taken because it's local to the current task. This
>    later results in doing an extra reference count decrement when
>    removing the block group from the unused list, eventually leading the
>    referecence count to 0.
> 
> This second case was caught when running generic/297 from fstests, which
> produced the following assertion failure and stack trace:
> 
>    [457589.559668] assertion failed: refcount_read(&block_group->refs) == 1, in fs/btrfs/block-group.c:4299
>    [457589.559931] ------------[ cut here ]------------
>    [457589.559932] kernel BUG at fs/btrfs/block-group.c:4299!
>    [457589.560168] invalid opcode: 0000 [#1] PREEMPT SMP PTI
>    [457589.560381] CPU: 8 PID: 2819134 Comm: umount Tainted: G        W          6.4.0-rc6-btrfs-next-134+ #1
>    [457589.560630] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
>    [457589.560871] RIP: 0010:btrfs_free_block_groups+0x449/0x4a0 [btrfs]
>    [457589.561181] Code: 68 62 da c0 (...)
>    [457589.561711] RSP: 0018:ffffa55a8c3b3d98 EFLAGS: 00010246
>    [457589.561957] RAX: 0000000000000058 RBX: ffff8f030d7f2000 RCX: 0000000000000000
>    [457589.562202] RDX: 0000000000000000 RSI: ffffffff953f0878 RDI: 00000000ffffffff
>    [457589.562442] RBP: ffff8f030d7f2088 R08: 0000000000000000 R09: ffffa55a8c3b3c50
>    [457589.562680] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8f05850b4c00
>    [457589.562921] R13: ffff8f030d7f2090 R14: ffff8f05850b4cd8 R15: dead000000000100
>    [457589.563167] FS:  00007f497fd2e840(0000) GS:ffff8f09dfc00000(0000) knlGS:0000000000000000
>    [457589.563419] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    [457589.563673] CR2: 00007f497ff8ec10 CR3: 0000000271472006 CR4: 0000000000370ee0
>    [457589.563934] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>    [457589.564196] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>    [457589.564460] Call Trace:
>    [457589.564771]  <TASK>
>    [457589.565032]  ? __die_body+0x1b/0x60
>    [457589.565290]  ? die+0x39/0x60
>    [457589.565571]  ? do_trap+0xeb/0x110
>    [457589.565818]  ? btrfs_free_block_groups+0x449/0x4a0 [btrfs]
>    [457589.566109]  ? do_error_trap+0x6a/0x90
>    [457589.566347]  ? btrfs_free_block_groups+0x449/0x4a0 [btrfs]
>    [457589.566623]  ? exc_invalid_op+0x4e/0x70
>    [457589.566854]  ? btrfs_free_block_groups+0x449/0x4a0 [btrfs]
>    [457589.567122]  ? asm_exc_invalid_op+0x16/0x20
>    [457589.567352]  ? btrfs_free_block_groups+0x449/0x4a0 [btrfs]
>    [457589.567624]  ? btrfs_free_block_groups+0x449/0x4a0 [btrfs]
>    [457589.567895]  close_ctree+0x35d/0x560 [btrfs]
>    [457589.568164]  ? fsnotify_sb_delete+0x13e/0x1d0
>    [457589.568401]  ? dispose_list+0x3a/0x50
>    [457589.568671]  ? evict_inodes+0x151/0x1a0
>    [457589.568897]  generic_shutdown_super+0x73/0x1a0
>    [457589.569128]  kill_anon_super+0x14/0x30
>    [457589.569358]  btrfs_kill_super+0x12/0x20 [btrfs]
>    [457589.569673]  deactivate_locked_super+0x2e/0x70
>    [457589.569901]  cleanup_mnt+0x104/0x160
>    [457589.570156]  task_work_run+0x56/0x90
>    [457589.570500]  exit_to_user_mode_prepare+0x160/0x170
>    [457589.570750]  syscall_exit_to_user_mode+0x22/0x50
>    [457589.570971]  ? __x64_sys_umount+0x12/0x20
>    [457589.571190]  do_syscall_64+0x48/0x90
>    [457589.571412]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
>    [457589.571639] RIP: 0033:0x7f497ff0a567
>    [457589.571865] Code: af 98 0e (...)
>    [457589.572348] RSP: 002b:00007ffc98347358 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
>    [457589.572641] RAX: 0000000000000000 RBX: 00007f49800b8264 RCX: 00007f497ff0a567
>    [457589.572883] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000557f558abfa0
>    [457589.573124] RBP: 0000557f558a6ba0 R08: 0000000000000000 R09: 00007ffc98346100
>    [457589.573359] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>    [457589.573628] R13: 0000557f558abfa0 R14: 0000557f558a6cb0 R15: 0000557f558a6dd0
>    [457589.573853]  </TASK>
>    [457589.574064] Modules linked in: dm_snapshot dm_thin_pool (...)
>    [457589.576327] ---[ end trace 0000000000000000 ]---
> 
> Fix this by adding a runtime flag to the block group to tell that the
> block group is still in the list of new block groups, and therefore it
> should not be moved to the list of unused block groups, at
> btrfs_mark_bg_unused(), until the flag is cleared, when we finish the
> creation of the block group at btrfs_create_pending_block_groups().
> 
> Fixes: a9f189716cf1 ("btrfs: move out now unused BG from the reclaim list")
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-06-29 18:06 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-28 16:13 [PATCH] btrfs: fix use-after-free of new block group that became unused fdmanana
2023-06-29 18:00 ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).