All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: unlock the newly allocated extent buffer in btrfs_alloc_tree_block()
@ 2021-09-14  6:57 Qu Wenruo
  2021-09-17 13:42 ` David Sterba
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Qu Wenruo @ 2021-09-14  6:57 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Hao Sun

[BUG]
There is a very detailed bug report that injected ENOMEM error could
leave a tree block locked while we return to user-space:

  BTRFS info (device loop0): enabling ssd optimizations
  FAULT_INJECTION: forcing a failure.
  name failslab, interval 1, probability 0, space 0, times 0
  CPU: 0 PID: 7579 Comm: syz-executor Not tainted 5.15.0-rc1 #16
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
  rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
  Call Trace:
   __dump_stack lib/dump_stack.c:88 [inline]
   dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:106
   fail_dump lib/fault-inject.c:52 [inline]
   should_fail+0x13c/0x160 lib/fault-inject.c:146
   should_failslab+0x5/0x10 mm/slab_common.c:1328
   slab_pre_alloc_hook.constprop.99+0x4e/0xc0 mm/slab.h:494
   slab_alloc_node mm/slub.c:3120 [inline]
   slab_alloc mm/slub.c:3214 [inline]
   kmem_cache_alloc+0x44/0x280 mm/slub.c:3219
   btrfs_alloc_delayed_extent_op fs/btrfs/delayed-ref.h:299 [inline]
   btrfs_alloc_tree_block+0x38c/0x670 fs/btrfs/extent-tree.c:4833
   __btrfs_cow_block+0x16f/0x7d0 fs/btrfs/ctree.c:415
   btrfs_cow_block+0x12a/0x300 fs/btrfs/ctree.c:570
   btrfs_search_slot+0x6b0/0xee0 fs/btrfs/ctree.c:1768
   btrfs_insert_empty_items+0x80/0xf0 fs/btrfs/ctree.c:3905
   btrfs_new_inode+0x311/0xa60 fs/btrfs/inode.c:6530
   btrfs_create+0x12b/0x270 fs/btrfs/inode.c:6783
   lookup_open+0x660/0x780 fs/namei.c:3282
   open_last_lookups fs/namei.c:3352 [inline]
   path_openat+0x465/0xe20 fs/namei.c:3557
   do_filp_open+0xe3/0x170 fs/namei.c:3588
   do_sys_openat2+0x357/0x4a0 fs/open.c:1200
   do_sys_open+0x87/0xd0 fs/open.c:1216
   do_syscall_x64 arch/x86/entry/common.c:50 [inline]
   do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80
   entry_SYSCALL_64_after_hwframe+0x44/0xae
  RIP: 0033:0x46ae99
  Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48
  89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
  01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
  RSP: 002b:00007f46711b9c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000055
  RAX: ffffffffffffffda RBX: 000000000078c0a0 RCX: 000000000046ae99
  RDX: 0000000000000000 RSI: 00000000000000a1 RDI: 0000000020005800
  RBP: 00007f46711b9c80 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000017
  R13: 0000000000000000 R14: 000000000078c0a0 R15: 00007ffc129da6e0

  ================================================
  WARNING: lock held when returning to user space!
  5.15.0-rc1 #16 Not tainted
  ------------------------------------------------
  syz-executor/7579 is leaving the kernel with locks still held!
  1 lock held by syz-executor/7579:
   #0: ffff888104b73da8 (btrfs-tree-01/1){+.+.}-{3:3}, at:
  __btrfs_tree_lock+0x2e/0x1a0 fs/btrfs/locking.c:112

[CAUSE]
In btrfs_alloc_tree_block(), after btrfs_init_new_buffer(), the new
extent buffer @buf is locked, but if later operations like adding
delayed tree ref fails, we just free @buf without unlocking it,
resulting above warning.

[FIX]
Unlock @buf in out_free_buf: tag.

Reported-by: Hao Sun <sunhao.th@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CACkBjsZ9O6Zr0KK1yGn=1rQi6Crh1yeCRdTSBxx9R99L4xdn-Q@mail.gmail.com/
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/extent-tree.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index c88e7727a31a..8aa981ffe7b7 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4898,6 +4898,7 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans,
 out_free_delayed:
 	btrfs_free_delayed_extent_op(extent_op);
 out_free_buf:
+	btrfs_tree_unlock(buf);
 	free_extent_buffer(buf);
 out_free_reserved:
 	btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 0);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] btrfs: unlock the newly allocated extent buffer in btrfs_alloc_tree_block()
  2021-09-14  6:57 [PATCH] btrfs: unlock the newly allocated extent buffer in btrfs_alloc_tree_block() Qu Wenruo
@ 2021-09-17 13:42 ` David Sterba
  2021-09-20  8:48 ` David Sterba
  2022-03-06 16:36 ` Denis Efremov
  2 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2021-09-17 13:42 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, Hao Sun

On Tue, Sep 14, 2021 at 02:57:59PM +0800, Qu Wenruo wrote:
> [BUG]
> There is a very detailed bug report that injected ENOMEM error could
> leave a tree block locked while we return to user-space:
> 
>   BTRFS info (device loop0): enabling ssd optimizations
>   FAULT_INJECTION: forcing a failure.
>   name failslab, interval 1, probability 0, space 0, times 0
>   CPU: 0 PID: 7579 Comm: syz-executor Not tainted 5.15.0-rc1 #16
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>   rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
>   Call Trace:
>    __dump_stack lib/dump_stack.c:88 [inline]
>    dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:106
>    fail_dump lib/fault-inject.c:52 [inline]
>    should_fail+0x13c/0x160 lib/fault-inject.c:146
>    should_failslab+0x5/0x10 mm/slab_common.c:1328
>    slab_pre_alloc_hook.constprop.99+0x4e/0xc0 mm/slab.h:494
>    slab_alloc_node mm/slub.c:3120 [inline]
>    slab_alloc mm/slub.c:3214 [inline]
>    kmem_cache_alloc+0x44/0x280 mm/slub.c:3219
>    btrfs_alloc_delayed_extent_op fs/btrfs/delayed-ref.h:299 [inline]
>    btrfs_alloc_tree_block+0x38c/0x670 fs/btrfs/extent-tree.c:4833
>    __btrfs_cow_block+0x16f/0x7d0 fs/btrfs/ctree.c:415
>    btrfs_cow_block+0x12a/0x300 fs/btrfs/ctree.c:570
>    btrfs_search_slot+0x6b0/0xee0 fs/btrfs/ctree.c:1768
>    btrfs_insert_empty_items+0x80/0xf0 fs/btrfs/ctree.c:3905
>    btrfs_new_inode+0x311/0xa60 fs/btrfs/inode.c:6530
>    btrfs_create+0x12b/0x270 fs/btrfs/inode.c:6783
>    lookup_open+0x660/0x780 fs/namei.c:3282
>    open_last_lookups fs/namei.c:3352 [inline]
>    path_openat+0x465/0xe20 fs/namei.c:3557
>    do_filp_open+0xe3/0x170 fs/namei.c:3588
>    do_sys_openat2+0x357/0x4a0 fs/open.c:1200
>    do_sys_open+0x87/0xd0 fs/open.c:1216
>    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>    do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80
>    entry_SYSCALL_64_after_hwframe+0x44/0xae
>   RIP: 0033:0x46ae99
>   Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48
>   89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
>   01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
>   RSP: 002b:00007f46711b9c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000055
>   RAX: ffffffffffffffda RBX: 000000000078c0a0 RCX: 000000000046ae99
>   RDX: 0000000000000000 RSI: 00000000000000a1 RDI: 0000000020005800
>   RBP: 00007f46711b9c80 R08: 0000000000000000 R09: 0000000000000000
>   R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000017
>   R13: 0000000000000000 R14: 000000000078c0a0 R15: 00007ffc129da6e0
> 
>   ================================================
>   WARNING: lock held when returning to user space!
>   5.15.0-rc1 #16 Not tainted
>   ------------------------------------------------
>   syz-executor/7579 is leaving the kernel with locks still held!
>   1 lock held by syz-executor/7579:
>    #0: ffff888104b73da8 (btrfs-tree-01/1){+.+.}-{3:3}, at:
>   __btrfs_tree_lock+0x2e/0x1a0 fs/btrfs/locking.c:112
> 
> [CAUSE]
> In btrfs_alloc_tree_block(), after btrfs_init_new_buffer(), the new
> extent buffer @buf is locked, but if later operations like adding
> delayed tree ref fails, we just free @buf without unlocking it,
> resulting above warning.
> 
> [FIX]
> Unlock @buf in out_free_buf: tag.
> 
> Reported-by: Hao Sun <sunhao.th@gmail.com>
> Link: https://lore.kernel.org/linux-btrfs/CACkBjsZ9O6Zr0KK1yGn=1rQi6Crh1yeCRdTSBxx9R99L4xdn-Q@mail.gmail.com/
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Added to misc-next, thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] btrfs: unlock the newly allocated extent buffer in btrfs_alloc_tree_block()
  2021-09-14  6:57 [PATCH] btrfs: unlock the newly allocated extent buffer in btrfs_alloc_tree_block() Qu Wenruo
  2021-09-17 13:42 ` David Sterba
@ 2021-09-20  8:48 ` David Sterba
  2021-09-20  8:54   ` Qu Wenruo
                     ` (2 more replies)
  2022-03-06 16:36 ` Denis Efremov
  2 siblings, 3 replies; 10+ messages in thread
From: David Sterba @ 2021-09-20  8:48 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, Hao Sun

On Tue, Sep 14, 2021 at 02:57:59PM +0800, Qu Wenruo wrote:
> [BUG]
> There is a very detailed bug report that injected ENOMEM error could
> leave a tree block locked while we return to user-space:
> 
>   BTRFS info (device loop0): enabling ssd optimizations
>   FAULT_INJECTION: forcing a failure.
>   name failslab, interval 1, probability 0, space 0, times 0
>   CPU: 0 PID: 7579 Comm: syz-executor Not tainted 5.15.0-rc1 #16
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>   rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
>   Call Trace:
>    __dump_stack lib/dump_stack.c:88 [inline]
>    dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:106
>    fail_dump lib/fault-inject.c:52 [inline]
>    should_fail+0x13c/0x160 lib/fault-inject.c:146
>    should_failslab+0x5/0x10 mm/slab_common.c:1328
>    slab_pre_alloc_hook.constprop.99+0x4e/0xc0 mm/slab.h:494
>    slab_alloc_node mm/slub.c:3120 [inline]
>    slab_alloc mm/slub.c:3214 [inline]
>    kmem_cache_alloc+0x44/0x280 mm/slub.c:3219
>    btrfs_alloc_delayed_extent_op fs/btrfs/delayed-ref.h:299 [inline]
>    btrfs_alloc_tree_block+0x38c/0x670 fs/btrfs/extent-tree.c:4833
>    __btrfs_cow_block+0x16f/0x7d0 fs/btrfs/ctree.c:415
>    btrfs_cow_block+0x12a/0x300 fs/btrfs/ctree.c:570
>    btrfs_search_slot+0x6b0/0xee0 fs/btrfs/ctree.c:1768
>    btrfs_insert_empty_items+0x80/0xf0 fs/btrfs/ctree.c:3905
>    btrfs_new_inode+0x311/0xa60 fs/btrfs/inode.c:6530
>    btrfs_create+0x12b/0x270 fs/btrfs/inode.c:6783
>    lookup_open+0x660/0x780 fs/namei.c:3282
>    open_last_lookups fs/namei.c:3352 [inline]
>    path_openat+0x465/0xe20 fs/namei.c:3557
>    do_filp_open+0xe3/0x170 fs/namei.c:3588
>    do_sys_openat2+0x357/0x4a0 fs/open.c:1200
>    do_sys_open+0x87/0xd0 fs/open.c:1216
>    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>    do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80
>    entry_SYSCALL_64_after_hwframe+0x44/0xae
>   RIP: 0033:0x46ae99
>   Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48
>   89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
>   01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
>   RSP: 002b:00007f46711b9c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000055
>   RAX: ffffffffffffffda RBX: 000000000078c0a0 RCX: 000000000046ae99
>   RDX: 0000000000000000 RSI: 00000000000000a1 RDI: 0000000020005800
>   RBP: 00007f46711b9c80 R08: 0000000000000000 R09: 0000000000000000
>   R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000017
>   R13: 0000000000000000 R14: 000000000078c0a0 R15: 00007ffc129da6e0
> 
>   ================================================
>   WARNING: lock held when returning to user space!
>   5.15.0-rc1 #16 Not tainted
>   ------------------------------------------------
>   syz-executor/7579 is leaving the kernel with locks still held!
>   1 lock held by syz-executor/7579:
>    #0: ffff888104b73da8 (btrfs-tree-01/1){+.+.}-{3:3}, at:
>   __btrfs_tree_lock+0x2e/0x1a0 fs/btrfs/locking.c:112
> 
> [CAUSE]
> In btrfs_alloc_tree_block(), after btrfs_init_new_buffer(), the new
> extent buffer @buf is locked, but if later operations like adding
> delayed tree ref fails, we just free @buf without unlocking it,
> resulting above warning.
> 
> [FIX]
> Unlock @buf in out_free_buf: tag.
> 
> Reported-by: Hao Sun <sunhao.th@gmail.com>
> Link: https://lore.kernel.org/linux-btrfs/CACkBjsZ9O6Zr0KK1yGn=1rQi6Crh1yeCRdTSBxx9R99L4xdn-Q@mail.gmail.com/
> Signed-off-by: Qu Wenruo <wqu@suse.com>

I found the following lockdep report, it's been with recent misc-next
and the functions on stack match what this patch touches but I haven't
done a deeper analysis so this could be a false trace (though the
warning seems legit).

The workload was some file creation/copy and relocation but I don't have
a more specific information.

[10898.966572] BTRFS info (device sdd10): balance: start -musage=50 -susage=50
[10898.980261] BTRFS info (device sdd10): relocating block group 195519578112 flags system
[10906.757623] BTRFS info (device sdd10): relocating block group 191148064768 flags metadata
[11401.635794] 
[11401.637392] ======================================================
[11401.643647] WARNING: possible circular locking dependency detected
[11401.649956] 5.15.0-rc1-git+ #810 Not tainted
[11401.654315] ------------------------------------------------------
[11401.660588] btrfs/11698 is trying to acquire lock:
[11401.665467] ffff8bb384bf7068 (btrfs-treloc-03#2){+.+.}-{3:3}, at: __btrfs_tree_lock+0x2c/0x140 [btrfs]
[11401.675102] 
[11401.675102] but task is already holding lock:
[11401.681029] ffff8bb3ce5a4110 (btrfs-tree-02/1){+.+.}-{3:3}, at: __btrfs_tree_lock+0x2c/0x140 [btrfs]
[11401.690417] 
[11401.690417] which lock already depends on the new lock.
[11401.690417] 
[11401.698700] 
[11401.698700] the existing dependency chain (in reverse order) is:
[11401.706272] 
[11401.706272] -> #2 (btrfs-tree-02/1){+.+.}-{3:3}:
[11401.712473]        __lock_acquire+0x3e5/0x730
[11401.716906]        lock_acquire.part.0+0x5f/0x190
[11401.721700]        down_write_nested+0x49/0x130
[11401.726310]        __btrfs_tree_lock+0x2c/0x140 [btrfs]
[11401.731744]        btrfs_init_new_buffer+0x7d/0x2c0 [btrfs]
[11401.737510]        btrfs_alloc_tree_block+0x13b/0x350 [btrfs]
[11401.743459]        __btrfs_cow_block+0x144/0x600 [btrfs]
[11401.748967]        btrfs_cow_block+0x107/0x160 [btrfs]
[11401.754306]        btrfs_search_slot+0x67a/0xc00 [btrfs]
[11401.759809]        btrfs_lookup_dir_item+0x7c/0xe0 [btrfs]
[11401.765507]        __btrfs_unlink_inode+0xaa/0x4f0 [btrfs]
[11401.771184]        btrfs_unlink+0x87/0x100 [btrfs]
[11401.776154]        vfs_unlink+0x101/0x210
[11401.780250]        do_unlinkat+0x19c/0x2c0
[11401.784435]        __x64_sys_unlinkat+0x34/0x60
[11401.789057]        do_syscall_64+0x3d/0xb0
[11401.793253]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[11401.798914] 
[11401.798914] -> #1 (btrfs-tree-02){++++}-{3:3}:
[11401.804947]        __lock_acquire+0x3e5/0x730
[11401.809381]        lock_acquire.part.0+0x5f/0x190
[11401.814185]        down_write_nested+0x49/0x130
[11401.818810]        __btrfs_tree_lock+0x2c/0x140 [btrfs]
[11401.824247]        btrfs_search_slot+0x2a1/0xc00 [btrfs]
[11401.829759]        do_relocation+0x12c/0x6f0 [btrfs]
[11401.834942]        relocate_tree_block+0x1a6/0x270 [btrfs]
[11401.840646]        relocate_tree_blocks+0xe8/0x260 [btrfs]
[11401.846358]        relocate_block_group+0x200/0x580 [btrfs]
[11401.852146]        btrfs_relocate_block_group+0x18b/0x350 [btrfs]
[11401.858455]        btrfs_relocate_chunk+0x38/0x120 [btrfs]
[11401.864166]        __btrfs_balance+0x2ea/0x490 [btrfs]
[11401.869511]        btrfs_balance+0x4d3/0x7c0 [btrfs]
[11401.874684]        btrfs_ioctl_balance+0x31c/0x3e0 [btrfs]
[11401.880382]        __x64_sys_ioctl+0x83/0xa0
[11401.884736]        do_syscall_64+0x3d/0xb0
[11401.888924]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[11401.894573] 
[11401.894573] -> #0 (btrfs-treloc-03#2){+.+.}-{3:3}:
[11401.900946]        check_prev_add+0x91/0xc30
[11401.905293]        validate_chain+0x56f/0x840
[11401.909740]        __lock_acquire+0x3e5/0x730
[11401.914186]        lock_acquire.part.0+0x5f/0x190
[11401.918979]        down_write_nested+0x49/0x130
[11401.923607]        __btrfs_tree_lock+0x2c/0x140 [btrfs]
[11401.929064]        btrfs_lock_root_node+0x31/0x40 [btrfs]
[11401.934671]        btrfs_search_slot+0x58e/0xc00 [btrfs]
[11401.940164]        replace_path+0x57e/0xc70 [btrfs]
[11401.945241]        merge_reloc_root+0x222/0x8c0 [btrfs]
[11401.950667]        merge_reloc_roots+0xf0/0x290 [btrfs]
[11401.956119]        relocate_block_group+0x2d7/0x580 [btrfs]
[11401.961910]        btrfs_relocate_block_group+0x18b/0x350 [btrfs]
[11401.968204]        btrfs_relocate_chunk+0x38/0x120 [btrfs]
[11401.973905]        __btrfs_balance+0x2ea/0x490 [btrfs]
[11401.979239]        btrfs_balance+0x4d3/0x7c0 [btrfs]
[11401.984423]        btrfs_ioctl_balance+0x31c/0x3e0 [btrfs]
[11401.990117]        __x64_sys_ioctl+0x83/0xa0
[11401.994456]        do_syscall_64+0x3d/0xb0
[11401.998642]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[11402.004323] 
[11402.004323] other info that might help us debug this:
[11402.004323] 
[11402.012465] Chain exists of:
[11402.012465]   btrfs-treloc-03#2 --> btrfs-tree-02 --> btrfs-tree-02/1
[11402.012465] 
[11402.023382]  Possible unsafe locking scenario:
[11402.023382] 
[11402.029395]        CPU0                    CPU1
[11402.034004]        ----                    ----
[11402.038622]   lock(btrfs-tree-02/1);
[11402.042287]                                lock(btrfs-tree-02);
[11402.048288]                                lock(btrfs-tree-02/1);
[11402.054476]   lock(btrfs-treloc-03#2);
[11402.060166] 
[11402.060166]  *** DEADLOCK ***
[11402.060166] 
[11402.066212] 5 locks held by btrfs/11698:
[11402.070210]  #0: ffff8bb4fab09478 (sb_writers#9){.+.+}-{0:0}, at: btrfs_ioctl_balance+0x47/0x3e0 [btrfs]
[11402.079923]  #1: ffff8bb48520a370 (&fs_info->reclaim_bgs_lock){+.+.}-{3:3}, at: __btrfs_balance+0x179/0x490 [btrfs]
[11402.090584]  #2: ffff8bb4852088e0 (&fs_info->cleaner_mutex){+.+.}-{3:3}, at: btrfs_relocate_block_group+0x183/0x350 [btrfs]
[11402.101965]  #3: ffff8bb4fab09698 (sb_internal#2){.+.+}-{0:0}, at: merge_reloc_root+0x11c/0x8c0 [btrfs]
[11402.111616]  #4: ffff8bb3ce5a4110 (btrfs-tree-02/1){+.+.}-{3:3}, at: __btrfs_tree_lock+0x2c/0x140 [btrfs]
[11402.121418] 
[11402.121418] stack backtrace:
[11402.125880] CPU: 7 PID: 11698 Comm: btrfs Not tainted 5.15.0-rc1-git+ #810
[11402.132843] Hardware name: empty empty/S3993, BIOS PAQEX0-3 02/24/2008
[11402.139470] Call Trace:
[11402.141989]  dump_stack_lvl+0x45/0x59
[11402.145736]  check_noncircular+0xf3/0x110
[11402.149827]  ? check_irq_usage+0xaa/0x3f0
[11402.153943]  check_prev_add+0x91/0xc30
[11402.157783]  validate_chain+0x56f/0x840
[11402.161719]  __lock_acquire+0x3e5/0x730
[11402.165654]  lock_acquire.part.0+0x5f/0x190
[11402.169924]  ? __btrfs_tree_lock+0x2c/0x140 [btrfs]
[11402.175013]  ? lock_acquire+0xa0/0x150
[11402.178839]  ? __btrfs_tree_lock+0x2c/0x140 [btrfs]
[11402.183949]  down_write_nested+0x49/0x130
[11402.188043]  ? __btrfs_tree_lock+0x2c/0x140 [btrfs]
[11402.193127]  __btrfs_tree_lock+0x2c/0x140 [btrfs]
[11402.198039]  btrfs_lock_root_node+0x31/0x40 [btrfs]
[11402.203148]  btrfs_search_slot+0x58e/0xc00 [btrfs]
[11402.208117]  replace_path+0x57e/0xc70 [btrfs]
[11402.212680]  merge_reloc_root+0x222/0x8c0 [btrfs]
[11402.217593]  ? lock_release+0x68/0x140
[11402.221427]  ? _raw_spin_unlock+0x1f/0x40
[11402.225522]  merge_reloc_roots+0xf0/0x290 [btrfs]
[11402.230439]  relocate_block_group+0x2d7/0x580 [btrfs]
[11402.235721]  btrfs_relocate_block_group+0x18b/0x350 [btrfs]
[11402.241521]  btrfs_relocate_chunk+0x38/0x120 [btrfs]
[11402.246688]  __btrfs_balance+0x2ea/0x490 [btrfs]
[11402.251524]  btrfs_balance+0x4d3/0x7c0 [btrfs]
[11402.256181]  btrfs_ioctl_balance+0x31c/0x3e0 [btrfs]
[11402.261377]  __x64_sys_ioctl+0x83/0xa0
[11402.265205]  do_syscall_64+0x3d/0xb0
[11402.268853]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[11402.273985] RIP: 0033:0x7fedef5ff6c7
[11402.277642] Code: 00 00 00 48 8b 05 c1 67 2b 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 67 2b 00 f7 d8 64 89 01 48
[11402.296539] RSP: 002b:00007ffda9289e78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[11402.304211] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fedef5ff6c7
[11402.311439] RDX: 00007ffda9289f10 RSI: 00000000c4009420 RDI: 0000000000000003
[11402.318657] RBP: 00007ffda928c866 R08: 00000000004c4cab R09: 0000000000000013
[11402.325877] R10: 0000000022494966 R11: 0000000000000246 R12: 0000000000000001
[11402.333115] R13: 00007ffda9289f10 R14: 0000000000000000 R15: 00007ffda9289f08
[11432.564100] BTRFS info (device sdd10): found 30943 extents, stage: move data extents

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] btrfs: unlock the newly allocated extent buffer in btrfs_alloc_tree_block()
  2021-09-20  8:48 ` David Sterba
@ 2021-09-20  8:54   ` Qu Wenruo
  2021-09-20  8:56   ` Johannes Thumshirn
  2021-09-20 10:16   ` Filipe Manana
  2 siblings, 0 replies; 10+ messages in thread
From: Qu Wenruo @ 2021-09-20  8:54 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs, Hao Sun



On 2021/9/20 16:48, David Sterba wrote:
> On Tue, Sep 14, 2021 at 02:57:59PM +0800, Qu Wenruo wrote:
>> [BUG]
>> There is a very detailed bug report that injected ENOMEM error could
>> leave a tree block locked while we return to user-space:
>>
>>    BTRFS info (device loop0): enabling ssd optimizations
>>    FAULT_INJECTION: forcing a failure.
>>    name failslab, interval 1, probability 0, space 0, times 0
>>    CPU: 0 PID: 7579 Comm: syz-executor Not tainted 5.15.0-rc1 #16
>>    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
>>    rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
>>    Call Trace:
>>     __dump_stack lib/dump_stack.c:88 [inline]
>>     dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:106
>>     fail_dump lib/fault-inject.c:52 [inline]
>>     should_fail+0x13c/0x160 lib/fault-inject.c:146
>>     should_failslab+0x5/0x10 mm/slab_common.c:1328
>>     slab_pre_alloc_hook.constprop.99+0x4e/0xc0 mm/slab.h:494
>>     slab_alloc_node mm/slub.c:3120 [inline]
>>     slab_alloc mm/slub.c:3214 [inline]
>>     kmem_cache_alloc+0x44/0x280 mm/slub.c:3219
>>     btrfs_alloc_delayed_extent_op fs/btrfs/delayed-ref.h:299 [inline]
>>     btrfs_alloc_tree_block+0x38c/0x670 fs/btrfs/extent-tree.c:4833
>>     __btrfs_cow_block+0x16f/0x7d0 fs/btrfs/ctree.c:415
>>     btrfs_cow_block+0x12a/0x300 fs/btrfs/ctree.c:570
>>     btrfs_search_slot+0x6b0/0xee0 fs/btrfs/ctree.c:1768
>>     btrfs_insert_empty_items+0x80/0xf0 fs/btrfs/ctree.c:3905
>>     btrfs_new_inode+0x311/0xa60 fs/btrfs/inode.c:6530
>>     btrfs_create+0x12b/0x270 fs/btrfs/inode.c:6783
>>     lookup_open+0x660/0x780 fs/namei.c:3282
>>     open_last_lookups fs/namei.c:3352 [inline]
>>     path_openat+0x465/0xe20 fs/namei.c:3557
>>     do_filp_open+0xe3/0x170 fs/namei.c:3588
>>     do_sys_openat2+0x357/0x4a0 fs/open.c:1200
>>     do_sys_open+0x87/0xd0 fs/open.c:1216
>>     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>>     do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80
>>     entry_SYSCALL_64_after_hwframe+0x44/0xae
>>    RIP: 0033:0x46ae99
>>    Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48
>>    89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
>>    01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
>>    RSP: 002b:00007f46711b9c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000055
>>    RAX: ffffffffffffffda RBX: 000000000078c0a0 RCX: 000000000046ae99
>>    RDX: 0000000000000000 RSI: 00000000000000a1 RDI: 0000000020005800
>>    RBP: 00007f46711b9c80 R08: 0000000000000000 R09: 0000000000000000
>>    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000017
>>    R13: 0000000000000000 R14: 000000000078c0a0 R15: 00007ffc129da6e0
>>
>>    ================================================
>>    WARNING: lock held when returning to user space!
>>    5.15.0-rc1 #16 Not tainted
>>    ------------------------------------------------
>>    syz-executor/7579 is leaving the kernel with locks still held!
>>    1 lock held by syz-executor/7579:
>>     #0: ffff888104b73da8 (btrfs-tree-01/1){+.+.}-{3:3}, at:
>>    __btrfs_tree_lock+0x2e/0x1a0 fs/btrfs/locking.c:112
>>
>> [CAUSE]
>> In btrfs_alloc_tree_block(), after btrfs_init_new_buffer(), the new
>> extent buffer @buf is locked, but if later operations like adding
>> delayed tree ref fails, we just free @buf without unlocking it,
>> resulting above warning.
>>
>> [FIX]
>> Unlock @buf in out_free_buf: tag.
>>
>> Reported-by: Hao Sun <sunhao.th@gmail.com>
>> Link: https://lore.kernel.org/linux-btrfs/CACkBjsZ9O6Zr0KK1yGn=1rQi6Crh1yeCRdTSBxx9R99L4xdn-Q@mail.gmail.com/
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>
> I found the following lockdep report, it's been with recent misc-next
> and the functions on stack match what this patch touches but I haven't
> done a deeper analysis so this could be a false trace (though the
> warning seems legit).
>
> The workload was some file creation/copy and relocation but I don't have
> a more specific information.
>
> [10898.966572] BTRFS info (device sdd10): balance: start -musage=50 -susage=50
> [10898.980261] BTRFS info (device sdd10): relocating block group 195519578112 flags system
> [10906.757623] BTRFS info (device sdd10): relocating block group 191148064768 flags metadata
> [11401.635794]
> [11401.637392] ======================================================
> [11401.643647] WARNING: possible circular locking dependency detected
> [11401.649956] 5.15.0-rc1-git+ #810 Not tainted
> [11401.654315] ------------------------------------------------------
> [11401.660588] btrfs/11698 is trying to acquire lock:
> [11401.665467] ffff8bb384bf7068 (btrfs-treloc-03#2){+.+.}-{3:3}, at: __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11401.675102]
> [11401.675102] but task is already holding lock:
> [11401.681029] ffff8bb3ce5a4110 (btrfs-tree-02/1){+.+.}-{3:3}, at: __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11401.690417]
> [11401.690417] which lock already depends on the new lock.
> [11401.690417]
> [11401.698700]
> [11401.698700] the existing dependency chain (in reverse order) is:
> [11401.706272]
> [11401.706272] -> #2 (btrfs-tree-02/1){+.+.}-{3:3}:
> [11401.712473]        __lock_acquire+0x3e5/0x730
> [11401.716906]        lock_acquire.part.0+0x5f/0x190
> [11401.721700]        down_write_nested+0x49/0x130
> [11401.726310]        __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11401.731744]        btrfs_init_new_buffer+0x7d/0x2c0 [btrfs]
> [11401.737510]        btrfs_alloc_tree_block+0x13b/0x350 [btrfs]
> [11401.743459]        __btrfs_cow_block+0x144/0x600 [btrfs]
> [11401.748967]        btrfs_cow_block+0x107/0x160 [btrfs]
> [11401.754306]        btrfs_search_slot+0x67a/0xc00 [btrfs]
> [11401.759809]        btrfs_lookup_dir_item+0x7c/0xe0 [btrfs]
> [11401.765507]        __btrfs_unlink_inode+0xaa/0x4f0 [btrfs]
> [11401.771184]        btrfs_unlink+0x87/0x100 [btrfs]
> [11401.776154]        vfs_unlink+0x101/0x210
> [11401.780250]        do_unlinkat+0x19c/0x2c0
> [11401.784435]        __x64_sys_unlinkat+0x34/0x60
> [11401.789057]        do_syscall_64+0x3d/0xb0
> [11401.793253]        entry_SYSCALL_64_after_hwframe+0x44/0xae
> [11401.798914]
> [11401.798914] -> #1 (btrfs-tree-02){++++}-{3:3}:
> [11401.804947]        __lock_acquire+0x3e5/0x730
> [11401.809381]        lock_acquire.part.0+0x5f/0x190
> [11401.814185]        down_write_nested+0x49/0x130
> [11401.818810]        __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11401.824247]        btrfs_search_slot+0x2a1/0xc00 [btrfs]
> [11401.829759]        do_relocation+0x12c/0x6f0 [btrfs]
> [11401.834942]        relocate_tree_block+0x1a6/0x270 [btrfs]
> [11401.840646]        relocate_tree_blocks+0xe8/0x260 [btrfs]
> [11401.846358]        relocate_block_group+0x200/0x580 [btrfs]
> [11401.852146]        btrfs_relocate_block_group+0x18b/0x350 [btrfs]
> [11401.858455]        btrfs_relocate_chunk+0x38/0x120 [btrfs]
> [11401.864166]        __btrfs_balance+0x2ea/0x490 [btrfs]
> [11401.869511]        btrfs_balance+0x4d3/0x7c0 [btrfs]
> [11401.874684]        btrfs_ioctl_balance+0x31c/0x3e0 [btrfs]
> [11401.880382]        __x64_sys_ioctl+0x83/0xa0
> [11401.884736]        do_syscall_64+0x3d/0xb0
> [11401.888924]        entry_SYSCALL_64_after_hwframe+0x44/0xae
> [11401.894573]
> [11401.894573] -> #0 (btrfs-treloc-03#2){+.+.}-{3:3}:
> [11401.900946]        check_prev_add+0x91/0xc30
> [11401.905293]        validate_chain+0x56f/0x840
> [11401.909740]        __lock_acquire+0x3e5/0x730
> [11401.914186]        lock_acquire.part.0+0x5f/0x190
> [11401.918979]        down_write_nested+0x49/0x130
> [11401.923607]        __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11401.929064]        btrfs_lock_root_node+0x31/0x40 [btrfs]
> [11401.934671]        btrfs_search_slot+0x58e/0xc00 [btrfs]
> [11401.940164]        replace_path+0x57e/0xc70 [btrfs]
> [11401.945241]        merge_reloc_root+0x222/0x8c0 [btrfs]
> [11401.950667]        merge_reloc_roots+0xf0/0x290 [btrfs]
> [11401.956119]        relocate_block_group+0x2d7/0x580 [btrfs]
> [11401.961910]        btrfs_relocate_block_group+0x18b/0x350 [btrfs]
> [11401.968204]        btrfs_relocate_chunk+0x38/0x120 [btrfs]
> [11401.973905]        __btrfs_balance+0x2ea/0x490 [btrfs]
> [11401.979239]        btrfs_balance+0x4d3/0x7c0 [btrfs]
> [11401.984423]        btrfs_ioctl_balance+0x31c/0x3e0 [btrfs]
> [11401.990117]        __x64_sys_ioctl+0x83/0xa0
> [11401.994456]        do_syscall_64+0x3d/0xb0
> [11401.998642]        entry_SYSCALL_64_after_hwframe+0x44/0xae
> [11402.004323]
> [11402.004323] other info that might help us debug this:
> [11402.004323]
> [11402.012465] Chain exists of:
> [11402.012465]   btrfs-treloc-03#2 --> btrfs-tree-02 --> btrfs-tree-02/1
> [11402.012465]
> [11402.023382]  Possible unsafe locking scenario:
> [11402.023382]
> [11402.029395]        CPU0                    CPU1
> [11402.034004]        ----                    ----
> [11402.038622]   lock(btrfs-tree-02/1);
> [11402.042287]                                lock(btrfs-tree-02);
> [11402.048288]                                lock(btrfs-tree-02/1);
> [11402.054476]   lock(btrfs-treloc-03#2);

This means ABBA tree lock.

But the patch I submitted only touched the error handling part of
btrfs_alloc_tree_block(), and all the possible error causes are -ENOMEM.


The lockdep report just shows that under certain call path we're holding
the tree lock in a bad sequence, no matter if my patch is applied or not.

Thanks,
Qu

> [11402.060166]
> [11402.060166]  *** DEADLOCK ***
> [11402.060166]
> [11402.066212] 5 locks held by btrfs/11698:
> [11402.070210]  #0: ffff8bb4fab09478 (sb_writers#9){.+.+}-{0:0}, at: btrfs_ioctl_balance+0x47/0x3e0 [btrfs]
> [11402.079923]  #1: ffff8bb48520a370 (&fs_info->reclaim_bgs_lock){+.+.}-{3:3}, at: __btrfs_balance+0x179/0x490 [btrfs]
> [11402.090584]  #2: ffff8bb4852088e0 (&fs_info->cleaner_mutex){+.+.}-{3:3}, at: btrfs_relocate_block_group+0x183/0x350 [btrfs]
> [11402.101965]  #3: ffff8bb4fab09698 (sb_internal#2){.+.+}-{0:0}, at: merge_reloc_root+0x11c/0x8c0 [btrfs]
> [11402.111616]  #4: ffff8bb3ce5a4110 (btrfs-tree-02/1){+.+.}-{3:3}, at: __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11402.121418]
> [11402.121418] stack backtrace:
> [11402.125880] CPU: 7 PID: 11698 Comm: btrfs Not tainted 5.15.0-rc1-git+ #810
> [11402.132843] Hardware name: empty empty/S3993, BIOS PAQEX0-3 02/24/2008
> [11402.139470] Call Trace:
> [11402.141989]  dump_stack_lvl+0x45/0x59
> [11402.145736]  check_noncircular+0xf3/0x110
> [11402.149827]  ? check_irq_usage+0xaa/0x3f0
> [11402.153943]  check_prev_add+0x91/0xc30
> [11402.157783]  validate_chain+0x56f/0x840
> [11402.161719]  __lock_acquire+0x3e5/0x730
> [11402.165654]  lock_acquire.part.0+0x5f/0x190
> [11402.169924]  ? __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11402.175013]  ? lock_acquire+0xa0/0x150
> [11402.178839]  ? __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11402.183949]  down_write_nested+0x49/0x130
> [11402.188043]  ? __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11402.193127]  __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11402.198039]  btrfs_lock_root_node+0x31/0x40 [btrfs]
> [11402.203148]  btrfs_search_slot+0x58e/0xc00 [btrfs]
> [11402.208117]  replace_path+0x57e/0xc70 [btrfs]
> [11402.212680]  merge_reloc_root+0x222/0x8c0 [btrfs]
> [11402.217593]  ? lock_release+0x68/0x140
> [11402.221427]  ? _raw_spin_unlock+0x1f/0x40
> [11402.225522]  merge_reloc_roots+0xf0/0x290 [btrfs]
> [11402.230439]  relocate_block_group+0x2d7/0x580 [btrfs]
> [11402.235721]  btrfs_relocate_block_group+0x18b/0x350 [btrfs]
> [11402.241521]  btrfs_relocate_chunk+0x38/0x120 [btrfs]
> [11402.246688]  __btrfs_balance+0x2ea/0x490 [btrfs]
> [11402.251524]  btrfs_balance+0x4d3/0x7c0 [btrfs]
> [11402.256181]  btrfs_ioctl_balance+0x31c/0x3e0 [btrfs]
> [11402.261377]  __x64_sys_ioctl+0x83/0xa0
> [11402.265205]  do_syscall_64+0x3d/0xb0
> [11402.268853]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [11402.273985] RIP: 0033:0x7fedef5ff6c7
> [11402.277642] Code: 00 00 00 48 8b 05 c1 67 2b 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 67 2b 00 f7 d8 64 89 01 48
> [11402.296539] RSP: 002b:00007ffda9289e78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [11402.304211] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fedef5ff6c7
> [11402.311439] RDX: 00007ffda9289f10 RSI: 00000000c4009420 RDI: 0000000000000003
> [11402.318657] RBP: 00007ffda928c866 R08: 00000000004c4cab R09: 0000000000000013
> [11402.325877] R10: 0000000022494966 R11: 0000000000000246 R12: 0000000000000001
> [11402.333115] R13: 00007ffda9289f10 R14: 0000000000000000 R15: 00007ffda9289f08
> [11432.564100] BTRFS info (device sdd10): found 30943 extents, stage: move data extents
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] btrfs: unlock the newly allocated extent buffer in btrfs_alloc_tree_block()
  2021-09-20  8:48 ` David Sterba
  2021-09-20  8:54   ` Qu Wenruo
@ 2021-09-20  8:56   ` Johannes Thumshirn
  2021-09-20  9:12     ` David Sterba
  2021-09-20 10:16   ` Filipe Manana
  2 siblings, 1 reply; 10+ messages in thread
From: Johannes Thumshirn @ 2021-09-20  8:56 UTC (permalink / raw)
  To: dsterba, Qu Wenruo; +Cc: linux-btrfs, Hao Sun

On 20/09/2021 10:48, David Sterba wrote:
> I found the following lockdep report, it's been with recent misc-next
> and the functions on stack match what this patch touches but I haven't
> done a deeper analysis so this could be a false trace (though the
> warning seems legit).
> 
> The workload was some file creation/copy and relocation but I don't have
> a more specific information.

This looks very much like a lockdep splat I know from zoned mode, but assumed
it's a zoned only problem and pushed on the task stack. Was this test on a 
regular SCSI/SATA drive? 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] btrfs: unlock the newly allocated extent buffer in btrfs_alloc_tree_block()
  2021-09-20  8:56   ` Johannes Thumshirn
@ 2021-09-20  9:12     ` David Sterba
  0 siblings, 0 replies; 10+ messages in thread
From: David Sterba @ 2021-09-20  9:12 UTC (permalink / raw)
  To: Johannes Thumshirn; +Cc: dsterba, Qu Wenruo, linux-btrfs, Hao Sun

On Mon, Sep 20, 2021 at 08:56:04AM +0000, Johannes Thumshirn wrote:
> On 20/09/2021 10:48, David Sterba wrote:
> > I found the following lockdep report, it's been with recent misc-next
> > and the functions on stack match what this patch touches but I haven't
> > done a deeper analysis so this could be a false trace (though the
> > warning seems legit).
> > 
> > The workload was some file creation/copy and relocation but I don't have
> > a more specific information.
> 
> This looks very much like a lockdep splat I know from zoned mode, but assumed
> it's a zoned only problem and pushed on the task stack. Was this test on a 
> regular SCSI/SATA drive? 

Yes, it's a multi-device fs, otherwise nothing special about it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] btrfs: unlock the newly allocated extent buffer in btrfs_alloc_tree_block()
  2021-09-20  8:48 ` David Sterba
  2021-09-20  8:54   ` Qu Wenruo
  2021-09-20  8:56   ` Johannes Thumshirn
@ 2021-09-20 10:16   ` Filipe Manana
  2 siblings, 0 replies; 10+ messages in thread
From: Filipe Manana @ 2021-09-20 10:16 UTC (permalink / raw)
  To: David Sterba, Qu Wenruo, linux-btrfs, Hao Sun

On Mon, Sep 20, 2021 at 10:58 AM David Sterba <dsterba@suse.cz> wrote:
>
> On Tue, Sep 14, 2021 at 02:57:59PM +0800, Qu Wenruo wrote:
> > [BUG]
> > There is a very detailed bug report that injected ENOMEM error could
> > leave a tree block locked while we return to user-space:
> >
> >   BTRFS info (device loop0): enabling ssd optimizations
> >   FAULT_INJECTION: forcing a failure.
> >   name failslab, interval 1, probability 0, space 0, times 0
> >   CPU: 0 PID: 7579 Comm: syz-executor Not tainted 5.15.0-rc1 #16
> >   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> >   rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> >   Call Trace:
> >    __dump_stack lib/dump_stack.c:88 [inline]
> >    dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:106
> >    fail_dump lib/fault-inject.c:52 [inline]
> >    should_fail+0x13c/0x160 lib/fault-inject.c:146
> >    should_failslab+0x5/0x10 mm/slab_common.c:1328
> >    slab_pre_alloc_hook.constprop.99+0x4e/0xc0 mm/slab.h:494
> >    slab_alloc_node mm/slub.c:3120 [inline]
> >    slab_alloc mm/slub.c:3214 [inline]
> >    kmem_cache_alloc+0x44/0x280 mm/slub.c:3219
> >    btrfs_alloc_delayed_extent_op fs/btrfs/delayed-ref.h:299 [inline]
> >    btrfs_alloc_tree_block+0x38c/0x670 fs/btrfs/extent-tree.c:4833
> >    __btrfs_cow_block+0x16f/0x7d0 fs/btrfs/ctree.c:415
> >    btrfs_cow_block+0x12a/0x300 fs/btrfs/ctree.c:570
> >    btrfs_search_slot+0x6b0/0xee0 fs/btrfs/ctree.c:1768
> >    btrfs_insert_empty_items+0x80/0xf0 fs/btrfs/ctree.c:3905
> >    btrfs_new_inode+0x311/0xa60 fs/btrfs/inode.c:6530
> >    btrfs_create+0x12b/0x270 fs/btrfs/inode.c:6783
> >    lookup_open+0x660/0x780 fs/namei.c:3282
> >    open_last_lookups fs/namei.c:3352 [inline]
> >    path_openat+0x465/0xe20 fs/namei.c:3557
> >    do_filp_open+0xe3/0x170 fs/namei.c:3588
> >    do_sys_openat2+0x357/0x4a0 fs/open.c:1200
> >    do_sys_open+0x87/0xd0 fs/open.c:1216
> >    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> >    do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80
> >    entry_SYSCALL_64_after_hwframe+0x44/0xae
> >   RIP: 0033:0x46ae99
> >   Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48
> >   89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
> >   01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
> >   RSP: 002b:00007f46711b9c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000055
> >   RAX: ffffffffffffffda RBX: 000000000078c0a0 RCX: 000000000046ae99
> >   RDX: 0000000000000000 RSI: 00000000000000a1 RDI: 0000000020005800
> >   RBP: 00007f46711b9c80 R08: 0000000000000000 R09: 0000000000000000
> >   R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000017
> >   R13: 0000000000000000 R14: 000000000078c0a0 R15: 00007ffc129da6e0
> >
> >   ================================================
> >   WARNING: lock held when returning to user space!
> >   5.15.0-rc1 #16 Not tainted
> >   ------------------------------------------------
> >   syz-executor/7579 is leaving the kernel with locks still held!
> >   1 lock held by syz-executor/7579:
> >    #0: ffff888104b73da8 (btrfs-tree-01/1){+.+.}-{3:3}, at:
> >   __btrfs_tree_lock+0x2e/0x1a0 fs/btrfs/locking.c:112
> >
> > [CAUSE]
> > In btrfs_alloc_tree_block(), after btrfs_init_new_buffer(), the new
> > extent buffer @buf is locked, but if later operations like adding
> > delayed tree ref fails, we just free @buf without unlocking it,
> > resulting above warning.
> >
> > [FIX]
> > Unlock @buf in out_free_buf: tag.
> >
> > Reported-by: Hao Sun <sunhao.th@gmail.com>
> > Link: https://lore.kernel.org/linux-btrfs/CACkBjsZ9O6Zr0KK1yGn=1rQi6Crh1yeCRdTSBxx9R99L4xdn-Q@mail.gmail.com/
> > Signed-off-by: Qu Wenruo <wqu@suse.com>
>
> I found the following lockdep report, it's been with recent misc-next
> and the functions on stack match what this patch touches but I haven't
> done a deeper analysis so this could be a false trace (though the
> warning seems legit).
>
> The workload was some file creation/copy and relocation but I don't have
> a more specific information.

The problem is known and is always triggered by brfs/187 for example.
It's unrelated to multi device filesystems or to Qu's patch.

The problem is that we have replace_path() locking nodes from a
relocation tree and then lock nodes from a subvolume tree,
while other paths do the opposite (like replace_file_extents() amongst
others), locking first from a subvolume tree and then from a
relocation tree.

I've reported that long ago on slack to Josef, which started to happen
shortly after the switch to rw semaphores for btree locks.
Unfortunately I don't think the problem is simple to fix.


>
> [10898.966572] BTRFS info (device sdd10): balance: start -musage=50 -susage=50
> [10898.980261] BTRFS info (device sdd10): relocating block group 195519578112 flags system
> [10906.757623] BTRFS info (device sdd10): relocating block group 191148064768 flags metadata
> [11401.635794]
> [11401.637392] ======================================================
> [11401.643647] WARNING: possible circular locking dependency detected
> [11401.649956] 5.15.0-rc1-git+ #810 Not tainted
> [11401.654315] ------------------------------------------------------
> [11401.660588] btrfs/11698 is trying to acquire lock:
> [11401.665467] ffff8bb384bf7068 (btrfs-treloc-03#2){+.+.}-{3:3}, at: __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11401.675102]
> [11401.675102] but task is already holding lock:
> [11401.681029] ffff8bb3ce5a4110 (btrfs-tree-02/1){+.+.}-{3:3}, at: __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11401.690417]
> [11401.690417] which lock already depends on the new lock.
> [11401.690417]
> [11401.698700]
> [11401.698700] the existing dependency chain (in reverse order) is:
> [11401.706272]
> [11401.706272] -> #2 (btrfs-tree-02/1){+.+.}-{3:3}:
> [11401.712473]        __lock_acquire+0x3e5/0x730
> [11401.716906]        lock_acquire.part.0+0x5f/0x190
> [11401.721700]        down_write_nested+0x49/0x130
> [11401.726310]        __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11401.731744]        btrfs_init_new_buffer+0x7d/0x2c0 [btrfs]
> [11401.737510]        btrfs_alloc_tree_block+0x13b/0x350 [btrfs]
> [11401.743459]        __btrfs_cow_block+0x144/0x600 [btrfs]
> [11401.748967]        btrfs_cow_block+0x107/0x160 [btrfs]
> [11401.754306]        btrfs_search_slot+0x67a/0xc00 [btrfs]
> [11401.759809]        btrfs_lookup_dir_item+0x7c/0xe0 [btrfs]
> [11401.765507]        __btrfs_unlink_inode+0xaa/0x4f0 [btrfs]
> [11401.771184]        btrfs_unlink+0x87/0x100 [btrfs]
> [11401.776154]        vfs_unlink+0x101/0x210
> [11401.780250]        do_unlinkat+0x19c/0x2c0
> [11401.784435]        __x64_sys_unlinkat+0x34/0x60
> [11401.789057]        do_syscall_64+0x3d/0xb0
> [11401.793253]        entry_SYSCALL_64_after_hwframe+0x44/0xae
> [11401.798914]
> [11401.798914] -> #1 (btrfs-tree-02){++++}-{3:3}:
> [11401.804947]        __lock_acquire+0x3e5/0x730
> [11401.809381]        lock_acquire.part.0+0x5f/0x190
> [11401.814185]        down_write_nested+0x49/0x130
> [11401.818810]        __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11401.824247]        btrfs_search_slot+0x2a1/0xc00 [btrfs]
> [11401.829759]        do_relocation+0x12c/0x6f0 [btrfs]
> [11401.834942]        relocate_tree_block+0x1a6/0x270 [btrfs]
> [11401.840646]        relocate_tree_blocks+0xe8/0x260 [btrfs]
> [11401.846358]        relocate_block_group+0x200/0x580 [btrfs]
> [11401.852146]        btrfs_relocate_block_group+0x18b/0x350 [btrfs]
> [11401.858455]        btrfs_relocate_chunk+0x38/0x120 [btrfs]
> [11401.864166]        __btrfs_balance+0x2ea/0x490 [btrfs]
> [11401.869511]        btrfs_balance+0x4d3/0x7c0 [btrfs]
> [11401.874684]        btrfs_ioctl_balance+0x31c/0x3e0 [btrfs]
> [11401.880382]        __x64_sys_ioctl+0x83/0xa0
> [11401.884736]        do_syscall_64+0x3d/0xb0
> [11401.888924]        entry_SYSCALL_64_after_hwframe+0x44/0xae
> [11401.894573]
> [11401.894573] -> #0 (btrfs-treloc-03#2){+.+.}-{3:3}:
> [11401.900946]        check_prev_add+0x91/0xc30
> [11401.905293]        validate_chain+0x56f/0x840
> [11401.909740]        __lock_acquire+0x3e5/0x730
> [11401.914186]        lock_acquire.part.0+0x5f/0x190
> [11401.918979]        down_write_nested+0x49/0x130
> [11401.923607]        __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11401.929064]        btrfs_lock_root_node+0x31/0x40 [btrfs]
> [11401.934671]        btrfs_search_slot+0x58e/0xc00 [btrfs]
> [11401.940164]        replace_path+0x57e/0xc70 [btrfs]
> [11401.945241]        merge_reloc_root+0x222/0x8c0 [btrfs]
> [11401.950667]        merge_reloc_roots+0xf0/0x290 [btrfs]
> [11401.956119]        relocate_block_group+0x2d7/0x580 [btrfs]
> [11401.961910]        btrfs_relocate_block_group+0x18b/0x350 [btrfs]
> [11401.968204]        btrfs_relocate_chunk+0x38/0x120 [btrfs]
> [11401.973905]        __btrfs_balance+0x2ea/0x490 [btrfs]
> [11401.979239]        btrfs_balance+0x4d3/0x7c0 [btrfs]
> [11401.984423]        btrfs_ioctl_balance+0x31c/0x3e0 [btrfs]
> [11401.990117]        __x64_sys_ioctl+0x83/0xa0
> [11401.994456]        do_syscall_64+0x3d/0xb0
> [11401.998642]        entry_SYSCALL_64_after_hwframe+0x44/0xae
> [11402.004323]
> [11402.004323] other info that might help us debug this:
> [11402.004323]
> [11402.012465] Chain exists of:
> [11402.012465]   btrfs-treloc-03#2 --> btrfs-tree-02 --> btrfs-tree-02/1
> [11402.012465]
> [11402.023382]  Possible unsafe locking scenario:
> [11402.023382]
> [11402.029395]        CPU0                    CPU1
> [11402.034004]        ----                    ----
> [11402.038622]   lock(btrfs-tree-02/1);
> [11402.042287]                                lock(btrfs-tree-02);
> [11402.048288]                                lock(btrfs-tree-02/1);
> [11402.054476]   lock(btrfs-treloc-03#2);
> [11402.060166]
> [11402.060166]  *** DEADLOCK ***
> [11402.060166]
> [11402.066212] 5 locks held by btrfs/11698:
> [11402.070210]  #0: ffff8bb4fab09478 (sb_writers#9){.+.+}-{0:0}, at: btrfs_ioctl_balance+0x47/0x3e0 [btrfs]
> [11402.079923]  #1: ffff8bb48520a370 (&fs_info->reclaim_bgs_lock){+.+.}-{3:3}, at: __btrfs_balance+0x179/0x490 [btrfs]
> [11402.090584]  #2: ffff8bb4852088e0 (&fs_info->cleaner_mutex){+.+.}-{3:3}, at: btrfs_relocate_block_group+0x183/0x350 [btrfs]
> [11402.101965]  #3: ffff8bb4fab09698 (sb_internal#2){.+.+}-{0:0}, at: merge_reloc_root+0x11c/0x8c0 [btrfs]
> [11402.111616]  #4: ffff8bb3ce5a4110 (btrfs-tree-02/1){+.+.}-{3:3}, at: __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11402.121418]
> [11402.121418] stack backtrace:
> [11402.125880] CPU: 7 PID: 11698 Comm: btrfs Not tainted 5.15.0-rc1-git+ #810
> [11402.132843] Hardware name: empty empty/S3993, BIOS PAQEX0-3 02/24/2008
> [11402.139470] Call Trace:
> [11402.141989]  dump_stack_lvl+0x45/0x59
> [11402.145736]  check_noncircular+0xf3/0x110
> [11402.149827]  ? check_irq_usage+0xaa/0x3f0
> [11402.153943]  check_prev_add+0x91/0xc30
> [11402.157783]  validate_chain+0x56f/0x840
> [11402.161719]  __lock_acquire+0x3e5/0x730
> [11402.165654]  lock_acquire.part.0+0x5f/0x190
> [11402.169924]  ? __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11402.175013]  ? lock_acquire+0xa0/0x150
> [11402.178839]  ? __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11402.183949]  down_write_nested+0x49/0x130
> [11402.188043]  ? __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11402.193127]  __btrfs_tree_lock+0x2c/0x140 [btrfs]
> [11402.198039]  btrfs_lock_root_node+0x31/0x40 [btrfs]
> [11402.203148]  btrfs_search_slot+0x58e/0xc00 [btrfs]
> [11402.208117]  replace_path+0x57e/0xc70 [btrfs]
> [11402.212680]  merge_reloc_root+0x222/0x8c0 [btrfs]
> [11402.217593]  ? lock_release+0x68/0x140
> [11402.221427]  ? _raw_spin_unlock+0x1f/0x40
> [11402.225522]  merge_reloc_roots+0xf0/0x290 [btrfs]
> [11402.230439]  relocate_block_group+0x2d7/0x580 [btrfs]
> [11402.235721]  btrfs_relocate_block_group+0x18b/0x350 [btrfs]
> [11402.241521]  btrfs_relocate_chunk+0x38/0x120 [btrfs]
> [11402.246688]  __btrfs_balance+0x2ea/0x490 [btrfs]
> [11402.251524]  btrfs_balance+0x4d3/0x7c0 [btrfs]
> [11402.256181]  btrfs_ioctl_balance+0x31c/0x3e0 [btrfs]
> [11402.261377]  __x64_sys_ioctl+0x83/0xa0
> [11402.265205]  do_syscall_64+0x3d/0xb0
> [11402.268853]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [11402.273985] RIP: 0033:0x7fedef5ff6c7
> [11402.277642] Code: 00 00 00 48 8b 05 c1 67 2b 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 91 67 2b 00 f7 d8 64 89 01 48
> [11402.296539] RSP: 002b:00007ffda9289e78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [11402.304211] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fedef5ff6c7
> [11402.311439] RDX: 00007ffda9289f10 RSI: 00000000c4009420 RDI: 0000000000000003
> [11402.318657] RBP: 00007ffda928c866 R08: 00000000004c4cab R09: 0000000000000013
> [11402.325877] R10: 0000000022494966 R11: 0000000000000246 R12: 0000000000000001
> [11402.333115] R13: 00007ffda9289f10 R14: 0000000000000000 R15: 00007ffda9289f08
> [11432.564100] BTRFS info (device sdd10): found 30943 extents, stage: move data extents



--
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] btrfs: unlock the newly allocated extent buffer in btrfs_alloc_tree_block()
  2021-09-14  6:57 [PATCH] btrfs: unlock the newly allocated extent buffer in btrfs_alloc_tree_block() Qu Wenruo
  2021-09-17 13:42 ` David Sterba
  2021-09-20  8:48 ` David Sterba
@ 2022-03-06 16:36 ` Denis Efremov
  2022-03-07  0:15   ` Qu Wenruo
  2 siblings, 1 reply; 10+ messages in thread
From: Denis Efremov @ 2022-03-06 16:36 UTC (permalink / raw)
  To: Qu Wenruo, David Sterba; +Cc: Hao Sun, linux-btrfs, stable

Hi,


On 9/14/21 09:57, Qu Wenruo wrote:
> [BUG]
...
> 
>   ================================================
>   WARNING: lock held when returning to user space!
>   5.15.0-rc1 #16 Not tainted
>   ------------------------------------------------
>   syz-executor/7579 is leaving the kernel with locks still held!
>   1 lock held by syz-executor/7579:
>    #0: ffff888104b73da8 (btrfs-tree-01/1){+.+.}-{3:3}, at:
>   __btrfs_tree_lock+0x2e/0x1a0 fs/btrfs/locking.c:112
> 
> [CAUSE]
> In btrfs_alloc_tree_block(), after btrfs_init_new_buffer(), the new
> extent buffer @buf is locked, but if later operations like adding
> delayed tree ref fails, we just free @buf without unlocking it,
> resulting above warning.

This patch fixes CVE-2021-4149. Commit 19ea40dddf18
"btrfs: unlock newly allocated extent buffer after error" upstream.
The patch was backported to kernels 5.15, 5.10, 5.4 because it contains
"CC: stable@vger.kernel.org # 5.4+" in the commit message.

However, it looks to me like kernels 4.9, 4.14, 4.19 are also vulnerable.
In v4.9 kernel there is btrfs_init_new_buffer() call:
btrfs_alloc_tree_block(...)
{
	...
	buf = btrfs_init_new_buffer(trans, root, ins.objectid, level);
	...
out_free_buf:                                                                    
        free_extent_buffer(buf);
	...
}

and btrfs_init_new_buffer() contains btrfs_tree_lock(buf) inside it.

The patch can be cherry-picked to v4.9 kernel without a conflict.

Probably, the error was introduced in the commit 67b7859e9bfa
"btrfs: handle ENOMEM in btrfs_alloc_tree_block" It's in the kernel
since v4.1

Can you confirm that kernels v4.9, 4.14, 4.19 are also vulnerable?

Thanks,
Denis

> 
> [FIX]
> Unlock @buf in out_free_buf: tag.
> 
> Reported-by: Hao Sun <sunhao.th@gmail.com>
> Link: https://lore.kernel.org/linux-btrfs/CACkBjsZ9O6Zr0KK1yGn=1rQi6Crh1yeCRdTSBxx9R99L4xdn-Q@mail.gmail.com/
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/extent-tree.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index c88e7727a31a..8aa981ffe7b7 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -4898,6 +4898,7 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans,
>  out_free_delayed:
>  	btrfs_free_delayed_extent_op(extent_op);
>  out_free_buf:
> +	btrfs_tree_unlock(buf);
>  	free_extent_buffer(buf);
>  out_free_reserved:
>  	btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 0);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] btrfs: unlock the newly allocated extent buffer in btrfs_alloc_tree_block()
  2022-03-06 16:36 ` Denis Efremov
@ 2022-03-07  0:15   ` Qu Wenruo
  2022-03-09  6:47     ` [PATCH] btrfs: unlock newly allocated extent buffer after error Denis Efremov
  0 siblings, 1 reply; 10+ messages in thread
From: Qu Wenruo @ 2022-03-07  0:15 UTC (permalink / raw)
  To: Denis Efremov, Qu Wenruo, David Sterba; +Cc: Hao Sun, linux-btrfs, stable



On 2022/3/7 00:36, Denis Efremov wrote:
> Hi,
>
>
> On 9/14/21 09:57, Qu Wenruo wrote:
>> [BUG]
> ...
>>
>>    ================================================
>>    WARNING: lock held when returning to user space!
>>    5.15.0-rc1 #16 Not tainted
>>    ------------------------------------------------
>>    syz-executor/7579 is leaving the kernel with locks still held!
>>    1 lock held by syz-executor/7579:
>>     #0: ffff888104b73da8 (btrfs-tree-01/1){+.+.}-{3:3}, at:
>>    __btrfs_tree_lock+0x2e/0x1a0 fs/btrfs/locking.c:112
>>
>> [CAUSE]
>> In btrfs_alloc_tree_block(), after btrfs_init_new_buffer(), the new
>> extent buffer @buf is locked, but if later operations like adding
>> delayed tree ref fails, we just free @buf without unlocking it,
>> resulting above warning.
>
> This patch fixes CVE-2021-4149. Commit 19ea40dddf18
> "btrfs: unlock newly allocated extent buffer after error" upstream.
> The patch was backported to kernels 5.15, 5.10, 5.4 because it contains
> "CC: stable@vger.kernel.org # 5.4+" in the commit message.
>
> However, it looks to me like kernels 4.9, 4.14, 4.19 are also vulnerable.
> In v4.9 kernel there is btrfs_init_new_buffer() call:
> btrfs_alloc_tree_block(...)
> {
> 	...
> 	buf = btrfs_init_new_buffer(trans, root, ins.objectid, level);
> 	...
> out_free_buf:
>          free_extent_buffer(buf);
> 	...
> }
>
> and btrfs_init_new_buffer() contains btrfs_tree_lock(buf) inside it.
>
> The patch can be cherry-picked to v4.9 kernel without a conflict.
>
> Probably, the error was introduced in the commit 67b7859e9bfa
> "btrfs: handle ENOMEM in btrfs_alloc_tree_block" It's in the kernel
> since v4.1
>
> Can you confirm that kernels v4.9, 4.14, 4.19 are also vulnerable?

Oh, thanks for catching this, I'm never good at taking care of older
kernels.

But since those three are TLS kernels, they deserve the fix.

And yes, in those three versions, they have btrfs_tree_lock() called in
btrfs_init_new_buffer(), so they are also affected.

For the cause, your commit is completely correct.

So feel free to backport those patches to stable, with your new fixed-by
tag.

Thanks,
Qu
>
> Thanks,
> Denis
>
>>
>> [FIX]
>> Unlock @buf in out_free_buf: tag.
>>
>> Reported-by: Hao Sun <sunhao.th@gmail.com>
>> Link: https://lore.kernel.org/linux-btrfs/CACkBjsZ9O6Zr0KK1yGn=1rQi6Crh1yeCRdTSBxx9R99L4xdn-Q@mail.gmail.com/
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/extent-tree.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>> index c88e7727a31a..8aa981ffe7b7 100644
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -4898,6 +4898,7 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans,
>>   out_free_delayed:
>>   	btrfs_free_delayed_extent_op(extent_op);
>>   out_free_buf:
>> +	btrfs_tree_unlock(buf);
>>   	free_extent_buffer(buf);
>>   out_free_reserved:
>>   	btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 0);

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] btrfs: unlock newly allocated extent buffer after error
  2022-03-07  0:15   ` Qu Wenruo
@ 2022-03-09  6:47     ` Denis Efremov
  0 siblings, 0 replies; 10+ messages in thread
From: Denis Efremov @ 2022-03-09  6:47 UTC (permalink / raw)
  Cc: Qu Wenruo, Hao Sun, stable, David Sterba, Denis Efremov

From: Qu Wenruo <wqu@suse.com>

commit 19ea40dddf1833db868533958ca066f368862211 upstream.

[BUG]
There is a bug report that injected ENOMEM error could leave a tree
block locked while we return to user-space:

  BTRFS info (device loop0): enabling ssd optimizations
  FAULT_INJECTION: forcing a failure.
  name failslab, interval 1, probability 0, space 0, times 0
  CPU: 0 PID: 7579 Comm: syz-executor Not tainted 5.15.0-rc1 #16
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
  rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
  Call Trace:
   __dump_stack lib/dump_stack.c:88 [inline]
   dump_stack_lvl+0x8d/0xcf lib/dump_stack.c:106
   fail_dump lib/fault-inject.c:52 [inline]
   should_fail+0x13c/0x160 lib/fault-inject.c:146
   should_failslab+0x5/0x10 mm/slab_common.c:1328
   slab_pre_alloc_hook.constprop.99+0x4e/0xc0 mm/slab.h:494
   slab_alloc_node mm/slub.c:3120 [inline]
   slab_alloc mm/slub.c:3214 [inline]
   kmem_cache_alloc+0x44/0x280 mm/slub.c:3219
   btrfs_alloc_delayed_extent_op fs/btrfs/delayed-ref.h:299 [inline]
   btrfs_alloc_tree_block+0x38c/0x670 fs/btrfs/extent-tree.c:4833
   __btrfs_cow_block+0x16f/0x7d0 fs/btrfs/ctree.c:415
   btrfs_cow_block+0x12a/0x300 fs/btrfs/ctree.c:570
   btrfs_search_slot+0x6b0/0xee0 fs/btrfs/ctree.c:1768
   btrfs_insert_empty_items+0x80/0xf0 fs/btrfs/ctree.c:3905
   btrfs_new_inode+0x311/0xa60 fs/btrfs/inode.c:6530
   btrfs_create+0x12b/0x270 fs/btrfs/inode.c:6783
   lookup_open+0x660/0x780 fs/namei.c:3282
   open_last_lookups fs/namei.c:3352 [inline]
   path_openat+0x465/0xe20 fs/namei.c:3557
   do_filp_open+0xe3/0x170 fs/namei.c:3588
   do_sys_openat2+0x357/0x4a0 fs/open.c:1200
   do_sys_open+0x87/0xd0 fs/open.c:1216
   do_syscall_x64 arch/x86/entry/common.c:50 [inline]
   do_syscall_64+0x34/0xb0 arch/x86/entry/common.c:80
   entry_SYSCALL_64_after_hwframe+0x44/0xae
  RIP: 0033:0x46ae99
  Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48
  89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
  01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
  RSP: 002b:00007f46711b9c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000055
  RAX: ffffffffffffffda RBX: 000000000078c0a0 RCX: 000000000046ae99
  RDX: 0000000000000000 RSI: 00000000000000a1 RDI: 0000000020005800
  RBP: 00007f46711b9c80 R08: 0000000000000000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000017
  R13: 0000000000000000 R14: 000000000078c0a0 R15: 00007ffc129da6e0

  ================================================
  WARNING: lock held when returning to user space!
  5.15.0-rc1 #16 Not tainted
  ------------------------------------------------
  syz-executor/7579 is leaving the kernel with locks still held!
  1 lock held by syz-executor/7579:
   #0: ffff888104b73da8 (btrfs-tree-01/1){+.+.}-{3:3}, at:
  __btrfs_tree_lock+0x2e/0x1a0 fs/btrfs/locking.c:112

[CAUSE]
In btrfs_alloc_tree_block(), after btrfs_init_new_buffer(), the new
extent buffer @buf is locked, but if later operations like adding
delayed tree ref fail, we just free @buf without unlocking it,
resulting above warning.

[FIX]
Unlock @buf in out_free_buf: label.

Reported-by: Hao Sun <sunhao.th@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CACkBjsZ9O6Zr0KK1yGn=1rQi6Crh1yeCRdTSBxx9R99L4xdn-Q@mail.gmail.com/
CC: stable@vger.kernel.org # 4.1+
Fixes: 67b7859e9bfa ("btrfs: handle ENOMEM in btrfs_alloc_tree_block")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Denis Efremov <denis.e.efremov@oracle.com>
---
I added Fixes tag and changed kernel version in "CC: stable@..." line.

 fs/btrfs/extent-tree.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index bf46ed74eae6..d71f800e8bf6 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -8327,6 +8327,7 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans,
 out_free_delayed:
 	btrfs_free_delayed_extent_op(extent_op);
 out_free_buf:
+	btrfs_tree_unlock(buf);
 	free_extent_buffer(buf);
 out_free_reserved:
 	btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 0);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-03-09  6:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-14  6:57 [PATCH] btrfs: unlock the newly allocated extent buffer in btrfs_alloc_tree_block() Qu Wenruo
2021-09-17 13:42 ` David Sterba
2021-09-20  8:48 ` David Sterba
2021-09-20  8:54   ` Qu Wenruo
2021-09-20  8:56   ` Johannes Thumshirn
2021-09-20  9:12     ` David Sterba
2021-09-20 10:16   ` Filipe Manana
2022-03-06 16:36 ` Denis Efremov
2022-03-07  0:15   ` Qu Wenruo
2022-03-09  6:47     ` [PATCH] btrfs: unlock newly allocated extent buffer after error Denis Efremov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.