* [PATCH V5 0/2] btrfs: device_list_mutex fix lockdep warn and cleanup @ 2021-08-31 1:21 Anand Jain 2021-08-31 1:21 ` [PATCH V5 1/2] btrfs: fix lockdep warning while mounting sprout fs Anand Jain 2021-08-31 1:21 ` [PATCH RFC V5 2/2] btrfs: consolidate device_list_mutex in prepare_sprout to its parent Anand Jain 0 siblings, 2 replies; 12+ messages in thread From: Anand Jain @ 2021-08-31 1:21 UTC (permalink / raw) To: linux-btrfs, dsterba; +Cc: l There are no new patches here. These patches were once independent and or part of another set in the mainline list. But now, as these two patches are related, so they are brought together in this set. These two patches were at V2 and V4 (respectively) at their last reroll. Here I have marked them both to be at V5 to avoid any confusion. But there is no code change. Patch 1/2 is an old patch in the mailing list, now part of this set as a new cleanup patch 2/2 depends on it. Patch 2/2 was sent before as part of the patchset (btrf_show_devname related fixes). And, now in this set as it depends on 1/2. Anand Jain (2): btrfs: fix lockdep warning while mounting sprout fs btrfs: consolidate device_list_mutex in prepare_sprout to its parent fs/btrfs/volumes.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) -- 2.31.1 ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH V5 1/2] btrfs: fix lockdep warning while mounting sprout fs 2021-08-31 1:21 [PATCH V5 0/2] btrfs: device_list_mutex fix lockdep warn and cleanup Anand Jain @ 2021-08-31 1:21 ` Anand Jain 2021-08-31 8:18 ` Nikolay Borisov ` (3 more replies) 2021-08-31 1:21 ` [PATCH RFC V5 2/2] btrfs: consolidate device_list_mutex in prepare_sprout to its parent Anand Jain 1 sibling, 4 replies; 12+ messages in thread From: Anand Jain @ 2021-08-31 1:21 UTC (permalink / raw) To: linux-btrfs, dsterba; +Cc: l Following test case reproduces lockdep warning. Test case: $ mkfs.btrfs -f <dev1> $ btrfstune -S 1 <dev1> $ mount <dev1> <mnt> $ btrfs device add <dev2> <mnt> -f $ umount <mnt> $ mount <dev2> <mnt> $ umount <mnt> The warning claims a possible ABBA deadlock between the threads initiated by [#1] btrfs device add and [#0] the mount. ====================================================== [ 540.743122] WARNING: possible circular locking dependency detected [ 540.743129] 5.11.0-rc7+ #5 Not tainted [ 540.743135] ------------------------------------------------------ [ 540.743142] mount/2515 is trying to acquire lock: [ 540.743149] ffffa0c5544c2ce0 (&fs_devs->device_list_mutex){+.+.}-{4:4}, at: clone_fs_devices+0x6d/0x210 [btrfs] [ 540.743458] but task is already holding lock: [ 540.743461] ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, at: __btrfs_tree_read_lock+0x32/0x200 [btrfs] [ 540.743541] which lock already depends on the new lock. [ 540.743543] the existing dependency chain (in reverse order) is: [ 540.743546] -> #1 (btrfs-chunk-00){++++}-{4:4}: [ 540.743566] down_read_nested+0x48/0x2b0 [ 540.743585] __btrfs_tree_read_lock+0x32/0x200 [btrfs] [ 540.743650] btrfs_read_lock_root_node+0x70/0x200 [btrfs] [ 540.743733] btrfs_search_slot+0x6c6/0xe00 [btrfs] [ 540.743785] btrfs_update_device+0x83/0x260 [btrfs] [ 540.743849] btrfs_finish_chunk_alloc+0x13f/0x660 [btrfs] <--- device_list_mutex [ 540.743911] btrfs_create_pending_block_groups+0x18d/0x3f0 [btrfs] [ 540.743982] btrfs_commit_transaction+0x86/0x1260 [btrfs] [ 540.744037] btrfs_init_new_device+0x1600/0x1dd0 [btrfs] [ 540.744101] btrfs_ioctl+0x1c77/0x24c0 [btrfs] [ 540.744166] __x64_sys_ioctl+0xe4/0x140 [ 540.744170] do_syscall_64+0x4b/0x80 [ 540.744174] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 540.744180] -> #0 (&fs_devs->device_list_mutex){+.+.}-{4:4}: [ 540.744184] __lock_acquire+0x155f/0x2360 [ 540.744188] lock_acquire+0x10b/0x5c0 [ 540.744190] __mutex_lock+0xb1/0xf80 [ 540.744193] mutex_lock_nested+0x27/0x30 [ 540.744196] clone_fs_devices+0x6d/0x210 [btrfs] [ 540.744270] btrfs_read_chunk_tree+0x3c7/0xbb0 [btrfs] [ 540.744336] open_ctree+0xf6e/0x2074 [btrfs] [ 540.744406] btrfs_mount_root.cold.72+0x16/0x127 [btrfs] [ 540.744472] legacy_get_tree+0x38/0x90 [ 540.744475] vfs_get_tree+0x30/0x140 [ 540.744478] fc_mount+0x16/0x60 [ 540.744482] vfs_kern_mount+0x91/0x100 [ 540.744484] btrfs_mount+0x1e6/0x670 [btrfs] [ 540.744536] legacy_get_tree+0x38/0x90 [ 540.744537] vfs_get_tree+0x30/0x140 [ 540.744539] path_mount+0x8d8/0x1070 [ 540.744541] do_mount+0x8d/0xc0 [ 540.744543] __x64_sys_mount+0x125/0x160 [ 540.744545] do_syscall_64+0x4b/0x80 [ 540.744547] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 540.744551] other info that might help us debug this: [ 540.744552] Possible unsafe locking scenario: [ 540.744553] CPU0 CPU1 [ 540.744554] ---- ---- [ 540.744555] lock(btrfs-chunk-00); [ 540.744557] lock(&fs_devs->device_list_mutex); [ 540.744560] lock(btrfs-chunk-00); [ 540.744562] lock(&fs_devs->device_list_mutex); [ 540.744564] *** DEADLOCK *** [ 540.744565] 3 locks held by mount/2515: [ 540.744567] #0: ffffa0c56bf7a0e0 (&type->s_umount_key#42/1){+.+.}-{4:4}, at: alloc_super.isra.16+0xdf/0x450 [ 540.744574] #1: ffffffffc05a9628 (uuid_mutex){+.+.}-{4:4}, at: btrfs_read_chunk_tree+0x63/0xbb0 [btrfs] [ 540.744640] #2: ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, at: __btrfs_tree_read_lock+0x32/0x200 [btrfs] [ 540.744708] stack backtrace: [ 540.744712] CPU: 2 PID: 2515 Comm: mount Not tainted 5.11.0-rc7+ #5 But the device_list_mutex in clone_fs_devices() is redundant, as explained below. Two threads [1] and [2] (below) could lead to clone_fs_device(). [1] open_ctree <== mount sprout fs btrfs_read_chunk_tree() mutex_lock(&uuid_mutex) <== global lock read_one_dev() open_seed_devices() clone_fs_devices() <== seed fs_devices mutex_lock(&orig->device_list_mutex) <== seed fs_devices [2] btrfs_init_new_device() <== sprouting mutex_lock(&uuid_mutex); <== global lock btrfs_prepare_sprout() lockdep_assert_held(&uuid_mutex) clone_fs_devices(seed_fs_device) <== seed fs_devices Both of these threads hold uuid_mutex which is sufficient to protect getting the seed device(s) freed while we are trying to clone it for sprouting [2] or mounting a sprout [1] (as above). A mounted seed device can not free/write/replace because it is read-only. An unmounted seed device can free by btrfs_free_stale_devices(), but it needs uuid_mutex. So this patch removes the unnecessary device_list_mutex in clone_fs_devices(). And adds a lockdep_assert_held(&uuid_mutex) in clone_fs_devices(). Reported-by: Su Yue <l@damenly.su> Signed-off-by: Anand Jain <anand.jain@oracle.com> --- v5: Vet test case in the changelog. v2: Remove Martin's Reported-by and Tested-by. Add Su's Reported-by. Add lockdep_assert_held check. Update the changelog, make it relevant to the current misc-next fs/btrfs/volumes.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index abdc392f6ef9..fa9fe47b5b68 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -558,6 +558,8 @@ static int btrfs_free_stale_devices(const char *path, struct btrfs_device *device, *tmp_device; int ret = 0; + lockdep_assert_held(&uuid_mutex); + if (path) ret = -ENOENT; @@ -988,11 +990,12 @@ static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig) struct btrfs_device *orig_dev; int ret = 0; + lockdep_assert_held(&uuid_mutex); + fs_devices = alloc_fs_devices(orig->fsid, NULL); if (IS_ERR(fs_devices)) return fs_devices; - mutex_lock(&orig->device_list_mutex); fs_devices->total_devices = orig->total_devices; list_for_each_entry(orig_dev, &orig->devices, dev_list) { @@ -1024,10 +1027,8 @@ static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig) device->fs_devices = fs_devices; fs_devices->num_devices++; } - mutex_unlock(&orig->device_list_mutex); return fs_devices; error: - mutex_unlock(&orig->device_list_mutex); free_fs_devices(fs_devices); return ERR_PTR(ret); } -- 2.31.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH V5 1/2] btrfs: fix lockdep warning while mounting sprout fs 2021-08-31 1:21 ` [PATCH V5 1/2] btrfs: fix lockdep warning while mounting sprout fs Anand Jain @ 2021-08-31 8:18 ` Nikolay Borisov 2021-09-02 23:51 ` Anand Jain 2021-08-31 12:37 ` Nikolay Borisov ` (2 subsequent siblings) 3 siblings, 1 reply; 12+ messages in thread From: Nikolay Borisov @ 2021-08-31 8:18 UTC (permalink / raw) To: Anand Jain, linux-btrfs, dsterba; +Cc: l On 31.08.21 г. 4:21, Anand Jain wrote: > Following test case reproduces lockdep warning. > > Test case: > > $ mkfs.btrfs -f <dev1> > $ btrfstune -S 1 <dev1> > $ mount <dev1> <mnt> > $ btrfs device add <dev2> <mnt> -f > $ umount <mnt> > $ mount <dev2> <mnt> > $ umount <mnt> > > The warning claims a possible ABBA deadlock between the threads initiated by > [#1] btrfs device add and [#0] the mount. > Send this as an xfstest <snip> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V5 1/2] btrfs: fix lockdep warning while mounting sprout fs 2021-08-31 8:18 ` Nikolay Borisov @ 2021-09-02 23:51 ` Anand Jain 0 siblings, 0 replies; 12+ messages in thread From: Anand Jain @ 2021-09-02 23:51 UTC (permalink / raw) To: Nikolay Borisov; +Cc: l, linux-btrfs, dsterba On 31/08/2021 16:18, Nikolay Borisov wrote: > > > On 31.08.21 г. 4:21, Anand Jain wrote: >> Following test case reproduces lockdep warning. >> >> Test case: >> >> $ mkfs.btrfs -f <dev1> >> $ btrfstune -S 1 <dev1> >> $ mount <dev1> <mnt> >> $ btrfs device add <dev2> <mnt> -f >> $ umount <mnt> >> $ mount <dev2> <mnt> >> $ umount <mnt> >> >> The warning claims a possible ABBA deadlock between the threads initiated by >> [#1] btrfs device add and [#0] the mount. >> > > Send this as an xfstest The above steps are the most common in the seed group of test cases. The first test case (btrfs/161) in the -g seed group already reproduces the lock dep warning. Any idea how do I reset the lockdep warning (without reboot) so that it will again report the same lockdep warning stack if it discovers again? Thanks, Anand > > <snip> > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V5 1/2] btrfs: fix lockdep warning while mounting sprout fs 2021-08-31 1:21 ` [PATCH V5 1/2] btrfs: fix lockdep warning while mounting sprout fs Anand Jain 2021-08-31 8:18 ` Nikolay Borisov @ 2021-08-31 12:37 ` Nikolay Borisov 2021-09-01 0:49 ` Su Yue 2021-09-02 15:28 ` David Sterba 3 siblings, 0 replies; 12+ messages in thread From: Nikolay Borisov @ 2021-08-31 12:37 UTC (permalink / raw) To: Anand Jain, linux-btrfs, dsterba; +Cc: l On 31.08.21 г. 4:21, Anand Jain wrote: > Following test case reproduces lockdep warning. > > Test case: > > $ mkfs.btrfs -f <dev1> > $ btrfstune -S 1 <dev1> > $ mount <dev1> <mnt> > $ btrfs device add <dev2> <mnt> -f > $ umount <mnt> > $ mount <dev2> <mnt> > $ umount <mnt> > > The warning claims a possible ABBA deadlock between the threads initiated by > [#1] btrfs device add and [#0] the mount. > > ====================================================== > [ 540.743122] WARNING: possible circular locking dependency detected > [ 540.743129] 5.11.0-rc7+ #5 Not tainted > [ 540.743135] ------------------------------------------------------ > [ 540.743142] mount/2515 is trying to acquire lock: > [ 540.743149] ffffa0c5544c2ce0 (&fs_devs->device_list_mutex){+.+.}-{4:4}, at: clone_fs_devices+0x6d/0x210 [btrfs] > [ 540.743458] but task is already holding lock: > [ 540.743461] ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, at: __btrfs_tree_read_lock+0x32/0x200 [btrfs] > [ 540.743541] which lock already depends on the new lock. > [ 540.743543] the existing dependency chain (in reverse order) is: > > [ 540.743546] -> #1 (btrfs-chunk-00){++++}-{4:4}: > [ 540.743566] down_read_nested+0x48/0x2b0 > [ 540.743585] __btrfs_tree_read_lock+0x32/0x200 [btrfs] > [ 540.743650] btrfs_read_lock_root_node+0x70/0x200 [btrfs] > [ 540.743733] btrfs_search_slot+0x6c6/0xe00 [btrfs] > [ 540.743785] btrfs_update_device+0x83/0x260 [btrfs] > [ 540.743849] btrfs_finish_chunk_alloc+0x13f/0x660 [btrfs] <--- device_list_mutex > [ 540.743911] btrfs_create_pending_block_groups+0x18d/0x3f0 [btrfs] > [ 540.743982] btrfs_commit_transaction+0x86/0x1260 [btrfs] > [ 540.744037] btrfs_init_new_device+0x1600/0x1dd0 [btrfs] > [ 540.744101] btrfs_ioctl+0x1c77/0x24c0 [btrfs] > [ 540.744166] __x64_sys_ioctl+0xe4/0x140 > [ 540.744170] do_syscall_64+0x4b/0x80 > [ 540.744174] entry_SYSCALL_64_after_hwframe+0x44/0xa9 In this callchain we lock device_list_mutex followed by locking the chunk root extent item > > [ 540.744180] -> #0 (&fs_devs->device_list_mutex){+.+.}-{4:4}: > [ 540.744184] __lock_acquire+0x155f/0x2360 > [ 540.744188] lock_acquire+0x10b/0x5c0 > [ 540.744190] __mutex_lock+0xb1/0xf80 > [ 540.744193] mutex_lock_nested+0x27/0x30 > [ 540.744196] clone_fs_devices+0x6d/0x210 [btrfs] > [ 540.744270] btrfs_read_chunk_tree+0x3c7/0xbb0 [btrfs] > [ 540.744336] open_ctree+0xf6e/0x2074 [btrfs] > [ 540.744406] btrfs_mount_root.cold.72+0x16/0x127 [btrfs] > [ 540.744472] legacy_get_tree+0x38/0x90 > [ 540.744475] vfs_get_tree+0x30/0x140 > [ 540.744478] fc_mount+0x16/0x60 > [ 540.744482] vfs_kern_mount+0x91/0x100 > [ 540.744484] btrfs_mount+0x1e6/0x670 [btrfs] > [ 540.744536] legacy_get_tree+0x38/0x90 > [ 540.744537] vfs_get_tree+0x30/0x140 > [ 540.744539] path_mount+0x8d8/0x1070 > [ 540.744541] do_mount+0x8d/0xc0 > [ 540.744543] __x64_sys_mount+0x125/0x160 > [ 540.744545] do_syscall_64+0x4b/0x80 > [ 540.744547] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Whereas in this call chain we first lock the chunk_root in btrfs_read_chunk_tree and then acquire device_list_mutex in clone_fs_devices. And this inversion of lock order could potentially lead to a deadlock. > > [ 540.744551] other info that might help us debug this: > [ 540.744552] Possible unsafe locking scenario: > > [ 540.744553] CPU0 CPU1 > [ 540.744554] ---- ---- > [ 540.744555] lock(btrfs-chunk-00); > [ 540.744557] lock(&fs_devs->device_list_mutex); > [ 540.744560] lock(btrfs-chunk-00); > [ 540.744562] lock(&fs_devs->device_list_mutex); > [ 540.744564] > *** DEADLOCK *** > > [ 540.744565] 3 locks held by mount/2515: > [ 540.744567] #0: ffffa0c56bf7a0e0 (&type->s_umount_key#42/1){+.+.}-{4:4}, at: alloc_super.isra.16+0xdf/0x450 > [ 540.744574] #1: ffffffffc05a9628 (uuid_mutex){+.+.}-{4:4}, at: btrfs_read_chunk_tree+0x63/0xbb0 [btrfs] > [ 540.744640] #2: ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, at: __btrfs_tree_read_lock+0x32/0x200 [btrfs] > [ 540.744708] > stack backtrace: > [ 540.744712] CPU: 2 PID: 2515 Comm: mount Not tainted 5.11.0-rc7+ #5 > > But the device_list_mutex in clone_fs_devices() is redundant, as explained > below. > Two threads [1] and [2] (below) could lead to clone_fs_device(). > > [1] > open_ctree <== mount sprout fs > btrfs_read_chunk_tree() > mutex_lock(&uuid_mutex) <== global lock > read_one_dev() > open_seed_devices() > clone_fs_devices() <== seed fs_devices > mutex_lock(&orig->device_list_mutex) <== seed fs_devices > > [2] > btrfs_init_new_device() <== sprouting > mutex_lock(&uuid_mutex); <== global lock > btrfs_prepare_sprout() > lockdep_assert_held(&uuid_mutex) > clone_fs_devices(seed_fs_device) <== seed fs_devices > > Both of these threads hold uuid_mutex which is sufficient to protect > getting the seed device(s) freed while we are trying to clone it for > sprouting [2] or mounting a sprout [1] (as above). A mounted seed > device can not free/write/replace because it is read-only. An unmounted > seed device can free by btrfs_free_stale_devices(), but it needs uuid_mutex. > So this patch removes the unnecessary device_list_mutex in clone_fs_devices(). > And adds a lockdep_assert_held(&uuid_mutex) in clone_fs_devices(). The reason we hold the device_list_mutex is not due to freeing (only) but also due to possible racing mounts of filesystems seeded by the same device. I.e in btrfs_prepare_sprout we essentially "empty" the seed following the call to clone_fs_devices by splicing its devices on the newly allocated 'seed_device'. But since those processes are synchronized by uuid_mutex (i.e open_seed_devices) then I'd say this is fine. Reviewed-by: Nikolay Borisov <nborisov@suse.com> > > Reported-by: Su Yue <l@damenly.su> > Signed-off-by: Anand Jain <anand.jain@oracle.com> > --- > v5: Vet test case in the changelog. > v2: Remove Martin's Reported-by and Tested-by. > Add Su's Reported-by. > Add lockdep_assert_held check. > Update the changelog, make it relevant to the current misc-next > > fs/btrfs/volumes.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index abdc392f6ef9..fa9fe47b5b68 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -558,6 +558,8 @@ static int btrfs_free_stale_devices(const char *path, > struct btrfs_device *device, *tmp_device; > int ret = 0; > > + lockdep_assert_held(&uuid_mutex); > + > if (path) > ret = -ENOENT; > > @@ -988,11 +990,12 @@ static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig) > struct btrfs_device *orig_dev; > int ret = 0; > > + lockdep_assert_held(&uuid_mutex); > + > fs_devices = alloc_fs_devices(orig->fsid, NULL); > if (IS_ERR(fs_devices)) > return fs_devices; > > - mutex_lock(&orig->device_list_mutex); > fs_devices->total_devices = orig->total_devices; > > list_for_each_entry(orig_dev, &orig->devices, dev_list) { > @@ -1024,10 +1027,8 @@ static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig) > device->fs_devices = fs_devices; > fs_devices->num_devices++; > } > - mutex_unlock(&orig->device_list_mutex); > return fs_devices; > error: > - mutex_unlock(&orig->device_list_mutex); > free_fs_devices(fs_devices); > return ERR_PTR(ret); > } > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V5 1/2] btrfs: fix lockdep warning while mounting sprout fs 2021-08-31 1:21 ` [PATCH V5 1/2] btrfs: fix lockdep warning while mounting sprout fs Anand Jain 2021-08-31 8:18 ` Nikolay Borisov 2021-08-31 12:37 ` Nikolay Borisov @ 2021-09-01 0:49 ` Su Yue 2021-09-02 15:28 ` David Sterba 3 siblings, 0 replies; 12+ messages in thread From: Su Yue @ 2021-09-01 0:49 UTC (permalink / raw) To: Anand Jain; +Cc: linux-btrfs, dsterba On Tue 31 Aug 2021 at 09:21, Anand Jain <anand.jain@oracle.com> wrote: > Following test case reproduces lockdep warning. > > Test case: > > $ mkfs.btrfs -f <dev1> > $ btrfstune -S 1 <dev1> > $ mount <dev1> <mnt> > $ btrfs device add <dev2> <mnt> -f > $ umount <mnt> > $ mount <dev2> <mnt> > $ umount <mnt> > > The warning claims a possible ABBA deadlock between the threads > initiated by > [#1] btrfs device add and [#0] the mount. > > ====================================================== > [ 540.743122] WARNING: possible circular locking dependency > detected > [ 540.743129] 5.11.0-rc7+ #5 Not tainted > [ 540.743135] > ------------------------------------------------------ > [ 540.743142] mount/2515 is trying to acquire lock: > [ 540.743149] ffffa0c5544c2ce0 > (&fs_devs->device_list_mutex){+.+.}-{4:4}, at: > clone_fs_devices+0x6d/0x210 [btrfs] > [ 540.743458] but task is already holding lock: > [ 540.743461] ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, at: > __btrfs_tree_read_lock+0x32/0x200 [btrfs] > [ 540.743541] which lock already depends on the new lock. > [ 540.743543] the existing dependency chain (in reverse order) > is: > > [ 540.743546] -> #1 (btrfs-chunk-00){++++}-{4:4}: > [ 540.743566] down_read_nested+0x48/0x2b0 > [ 540.743585] __btrfs_tree_read_lock+0x32/0x200 [btrfs] > [ 540.743650] btrfs_read_lock_root_node+0x70/0x200 [btrfs] > [ 540.743733] btrfs_search_slot+0x6c6/0xe00 [btrfs] > [ 540.743785] btrfs_update_device+0x83/0x260 [btrfs] > [ 540.743849] btrfs_finish_chunk_alloc+0x13f/0x660 [btrfs] <--- > device_list_mutex > [ 540.743911] btrfs_create_pending_block_groups+0x18d/0x3f0 > [btrfs] > [ 540.743982] btrfs_commit_transaction+0x86/0x1260 [btrfs] > [ 540.744037] btrfs_init_new_device+0x1600/0x1dd0 [btrfs] > [ 540.744101] btrfs_ioctl+0x1c77/0x24c0 [btrfs] > [ 540.744166] __x64_sys_ioctl+0xe4/0x140 > [ 540.744170] do_syscall_64+0x4b/0x80 > [ 540.744174] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [ 540.744180] -> #0 (&fs_devs->device_list_mutex){+.+.}-{4:4}: > [ 540.744184] __lock_acquire+0x155f/0x2360 > [ 540.744188] lock_acquire+0x10b/0x5c0 > [ 540.744190] __mutex_lock+0xb1/0xf80 > [ 540.744193] mutex_lock_nested+0x27/0x30 > [ 540.744196] clone_fs_devices+0x6d/0x210 [btrfs] > [ 540.744270] btrfs_read_chunk_tree+0x3c7/0xbb0 [btrfs] > [ 540.744336] open_ctree+0xf6e/0x2074 [btrfs] > [ 540.744406] btrfs_mount_root.cold.72+0x16/0x127 [btrfs] > [ 540.744472] legacy_get_tree+0x38/0x90 > [ 540.744475] vfs_get_tree+0x30/0x140 > [ 540.744478] fc_mount+0x16/0x60 > [ 540.744482] vfs_kern_mount+0x91/0x100 > [ 540.744484] btrfs_mount+0x1e6/0x670 [btrfs] > [ 540.744536] legacy_get_tree+0x38/0x90 > [ 540.744537] vfs_get_tree+0x30/0x140 > [ 540.744539] path_mount+0x8d8/0x1070 > [ 540.744541] do_mount+0x8d/0xc0 > [ 540.744543] __x64_sys_mount+0x125/0x160 > [ 540.744545] do_syscall_64+0x4b/0x80 > [ 540.744547] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [ 540.744551] other info that might help us debug this: > [ 540.744552] Possible unsafe locking scenario: > > [ 540.744553] CPU0 CPU1 > [ 540.744554] ---- ---- > [ 540.744555] lock(btrfs-chunk-00); > [ 540.744557] > lock(&fs_devs->device_list_mutex); > [ 540.744560] lock(btrfs-chunk-00); > [ 540.744562] lock(&fs_devs->device_list_mutex); > [ 540.744564] > *** DEADLOCK *** > > [ 540.744565] 3 locks held by mount/2515: > [ 540.744567] #0: ffffa0c56bf7a0e0 > (&type->s_umount_key#42/1){+.+.}-{4:4}, at: > alloc_super.isra.16+0xdf/0x450 > [ 540.744574] #1: ffffffffc05a9628 (uuid_mutex){+.+.}-{4:4}, at: > btrfs_read_chunk_tree+0x63/0xbb0 [btrfs] > [ 540.744640] #2: ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, > at: __btrfs_tree_read_lock+0x32/0x200 [btrfs] > [ 540.744708] > stack backtrace: > [ 540.744712] CPU: 2 PID: 2515 Comm: mount Not tainted > 5.11.0-rc7+ #5 > > But the device_list_mutex in clone_fs_devices() is redundant, as > explained > below. > Two threads [1] and [2] (below) could lead to > clone_fs_device(). > > [1] > open_ctree <== mount sprout fs > btrfs_read_chunk_tree() > mutex_lock(&uuid_mutex) <== global lock > read_one_dev() > open_seed_devices() > clone_fs_devices() <== seed fs_devices > mutex_lock(&orig->device_list_mutex) <== seed fs_devices > > [2] > btrfs_init_new_device() <== sprouting > mutex_lock(&uuid_mutex); <== global lock > btrfs_prepare_sprout() > lockdep_assert_held(&uuid_mutex) > clone_fs_devices(seed_fs_device) <== seed fs_devices > > Both of these threads hold uuid_mutex which is sufficient to > protect > getting the seed device(s) freed while we are trying to clone it > for > sprouting [2] or mounting a sprout [1] (as above). A mounted > seed > device can not free/write/replace because it is read-only. An > unmounted > seed device can free by btrfs_free_stale_devices(), but it needs > uuid_mutex. > So this patch removes the unnecessary device_list_mutex in > clone_fs_devices(). > And adds a lockdep_assert_held(&uuid_mutex) in > clone_fs_devices(). > > Reported-by: Su Yue <l@damenly.su> > Signed-off-by: Anand Jain <anand.jain@oracle.com> > Tested-by: Su Yue <l@damenly.su> -- Su > --- > v5: Vet test case in the changelog. > v2: Remove Martin's Reported-by and Tested-by. > Add Su's Reported-by. > Add lockdep_assert_held check. > Update the changelog, make it relevant to the current > misc-next > > fs/btrfs/volumes.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index abdc392f6ef9..fa9fe47b5b68 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -558,6 +558,8 @@ static int btrfs_free_stale_devices(const > char *path, > struct btrfs_device *device, *tmp_device; > int ret = 0; > > + lockdep_assert_held(&uuid_mutex); > + > if (path) > ret = -ENOENT; > > @@ -988,11 +990,12 @@ static struct btrfs_fs_devices > *clone_fs_devices(struct btrfs_fs_devices *orig) > struct btrfs_device *orig_dev; > int ret = 0; > > + lockdep_assert_held(&uuid_mutex); > + > fs_devices = alloc_fs_devices(orig->fsid, NULL); > if (IS_ERR(fs_devices)) > return fs_devices; > > - mutex_lock(&orig->device_list_mutex); > fs_devices->total_devices = orig->total_devices; > > list_for_each_entry(orig_dev, &orig->devices, dev_list) { > @@ -1024,10 +1027,8 @@ static struct btrfs_fs_devices > *clone_fs_devices(struct btrfs_fs_devices *orig) > device->fs_devices = fs_devices; > fs_devices->num_devices++; > } > - mutex_unlock(&orig->device_list_mutex); > return fs_devices; > error: > - mutex_unlock(&orig->device_list_mutex); > free_fs_devices(fs_devices); > return ERR_PTR(ret); > } ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V5 1/2] btrfs: fix lockdep warning while mounting sprout fs 2021-08-31 1:21 ` [PATCH V5 1/2] btrfs: fix lockdep warning while mounting sprout fs Anand Jain ` (2 preceding siblings ...) 2021-09-01 0:49 ` Su Yue @ 2021-09-02 15:28 ` David Sterba 3 siblings, 0 replies; 12+ messages in thread From: David Sterba @ 2021-09-02 15:28 UTC (permalink / raw) To: Anand Jain; +Cc: linux-btrfs, dsterba, l On Tue, Aug 31, 2021 at 09:21:28AM +0800, Anand Jain wrote: > Following test case reproduces lockdep warning. > > Test case: > > $ mkfs.btrfs -f <dev1> > $ btrfstune -S 1 <dev1> > $ mount <dev1> <mnt> > $ btrfs device add <dev2> <mnt> -f > $ umount <mnt> > $ mount <dev2> <mnt> > $ umount <mnt> > > The warning claims a possible ABBA deadlock between the threads initiated by > [#1] btrfs device add and [#0] the mount. > > ====================================================== > [ 540.743122] WARNING: possible circular locking dependency detected > [ 540.743129] 5.11.0-rc7+ #5 Not tainted > [ 540.743135] ------------------------------------------------------ > [ 540.743142] mount/2515 is trying to acquire lock: > [ 540.743149] ffffa0c5544c2ce0 (&fs_devs->device_list_mutex){+.+.}-{4:4}, at: clone_fs_devices+0x6d/0x210 [btrfs] > [ 540.743458] but task is already holding lock: > [ 540.743461] ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, at: __btrfs_tree_read_lock+0x32/0x200 [btrfs] > [ 540.743541] which lock already depends on the new lock. > [ 540.743543] the existing dependency chain (in reverse order) is: > > [ 540.743546] -> #1 (btrfs-chunk-00){++++}-{4:4}: > [ 540.743566] down_read_nested+0x48/0x2b0 > [ 540.743585] __btrfs_tree_read_lock+0x32/0x200 [btrfs] > [ 540.743650] btrfs_read_lock_root_node+0x70/0x200 [btrfs] > [ 540.743733] btrfs_search_slot+0x6c6/0xe00 [btrfs] > [ 540.743785] btrfs_update_device+0x83/0x260 [btrfs] > [ 540.743849] btrfs_finish_chunk_alloc+0x13f/0x660 [btrfs] <--- device_list_mutex > [ 540.743911] btrfs_create_pending_block_groups+0x18d/0x3f0 [btrfs] > [ 540.743982] btrfs_commit_transaction+0x86/0x1260 [btrfs] > [ 540.744037] btrfs_init_new_device+0x1600/0x1dd0 [btrfs] > [ 540.744101] btrfs_ioctl+0x1c77/0x24c0 [btrfs] > [ 540.744166] __x64_sys_ioctl+0xe4/0x140 > [ 540.744170] do_syscall_64+0x4b/0x80 > [ 540.744174] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [ 540.744180] -> #0 (&fs_devs->device_list_mutex){+.+.}-{4:4}: > [ 540.744184] __lock_acquire+0x155f/0x2360 > [ 540.744188] lock_acquire+0x10b/0x5c0 > [ 540.744190] __mutex_lock+0xb1/0xf80 > [ 540.744193] mutex_lock_nested+0x27/0x30 > [ 540.744196] clone_fs_devices+0x6d/0x210 [btrfs] > [ 540.744270] btrfs_read_chunk_tree+0x3c7/0xbb0 [btrfs] > [ 540.744336] open_ctree+0xf6e/0x2074 [btrfs] > [ 540.744406] btrfs_mount_root.cold.72+0x16/0x127 [btrfs] > [ 540.744472] legacy_get_tree+0x38/0x90 > [ 540.744475] vfs_get_tree+0x30/0x140 > [ 540.744478] fc_mount+0x16/0x60 > [ 540.744482] vfs_kern_mount+0x91/0x100 > [ 540.744484] btrfs_mount+0x1e6/0x670 [btrfs] > [ 540.744536] legacy_get_tree+0x38/0x90 > [ 540.744537] vfs_get_tree+0x30/0x140 > [ 540.744539] path_mount+0x8d8/0x1070 > [ 540.744541] do_mount+0x8d/0xc0 > [ 540.744543] __x64_sys_mount+0x125/0x160 > [ 540.744545] do_syscall_64+0x4b/0x80 > [ 540.744547] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > [ 540.744551] other info that might help us debug this: > [ 540.744552] Possible unsafe locking scenario: > > [ 540.744553] CPU0 CPU1 > [ 540.744554] ---- ---- > [ 540.744555] lock(btrfs-chunk-00); > [ 540.744557] lock(&fs_devs->device_list_mutex); > [ 540.744560] lock(btrfs-chunk-00); > [ 540.744562] lock(&fs_devs->device_list_mutex); > [ 540.744564] > *** DEADLOCK *** > > [ 540.744565] 3 locks held by mount/2515: > [ 540.744567] #0: ffffa0c56bf7a0e0 (&type->s_umount_key#42/1){+.+.}-{4:4}, at: alloc_super.isra.16+0xdf/0x450 > [ 540.744574] #1: ffffffffc05a9628 (uuid_mutex){+.+.}-{4:4}, at: btrfs_read_chunk_tree+0x63/0xbb0 [btrfs] > [ 540.744640] #2: ffffa0c54a7932b8 (btrfs-chunk-00){++++}-{4:4}, at: __btrfs_tree_read_lock+0x32/0x200 [btrfs] > [ 540.744708] > stack backtrace: > [ 540.744712] CPU: 2 PID: 2515 Comm: mount Not tainted 5.11.0-rc7+ #5 > > But the device_list_mutex in clone_fs_devices() is redundant, as explained > below. > Two threads [1] and [2] (below) could lead to clone_fs_device(). > > [1] > open_ctree <== mount sprout fs > btrfs_read_chunk_tree() > mutex_lock(&uuid_mutex) <== global lock > read_one_dev() > open_seed_devices() > clone_fs_devices() <== seed fs_devices > mutex_lock(&orig->device_list_mutex) <== seed fs_devices > > [2] > btrfs_init_new_device() <== sprouting > mutex_lock(&uuid_mutex); <== global lock > btrfs_prepare_sprout() > lockdep_assert_held(&uuid_mutex) > clone_fs_devices(seed_fs_device) <== seed fs_devices > > Both of these threads hold uuid_mutex which is sufficient to protect > getting the seed device(s) freed while we are trying to clone it for > sprouting [2] or mounting a sprout [1] (as above). A mounted seed > device can not free/write/replace because it is read-only. An unmounted > seed device can free by btrfs_free_stale_devices(), but it needs uuid_mutex. > So this patch removes the unnecessary device_list_mutex in clone_fs_devices(). > And adds a lockdep_assert_held(&uuid_mutex) in clone_fs_devices(). > > Reported-by: Su Yue <l@damenly.su> > Signed-off-by: Anand Jain <anand.jain@oracle.com> For the record, this patch has been added to misc-next. ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH RFC V5 2/2] btrfs: consolidate device_list_mutex in prepare_sprout to its parent 2021-08-31 1:21 [PATCH V5 0/2] btrfs: device_list_mutex fix lockdep warn and cleanup Anand Jain 2021-08-31 1:21 ` [PATCH V5 1/2] btrfs: fix lockdep warning while mounting sprout fs Anand Jain @ 2021-08-31 1:21 ` Anand Jain 2021-08-31 13:03 ` Nikolay Borisov 2021-09-17 15:37 ` David Sterba 1 sibling, 2 replies; 12+ messages in thread From: Anand Jain @ 2021-08-31 1:21 UTC (permalink / raw) To: linux-btrfs, dsterba; +Cc: l btrfs_prepare_sprout() moves seed devices into its own struct fs_devices, so that its parent function btrfs_init_new_device() can add the new sprout device to fs_info->fs_devices. Both btrfs_prepare_sprout() and btrfs_init_new_device() needs device_list_mutex. But they are holding it sequentially, thus creates a small window to an opportunity to race. Close this opportunity and hold device_list_mutex common to both btrfs_init_new_device() and btrfs_prepare_sprout(). Signed-off-by: Anand Jain <anand.jain@oracle.com> --- RFC because IMO the cleanup of device_list_mutex makes sense even though there isn't another thread that could race potentially race as of now. Depends on [PATCH v2] btrfs: fix lockdep warning while mounting sprout fs which removed the device_list_mutex from clone_fs_devices() otherwise this patch will cause a double mutex error. v2: fix the missing mutex_unlock in the error return v3: - v4: - v5: - (Except for the change in below SO comments) fs/btrfs/volumes.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index fa9fe47b5b68..53ead67b625c 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2369,6 +2369,8 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) u64 super_flags; lockdep_assert_held(&uuid_mutex); + lockdep_assert_held(&fs_devices->device_list_mutex); + if (!fs_devices->seeding) return -EINVAL; @@ -2400,7 +2402,6 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) INIT_LIST_HEAD(&seed_devices->alloc_list); mutex_init(&seed_devices->device_list_mutex); - mutex_lock(&fs_devices->device_list_mutex); list_splice_init_rcu(&fs_devices->devices, &seed_devices->devices, synchronize_rcu); list_for_each_entry(device, &seed_devices->devices, dev_list) @@ -2416,7 +2417,6 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) generate_random_uuid(fs_devices->fsid); memcpy(fs_devices->metadata_uuid, fs_devices->fsid, BTRFS_FSID_SIZE); memcpy(disk_super->fsid, fs_devices->fsid, BTRFS_FSID_SIZE); - mutex_unlock(&fs_devices->device_list_mutex); super_flags = btrfs_super_flags(disk_super) & ~BTRFS_SUPER_FLAG_SEEDING; @@ -2591,10 +2591,12 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path device->dev_stats_valid = 1; set_blocksize(device->bdev, BTRFS_BDEV_BLOCKSIZE); + mutex_lock(&fs_devices->device_list_mutex); if (seeding_dev) { btrfs_clear_sb_rdonly(sb); ret = btrfs_prepare_sprout(fs_info); if (ret) { + mutex_unlock(&fs_devices->device_list_mutex); btrfs_abort_transaction(trans, ret); goto error_trans; } @@ -2604,7 +2606,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path device->fs_devices = fs_devices; - mutex_lock(&fs_devices->device_list_mutex); mutex_lock(&fs_info->chunk_mutex); list_add_rcu(&device->dev_list, &fs_devices->devices); list_add(&device->dev_alloc_list, &fs_devices->alloc_list); -- 2.31.1 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH RFC V5 2/2] btrfs: consolidate device_list_mutex in prepare_sprout to its parent 2021-08-31 1:21 ` [PATCH RFC V5 2/2] btrfs: consolidate device_list_mutex in prepare_sprout to its parent Anand Jain @ 2021-08-31 13:03 ` Nikolay Borisov 2021-09-03 3:08 ` Anand Jain 2021-09-17 15:37 ` David Sterba 1 sibling, 1 reply; 12+ messages in thread From: Nikolay Borisov @ 2021-08-31 13:03 UTC (permalink / raw) To: Anand Jain, linux-btrfs, dsterba; +Cc: l On 31.08.21 г. 4:21, Anand Jain wrote: > btrfs_prepare_sprout() moves seed devices into its own struct fs_devices, > so that its parent function btrfs_init_new_device() can add the new sprout > device to fs_info->fs_devices. > > Both btrfs_prepare_sprout() and btrfs_init_new_device() needs > device_list_mutex. But they are holding it sequentially, thus creates a > small window to an opportunity to race. Close this opportunity and hold > device_list_mutex common to both btrfs_init_new_device() and > btrfs_prepare_sprout(). > > Signed-off-by: Anand Jain <anand.jain@oracle.com> That's a moot point, what's important is that btrfs_prepare_sprout leaves the fs_devices in a consistent state and btrfs_init_new_device also takes the lock when it's about to modify the devices list, so that's fine as well. While the patch itself won't do any harm I think it's irrelevant. <snip> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH RFC V5 2/2] btrfs: consolidate device_list_mutex in prepare_sprout to its parent 2021-08-31 13:03 ` Nikolay Borisov @ 2021-09-03 3:08 ` Anand Jain 0 siblings, 0 replies; 12+ messages in thread From: Anand Jain @ 2021-09-03 3:08 UTC (permalink / raw) To: Nikolay Borisov, dsterba; +Cc: l, linux-btrfs On 31/08/2021 21:03, Nikolay Borisov wrote: > > > On 31.08.21 г. 4:21, Anand Jain wrote: >> btrfs_prepare_sprout() moves seed devices into its own struct fs_devices, >> so that its parent function btrfs_init_new_device() can add the new sprout >> device to fs_info->fs_devices. >> >> Both btrfs_prepare_sprout() and btrfs_init_new_device() needs >> device_list_mutex. But they are holding it sequentially, thus creates a >> small window to an opportunity to race. Close this opportunity and hold >> device_list_mutex common to both btrfs_init_new_device() and >> btrfs_prepare_sprout(). >> >> Signed-off-by: Anand Jain <anand.jain@oracle.com> > > That's a moot point, what's important is that btrfs_prepare_sprout > leaves the fs_devices in a consistent state and btrfs_init_new_device > also takes the lock when it's about to modify the devices list, so > that's fine as well. While the patch itself won't do any harm I think > it's irrelevant. This patch is for the cleanup of the device_list_mutex. Thanks, Anand ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH RFC V5 2/2] btrfs: consolidate device_list_mutex in prepare_sprout to its parent 2021-08-31 1:21 ` [PATCH RFC V5 2/2] btrfs: consolidate device_list_mutex in prepare_sprout to its parent Anand Jain 2021-08-31 13:03 ` Nikolay Borisov @ 2021-09-17 15:37 ` David Sterba 2021-09-18 0:10 ` Anand Jain 1 sibling, 1 reply; 12+ messages in thread From: David Sterba @ 2021-09-17 15:37 UTC (permalink / raw) To: Anand Jain; +Cc: linux-btrfs, dsterba, l On Tue, Aug 31, 2021 at 09:21:29AM +0800, Anand Jain wrote: > btrfs_prepare_sprout() moves seed devices into its own struct fs_devices, > so that its parent function btrfs_init_new_device() can add the new sprout > device to fs_info->fs_devices. > > Both btrfs_prepare_sprout() and btrfs_init_new_device() needs > device_list_mutex. But they are holding it sequentially, thus creates a > small window to an opportunity to race. Close this opportunity and hold > device_list_mutex common to both btrfs_init_new_device() and > btrfs_prepare_sprout(). I don't se what exactly would go wrong with the separate device list locking, but I see at least one potential problem with the new code. > > Signed-off-by: Anand Jain <anand.jain@oracle.com> > --- > RFC because IMO the cleanup of device_list_mutex makes sense even though > there isn't another thread that could race potentially race as of now. > > Depends on > [PATCH v2] btrfs: fix lockdep warning while mounting sprout fs > which removed the device_list_mutex from clone_fs_devices() otherwise > this patch will cause a double mutex error. > > v2: fix the missing mutex_unlock in the error return > v3: - > v4: - > v5: - (Except for the change in below SO comments) > > fs/btrfs/volumes.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index fa9fe47b5b68..53ead67b625c 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -2369,6 +2369,8 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) > u64 super_flags; > > lockdep_assert_held(&uuid_mutex); > + lockdep_assert_held(&fs_devices->device_list_mutex); > + > if (!fs_devices->seeding) > return -EINVAL; > > @@ -2400,7 +2402,6 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) > INIT_LIST_HEAD(&seed_devices->alloc_list); > mutex_init(&seed_devices->device_list_mutex); A few lines before this one there's alloc_fs_devices and clone_fs_devices, both allocating memory. This would happen under a big lock as device_list_mutex also protects superblock write. This is a pattern to avoid. A rough idea would be to split btrfs_prepare_sprout into parts where the allocations are not done under the lock and the locked part. It could be partially inlined to btrfs_init_new_device. > > - mutex_lock(&fs_devices->device_list_mutex); > list_splice_init_rcu(&fs_devices->devices, &seed_devices->devices, > synchronize_rcu); > list_for_each_entry(device, &seed_devices->devices, dev_list) > @@ -2416,7 +2417,6 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) > generate_random_uuid(fs_devices->fsid); > memcpy(fs_devices->metadata_uuid, fs_devices->fsid, BTRFS_FSID_SIZE); > memcpy(disk_super->fsid, fs_devices->fsid, BTRFS_FSID_SIZE); > - mutex_unlock(&fs_devices->device_list_mutex); > > super_flags = btrfs_super_flags(disk_super) & > ~BTRFS_SUPER_FLAG_SEEDING; > @@ -2591,10 +2591,12 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path > device->dev_stats_valid = 1; > set_blocksize(device->bdev, BTRFS_BDEV_BLOCKSIZE); > > + mutex_lock(&fs_devices->device_list_mutex); > if (seeding_dev) { > btrfs_clear_sb_rdonly(sb); > ret = btrfs_prepare_sprout(fs_info); > if (ret) { > + mutex_unlock(&fs_devices->device_list_mutex); > btrfs_abort_transaction(trans, ret); > goto error_trans; > } > @@ -2604,7 +2606,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path > > device->fs_devices = fs_devices; > > - mutex_lock(&fs_devices->device_list_mutex); > mutex_lock(&fs_info->chunk_mutex); > list_add_rcu(&device->dev_list, &fs_devices->devices); > list_add(&device->dev_alloc_list, &fs_devices->alloc_list); > -- > 2.31.1 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH RFC V5 2/2] btrfs: consolidate device_list_mutex in prepare_sprout to its parent 2021-09-17 15:37 ` David Sterba @ 2021-09-18 0:10 ` Anand Jain 0 siblings, 0 replies; 12+ messages in thread From: Anand Jain @ 2021-09-18 0:10 UTC (permalink / raw) To: dsterba, linux-btrfs, dsterba, l On 17/09/2021 23:37, David Sterba wrote: > On Tue, Aug 31, 2021 at 09:21:29AM +0800, Anand Jain wrote: >> btrfs_prepare_sprout() moves seed devices into its own struct fs_devices, >> so that its parent function btrfs_init_new_device() can add the new sprout >> device to fs_info->fs_devices. >> >> Both btrfs_prepare_sprout() and btrfs_init_new_device() needs >> device_list_mutex. But they are holding it sequentially, thus creates a >> small window to an opportunity to race. Close this opportunity and hold >> device_list_mutex common to both btrfs_init_new_device() and >> btrfs_prepare_sprout(). > > I don't se what exactly would go wrong with the separate device list > locking, but I see at least one potential problem with the new code. > >> >> Signed-off-by: Anand Jain <anand.jain@oracle.com> >> --- >> RFC because IMO the cleanup of device_list_mutex makes sense even though >> there isn't another thread that could race potentially race as of now. >> >> Depends on >> [PATCH v2] btrfs: fix lockdep warning while mounting sprout fs >> which removed the device_list_mutex from clone_fs_devices() otherwise >> this patch will cause a double mutex error. >> >> v2: fix the missing mutex_unlock in the error return >> v3: - >> v4: - >> v5: - (Except for the change in below SO comments) >> >> fs/btrfs/volumes.c | 7 ++++--- >> 1 file changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >> index fa9fe47b5b68..53ead67b625c 100644 >> --- a/fs/btrfs/volumes.c >> +++ b/fs/btrfs/volumes.c >> @@ -2369,6 +2369,8 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) >> u64 super_flags; >> >> lockdep_assert_held(&uuid_mutex); >> + lockdep_assert_held(&fs_devices->device_list_mutex); >> + >> if (!fs_devices->seeding) >> return -EINVAL; >> >> @@ -2400,7 +2402,6 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) >> INIT_LIST_HEAD(&seed_devices->alloc_list); >> mutex_init(&seed_devices->device_list_mutex); BTW mutex_init here will go, as the sprout's private fs_devices::device_list_mutex is unused. It is a pending cleanup. > A few lines before this one there's alloc_fs_devices and > clone_fs_devices, both allocating memory. This would happen under a big > lock as device_list_mutex also protects superblock write. This is a > pattern to avoid. Oh. That's right. Thx. One way is to flag NOFS alloc. > A rough idea would be to split btrfs_prepare_sprout into parts where the > allocations are not done under the lock and the locked part. It could be > partially inlined to btrfs_init_new_device. I think you mean something like this... btrfs_init_new_device() <snip> if seeding_dev alloc_prepare_sprout mutex_lock(&fs_devices->device_list_mutex); if seeding_dev finish_prepare_sprout <snip> mutex_unlock(&fs_devices->device_list_mutex); I am trying. Thanks, Anand >> >> - mutex_lock(&fs_devices->device_list_mutex); >> list_splice_init_rcu(&fs_devices->devices, &seed_devices->devices, >> synchronize_rcu); >> list_for_each_entry(device, &seed_devices->devices, dev_list) >> @@ -2416,7 +2417,6 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) >> generate_random_uuid(fs_devices->fsid); >> memcpy(fs_devices->metadata_uuid, fs_devices->fsid, BTRFS_FSID_SIZE); >> memcpy(disk_super->fsid, fs_devices->fsid, BTRFS_FSID_SIZE); >> - mutex_unlock(&fs_devices->device_list_mutex); >> >> super_flags = btrfs_super_flags(disk_super) & >> ~BTRFS_SUPER_FLAG_SEEDING; >> @@ -2591,10 +2591,12 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path >> device->dev_stats_valid = 1; >> set_blocksize(device->bdev, BTRFS_BDEV_BLOCKSIZE); >> >> + mutex_lock(&fs_devices->device_list_mutex); >> if (seeding_dev) { >> btrfs_clear_sb_rdonly(sb); >> ret = btrfs_prepare_sprout(fs_info); >> if (ret) { >> + mutex_unlock(&fs_devices->device_list_mutex); >> btrfs_abort_transaction(trans, ret); >> goto error_trans; >> } >> @@ -2604,7 +2606,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path >> >> device->fs_devices = fs_devices; >> >> - mutex_lock(&fs_devices->device_list_mutex); >> mutex_lock(&fs_info->chunk_mutex); >> list_add_rcu(&device->dev_list, &fs_devices->devices); >> list_add(&device->dev_alloc_list, &fs_devices->alloc_list); >> -- >> 2.31.1 ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-09-18 0:10 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-31 1:21 [PATCH V5 0/2] btrfs: device_list_mutex fix lockdep warn and cleanup Anand Jain 2021-08-31 1:21 ` [PATCH V5 1/2] btrfs: fix lockdep warning while mounting sprout fs Anand Jain 2021-08-31 8:18 ` Nikolay Borisov 2021-09-02 23:51 ` Anand Jain 2021-08-31 12:37 ` Nikolay Borisov 2021-09-01 0:49 ` Su Yue 2021-09-02 15:28 ` David Sterba 2021-08-31 1:21 ` [PATCH RFC V5 2/2] btrfs: consolidate device_list_mutex in prepare_sprout to its parent Anand Jain 2021-08-31 13:03 ` Nikolay Borisov 2021-09-03 3:08 ` Anand Jain 2021-09-17 15:37 ` David Sterba 2021-09-18 0:10 ` Anand Jain
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.