On 2020/1/11 上午12:11, Josef Bacik wrote: > While running xfstests with compression on I noticed I was panicing on > btrfs/154. I bisected this down to my inc_block_group_ro patches, which > was strange. > > What was happening is with my patches we now use btrfs_can_overcommit() > to see if we can flip a block group read only. Before this would fail > because we weren't taking into account the usable un-allocated space for > allocating chunks. With my patches we were allowed to do the balance, > which is technically correct. > > However this test is testing restriping with a degraded mount, something > that isn't working right because Anand's fix for the test was never > actually merged. > > So now we're trying to allocate a chunk and cannot because we want to > allocate a RAID1 chunk, but there's only 1 device that's available for > usage. This results in an ENOSPC in one of the BUG_ON(ret) paths in > relocation (and a tricky path that is going to take many more patches to > fix.) > > But we shouldn't even be making it this far, we don't have enough > devices to restripe. The problem is we're using btrfs_num_devices(), > which for some reason includes missing devices. That's not actually > what we want, we want the rw_devices. > > Fix this by getting the rw_devices. With this patch we're no longer > panicing with my other patches applied, and we're in fact erroring out > at the correct spot instead of at inc_block_group_ro. The fact that > this was working before was just sheer dumb luck. > > Fixes: e4d8ec0f65b9 ("Btrfs: implement online profile changing") > Signed-off-by: Josef Bacik > --- > fs/btrfs/volumes.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 7483521a928b..a92059555754 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -3881,7 +3881,14 @@ int btrfs_balance(struct btrfs_fs_info *fs_info, > } > } > > - num_devices = btrfs_num_devices(fs_info); > + /* > + * rw_devices can be messed with by rm_device and device replace, so > + * take the chunk_mutex to make sure we have a relatively consistent > + * view of the fs at this point. > + */ > + mutex_lock(&fs_info->chunk_mutex); > + num_devices = fs_info->fs_devices->rw_devices; > + mutex_unlock(&fs_info->chunk_mutex); chunk_mutex is the correct lock for rw_devices counter and alloc_list. So, Reviewed-by: Qu Wenruo Thanks, Qu > > /* > * SINGLE profile on-disk has no profile bit, but in-memory we have a >