On 2020/7/31 下午10:08, David Sterba wrote: > On Fri, Jul 31, 2020 at 07:29:11PM +0800, Qu Wenruo wrote: >> --- a/fs/btrfs/volumes.c >> +++ b/fs/btrfs/volumes.c >> @@ -4720,6 +4720,18 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size) >> } >> >> mutex_lock(&fs_info->chunk_mutex); >> + /* >> + * Also clear any CHUNK_TRIMMED and CHUNK_ALLOCATED bits beyond the >> + * current device boundary. >> + * This shouldn't fail, as alloc_state should only utilize those two >> + * bits, thus we shouldn't alloc new memory for clearing the status. > > If this fails or not depends on implementation details of > clear_extent_bits and this comment will get out of sync eventually, so I > don't think it should be that specific. > > If the new_size is somewhere in the middle of an existing state, it'll > need to be split anyway, no? Nope. Because in alloc_state we only have two bits utilized, CHUNK_TRIMMED and CHUNK_ALLOCATED. Thus what we're doing is to clear all utilized bits. > > alloc_state |-----+++++| > clear |------------------------- ... (u64)-1| > > So we'd need to keep the state "-" and unset bits only from "+", and > this will require a split. In this case, we would only reduce the the size of the existing status, or just remove it completely. > > But I still have doubts about just clearing the range, why are there any > device->alloc_state entries at all after device is shrunk? Because the alloc_state is mostly only utilized by trim facility, thus existing functions won't bother clearing/setting it. In this particular case, previous fstrim run would set the CHUNK_TRIMMED bit for all unallocated range (except the super reserve). Then shrink doesn't clear the exceed range, and cause problem. Thus clearing the bit in btrfs_shrink_device() makes sense. > Using > clear_extent_bits here is not wrong if we look at the end result of > clearing the range, but otherwise it leaves some state information > and allocated memory behind. > Not that complex case, just plain not fully considered corner case. Thanks, Qu