On 2019/8/7 下午5:43, Anand Jain wrote: > > >> On 7 Aug 2019, at 4:55 PM, Qu Wenruo wrote: >> >> >> >> On 2019/8/7 下午4:44, Filipe Manana wrote: >>> On Wed, Aug 7, 2019 at 9:35 AM Anand Jain wrote: >>>> >>>> Commit 6ba9fc8e628b (btrfs: Ensure btrfs_trim_fs can trim the whole >>>> filesystem) makes sure we always trim starting from the first block group. >>>> However it also removed the range.start validity check which is set in the >>>> context of the user, where its range is from 0 to maximum of filesystem >>>> totalbytes and so we have to check its validity in the kernel. >>>> >>>> Also as in the fstrim(8) [1] the kernel layers may modify the trim range. >>>> >>>> [1] >>>> Further, the kernel block layer reserves the right to adjust the discard >>>> ranges to fit raid stripe geometry, non-trim capable devices in a LVM >>>> setup, etc. These reductions would not be reflected in fstrim_range.len >>>> (the --length option). >>>> >>>> This patch undos the deleted range::start validity check. >>>> >>>> Fixes: 6ba9fc8e628b (btrfs: Ensure btrfs_trim_fs can trim the whole filesystem) >>>> Signed-off-by: Anand Jain >>>> --- >>>> With this patch fstests generic/260 is successful now. >>>> >>>> fs/btrfs/ioctl.c | 2 ++ >>>> 1 file changed, 2 insertions(+) >>>> >>>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c >>>> index b431f7877e88..9345fcdf80c7 100644 >>>> --- a/fs/btrfs/ioctl.c >>>> +++ b/fs/btrfs/ioctl.c >>>> @@ -521,6 +521,8 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg) >>>> return -EOPNOTSUPP; >>>> if (copy_from_user(&range, arg, sizeof(range))) >>>> return -EFAULT; >>>> + if (range.start > btrfs_super_total_bytes(fs_info->super_copy)) >>>> + return -EINVAL; >>> >>> This makes it impossible to trim block groups that start at an offset >>> greater then the super_total_bytes values. >> > > what did I miss? As there is no limit on the range::length > so we can still trim beyond super_total_bytes. So I don’t > understand why not? More below. > > >> Exactly. >> >>> In fact, in extreme cases >>> it's possible all block groups start at offsets larger then >>> super_total_bytes. >>> Have you considered that, or am I missing something? >> >> And, I have already mentioned exactly the same reason in that commit >> message. >> >> To address it once again, the bytenr is btrfs logical address space, has >> nothing to do with any device. >> Thus it can be [0, U64MAX]. >> > > Fundamentally, logical address space has no relevance in the user context, > so also I don’t understand your view on how anyone shall use the range::start > even if there is no check? range::start == bg_bytenr, range::len = bg_len to trim only a bg. And that bg_bytenr is at 128T, since the fs has gone through several balance. But there is only one device, and its size is only 1T. Please tell me how to trim that block group only? > > As in the man page it's ok to adjust the range internally, and as length > can be up to U64MAX we can still trim beyond super_total_bytes? As I said already, super_total_bytes makes no sense in logical address space. As super_total_bytes is just the sum of all devices, it's a device layer thing, nothing to do with logical address space. You're mixing logical bytenr with something not even a device physical offset, how can it be correct? Let me make it more clear, btrfs has its own logical address space unrelated to whatever the devices mapping are. It's always [0, U64_MAX], no matter how many devices there are. If btrfs is implemented using dm, it should be more clear. (single device btrfs) | (dm linear, 0 ~ U64_MAX, virtual devices)<- that's logical address space | | | | | | | \- (dm raid1, 1T ~ 1T + 128M, devid1 XXX, devid2 XXX) | | \------ (dm raid0, 2T ~ 2T + 1G, devid1 XXX, devid2 XXX) | \---------- (dm raid1, 128G ~ 128G + 128M, devi1 XXX, devid xxx) \-------------- (dm raid0, 1M ~ 1M + 1G, devid1 xxx, devid2 xxx). If we're trim such fs layout, you tell me which offset you should use. Thanks, Qu > > Thanks, Anand > > >> Thanks, >> Qu >> >>> >>> The change log is also vague to me, doesn't explain why you are >>> re-adding that check. >>> >>> Thanks. >>> >>>> >>>> /* >>>> * NOTE: Don't truncate the range using super->total_bytes. Bytenr of >>>> -- >>>> 2.21.0 (Apple Git-120) >>>> >>> >>> >> >