On 2019/1/8 下午7:45, Qu Wenruo wrote: > > > On 2019/1/8 下午7:16, Filipe Manana wrote: >> On Tue, Jan 8, 2019 at 6:11 AM Qu Wenruo wrote: >>> >>> [BUG] >>> Linux v5.0-rc1 will fail fstests/btrfs/163 with the following kernel >>> message: >>> >>> BTRFS error (device dm-6): dev extent devid 1 physical offset 13631488 len 8388608 is beyond device boundary 0 >>> BTRFS error (device dm-6): failed to verify dev extents against chunks: -117 >>> BTRFS error (device dm-6): open_ctree failed >>> >>> [CAUSE] >>> Commit cf90d884b347 ("btrfs: Introduce mount time chunk <-> dev extent >>> mapping check") introduced strict check on dev extents. >>> >>> We use btrfs_find_device() with dev uuid and fs uuid set to NULL, and >>> only dependent on @devid to find the real device. >>> >>> For seed devices, we call clone_fs_devices() in open_seed_devices() to >>> allow us search seed devices directly. >>> >>> However clone_fs_devices() just populates devices with devid and dev >>> uuid, without populating other essential members, like disk_total_bytes. >>> >>> This makes any device returned by btrfs_find_device(fs_info, devid, >>> NULL, NULL) is just a dummy, with 0 disk_total_bytes, and any dev >>> extents on the seed device will not pass the device boundary check. >>> >>> [FIX] >>> This patch will try to verify the device returned by btrfs_find_device() >>> and if it's a dummy then re-search in seed devices. >>> >>> Reported-by: Filipe Manana >>> Fixes: cf90d884b347 ("btrfs: Introduce mount time chunk <-> dev extent mapping check") >>> Signed-off-by: Qu Wenruo >>> --- >>> fs/btrfs/volumes.c | 12 ++++++++++++ >>> 1 file changed, 12 insertions(+) >>> >>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >>> index 2576b1a379c9..3e4f8f88353e 100644 >>> --- a/fs/btrfs/volumes.c >>> +++ b/fs/btrfs/volumes.c >>> @@ -7825,6 +7825,18 @@ static int verify_one_dev_extent(struct btrfs_fs_info *fs_info, >>> ret = -EUCLEAN; >>> goto out; >>> } >>> + >>> + /* It's possible this device is a dummy for seed device */ >>> + if (dev->disk_total_bytes == 0) { >>> + dev = find_device(fs_info->fs_devices->seed, devid, NULL); >>> + if (!dev) { >>> + btrfs_err(fs_info, "failed to find seed devid %llu", >>> + devid); >>> + ret = -EUCLEAN; >>> + goto out; >>> + } >>> + } >> >> Why just not pass the FSID (can be taken from fs_info->super_copy) to >> the previous call to btrfs_find_device()? >> It's a lot simpler. > > Then btrfs_find_device() will just return NULL. > We still need to verify the dev extent of the seed device not to exceed > seed device boundary. OK, it's not the case. Even we can pass correct dev uuid into btrfs_find_device(), it will still give us the dummy device, with 0 disk_total_bytes. The problem is, we have a screwed up fs_info->fs_devices for seed device from the very beginning. For a seeding fs, devid 1 is seed device and devid 2 is real rw device. Then our fs_devices looks like: fs_info->fs_devices |- devices | |- devid 1 | | dummy, devid 1 dev uuid, any else is unpopulated | | created by open_seed_devices()-> clone_fs_devices() | |- devid 2 | real one, devid 2 dev uuid, everything is good |- seeding |- devices |- devid 1 real seed device. So, at btrfs_find_device() call time, we must pass the *seeding* fsid, to locate the real seed device. Or we can only get a dummy. This patch is just a quick and dirty fix. The correct way to fix is to not screw up fs_devices any more with such meaningless dummy in fs_devices. Especially when btrfs_find_device() has already handled such case. I don't understand why we do such meaningless clone_fs_devices() call, but I don't think the proper fix may catch the v5.0 window. Thanks, Qu > > What we need is device uuid, however we can't easily get that dev uuid > from dev extent item. > > Thanks, > Qu > >> >>> + >>> if (physical_offset + physical_len > dev->disk_total_bytes) { >>> btrfs_err(fs_info, >>> "dev extent devid %llu physical offset %llu len %llu is beyond device boundary %llu", >>> -- >>> 2.20.1 >>> >> >> >