On Feb 11, 2019, at 8:23 AM, Matthew Wilcox wrote: > > On Mon, Feb 11, 2019 at 10:43:06AM +0100, Carlos Maiolino wrote: >> - The general idea, is to provide a way for FIEMAP ioctls to return the device >> id where each extent is physically located. > > How does userspace get to use this information? If I call fiemap() and > it tells me extent 1 is on device 0x12345678 and extent 2 is on device > 0x34567812, what can I do with that information? For filesystems that may store a file on different devices, filefrag will print out which device the file is located on, so that users can see where the file is located. Programs (e.g a mythical LILO that used FIEMAP instead of FIBMAP) could check fe_device to see whether the whole file is located on the same block device or not, and not allow booting from such a file. > Bear in mind that glibc uses a different dev_t from the kernel. That is glibc's problem. The kernel would return fe_device using the same dev_t that it uses for stat.st_dev and friends. Even so, the majority of users will care about "these blocks/files are on a different device than those other blocks/files" and not the exact meaning of the bits. >> - This is particularly useful for those filesystems where the file extents are >> located on a different block device other than that associated with the >> superblock , for example, btrfs using multiple devices, and XFS when using a >> real-time device. > > Darrick said it was useful for _inside_ the kernel. How is it useful > for outside the kernel? In my experience, this can be very useful for users to understand how their file is allocated if there are performance or other issues with a particular device. Also, in some respects, it is _required_ for multi-device filesystems, since it makes it clear that block 123 on one device is not related to the same block number on a different device. It may well be that ext4 will get some kind of multi-device capability in the future (e.g. with the existing ext4 SMR patch using a separate flash journal device and file data being permanently kept in the journal instead of the HDD, or storing all the metadata on a flash device and all data on a HDD device). Cheers, Andreas