From: Andreas Dilger <firstname.lastname@example.org> To: David Howells <email@example.com>, Christoph Hellwig <firstname.lastname@example.org> Cc: Qu Wenruo <email@example.com>, linux-fsdevel <firstname.lastname@example.org>, Al Viro <email@example.com>, "Theodore Y. Ts'o" <firstname.lastname@example.org>, "Darrick J. Wong" <email@example.com>, Chris Mason <firstname.lastname@example.org>, Josef Bacik <email@example.com>, David Sterba <firstname.lastname@example.org>, linux-ext4 <email@example.com>, linux-xfs <firstname.lastname@example.org>, linux-btrfs <email@example.com>, Linux Kernel Mailing List <firstname.lastname@example.org> Subject: Re: Problems with determining data presence by examining extents? Date: Wed, 15 Jan 2020 12:48:44 -0700 [thread overview] Message-ID: <C0F67EC5-7B5D-4179-9F28-95B84D9CC326@dilger.ca> (raw) In-Reply-To: <20200115133101.GA28583@lst.de> [-- Attachment #1: Type: text/plain, Size: 2714 bytes --] On Jan 15, 2020, at 6:31 AM, Christoph Hellwig <email@example.com> wrote: > > On Wed, Jan 15, 2020 at 09:10:44PM +0800, Qu Wenruo wrote: >>> That allows userspace to distinguish fe_physical addresses that may be >>> on different devices. This isn't in the kernel yet, since it is mostly >>> useful only for Btrfs and nobody has implemented it there. I can give >>> you details if working on this for Btrfs is of interest to you. >> >> IMHO it's not good enough. >> >> The concern is, one extent can exist on multiple devices (mirrors for >> RAID1/RAID10/RAID1C2/RAID1C3, or stripes for RAID5/6). >> I didn't see how it can be easily implemented even with extra fields. >> >> And even we implement it, it can be too complex or bug prune to fill >> per-device info. > > It's also completely bogus for the use cases to start with. fiemap > is a debug tool reporting the file system layout. Using it for anything > related to actual data storage and data integrity is a receipe for > disaster. As said the right thing for the use case would be something > like the NFS READ_PLUS operation. If we can't get that easily it can > be emulated using lseek SEEK_DATA / SEEK_HOLE assuming no other thread > could be writing to the file, or the raciness doesn't matter. I don't think either of those will be any better than FIEMAP, if the reason is that the underlying filesystem is filling in holes with actual data blocks to optimize the IO pattern. SEEK_HOLE would not find a hole in the block allocation, and would happily return the block of zeroes to the caller. Also, it isn't clear if SEEK_HOLE considers an allocated but unwritten extent to be a hole or a block? I think what is needed here is an fadvise/ioctl that tells the filesystem "don't allocate blocks unless actually written" for that file. Storing anything in a separate data structure is a recipe for disaster, since it will become inconsistent after a crash, or filesystem corruption+e2fsck, and will unnecessarily bloat the on-disk metadata for every file to hold redundant information. I don't see COW/reflink/compression as being a problem in this case, since what cachefiles cares about is whether there is _any_ data for a given logical offset, not where/how the data is stored. IF FIEMAP was used for a btrfs backing filesystem, it would need the "EXTENT_DATA_COMPRESSED" feature to be implemented as well, so that it can distinguish the logical vs. physical allocations. I don't think that would be needed for SEEK_HOLE and SEEK_DATA, so long as they handle unwritten extents properly (and are correctly implemented in the first place, some filesystems fall back to always returning the next block for SEEK_DATA). Cheers, Andreas [-- Attachment #2: Message signed with OpenPGP --] [-- Type: application/pgp-signature, Size: 873 bytes --]
next prev parent reply other threads:[~2020-01-15 19:48 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-01-14 16:48 David Howells 2020-01-14 22:49 ` Theodore Y. Ts'o 2020-01-15 3:54 ` Qu Wenruo 2020-01-15 12:46 ` Andreas Dilger 2020-01-15 13:10 ` Qu Wenruo 2020-01-15 13:31 ` Christoph Hellwig 2020-01-15 19:48 ` Andreas Dilger [this message] 2020-01-16 10:16 ` Christoph Hellwig 2020-01-15 20:55 ` David Howells 2020-01-15 22:11 ` Andreas Dilger 2020-01-15 23:09 ` David Howells 2020-01-26 18:19 ` Zygo Blaxell 2020-01-15 14:35 ` David Howells 2020-01-15 14:48 ` Christoph Hellwig 2020-01-15 14:59 ` David Howells 2020-01-16 10:13 ` Christoph Hellwig 2020-01-17 16:43 ` David Howells 2020-01-15 14:20 ` David Howells 2020-01-15 8:38 ` Christoph Hellwig 2020-01-15 13:50 ` David Howells 2020-01-15 14:05 ` David Howells 2020-01-15 14:24 ` Qu Wenruo 2020-01-15 14:50 ` David Howells 2020-01-15 14:15 ` David Howells
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=C0F67EC5-7B5D-4179-9F28-95B84D9CC326@dilger.ca \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --subject='Re: Problems with determining data presence by examining extents?' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.