From: David Howells <email@example.com> To: Qu Wenruo <firstname.lastname@example.org> Cc: email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org Subject: Re: Problems with determining data presence by examining extents? Date: Wed, 15 Jan 2020 14:50:22 +0000 [thread overview] Message-ID: <email@example.com> (raw) In-Reply-To: <firstname.lastname@example.org> Qu Wenruo <email@example.com> wrote: > "Unaligned" means "unaligned to fs sector size". In btrfs it's page > size, thus it shouldn't be a problem for your 256K block size. Cool. > > Same answer as above. Btw, since I'm using DIO reads and writes, would these > > get compressed? > > Yes. DIO will also be compressed unless you set the inode to nocompression. > > And you may not like this btrfs internal design: > Compressed extent can only be as large as 128K (uncompressed size). > > So 256K block write will be split into 2 extents anyway. > And since compressed extent will cause non-continuous physical offset, > it will always be two extents to fiemap, even you're always writing in > 256K block size. > > Not sure if this matters though. Not a problem, provided I can read them with a single DIO read. I just need to know whether the data is present. I don't need to know where it is or what hoops the filesystem goes through to get it. > > I'm not sure this isn't the same answer as above either, except if this > > results in parts of the file being "filled in" with blocks of zeros that I > > haven't supplied. > > The example would be, you have written 256K data, all filled with 0xaa. > And it committed to disk. > Then the next time you write another 256K data, all filled with 0xaa. > Then instead of writing this data onto disk, the fs chooses to reuse > your previous written data, doing a reflink to it. That's fine as long as the filesystem says it's there when I ask for it. Having it shared isn't a problem. But that brings me back to the original issue and that's the potential problem of the filesystem optimising storage by adding or removing blocks of zero bytes. If either of those can happen, I cannot rely on the filesystem metadata. > So fiemap would report your latter 256K has the same bytenr of your > previous 256K write (since it's reflinked), and with SHARED flag. It might be better for me to use SEEK_HOLE than fiemap - barring the slight issues that SEEK_HOLE has no upper bound and that writes may be taking place at the same time. David
next prev parent reply other threads:[~2020-01-15 14:50 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-01-14 16:48 David Howells 2020-01-14 22:49 ` Theodore Y. Ts'o 2020-01-15 3:54 ` Qu Wenruo 2020-01-15 12:46 ` Andreas Dilger 2020-01-15 13:10 ` Qu Wenruo 2020-01-15 13:31 ` Christoph Hellwig 2020-01-15 19:48 ` Andreas Dilger 2020-01-16 10:16 ` Christoph Hellwig 2020-01-15 20:55 ` David Howells 2020-01-15 22:11 ` Andreas Dilger 2020-01-15 23:09 ` David Howells 2020-01-26 18:19 ` Zygo Blaxell 2020-01-15 14:35 ` David Howells 2020-01-15 14:48 ` Christoph Hellwig 2020-01-15 14:59 ` David Howells 2020-01-16 10:13 ` Christoph Hellwig 2020-01-17 16:43 ` David Howells 2020-01-15 14:20 ` David Howells 2020-01-15 8:38 ` Christoph Hellwig 2020-01-15 13:50 ` David Howells 2020-01-15 14:05 ` David Howells 2020-01-15 14:24 ` Qu Wenruo 2020-01-15 14:50 ` David Howells [this message] 2020-01-15 14:15 ` David Howells
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: Problems with determining data presence by examining extents?' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.