From: David Howells <firstname.lastname@example.org> To: email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com Cc: firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org Subject: Problems with determining data presence by examining extents? Date: Tue, 14 Jan 2020 16:48:29 +0000 [thread overview] Message-ID: <email@example.com> (raw) Again with regard to my rewrite of fscache and cachefiles: https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter I've got rid of my use of bmap()! Hooray! However, I'm informed that I can't trust the extent map of a backing file to tell me accurately whether content exists in a file because: (a) Not-quite-contiguous extents may be joined by insertion of blocks of zeros by the filesystem optimising itself. This would give me a false positive when trying to detect the presence of data. (b) Blocks of zeros that I write into the file may get punched out by filesystem optimisation since a read back would be expected to read zeros there anyway, provided it's below the EOF. This would give me a false negative. Is there some setting I can use to prevent these scenarios on a file - or can one be added? Without being able to trust the filesystem to tell me accurately what I've written into it, I have to use some other mechanism. Currently, I've switched to storing a map in an xattr with 1 bit per 256k block, but that gets hard to use if the file grows particularly large and also has integrity consequences - though those are hopefully limited as I'm now using DIO to store data into the cache. If it helps, I'm downloading data in aligned 256k blocks and storing data in those same aligned 256k blocks, so if that makes it easier... David
next reply other threads:[~2020-01-14 16:48 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-01-14 16:48 David Howells [this message] 2020-01-14 22:49 ` Theodore Y. Ts'o 2020-01-15 3:54 ` Qu Wenruo 2020-01-15 12:46 ` Andreas Dilger 2020-01-15 13:10 ` Qu Wenruo 2020-01-15 13:31 ` Christoph Hellwig 2020-01-15 19:48 ` Andreas Dilger 2020-01-16 10:16 ` Christoph Hellwig 2020-01-15 20:55 ` David Howells 2020-01-15 22:11 ` Andreas Dilger 2020-01-15 23:09 ` David Howells 2020-01-26 18:19 ` Zygo Blaxell 2020-01-15 14:35 ` David Howells 2020-01-15 14:48 ` Christoph Hellwig 2020-01-15 14:59 ` David Howells 2020-01-16 10:13 ` Christoph Hellwig 2020-01-17 16:43 ` David Howells 2020-01-15 14:20 ` David Howells 2020-01-15 8:38 ` Christoph Hellwig 2020-01-15 13:50 ` David Howells 2020-01-15 14:05 ` David Howells 2020-01-15 14:24 ` Qu Wenruo 2020-01-15 14:50 ` David Howells 2020-01-15 14:15 ` David Howells
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --subject='Re: Problems with determining data presence by examining extents?' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.