Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
* Problems with determining data presence by examining extents?
@ 2020-01-14 16:48 David Howells
  2020-01-14 22:49 ` Theodore Y. Ts'o
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: David Howells @ 2020-01-14 16:48 UTC (permalink / raw)
  To: linux-fsdevel, viro, hch, tytso, adilger.kernel, darrick.wong,
	clm, josef, dsterba
  Cc: dhowells, linux-ext4, linux-xfs, linux-btrfs, linux-kernel

Again with regard to my rewrite of fscache and cachefiles:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter

I've got rid of my use of bmap()!  Hooray!

However, I'm informed that I can't trust the extent map of a backing file to
tell me accurately whether content exists in a file because:

 (a) Not-quite-contiguous extents may be joined by insertion of blocks of
     zeros by the filesystem optimising itself.  This would give me a false
     positive when trying to detect the presence of data.

 (b) Blocks of zeros that I write into the file may get punched out by
     filesystem optimisation since a read back would be expected to read zeros
     there anyway, provided it's below the EOF.  This would give me a false
     negative.

Is there some setting I can use to prevent these scenarios on a file - or can
one be added?

Without being able to trust the filesystem to tell me accurately what I've
written into it, I have to use some other mechanism.  Currently, I've switched
to storing a map in an xattr with 1 bit per 256k block, but that gets hard to
use if the file grows particularly large and also has integrity consequences -
though those are hopefully limited as I'm now using DIO to store data into the
cache.

If it helps, I'm downloading data in aligned 256k blocks and storing data in
those same aligned 256k blocks, so if that makes it easier...

David


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, back to index

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-14 16:48 Problems with determining data presence by examining extents? David Howells
2020-01-14 22:49 ` Theodore Y. Ts'o
2020-01-15  3:54 ` Qu Wenruo
2020-01-15 12:46   ` Andreas Dilger
2020-01-15 13:10     ` Qu Wenruo
2020-01-15 13:31       ` Christoph Hellwig
2020-01-15 19:48         ` Andreas Dilger
2020-01-16 10:16           ` Christoph Hellwig
2020-01-15 20:55         ` David Howells
2020-01-15 22:11           ` Andreas Dilger
2020-01-15 23:09           ` David Howells
2020-01-26 18:19             ` Zygo Blaxell
2020-01-15 14:35       ` David Howells
2020-01-15 14:48         ` Christoph Hellwig
2020-01-15 14:59         ` David Howells
2020-01-16 10:13           ` Christoph Hellwig
2020-01-17 16:43           ` David Howells
2020-01-15 14:20   ` David Howells
2020-01-15  8:38 ` Christoph Hellwig
2020-01-15 13:50 ` David Howells
2020-01-15 14:05 ` David Howells
2020-01-15 14:24   ` Qu Wenruo
2020-01-15 14:50   ` David Howells
2020-01-15 14:15 ` David Howells

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org
	public-inbox-index linux-ext4

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git