linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: dhowells@redhat.com, linux-fsdevel@vger.kernel.org,
	viro@zeniv.linux.org.uk, hch@lst.de, tytso@mit.edu,
	adilger.kernel@dilger.ca, darrick.wong@oracle.com, clm@fb.com,
	josef@toxicpanda.com, dsterba@suse.com,
	linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-btrfs@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Problems with determining data presence by examining extents?
Date: Wed, 15 Jan 2020 14:50:22 +0000	[thread overview]
Message-ID: <27263.1579099822@warthog.procyon.org.uk> (raw)
In-Reply-To: <6330a53c-781b-83d7-8293-405787979736@gmx.com>

Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:

> "Unaligned" means "unaligned to fs sector size". In btrfs it's page
> size, thus it shouldn't be a problem for your 256K block size.

Cool.

> > Same answer as above.  Btw, since I'm using DIO reads and writes, would these
> > get compressed?
> 
> Yes. DIO will also be compressed unless you set the inode to nocompression.
> 
> And you may not like this btrfs internal design:
> Compressed extent can only be as large as 128K (uncompressed size).
> 
> So 256K block write will be split into 2 extents anyway.
> And since compressed extent will cause non-continuous physical offset,
> it will always be two extents to fiemap, even you're always writing in
> 256K block size.
> 
> Not sure if this matters though.

Not a problem, provided I can read them with a single DIO read.  I just need
to know whether the data is present.  I don't need to know where it is or what
hoops the filesystem goes through to get it.

> > I'm not sure this isn't the same answer as above either, except if this
> > results in parts of the file being "filled in" with blocks of zeros that I
> > haven't supplied.
> 
> The example would be, you have written 256K data, all filled with 0xaa.
> And it committed to disk.
> Then the next time you write another 256K data, all filled with 0xaa.
> Then instead of writing this data onto disk, the fs chooses to reuse
> your previous written data, doing a reflink to it.

That's fine as long as the filesystem says it's there when I ask for it.
Having it shared isn't a problem.

But that brings me back to the original issue and that's the potential problem
of the filesystem optimising storage by adding or removing blocks of zero
bytes.  If either of those can happen, I cannot rely on the filesystem
metadata.

> So fiemap would report your latter 256K has the same bytenr of your
> previous 256K write (since it's reflinked), and with SHARED flag.

It might be better for me to use SEEK_HOLE than fiemap - barring the slight
issues that SEEK_HOLE has no upper bound and that writes may be taking place
at the same time.

David


  parent reply	other threads:[~2020-01-15 14:50 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-14 16:48 Problems with determining data presence by examining extents? David Howells
2020-01-14 22:49 ` Theodore Y. Ts'o
2020-01-15  3:54 ` Qu Wenruo
2020-01-15 12:46   ` Andreas Dilger
2020-01-15 13:10     ` Qu Wenruo
2020-01-15 13:31       ` Christoph Hellwig
2020-01-15 19:48         ` Andreas Dilger
2020-01-16 10:16           ` Christoph Hellwig
2020-01-15 20:55         ` David Howells
2020-01-15 22:11           ` Andreas Dilger
2020-01-15 23:09           ` David Howells
2020-01-26 18:19             ` Zygo Blaxell
2020-01-15 14:35       ` David Howells
2020-01-15 14:48         ` Christoph Hellwig
2020-01-15 14:59         ` David Howells
2020-01-16 10:13           ` Christoph Hellwig
2020-01-17 16:43           ` David Howells
2020-01-15 14:20   ` David Howells
2020-01-15  8:38 ` Christoph Hellwig
2020-01-15 13:50 ` David Howells
2020-01-15 14:05 ` David Howells
2020-01-15 14:24   ` Qu Wenruo
2020-01-15 14:50   ` David Howells [this message]
2020-01-15 14:15 ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=27263.1579099822@warthog.procyon.org.uk \
    --to=dhowells@redhat.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=clm@fb.com \
    --cc=darrick.wong@oracle.com \
    --cc=dsterba@suse.com \
    --cc=hch@lst.de \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).