linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Filipe Manana <fdmanana@kernel.org>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-btrfs <linux-btrfs@vger.kernel.org>,
	xfs <linux-xfs@vger.kernel.org>,
	Filipe Manana <fdmanana@suse.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH 1/2] fs: allow deduplication of eof block into the end of the destination file
Date: Tue, 7 Jan 2020 09:57:39 -0800	[thread overview]
Message-ID: <20200107175739.GC472651@magnolia> (raw)
In-Reply-To: <CAL3q7H5+CMRkJ9yAa2AeB0aKtA=b_yW2g9JSQwCOhOtLNrH1iQ@mail.gmail.com>

On Tue, Jan 07, 2020 at 04:23:15PM +0000, Filipe Manana wrote:
> On Mon, Dec 16, 2019 at 6:28 PM <fdmanana@kernel.org> wrote:
> >
> > From: Filipe Manana <fdmanana@suse.com>
> >
> > We always round down, to a multiple of the filesystem's block size, the
> > length to deduplicate at generic_remap_check_len().  However this is only
> > needed if an attempt to deduplicate the last block into the middle of the
> > destination file is requested, since that leads into a corruption if the
> > length of the source file is not block size aligned.  When an attempt to
> > deduplicate the last block into the end of the destination file is
> > requested, we should allow it because it is safe to do it - there's no
> > stale data exposure and we are prepared to compare the data ranges for
> > a length not aligned to the block (or page) size - in fact we even do
> > the data compare before adjusting the deduplication length.
> >
> > After btrfs was updated to use the generic helpers from VFS (by commit
> > 34a28e3d77535e ("Btrfs: use generic_remap_file_range_prep() for cloning
> > and deduplication")) we started to have user reports of deduplication
> > not reflinking the last block anymore, and whence users getting lower
> > deduplication scores.  The main use case is deduplication of entire
> > files that have a size not aligned to the block size of the filesystem.
> >
> > We already allow cloning the last block to the end (and beyond) of the
> > destination file, so allow for deduplication as well.
> >
> > Link: https://lore.kernel.org/linux-btrfs/2019-1576167349.500456@svIo.N5dq.dFFD/
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> 
> Darrick, Al, any feedback?

Is there a fstest to check for correct operation of dedupe at or beyond
source and destfile EOF?  Particularly if one range is /not/ at EOF?
And that an mmap read of the EOF block will see zeroes past EOF before
and after the dedupe operation?

If I fallocate a 16k file, write 'X' into the first 5000 bytes,
write 'X' into the first 66,440 bytes (60k + 5000) of a second file, and
then try to dedupe (first file, 0-8k) with (second file, 60k-68k),
should that work?

I'm convinced that we could support dedupe to EOF when the ranges of the
two files both end at the respective file's EOF, but it's the weirder
corner cases that I worry about...

--D

> Thanks.
> 
> > ---
> >  fs/read_write.c | 10 ++++------
> >  1 file changed, 4 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/read_write.c b/fs/read_write.c
> > index 5bbf587f5bc1..7458fccc59e1 100644
> > --- a/fs/read_write.c
> > +++ b/fs/read_write.c
> > @@ -1777,10 +1777,9 @@ static int remap_verify_area(struct file *file, loff_t pos, loff_t len,
> >   * else.  Assume that the offsets have already been checked for block
> >   * alignment.
> >   *
> > - * For deduplication we always scale down to the previous block because we
> > - * can't meaningfully compare post-EOF contents.
> > - *
> > - * For clone we only link a partial EOF block above the destination file's EOF.
> > + * For clone we only link a partial EOF block above or at the destination file's
> > + * EOF.  For deduplication we accept a partial EOF block only if it ends at the
> > + * destination file's EOF (can not link it into the middle of a file).
> >   *
> >   * Shorten the request if possible.
> >   */
> > @@ -1796,8 +1795,7 @@ static int generic_remap_check_len(struct inode *inode_in,
> >         if ((*len & blkmask) == 0)
> >                 return 0;
> >
> > -       if ((remap_flags & REMAP_FILE_DEDUP) ||
> > -           pos_out + *len < i_size_read(inode_out))
> > +       if (pos_out + *len < i_size_read(inode_out))
> >                 new_len &= ~blkmask;
> >
> >         if (new_len == *len)
> > --
> > 2.11.0
> >

  reply	other threads:[~2020-01-07 17:57 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-16 18:26 [PATCH 0/2] Allow deduplication of the eof block when it is safe to do so fdmanana
2019-12-16 18:26 ` [PATCH 1/2] fs: allow deduplication of eof block into the end of the destination file fdmanana
2019-12-17 15:52   ` Josef Bacik
2020-01-07 16:23   ` Filipe Manana
2020-01-07 17:57     ` Darrick J. Wong [this message]
2020-01-08 11:36       ` Filipe Manana
2020-01-08 16:15         ` Darrick J. Wong
2020-01-09 19:00           ` Filipe Manana
2020-01-09 19:12             ` Darrick J. Wong
2020-01-14 14:36               ` Filipe Manana
2020-01-22  0:35                 ` Darrick J. Wong
2020-01-22 12:38                   ` David Sterba
2019-12-16 18:26 ` [PATCH 2/2] Btrfs: make deduplication with range including the last block work fdmanana
2019-12-17 15:54   ` Josef Bacik
2019-12-29  5:22   ` Zygo Blaxell
2020-01-07 16:18     ` Filipe Manana
2020-01-07 18:16       ` Zygo Blaxell
2020-01-08 11:42         ` Filipe Manana
2020-01-08 14:53           ` David Sterba
2020-01-23 17:37 ` [PATCH 0/2] Allow deduplication of the eof block when it is safe to do so David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200107175739.GC472651@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=fdmanana@kernel.org \
    --cc=fdmanana@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).