From: Dave Chinner <david@fromorbit.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Jan Kara <jack@suse.cz>,
lsf-pc@lists.linux-foundation.org,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
linux-xfs <linux-xfs@vger.kernel.org>,
"Darrick J. Wong" <darrick.wong@oracle.com>,
Christoph Hellwig <hch@lst.de>
Subject: Re: [LSF/MM TOPIC] Lazy file reflink
Date: Tue, 29 Jan 2019 11:18:26 +1100 [thread overview]
Message-ID: <20190129001826.GV4205@dastard> (raw)
In-Reply-To: <CAOQ4uxgUDoSc_nVrLM1An_tH_0NMVonA8npJLBbi0ibD+mwnMw@mail.gmail.com>
On Tue, Jan 29, 2019 at 12:56:17AM +0200, Amir Goldstein wrote:
> > > > What I just described above is actually already implemented with
> > > > Overlayfs snapshots [1], but for many applications overlayfs snapshots
> > > > it is not a practical solution.
> > > >
> > > > I have based my assumption that reflink of a large file may incur
> > > > lots of metadata updates on my limited knowledge of xfs reflink
> > > > implementation, but perhaps it is not the case for other filesystems?
> >
> > Comparitively speaking: compared to copying a large file, reflink is
> > cheap on any filesystem that implements it. Sure, reflinking on XFS
> > is CPU limited, IIRC, to ~10-20,000 extents per second per reflink
> > op per AG, but it's still faster than copying 10-20,000 extents
> > per second per copy op on all but the very fastest, unloaded nvme
> > SSDs...
> >
>
> I think the concern is the added metadata load on the rest of the
> users. Backup app doesn't care about the time it consumes to clone
> before backup. But this concern is not based on actual numbers.
So what is it based on?
> > Really, though, for this use case it's make more sense to have "per
> > file freeze" semantics. i.e. if you want a consistent backup image
> > on snapshot capable storage, the process is usually "freeze
> > filesystem, snapshot fs, unfreeze fs, do backup from snapshot,
> > remove snapshot". We can already transparently block incoming
> > writes/modifications on files via the freeze mechanism, so why not
> > just extend that to per-file granularity so writes to the "very
> > large read-mostly file" block while it's being backed up....
> >
> > Indeed, this would probably only require a simple extension to
> > FIFREEZE/FITHAW - the parameter is currently ignored, but as defined
> > by XFS it was a "freeze level". Set this to 0xffffffff and then it
> > freezes just the fd passed in, not the whole filesystem.
> > Alternatively, FI_FREEZE_FILE/FI_THAW_FILE is simple to define...
> >
>
> I think it's a good idea to add file freeze semantics to the toolbox
> of useful things that could be accomplished with reflink.
reflink is already atomic w.r.t. other writes - in what way does a
"file freeze" have any impact on a reflink operation? that is, apart
from preventing it from being done, because reflink can modify the
source inode on XFS, too....
> Especially with your plans for subvolumes as files
> How is that coming along by the way?.
If I didn't have to spend so much time fire-fighting broken stuff,
I might make more progress.
> Anyway, freeze semantics alone won't work for our backup application
> that needs to be non intrusive. Even if writes to large file are few,
> backup may take time, so blocking those few write for that long is
> not acceptable.
So, reflink is too expensive because there are only occasional
writes, but blocking that occasional write is too expensive, too,
even though it is rare?
> Blocking the writes for the setup time of a reflink
> is exactly what I was proposing and in your analogy,
No, I proposed a way to provide a -point in time snapshot- of a
file that doesn't require reflink or any other special filesystem
support.
> the block
> device is frozen only for a short period of time for setting up the
> snapshot and not for the duration of the backup.
Right, it's frozen for as long as it takes to set up a -point in
time snapshot- that the backup can be taken from. You don't need
that to reflink a file. You need it if you want to do something
other than a reflink....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2019-01-29 0:18 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-25 14:27 [LSF/MM TOPIC] Lazy file reflink Amir Goldstein
2019-01-28 12:50 ` Jan Kara
2019-01-28 21:26 ` Dave Chinner
2019-01-28 22:56 ` Amir Goldstein
2019-01-29 0:18 ` Dave Chinner [this message]
2019-01-29 7:18 ` Amir Goldstein
2019-01-29 23:01 ` Dave Chinner
2019-01-30 13:30 ` Amir Goldstein
2019-01-31 20:25 ` Chris Murphy
2019-01-31 21:13 ` Matthew Wilcox
2019-02-01 13:49 ` Amir Goldstein
2019-04-27 21:46 ` Amir Goldstein
2019-01-31 20:02 ` Chris Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190129001826.GV4205@dastard \
--to=david@fromorbit.com \
--cc=amir73il@gmail.com \
--cc=darrick.wong@oracle.com \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).