All of lore.kernel.org
 help / color / mirror / Atom feed
From: nate <linux-xfs@linuxpowered.net>
To: linux-xfs@vger.kernel.org
Subject: Re: XFS reflink copy to different filesystem performance question
Date: Thu, 17 Mar 2022 09:43:55 -0700	[thread overview]
Message-ID: <3d9539b0f931cbb28dc26d68806f0b11@linuxpowered.net> (raw)
In-Reply-To: <20220316222304.GR3927073@dread.disaster.area>

On 2022-03-16 15:23, Dave Chinner wrote:
> reflink is not dedupe. file clones simply make a copy by reference,
> so it doesn't duplicate the data in the first place. IOWs, it ends
> up with a single physical copy that has multiple references to it.
> 
> dedupe is done by a different operation, which requires comparing
> the data in two different locations and if they are the same
> reducing it to a single physical copy with multiple references.

Yeah sorry I didn't phrase that statement right but I understand
the situation.

> IIUC, you are asking about whether you can run a reflink copy on
> the destination before you run rsync, then do a delta sync using
> rsync to only move the changed blocks, so only store the changed
> blocks in the backup image?
> 
> If so, then yes. This is how a reflink-based file-level backup farm
> would work. It is very similar to a hardlink based farm, but instead
> of keeping a repository of every version of the every file that is
> backed up in an object store and then creating the directory
> structure via hardlinks to the object store, it creates the new
> directory structure with reflink copies of the previous version and
> then does delta updates to the files directly.

ok thanks


> I haven't confirmed anything, just made a guess same as you have.

Well good enough for me thanks anyway!


> That sounds more like the dedupe process searching for duplicate
> blocks to dedupe....

I think so too.

> You can use FIEMAP (filefrag(1) or xfs_bmap(8)) to tell you if a
> specific extent is shared or not. But it cannot tell you how many
> references there are to it, nor what file those references belong
> to. For that, you need root permissions, ioctl_getfsmap(2) and
> rmapbt=1 support in your filesystem.

Sounds more complex than I would like to deal with.

> Unless you have an immediate use for filesystem metadata level
> introspection (generally unlikely), there's no need to enable it.

ok thanks for the info.

I am leaving the list now, thanks a bunch for the replies.

nate

      reply	other threads:[~2022-03-17 16:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-16  0:45 XFS reflink copy to different filesystem performance question nate
2022-03-16  8:33 ` Dave Chinner
2022-03-16 17:08   ` nate
2022-03-16 22:23     ` Dave Chinner
2022-03-17 16:43       ` nate [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3d9539b0f931cbb28dc26d68806f0b11@linuxpowered.net \
    --to=linux-xfs@linuxpowered.net \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.