From: "André Malm" <admin@sheepa.org>
To: Chris Murphy <lists@colorremedies.com>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Btrfs send with parent different size depending on source of files.
Date: Tue, 19 Feb 2019 01:28:28 +0100 [thread overview]
Message-ID: <c1fc8b54-9cf6-f191-57ba-7621edefd139@sheepa.org> (raw)
In-Reply-To: <CAJCQCtSc=T=Umg4bz2M6o1bZin7Y967827Jrfkeh=nCPLy6dGQ@mail.gmail.com>
Rsync is probably i bad idea yes. I could btrfs send -p the changed
"new" master subvolume and then delete the old master subvolume and then
reference the new master subvolume when transferring it later on i guess?
I'll explain the problem I'm trying to solve abit better;
Say i have a program that will run in multiple instances. The program
requires a dataset of large files to run (say 20GB). The dataset will be
updated over time, i.e parts of them changes. These changes should only
apply to new instances for the program. The program will also generate
new data (both new files and also changing data in the the shared
dataset) that is unique to the instance of the child subvolume. Finally
I need to transfer the program together with its generated data to
another remote machine to continue it's processing there. What i want to
achieve is avoid having to transfer the entire dataset when only small
parts of it is changed by the program. I also want to avoid having to
duplicate copies of the data on the remote machine.
On 2019-02-19 01:16, Chris Murphy wrote:
> On Mon, Feb 18, 2019 at 4:58 PM André Malm <admin@sheepa.org> wrote:
>> I assume i would have to use rsync (with --inplace possibly) to keep the
>> master volume in sync between machines?
> Why?
>
> You previously said you didn't want to do that:
> " if I change / remove, say 10 GB worth of data from the
> master subvolume unrelated to the child subvolume I don't want those
> gigabytes sent down the wire with btrfs send as they are unrelated. "
>
> So instead of sending those changes with send you're going to send the
> changes with rsync which is even more inefficient?
>
>> Say for example I have a (large) file on master, on machine A, I cp
>> reflink it to a child subvolume. I then send -p child subvolume to
>> remote machine B (which already have the master volume).
> 1. The master snapshots must be identical snapshots on source and
> destination; it must have been replicated using send/receive or
> receive will complain.
> 2. I don't know what "send -p child subvolume to remove" means because
> -p option means two snapshots must be included and I don't know what
> two snapshots you're planning on using, so I can't answer the
> question.
>
>> Then i change
>> parts of the file on the master of machine A. I then rsync (?) the
>> master volume so its the same across the machines. Can I then later send
>> -p the child volume, either back to the original machine (A) or to a 3rd
>> machine (C) given that the master volumes are synced?
> rsync'd subvolumes across volumes aren't consistent identical by btrfs
> receive. I expect they can't be used for incremental send/receive.
> What you're trying to achieve, big picture, isn't clear. You're
> describing it in terms of send/receive and confusing it with rsync. I
> don't understand what rsync has to do with this.
>
> You realize if you merely move a large file within a subvolume, and
> then you using rsync to a remote system, it will delete the original
> file and then sync over the nework that same data to the new location?
> Rsync has no idea how to just move files around. Btrfs send-receive
> does.
>
>> About the efficiency I'm not planning to remove large amounts of data
>> that is used by child subvolumes (although some will be updated). But
>> given the unpredictability of what files will be used by child
>> subvolumes i might remove large unused amounts of data.
> OK?
>
>
next prev parent reply other threads:[~2019-02-19 0:28 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-14 11:37 Btrfs send with parent different size depending on source of files André Malm
2019-02-14 22:37 ` Chris Murphy
2019-02-15 4:00 ` Remi Gauvin
2019-02-15 18:38 ` Chris Murphy
2019-02-15 18:56 ` Remi Gauvin
2019-02-16 20:10 ` Andrei Borzenkov
2019-02-15 17:45 ` Andrei Borzenkov
2019-02-15 19:11 ` Chris Murphy
2019-02-16 20:26 ` Andrei Borzenkov
2019-02-16 20:32 ` Andrei Borzenkov
2019-02-18 18:00 ` Chris Murphy
2019-02-15 19:29 ` Remi Gauvin
2019-02-15 19:41 ` Remi Gauvin
2019-02-16 20:08 ` Andrei Borzenkov
2019-02-17 3:11 ` Remi Gauvin
2019-02-18 13:05 ` André Malm
2019-02-18 18:06 ` Chris Murphy
2019-02-18 19:58 ` André Malm
2019-02-18 20:59 ` Graham Cobb
2019-02-18 21:22 ` Chris Murphy
2019-02-18 21:36 ` André Malm
2019-02-18 22:28 ` Chris Murphy
2019-02-18 22:58 ` André Malm
2019-02-18 23:49 ` Chris Murphy
2019-02-18 23:58 ` André Malm
2019-02-19 0:16 ` Chris Murphy
2019-02-19 0:17 ` Chris Murphy
2019-02-19 0:28 ` André Malm [this message]
2019-02-19 3:54 ` Chris Murphy
2019-02-19 12:05 ` André Malm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c1fc8b54-9cf6-f191-57ba-7621edefd139@sheepa.org \
--to=admin@sheepa.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=lists@colorremedies.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).