linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: "André Malm" <admin@sheepa.org>
Cc: Chris Murphy <lists@colorremedies.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: Btrfs send with parent different size depending on source of files.
Date: Mon, 18 Feb 2019 20:54:24 -0700	[thread overview]
Message-ID: <CAJCQCtQzGCvn3DjPQUndRv=MmhVGzz5Nr=JL7p8kJim9ABmN3g@mail.gmail.com> (raw)
In-Reply-To: <c1fc8b54-9cf6-f191-57ba-7621edefd139@sheepa.org>

On Mon, Feb 18, 2019 at 5:28 PM André Malm <admin@sheepa.org> wrote:
>
> Rsync is probably i bad idea yes. I could btrfs send -p the changed
> "new" master subvolume and then delete the old master subvolume and then
> reference the new master subvolume when transferring it later on i guess?

I'm not sure how your application reacts to snapshots or reflinks, or
how it updates its files. All of that needs to be tested to see what
the incremental send size is, and if the resulting received snapshot
contains files with the integrity your application expects, and so on.

>
> I'll explain the problem I'm trying to solve abit better;
>
> Say i have a program that will run in multiple instances. The program
> requires a dataset of large files to run (say 20GB). The dataset will be
> updated over time, i.e parts of them changes. These changes should only
> apply to new instances for the program. The program will also generate
> new data (both new files and also changing data in the the shared
> dataset) that is unique to the instance of the child subvolume. Finally
> I need to transfer the program together with its generated data to
> another remote machine to continue it's processing there. What i want to
> achieve is avoid having to transfer the entire dataset when only small
> parts of it is changed by the program. I also want to avoid having to
> duplicate copies of the data on the remote machine.

Yep. Based on this description though, the only time I grok using
'btrfs send -p master.snap child.snap | btrfs receive /destination/'
is for the initial transfer of child. Master must be already fully
replicated. Now you can snapshot master and child on separate
schedules to account for their different use case, and send their
increments independent of each other. Or in fact maybe you'll realize
you do have a use case for clone.

Have you looked at GlusterFS or Ceph for this use case? I kinda wonder
if there's any simplification to just having a clustered file system
make all of the send/receive stuff go away, and you can ensure your
data is replicated pretty much immediately, and is always available
for all computers. *shrug* That's off topic but I'm curious if there
are ways to simplify this for your use case.



-- 
Chris Murphy

  reply	other threads:[~2019-02-19  3:54 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-14 11:37 Btrfs send with parent different size depending on source of files André Malm
2019-02-14 22:37 ` Chris Murphy
2019-02-15  4:00   ` Remi Gauvin
2019-02-15 18:38     ` Chris Murphy
2019-02-15 18:56       ` Remi Gauvin
2019-02-16 20:10         ` Andrei Borzenkov
2019-02-15 17:45   ` Andrei Borzenkov
2019-02-15 19:11     ` Chris Murphy
2019-02-16 20:26       ` Andrei Borzenkov
2019-02-16 20:32         ` Andrei Borzenkov
2019-02-18 18:00         ` Chris Murphy
2019-02-15 19:29 ` Remi Gauvin
2019-02-15 19:41   ` Remi Gauvin
2019-02-16 20:08 ` Andrei Borzenkov
2019-02-17  3:11   ` Remi Gauvin
2019-02-18 13:05     ` André Malm
2019-02-18 18:06       ` Chris Murphy
2019-02-18 19:58         ` André Malm
2019-02-18 20:59           ` Graham Cobb
2019-02-18 21:22           ` Chris Murphy
2019-02-18 21:36             ` André Malm
2019-02-18 22:28               ` Chris Murphy
2019-02-18 22:58                 ` André Malm
2019-02-18 23:49                   ` Chris Murphy
2019-02-18 23:58                     ` André Malm
2019-02-19  0:16                       ` Chris Murphy
2019-02-19  0:17                         ` Chris Murphy
2019-02-19  0:28                         ` André Malm
2019-02-19  3:54                           ` Chris Murphy [this message]
2019-02-19 12:05                             ` André Malm

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJCQCtQzGCvn3DjPQUndRv=MmhVGzz5Nr=JL7p8kJim9ABmN3g@mail.gmail.com' \
    --to=lists@colorremedies.com \
    --cc=admin@sheepa.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).