linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hugo Mills <hugo@carfax.org.uk>
To: Graham Cobb <g.btrfs@cobb.uk.net>
Cc: Eric Levy <contact@ericlevy.name>, linux-btrfs@vger.kernel.org
Subject: Re: receive failing for incremental streams
Date: Thu, 16 Dec 2021 11:38:48 +0000	[thread overview]
Message-ID: <20211216113848.GA21083@savella.carfax.org.uk> (raw)
In-Reply-To: <e479561d-98be-5da2-4853-e697eb9690b3@cobb.uk.net>

On Thu, Dec 16, 2021 at 10:24:09AM +0000, Graham Cobb wrote:
> On 16/12/2021 01:13, Eric Levy wrote:
> >> Later you snapshot /data again to create /data-2 on the source
> >> system.
> >> You btrfs-send /data-2 to the other system again and it creates a new
> >> read-only subvolume - you tell btrfs-receive what to call it and
> >> where
> >> to put it, let's say you call it /copy-data-2 - using the data in the
> >> stream and reusing some extents from the existing /copy-data-1.
> >> /copy-data-2 is now a (read-only) copy of /data-2 from the source
> >> system.
> >>
> >> How you use that copy is up to you. If you are just taking backups
> >> you
> >> probably do nothing with it unless you have a problem (it will form
> >> part
> >> of the source for data for any future /copy-data-3). If you want to
> >> use
> >> it to initialize a read-write subvolume on the destination system you
> >> can take a read-write snapshot of /copy-data-2 to create a new
> >> subvolume
> >> (say /my-new-data) on the destination system.
> > 
> > Such is close to what I have always understood about receive, but the
> > confusion is that the second receive command makes no reference to the
> > subvolume created by the first command. How do I ultimately create a
> > restore target that combines the original full capture with the
> > incremental differences?
> 
> It's just magic. Seriously, as long as you have already restored the
> parent (and any clone sources, if you have specified those) to the same
> filesystem, btrfs will find them and clone the necessary files into the
> new subvolume.

   This is what happens:

Sending machine                        Receiving machine

$ send A
    Send all the data of A
    plus its UUID (uA)

                                       $ receive
				          Make a new subvol, A'
					  Write all the data to it
					  Set "received_uuid" on A' to uA
					  Make A' read-only

$ send B -p A
    Send the differences between A
    and B, plus their UUIDs, uA and uB

                                      $ receive
				         Find the subvol with
					 "received_uuid" == uA (this is A')
					 Snapshot it to B'
					 Modify B' using the differences
					 Set "received_uuid" of B' to uB
					 Make B' read-only


> > When I ask how I use it, I mean what commands do I enter into the
> > system.
> 
> Assume subvolume called /data.
> 
> On the sending side...
> 
> btrfs subv snapshot -r /data /data-1
> btrfs send /data-1 -f data-1.send
> 
> Later, to generate the incremental stream from /data-1...
> 
> btrfs subv snapshot -r /data /data-2
> btrfs send -p /data-1 /data-2 -f data-2.send
> 
> When you want to restore...
> 
> btrfs receive -f data-1.send /recv-data-1
> btrfs receive -f data-2.send /recv-data-2
> 
> If you want read-write access to the data you need to create a new
> subvolume...
> 
> btrfs subv snapshot /recv-data-2 /new-data
> 
> [I haven't tested these so sorry for any mistakes - hopefully you get
> the idea]
> 
> > 
> > Note in my case I archive the streams into regular (compressed) filesm
> > for later recovery.
> 
> I considered doing that but I don't recommend it. The biggest issue is
> that you have to keep all the incrementals since the last full backup,
> as all the steps must complete in order to restore. This means that if
> something has gone wrong with the archive (even a single bit corruption,
> or an unexpected truncation) all the incremental streams after that
> point are useless. btrfs receive doesn't have a "try hard" mode - it
> will just fail unless all the sources it needs, and the stream it is
> processing, are perfect. And you don't know, unless you do regular test
> restorations.
> 
> In the end I decided I would keep the archive subvolumes themselves, not
> the streams. Even in the worst case, this takes very little more space
> (assuming you have turned on compression) - after all the cloned data is
> still cloned. And even if something has been corrupted you can still get
> at undamaged files in the various subvolumes. And if you make sure that
> each send stream is only using the directly previous snapshot as its
> clone source, you can remove any older snapshots that you like without
> making later subvolumes useless.
> 
> Once I decided that, I ended up using btrbk - which makes a good job of
> managing the backup and archive subvolumes, on both the source system
> and the destination system. Of course, many other tools are available.
> 

-- 
Hugo Mills             | Computer Science is not about computers, any more
hugo@... carfax.org.uk | than astronomy is about telescopes.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                                       Esdger Dijkstra

  reply	other threads:[~2021-12-16 11:38 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-15 20:27 receive failing for incremental streams Eric Levy
2021-12-15 23:35 ` Graham Cobb
2021-12-15 23:52   ` Eric Levy
2021-12-16  0:55     ` Graham Cobb
2021-12-16  1:13       ` Eric Levy
2021-12-16 10:24         ` Graham Cobb
2021-12-16 11:38           ` Hugo Mills [this message]
2021-12-18 23:53             ` Eric Levy
2021-12-16  5:36 ` Andrei Borzenkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211216113848.GA21083@savella.carfax.org.uk \
    --to=hugo@carfax.org.uk \
    --cc=contact@ericlevy.name \
    --cc=g.btrfs@cobb.uk.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).